Welcome to SOFA documentation!
Introduction
Skywater Opensource FpgA (SOFA) is a fully open-source embedded FPGA IP library, from the architecture description to production ready layouts. As illustrated in Fig. 1, SOFA IPs are designed through the Skywater 130nm PDK, OpenFPGA framework and Synopsys IC Compiler II. The runtime of the design flow for each IP is within 24 hours.
All the SOFA FPGAs are designed to interface the Caravel SoC interface. We aims to empower embedded applications with its low-cost design approach but high-density architecture.

24-hour FPGA IP development: from PDK to production-ready layout
HD FPGAs
Device Comparison
The High Density (HD) FPGAs are embedded FPGAs built with the Skywater 130nm High Density Standard Cell library (Sky130_fd_SC_HD).
Resource/Capacity |
SOFA HD |
QLSOFA HD |
SOFA CHD |
---|---|---|---|
Look-Up Tables [1] |
1152 |
1152 |
1152 |
Flip-flops |
2304 |
2304 |
2304 |
Soft Adders [2] |
N/A |
1152 |
1152 |
Routing Channel Width [3] |
40 |
60 |
60 |
Max. Configuration Speed [4] |
50MHz |
50MHz |
50MHz |
Max. Operating Speed [4] |
50MHz |
50 MHz |
50MHz |
User I/O Pins [5] |
144 |
144 |
144 |
Max. I/O Speed [4] |
33MHz |
33MHz |
33MHz |
Core Voltage |
1.8V |
1.8V |
1.8V |
DC and AC Characteristics
Recommended Operating Conditions
Symbol |
Description |
Min |
Typical |
Max |
Units |
---|---|---|---|---|---|
VDD_io |
Supply voltage for I/Os |
1.8 |
3.3 |
5.0 |
V |
VDD_core |
Supply voltage for FPGA core |
1.62 |
1.8 |
1.98 |
V |
V_in |
Input voltage for other I/Os |
TBD |
3.3 |
TBD |
V |
I_in |
Maximum current through pins |
N/A |
TBD |
TBD |
mA |
f_max |
Maximum frequency of I/Os |
N/A |
TBD |
TBD |
MHz |
Note
Threshold voltage of logic 1 for I/O (V_OH) is 0.8 * VDD_io. In other words, V_in should be at least 2.64V in order to be sensed as logic 1
Note
Threshold voltage of logic 0 for I/O (V_OH) is 0.4. In other words, V_in should not exceed 0.4V in order to be sensed as logic 0.
Typical AC Characteristics
Symbol |
Description |
Min |
Max |
Units |
---|---|---|---|---|
V_in Overshoot |
Maximum allowed overshoot voltage for Vin |
TBD |
TBD |
V |
V_in Undershoot |
Minimum allowed overshoot voltage for Vin |
TBD |
TBD |
V |
I_VDD_core |
Quiescent VDD_core supply current |
TBD |
TBD |
mA |
I_VDD_io |
Quiescent VDD_io supply current |
TBD |
TBD |
mA |
Chip Gallery
Here lists the images of each HD FPGA chips
SOFA HD
SOFA HD is the base design of the SOFA high-density eFPGA IPs

Layout view of SOFA HD device in Caravel SoC
QLSOFA HD
QLSOFA HD is the arithmetic-enhanced design of the SOFA high-density eFPGA IPs

Layout view of QLSOFA HD device in Caravel SoC
SOFA CHD
SOFA CHD is the performance-optimized design of the SOFA high-density eFPGA IPs

Layout view of SOFA CHD device in Caravel SoC
SOFA HD
Architecture
Floorplan
Fig. 5 shows an overview on the architecture of the embedded FPGA fabric. The FPGA follows a homogeneous architecture which only contains single type of tiles in the center fabric. I/O tiles are placed at the boundary of the FPGA to interface with GPIOs and RISC-V processors (see details in I/O Resources).
Tile-based FPGA architecture
Tiles
The FPGA architecture follows a tile-based organization, to exploit the fine-grainularity in physical design, where three types of tiles are built:
Type |
Capacity |
Description |
---|---|---|
CLB |
144 |
Each CLB tile consists of
- a Configurable Logic Block (CLB)
- a X-direction Connection Block (CBx)
- a Y-direction Connection Block (CBy)
- a Switch Block (SB).
This is the majority tile across the fabric
to implement logics and registers.
|
IO-A |
36 |
The type-A I/O is a low-density I/O tile
which is designed to mainly interface
the GPIOs of the SoC.
Each I/O-A tile consists of 1 digitial I/O
cell.
|
IO-B |
12 |
The type-B I/O is a high-density I/O tile
which is designed to mainly interface
the wishbone interface and logic analyzer
of the SoC.
Each I/O-B tile consists of 9 digitial I/O
cells.
|
Routing Architecture
The routing architecture is based on uni-directional routing tracks, which are interconnected by routing multiplexers. Fig. 6 illustrates the detailed organization of the routing architecture.
Detailed routing architecture
The routing architecture consists the following type of routing tracks:
Length-1 wires (
L1 wires
), which hop over 1 logic block (including I/O block)Length-2 wires (
L2 wires
), which hop over 2 logic block (including I/O block)Length-4 wires (
L4 wires
), which hop over 4 logic block (including I/O block)
Each tile includes two routing channels, i.e., the X-direction routing channel and the Y-direction routing channel, providing horizental and vertical connections to adjacent tiles. Each routing channel consists of 40 routing tracks. See details in Table 5.
Track type |
Number of tracks per channel |
---|---|
Length-1 |
4 (10%) |
Length-2 |
4 (10%) |
Length-4 |
32 (80%) |
Total |
40 |
Scan-chain
There is a built-in scan-chain in the FPGA which connects the the sc_in and sc_out ports of CLBs in a chain (see details in Scan Chain), as illustrated in Fig. 7.
When Test_en signal is active, users can
overwrite the contents of all the D-type flip-flops in the FPGA by feeding signals to the SC_HEAD port
readback the contents of all the D-type flip-flops in the FPGA through the SC_TAIL port.
Built-in scan-chain across FPGA
I/O Resources
Pin Assignment
The High-Density (HD) FPGA IP has 144 data I/O pins as shown in Fig. 8.
Among the 144 I/Os,
29 external I/Os are accessible through the Caravel SoC’s General-Purpose I/Os (GPIOs).
115 internal I/Os are accessible through the Caravel SOC’s logic analyzer and wishbone interfaces, which are controlled by the RISC-V processor. See Debug Mode and Accelerator Mode for details.
Warning
For all the unused GPIOs, please set them to input mode, so that the FPGA will not output any noise signals to damage other SoC components.
Note
The connectivity of the 115 internal I/Os can be switched through a GPIO of Caravel SoC. As a result, the FPGA can operate in different modes.
Warning
The internal I/O pins will drive either Wishbone or the logic analyzer, following the same truth table as mode-switch bit in Fig. 8.
I/O arrangement of High-Density (HD) FPGA IP: switchable between logic analyzer and wishbone bus interface
External I/Os
A SOFA HD FPGA IP contains 37 external I/O pins, including 29 data I/Os and 8 control I/Os.
Full details are summarized in the following table.
I/O Type |
Description |
No. of Pins |
---|---|---|
Data I/O |
Datapath I/Os of FPGA fabric |
29 |
CLK |
Operating clock of FPGA core |
1 |
PROG_CLK |
Clock used by configuration protocol to program FPGA fabric |
1 |
CCFF_HEAD |
Input of configuation protocol to load bitstream |
1 |
CCFF_TAIL |
Output of configuration protocol to read back bitstream |
1 |
TEST_EN |
Activate the test mode of FPGA fabric |
1 |
SC_HEAD |
Input of built-in scan-chain to load data to flip-flops of FPGA fabric |
1 |
SC_TAIL |
Output of built-in scan-chain to read back flip-flops from FPGA fabric |
1 |
IO_ISLO_N |
Active-low signal to enable I/O datapath isolation from external ports |
1 |
Total |
37 |
Accelerator Mode
When the Wishbone interface is enabled, the FPGA can operate as an accelerator for the RISC-V processor. Fig. 9 illustrates the detailed I/O arrangement for the FPGA, where the wishbone bus signals are connected to fixed FPGA I/O locations.
Note
Not all the 115 internal I/Os are used by the Wishbone interface. Especially, the I/O[21:29] are not connected.
Warning
The FPGA does not contain a Wishbone slave IP. Users have to implement a soft Wishbone slave when use the FPGA as an accelerator.
I/O arrangement of High-Density (HD) FPGA IP when interfacing wishbone bus
Debug Mode
When the logic analyzer interface is enabled, the FPGA can operate in debug mode, whose internal signals can be readback through the registers of the RISC-V processor. Fig. 10 illustrates the detailed I/O arrangement for the FPGA, where the logic analyzer signals are connected to fixed FPGA I/O locations.
Note
The logic analyzer is 128-bit, while 115 bits can drive or be driven by the FPGA I/O. The other 14 bits are connected to internal spots of the FPGA fabric, monitoring critical signal activities of the FPGA in debugging purpose.
Warning
If the logic analyzer is not used, please configure both the management SoC and the FPGA as follows:
all the I/O directionality is set to input mode.
all the output ports is pulled down to logic ``0``.
I/O arrangement of High-Density (HD) FPGA IP when interfacing logic analyzer
Configurable Logic Block
Generality
Each Logic Block (CLB) consists of 8 Logic Elements (LEs) as shown in Fig. 11. All the pins of the LEs are directly wired to CLB pins without a local routing architecture. Feedback connections between LEs are implemented by the global routing architecture outside the CLBs.
Configurable logic block schematic
Multi-mode Logic Element
Physical Implementation
As shown in Fig. 12, each Logic Element (LE) consists of
a fracturable 4-input Look-Up Table (LUT)
two D-type Flip-Flops (FF)
Detailed schematic of a logic element
The LE can operate in different modes to map logic function efficiently
4-input LUT and single FF (see details in Operating mode: LUT4 + FF).
Dual 3-input LUTs and 2 FFs (see details in Operating mode: Dual-LUT3).
2-bit shift registers (see details in Operating mode: Shift-Register).
Operating mode: LUT4 + FF
The logic element can operate in the Look-Up Table (LUT) + Flip-flop (FF) mode as many classical FPGA logic elements. As depicted in Fig. 13, the fracturable LUT will operate as a single-output 4-input LUT and the upper FF is used to implemented sequential logic.
The operating mode is designed to efficiently implement 4-input functions.
Resource usage of the logic element operating in LUT4 + FF mode (Grey blocks and lines are unused resources).
Operating mode: Dual-LUT3
The logic element can operate in the dual Look-Up Tables (LUTs) and Flip-flops (FFs) mode as many modern FPGA logic elements. As depicted in Fig. 14, the fracturable LUT will operate as two 3-input LUTs with shared inputs.
The operating mode is designed to efficiently implement two 3-input functions with shared input variables. A popular example is the adder function, where the carry logic can be mapped to the upper LUT3 and the sum logic can be mapped to the lower LUT3.
Resource usage of the logic element operating in dual LUT3 + FFs mode (Grey blocks and lines are unused resources).
Operating mode: Shift-Register
As depicted in Fig. 15, the Flip-flops (FFs) can be connected in dedicated routing wires to implement high-performance shift registers.
The operating mode is designed to efficiently implement shift registers which are widely used in buffer logic, e.g., FIFOs.
Resource usage of the logic element operating in shift register mode (Grey blocks and lines are unused resources).
Scan Chain
There is a built-in scan-chain in the CLB where all the sc_in and sc_out ports of LEs are connected in a chain, as illustrated in Fig. 11. When Test_en signal is active, users can readback the contents of all the D-type flip-flops of the LEs thanks to the scan-chain. When Test_en signal is disabled, D-type flip-flops of the LEs operate in regular mode to propagate datapath signal from LUT outputs.
Note
The scan-chain of CLBs are connected in a chain at the top-level. See details in Scan-chain.
Circuit Designs
I/O Circuit
As shown in Fig. 16, the I/O circuit used in the I/O tiles of the FPGA fabric (see Fig. 5) is an digital I/O cell with
An active-low I/O isolation signal
IO_ISOL_N
to set the I/O in input mode. This is to avoid any unexpected output signals to damage circuits outside the FPGA due to configurable memories are not properly initialized.Warning
This feature may not be needed if the configurable memory cell has a built-in set/reset functionality!
An internal protection circuitry to ensure clean signals at all the SOC I/O ports. This is to avoid
SOC_OUT
port outputs any random signal when the I/O is in input modeFPGA_IN
port is driven by any random signal when the I/O is output mode
An internal configurable memory element to control the direction of I/O cell
The truth table of the I/O cell is consistent with the GPIO cell of Caravel SoC (which requires an active-low signal to enable output directionality), where
When configuration bit (FF output) is logic
1
, the I/O cell is in input modeWhen configuration bit (FF output) is logic
0
, the I/O cell is in output mode
Schematic of embedded I/O cell used in FPGA
Fig. 17 shows an example waveform about how the I/O cell works:
When
IO_ISOL_N
is enabled/disabledWhen operates in input mode
When operates in output mode
An example of waveforms of embedded I/O cell used in FPGA
Multiplexer
Routing multiplexer are designed by using the skywater High-Density (HD) 2-input MUX cell, as shown in Fig. 18. The tree-like multiplexer design is applied to all the routing multiplexers in logic elements, connection blocks and switch blocks across the FPGA fabric.
Schematic of multiplexer design in SOFA HD FPGA
Note
Each routing multiplexer has a dedicated input which is connected to ground (GND) signal. When it is not used, the output will be driven by the ground, working as a constant generator.
Timing Annotation
Configurable Logic Block
The path delays in Fig. 19 are listed in Table 7.
Schematic of a logic element used in SOFA HD FPGA
Path / Delay |
TT (unit: ns) |
---|---|
in0 -> LUT3_out[0] |
0.85 |
in1 -> LUT3_out[0] |
0.57 |
in2 -> LUT3_out[0] |
0.30 |
in0 -> LUT3_out[1] |
0.86 |
in1 -> LUT3_out[1] |
0.59 |
in2 -> LUT3_out[1] |
0.31 |
in0 -> LUT4_out |
1.14 |
in1 -> LUT4_out |
0.86 |
in2 -> LUT4_out |
0.58 |
in3 -> LUT4_out |
0.51 |
LUT3_out[0] -> A |
0.56 |
LUT4_out[0] -> A |
0.58 |
A -> out[0] |
0.88 |
A -> FF[0] |
0.56 |
FF[0] -> out[0] |
0.88 |
LUT3_out[1] -> out[1] |
0.89 |
LUT3_out[1] -> FF[1] |
0.56 |
FF[1] -> out[1] |
0.89 |
regin -> FF[0] |
0.58 |
FF[0] -> FF[1] |
0.56 |
I/O Block
The path delays in Fig. 16 are listed in Table 8.
Path / Delay |
TT (unit: ns) |
---|---|
SOC_IN -> FPGA_IN |
0.11 |
FPGA_OUT -> SOC_OUT |
0.11 |
Routing Architecture
The path delays in Fig. 6 are listed in Table 9.
Path / Delay |
TT (unit: ns) |
---|---|
A -> B |
1.61 |
A -> C |
1.61 |
A -> D |
1.61 |
B -> E |
1.38 |
QLSOFA HD
Architecture
Floorplan
QLSOFA HD FPGA share the same floorplan as SOFA HD FPGA. See details at Floorplan.
Tiles
The FPGA architecture follows a tile-based organization, to exploit the fine-grainularity in physical design, where three types of tiles are built:
Type |
Capacity |
Description |
---|---|---|
CLB |
144 |
Each CLB tile consists of
- a Configurable Logic Block (CLB)
- a X-direction Connection Block (CBx)
- a Y-direction Connection Block (CBy)
- a Switch Block (SB).
This is the majority tile across the fabric
to implement logics and registers.
|
IO-A |
36 |
The type-A I/O is a low-density I/O tile
which is designed to mainly interface
the GPIOs of the SoC.
Each I/O-A tile consists of 1 digitial I/O
cell.
|
IO-B |
12 |
The type-B I/O is a high-density I/O tile
which is designed to mainly interface
the wishbone interface and logic analyzer
of the SoC.
Each I/O-B tile consists of 9 digitial I/O
cells.
|
Routing Architecture
The routing architecture shares the same principle as the SOFA HD routing architecture (See details in Routing Architecture).
Note
Different from SOFA HD, each routing channel consists of 60 routing tracks. See details in Table 11.
Track type |
Number of tracks per channel |
---|---|
Length-1 |
6 (10%) |
Length-2 |
6 (10%) |
Length-4 |
48 (80%) |
Total |
60 |
Scan-chain
QLSOFA HD FPGA share the same floroplan as SOFA HD FPGA. See details at Scan-chain.
I/O Resources
Pin Assignment
The QLSOFA HD FPGA IP has 144 data I/O pins as shown in Fig. 20.
Among the 144 I/Os,
29 external I/Os are accessible through the Caravel SoC’s General-Purpose I/Os (GPIOs).
115 internal I/Os are accessible through the Caravel SOC’s logic analyzer and wishbone interfaces, which are controlled by the RISC-V processor. See Debug Mode and Accelerator Mode for details.
Warning
For all the unused GPIOs, please set them to input mode, so that the FPGA will not output any noise signals to damage other SoC components.
Note
The connectivity of the 115 internal I/Os can be switched through a GPIO of Caravel SoC. As a result, the FPGA can operate in different modes.
Warning
The internal I/O pins will drive either Wishbone or the logic analyzer, following the same truth table as mode-switch bit in Fig. 20.
I/O arrangement of QLSOFA HD FPGA IP: switchable between logic analyzer and wishbone bus interface
External I/Os
A QLSOFA HD FPGA IP contains 37 external I/O pins, including 27 data I/Os and 10 control I/Os.
Full details are summarized in the following table.
I/O Type |
Description |
No. of Pins |
---|---|---|
Data I/O |
Datapath I/Os of FPGA fabric |
27 |
CLK |
Operating clock of FPGA core |
1 |
PROG_CLK |
Clock used by configuration protocol to program FPGA fabric |
1 |
RESET |
Active-low reset for datapath flip-flops in the FPGA |
1 |
PROG_RESET |
Active-low reset for configuration flip-flops in the FPGA |
1 |
CCFF_HEAD |
Input of configuation protocol to load bitstream |
1 |
CCFF_TAIL |
Output of configuration protocol to read back bitstream |
1 |
TEST_EN |
Activate the test mode of FPGA fabric |
1 |
SC_HEAD |
Input of built-in scan-chain to load data to flip-flops of FPGA fabric |
1 |
SC_TAIL |
Output of built-in scan-chain to read back flip-flops from FPGA fabric |
1 |
IO_ISLO_N |
Active-low signal to enable I/O datapath isolation from external ports |
1 |
Total |
37 |
Accelerator Mode
When the Wishbone interface is enabled, the FPGA can operate as an accelerator for the RISC-V processor. Fig. 21 illustrates the detailed I/O arrangement for the FPGA, where the wishbone bus signals are connected to fixed FPGA I/O locations.
Note
Not all the 115 internal I/Os are used by the Wishbone interface. Especially, the I/O[21:29] are not connected.
Warning
The FPGA does not contain a Wishbone slave IP. Users have to implement a soft Wishbone slave when use the FPGA as an accelerator.
I/O arrangement of High-Density (HD) FPGA IP when interfacing wishbone bus
Debug Mode
When the logic analyzer interface is enabled, the FPGA can operate in debug mode, whose internal signals can be readback through the registers of the RISC-V processor. Fig. 22 illustrates the detailed I/O arrangement for the FPGA, where the logic analyzer signals are connected to fixed FPGA I/O locations.
Note
The logic analyzer is 128-bit, while 115 bits can drive or be driven by the FPGA I/O. The other 14 bits are connected to internal spots of the FPGA fabric, monitoring critical signal activities of the FPGA in debugging purpose.
Warning
If the logic analyzer is not used, please configure both the management SoC and the FPGA as follows:
all the I/O directionality is set to input mode.
all the output ports is pulled down to logic ``0``.
I/O arrangement of High-Density (HD) FPGA IP when interfacing logic analyzer
Configurable Logic Block
Generality
Each Logic Block (CLB) consists of 8 Logic Elements (LEs) as shown in Fig. 23. All the pins of the LEs are directly wired to CLB pins without a local routing architecture. Feedback connections between LEs are implemented by the global routing architecture outside the CLBs.
Configurable logic block schematic
Multi-mode Logic Element
Physical Implementation
As shown in Fig. 24, each Logic Element (LE) consists of
a fracturable 4-input Look-Up Table (LUT)
two D-type Flip-Flops (FF)
Detailed schematic of a logic element
The LE can operate in different modes to map logic function efficiently
4-input LUT and single FF (see details in Operating mode: LUT4 + FF).
Dual 3-input LUTs and 2 FFs (see details in Operating mode: Dual-LUT3).
2-bit shift registers (see details in Operating mode: Shift-Register).
Operating mode: LUT4 + FF
The logic element can operate in the Look-Up Table (LUT) + Flip-flop (FF) mode as many classical FPGA logic elements. As depicted in Fig. 25, the fracturable LUT will operate as a single-output 4-input LUT and the upper FF is used to implemented sequential logic.
The operating mode is designed to efficiently implement 4-input functions.
Resource usage of the logic element operating in LUT4 + FF mode (Grey blocks and lines are unused resources).
Operating mode: Dual-LUT3
The logic element can operate in the dual Look-Up Tables (LUTs) and Flip-flops (FFs) mode as many modern FPGA logic elements. As depicted in Fig. 26, the fracturable LUT will operate as two 3-input LUTs with shared inputs.
The operating mode is designed to efficiently implement two 3-input functions with shared input variables. A popular example is the adder function, where the carry logic can be mapped to the upper LUT3 and the sum logic can be mapped to the lower LUT3.
Resource usage of the logic element operating in dual LUT3 + FFs mode (Grey blocks and lines are unused resources).
Operating mode: Shift-Register
As depicted in Fig. 27, the Flip-flops (FFs) can be connected in dedicated routing wires to implement high-performance shift registers.
The operating mode is designed to efficiently implement shift registers which are widely used in buffer logic, e.g., FIFOs.
Resource usage of the logic element operating in shift register mode (Grey blocks and lines are unused resources).
Operating mode: Soft Adder
As depicted in Fig. 28, the 4-input LUT can implement an 1-bit adder logic, where carry inputs and outputs are connected through dedicated carry chain wires cin
and cout
across logic elements. This is more delay efficient than implementing adders through the dual LUT3 mode (see details in Operating mode: Dual-LUT3).
The operating mode is designed to efficiently implement multi-bit adders.
Resource usage of the logic element operating in soft adder mode (Grey blocks and lines are unused resources).
Scan Chain
There is a built-in scan-chain in the CLB where all the sc_in and sc_out ports of LEs are connected in a chain, as illustrated in Fig. 23. When Test_en signal is active, users can readback the contents of all the D-type flip-flops of the LEs thanks to the scan-chain. When Test_en signal is disabled, D-type flip-flops of the LEs operate in regular mode to propagate datapath signal from LUT outputs.
Note
The scan-chain of CLBs are connected in a chain at the top-level. See details in Scan-chain.
Circuit Designs
I/O Circuit
QLSOFA HD FPGA share the same I/O circuit design as SOFA HD FPGA. See details at I/O Circuit.
Multiplexer
QLSOFA HD FPGA share the same multiplexer design as SOFA HD FPGA. See details at Multiplexer.
Timing Annotation
Configurable Logic Block
The path delays in Fig. 29 are listed in Table 7.
Schematic of a logic element used in QLSOFA HD FPGA
Path / Delay |
TT (unit: ns) |
---|---|
in0 -> LUT3_out[0] |
0.85 |
in1 -> LUT3_out[0] |
0.57 |
in2 -> B |
0.60 |
B -> LUT3_out[0] |
0.32 |
in0 -> LUT3_out[1] |
0.90 |
in1 -> LUT3_out[1] |
0.62 |
B -> LUT3_out[1] |
0.33 |
in0 -> LUT4_out |
1.17 |
in1 -> LUT4_out |
0.89 |
in2 -> LUT4_out |
1.21 |
in3 -> LUT4_out |
0.79 |
LUT3_out[0] -> A |
0.56 |
LUT4_out[0] -> A |
0.58 |
A -> out[0] |
0.88 |
A -> FF[0] |
0.56 |
FF[0] -> out[0] |
0.88 |
LUT3_out[1] -> out[1] |
0.89 |
LUT3_out[1] -> FF[1] |
0.56 |
FF[1] -> out[1] |
0.89 |
regin -> FF[0] |
0.58 |
FF[0] -> FF[1] |
0.56 |
I/O Block
The path delays of I/O blocks in QLSOFA HD FPGA is same as the SOFA HD FPGA. See details in I/O Block.
Routing Architecture
The path delays in Fig. 6 are listed in Table 14.
Path / Delay |
TT (unit: ns) |
---|---|
A -> B |
1.44 |
A -> C |
1.44 |
A -> D |
1.44 |
B -> E |
1.38 |
SOFA CHD
Architecture
SOFA CHD FPGA share the same architecture as QLSOFA HD FPGA. See full details at Architecture.
I/O Resources
The SOFA CHD FPGA IP share the same I/O resource arragement as QLSOFA HD FPGA IP. See details at I/O Resources.
Configurable Logic Block
The SOFA CHD FPGA IP share the same Configurable Logic Block (CLB) architecture as QLSOFA HD FPGA IP. See details at Configurable Logic Block.
Circuit Designs
I/O Circuit
SOFA CHD FPGA share the same I/O circuit design as SOFA HD FPGA. See details at I/O Circuit.
Multiplexer
Routing multiplexer are designed by using a few custom cells based on the Skywater High-Density (HD) PDK, as shown in Fig. 30. The multiplexer design follows a two-level structure, which is applied to all the routing multiplexers in logic elements, connection blocks and switch blocks across the FPGA fabric.
Schematic of multiplexer design in SOFA CHD FPGA
Each primitive in the two-level structure could be a 2/3/4-input custom cell, depending on the input size of the routing multiplexer. Each custom cell is built with input inverters and transmission-gates. For instance, Fig. 31 shows the transistor-level design of a 3-input custom cell.
Detailed schematic of a 3-input custom cell in SOFA CHD FPGA
Note
Each routing multiplexer has a dedicated input which is connected to ground (GND) signal. When it is not used, the output will be driven by the ground, working as a constant generator.
Timing Annotation
Configurable Logic Block
The path delays in Fig. 32 are listed in Table 15.
Schematic of a logic element used in SOFA CHD FPGA
Path / Delay |
TT (unit: ns) |
---|---|
in0 -> LUT3_out[0] |
0.86 |
in1 -> LUT3_out[0] |
0.58 |
in2 -> B |
0.16 |
B -> LUT3_out[0] |
0.32 |
in0 -> LUT3_out[1] |
0.91 |
in1 -> LUT3_out[1] |
0.63 |
B -> LUT3_out[1] |
0.34 |
in0 -> LUT4_out |
1.20 |
in1 -> LUT4_out |
0.92 |
in2 -> LUT4_out |
0.78 |
in3 -> LUT4_out |
0.52 |
LUT3_out[0] -> A |
0.17 |
LUT4_out[0] -> A |
0.18 |
A -> out[0] |
0.48 |
A -> FF[0] |
0.15 |
FF[0] -> out[0] |
0.48 |
LUT3_out[1] -> out[1] |
0.47 |
LUT3_out[1] -> FF[1] |
0.16 |
FF[1] -> out[1] |
0.37 |
regin -> FF[0] |
0.15 |
FF[0] -> FF[1] |
0.16 |
I/O Block
The path delays of I/O blocks in SOFA CHD FPGA is same as the SOFA HD FPGA. See details in I/O Block.
Routing Architecture
The path delays in Fig. 6 are listed in Table 16.
Path / Delay |
TT (unit: ns) |
---|---|
A -> B |
0.81 |
A -> C |
0.81 |
A -> D |
0.81 |
B -> E |
0.57 |
Custom Cells
Skywater Custom Multiplexer Cells
Background
Traditionally, larger multiplexers are built using trees of smaller multiplexers as illustrated below:

Multiplexers trees lead to large power and timing constraints that limit FPGA performance. FPGA fabrics use complementary pass gate logic (CPL) to replace multiplexer trees with single level inverted transmission gate derived multiplexers, as illustrated below:

Single level multiplexers are controlled through configuration SRAM cells which enable high impedance connections throughout the multiplexer hierarchy, thereby removing the need for hierarchical designs of multiplexers. Therefore, the CPL multiplexers enable increased performance and reduced power consumption throughout FPGA fabrics. Standard cells required for CPL multiplexers are not commonly included in PDKs, thereby requiring the need for custom cell creation to enable FPGA multiplexer hierarchies. The remainder of this document is dedicated to the architecture and performance evaluation of our sky130_uuopenfpga_cc_hd_invmux2_1/sky130_uuopenfpga_cc_hd_invmux3_1 custom cells generated using the Skywater 130nm PDK. A comparison is achieved by generating 4-to-1 multiplexer and 6-to-1 multiplexers made from our cells and standard cells provided within the Skywater 130nm PDK.
SKY130_UUOPENFPGA_CC_HD_INVMUX2_1
- Usage: 2-Input Transmission Gate Multiplexer with Unity Drive Strength Inverter Input
- -Pins:
Q1/Q2 - Inverter Input
S0/S1 - NMOS Select Input
S0B/S1B - PMOS Select Input
Schematic:

Layout:

Comparison: To demonstrate the performance gains using CPL multiplexers, we built a 4-1 single level multiplexer using our custom sky130_uuopenfpga_cc_hd_invmux2_1 cell along with a 4-to-1 multiplexer using the sky130_fd_sc_hd__mux2_1 as the root cell of the multiplexer tree.
The sky130_fd_sc_hd_mux2_1 multiplexer is built using a static CMOS structure with a single select input, whereas our cell uses a fractured select hierarchy. To perform the comparisons we have tabulated values in regards to power, area, and timing for the 4-to-1 multiplexer tree using Cadence ADE XL.
- Power:
sky130_uuopenfpga_cc_hd_invmux2_1: 2.37 μW
sky130_fd_sc_hd__mux2_1: 3.03 μW
Our custom multiplexer provides a 22% reduction in power consumption.
- Area:
sky130_uuopenfpga_cc_hd_invmux2_1: 33.78 μA2
sky130_fd_sc_hd__mux2_1: 33.78 μA2
Our multiplexer implementation requires equal area neglecting interconnect overhead.
- Timing:
sky130_uuopenfpga_cc_hd_invmux2_1: 211.1 ps
sky130_fd_sc_hd__mux2_1: 304.3 ps
Our custom multiplexer provides over a 31% reduction in propagation delay.
SKY130_UUOPENFPGA_CC_HD_INVMUX3_1
- Usage: 3-Input Transmission Gate Multiplexer with Unity Drive Strength Inverter Input
- Pins:
Q2/Q3 - Inverted Input
S0/S1/S2 - NMOS Select Input
S0B/S1B/S2B - PMOS Select Input
Schematic:

Layout:

Comparison: To demonstrate the performance gains using CPL multiplexers, we built a 6-1 single level multiplexer using our custom sky130_uuopenfpga_cc_hd_invmux3_1 cell along with a 6-to-1 multiplexer using the sky130_fd_sc_hd__mux4/2_1 as the root cells of the multiplexer tree.
To perform the comparisons we have tabulated values in regards to power, area, and timing for the 4-to-1 multiplexer tree using Cadence ADE XL.
- Power:
sky130_uuopenfpga_cc_hd_invmux3_1: 2.96 μW
sky130_fd_sc_hd__mux2_1: 3.31 μW
Our custom multiplexer provides a 10.5% reduction in power consumption.
- Area:
sky130_uuopenfpga_cc_hd_invmux3_1: 61.31 μA2
sky130_fd_sc_hd__mux2_1: 48.80 μA2
The Skywater multiplexer provides a 20% reduction in area.
- Timing:
sky130_uuopenfpga_cc_hd_invmux3_1: 272.6 ps
sky130_fd_sc_hd__mux2_1: 374.2 ps
Our custom multiplexer provides over a 27% reduction in propagation delay.
Contacts
- General Questions
Prof. Pierre-Emmanuel Gaillardon
- Technical Questions about OpenFPGA
Prof. Xifan Tang
- Technical Questions about Physical Design
Ganesh Gore
Acknowledgment


Supported by DARPA PoSH program

For more information on the OpenFPGA see openfpga_doc or openfpga_github
For more information on the VPR architecture description language see xml_vtr
For more information on the Skywater 130nm PDK see skywater_pdk_github