Deep Learning Card Comparison – 2022

NVIDIA’s cards is still the best choice for deep learning, and it has variety of cards or modules for choosing. We collected some specification which might be useful when choosing the cards or modules.

NameForm FactorCoreCore
Architecture
Chip
Manufacturing
Die Size
(mm^2)
Transistor#SMTensor
Cores
FP64
CUDA
 Cores
FP32
CUDA
Cores
MemoryTDP(W)FP64
FLOPS
FP32
FLOPS
TF32
FLOPS
FP16
FLOPS
BF16
FLOPS
INT16
FLOPS
INT8
FLOPS
INT4
FLOPS
Fmax
(GHz)
Memory
Bandwidth
Nvidia
H100 SXM
SXM5.0 Card1x GH100HopperTSMC 4N81480B13252884481689680GB/HBM3700W30T
60T(Tensor Core)
60T
1000|500T(Tensor Core)
500T120T
1000T(Tensor Core)
1000T(Tensor Core)    3TB/s 
Nvidia
H100 PCIe
PCIe Card1x GH100HopperTSMC 4N81480B11445672961459280GB HBM2e350W24T
48T(Tensor Core)
48T
400T(Tensor Core)
400T(Tensor Core)800T(Tensor Core)800T(Tensor Core)    2TB/s 
Nvidia
A100 80GB
SXM4.0 Card1x GA100AmpereTSMC N782654.2B1084323456691280GB HBM2e400W9.7/TC: 19.519.5TTC: 156/312*TC: 312/624*TC: 312/624* TC: 624/1248*TC: 1248/2496*1.412TB/s
5Kb
3.2Gbps
Nvidia
A100 40GB
SXM4.0 Card1x GA100AmpereTSMC N782654.2B10843234566912HBM2 : 40GB400W9.7/TC: 19.519.5TC: 156/312*TC: 312/624*TC: 312/624* TC: 624/1248*TC: 1248/2496*1.411.6TB/s
5Kb
2.56Gbps
Nvidia
A100-PCIe
PCIe Card
FHFL Dual-Slot
1x GA100-883AA-A1AmpereTSMC N782654.2B1084323456691240/80GB HBM2250W          
Nvidia
A40
PCIe Card
FHFL Dual-Slot
1x GA102AmpereSamsung
8nm
628.428.3B   10752GDRR6 : 48GB300W335G38T75151NA 302NANA696GB/s
384b
14.5Gbps
Nvidia
RTX A6000
PCIe Card
FHFL Dual-Slot
1x GA102AmpereSamsung
8nm
628.428.3B84336(v3) 10752GDDR6 : 48GB300W335G
1,210G(1:32)
38.7T7538.7T(1:1)NA 302NA1.8768GB/s
384b
16Gbps
Nvidia
RTX 3090Ti
PCIe Card
FHFL Three-Slot
1x GA102-350-A1AmpereSamsung
8nm
628.428.3B84336(v3) 10752GDDR6X : 24GB450W[2]625.0G(1:64)40T 40T(1:1)    1.86768GB/s
384b
1,008 GB/s 
Nvidia
RTX 3090
PCIe Card
FHFL Three-Slot
1x GA102-300-A1 AmpereSamsung
8nm
628.428.3B82328(v3) 10496GDDR6X : 24GB350W556.0G(1:64)35.58TTC: 35.6/71*35.6T
TC: 71/142* (w/ FP32+)
TC: 142/284* (w/ FP16+)
35.6T
TC: 71/142* (w/ FP32+)
 TC: 284/568*TC: 568/1136*1.695936GB/s
384b
19.5Gbps
Nvidia
RTX 3080Ti
PCIe Card
FHFL Dual-Slot
1x GA102-225-A1 AmpereSamsung
8nm
628.428.3B80320(v3) 10240GDDR6X : 12GB350W532.8G34.1 34.1    1.67912.4 GB/s
384-bit
Nvidia
RTX 3080
PCIe Card
FHFL Dual-Slot
1x GA102-200-KD-A1AmpereSamsung
8nm
628.428.3B68272(v3) 8704GDDR6X : 10GB320W465.1G(1:64)29.77T 29.77T    1.71760GB/s
320-bit
Nvidia
RTX 3070 Ti
PCIe Card
FHFL Dual-Slot
1x GA104-400-A1AmpereSamsung
8nm
392.517.4B48192(v3) 6144GDDR6X : 8GB290W339.8G(1:64)21.75T 21.75T    1.77256-bit
Nvidia
RTX 3070
PCIe Card
FHFL Dual-Slot
1x GA104AmpereSamsung
8nm
392.517.4B46184(v3) 5888GDDR6 : 8GB220W317.4G(1:64)20.31TTC: 20.3/40.6*20.31
TC: 40.6/81.3* (w/FP32+)
TC: 81.3/162.6* (w/FP16+)
20.3
TC: 40.6/81.3* (w/FP32+)
 TC: 162.6/325.2*TC: 325.2/650.4*1.725448GB/s
256-bit
14Gbps
Nvidia
RTX 3060TI
PCIe Card
FHFL Dual-Slot
1x GA104-200-A1AmpereSamsung
8nm
392.517.4B38152(v3) 4864GDDR6: 8 GB200W253.1G(1:64)16.20T 16.20T    1.67448GB/s
256-bit
Nvidia
RTX 3060
PCIe Card
FHFL Dual-Slot
1x GA106-300-A1AmpereSamsung
8nm
27613.3B28112(v3) 3584GDDR6: 12GB170W199.0G(1:64)12.74T 12.74T    1.78360.0 GB/s
192-bit
Nvidia
TITAN RTX
PCIe Card
FHFL Dual-Slot
1x TU102-400-A1Turing12 nm FFN75418.6B72576 4608GDDR6: 24GB280W509.8G(1:32)16.30T 32.62T    1.77672 GB/s
384-bit
14 Gbps
Nvidia
RTX 2080 Ti
PCIe Card
FHFL Dual-Slot
1x TU102-300TuringTSMC 12nm75418.6B68544 4352GDDR6: 11GB250W420G(1:32)13.45 26.90T    1.75616.0 GB/s
352-bit
14Gbps
Nvidia
RTX 2070 Super
PCIe Card
FHFL Dual-Slot
1x TU104TuringTSMC 12nm54513.6B40320 2560GDDR6: 8 GB215W283.2G(1:32)9.062 18.12T    1.77448GB/s
256-bit
14Gbps
Nvidia
RTX 2070
PCIe Card
FHFL Dual-Slot
1x TU106-400A-A1TuringTSMC 12nm44510.8B36288 2304GDDR6: 8 GB175W233.3G(1:32)7.465T 14.93T     448.0 GB/s
256 bit
Nvidia
V100S-PCIe
PCIe Card
FHFL Dual-Slot
1x GV100VoltaTSMC12 FFN81521.1B80640(v1)25605120HBM2 : 32GB250W8.216.4/TC: 130/ 62/1.641.134TB/s
4Kb
2.2Gbps
Nvidia
V100
SXM3.0 Card1x GV100VoltaTSMC12 FFN81521.1B80640(v1)25605120HBM2 : 32/16GB300W7.815.7/TC: 125/ 62/1.53900GB/s
4Kb
1.75Gbps
Nvidia
V100-PCIe
PCIe Card
FHFL Dual-Slot
1x GV100VoltaTSMC12 FFN81521.1B80640(v1)25605120HBM2 : 32/16GB250W714/TC: 112/ 62/1.4900GB/s
4Kb
1.8Gbps
Table 1. NVIDIA Deep Learning Cards