Nvidia Next-Gen GPU: Blackwell Ultra with PCIe 6.0 Details
![]() |
Source: Nvidia |
After Nvidia CEO Jensen Huang presented the US company's GPU roadmap for the next-gen architectures Blackwell Ultra, Rubin, Rubin Ultra and Feynman at Computex 2024 and specified it again at this year's Nvidia GTC 2025, further details on the upcoming Blackwell Refresh were made public at Hot Chips 2025. Blackwell Ultra is said to be predestined to serve as a temporary solution to ruby.\
As Nvidia now reveals on its in-house developer blog under the motto "Inside Nvidia Blackwell Ultra", Blackwell Ultra is still based on TSMC's custom node "4NP" and continues to consist of 208 billion transistors. Accordingly, Blackwell Ultra from the graphics chip still corresponds to what Nvidia has to offer with Blackwell. Nvidia is raising the computing units from 144 to 160 streaming multiprocessors.
Whereas Blackwell was still comparatively severely limited in the form of the B100, B200 and GB200, Blackwell Ultra can now draw on the full with GB300. Nvidia has now presented the B300 GPU in detail for the first time.
![]() |
Source: Nvidia |
The B300 GPU, which is a further development of the B200 GPU based on the same GPU microarchitecture and which, as a dual die GPU, continues to connect two dies via a die-to-die link, can connect up to 288 Gibytes of HBM3e memory, which is 50 percent more than its predecessor. In order to be able to guarantee this expansion, 12 HBM3e chips are stacked instead of 8.
- 15 PetaFLOPS for FP4 Dense and 30 PetaFLOPS for FP4 Sparsity
- 1.5x FP4 inferencing performance compared to Blackwell
- Optimized for AI inferencing and AI reasoning models
- Eight HBM3e stacks with 12 HBM3e chips each
- 288 instead of 192 gibytes of HBM3e memory
- PCI Express 6.0 instead of PCI Express 5.0
- 1,400 instead of 1,200 watts
The B300 graphics processor celebrates its premiere in the GB300 NVL72 supercomputer, which combines a total of 72 Blackwell Ultra GPUs and 36 Grace CPUs in a server cabinet with 36 racks.
Nvidia Blackwell Ultra with 784 GiByte for the first time in the desktop PC
Even if the B300 graphics processor is still not a fully developed one, as a pure GPU solution it can now consume 1,200 watts (B300) instead of 1,000 watts (B200) and as a superchip with Grace CPU it can now consume 1,400 watts (GB300) instead of 1,200 watts (GB200). Nvidia is positioning Blackwell Ultra wider in comparison to Blackwell and promises to be able to deliver up to 50 percent more FP4 performance.
Blackwell Ultra compared to Blackwell an upgrade from PCIe 5.0 to PCIe 6.0, which results in an increased bandwidth from 128 GiB/s to 256 GiB/s. In terms of numbers, this reads as follows.
Nvidia GB300 | Nvidia GB200 | |
---|---|---|
Codename | Blackwell Ultra | Blackwell |
Construction | Dual-die graphics chip | Dual-die graphics chip |
Transistors | 208 billion | 208 billion |
Streaming Multiprocessors | 160 | 144 |
FP4 Tensor Core (Dense/Sparse) | 15/20 PFLOPS | 10/20 PFLOPS |
FP8/FP6 Tensor Core (Dense/Sparse) | 5/10 PFLOPS | 5/10 PFLOPS |
INT8 Tensor Core (Dense/Sparse) | 157.5/315 TFLOPS | 105/210 TFLOPS |
FP16/BF16 Tensor Core (Dense/Sparse) | 2.5/5 PFLOPS | 2.5/5 PFLOPS |
TF32 Tensor Core (Dense/Sparse) | 1.25/2.5 PFLOPS | 1.25/2.5 PFLOPS |
FP32 | 80 TFLOPS | 80 TFLOPS |
memory | 288 GiB HBM3E | 192 GB HBM3E |
Memory bandwidth | 8 TB/s | 8 TB/s |
NVLink interface | 1.8 TB/s | 1.8 TB/s |
interface | PCIe 6.0 with 256 GiB/s | PCIe 5.0 with 128 GiB/s |
TDP | 1,400 W | 1,200 W |
Post a Comment