To recent GTC conference, NVIDIA announced their next-generation processors for AI computing, the H100 GPUs and the Grace Superchip CPU. Based on NVIDIA Hopper architecture, the H100 includes a transformer motor for faster training of AI models. Superchip’s Grace CPU features 144 Arm cores and outperforms NVIDIA’s current dual-processor offering on the SPECrate 2017_int_base reference.
NVIDIA Founder and CEO Jensen Huang made the announcement during his main presentation. The Hopper architecture for accelerating AI training includes several innovative features, including Tensor cores with an increase in floating point operations per second (FLOPS) as well as NVIDIA Confidential Computing technology for more security and privacy. The H100 GPU, built on this architecture, is the first GPU to support PCI Express Gen 5 (PCIe 5) and HBM3. The Grace CPU Superchip is a single-socket package that contains two CPU chips that are connected via NVIDIA’s broadband. NVLink-C2C Technology. Huang’s keynote positioned NVIDIA’s new chips as “the engine of the global AI infrastructure that enterprises use to accelerate their AI-driven businesses.”
the Transformer deep learning model is a common choice for many AI tasks, especially large language models such as GPT-3. Training these models requires massive datasets and many days, if not weeks, of computation. The H100 GPU includes a Transformer Engine, which can dynamically mix 8-bit (FP8) and 16-bit (FP16) floating-point arithmetic. By operating at lower accuracy and supporting increased global FLOPS, the H100 can achieve an order of magnitude faster than the previous generation. Ampere GPUs. Overall, NVIDIA claims that training a GPT-3 model of 175B parameters could be accelerated 6x, and up to 9x for a mixture of expert model 395B parameters: reduced from 7 days to 20 hours.
The chip also contains new dynamic programming instructions (DPX) that can speed up dynamic programming algorithms up to 7x compared to Ampere, delivering increased performance in applications such as routing optimization and protein folding. To support multi-tenant operation in a cloud environment, the H100 includes Secure Multi-Instance GPU (MIG) and Confidential Computing technologies, which allow it to be partitioned into up to seven virtual GPUs while maintaining privacy tenant data.
The Grace CPU Superchip is the next iteration of the Superchip Grace Hopper announced last year, which combines a Grace processor with a Hopper-based GPU in a single chip. The new chip combines two Grace processors, which are connected using NVIDIA’s NVLink-C2C interconnect. Each processor is based on the Arm v9 architecture, has 1 TB/s memory bandwidth and consumes only 500 W of power. The chip supports all NVIDIA software stacks, including Omniverse, NVIDIA AI, and NVIDIA HPC. Using NVIDIA’s ConnectX-7 network cards, the chip can support up to eight Hopper-based external GPUs.
Multiple users commented on the ad in a thread on Hacker News. One noted:
NVIDIA continues to vertically integrate its data center offerings. They bought Mellanox to get InfiniBand. They tried to buy ARM – it didn’t work. But they build and bundle processors anyway. I guess when you’re so advanced on IT, it’s all the peripherals that hold you back, so they put together a complete solution.
NVIDIA GPUs are a popular choice for accelerating AI workloads. Earlier this year, InfoQ reported on the latest MLPerf benchmarks, where NVIDIA posted the best results on seven out of eight tasks.