Nvidia and Microsoft to build massive AI cloud computer
Nvidia announced a collaboration with Microsoft to build a “massive” cloud computer focused on AI. Their plan is to use tens of thousands of high-end Nvidia GPUs for applications like deep learning and large language models. The companies aim to make it one of the most powerful AI supercomputers in the world.
Meanwhile, Microsoft will contribute its Azure cloud infrastructure and ND- and NC-series virtual machines.
The new supercomputer will feature thousands of units of what is arguably the most powerful GPU in the world, the Hopper H100. Nvidia launched the Hopper H100 in October. Nvidia will also provide its second most powerful GPU, the A100, and utilize its Quantum-2 InfiniBand networking platform, which can transfer data at 400 gigabits per second between servers, linking them together into a powerful cluster. Nvidia’s AI Enterprise platform will tie the whole thing together. The companies will also collaborate on DeepSpeed, Microsoft’s deep learning optimization software.
Nvidia and Microsoft’s cloud computer will allow customers to deploy thousands of GPUs in a single cluster. This will allow them to “train even the most massive large language models, build the most complex recommender systems at scale, and enable generative AI at scale,” according to Nvidia.
At the time of writing, the companies have not provided details on when the new supercomputer will be ready, but mentioned that this announcement marks the beginning of a multi-year collaboration. It is very likely that the supercomputer will scale up in capacity over time, as it is continuously developed.