Nvidia wants to speed up data transfers by connecting datacenter GPUs to SSDs 

Nvidia wants to speed up data transfers by connecting datacenter GPUs to SSDs 

This week, Microsoft introduced DirectStorage for Windows PCs. The API promises faster loading times and more detailed graphics, allowing game developers to create applications that load graphics data from an SSD directly to the GPU. Now Nvidia and IBM have created a similar SSD/GPU technology, but they are aimed at dealing with massive datasets in data centers.

Rather than focusing on console or PC games like DirectStorage, Big Accelerator Memory (BaM) is designed to give data centers fast access to massive amounts of data in GPU-intensive applications like machine learning training, analytics, and high performance computing. to a research paper discovered by The Register this week. Paper entitled “BaM: The case for providing fine-grained, high-bandwidth, GPU-driven storage access”(PDF), researchers at Nvidia, IBM, and several US universities, offers a more efficient way to run next-generation applications in data centers with massive processing power and memory bandwidth.

BaM also differs from DirectStorage in that the system architects plan to make it open source.

The paper states that while CPU-driven storage data access is suitable for “classic”GPU applications such as dense neural network training with “predefined, regular, dense”data access patterns, it incurs too much “overhead”. to CPU-GPU synchronization and/or amplification of I/O traffic.” This makes it less suitable for next-generation applications that use graph and data analytics, recommender systems, graph neural networks, and other “fine-grained data-dependent access patterns,”the authors write.

Like DirectStorage, BaM works alongside an NVMe SSD. According to the document, BaM “reduces I/O traffic amplification by allowing GPU threads to read or write small amounts of computer-determined data on demand.”

Specifically, BaM uses the GPU’s onboard memory, which is a software-controlled cache, as well as the GPU’s software thread library. Threads receive data from the SSD and move it around using a custom Linux kernel driver. The researchers tested a prototype system with a 40GB Nvidia A100 PCIe GPU, two AMD EPYC 7702 processors with 64 cores each, and 1TB of DDR4-3200 memory. The system is running Ubuntu 20.04 LTS.

The authors noted that even a “consumer-grade”SSD can support BaM with application performance that is “competitive with a much more expensive DRAM-only solution.”

Leave a Reply

Your email address will not be published. Required fields are marked *