New IBM PowerAI Software Toolkit Paired with NVIDIA NVLink and GPUDL Libraries Optimized for IBM Power Architecture Helps Enable 2X Performance Breakthroughs on AlexNet with Caffe
SALT LAKE CITY, UT – 14 Nov 2016: IBM (IBM: NYSE) and NVIDIA (NASDAQ: NVDA) today announced collaboration on a new deep learning tool optimized for the latest IBM and NVIDIA technologies to help train computers to think and learn in more human-like ways at a faster pace.
Deep learning is a fast growing machine learning method that extracts information by crunching through millions of pieces of data to detect and rank the most important aspects from the data. Publicly supported among leading consumer web and mobile application companies, deep learning is quickly being adopted by more traditional business enterprises.
Deep learning and other artificial intelligence capabilities are being used across a wide range of industry sectors; in banking to advance fraud detection through facial recognition; in automotive for self-driving automobiles and in retail for fully automated call centers with computers that can better understand speech and answer questions.
A new deep learning software toolkit available today called IBM PowerAI runs on the recently announced IBM server built for artificial intelligence that features NVIDIA® NVLink™ interconnect technology optimized for IBM’s Power architecture. The hardware-software solution provides more than 2X performance over comparable servers with 4 GPUs running AlexNet with Caffe. The same 4-GPU Power-based configuration running Alexnet with BVLC Caffe can also outperform 8 M40 GPU-based x86 configurations , making it the world’s fastest commercially available enterprise systems platform on two versions of a key deep learning framework.
Caffe is a widely-used deep learning framework developed by Berkeley Vision and Learning Center (BVLC) and is recognized within the technology industry as one of the most popular deep learning community applications. Caffe is one of five deep learning software frameworks available in the IBM PowerAI toolkit. The toolkit leverages NVIDIA GPUDL libraries including cuDNN, cuBLAS and NCCL as part of NVIDIA SDKs to deliver multi-GPU acceleration on IBM servers.
IBM PowerAI is designed to run on IBM’s highest performing server in its OpenPOWER LC lineup, the IBM Power S822LC for High Performance Computing (HPC), which features NVIDIA NVLink technology optimized for the Power architecture and NVIDIA’s latest GPU technology. The new solution supports emerging computing methods of artificial intelligence, particularly deep learning. IBM PowerAI also provides a continued path for Watson, IBM’s cognitive solutions platform, to extend its artificial intelligence expertise in the enterprise by using several deep learning methods to train Watson.
“PowerAI democratizes deep learning and other advanced analytic technologies by giving enterprise data scientists and research scientists alike an easy to deploy platform to rapidly advance their journey on AI,” said Ken King, General Manager, OpenPOWER. “Coupled with our high performance computing servers built for AI, IBM provides what we believe is the best platform for enterprises building AI-based software, whether it’s chatbots for customer engagement, or real-time analysis of social media data.”
“Our innovation with IBM on NVIDIA NVLink has created new opportunities for POWER in the deep learning and analytics market,” said Ian Buck, VP and GM of Accelerated Computing Group. “NVIDIA’s GPUDL libraries in PowerAI will provide world class high-performance tools to power GPU-accelerated deep learning applications.”
IBM PowerAI is available immediately at no charge to customers of IBM’s Power S822LC for HPC server. PowerAI is designed to run on a single S822LC server and also to scale to large scale supercomputing clusters consisting of dozens, hundreds or thousands of servers.
The NVLink Advantage
PowerAI is a set of binary distributions of popular deep learning frameworks including Caffe, Torch and Theano. Additional distributions include the IBM and NVIDIA versions of the Caffe deep learning frameworks, IBM-Caffe and NVCaffe. IBM has optimized each of the distributions to take advantage of the recently announced IBM POWER8 chip with the NVIDIA NVLink interface featured on the IBM Power S822LC for HPC server.
The POWER8 with NVIDIA NVLink chip is a technology-leading processor design that is the result of open collaboration between OpenPOWER Foundation members IBM and NVIDIA. The new chip enables tight integration between IBM’s POWER8 CPU server architecture and the new Pascal architecture NVIDIA’s Tesla P100 GPU accelerators. The CPUs and GPUs integrated into the Power S822LC for HPC are connected to each other via the high-speed NVIDIA NVLink interconnect. This industry-unique interface between the CPUs and GPUs, and also between the GPUs, removes potential bottlenecks created by the PCIe interface found in most Intel x86-based servers. The PowerAI toolkit of deep learning applications takes advantage of this new NVLink-based server architecture to optimize performance of the leading artificial intelligence, deep learning and machine learning applications.
Growing Momentum for the Power S822LC for High Performance Computing
The hardware pairing for PowerAI, the IBM Power S822LC for HPC server, was launched in early September. There was immediate interest in the server, equipped with raw performance advantages, among leading research institutions, cloud service providers and business enterprises. This led to very strong demand in the third quarter, contributing to Power’s 2x year to year growth in Linux systems revenues.
Initial client uses for the new IBM Power S822LC for HPC servers include:
- Human Brain Project – In support of the Human Brain Project, a research project funded by the European Commission to advance understanding of the human brain, IBM, and NVIDIA deployed a pilot system at the Juelich Supercomputing Centre as part of the Pre-Commercial Procurement process. Called JURON, the new supercomputer leverages Power S822LC for HPC systems.
- Cloud provider Nimbix – HPC cloud platform provider, Nimbix expanded its cloud supercomputing offerings this month, putting IBM Power S822LC for HPC systems with PowerAI in the hands of developers and data scientists to achieve enhanced performance.
- City of Yachay, Ecuador – Ecuador’s “City of Knowledge,” Yachay, is a planned city designed to push the nation’s economy away from commodities and towards knowledge-based innovation. Last week the city announced it is using a cluster of Power S822LC servers to build the country’s first supercomputer for the purpose of creating new forms of energy, predicting climates, and pioneering food genomics.
- SC3 Electronics – A leading cloud supercomputing center in Turkey, SC3 Electronics announced last month at the OpenPOWER Summit Europe that it is creating the largest HPC cluster in the Middle East and North Africa region based on Power S822LC for HPC servers.
To download IBM PowerAI, go to www.ibm.biz/powerai.
To learn more about the IBM Power S822LC for High Performance Computing server, go to: www.ibm.biz/s822lc-hpc.
To learn more about NVIDIA deep learning, go to www.nvidia.com/deeplearning.
# # #
(1) Based on AlexNet Training for Top-1 50% Accuracy. IBM Power S822LC for HPC configuration: 16 cores (8 cores/socket) at 4.025 GHz with 4xNvidia Pascal P100 GPUs; 512 GB memory; Ubuntu 16.04.1 running NVCaffe 0.14.5 compared to IBM Power S822L configuration: 20 cores (10 cores/socket) at 3.694 GHz with 4xNvidia M40 GPUs; 512 GB memory; Ubuntu 16.04 running BVLC-Caffe f28f5ae2f2453f42b5824723efc326a04dd16d85. Software stack details for both configurations: G++ – 5.3.1, Gfortran –5.3.1, OpenBlas – 0.2.18, Boost –1.58.0, CUDA 8.0 Toolkit, Lapack –3.6.0, Hdf5 –1.8.16, Opencv –2.4.9.
(2) IBM Power S822LC for HPC configuration: 20 cores (10 cores/socket) at 3.95 GHz with 4xNvidia Pascal P100 GPUs; 512 GB memory; Ubuntu 16.04 LE running IBM version BVLC 1.0.0-rc3 compared to Intel E5-2640v4 (Broadwell): 20 cores (10 cores/socket) at 3.6 GHz with 8xNvidia M40 GPUs; 512 GB memory; Ubuntu 16.04 LE running BVLC-Caffe 985493e9ce3e8b61e06c072a16478e6a74e3aa5a. Software stack details for both configurations: G++ – 5.4, Gfortran .4, OpenBlas – 0.2.19, Boost .58.0, CUDA 8.0 Toolkit, Lapack .6.0, Hdf5 .8.16, Opencv .4.9