Bill Dally

Domain-Specific Accelerators

Abstract

Increasing computing performance enables new applications and greater value from computing. With the end of Moore’s Law and Dennard Scaling, continued performance scaling will come primarily from specialization. Specialized hardware engines can achieve performance and efficiency from 10x to 10,000x a CPU through specialization, parallelism, and optimized memory access. Graphics processing units are an ideal platform on which to build domain-specific accelerators. They provide very efficient, high performance communication and memory subsystems - which are needed by all domains. Specialization is provided via “cores”, such as tensor cores that accelerate deep learning or ray-tracing cores that accelerate specific applications. This talk will describe some common characteristics of domain-specific accelerators via case studies.

About the speaker

Bill Dally is Professor (Research) of Computer Science and of Electrical Engineering at the Department of Computer Science at the Stanford University and Chief Scientist at Nvidia. Dally develops efficient hardware for demanding information processing problems and sustainable energy systems. His current projects include domain-specific accelerators for deep learning, bioinformatics, and SAT solving; redesigning memory systems for the data center; developing efficient methods for video perception; and developing efficient sustainable energy systems. His research involves demonstrating novel concepts with working systems. Previous systems include the MARS Hardware Accelerator, the Torus Routing Chip, the J-Machine, M-Machine, the Reliable Router, the Imagine signal and image processor, the Merrimac supercomputer, and the ELM embedded processor. His work on stream processing led to GPU computing. His group has pioneered techniques including fast capability-based addressing, processor coupling, virtual channel flow control, wormhole routing, link-level retry, message-driven processing, deadlock-free routing, pruning neural networks, and quantizing neural networks.