Blog

Welcome to the SSAIL Lab at UIUC. At SSAIL, we’re driven by a bold vision: to redefine how large-scale machine learning systems are built, optimized, and applied. On the systems side, we build efficient, effective, and easy-to-use systems that power the future of AI, from high-performance and distributed training to ultra-fast inference engines. On the machine learning side, we explore new frontiers, from model compression to vision-language models, agents, scientific AI, and beyond, pushing the boundaries of what’s possible while uncovering the principles that make it all work. What excites us most is the synergy between systems and algorithms. Every algorithmic insight drives new system breakthroughs, and every system we build opens the doors to new algorithmic capabilities. This virtuous cycle is where innovation thrives, and where SSAIL is helping shape the next generation of intelligent systems. Explore our latest blog posts to see what we’re working on.

Filter by Tags

VoltanaLLM: Feedback-Driven Frequency Control and Routing for Energy-Efficient LLM Serving

This blog presents the motivation, insights, and key optimizations behind VoltanaLLM, our system for energy-efficient LLM inference. We’ll walk through why energy matters, how conventional GPU frequency scaling falls short, the surprising behaviors we uncovered when profiling LLM serving, how P/D disaggregated serving creates unique opportunities, and how VoltanaLLM’s co-design of frequency control and routing achieves up to 36.3% GPU energy savings while maintaining near-perfect Service Level Objective (SLO) attainment.

llm-inference gpu-optimization gpu

September 14, 2025 9 minutes