The Complete Guide to Needle: How Cactus Distilled Gemini Tool Calling into a 26M Model
Cactus has open-sourced Needle, a 26-million parameter model that strips away unnecessary complexity from large language models to focus purely on tool calling. Running at 6000 tokens per second on consumer devices, Needle challenges the assumption that agentic AI requires massive models.










