Blog - network architecture

Soft Merging of Experts with Adaptive Routing

December 25, 2025 mixture of experts network architecture 📄 PDF

Mixture of Experts model where within the network blocks of layers are weighted averages of different experts

Read more →

December 22, 2025 network architecture mixture of experts 📄 PDF

Presentation for: Unified Scaling Laws for Routed Language Models. Shows how MoE models scale wrt parameter size & number of experts

Read more →

December 21, 2025 mixture of experts network architecture 📄 PDF

Survey paper of MoE (Mixture of Expert) Models from 2022. Overview of variations of MoE's, strengths, & future research.

Read more →

December 16, 2025 network architecture mixture of experts 📄 PDF

Notes on the classic: Adaptive Mixtures of Local Experts from 1991 by Hinton

Read more →