code helix

Learning monosemantic features in DNA regulatory sequence models via sparse autoencoders

Published
December 15, 2025
Email
anya@codehelix.ai
OpenReview
github
Demo

Deep learning models excel at predicting genomic activity—from transcription to chromatin features—but their internal representations remain opaque. In our new work presented at the NeurIPS 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences, we apply sparse autoencoders (SAEs) to decode what these models actually learn.

The approach

We analyzed Borzoi, a state-of-the-art CNN-transformer that predicts genome-wide transcriptional and epigenetic profiles from DNA sequence (Linder et al., 2025). By training TopK-SAEs on activations from Borzoi's early convolutional layers, we systematically decomposed the model's learned representations into interpretable components.


SAE framework for discovering monosemantic regulatory features. We extract activations from Borzoi's first four convolutional layers, train TopK-SAEs with expansion factor F=4 and 5% sparsity to decompose them into sparse features, then identify biological concepts.

Figure 1: SAE framework for discovering monosemantic regulatory features. We extract activations from Borzoi's first four convolutional layers, train TopK-SAEs with expansion factor ℱ=4 and 5% sparsity to decompose them into sparse features, then identify biological concepts by extracting top-activating sequence regions (seqlets), discovering position weight matrices (PWMs) via MEME, and matching to known TF/RBP databases.

Sparse autoencoder architecture

The TopK sparse autoencoder expands the channel dimension by factor ℱ > 1, retains only the top k percent of autoencoder latents, and reconstructs the original activations (Gao et al., 2024):

z = ReLU(TopK(Wenc(xbdec) + benc))
= Wdecz + bdec

where z represents sparse features, Wenc,benc, Wdec, bdec are learned encoder and decoder weights and biases. The loss minimizes reconstruction error: ℒ = ||x||²₂.

What did we discover?

Our analysis revealed over 2,000 monosemantic features corresponding to distinct biological concepts:

  • 585 nodes matched transcription factor (TF) binding motifs
  • 42 nodes recognized RNA binding protein (RBP) motifs
  • 1,042 nodes captured transposable element sequences
  • 352 nodes identified candidate cis-regulatory elements (cCREs)

Fundamental regulatory motifs—TATA boxes, poly-A signals, SINE/Alu elements, and GATA motifs—exhibited the highest activation values, suggesting preferential signal propagation through the network.


Landscape of SAE-discovered regulatory features showing activation frequency versus mean activation strength, and distribution across biological categories

Figure 2: Landscape of SAE-discovered regulatory features. (A) Feature importance plot showing activation frequency versus mean activation strength for layer 2 nodes. High-activation nodes capture fundamental regulatory elements: TATA/RBMS3, SINE/Alu, poly-A, and GATA1+TAL1 composite elements. (B) Sankey diagram showing distribution of 2,173 discovered features across biological categories.

Explore the features

We built an interactive visualization platform at motifscout.com where you can explore SAE features across layers L1-L4, view position weight matrices (PWMs), TomTom matches to known TF databases, and analyze motif-level strand-specific patterns.

Why this matters

This work demonstrates that sparse autoencoders can systematically decompose deep learning representations in regulatory genomics into biologically meaningful features. By revealing what models learn—from basic TATA boxes to complex transcription factor binding patterns—we provide a framework for mechanistic understanding of how deep learning encodes regulatory biology.

The comprehensive recognition of transposable elements by Borzoi is particularly intriguing and warrants further investigation, as these repetitive regions have extensively rewired regulatory networks and may influence nearby transcription and chromatin accessibility.

All code is available at github.com/calico/sae-borzoi.

References

Gao, L., la Tour, T. D., Tillman, H., Goh, G., Troll, R., Radford, A., Sutskever, I., Leike, J., & Wu, J. (2024). Scaling and evaluating sparse autoencoders. arXiv preprint arXiv:2406.04093.https://doi.org/10.48550/arXiv.2406.04093

Linder, J., Srivastava, D., Yuan, H., Agarwal, V., & Kelley, D. R. (2025). Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation.Nature Genetics, 1–13.https://doi.org/10.1038/s41588-024-02053-6