fano-embedding

read any word against a fixed set of named semantic axes instead of letting PCA pick them

pythonembeddingsnlp
Python 0 stars 0 forks

A semantic instrumentation framework. Instead of letting PCA discover axes from variance, it fixes a set of named axes up front and reads any word or text against them.

Each axis is the unit vector between two pole sets: a handful of words for the positive end, a handful for the negative end. Project a concept onto all of them and you get a fingerprint showing how it aligns with each dimension.

The axes

Eight axes survived cross-model validation, each built from centroid pairs of opposing word sets:

Axis Positive pole Negative pole
boundary Input, Receive, Observe Output, Emit, Express
state Memory, Belief, Model, State Signal, Output, Action
update Feedback, Learning, Attention, Reflection Commit, Choice, Action, Control
agency Goal, Intention, Choice, Action Perceiving, Receiving, Feedback
affect Desire, Fear, Curiosity, Feeling Planning, Model, Evaluation
prediction Prediction, Expectation, Planning Memory, Reflection, Knowing
control Control, Agency, Direction, Constraint Adaptation, Drift, Chance
generativity Create, Generate, Compose, Originate Copy, Repeat, Maintain, Preserve

How it works

a few methodological choices that make the readings hold up:

  • per-word centroids for axis construction, not phrase embeddings. averaging individual word vectors gives a cleaner pole than embedding the whole pole as one string
  • mean-centering the vectors first, to strip out the anisotropic bias that embedding models bake into every vector
  • least-squares residuals so the projections account for the fact that the axes aren't orthogonal
  • the residual doubles as a centrality meter: low residual means the concept is a hub that the frame already explains, high residual means it's a leaf living mostly off-frame

Validation

ran the whole thing across four embedding models. most axes reproduce consistently (generativity held at +0.32 to +0.44 everywhere). a ninth axis, transmission for communication, never converged and got cut. the nice part is the frame can flag its own gaps: an axis with no signal across models is one the word poles just don't capture.

Run it

uv run python main.py

swap the embedding model via the MODEL constant with anything Ollama can serve.

all projects