on the clojure move

jump to main content

The System Doesn't Know Its Own Shape

There's something quietly broken in how we build Clojure systems. Not in any single piece — each piece is actually quite good — but in the space between the pieces.

1. Data-oriented, separately

Clojure got data-orientation right at every layer. Lifecycle libraries like Integrant or Component give you a system map — data describing what components exist and how they depend on each other. Routing libraries like reitit or bidi give you route tables — vectors of paths and handlers, inspectable, composable. Spec gives you a registry of data shapes — predicates with names, checkable, generatable. Protocols give you named behavioral contracts.

Each of these is individually excellent. Each produces data you can inspect, manipulate, reason about.

But the thing that connects a route to a handler to a business function to a stateful component? That's Clojure code. A handler closes over a component injected at startup. A function calls another function by name. A route vector contains a keyword that resolves to a handler somewhere. The wiring is code, not data.

So we end up in this odd place: a system built entirely from data-oriented pieces, whose overall shape is not itself data. The system doesn't know its own topology. You can inspect any single layer — what routes exist, what components start, what specs are registered — but you can't ask the system "what does this endpoint actually depend on, transitively, all the way down to the database?"

That answer lives in the code, but the code doesn't offer it up. The developer either already knows how the wiring works, or has to trace through closures, call chains, and init methods to reconstruct it manually. The topology is there. You just can't ask for it.

2. Names pointing at names pointing at names

Identity in a typical Clojure project is spatial and name-based. A namespace path, a component key, a spec name, a route string — each is a single name pointing at a single thing. This works fine until you realize that the same conceptual entity has different names in different layers. :user/email is a spec. UserFetchable is a protocol. :app.system/user-service is a component key. /api/users/:id is a route. Each layer independently invents a name that references "user" in its own way, but the relationship between these names is implicit — encoded in code proximity, naming conventions, and developer knowledge.

As the system grows, each layer proliferates independently. More specs, more component keys, more routes, more protocol methods. The relationships between names multiply faster than the names themselves. But those relationships exist only in assembly code — in the init method that wires a component, in the handler function that calls a service, in the middleware that checks auth before a route fires.

The topology is real. It's just not declared.

3. The fear line

This has consequences for how we think about change. We hide behind narrow interfaces — expose as little as possible, because every exposed detail is a potential commitment. We're afraid of change, and instead of learning to manage change structurally, we minimize our exposure. Build a wall. Put an interface in front of it. Only let consumers see what they absolutely must.

But the wall doesn't actually solve the problem. We break consumers anyway — a major version bump, a changelog entry that says "breaking: removed X." The interface was supposed to protect us, but all it did was narrow the channel through which breakage eventually flows. The fear was justified. The response was misdirected.

Rich Hickey's framing in Spec-ulation is clarifying here: change is not one thing, it's two. Growth (provide more, require less) and breakage (provide less, require more). If we could structurally distinguish between these two at the level of system definitions, we wouldn't need to hide behind interfaces out of fear. We could expose more, confidently, because we'd know when a change was safe.

But to know that, the system would need to know its own shape.

4. The aggregate illusion

There's a subtler version of this same problem in how we think about data shapes. Spec's s/keys lets you define a named aggregate: ::user is a map that contains these keys with these specs. This seems natural — of course you'd want to say "this is what a user looks like."

But it creates what you might call a false center. The aggregate pretends there's a canonical "user" — a platonic form that all actual user maps are subsets of. In reality, there's no such thing. There's the user data your login function needs (just email and password hash). There's the user data your profile endpoint returns (name, avatar, preferences). There's the user data your admin panel shows (everything, plus audit logs). These aren't subsets of some ideal user. They're just different collections of keys that happen to overlap.

Hickey saw this too, in his Maybe Not talk. Putting optionality in the aggregate definition was a mistake, because there's no usage context there. "When don't we know the model and the year? Who knows when?" The s/keys definition says forever which keys are required and which are optional, but that's a lie — it depends on where you are in the program.

His proposed fix was schema/select — separate the shape definition from the contextual subsetting. But this leads somewhere uncomfortable. Each selection needs a name or a reference. If you name each one, you're back to proliferation — ::user-for-login, ::user-for-profile, ::user-for-admin. If you don't name them, you can't talk about them or compose them.

Spec was genuinely trying to approach the broader identity problem. At the attribute level, it got things right — s/def ::user/email string? creates an independent, globally addressable spec. Each attribute has its own qualified name, its own semantics, resolvable on its own. You can add more keys to maps freely. Spec is open-world at the attribute level by design.

The problem is specifically s/keys. The moment you write (s/def ::user (s/keys :req [::email ::name])), you've created a named pointer to a fixed collection. And then everything references that name. The individual specs are still independent, still open, still composable. But the s/keys name becomes the thing people reach for, and it ossifies. The name attracts dependencies. It becomes a versioning bottleneck — not because the attributes aren't independent, but because the grouping got a name and names are what code couples to.

5. Specification is just enumeration

But maybe the whole pattern is backwards. In a system built on qualified keywords, every keyword is already a global address. :user/email is independently resolvable — you don't need to go through ::user to reach it. You already know what you need. You just list the keys.

The aggregate adds a layer of indirection that doesn't carry information. Going through ::user to get to :user/email is like going through a phone book to find a number you already have. The "context" that filters keys down to what you need? It's just… the keys you need. The selection is the specification. The map already knows what it knows — Hickey's own point about maps being self-describing.

Spec already got this right at the individual attribute level. Each s/def for a qualified keyword is independent, globally addressable, composable. The problem isn't that spec missed some deeper insight about properties being first-class — it has that. The problem is that s/keys introduced a named aggregation layer that undermines the very openness the attributes provide. The individual specs are fine. The name that points to a collection of them is the bottleneck.

In a qualified-keyword system, you don't need to predefine the group and then subtract from it. You enumerate what you need, the system resolves each part independently, and grouping is whatever you requested. No named intermediary. No false center.

But this raises a practical question: if you dissolve the named aggregate, how do humans communicate? We think in concepts — "pass me a user," "this is an endpoint." We need names. The trick is recognizing that you might need two kinds of identity for the same thing: one that's human-oriented — a name you can say, type, remember — and one that's machine-oriented — a set of semantic attributes that can be queried, intersected, compared. The human name gives you communication. The machine identity gives you computation. Neither needs to be the other.

6. What we're missing

So here's where we are in a typical Clojure project. We have excellent data-oriented tools at each layer. We have qualified keywords that could, in principle, serve as universal addresses. We have Hickey's growth/breakage framework telling us how to think about change. We even have spec getting the attribute-level identity right.

But we're missing the connective tissue. The topology between layers is code, not data. The relationships between naming systems are implicit. Our aggregates create false centers that force us into either proliferation or indirection. And we hide behind minimal interfaces because we can't structurally tell growth from breakage.

None of this is unique to Clojure — it's the normal state of medium-to-large systems in any language. But it's sharper in Clojure because Clojure got so many other things right. The gap between "each layer is data" and "the system is data" is conspicuous precisely because the first part works so well.

The question isn't whether we need more tools. It's whether the architecture itself — the shape of the system, the relationships between its parts, the history of how it changed — can become data too. Queryable, diffable, validatable. Not a diagram on a wiki. Not a README that's three months stale. Data that emerges from how the system is actually written.

Atlas is an experimental project in which I'm exploring this gap. It tries to make the implicit topology between layers explicit by using another system entities definition approach, letting architecture becomes a projection of the code, not a separate artifact maintained alongside it.