Overdesigned at first — simplifying the suggestion system, then adding precision

VauDium·May 12, 2026

Four trigger types, a signal-spec abstraction, all records re-evaluated on every event. It looked well-designed until using it produced an endless cycle of suggestions. A day of cutting it back, then adding the precision the simpler shape could now hold.

Overdesigned at first

When a record is completed and its description differs from the template’s, the diff should become a suggestion that can flow back to the template. A one-line idea. The structure I’d built for it was anything but.

Four trigger types: RECORD_COMPLETED, RECORD_UNCOMPLETED, TEMPLATE_FIELD_CHANGED, RECORD_DELETED
An ImproveSignalSpec dataclass: triggers, monitored fields, needs_records, record_state_filter, compute_from_template, compute_from_record, is_dismissed_match, apply
A REGISTRY: signal type → spec lookup
An orchestrator that, on every trigger, walked every completed record for the template and re-evaluated every signal

It looked like clean architecture. “Adding a new signal type is easy” — that kind of pattern. Using it broke the model.

The cycle

A user found a scenario:

Edit a record’s description text
Apply that change to the template
After applying, the old records’ contents reappear as fresh suggestions

The screen had five identical cards proposing to remove the line “hahaha”, plus one final card with a different change. Five identical cards.

The cause was simple once I traced it. The TEMPLATE_FIELD_CHANGED trigger fired after each apply, the orchestrator re-walked every completed record, and each one generated a new hunk against the now-changed template. Apply, regenerate, apply, regenerate. Reversion suggestions from older records piled up endlessly.

The user’s one-liner clinched it: “Why are we auto-generating these? Why are we solving a simple problem this hard?”

They were right. And one more: “Isn’t it enough to line-by-line diff at COMPLETED time, save the line numbers and contents, and just show them?”

Cutting

I rewrote the architecture down.

One trigger only: RECORD_COMPLETED. When a record completes, compute only that record’s hunks.
All TEMPLATE_FIELD_CHANGED call sites removed — apply doesn’t trigger a recompute. No cycle to make.
RECORD_UNCOMPLETED only cleans up that record’s PENDINGs (no recompute).
RECORD_DELETED only cleans up.
The orchestrator’s “iterate every record” loop: gone. One record at a time.
ImproveSignalSpec dataclass / REGISTRY / ImproveTrigger enum: all removed.
In their place: three dispatch dicts — PER_RECORD_SIGNALS, PER_TEMPLATE_SIGNALS, APPLY_HANDLERS.

About 250 lines came out. The same job in less code, more precisely — not less.

On apply, I added one thing. If several records had proposed different alternatives for the same source line (current_value), picking one of them auto-clears the rest. “Once a version is chosen, the alternatives for the same source line are moot” — a natural rule.

Precision through position

Then the user pushed further:

“With grouping by current_value, what happens when the same content appears in consecutive lines?”

If a template description has “30 min workout” in two places and some record proposes changing it to “30 min run”, which occurrence does that target? The current structure couldn’t tell. description.indexOf(currentValue) always picked the first one.

The fix was storing position. difflib’s opcodes already gave me (i1, i2, j1, j2) line indices; I just had to keep i1 as current_line_start:

The dedupe key includes line_start → same content at a different position = a distinct PENDING.
Apply verifies lines[ls:ls+len(cv)] and replaces at the exact position.
The client groups by (line_start, current_value) instead of current_value alone.

And again the user pushed:

“Can it handle when a change adds or removes lines?”

Yes. Applying one hunk can change the line count of the description. Replacing one line with three is +2. PENDINGs after that point need their current_line_start shifted by 2. PENDINGs within the applied range refer to lines that no longer exist — stale.

The apply handler now:

Deletes overlapping PENDINGs: current_line_start ∈ [applied_start, applied_end)
Shifts following PENDINGs: $inc: { current_line_start: delta } for all with current_line_start ≥ applied_end

The stale filter got line-aware too — at fetch time, a suggestion is shown only if lines[ls:ls+cvLines].join("") === cv. If drift breaks the verification, it disappears automatically.

A unified diff view

I also rewrote the UI from scratch. The old view was one card per hunk (red + green + buttons). The new one is GitHub PR style — a unified diff:

unchanged line
- line to remove
+ alternative A    [>][X]
+ alternative B    [>][X]
+ alternative C    [>][X]
unchanged line
unchanged line
+ line to insert    [>][X]

Multiple alternatives for the same source share one red line and stack as green lines, each with its own apply/dismiss inline. Unchanged context lines come along, so where and how the change applies is visible in one read.

With current_line_start, positional grouping works; the client’s verify step (does the real line still match cv?) hides stale entries automatically.

Looking back

The first design was reaching for future extensibility too soon. Four triggers, a signal-spec abstraction, a registry — pattern-friendly for adding new signal types, but shaped for what later might look like, not what now actually needs to do.

The user pointed at it precisely. “Why auto-generate?” — narrow auto-generation to one record at a time and there’s no cycle. “Why solve it this hard?” — drop two abstractions and it’s two dispatch dicts.

And once it was simpler, precision fit. Position info, drift correction, verify. Adding these to the old over-engineered structure would have multiplied complexity. On top of the simple core, they slotted in naturally.

The takeaway is small but real: don’t grab for the abstraction first. With two signals, write two functions; don’t extract a dataclass. Wait until a third shows up and the pattern is obvious. Today I had to relearn that.