research

What we are learning.

Field notes, experiments, and working theories on how models behave inside real systems.

A matched four-run comparison of `Qwen 3.5 27B` and `Gemma 4 31B` shows that both families recover the same two colour axes under controlled short response extraction, but they organise that control differently across depth. Qwen is late-stack and prompt-stable. Gemma keeps a mid-stack band active and sharpens the colour axes during rollout.

#1persona steering14 min read

read note

Beyond Prompting: Steering Qwen into Colour Personalities for Task Fit and Enterprise Control

A research note on whether inference-time persona steering can push Qwen 3.5 9B into colour-matched working modes that perform better on the right tasks. In a 40-task cross-colour benchmark, matched personas outperformed both the base model and mismatched personas on every task family, with the clearest steering gains in red and green.