Skip to main content

Research

What we measure, why we measure it, what we found.

Refactron's thesis is that refactoring belongs to deterministic tools with formal safety guarantees, not to text generators. These papers are how we hold that claim to scrutiny: published benchmarks, published methodology, published source.

Published

2

Planned

2

Last update

2026-05-15

Papers

  1. 01

    2026-05-15

    live

    Refactron 0.2.0. A measured look at deterministic refactoring at scale.

    Wall-clock benchmarks for analyze, plan, and the 3-gate verifier on synthetic and real Python fixtures. 45% faster on 100k LOC vs the 0.1 baseline. All scripts and raw runs in the public repo.

  2. 02

    2026-05-15

    live

    Refactron vs the codemod baseline. A head-to-head on var → const/let and format → f-string.

    Two transforms, measured against jscodeshift, Comby, ESLint --fix, and LibCST on identical inputs across speed, coverage, and safety. The unverified codemod tools run sub-second and write code that does not compile; Refactron is the slowest tool measured and the only one that is top-coverage on both transforms while never unsafe.

  3. 03

    Target 2026-06

    planned

    Legacy patterns in the wild. An empirical survey of the top 100 PyPI packages.

    How prevalent are the patterns Refactron transforms target? Which packages would benefit most from a deterministic refactoring pass? Distribution by transform, by package age, and by test coverage.

    soon
  4. 04

    Target 2026-06

    planned

    Cross-file preconditions for callback_to_async_await. A method paper.

    Why this transform is the hardest of the ten and how the precondition set is constructed. Walks through the call-graph, the safety constraints, and the cases the transform deliberately refuses.

    soon

About this stream

Every Refactron research piece commits to three rules.

  • ReproducibleEvery published number ships with the script that produced it. If you can't reproduce a claim from bench/, it shouldn't be in the paper.
  • DefensibleMethodology is described in detail upfront. No cherry-picked runs, no proprietary inputs, no mystery hardware.
  • HonestEach paper has a Discussion section listing what it does not measure. We'd rather ship a narrow paper with sharp edges than a broad one with caveats hidden in the small print.