Notes

Method write-ups.

Dense reward
Why step-by-step grading beats pass or fail.
Note
Environments under RL
What a graded world does to a model.
Study
Shelf life
Representing a codebase as an environment.
Note