TL;DR - A subset of Standard ML, including dynamic semantics, is implemented in Lean 4. Using a variety of newly implemented tactics, this allows for writing machine checkable proofs about SML code similarly to how one would on paper.

If you are unfamiliar with Lean 4, one nice feature is its meta-programming system (which is shockingly easy to use, in my opinion). I won't go into how to use it here1, but instead focus on what you can use it to implement. Namely, using this system it was relatively easy to embed a version (subset) of Standard ML within Lean. This means that normal SML code can be written and Lean will expand it into an AST representation:

def fact := [sml|
  fun fact (n : int) : int =
    if n = 0
    then 1
    else n * fact (n - 1)

This is Lean code that can be ran and it generates the corresponding AST representation of 'fact'. Meta-programming features like this are already pretty useful2, but we don't have to stop at just embedding SML in Lean. We can take advantage of the "theorem proving" part of Lean to write proofs about pieces of SML code. For instance, we could prove the following statement (where 'env' is the state of the program after adding the 'fact' declaration):

set_option maxRecDepth 5000 in  -- lets `repeat` run longer since there's lots of steps
example : [smlprop|
           env ⊢ fact 5
      ==>* env ⊢ 120
    ] := by
  repeat sml_step               -- step the code until we prove the theorem

Importantly, this is not just an interpreter! This is a theorem that states that 'fact 5' evaluates to '120' in 0 or more steps. The 'repeat sml_step' proves this theorem by repeatedly applying rules from SML's dynamic semantics. This is similar to how one might prove this on by hand, by writing out various code-steps and justifying why they can be made.

By itself this isn't too useful or interesting, but we can use this to prove larger and more meaningful theorems about SML programs. For instance, we could try and prove that 'fact' is total for all natural numbers:

∀ n : Nat, ∃ v : Nat,
    env ⊢ fact ↑n ==>* env ⊢ ↑v

To help prove this statement in Lean, we add a '↑e' notation to SML, which allows us to more easily embed Lean terms into the SML programs. Now, think about how we might prove this statement on paper. That proof can essentially be mirrored in Lean:

theorem fact_nat_total
  : ∀ n : Nat, ∃ v : Nat,
      env ⊢ fact ↑n ==>* env ⊢ ↑v
    ] := by
  intro n
  induction n with
  | zero =>
    apply Exists.intro 1                -- v = 1 in this case
    repeat sml_step

  | succ n' ih =>
    let ⟨v', ih⟩ := ih                  -- `ih` is of the form `∃ v', ...` this pulls out the `v'`

    -- now `ih` is the proposition `fact ↑n' ==>* ↑v'`
    apply Exists.intro ((n' + 1) * v')  -- the resulting value for this case

             (env, [sml_exp| fact (↑(n' + 1))])
      _ ==>* (env, [sml_exp| ↑(n' + 1) * (fn n => if n = 0 then 1 else n * fact (n - 1)) ↑n'])
          := by repeat sml_step
      _ ==>* (env, [sml_exp| ↑(n' + 1) * ↑v'])
          := by repeat sml_congr        -- this focuses us on the second argument to `*`
                sml_apply_ih ih
      _ ==>* (env, [sml_exp| ↑((n' + 1) * v')])
          := by repeat sml_step

If you can follow the Lean code (without the info-view) then hopefully this seems fairly similar to what would be expected from a handwritten version.

To give a quick run-down: The zero (base) case is simply just stepping the code to the value '1'3. The succ n' (inductive) case is more interesting. The built-in calc tactic is used to guide the evaluation by writing out where we'd like to stop evaluating. In the step that applies the IH, the sml_congr tactic helps focus the proof so that the IH can be applied and the sml_apply_ih tactic tries to apply the ih (and does some basic transformations incase it can't be directly applied).

Hopefully in the future this initial framework can be further expanded to faciliate even more complex and interesting proofs (adding theorems and tactics for reasoning about extensional equivalence comes to mind). For now, hopefully this serves as nice peek at what can be done in Lean and what the future might hold.

Currently, this project isn't really in a state that I would describe as generally usable (unless already familiar with Lean). Some parts probably need to be reworked or just weren't implemented (e.g. typechecking). That being said, I don't intend on working on this project too much more. I mostly was using this as a proof-of-concept before I try something similar for C0 programs. So, if this has peaked your interest, feel free to reach out and take the reins. I'd love to help where I can : )

The source of the Lean implementation as well as examples can be found on GitHub (for instance: here is an example that the power function is correct). Also, thanks to James Gallicchio for their initial implementation of the AST and some of the semantic rules.

Some thoughts on pedagogy

Preface: I'm going to be talking about my experiences TAing Carnegie Mellon's intro to functional programming course called 15-150 or commonly 150. The specifics don't matter too much beyond that the course teaches the language Standard ML and students are expected to write proofs about their code in SML.

From my experiences, one of the main barriers for students and learning proofs is the time between making a mistake and receiving feedback. In the most ideal world, the fastest a student could get feedback about proofs from us (other than office hours, which has a host of other issues) is 3 days. A tool like this eliminates that delay since rather than needing a TA to grade and leave meaningful feedback (that a student might not even bother to read since it has been so long) Lean will tell the student what is wrong before they even save the file4.

There still is room for handwritten proofs. The way I see things, writing a proof in Lean is ideal for helping understanding how proofs structurally work. That new understanding should be translatable to handwritten work and actually requiring students to write some proofs by hand would be the ideal way to make sure students are actually learning (and not just bashing their head against the code until it works).

Likewise, having the ability to automatically assess student code allows for deeper and more interesting problems to be ran on homeworks. Since feedback could be delivered instantly, core concepts/learning objectives could be given in Lean. Then all of those concepts can be synthesised together in a final, handwritten/LaTeX'd proof.

But yeah, those are just some of my thoughts and I'm excited to see where these kinds of tools/ideas take us in the future!

↩1 The Metaprogramming in Lean 4 books provides an introduction to metaprogramming, with a bunch of examples.

↩2 Another project of mine already uses an embedding of WebAssembly text format to inline WASM programs and the lean-verbose project uses meta-programming to just let you write Lean proofs in plain english (and french). And these are only a taste of what has already been done.

↩3 Lean can generally infer this. For instance, replacing the base case with exact ⟨_, by repeat sml_step⟩ also works (likewise in the inductive case). This, and other code simplifications, were not done here for the sake of readability.

↩4 One interesting avenue I could think of is embedding something like this into an education game/website, similar to the already existing Natural Numbers Game but instead for SML.