Functional Programming at St Andrews
RSS icon Email icon Home icon
  • “Provisional” Definitions in Idris

    Posted on February 9th, 2009 Edwin Brady 2 comments

    I’ve been doing a fair bit of programing in Idris lately, mostly experimenting with implementing Domain Specific Embedded Languages. I’ll say more on that some other time (and I’ll talk informally about it at lunch on Wednesday) – here I wanted to mention a feature that was fairly simple to implement, but I now find I cannot live without, that I am provisionally called “provisional definitions.”

    When your type system gets sufficiently expressive, and you begin to write down properties of your programs, you will sooner or later find that you need to do some theorem proving. The challenges for any dependently typed language then include:

    • Can we separate these theorem proving tasks from the rest of the program?
    • How do we know what the theorems we need to prove are?
    • How much of the theorem proving can be automated?

    ..among several others.

    The first question is important because we don’t want the presence of proofs to affect the readability of programs — proof terms may distract from real computational content. The second and third are important because we don’t want to affect the writability (if there is such a word) of programs. I’d like to write down what I think the program should do (i.e. its type, in as much detail as I feel is necessary), then how I think it should do it (i.e. its computational content), then have a machine tell me what it thinks is necessary to justify my computational content.

    To this end, Idris includes the following two features:

    • Programs may contain holes, marked with a ?.
    • Pattern match clauses may be marked as provisional, by defining with ?= rather than =. Such clauses may be incorrectly typed, and the type checker will insert a lemma (to be proved by the programmer) to convert them to the correct type.

    Since Idris is built with a theorem proving library, Ivor, generated holes may be proved interactively, and are written separately from the program.

    For example, imagine we’re implementing a new programming language. At first, we’ve kept things simple in the implementation, just to make sure things work, without worrying too much about efficiency. But soon we find that the typechecker is too slow. Profiling reveals that the culprit is the function which looks up names in the context. (Yes, this is a true story, and of course you already knew that the first thing to do to try to speed up a program is to run it with a profiler!)

    In our hypothetical compiler, we implemented the context as an association list mapping names to types:

    data List a = Nil | Cons a (List a);
    data Pair a b = mkPair a b;
    
    insert : List (Pair a b) -> a -> b -> List (Pair a b)

    When programs become large, this becomes a bottleneck because name lookup is O(n). So instead of this, we decide to implement the context as a binary search tree. We would like to be certain that our new implementation is equivalent to the old one, and so we use the type system to link binary trees with the list representing their content, in order.

    data BST : (List A) -> # where
        Leaf : Tree Nil
      | Node : {A:#}->{ls,rs:List A} ->
               (l:BSTls) -> (x:A) -> (r:BST rs) ->
               (BST (app ls (Cons x rs)));

    Inserting into the tree involves ensuring that the resulting tree is indexed over a permuation of an insertion into the list. I’ll skip the details of this for the moment — more on this some other time — to note an important part of the efficiency of this operation which involves balancing the tree. However we choose to do this (e.g. red-black trees or AVL trees, or some other way), we’ll have to rotate about some node in the tree. When we do this, we need to ensure that the tree ordering is preserved, otherwise we break the property that makes searching work. So we say:

    rotateLeft : {xs:List A} -> BST xs -> BST xs;

    This is an example of a lightweight invariant. it does not guarantee the correctness of tree balancing, but does at least guarantee that the resulting tree will function in the same way as the original. A first attempt at implementing this might be:

    rotateLeft : {xs:List A} -> (BST xs) -> (BST xs);
    rotateLeft (Node bl b (Node al a ar))
        = Node (Node bl b al) a ar;
    rotateLeft t = t;

    Unfortuntaely, this isn’t going to work, even if you believe the code does the correct computation. There is a type error, because the typechecker does not automatically know the associativity properties of list append. A correct program would involve rewriting by an auxilliary lemma:

    rlLemma : (a,b:A)->(bls,als,ars:List A) ->
      ((app (app bls (Cons b als)) (Cons a ars))
       = (app bls (Cons b (app als (Cons a ars)))));
    rlLemma a b Nil als ars = refl _;
    rlLemma a b (Cons x xs) als ars
        = eq_resp_Cons (rlLemma a b xs als ars);
    
    rewrite : {A:B->#} -> A m -> m=n -> A n;
    
    rotateLeft (Node bl b (Node al a ar))
        = rewrite (Node (Node bl b al) a ar)
             (rlLemma a b bls als ars);

    This is now correct, but rather ugly, and we can see a lot of detail we don’t wish to know about. The details of the program have become lost in the details of the proof. Not only that, but how did we know what we needed to prove in the first place? Sure, we can work this out and write rlLemma by hand, but we have a perfectly good type checker that ought to be able to tell us these things!

    Instead, we write the program as follows, using a provisional pattern match clause:

    rotateLeft : {xs:List A} -> (BST xs) -> (BST xs);
    rotateLeft (Node bl b (Node al a ar))
        ?= Node (Node bl b al) a ar;  [rlLemma]

    The provisional match clause gives the desired definition, and the name of a lemma to generate (rlLemma) which will hold the proof. The typechecker then has the job of generating the correct type for the lemma.

    On running Idris on this program, we are given a command prompt (much like with hugs or ghci), from which we can ask to prove the lemma:

    Idris> :p rlLemma

    This drops us into a Coq-style interactive mode, which tells us the value we have, and the type we need:

    value : BST (app (app zs (Cons b ys)) (Cons a xs))
    --------------------------------
    H0 ? BST (app zs (Cons b (app ys (Cons a xs))))

    We can convert this into an equality proof with the “use” tactic:

    rlLemma> use value
    
    value : BST (app (app zs (Cons b ys)) (Cons a xs))
    --------------------------------
    equality ? app zs (Cons b (app ys (Cons a xs))) =
                   app (app zs (Cons b ys)) (Cons a xs)

    From here, the proof is fairly simple, and proceeds by induction on xs. When we have finished, Idris outputs the complete proof script, which can then be pasted into the program. So our code looks like this:

    rotateLeft : {xs:List A} -> (BST xs) -> (BST xs);
    rotateLeft (Node bl b (Node al a ar))
        ?= Node (Node bl b al) a ar;  [rlLemma]
    rotateLeft t = t;
    {- ... rest of program ... -}
    
    rlLemma proof {
       %intro A, xs, ar, a, ys, al, zs, bl, b, value;
       %use value; %induction zs; %refl;
       %intro x,xs',xsIH; %compute;
       %rewrite xsIH; %refl; %qed;
    };

    In this way, we can keep the details of the computation in rotateLeft, and postpone the logical part, rlLemma, until a separate part of the code.

    Here’s another complete example, the binary adder we describe in this paper, rewritten as an Idris program with provisional definitions. Note that the computational content and the proofs are entirely separated.

    There is a lot of work going on at the minute in trying to extend Haskell’s type system to allow more properties to be statically guaranteed, and this is clearly a good thing if you want to make strong correctness claims about your programs. But I think that there are limits to this kind of technology, unless you allow interactive construction of proofs and a clean separation between programming and proof construction. I might even go further and say that you need a way of plugging in decision procedures for those kinds of properties for which a decision procedure exists. We’re not quite there yet though…

    (Edit: I should add, by the way, that this is not entirely unlike Russell, in that proof obligations are generated from computational content. The main difference as I see it is that full dependent pattern matching in Idris has some influence on the you might write programs, and the way that obligations are generated. It would be fun to see what would happen if this sort of thing was supported in Haskell…)

     

    1 responses to ““Provisional” Definitions in Idris” RSS icon

    • Why is the “Dependent Types” tag in the blog twice the size of the others? An attempt to attract attention? :-)


    1 Trackbacks / Pingbacks

    Leave a reply

    You must be logged in to post a comment.