Formal methods: Reverse

Tim Newsham
2007-Aug-11

Introduction

This is a brief introduction to some small program proofs in Haskell and Isabelle/HOL. The material in this lesson borrow heavily from the Isabelle tutorial [Isabelle] and chapter 13 of Programming in Haskell [Hutton].

This small lesson will focus on a function for reversing lists. Here's a Haskell function for reversing a list:

    reverse :: [a] -> [a]
    reverse [] = []
    reverse (x:xs) = reverse xs ++ [x]
This is a really simple definition, but may look weird if you're unfamiliar with functional programming or Haskell. The first line is just a type description of the function and is optional (Haskell can figure this out from the definition). It says that the function reverse takes a single argument which is a list of some type (the type is called "a" here) and returns a list of the same type (also "a"). It should be fairly obvious that this is the type of function we want to write (note: in Haskell all elements in a list must have the same type so this really is the obvious type you'd want).

The next two lines give a definition and an algorithm for reversing the list. The definition is inductive and is based on the recursive definition of lists in Haskell. A list is either empty (represented as []) or has a head element and a tail list (represented as x : xs where x is the head element and xs is the tail list). The definition of reverse takes care of both of these cases. The first line says that the reverse of the empty list is the empty list. This is the simplest case and is the base case of the induction. The next line says what to do when you have a non-empty list. In this case the list is made up of two parts -- the head (called "x"), which is an element of type "a", and the tail (called "xs", pronounced as the plural of "x") which is a list of elements of type "a". In Haskell the colon (":") represents a list constructed of a head part and a tail part. The definition says to reverse the list you take the tail part and reverse it (recursively) and concatenate it (the "++" operator is list concatenation) with a list containing the head element. This recursive definition works because we have a base case that can reverse the empty list, and we always recurse with a list that is one size smaller. This guarantees that the "reverse" function will terminate for any finite input list that it operates on.

A quick test shows that this behaves as we would expect. In Haskell a string is just a list of characters so we can use reverse on it (here reverse acts as a function from [Char] -> [Char]):

    $ ghci rev.hs
       ___         ___ _
      / _ \ /\  /\/ __(_)
     / /_\// /_/ / /  | |      GHC Interactive, version 6.6.1, for Haskell 98.
    / /_\\/ __  / /___| |      http://www.haskell.org/ghc/
    \____/\/ /_/\____/|_|      Type :? for help.
    
    Loading package base ... linking ... done.
    [1 of 1] Compiling Main             ( rev.hs, interpreted )
    Ok, modules loaded: Main.
    *Main> reverse "testing 1 2 3"
    "3 2 1 gnitset"

An interesting property

Often when we're dealing with a function we may want to take advantage of certain properties of that function. For example, when performing a proof about reversing strings we may want to rely on the property that reversing a string twice gives back the same string. We'll use some techniques to test and prove that this property holds. It might seem a bit heavy handed to test and prove such an obvious property, but the same techniques can be used for other more complicated properties as well.

Haskell provides a library that helps us test properties. The library is called Test.QuickCheck [Quick] and it can automatically generate test cases and use them to verify the property. All we have to do as the programmer is tell it what the property is. Here's the property we care about:

    import Test.QuickCheck

    prop_reverev :: [Int] -> Bool
    prop_revrev xs = reverse (reverse xs) == xs
Our property is defined as a function that takes a list of integers and returns a boolean. In order to generate test cases QuickCheck needs to know what type of inputs to generate, which is why this property is written in terms of integer lists as opposed to a list of some abstract element type "a". The prop_revrev function returns true if reversing the list twice results in the same list. We would like to test that this is always the case. QuickCheck provides a function quickCheck which generates 100 random test inputs and verifies that the property always returns true. It will report any inputs that cause false to be returned instead. In our case the property passes the tests:
    *Main> quickCheck prop_revrev
    OK, passed 100 tests.

As an aside, another way of writing the property we are trying to prove is:

    prop_revrev = (reverse . reverse) == id
That is, reverse composed with itself is the identity operator. However, we cannot directly compare functions, but we can compare the results of functions:
    prop_revrev xs = (reverse . reverse) xs == id xs
which is the same as either of these:
    prop_revrev xs = (reverse . reverse) xs == xs
    prop_revrev xs = reverse (reverse xs) == xs

Have we proven that the property holds? No, we haven't! We've only shown that for some random inputs the property appears to hold. We could increase our confidence by running more and more tests. However, we will never be able to cover all cases since lists can have unbounded sizes. In fact, there is a case in Haskell when reverse . reverse is not the same as id! The property does not always hold! This case occurs when the input list is infinite. In this case the reverse function will never terminate and return an answer, while the id function will.

  *Main> id [1..]
   [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,
   28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,
   Interupted.
  *Main> (reverse . reverse) [1..]
  *** Exception: stack overflow
If we want to get a better answer, testing alone is not going to help us.

A proof with Isabelle

To prove that the property holds for all inputs (all finite inputs, at least) we will use the Isabelle/HOL proof assistant. Isabelle [Isabelle] is a proof assistant supporting several logic systems, and HOL is a logic system over total functions that is similar to the Haskell programming language. This program helps us explore and write proofs while checking our work and automating some of the steps. We could construct the same proof without the use of Isabelle (and for small proofs such as this one that might be a better idea) but the proof process may be more time consuming and more error prone.

An important detail about using HOL is that it is a logic for total functions, that is, functions that return a value for all inputs. Haskell lets you write partial functions that are not defined for all inputs or do not terminate for all inputs. These are not allowed by HOL. Such functions can lead to inconsistencies when using equational reasoning. For example the function:

    inf x = inf x + 1
does not terminate. If we manipulated the function definition as an equation using standard mathematics we could subtract inf x from both sides and arive at the nonsense statement 0 = 1. In HOL we will only consider total functions operating on finite inputs, thus sidestepping the earlier issue with reverse . reverse and infinite lists. Isabelle will let us know if any of our defintitions are not total. Its important to keep these little details in mind -- whatever we prove will be true for finite inputs.

Isabelle/HOL has its own language which we must use to specify the definitions we want to use and the properties we want to prove. It is similar in nature to Haskell's but not identical. Here is a set of definitions for the list datatype, list concatenation and the reverse function:

    datatype 'a list = Nil                        ("[]")
                     | Cons 'a "'a list"          (infixr "#" 65)
  
    consts app :: "'a list => 'a list => 'a list" (infixr "@" 65)
           rev :: "'a list => 'a list"

    primrec
    "[] @ xs = xs"
     "(x # xs) @ ys = x # (xs @ ys)"

    primrec
    "rev [] = []"
    "rev (x # xs) = (rev xs) @ (x # [])"
The first two lines define a list of any type "a" as being either the empty list (Nil, or "[]") or being a head value of type "a" followed by a tail list of "as" (which can be abbreviated as x # xs). This is similar to the definition of lists in Haskell except that "#" is used instead of ":" to represent a head element and tail of a list. The next two lines give the type definitions for two functions: app (short for append) takes two lists of "as" and returns a list of "as" while rev (reverse) takes a list of "as" and returns a list of "as". The app function can be abbreviated as xs @ ys. The next section gives a recursive definition of the app function: the empty list concatenated with xs is just xs while the list (x # xs) concatenated with ys is the list with x at the head and xs concatenated with ys at the end. Finally the last section gives the definition of rev that you should already be familiar with: reversing an empty list results in the empty list while reverse a list (x # xs) results in the reverse of xs concatenated with the list containing just x.

First attempt

The task of constructing a proof is not always straightforward and Isabelle provides great interactive features to help explore the proof. The first step in constructing the proof is to say what it is we want to prove. Next we give some directions to Isabelle to tell it how to procede. We're going to skip over some of the details of directing the proof (interested readers are encouraged to read more at [Isabelle]):
    lemma rev_rev [simp]: "rev (rev xs) = xs"
    apply(induct_tac xs)
    apply(auto)
    done
The first line states that we're going to try to prove that reversing a list twice results in the same list (and ask Isabelle to remember that for future proofs as a simplifying rule). The next line tells Isabelle to perform induction on the list "xs" and the next line tells Isabelle to continue solving as much as it can automatically after that.

When these instructions are loaded into Isabelle's ProofGeneral interface we can evaluate these lines one at a time. At each step the interface shows us what needs to be proven to complete the proof and we can use the forward and backward arrows to evaluate more rules or back out of rules that didn't work as anticipated. Here is the display after evaluating the first line:



At this point Isabelle gives us a single goal to prove, the lemma itself.



After applying the induction rule the goals change. Now we need to prove the base case (goal 1) and the inductive case (goal 2). The base case states that reversing the empty list twice results in the empty list. This goal can be proven directly from the definition of rev. The inductive case states that we can assume that the lemma is valid for "list" while we try to prove that it is also valid for a longer list "a # list". This case will be more complicated to prove.



After telling Isabelle to try to prove the rest automatically we notice that base case has been removed as a goal -- Isabelle was able to figure out that it was true by simplification. However, Isabelle wasn't able to automatically prove the inductive case. After some simplification it got stuck. The prover doesn't know how to reduce "rev (xs @ ys)" further (for example when xs is "rev list" and ys is "a # []"). We will have to back out of this proof and add a new theorem in order to progress further. Since the proof is incomplete, we'll get an error if we try to mark it as done:



Another theory

To help the theorem prover along we add another theorem:
    lemma rev_app [simp]: "rev(xs @ ys) = (rev ys) @ (rev xs)"
    apply(induct_tac xs)
    apply(auto)
    done
This theory says that when you reverse two strings that are concatenated you can distribute the reverse to the two strings provided you concatenate the result in the opposite order. For example:
    rev ("abc" @ "def") = (rev "def") @ (rev "abc")
                        = "fed" @ "cba"
                        = "fedcba"
We tell Isabelle to prove this using induction on the list xs and then to try to prove the rest automatically. As before the induction rule provides two subgoals:



and once again Isabelle is unable to complete the proof:



This time it wasn't able to prove either the base case or the inductive case. For the first goal the prover doesn't yet realize that "xs @ []" is the same as "xs". The second case may look a little puzzling. What it is saying is that it needs to prove that "(rev ys @ rev list) @ a # []" is the same as "rev ys @ (rev list @ a # [])". (Isabelle didn't show the parenthesis in the second expression because they are not needed due to the precedence rules). The reason the second goal couldn't be proven is because the prover isn't aware that "(xs @ ys) @ zs" is the same as "xs @ (ys @ zs)" (for example when zs is "a # []"). We'll need to back out of this proof and create and prove two more theorems.

Completing the proof

We define two more theories:
    lemma app_Nil2 [simp]: "xs @ [] = xs"
    apply(induct_tac xs)
    apply(auto)
    done

    lemma app_assoc [simp]: "(xs @ ys) @ zs = xs @ (ys @ zs)"
    apply(induct_tac xs)
    apply(auto)
    done
The first says that concatenating an empty list at the end of a list results in the same list. The second theory says that concatenation is associative and we can perform the operations in any order we choose. Isabelle is instructed to solve both of these by induction on the "xs" list. Unlike the previous theories, these are easy work for Isabelle and their proofs are accepted and recorded. Here's the state after app_assoc has been worked on:



With these theories proved we can continue on and complete the earlier rev_app and finally rev_rev proofs:



At this point all the theories have been proven and accepted by Isabelle and we are done.

Refining code

We've spent a lot of time working with a particular implementation of list reversal. It turns out that this implementation, while straightforward, is not very efficient in Haskell. The problem is that appending a single character at the end of a list requires building an entirely new list and our code does this for every character in the input list. Here is a faster implementation that avoids appending to the end of a list:
    revHelper :: [a] -> [a] -> [a]
    revHelper [] ys = ys
    revHelper (x:xs) ys = revHelper xs (x:ys)

    reverse2 :: [a] -> [a]
    reverse2 xs = revHelper xs []
This implementation works by using a second argument as an intermediate accumulator. As list elements are removed from the head of the input list they are added to the head of the accumulator list. When the input list is empty, the result is whatever is in the accumulator.

Does this new function reverse lists? Does it have the same properties as our earlier implementation? These questions can be answered by proving that the two functions are the same. We wish to prove that reverse2 == reverse. Written in the language of QuickCheck we want to prove the property:

    proprevrev2 :: [Int] -> Bool
    prop_revrev2 xs = reverse2 xs == reverse xs
and sure enough, quickCheck prop_revrev2 passes with flying colors.

A Proof

Not satisfied with a few random tests, lets build a stronger proof that these two functions are identical. We'll do this proof using equational reasoning directly on Haskell code rather than using Isabelle. In this proof we restrict ourselves to total functions and we will only consider finite inputs, so our reasoning will use higher order logic (HOL) and avoid inconsistencies.

The key to understanding how reverse2 and reverse are related is to note that the result of revhelper xs ys is always the same as the result of (reverse xs) ++ ys. Here's a property which states this:

    prop_revrevh :: [Int] -> [Int] -> Bool
    prop_revrevh xs ys = reverse xs ++ ys == revHelper xs ys
This is easily verified with quickCheck. A proof can be done by induction on xs, but before we get into the proof, let's look at the definition of concatenation since it will be relevant to our proof:
    (++) :: [a] -> [a] -> [a]
    [] ++ ys = ys
    (x:xs) ++ ys = x : (xs ++ ys)
This definition is identical to the definition of "@" used earlier in Isabelle.

Now lets look at a proof of prop_revrevh. First consider the case when xs is the empty list:

    revHelper [] ys = ys                             definition of revHelper
                    = [] ++ ys                       definition of concat
                    = reverse [] ++ ys               definition of reverse
showing that prop_revrevh holds when xs is []. Notice at each step we are replacing terms in the expression with equivalent terms based on the function definitions. This is valid (for total functions!) because Haskell is a pure language and has referential transparency.

Next we have to prove the inductive case where xs may be made up of a head element and a tail list which I'll write as (z:zs). We can use the inductive hypothesis that our property holds for lists shorter than xs (such as the list zs):

    revHelper (z:zs) ys = revHelper zs (z:ys)        definition of revHelper
                        = revHelper zs ([z] ++ ys)   definition of concat
                        = reverse zs ++ ([z] ++ ys)  inductive hypothesis
                        = (reverse zs ++ [z]) ++ ys  associativity of concat
                        = reverse (z:zs) ++ ys       definition of reverse
In this proof we had to make use of a property of concat that we haven't discussed yet (concatenating [x] with xs is the same as x:xs) and one that we've proved earlier in Isabelle (that concat is associative). The first property follows directly from the definition of concat. Having proved both the base case and the inductive case we've shown that prop-revrevh always holds. Proving that reverse2 and reverse are the same is now trivial:
    reverse2 xs = revHelper xs []                    definition of reverse2
                = reverse xs ++ []                   defintition of revHelper
                = reverse xs                         by app_Nil2, proved earlier
We've shown that reverse and reverse2 behave identically (for finite inputs). The two can be interchanged in equations, and the properties of reverse also hold for the function reverse2.

Isabelle proof

We could also use Isabelle to verify this proof by adding the following to our existing proof:

    consts revH :: "'a list => 'a list => 'a list"

    primrec
    "revH [] ys = ys"
    "revH (x # xs) ys = revH xs (x # ys)"

    lemma rev_revH: "ALL ys. revH xs ys = rev xs @ ys"
    apply(induct_tac xs)
    apply(auto)
    done

    lemma rev_rev2: "rev xs = revH xs []"
    apply(simp add: rev_revH)
    done
This proof shows a few variations in how Isabelle is used. First we stated the lemma rev_revH in a slightly different way. If we state the proof as "revH xs ys = rev xs @ ys" and use induction we get:



After performing simplification the proof is just about done, but Isabelle gets stuck. It doesn't realize that it can use the induction hypothesis "revH list ys = rev list @ ys" to complete the proof because "ys" is not the same thing as "a # ys". We know, however, that the lemma is valid for any "ys" (we didn't put any conditions on what "ys" is) including "a # ys". The solution is we prove the slightly more general lemma "ALL ys . revH xs ys = rev xs @ ys" and Isabelle is able to complete the proof.

Another difference from earlier proofs shows up in "rev_rev2". Here instead of telling Isabelle to complete the proof automatically we explicitely told Isabelle to perform simplification using the "rev_revH" lemma. "Auto" would not have worked directly because we didnt mark rev_revH as a "[simp]" (simplifying) rule so Isabelle doesn't automatically try to use it.

Further reading

The examples in this lesosn are based heavily on several of the following sources:

These are the sources used in this lesson: