Fit 2: On Berry's Majority Function

Just to remind you all, I'm talking about

 maj :: Bool -> Bool -> Bool -> Bool
 maj    True    True    True  = True   -- 1
 maj    x       True    False = x      -- 2
 maj    False   y       True  = y      -- 3
 maj    True    False   z     = z      -- 4
 maj    False   False   False = False  -- 5

This is a perfectly respectable function definition in Haskell. It can
be given the standard naive 'proceed with the first line which yields
a matching substitution' semantics (see thesis of F. McBride;
actually, you can see it on my web page, just under my mouse; I really
must scan it some time). The maj function is total: its patterns are
exhaustive and disjoint. However, these patterns do not arise as the
fringe of a splitting tree. Hence it is not possible to give an
implementation of maj with conventional eliminators or in Coquand's
pattern matching language (henceforth, Alf) which satisfies those five
equations definitionally on open terms.

Of course, you can still compile this program Augustsson-style into a
tree of Burstallian case expressions. Here's how. As in the previous
fit, associate problem patterns with programs, starting with

prob: maj x y z
prog: {- as above -}

Match the prob pattern to the prog patterns in turn. Seek the first
variable matched to a constructor form. If there's no such animal,
you're done: hopefully you have one program line and your matching
substitution is a bijective renaming, in which case you've got your
right-hand-side; otherwise it's an error. But if some topmost leftmost
variable is matched to a constructor, split the problem pattern with
respect to it. For each resulting subproblem, filter the program for
the patterns which /unify/ with it, and /instantiate them by the
unifier/. The effect of this is to split program lines into separate
constructor cases wherever it is necessary to build a splitting tree.

For this example, we see from line 1 that we must split x. Line 2
unifies with both the True subproblem and the False subproblem, hence it
splits into two more specific versions.

prob: maj True y     z
prog: maj True True  True  = True   -- 1
     maj True True  False = False  -- 2[True/x]
     maj True False z     = z      -- 4

[Now split y, and only when y is True, split z.]

prob: maj False y     z
prog: maj False True  False = False -- 2[False/x]
     maj False y     True  = y     -- 3
     maj False False False = False -- 5

[Now split y, then split z in both cases.]

Note that the more instantiated patterns arising through the
unification do not make the program any stricter than it already
was. In a Haskell-like language, where the non-canonical value we're
scared of is bottom, this compilation strategy preserves the naive semantics
of pattern matching.

We can deploy exactly the same strategy for programs in type theory, but
at a
cost. Given the original maj function as input, we can systematically
construct
a term which implements this version:

 maj    True    True    True  = True   -- 1
 maj    True    True    False = True   -- 2[True/x]
 maj    True    False   z     = z      -- 4
 maj    False   True    True  = True   -- 3[True/y]
 maj    False   True    False = False  -- 2[False/x]
 maj    False   False   True  = False  -- 3[False/y]
 maj    False   False   False = False  -- 5

This is just the splitting tree program which minimally refines the
program we started with. I'm guessing it's also the procedure which
Edwin followed when constructing the covering function for the view he
wrote.  At run time, its operational behaviour is just the same as the
original, but its behaviour on open terms is stricter. Here, splitting
on a non-canonical value isn't an error: we just get stuck.

So Yong is quite right to observe that there are formulations of
pattern matching which allow more control over the computational
behaviour of open terms than Coquand's formulation. This raises the
obvious question: are there any good examples where this extra
flexibility buys you something useful? I'm willing to believe there
might be, but I can't quite think of any just now.  The sorts of
examples which spring to mind don't quite have disjoint patterns (but
they're confluent anyway). The classic (which I learned from James
many years ago) is

 plus :: Nat ->  Nat -> Nat
 plus    x       Zero   = x
 plus    Zero    y      = y
 plus   (Suc x) (Suc y) = Suc (Suc (plus x y))

which satisfies both left and right absorption laws definitionally,
and is very easily shown to be commutative.

Let me be clear about the example I am asking for. I'm not asking why
it's convenient to *write* non-splitting tree programs: it's no big
deal to allow people to write them, then let the elaborator produce a
splitting-tree refinement which actually gets executed. I'm asking why
it's convenient to *run* non-splitting tree programs. It's not a trick
question. I'm well aware that careful management of the definitional
equality can make the difference between a clumsy program/proof and an
elegant one.

Let me also be clear about this fact: if you add the refutation
notation, coverage is no more a problem for these maj-like examples
than it is for splitting tree programs. The above procedure is basically
a coverage checker. Let us therefore separate the
questions of how to treat non-splitting-tree definitions and whether
to have partial functions. I shall leave the latter question aside in
this message, save to reiterate that desirability of non-splitting-tree
functions is not an argument for partial functions.

We might consider treating maj-like functions in a number of ways:
(1) forbid them;
(2) construct them by defining views (ie by constructing the refined
splitting
manually);
(3) allow them, but elaborate them to splitting trees (hence conventional
eliminators), with the correspondingly strict operational behaviour on open
terms;
(4) allow them with their direct operational behaviour, effectively
replacing
the individual case operator with a general scheme for partitioning.

As Edwin pointed out, the Epigram situation is basically (2). For us,
(1) is not an option as we can't prevent people from using any scheme
of pattern matching which they know how to justify. (3) might well be
worth doing; (4) is clearly possible, but it requires quite a bit of extra
machinery in the underlying theory, compared to 'there's your elim operator,
the rest is up to you'. But I'm open to persuasion that the power can
justify
the weight.

From here, two other questions suggest themselves to me:
(a) how can we support more helpful notions of pattern matching within our
current set-up?
(b) how can we give useful treatments to equational theories which go beyond
the power of built-in computation?

Of course, James and I have been banging on about (a) for
ages. Programming with derived eliminators can be very convenient
indeed. Check out the revised version of my old unifcation example,
posted to this list not so long ago, here:

 http://www.mail-archive.com/epigram@durham.ac.uk/msg00206.html

The point here is that the somewhat primitive nature of eliminators in
the core theory does not prevent flexible and sophisticated
functionality being delivered to the programmer. This is because we use the
core theory as a language in which to explain high-level functionality.

Meanwhile, (b) is a serious can of worms which we should open, but
with thought and care. There are all sorts of candidates out
there. We've already been fiddling about with eta-rules, but then
you've got various other kinds of algebraic stuff (functor laws?) as
well as more ad hoc rewriting systems. Bear in mind also that the
problem isn't just about reduction and checking things are equal; more
generally, it's about unification, ie solving unknowns to make things
equal. The question 'should we allow maj-like extensions to the
definitional equality' should be considered in the wider context of
increasing the system's power to compute solutions to algebraic
problems.

All the best from Aberystwyth

Conor


Reply via email to