Re: The Use and Abuse of Liskov

Luke Palmer Wed, 27 Jul 2005 04:51:05 -0700

On 7/19/05, Damian Conway <[EMAIL PROTECTED]> wrote:
> > And now maybe you see why I am so disgusted by this metric.  You see,
> > I'm thinking of a class simply as the set of all of its possible
> > instances.
> 
> There's your problem. Classes are not isomorphic to sets of instances and
> derived classes are not isomorphic to subsets.


Ahh, I understand now.  If you think that way, then there is no way to
convince you, since that is the piece of mathematics that my whole
argument is based on.  Please seriously consider this world model and
its implications (especially regarding my new thread about
superclassing).  I'll give up on the theoretical side now.

~~~~~

I've just released Class::Multimethods::Pure for an account of how
pure ordering works in practice.  As the first case study, see
Class::Multimethods::Pure (it bootstraps itself :-).  The ambiguities
that it pointed to me turned out to be very important design-wise, and
I noticed that under manhattan distance it would have silently worked
and then broken (in ambiguity) later.

Readers, do your best to follow along.  This is pretty complex, but
that's exactly what I'm arguing: that a derivation metric like
Manhattan will decieve you when things get complex.

The piece was the junction factoring that I described in my other
thread (I use junctions in MMD to implement type junctions).  At first
I had this model:

    Object
    |- Junction
       |- Disjunction
       |- Conjunction
       |- Injunction
    |- Constrained
       |- Subtype
    |- PackageType
    |- ...

And the multis defined as:

    multi subset (Junction, Object)   {...}
    multi subset (Object, Junction)   {...}
    multi subset (Junction, Junciton) {...}

Which made recursive calls to subset on their constituent types.  The
various Junction subclasses have a "logic" method which knows how to
evaluate the junction in boolean context.  I also had:

    multi subset (Subtype, Object)  {...}
    multi subset (Object, Subtype)  {...}
    multi subset (Subtype, Subtype) {...}

Then:

    multi subset (Package, Package) {...}

Etc. for all the other non-combinatoric types, and:

    multi subset (Object, Object) { 0 }

As the fallback.  Naturally, when I called:

    subset(Disjunction.new(...),  Subtype.new(...))

I got an ambiguity.  Did you mean (Junction, Object) or (Object,
Subtype)?  Something was wrong with my design: I needed to structure
my types to tell the MMD system which one I wanted to thread first. 
This is an error that you'd expect, right?  I didn't tell the compiler
something it needed to know.

However, look at the applicable candidates:

    subset(Junction, Object)   #  1 + 2  =  3
    subset(Object, Subset)     #  2 + 0  =  2
    subset(Object, Object)     #  2 + 2  =  4

The second variant, (Object, Subset) matches.  Oh goody, it worked! 
Now I can go on my merry way documenting and releasing my module.

Now Mr. Joe Schmoe comes along and decides that he wants to write a
new subtype type -- one that accepts his new statically-analyzable
subtyping language or something.  He decides to reuse code and derive
from the existing Subtype type.  The new type hierarchy follows:

    Object
    |- Junction
       |- Disjunction
       |- Conjunction
       |- Injunction
    |- Constrained
       |- Subtype
          |- MagicSubtype    # the new type
    |- PackageType
    |- ...

Now look at what happens for subtype(Disjunction.new(...),
MagicSubtype.new(...)):

    subset(Junction, Object)     # 1 + 2 = 3
    subset(Object, Subtype)      # 2 + 1 = 3
    subset(Object, Object)       # 2 + 2 = 4

Oh no!  An ambiguity!  What the hell, Joe's just trying to extend
Subtype a little, and now he has to write a specialized MMD variant
just for that, which delegates *exactly* to the (Object, Subtype)
variant.

I'll also point out that if you remove the Constrained intermediate
type, which I did (!), you also end up in ambiguity for the call
subtype(Disjunction.new(...), Subtype.new(...)).

And that's it.  Two innocent changes, and a working program breaks
into ambiguity errors.  And the person who sees the ambiguity errors
is not the person who wrote -- or even touched -- the multimethods. 
Keep in mind: these multimethods could be for internal use, so the
extender may not even know they exist.

Using pure ordering, we saw the ambiguity early and were forced to
think about the design and come up with one that passed the tests. 
When I did that, I was able to factor things to avoid duplication and
needless disambiguating variants[1].  It is impossible to break the
new factoring by simply deriving from any class.  You would have to
add a new generic, like Junction, to the top in order to break
existing code.  Manhattan distance suffers from the same problem.  See
the supertyping thread for a solution :-)

I'm seeing after this case study, and something that I suspected all
along, that Manhattan MMD is to pure ordering as mixins are to roles. 
Roles don't provide any extra semantics over mixins: they add only
errors.  But those errors are very important to large-scale
development.  They help you catch things that only extenders of your
module would normally find out.

Luke

[1]  My biggest fear is that the average user won't be able to come up
with such a factoring.  However, I assume that there are relatively
few techniques that you need to know in order to keep things safe, and
these are things that you ought to be doing under a manhattan metric
anyway, as I have just demonstrated.  The early error gets you to ask
your local mailing list instead of publishing your fragile module.

Re: The Use and Abuse of Liskov

Reply via email to