Re: overlapping instances

Marcin 'Qrczak' Kowalczyk Thu, 24 Feb 2000 10:00:01 -0800
Thu, 24 Feb 2000 14:17:43 +0300 (MSK), S.D.Mechveliani <[EMAIL PROTECTED]> pisze:

> > That is, f receives a dictionary of Eq methods on the type a, as
> > specified in its type. It builds a dictionary of Eq methods on the
> > type Maybe a itself, but the fact that it uses instance Eq (Maybe a)
> > is not visible outside.
> 
> No. Probably, here how it should be.
> Seeing                               `Just x1 == Just x2'  
> the compiler extends *silently* the context for  f:

It would mean that the type no longer determines the way a function
is physically used.

Well, currently compilers already depend on the body (strictness
analysis, inlining), but it's for optimization only: does not
change the semantics and is skipped in cases where it's impossible
(e.g. higher order functions).

Compilers can't be _required_ to make usage of a function depend on
its definition (not type alone), because it is sometimes impossible.

Surprisingly, separate compilation does not make this impossible.
An imported module must be compiled before an importing module can
use it. A compiler puts derived extra information about compiled
functions in the interface file. Such inter-module optimizations
are mostly impossible in languages where the programmer writes the
interface file himself, like in C.

The other issue is efficiency if you want f to behave the same way even
when the instance Eq (Maybe String) is not visible at f's definition.
It would mean that every overloaded function must be extended to
directly receive all dictionaries it needs. This can give not only
very many, but even an infinite number of dictionaries needed (with
polymorphic recursion). I cannot imagine a sensible implementation
of the function:
    h :: Eq a => a -> a -> Int
    h x y = if x == y then 0 else 1 + h [x] [y]
which would make h "1" "2" return 7 if in some other module there is:
    instance Eq [[[[[[[String]]]]]]] where
        x == y = length x == length y

> Let us put some concrete example. Suppose the module  G  contains
>   g :: Eq a => ([a] -> [a]) -> [[a]] -> Int
>   g            h               (xs:_) =
>                                     let ... <something complex>
>                                         l = length $ h xs
>                                                      ----
>                                         ys = ...<something complex>
>                                     in  l+(length ys)
> g  is compiled straightforward.

So it passes just the Eq a dictionary to h.

> Suppose other module  F  contains  f  and the  Eq  instances
> discussed above.
> And the third module  E  imports  F, G  and exports the 
> expressions                      g1 = g f [[1,2],[3]]
>                                  g2 = g f [[Just "ab",Just "c"], []]
> So,  g1  leads to  f [1,2],
> g2  leads to       f [Just "ab",Just "c"]

Exactly.

> - this can be discovered by scanning all the function bodies in the
> modules F,G,E.

Ugh. It requires such global analysis that I believe is impossible
to implement (either efficiently or at all).

What if g itself is passed to another higher order function, which
in turn applies it to f? Or not to f, but to a lambda which uses f
in a complicated way and happens to sometimes apply f to the lambda's
own argument? The possibilities are extremely wide.

What if all this goes through some extensions that will make
investigations about what is applied to what yet more impossible:
a global variable (made using unsafePerformIO) or a foreign function
written in C?

Sorry, you can't statically check which function can ever receive
which arguments, even if permitting false positives. Values can be
built at runtime, functions too, instances too.

The type system is to extract the information that can be statically
determined. When a type no longer describes exactly how a value can
be used, more things must be determined at runtime.

> I think so far of the following alternatives, listed
> "preferable first".
> 
> (1) Extend the interface format for each exported function with 
>     the "skeleton".

It does not help even if the complete definition of g was put in
the interface. IMHO higher order functions are enough to make it
undecidable among a single module.

> (2) Run-time resolution.
> Provide  f  in  g1,g2  with both dictionary values for  eqMb  
> and choose the one of the most special instance at the run-time.
> It this hard to implement?

I don't know, but it would mean that all parts of the program, even
those that won't ever be called with types that depend on overlapping
instances, must use runtime instance resolution. I certainly don't
want to pay for overlapping instances when I don't use them. Haskell
is already not too fast, because of e.g. laziness.

It's not too bad when a compiler can optimize common and easy cases,
but every such thing makes the compiler yet more complicated, harder
to maintain, more error prone etc.

I'm not a compiler implementor, but I guess that here it would be
practically impossible to not add significant penalty to functions
not using overlapping instances, if overlapping instances were to
work sanely and allow "f :: Eq a => ..." with definition as before.

> (3) Un-separate partially the compilation.
> In the case like considered, the compiler reports an error and says
>   "module E ... compiling  E.g2 ...
>    Need to see the body of  G.g  to resolve the overlap for the 
>    argument `f'.
>    Re-compile  G  with such and such flag for `g'.
>   "

Again it does not help for higher order functions in general.

Moreover, it makes currently legal programs illegal (such programs
that I believe should remain legal without special treatment, provided
that I don't care about overlapping instances).

> For, to my mind, the separate compilation is not a sacred cow.
> It is only a technical feature, something that is not theoretically
> principal.

I don't agree. Some people could even say that the current system,
where an imported module must be compiled before an importing one can,
is a too limited version of separate compilation, but I accept this
principle as it allows inter-module optimizations. I don't imagine
how it could be further relaxed.

The problem is hard enough even without separate compilation.

> > Unless one accepts that subtle differences in contexts, ones
> > depending on the implementation rather than the interface, change
> > the meaning. And that definition like "g :: some signature; g = f"
> > can cause g to be defined on exactly the same types as f, but with
> > a different meaning. And that the same polymorphic function used on
> > the same type has different meanings in various places.
> 
> First, could you provide an example?
> Maybe, the one constructed from above `f' ...

It depends on what rules do we have. For example if f was required
to have the type signature:
    f :: (Eq a, Eq (Maybe a)) => [a] -> [a]
to see the more specific instances, nevertheless:
    g :: Eq a => [a] -> [a]
    g = f
was allowed, then this is an example. It's bad because it's easier to
spot an error when it is catched at compile time (this is not allowed)
than when it leads to a change in behavior (something is allowed but
this is allowed in a different way and the compiler has picked one of
them).

> Second, the people in this mail list often say "different meaning"
> when speaking on the overlapping instances, and generally, on
> different *compilations*.
> To my mind, the overlapping instances is only the instrument to
> provide the alternative evaluation ways for a thing - preserving
> the *meaning*.

The compiler can't determine if various user defined functions are
equivalent. From its point of view they are different, and it does
really matter which is chosen. IMHO it would be too dangerous to
blindly assume that if something can be implemented in such different
ways then the programmer really made all versions give the same result
and does not care if everywhere the "right" is chosen, because it's
too easy to make subtle errors in thinking that there is no ambiguity
and the compiler just has to pick the version I have in mind.

Unless it is clearly marked as a place when the programmer is
responsible for ensuring the equivalence, like in GHC's {-# RULES #-}.
Or that he will happily accept each possible variant, like in
concurrency, which is non-deterministic in its spirit and nobody
should complain when the order of IO actions was different than he
wanted if he did not ensure it himself.

There is a delicate issue of defaulting numeric types. It's not clear
how far we should get, the concept is a bit controversial, ghc -Wall
warns when it fires. It's unfortunate that we don't know a cleaner
and still convenient solution.

The less guessing, automatic ambiguity resolution by the compiler is
needed, the better. Fortunately Haskell generally does not leave much
unspecified: for example the order of evaluation of function arguments
really does not matter, there are no implicit type conversions,
overloading is systematic, rules in the style of "it applies everytime
_unless_ something is true" are avoided, depending on the absence of
something is avoided, with conflicts leading to compile errors (e.g.
conflicting imports or repeated instances).

-- 
 __("<    Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/
 \__/              GCS/M d- s+:-- a22 C+++$ UL++>++++$ P+++ L++>++++$ E-
  ^^                  W++ N+++ o? K? w(---) O? M- V? PS-- PE++ Y? PGP+ t
QRCZAK                  5? X- R tv-- b+>++ DI D- G+ e>++++ h! r--%>++ y-
Re: overlapping instances

Reply via email to