overlapping instances

S.D.Mechveliani Thu, 24 Feb 2000 03:29:16 -0800
I thank  Marcin 'Qrczak' Kowalczyk  <[EMAIL PROTECTED]>
for the helpful example on the subject of overlapping instances.
The example was against the overlapping instances.
And the whole discussion aims to investigate to what extent the 
overlaps can be treated consistently in Haskell.
Sorry, if my suggestions occur rubbish; probably, the people would
correct or criticize it.
You see, I am not an implementor; once only dealt with the 
implementation of certain simple *typeless* functional language.

And this letter contains the question for the implementors. It 
concerns the example by Marcin when  f x  needs overlapping 
instances for the argument  x  and  f  is passed as the argument to 
some polymorphic high-order function  g.

Marcin writes on Feb 6 2000:

> f:: Eq a => [a] -> [a]
> f (x1:x2:xs) = if Just x1 == Just x2 then xs else []
> [..]
> -- Translation of class Eq:
> data Eq a = Eq {(==), (/=) :: a -> a -> Bool}
> [..]
> -- Translation of instance Eq a => Eq (Maybe a):
> eqMaybe:: Eq a -> Eq (Maybe a)
> eqMaybe eqOnArg = Eq {eq, (not .) . eq}
>    where
>    eq Nothing  Nothing  = True
>    eq (Just a) (Just b) = (==) eqOnArg a b
>    eq _        _        = False

All right.

> -- Translation of f :: Eq a => [a] -> [a]:
> f:: Eq a -> [a] -> [a]
> f eq (x1:x2:xs) = if (==) (eqMaybe eq) x1 x2 then xs else []
>
> That is, f receives a dictionary of Eq methods on the type a, as
> specified in its type. It builds a dictionary of Eq methods on the
> type Maybe a itself, but the fact that it uses instance Eq (Maybe a)
> is not visible outside.

No. Probably, here how it should be.
Seeing                               `Just x1 == Just x2'  
the compiler extends *silently* the context for  f:

  -- Translation of f :: Eq a => [a] -> [a]:
  --
  --> f :: (Eq a, Eq (Maybe a)) => [a] -> [a]
  --
  f :: Eq a -> Eq (Maybe a) -> [a] -> [a]
  f    eq      eqMb            (x1:x2:xs) = 
                             if  (==) eqMb x1 x2  then  xs  else  []

1. The compiler does *not* presume here that  eqMb  is derived
standardly from  eq.  Here  eq  and  eqMb  are independent.
And  eq  is not used at all - due to the choice of example.

2. To my mind, the *user should not* add the above context 
                                                       Eq (Maybe a).
`Just x1 == Just x2'  is sufficient for the compiler to rebuild this
context automatically.

>    but the fact that it uses instance Eq (Maybe a)
> is not visible outside.

And after this, the fact that  f  uses  instance Eq (Maybe a)
becomes visible outside.


> Now, what happens with overlapping instances, when the user defines:
>
> instance Eq (Maybe String) where
>    Nothing == Nothing = True
>    Just a  == Just b  = length a == length b
>    _       == _       = False

For the uniformity, let us provide the translation:
                       eqMaybeString :: Eq (Maybe String)
                       eqMaybeString = Eq {eq, (not .) . eq}
                         where
                         eq Nothing  Nothing  = True 
                         eq (Just a) (Just b) = length a == length b
                         eq _        _        = False
Does this look reasonable?

> and uses f ["foo","bar","baz"]? f has the implementation like above
> and has only one (polymorphic) implementation, so it produces the
> dictionary of Eq (Maybe String) basing on the general eqMaybe and
> equality on strings (synthesized from the equality on characters and
> lists), ignoring the more specific instance.

No. 
The philosophy should be:
--------------
seeing in the program     f ["foo","bar","baz"]
the compiler judges that  f  applies to certain  xs :: [String].
According to the compiled type of  f,  
the instances                          Eq String,  Eq (Maybe String)
are required. The instance  Eq String  is standard and unique.
Then, the compiler chooses the most special definition for the  Eq
instance for  Maybe  among those which are in the scope and which 
match the type  Maybe String.
So, if the second  Eq  instance for  Maybe  occurs in the scope, 
then, clearly, it is applied. 
And how the scope forms?
In Haskell-98, any mentioning of a module name <M> under `import' 
brings to the scope all the instances defined in <M>.
Is this how the existing implementations treat the overlaps?

Technique:
----------
the compiler has to solve, what dictionary value to substitute for
Eq (Maybe a)  declared by the compiled  f.
According to `philosophy', the choice may be between the above
dictionaries values from 
                               eqMaybe       :: Eq a -> Eq (Maybe a)
or                             eqMaybeString :: Eq (Maybe String).
The second is chosen as the most special.
What does this mean "chosen" ?
The  *compiled  f  has two extra arguments* : 
for the Eq class dictionary values for `a' (corresponds to the `eq' 
local variable) and for `Maybe a'  (`eqMb'). 
Now,  
              f ["foo","bar","baz"]
compiles to   f_compiled eqChar eqMaybeString ["foo","bar","baz"].

Hence, ignoring so far various optimization possibilities, we see 
that  f  is compiled *once*. But each its application requires the 
appropriate dictionary value for the additional argument.
The overlapping instances cause a non-trivial choice for this 
additional argument - which is resolved quite naturally.
Is this how  GHC, Hugs  process the overlaps?


> It's not enough for the compiler to just remember after compiling
> f that it needs an explicit dictionary of Eq (Maybe a) passed, and
> generate different calls to f than to other functions of the same type,
> because a function like
>    g :: Eq a => ([a] -> [a]) -> [[a]] -> Int
> does not know what kind of f will receive (the one that depends
> on Eq (Maybe a), or possibly Eq (a,Bool) etc.) - it is determined
> at runtime.

Here, I am about stuck.
Let us put some concrete example. Suppose the module  G  contains
  g :: Eq a => ([a] -> [a]) -> [[a]] -> Int
  g            h               (xs:_) =
                                    let ... <something complex>
                                        l = length $ h xs
                                                     ----
                                        ys = ...<something complex>
                                    in  l+(length ys)
g  is compiled straightforward.
Suppose other module  F  contains  f  and the  Eq  instances
discussed above.
And the third module  E  imports  F, G  and exports the 
expressions                      g1 = g f [[1,2],[3]]
                                 g2 = g f [[Just "ab",Just "c"], []]
So,  g1  leads to  f [1,2],
g2  leads to       f [Just "ab",Just "c"]  
- this can be discovered by scanning all the function bodies in the
modules F,G,E.
The separate compilation leads to the following question.
g  is large and not in-lined. Hence, compiling the module  E,
only the type              g :: Eq a => ([a] -> [a]) -> [[a]] -> Int
is visible for  g.
Looking at this type, and at         g f [[Just "ab",Just "c"], []],
how the compiler solves which dictionary to pass to  f ?

The type of  f_compiled  is in the scope:

    f :: (Eq a, Eq (Maybe a)) => [a] -> [a],

    f_compiled eq eqMb xs =  <something not visible from E>
For 
    g f [[Just "ab",Just "c"], []],
which value of  eqMb  to give to  f_compiled ?

The question to the implementors: 
---------------------------------
indeed, how
         ghc -c -fglasgow-exts -optC-fallow-overlapping-instances 
                -optC-fallow-undecidable-instances 
             E.hs
solves this?
I tried to compile all this and observed for  E:
  __interface ... 407 where
  __export ...;
  import F 1 ! :: f 1 zdfEqMaybe 1;
         -------------------------
  ...
What does this mean?

I think so far of the following alternatives, listed
"preferable first".

(1) Extend the interface format for each exported function with 
    the "skeleton".
Thus, for  G.g,  the type skeleton  g_skel  of `g' has to go to the
module interface - in addition to the type description of `g'.
To compile a value needs knowing the type and skeleton of a value.
g_skel  is the body of  g  reduced so that 
  * g_skel  is desirably small 
  * for the argument variables `f',x,y...  of  g
    and for any usage of  g  in other modules
    the overlapping instances for  f,x,y...  can be resolved only
    from  g_skel.
g_skel  has the same type as  g,  but may define different map.
Example.
For                 g f xs = let  x:y:z:u:vs = xs
                                  xs'        = sortBy f [x,y,z,u]
                             in   xs'++(reverse vs)
may the skeleton be 
                    g_skel f [x] = case  f x x  of  LT -> [x]
                                                    _  -> []
?
That is, if  g  is applied in any way in other module, would it 
suffice to observe this  g_skel  to resolve the appeared overlaps 
for the argument values of  f, xs  in  g ?  I wonder.

(2) Run-time resolution.
Provide  f  in  g1,g2  with both dictionary values for  eqMb  
and choose the one of the most special instance at the run-time.
It this hard to implement?

(3) Un-separate partially the compilation.
In the case like considered, the compiler reports an error and says
  "module E ... compiling  E.g2 ...
   Need to see the body of  G.g  to resolve the overlap for the 
   argument `f'.
   Re-compile  G  with such and such flag for `g'.
  "
Then, the second compilation of  E  will find how  f  is used in
g  and choose the appropriate Eq instance in  f.

For, to my mind, the separate compilation is not a sacred cow.
It is only a technical feature, something that is not theoretically
principal.
Imagine you are reading some computational task solution in some 
book, in Volum 2.
And its discourse refers to some Sections of Volum 1. In principle, 
you may hope to derive the needed referred information only from the 
short *abstracts* in the head of each Section of Volum 1 
(suppose, they are provided).
This corresponds to separate compilation. And reading and 
understanding the method in Vol.2 corresponds to compiling of a 
program module.
To understand Vol.2 it may suffice to see the referred abstract in 
Vol.1. 
This is the level 1 modularity.
The level 2 modularity means reading only the definitions and 
formulations of the theorems in Vol.1.
But it is often needed to read some part of the "body" 
(theorem proofs and various comments) of  Vol.1  to understand how 
the task is solved in  Vol.2
(or, do you say such a book is badly written?).
Therefore the compiles have, probably, to prepare in future to relax 
the modularity, depending on the situation.
?


> Unless one accepts that subtle differences in contexts, ones
> depending on the implementation rather than the interface, change
> the meaning. And that definition like "g :: some signature; g = f"
> can cause g to be defined on exactly the same types as f, but with
> a different meaning. And that the same polymorphic function used on
> the same type has different meanings in various places.


First, could you provide an example?
Maybe, the one constructed from above `f' ...

Second, the people in this mail list often say "different meaning"
when speaking on the overlapping instances, and generally, on 
different *compilations*.
To my mind, the overlapping instances is only the instrument to 
provide the alternative evaluation ways for a thing - 
preserving the *meaning*.
The meaning in the functional programming is the result value, not
the way to obtain it.
Haskell allows to write millions different programs having the same
meaning.
Therefore, it is more correct to say "different evaluation (way)"
or                                   "different compilation"
than "different meaning".
I wonder what is the Haskell Report position in respect to this bit
of terminology.

------------------
Sergey Mechveliani
[EMAIL PROTECTED]
overlapping instances

Reply via email to