See Note [ClassOp/DFun selection] in TcInstDcls, which I reproduce below. Does
that help?
Simon
Note [ClassOp/DFun selection]
One thing we see a lot is stuff like
op2 (df d1 d2)
where 'op2' is a ClassOp and 'df' is DFun. Now, we could inline *both*
'op2' and 'df' to get
case (MkD ($cop1 d1 d2) ($cop2 d1 d2) ... of
MkD _ op2 _ _ _ -> op2
And that will reduce to ($cop2 d1 d2) which is what we wanted.
But it's tricky to make this work in practice, because it requires us to
inline both 'op2' and 'df'. But neither is keen to inline without having
seen the other's result; and it's very easy to get code bloat (from the
big intermediate) if you inline a bit too much.
Instead we use a cunning trick.
* We arrange that 'df' and 'op2' NEVER inline.
* We arrange that 'df' is ALWAYS defined in the sylised form
df d1 d2 = MkD ($cop1 d1 d2) ($cop2 d1 d2) ...
* We give 'df' a magical unfolding (DFunUnfolding [$cop1, $cop2, ..])
that lists its methods.
* We make CoreUnfold.exprIsConApp_maybe spot a DFunUnfolding and return
a suitable constructor application -- inlining df "on the fly" as it
were.
* We give the ClassOp 'op2' a BuiltinRule that extracts the right piece
iff its argument satisfies exprIsConApp_maybe. This is done in
MkId mkDictSelId
* We make 'df' CONLIKE, so that shared uses stil match; eg
let d = df d1 d2
in ...(op2 d)...(op1 d)...
From: [email protected] [mailto:[email protected]] On
Behalf Of Thomas Schilling
Sent: 15 November 2012 18:01
To: [email protected]
Subject: Optimisation of type class methods / dictionaries
Hi,
I was investigating how GHC optimises dictionaries. So, I'm using the
following variant of the Eq type class:
class XEq a where
xeq, xne :: a -> a -> Bool
xne a b = not (a `xeq` b) }
instance XEq Bool where
xeq True True = True
xeq False False = True
xeq _ _ = False
instance XEq a => XEq [a] where
xeq [] [] = True
xeq (x:xs) (y:ys) = xeq x y && xeq xs ys
xeq _ _ = False
I then compile this with:
ghc -c -fforce-recomp -O2 -ddump-simpl -ddump-types <filename.hs>
I now have two questions regarding the output of the compiler.
Question 1: Why are the instance method selectors marked as NOINLINE?
Main.xeq [InlPrag=[NEVER]]
:: forall a_aeV. Main.XEq a_aeV => a_aeV -> a_aeV -> GHC.Types.Bool
Question 2: The optimised code for the list implementation looks as follows:
Main.$fXEq[]_$cxeq [Occ=LoopBreaker]
:: forall a_an3.
Main.XEq a_an3 =>
[a_an3] -> [a_an3] -> GHC.Types.Bool
[GblId, Arity=3, Caf=NoCafRefs, Str=DmdType LSS]
Main.$fXEq[]_$cxeq =
\ (@ a_an3)
($dXEq_auY :: Main.XEq a_an3)
(ds_dw8 :: [a_an3])
(ds1_dw9 :: [a_an3]) ->
case ds_dw8 of _ {
[] ->
case ds1_dw9 of _ {
[] -> GHC.Types.True;
: ipv_swB ipv1_swC -> GHC.Types.False
};
: x_an4 xs_an5 ->
case ds1_dw9 of _ {
[] -> GHC.Types.False;
: y_an6 ys_an7 ->
--- AAA
case Main.xeq @ a_an3 $dXEq_auY x_an4 y_an6 of _ {
GHC.Types.False -> GHC.Types.False;
GHC.Types.True ->
--- BBB
Main.$fXEq[]_$cxeq @ a_an3 $dXEq_auY xs_an5 ys_an7
}
}
}
If I turn off optimisation there are two calls to "xeq" which correspond
exactly to the two calls from the implementation to either side of "&&". In
the optimised version, the second "xeq" has been inlined (marker BBB), but not
the first one (marker AAA). This is despite "xeq" being marked NOINLINE, by
the way. I would expect the code after AAA to look something like:
case $dXEq_auY of _ {
Main.D:XEq tpl_B2 _ ->
case tpl_B2 x_an4 y_an6 of _ {
... ->
Adding -fdicts-strict to the command line also doesn't seem to convince the
compiler to apply the worker/wrapper transformation and create a loop that just
passes around the implementation of "xeq" instead of the whole dictionary.
Does anyone know what the issue is here?
Thanks,
/ Thomas
_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc