#3181: Regression in unboxing
-------------------------------------+--------------------------------------
Reporter: dolio | Owner:
Type: run-time performance bug | Status: new
Priority: normal | Component: Compiler
Version: 6.10.2 | Severity: normal
Keywords: unboxing boxing | Testcase:
Os: Linux | Architecture: x86_64 (amd64)
-------------------------------------+--------------------------------------
Greetings,
Finally having added some benchmarking to my uvector-algorithms package, I
came across a regression in performance in the heap sort implementation.
With 6.10.2, it did significantly more allocation than I'd remembered, and
looking at the core, I spotted a boxed type that I have since determined
to be the culprit. I'll attach the relevant portion of the algorithm as a
test case (I apologize that it's rather large, but I haven't yet figured
out how to make a smaller example that works).
The key lines are:
{{{
do (child' :*: ac) <- maximumChild cmp a off child len
case cmp val ac of
LT -> writeMU a (root + off) ac >> sift val child' len
_ -> writeMU a (root + off) val
}}}
The monadic (ST) pattern match against the strict pair gets compiled to a
function that accepts the arguments of the pair (as well as the ST state
parameter) like so:
{{{
$w$j_s118 :: State# RealWorld
-> Int
-> Int#
-> (# State# RealWorld, () #)
[Arity 3
$w$j_s118 =
\ (w_s108 :: State# RealWorld)
(ww_s10b :: Int)
(ww1_s10e :: Int#) ->
case <# sc3_s11t ww1_s10e of wild11_aY2 {
False ->
case writeIntArray#
@ RealWorld
marr#_aVT
(+# sc2_s11s 40)
sc3_s11t
w_s108
of s2#1_aX6 { __DEFAULT ->
(# s2#1_aX6, () #)
};
True ->
case writeIntArray#
@ RealWorld
marr#_aVT
(+# sc2_s11s 40)
ww1_s10e
w_s108
of s2#1_aX6 { __DEFAULT ->
case ww_s10b of w1_X10F { I# ww2_X11k ->
$s$wa_s11D s2#1_aX6 sc1_s11r ww2_X11k sc3_s11t
}
}
}
}}}
As can be seen, all that is done with the boxed Int is that it is taken
apart in one branch (this identifies the boxed argument as the `child'`
variable, the `Int` being returned by `maximumChild`; I verified this by
modifying the code to use a custom type:
{{{
data IP e = IP {-# UNPACK #-} !Int !e
}}}
this results in the desired unboxing behavior). Further, all calls to this
function look like:
{{{
$w$j_s118 s2#3_XZ1 (I# x_aTU) r#_aXD
}}}
So this seems to be unnecessary boxing. Further, I have reports that 6.8.2
*did* spot this unboxing opportunity, which would explain why my algorithm
wasn't performing as well as I had remembered, since the last time I'd
seriously inspected the performance was in the 6.8 series.
One theory I had was that the fact that the value was only used in one
branch of the case was causing it to look non-strict somehow, despite the
pair being strict in its fields (perhaps the pair was eliminated before
strictness analysis was done?). However, I've added bang patterns to the
`child'` match, and changed the case statement to:
{{{
case child' `seq` cmp val ac of ...
}}}
without it making a difference. So I am a bit stumped as to why exactly
GHC isn't eliminating this box.
As noted, I can get the desired unboxing with a custom unpacked type, but
fixing the regression would be preferable. Thanks for your time and help!
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3181>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs