I've looked further into this.

The bad news is the ratio is likely not a useful indicator for compile time.

The good news is I spent the last day and a half profiling and optimizing the code and found the major reason why performance degraded so fast. There was a certain pattern which had complexity of O(blocks * branches) which now should be O(blocks * log branches) with negligibly increased memory usage.

The worst cases I know (a function with 1500 guards compiled with -O0) is now only about 3% slower than the old algorithm. Which seems reasonable.

I will check how this affects compile time for regular code and rerun benchmarks to make sure I've not introduced regressions. But if it holds up we should at least enable it at O1/O2. Depending how compile times are for regular code maybe for -O0 as well.

Cheers
Andreas




Andreas Klebinger schrieb:

Simon Peyton Jones schrieb:
Good work.

| However it will also be possible to fall back to the old behavior to
| avoid the large compile times for modules known to trigger this
| edge case.

Not for specific modules, I hope; rather fall back when the number of cases becomes large, or something like that? That'd be fine.
As it stands it's a regular flag and hence per Module (just like worker wrapper or similar) and user controlled. In the code example I have in my head it was a large case that get's desugared to a larger chain of if/elseif/elseif/..../else.
So it's not strictly large cases but usually it is.

I actually didn't consider doing a "dynamic" fall back so far. That's a great idea! If we are lucky we can use the ratio of edges to blocks, and fall back to the old one if we are above a certain function size and ratio.
With the threshold in turn depending on the optimization level.

But not sure if that's good design as we then end up with cases where performance suddenly gets 5% worse if
people add a constructor or code get's slightly larger for some reason.

The only other potential worry is how tricky/understandable is your patch. Hopefully it's not hard.
I hope so, the algorithm itself isn't that complicated to some of the stuff in GHC and I tried to document it well.
But it's also far from being trivial code.

| 1. Always since on average both compile and runtime is faster.
| 2. -O1/O2 since users often use this when performance matters.
| 3. -O2 since users expect compilation times to blow up with it?
| 4. Never as users experience compile time slowdowns when they hit these
| edge cases.

Well, if you can robustly fall back in edge cases, you'll never hit the blow-ups. So then (1) would be fine would it not?
Guess I will have to see how well the edge/block ratio correlates with compile time blowups. If we can use that to rule out the really bad cases then (1) should be fine.
If not I will have to come back to the question.

Cheers
Andreas

Simon




| -----Original Message-----
| From: ghc-devs<ghc-devs-boun...@haskell.org>  On Behalf Of Andreas
| Klebinger
| Sent: 30 August 2018 18:08
| To: ghc-devs@haskell.org
| Subject: Compile times: Average vs worst case runtime
|
| Hello Devs,
|
| I developed a patch improving on GHC's code layout during GSoC:
| https://ghc.haskell.org/trac/ghc/ticket/15124
| The gains are pretty nice with most library benchmarks showing
| improvments in the 1-2% range or more.
|
| Even compile times went down comparing head vs stage2 built with, and
| using my patch!
| Making compilation of nofib overall faster by 0.5-1% depending on the
| exact setup.
|
| Now the bad news:
| In some cases the algorithm has bad big O performance, in practice this
| seems to be code with
| things like cases with 200+ alternatives. Where these cases also
| represent most of the code compiled.
|
| The worst case I saw was doubled compile time in a Module which only
| consisted of a 500 deep if/else chain only selecting
| an value.
|
| While there are some small optimizations possible to improve on this I
| don't expect to make these edge cases much faster overall.
| However it will also be possible to fall back to the old behavior to
| avoid the large compile times for modules known to trigger this
| edge case.
|
| Which brings me to my main question: When should we use the new code
| layout.
| 1. Always since on average both compile and runtime is faster.
| 2. -O1/O2 since users often use this when performance matters.
| 3. -O2 since users expect compilation times to blow up with it?
| 4. Never as users experience compile time slowdowns when they hit these
| edge cases.
|
| I'm would prefer 2. 3. 1. 4. in that order but I wonder what the wider
| community is thinking.
|
| Cheers
| Andreas
| _______________________________________________
| ghc-devs mailing list
| ghc-devs@haskell.org
| http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to