Thanks Simon for linking that issue! Does the patch linked there <https://ghc.haskell.org/trac/ghc/attachment/ticket/8905/0001-EXPERIMENTAL-save-the-continuation-after-the-tag-che.patch> already cover what you suggested in your last mail? I think no, it's a more limited change, but I'm having trouble understanding exactly what.
I've also got one really basic question -- was it decided long ago that all these stack limit checks are cheaper than having a guard page at the end of each stack and faulting to grow the stack? (I couldn't find a place that this rationale was described on wiki.) Best, -Ryan On Sun, Nov 1, 2015 at 10:05 AM, Simon Marlow <marlo...@gmail.com> wrote: > Yes, I think we can probably do a better job of compiling case > expressions. The current compilation strategy optimises for code size, but > at the expense of performance in the fast path. We can tweak this > tradeoff, perhaps under the control of a flag. > > Ideally the sequence should start by assuming that the closure is already > evaluated, e.g. > > loop: > tag = R2 & 7; > if (tag == 1) then // code for [] > else if (tag == 2) then // code for (:) > else evaluate; jump back to loop > > The nice thing is that now that we don't need proc points, "loop" is just > a label that we can directly jump to. Unfortunately this only works when > using the NCG, not with LLVM, because LLVM requires proc points, so we > might need to fall back to a different strategy for LLVM. > > Similar topics came up here: https://ghc.haskell.org/trac/ghc/ticket/8905 > and I think there was another ticket but I can't find it now. > > Cheers > Simon > > On 23/10/2015 19:00, Ryan Newton wrote: > >> 1. Small tweaks: The CMM code above seems to be /betting/ than the >> thunk is unevaluated, because it does the stack check and stack >> write /before/ the predicate test that checks if the thunk is >> evaluated (if(R1 & 7!= 0) gotoc3aO; elsegotoc3aP;). With a >> bang-pattern function, couldn't it make the opposite bet? That >> is, branch on whether the thunk is evaluated first, and then the >> wasted computation is only a single correctly predicted branch >> (and a read of a tag that we need to read anyway). >> >> Oh, a small further addition would be needed for this tweak. In the >> generated code above "Sp = Sp + 8;" happens /late/, but I think it could >> happen right after the call to the thunk. In general, does it seem >> feasible to separate the slowpath from fastpath as in the following >> tweak of the example CMM? >> >> >> * // Skip to the chase if it's already evaluated:* >> * start:* >> * if (R2 & 7 != 0) goto fastpath; else goto slowpath;* >> * >> * >> * slowpath: // Formerly c3aY* >> * if ((Sp + -8) < SpLim) goto c3aZ; else goto c3b0;* >> * c3aZ:* >> * // nop* >> * R1 = PicBaseReg + foo_closure;* >> * call (I64[BaseReg - 8])(R2, R1) args: 8, res: 0, upd: 8;* >> * c3b0:* >> * I64[Sp - 8] = PicBaseReg + block_c3aO_info;* >> * R1 = R2;* >> * Sp = Sp - 8;* >> >> * call (I64[R1])(R1) returns to fastpath, args: 8, res: 8, upd: 8;* >> * // Sp bump moved to here so it's separate from "fastpath"* >> * Sp = Sp + 8;* >> * >> * >> * fastpath: // Formerly c3aO* >> * if (R1 & 7 >= 2) goto c3aW; else goto c3aX;* >> * c3aW:* >> * R1 = P64[R1 + 6] & (-8);* >> * call (I64[R1])(R1) args: 8, res: 0, upd: 8;* >> * c3aX:* >> * R1 = PicBaseReg + lvl_r39S_closure;* >> * call (I64[R1])(R1) args: 8, res: 0, upd: 8;* >> >> >> >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs@haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >>
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs