Hi Bulat, This seems quite reasonable to me. Have you eyeballed the assembly GCC produces to see that the hotpath is improved? If you can submit a patch that would be great!
Cheers, Edward Excerpts from Bulat Ziganshin's message of 2014-10-14 10:08:59 -0700: > Hello Glasgow-haskell-users, > > i'm looking a the > https://github.com/ghc/ghc/blob/23bb90460d7c963ee617d250fa0a33c6ac7bbc53/rts/sm/Storage.c#L680 > > if i correctly understand, it's speed-critical routine? > > i think that it may be improved in this way: > > StgPtr allocate (Capability *cap, W_ n) > { > bdescr *bd; > StgPtr p; > > TICK_ALLOC_HEAP_NOCTR(WDS(n)); > CCS_ALLOC(cap->r.rCCCS,n); > > /// here starts new improved code: > > bd = cap->r.rCurrentAlloc; > if (bd == NULL || bd->free + n > bd->end) { > if (n >= LARGE_OBJECT_THRESHOLD/sizeof(W_)) { > .... > } > if (bd->free + n <= bd->start + BLOCK_SIZE_W) > bd->end = min (bd->start + BLOCK_SIZE_W, bd->free + > LARGE_OBJECT_THRESHOLD) > goto usual_alloc; > } > .... > } > > /// and here it stops > > usual_alloc: > p = bd->free; > bd->free += n; > > IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa)); > return p; > } > > > i think it's obvious - we consolidate two if's on the crirical path > into the single one plus avoid one ADD by keeping highly-useful bd->end > pointer > > further improvements may include removing bd==NULL check by > initializing bd->free=bd->end=NULL and moving entire "if" body > into separate slow_allocate() procedure marked "noinline" with > allocate() probably marked as forceinline: > > StgPtr allocate (Capability *cap, W_ n) > { > bdescr *bd; > StgPtr p; > > TICK_ALLOC_HEAP_NOCTR(WDS(n)); > CCS_ALLOC(cap->r.rCCCS,n); > > bd = cap->r.rCurrentAlloc; > if (bd->free + n > bd->end) > return slow_allocate(cap,n); > > p = bd->free; > bd->free += n; > > IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa)); > return p; > } > > this change will greatly simplify optimizer's work. according to my > experience current C++ compilers are weak on optimizing large > functions with complex execution paths and such transformations really > improve the generated code > _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users