On Mon, Apr 21, 2008 at 8:25 PM, Patrick R. Michaud <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 21, 2008 at 10:19:08AM -0700, chromatic wrote:
>  > I'm still exploring the Rakudo build progress as a profiling target for 
> likely
>  > optimizations.  After this weekend's work, I have src/gen_actions.pir
>  > generation down to 27,788,055,796 instructions (with an optimized Parrot). 
>  A
>  > big chunk of that time goes to support bsr_ic:
>  >
>  >  7,784,136,854  core.ops:Parrot_bsr_ic
>  >  7,775,231,886  stacks.c:stack_push
>  >  7,763,569,145  stack_common.c:stack_prepare_push
>  >  7,754,735,042  stack_common.c:cst_new_stack_chunk
>  >
>
> > Why is it expensive?  *Every* call to cst_new_stack_chunk() requests a free
>  > bufferlike object from the GC.  98% of the inclusive cost of these four
>  > functions is in running the GC.
>  >
>  > Someone who's familiar with the stack code (or wants to be) might be able 
> to
>  > find a big optimization here.
>
>  To me, the scary part of src/stacks.c is at the beginning:
>
>     The stack is stored as a linked list of chunks (C<Stack_Chunk>),
>     where each chunk has room for one entry.
>
>  Eek!  For something like bsr_ic, which is really just pushing a
>  return address (i.e., opcode_t *) onto a stack, we're allocating a
>  new GC-able object for every bsr invocation, and freeing it on ret.
>  Since PGE uses bsr/ret for its backtracking, that's a lot of allocations.
>
>  In fact, this seems to be the case for everything using the
>  "generic stack", which AFAICT is the &interp->dynamic_env structure.
>  So, everything that gets pushed onto this stack (exceptions,
>  continuations, coroutines, bsr/ret calls) ends up with a separate
>  gc-able structure (Stack_Chunk) to hold the stack entry.
>  So, switching PGE from bsr/ret to another control structure doesn't
>  give us a win here.
>
>  I think we'd get a BIG win if we changed the dynamic_env stack to
>  have an approach similar to ResizableIntegerArray, where we allocate
>  arrays of Stack_Chunk entries and manage them that way, instead of
>  a separate allocation per element in the stack.

what about actually using a ResizableIntegerArray for this purpose?
(or PMC, if Integer is not suitable for storing addresses)
Or is this a really dumb idea?

kjs

Reply via email to