Re: [perl #17549] [PATCH] direct accesss for intlist 10 times faster

Nicholas Clark Fri, 27 Sep 2002 12:04:12 -0700

Disclaimer: I am very out of touch with what's actually going on on
perl6-internals at the moment. (And even more out of touch with the source
code)

On Tue, Sep 24, 2002 at 04:56:25AM +0000, Leopold Toetsch wrote:

> +    return list->chunk_list[idx / INTLIST_CHUNK_SIZE];

On Tue, Sep 24, 2002 at 05:22:33PM +0200, Leopold Toetsch wrote:
> Sean O'Rourke wrote:

> >If we expect these (especially shift) to be as frequent as push/pop, and
> >we want fast indexing as well, then maybe something like the SGI STL
> >implementation of a dequeue (dequeueue?) would be best:  keep an array of
> >(pointers to) fixed-size chunks (rather than intlist's linked-list).
> 
> 
> Exactly this is, what my recent patch actually did: list->chunk_list 
> holds pointers to chunks, an index lookup is one "div" more expensive 
> then in array.

I see that you define INTLIST_CHUNK_SIZE as 256 currently. Do all compiler
optimisers pick up on the fact that this is a multiple of 2, and can be
converted to a shift (probably faster)?
Would it be wise to express it as log base 2 in the header, and explicitly
write it as a shift here (and idx = idx % INTLIST_CHUNK_SIZE; becomes a
logical and) ?

> >This would also allow gaps in the middle to remain unallocated, and could
> >speed up some splices.
> 
> 
> Yep. With some additional overhead we could also have sparse arrays and 
> cheap splice.

I like the sound of sparse arrays.

I suspect that by having arrays stored in chunks, we also have cheaper
extension (no need to allocate a new huge block for 1024 elements, and copy
512 into it, merely allocate another 256 and chain them on the end) at the
cost of one indirection more on lookup, and more wasted overhead by having
smaller blocks. (or are they fixed size, and so come from an arena?)

> >It's an improvement over the current implementation, but I don't think
> >it's the best solution for perlarray's requirements:
> >
> >1 - front insertion/removal
> >2 - back insertion/removal
> >3 - indexing
> >4 - less-frequent insertion/removal in the middle (splice)
> >
> >The current approach is necessarily bad for 1, good for 3, and
> >unnecessarily bad for 2.  IntList is good for 1 and 2, but bad for 3.

I'm guessing that the current implementation is a contiguous array that
extends at the end. That's not necessarily bad for front insertion if you
can store pointers to both the true start of allocated space and the
current first element, and move the current first element around for front
insertion and removal. (OK, with some major copying needed occasionally)

I believe that perl5 is using this approach. (but I've not checked)

> >Both are not-so-hot for 4, but I don't think a lot of people use splice.
> 
> 
> Also for splice the chunk based solution is much better.

For the case where you are splicing in or out elements equal to the chunk
size, it's easy to see that you can just drop or gain entire chunk.

Why is it better for splices of arbitrary sizes? It's not obvious to me that
there is any particular advantage or disadvantage of a chunked approach for
this.

Nicholas Clark
-- 
Even better than the real thing:        http://nms-cgi.sourceforge.net/

Re: [perl #17549] [PATCH] direct accesss for intlist 10 times faster

Reply via email to