Re: [sqlite] Idea for improving page cache

Ken Thu, 29 Oct 2009 03:51:42 -0700

Just for the sake of discussion I've attached a performance graph for various 
C++ data structures plus the Unrolled LL. The tests where run on a dell vostro 
1400 laptop. As you can see the graphs show the ULL to be quite efficient for 
insert/delete from the front/back of the list. I beleive this is mainly due to 
the fact that a new node is not allocated for the insert for each operation.


If there isn't a page cache on the processor then the ULL will likely not 
perform.

As John and Pavel have pointed out its probably not ideal for the page cache 
since the pages need to point back into the list. So this would be very 
problematic for page movement.

Ken




--- On Tue, 10/27/09, John Crenshaw <johncrens...@priacta.com> wrote:

> From: John Crenshaw <johncrens...@priacta.com>
> Subject: Re: [sqlite] Idea for improving page cache
> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
> Date: Tuesday, October 27, 2009, 8:29 PM
> I don't think ULL buys you as much
> extra cache space as you think, and
> it certainly isn't anywhere near as fast as LL. Consider
> the following
> problems that still remain:
> 1. If you need to point to an element (common with linked
> lists) it gets
> really messy. Sorted inserts and deletes will change the
> memory location
> of elements, as well as their offset in the node, so you
> can't point to
> them by memory location or node + offset. Instead, you have
> to point to
> the list, plus something static and uniquely identifiable
> about the
> element. Many times this is going to involve additional
> space once per
> node. Assuming you care about word alignment (which you
> should if you're
> talking memory locality and stuff), the additional space
> needs to be the
> size of a pointer. This would give a net savings of one
> pointer for
> double LL, and ZERO net savings for single LL, unless each
> element
> already stored something uniquely identifying.
> 2. Unless you never need to add OR remove elements in the
> middle, you
> need wasted space to avoid very expensive operations. You
> still have to
> store the data. If the data is a pointer to other data, the
> whole
> business of cache locality is largely academic anyway,
> since you've got
> plenty of chances for cache misses when you look at the
> other data.
> Let's say that the data stored is 16 bytes (small, but
> useful), and
> pointers are 4 bytes. The data in this case uses 66% of the
> space, so
> unless you can keep the wasted space per node below 33% you
> don't gain
> anything at all over a straight double LL with good
> locality. If you're
> replacing a single LL, you have to keep wasted space below
> 20% to see
> any gain. At larger and larger data, this number gets
> smaller and
> smaller.
> 3. If data is arbitrarily removed at approximately the same
> rate that
> data is added, the wasted space will naturally gravitate
> towards 50%,
> which is a no gain for footprint in most cases. Even for a
> single LL
> with pointer sized data this is a zero gain.
> 4. You CAN improve wasted space with balancing to
> consolidate adjacent
> nodes when possible, but this can quickly become expensive
> with lots of
> memory copy operations just to avoid a little wasted space
> in the cache.
> If we measure the value of that space in terms of the
> percentage of a
> cache miss that it might avoid times the cost of a cache
> miss in time,
> it is unclear if there is a real gain here.
> 5. Insert and delete operations will require, on average, a
> copy of half
> the elements in a node. Compared to the 2 operation
> insert/delete of a
> single LL and this is not cheap. Again, compare to the
> value of the
> cache you are trying to save by avoiding the pointers, and
> this doesn't
> look like a wise spend to me.
> 6. Linked lists are designed for use in cases where inserts
> and removals
> should be fast and simple, but ULL ignores this. Deletes
> are especially
> problematic, requiring a full scan to the element being
> deleted,
> followed by a memory copy and possibly node balancing. In a
> double
> linked list, if you have a pointer to the node you want to
> delete, the
> delete costs only 2 operations.
> 7. ULL will outperform LL when traversing nodes to find a
> sorted
> insertion point, but may likely lose that gain when it
> actually does the
> insert. In any case btree will outperform either when
> finding a sorted
> insertion point, still gets the superfast insert, and can
> get
> appropriate locality by pooling.
> 
> I guess my point is, avoiding cache misses is a fine goal,
> but I don't
> believe ULL is the tool to use. A Pool allocation scheme
> that uses the
> same pool size as a ULL node will likely have the same or
> better memory
> footprint, faster insertion, faster deletion, and the same
> cache
> locality. In practice, pools could safely be much larger,
> because you
> don't have to balance the additional cost for inserts and
> deletes. This
> means pool allocation could experience even better cache
> locality than
> ULL, due to larger allocation blocks. Additionally, the
> allocation cost
> per node in these pools is unbelievably low (a couple of
> operations),
> and if you have a separate pool per list, access to the
> pools can be
> controlled by the same critical sections that already
> control the linked
> lists, making it automatically threadsafe.
> 
> Pool allocation works without all the added complexity of
> ULL, and
> rather than hidden costs (memcpy for insert and delete),
> there are
> hidden saving (nearly zero per-node allocation costs unless
> you hit the
> end of a pool).
> 
> Anyway, if someone wants to tackle the problem of improving
> cache
> locality, and I get a vote on how it's done, I vote pools.
> They're easy
> (could probably be done in < 100 lines of new code),
> uncomplicated
> (logically the LL works the same as it always has),
> flexible (can
> convert to btree), and give a clear benefit (as opposed to
> questionable
> gains once space, pointers to nodes, and memcpy are looked
> at).
> 
> John
> 
> -----Original Message-----
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of Nicolas Williams
> Sent: Tuesday, October 27, 2009 6:29 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Idea for improving page cache
> 
> On Tue, Oct 27, 2009 at 04:28:11PM -0400, John Crenshaw
> wrote:
> > "advantage" kind of depends. ULL is more specialized.
> You gain some
> > benefit, but also lose some as well. For example,
> consider what is
> > involved in doing a sorted insert into an ULL. On the
> other hand, you
> > can get all of the same locality benefit with a pool
> allocation
> > scheme. You don't reduce the pointer overhead this
> way, but that isn't
> > really much of an issue.
> 
> I don't know that you get the same locality benefit with
> pools: with a
> ULL there's fewer prev/next pointers to take up valuable
> cache space.
> 
> The need to "pullup"/"breakup" a ULL at times is annoying,
> and to
> minimize you probably have to waste some space in order to
> amortize the
> cost.  As you say, there's a trade-off.  Many
> optimizations result in
> more complex, error-prone, brittle software.
> 
> Wasting space is not much of a problem for cache locality,
> and if you
> keep the wasted space well under 50% you're ahead of plain
> lists in
> terms of memory footprint.  So, ignoring code
> complexity, ULL seems like
> a likely win-win performance-wise.  Even code
> complexity-wise, ULLs
> allow random access to be much faster than with plain
> linked lists,
> which means you're more likely to use random access, which
> means you
> probably win code complexity-wise too.  Of course,
> once you start
> leaving room for additions in each chunk, ULLs start
> resembling B-trees,
> which, I think, is your point: might as well go whole-hog.
> 
> > ULL requires less time to walk all nodes, but since
> you can't afford
> > to keep ULL sorted, there aren't a lot of scenarios
> where that saves
> 
> I don't think it's true that you can't afford to keep it
> sorted, but you
> probably have to waste space if you want it sorted and
> allowing most
> random inserts to not require spills.  Also, just
> because a chunk is
> full, or even all chunks are full, does not mean you must
> move lots of
> memory around: you can just insert a new chunk and move
> some entries
> from the neighboring ones into it, thus creating some
> breathing space
> for further additions (at the cost of space waste).
> 
> Nico
> -- 
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Idea for improving page cache

Reply via email to