On Aug 23, 2011, at 2:03 AM, Heikki Linnakangas wrote:
> While looking at Alexander's GiST fastbuild patch, which adds buffers to 
> internal nodes to avoid random I/O during index build, it occurred to me that 
> inserting the tuples to the leaf pages one at a time is quite inefficient 
> too, even if the leaf pages are in cache. There's still the overhead of 
> locking and WAL-logging each insertion separately. I think we could get a 
> nice further speedup if we attach a small buffer (one block or so) to every 
> leaf page we're currently writing tuples to, and update the leaf page in 
> bulk. Conveniently, the code to insert multiple tuples to a page already 
> exists in GiST code (because inserting a tuple sometimes splits the page into 
> more than two parts, so you need to insert multiple downlinks to the parent), 
> so this requires no changes to the low-level routines and WAL-logging.
> 
> Let's finish off the main fastbuild patch first, but I wanted to get the idea 
> out there.

I've often wondered about the per-tuple overhead of all kinds of operations, 
not just GiST index builds. For example, if you're doing a seqscan, ISTM it 
would be a lot more efficient to memcpy an entire page into backend-local 
memory and operate off of that lock-free. Similarly for an index scan, you'd 
want to copy a full leaf page if you think you'll be hitting it more than once 
or twice.
--
Jim C. Nasby, Database Architect                   j...@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to