On Thu, Sep 8, 2011 at 8:35 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote:
> Now that the main GiST index build patch has been committed, there's a few > further improvements that could make it much faster still: > > Better management of the buffer pages on disk. At the moment, the > temporary file is used as a heap of pages belonging to all the buffers in > random order. I think we could speed up the algorithm considerably by > reading/writing the buffer pages sequentially. For example, when an > internal page is split, and all the tuples in its buffer are relocated, > that would be a great chance to write the new pages of the new buffers in > sequential order, instead of writing them back to the pages freed up by the > original buffer, which can be spread all around the temp file. I wonder if > we could use a separate file for each buffer? Or at least, a separate file > for all buffers that are larger than, say 100 MB in size. > > Better management of in-memory buffer pages. When we start emptying a > buffer, we currently flush all the buffer pages in memory to the temporary > file, to make room for new buffer pages. But that's a waste of time, if > some of the pages we had in memory belonged to the buffer we're about to > empty next, or that we empty tuples to. Also, if while emptying a buffer, > all the tuples go to just one or two lower level buffers, it would be > beneficial to keep more than one page in-memory for those buffers. > > Buffering leaf pages. This I posted on a separate thread already: > http://archives.postgresql.**org/message-id/4E5350DB.** > 3060...@enterprisedb.com<http://archives.postgresql.org/message-id/4e5350db.3060...@enterprisedb.com> > > > Also, at the moment there is one issue with the algorithm that we have > glossed over this far: For each buffer, we keep some information in memory, > in the hash table, and in the auxiliary lists. That means that the amount > of memory needed for the build scales with the size of the index. If you're > dealing with very large indexes, hopefully you also have a lot of RAM in > your system, so I don't think this is a problem in practice. Still, it > would be nice to do something about that. A straightforward idea would be > to swap some of the information to disk. Another idea that, simpler to > implement, would be to completely destroy a buffer, freeing all the memory > it uses, when it becomes completely empty. Then, if you're about to run out > of memory (as defined by maintenance_work_mem), you can empty some low > level buffers to disk to free up some. Unfortunately, I hadn't enough of time to implement something of this before 9.2 release. Work on my Phd. thesis and web-projects takes too much time. But, I think there is one thing we should fix before 9.2 release. We assume that gist index build have at least effective_cache_size/4 of cache. This assumption could easily be false on high concurrency systems. I don't see the way for convincing estimate here, but we could document this behaviour. So, users could just tune effective_cache_size for gist index build on high concurrency. ------ With best regards, Alexander Korotkov.