Re: [PERFORM] GiST index performance

Yeb Havinga Wed, 17 Mar 2010 02:26:47 -0700

Yeb Havinga wrote:

Matthew Wakeling wrote:
Matthew Wakeling wrote:
A second quite distinct issue is the general performance of GiSTindexes
which is also mentioned in the old thread linked from Open Items. For
that, we have a test case at
http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.phpforbtree_gist indexes. I have a similar example with the bioseg GiSTindex. Ihave completely reimplemented the same algorithms in Java foralgorithminvestigation and instrumentation purposes, and it runs about ahundredtimes faster than in Postgres. I think this is a problem, and I'mwilling
to do some investigation to try and solve it.
I have not made any progress on this issue. I think Oleg and Teodorwould be better placed working it out. All I can say is that Iimplemented the exact same indexing algorithm in Java, and itperformed 100 times faster than Postgres. Now, Postgres has to do alot of additional work, like mapping the index onto disc, lockingpages, and abstracting to plugin user functions, so I would expectsome difference - I'm not sure 100 times is reasonable though. Itried to do some profiling, but couldn't see any one section of codethat was taking too much time. Not sure what I can further do.
Hello Mathew and list,
A lot of time spent in gistget.c code and a lot of functioncall5's tothe gist's consistent function which is out of sight for gprof.Something different but related since also gist: we noticed beforethat gist indexes that use a compressed form for index entries sufferfrom repeated compress calls on query operands (seehttp://archives.postgresql.org/pgsql-hackers/2009-05/msg00078.php).
The btree_gist int4 compress function calls the genericgbt_num_compress, which does a palloc. Maybe this palloc is allso hital lot when scanning the index, because the constants that are querieswith are repeatedly compressed and palloced.

Looked in the code a bit more - only the index nodes are compressed atindex creation, the consistent functions does not compress queries, sonot pallocs there. However when running Mathews example fromhttp://archives.postgresql.org/pgsql-performance/2009-04/msg00276.phpwith the gist index, the coverage shows in gistget.c: 1000000 palloc0 'sof gistsearchstack at line 152 and 2010982 palloc's also of thegistsearchstack on line 342. Two pfrees are also hit a lot: line 195:1010926 of a stackentry and line 293: 200056 times. My $0.02 cents isthat the pain is here. My knowledge of gistget or the other sources inaccess/gist is zero, but couldn't it be possible to determine themaximum needed size of the stack and then allocate it at once and use apop/push kind off api?


regards,
Yeb Havinga


regards,
Yeb Havinga



--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] GiST index performance

Reply via email to