Re: [HACKERS] WIP: Fast GiST index build

Heikki Linnakangas Thu, 01 Sep 2011 02:38:31 -0700

On 01.09.2011 12:23, Alexander Korotkov wrote:

On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas<
[email protected]>  wrote:

So I changed the test script to generate the table as:

CREATE TABLE points AS SELECT random() as x, random() as y FROM
generate_series(1, $NROWS);

The unordered results are in:

          testname           |   nrows   |    duration     | accesses
-----------------------------+**-----------+-----------------+**----------
  points unordered buffered   | 250000000 | 05:56:58.575789 |  2241050
  points unordered auto       | 250000000 | 05:34:12.187479 |  2246420
  points unordered unbuffered | 250000000 | 04:38:48.663952 |  2244228

Although the buffered build doesn't lose as badly as it did with more
overlap, it still doesn't look good :-(. Any ideas?



But it's still a lot of overlap. It's about 220 accesses per small area
request. It's about 10 - 20 times greater than should be without overlaps.

Hmm, those "accesses" numbers are actually quite bogus for this test. Ichanged the creation of the table as you suggested, so that all x and yvalues are in the range 0.0 - 1.0, but I didn't change the loop tocalculate those accesses, so it still queried for boxes in the range 0 -100000. That makes me wonder, why does it need 220 accesses on averageto satisfy queries most of which lie completely outside the range ofactual values in the index? I would expect such queries to just look atthe root node, conclude that there can't be any matching tuples, andreturn immediately.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

Reply via email to