Re: [PERFORM] sequential scan on select distinct

Pierre-Frédéric Caillaud Wed, 06 Oct 2004 10:32:15 -0700


    There are even three questions here :

        - given that 'SELECT DISTINCT field FROM table' is exactly
the same as 'SELECT field FROM table GROUP BY field", postgres could
transform the first into the second and avoid itself a (potentially
killer) sort.

On my example the table was not too large but on a very large table, sorting all the values and then discinct'ing them does not look too appealing.

Currently Postgres does Sort+Unique, but there could be a DistinctSort instead of a Sort, that is a thing that sorts and removes the duplicates at the same time. Not that much complicated to code than a sort, and much faster in this case. Or there could be a DistinctHash, which would be similar or rather identical to a HashAggregate and would again skip the sort.

It would (as a bonus) speed up queries like UNION (not ALL), that kind of things. For example :

explain (select number from dummy) union (select number from dummy); Unique (cost=287087.62..297087.62 rows=2000000 width=4) -> Sort (cost=287087.62..292087.62 rows=2000000 width=4) Sort Key: number -> Append (cost=0.00..49804.00 rows=2000000 width=4) -> Subquery Scan "*SELECT* 1" (cost=0.00..24902.00 rows=1000000 width=4) -> Seq Scan on dummy (cost=0.00..14902.00 rows=1000000 width=4) -> Subquery Scan "*SELECT* 2" (cost=0.00..24902.00 rows=1000000 width=4) -> Seq Scan on dummy (cost=0.00..14902.00 rows=1000000 width=4)

        This is scary !

I can rewrite it as such (and the planner could, too) :

explain select * from ((select number from dummy) union all (select number from dummy)) as foo group by number; HashAggregate (cost=74804.00..74804.00 rows=200 width=4) -> Subquery Scan foo (cost=0.00..69804.00 rows=2000000 width=4) -> Append (cost=0.00..49804.00 rows=2000000 width=4) -> Subquery Scan "*SELECT* 1" (cost=0.00..24902.00 rows=1000000 width=4) -> Seq Scan on dummy (cost=0.00..14902.00 rows=1000000 width=4) -> Subquery Scan "*SELECT* 2" (cost=0.00..24902.00 rows=1000000 width=4) -> Seq Scan on dummy (cost=0.00..14902.00 rows=1000000 width=4)

which avoids a large sort...

However there must be cases in which performing a sort is faster, like when there are a lot of distinct values and the HashAggregate becomes huge too.

Well there are two questions here. Why given the current plans available does postgres choose a sequential scan instead of an index scan. And why isn't

Well because it needs to get all the rows in the table in order. in this case seq scan+sort is about twice as fast as index scan. Interestingly, once I ANALYZED the table, postgres will chooses to index-scan, which is slower.

there this kind of "skip index scan" available.


    It would be really nice to have a skip index scan available.

I have an other idea, lets call it the indexed sequential scan : When pg knows there are a lot of rows to access, it will ignore the index and seqscan. This is because index access is very random, thus slow. However postgres could implement an "indexed sequential scan" where : - the page numbers for the matching rows are looked up in the index (this is fast as an index has good locality) - the page numbers are grouped so we have a list of pages with one and only one instance of each page number - the list is then sorted so we have page numbers in-order - the pages are loaded in sorted order (doing a kind of partial sequential scan) which would be faster than reading them randomly.

        Other ideas later

Postgres chooses a sequential scan with a sort (or hash aggregate) over an index scan because it expects it to be faster. sequential scans are much faster than random access scans of indexes, plus index scans need to read many more blocks. If you're finding the index scan to be just as fast as sequential scans you might consider lowering random_page_cost closer to 1.0. But note that you may be getting fooled by a testing methodology where more things are cached than would be in production.

why isn't a "skip index scan" plan available? Well, nobody's written the code yet. It would part of the same code needed to get an index scan used for:
    select y,min(x) from bar group by y
And possibly also related to the TODO item:
    Use index to restrict rows returned by multi-key index when used with
    non-consecutive keys to reduce heap accesses
For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3 = 9, spin though the index checking for col1 and col3 matches, rather than just col1

Note that the optimizer would have to make a judgement call based on the expected number of distinct values. If you had much more than 256 distinct values then the your plpgsql function wouldn't have performed well at all.


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] sequential scan on select distinct

Reply via email to