(This doesn't belong on the pgadmin-hackers list, but here goes anyway..)

Berkowitz Eric wrote:
When postgresql implements the following query:

Select * from <table> where <condition> order by <ordinal expression> limit <X>

It appears to do a select, then a sort, then return the top X rows.

This works fine for small results but not for tables with tens of millions of rows and queries that may return tens of thousands or even hundreds of thousands of rows.

The sort is superfluous and incredibly expensive.

What should be done on this query is to do the select saving X rows in a save-bucket that is ranked by the ordinal expression.

Starting with version 8.3, the server can do just that. It's implemented within the Sort node, but you can tell by looking at the EXPLAIN ANALYZE output if that optimization has taken effect:

postgres=# explain analyze SELECT * FROM foo ORDER BY a LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Limit (cost=7.16..7.19 rows=10 width=2) (actual time=0.581..0.625 rows=10 loops=1) -> Sort (cost=7.16..7.41 rows=100 width=2) (actual time=0.577..0.592 rows=10 loops=1)
         Sort Key: a
         Sort Method:  top-N heapsort  Memory: 17kB
-> Seq Scan on foo (cost=0.00..5.00 rows=100 width=2) (actual time=0.013..0.207 rows=103 loops=1)
 Total runtime: 0.694 ms
(6 rows)

The "top-N heapsort" is exactly what you're looking for.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgadmin-hackers mailing list (pgadmin-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-hackers

Reply via email to