[GENERAL] Indexing problem with OFFSET LIMIT
Hello I have problem in my applications and don't know how to fix it. This is the table and one of the indexes: CREATE TABLE foo ( id serial NOT NULL, foo_name character varying(100), realm_id integer ... and about 50 other columns ) CREATE INDEX idx_foo_name_realm ON foo USING btree (realm_id, foo_name); Table foo contains about 8 Million Rows. The problem: Consider this query: SELECT * FROM foo WHERE realm_id = 228 order by foo_name LIMIT 200 OFFSET 15000 And it's execution plan: Limit (cost=57527.13..58294.16 rows=200 width=575) (actual time=182.302..184.971 rows=200 loops=1) - Index Scan using idx_foo_name_realm on foo (cost=0.00..62159.98 rows=16208 width=575) (actual time=0.085..166.861 rows=15200 loops=1) Index Cond: (realm_id = 228) Total runtime: 185.591 ms And now look at this: SELECT * FROM foo WHERE realm_id = 228 order by foo_name LIMIT 200 OFFSET 15999 Limit (cost=59601.92..59602.42 rows=200 width=575) (actual time=1069.759..1072.310 rows=200 loops=1) - Sort (cost=59561.92..59602.44 rows=16208 width=575) (actual time=929.948..1052.620 rows=16199 loops=1) Sort Key: foo_name Sort Method: external merge Disk: 8984kB - Bitmap Heap Scan on foo (cost=306.69..54270.62 rows=16208 width=575) (actual time=9.612..235.902 rows=21788 loops=1) Recheck Cond: (realm_id = 228) - Bitmap Index Scan on foo_realm_id (cost=0.00..302.64 rows=16208 width=0) (actual time=8.733..8.733 rows=21810 loops=1) Index Cond: (realm_id = 228) Total runtime: 1084.706 ms Execution time increases tenfold because postgres stopped using the index. Can anybody explain to me what's going on and what can be done? Is this a memory problem?
Re: [GENERAL] Indexing problem with OFFSET LIMIT
I'm no expert at reading query plans, but I'm guessing the planner chose the other plan because your offset + limit went beyond the row estimate. Look's like it's then doing a disk based sort in the other plan which probably explain why it's slow. Someone please correct me if I'm wrong. Perhaps if you can spare a little more work_mem in postgesql.conf it might go back to a memory sort. David. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Weichhold Sent: 29 August 2008 21:38 To: pgsql-general@postgresql.org Subject: [GENERAL] Indexing problem with OFFSET LIMIT Hello I have problem in my applications and don't know how to fix it. This is the table and one of the indexes: CREATE TABLE foo ( id serial NOT NULL, foo_name character varying(100), realm_id integer ... and about 50 other columns ) CREATE INDEX idx_foo_name_realm ON foo USING btree (realm_id, foo_name); Table foo contains about 8 Million Rows. The problem: Consider this query: SELECT * FROM foo WHERE realm_id = 228 order by foo_name LIMIT 200 OFFSET 15000 And it's execution plan: Limit (cost=57527.13..58294.16 rows=200 width=575) (actual time=182.302..184.971 rows=200 loops=1) - Index Scan using idx_foo_name_realm on foo (cost=0.00..62159.98 rows=16208 width=575) (actual time=0.085..166.861 rows=15200 loops=1) Index Cond: (realm_id = 228) Total runtime: 185.591 ms And now look at this: SELECT * FROM foo WHERE realm_id = 228 order by foo_name LIMIT 200 OFFSET 15999 Limit (cost=59601.92..59602.42 rows=200 width=575) (actual time=1069.759..1072.310 rows=200 loops=1) - Sort (cost=59561.92..59602.44 rows=16208 width=575) (actual time=929.948..1052.620 rows=16199 loops=1) Sort Key: foo_name Sort Method: external merge Disk: 8984kB - Bitmap Heap Scan on foo (cost=306.69..54270.62 rows=16208 width=575) (actual time=9.612..235.902 rows=21788 loops=1) Recheck Cond: (realm_id = 228) - Bitmap Index Scan on foo_realm_id (cost=0.00..302.64 rows=16208 width=0) (actual time=8.733..8.733 rows=21810 loops=1) Index Cond: (realm_id = 228) Total runtime: 1084.706 ms Execution time increases tenfold because postgres stopped using the index. Can anybody explain to me what's going on and what can be done? Is this a memory problem?
Re: [GENERAL] Indexing problem with OFFSET LIMIT
On Fri, Aug 29, 2008 at 4:38 PM, Oliver Weichhold [EMAIL PROTECTED] wrote: Hello I have problem in my applications and don't know how to fix it. This is the table and one of the indexes: CREATE TABLE foo ( id serial NOT NULL, foo_name character varying(100), realm_id integer ... and about 50 other columns ) CREATE INDEX idx_foo_name_realm ON foo USING btree (realm_id, foo_name); Table foo contains about 8 Million Rows. The problem: Consider this query: SELECT * FROM foo WHERE realm_id = 228 order by foo_name LIMIT 200 OFFSET 15000 try this: SELECT * FROM foo WHERE realm_id = 228 order by realm_id, foo_name LIMIT 200 OFFSET 15000 Or even better don't use 'offset' at all. It's simply lousy. If you want to skip ahead 200 rows at a time, save off the previous last extracted rows in the app: 1st time: select * from foo order by realm_id, foo_name limit 200; times after that: select * from foo where (realm_id, foo_name) (last_realm_id, last_foo_name) order by realm_id, foo_name limit 200; you should be pleasantly surprised :-). This is also a little bit more graceful if other sessions are deleting/inserting rows while you are browsing. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indexing problem with OFFSET LIMIT
Merlin Moncure [EMAIL PROTECTED] writes: On Fri, Aug 29, 2008 at 4:38 PM, Oliver Weichhold [EMAIL PROTECTED] wrote: Consider this query: SELECT * FROM foo WHERE realm_id = 228 order by foo_name LIMIT 200 OFFSET 15000 try this: SELECT * FROM foo WHERE realm_id = 228 order by realm_id, foo_name LIMIT 200 OFFSET 15000 Or even better don't use 'offset' at all. It's simply lousy. If you want to skip ahead 200 rows at a time, save off the previous last extracted rows in the app: Yeah, large OFFSET values are always going to suck performance-wise, because there is no magic way to skip over those rows --- the backend has to read them, and then throw them away internally. Avoid OFFSET. Aside from the type of trick Merlin mentions, have you considered using a cursor? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general