Re: [PERFORM] max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS)
Adam Rich wrote: Did anybody get a chance to look at this? Is it expected behavior? Everyone seemed so incredulous, I hoped maybe this exposed a bug that would be fixed in a near release. Actually, the planner is only able to do the min()/max() transformation into order by/limit in the case of a single table being scanned. Since you have a join here, the optimization is obviously not used: select max(item_id) from events e, receipts r, receipt_items ri where e.event_id=r.event_id and r.receipt_id=ri.receipt_id plan/planagg.c says /* * We also restrict the query to reference exactly one table, since join * conditions can't be handled reasonably. (We could perhaps handle a * query containing cartesian-product joins, but it hardly seems worth the * trouble.) */ so you should keep using your hand-written order by/limit query. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] max() versus order/limit (WAS: High update
Luke Lonergan wrote: Adam, This optimization would require teaching the planner to use an index for MAX/MIN when available. It seems like an OK thing to do to me. This optimization already exists, albeit for queries that use a single table. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] max() versus order/limit (WAS: High update
Luke Lonergan wrote: Adam, This optimization would require teaching the planner to use an index for MAX/MIN when available. It seems like an OK thing to do to me. Uhmmm I thought we did that already in 8.1? Joshua D. Drake - Luke -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Rich Sent: Sunday, January 14, 2007 8:52 PM To: 'Joshua D. Drake'; 'Tom Lane' Cc: 'Craig A. James'; 'PostgreSQL Performance' Subject: Re: [PERFORM] max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS) Did anybody get a chance to look at this? Is it expected behavior? Everyone seemed so incredulous, I hoped maybe this exposed a bug that would be fixed in a near release. -Original Message- From: Adam Rich [mailto:[EMAIL PROTECTED] Sent: Sunday, January 07, 2007 11:53 PM To: 'Joshua D. Drake'; 'Tom Lane' Cc: 'Craig A. James'; 'PostgreSQL Performance' Subject: RE: [PERFORM] High update activity, PostgreSQL vs BigDBMS Here's another, more drastic example... Here the order by / limit version runs in less than 1/7000 the time of the MAX() version. select max(item_id) from events e, receipts r, receipt_items ri where e.event_id=r.event_id and r.receipt_id=ri.receipt_id Aggregate (cost=10850.84..10850.85 rows=1 width=4) (actual time=816.382..816.383 rows=1 loops=1) - Hash Join (cost=2072.12..10503.30 rows=139019 width=4) (actual time=155.177..675.870 rows=147383 loops=1) Hash Cond: (ri.receipt_id = r.receipt_id) - Seq Scan on receipt_items ri (cost=0.00..4097.56 rows=168196 width=8) (actual time=0.009..176.894 rows=168196 loops=1) - Hash (cost=2010.69..2010.69 rows=24571 width=4) (actual time=155.146..155.146 rows=24571 loops=1) - Hash Join (cost=506.84..2010.69 rows=24571 width=4) (actual time=34.803..126.452 rows=24571 loops=1) Hash Cond: (r.event_id = e.event_id) - Seq Scan on receipts r (cost=0.00..663.58 rows=29728 width=8) (actual time=0.006..30.870 rows=29728 loops=1) - Hash (cost=469.73..469.73 rows=14843 width=4) (actual time=34.780..34.780 rows=14843 loops=1) - Seq Scan on events e (cost=0.00..469.73 rows=14843 width=4) (actual time=0.007..17.603 rows=14843 loops=1) Total runtime: 816.645 ms select item_id from events e, receipts r, receipt_items ri where e.event_id=r.event_id and r.receipt_id=ri.receipt_id order by item_id desc limit 1 Limit (cost=0.00..0.16 rows=1 width=4) (actual time=0.047..0.048 rows=1 loops=1) - Nested Loop (cost=0.00..22131.43 rows=139019 width=4) (actual time=0.044..0.044 rows=1 loops=1) - Nested Loop (cost=0.00..12987.42 rows=168196 width=8) (actual time=0.032..0.032 rows=1 loops=1) - Index Scan Backward using receipt_items_pkey on receipt_items ri (cost=0.00..6885.50 rows=168196 width=8) (actual time=0.016..0.016 rows=1 loops=1) - Index Scan using receipts_pkey on receipts r (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1) Index Cond: (r.receipt_id = ri.receipt_id) - Index Scan using events_pkey on events e (cost=0.00..0.04 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: (e.event_id = r.event_id) Total runtime: 0.112 ms -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joshua D. Drake Sent: Sunday, January 07, 2007 9:10 PM To: Adam Rich Cc: 'Craig A. James'; 'Guy Rouillier'; 'PostgreSQL Performance' Subject: Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS On Sun, 2007-01-07 at 20:26 -0600, Adam Rich wrote: I'm using 8.2 and using order by limit is still faster than MAX() even though MAX() now seems to rewrite to an almost identical plan internally. Gonna need you to back that up :) Can we get an explain analyze? Count(*) still seems to use a full table scan rather than an index scan. There is a TODO out there to help this. Don't know if it will get done. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner
Re: [PERFORM] max() versus order/limit (WAS: High update
Adam, This optimization would require teaching the planner to use an index for MAX/MIN when available. It seems like an OK thing to do to me. - Luke -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Rich Sent: Sunday, January 14, 2007 8:52 PM To: 'Joshua D. Drake'; 'Tom Lane' Cc: 'Craig A. James'; 'PostgreSQL Performance' Subject: Re: [PERFORM] max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS) Did anybody get a chance to look at this? Is it expected behavior? Everyone seemed so incredulous, I hoped maybe this exposed a bug that would be fixed in a near release. -Original Message- From: Adam Rich [mailto:[EMAIL PROTECTED] Sent: Sunday, January 07, 2007 11:53 PM To: 'Joshua D. Drake'; 'Tom Lane' Cc: 'Craig A. James'; 'PostgreSQL Performance' Subject: RE: [PERFORM] High update activity, PostgreSQL vs BigDBMS Here's another, more drastic example... Here the order by / limit version runs in less than 1/7000 the time of the MAX() version. select max(item_id) from events e, receipts r, receipt_items ri where e.event_id=r.event_id and r.receipt_id=ri.receipt_id Aggregate (cost=10850.84..10850.85 rows=1 width=4) (actual time=816.382..816.383 rows=1 loops=1) - Hash Join (cost=2072.12..10503.30 rows=139019 width=4) (actual time=155.177..675.870 rows=147383 loops=1) Hash Cond: (ri.receipt_id = r.receipt_id) - Seq Scan on receipt_items ri (cost=0.00..4097.56 rows=168196 width=8) (actual time=0.009..176.894 rows=168196 loops=1) - Hash (cost=2010.69..2010.69 rows=24571 width=4) (actual time=155.146..155.146 rows=24571 loops=1) - Hash Join (cost=506.84..2010.69 rows=24571 width=4) (actual time=34.803..126.452 rows=24571 loops=1) Hash Cond: (r.event_id = e.event_id) - Seq Scan on receipts r (cost=0.00..663.58 rows=29728 width=8) (actual time=0.006..30.870 rows=29728 loops=1) - Hash (cost=469.73..469.73 rows=14843 width=4) (actual time=34.780..34.780 rows=14843 loops=1) - Seq Scan on events e (cost=0.00..469.73 rows=14843 width=4) (actual time=0.007..17.603 rows=14843 loops=1) Total runtime: 816.645 ms select item_id from events e, receipts r, receipt_items ri where e.event_id=r.event_id and r.receipt_id=ri.receipt_id order by item_id desc limit 1 Limit (cost=0.00..0.16 rows=1 width=4) (actual time=0.047..0.048 rows=1 loops=1) - Nested Loop (cost=0.00..22131.43 rows=139019 width=4) (actual time=0.044..0.044 rows=1 loops=1) - Nested Loop (cost=0.00..12987.42 rows=168196 width=8) (actual time=0.032..0.032 rows=1 loops=1) - Index Scan Backward using receipt_items_pkey on receipt_items ri (cost=0.00..6885.50 rows=168196 width=8) (actual time=0.016..0.016 rows=1 loops=1) - Index Scan using receipts_pkey on receipts r (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1) Index Cond: (r.receipt_id = ri.receipt_id) - Index Scan using events_pkey on events e (cost=0.00..0.04 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: (e.event_id = r.event_id) Total runtime: 0.112 ms -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joshua D. Drake Sent: Sunday, January 07, 2007 9:10 PM To: Adam Rich Cc: 'Craig A. James'; 'Guy Rouillier'; 'PostgreSQL Performance' Subject: Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS On Sun, 2007-01-07 at 20:26 -0600, Adam Rich wrote: I'm using 8.2 and using order by limit is still faster than MAX() even though MAX() now seems to rewrite to an almost identical plan internally. Gonna need you to back that up :) Can we get an explain analyze? Count(*) still seems to use a full table scan rather than an index scan. There is a TODO out there to help this. Don't know if it will get done. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan