Re: [SQL] Wrong query plan when using a left outer join
BTW, add a foreign key and index on handhistory_plain.hand_id (unless you have it already). It's there already: feiketracker=# \d+ handhistory_plain; Table stage.handhistory_plain Column | Type | Modifiers | Storage | Description -+-+---+--+- hand_id | integer | not null | plain| history | text| not null | extended | Indexes: handhistory_plain_pkey PRIMARY KEY, btree (hand_id) CLUSTER Foreign-key constraints: handhistory_plain_hand_id_fkey FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) BTW2, if you really don't care on handhistory you can just use original query with no join. Well, sometimes I do, sometimes I don't. For easier application access I wanted to create a view that joins both these tables together: easier application design and better performance, as the analyzer should know best when not to use the handhistory_plain table. The design is as follows: hand_meta - holds all metadata for a pokerhand handhistory_plain holds the history for a pokerhand hand_meta is going to be used the most, it is around 165 bytes per tuple handhistory_plain is not going to be used often (it is there as a reference); it is around 5000 bytes per tuple. They both hold the same column as primary key, handhistory_plain holds a fraction of the tuples of hand_meta, the split was only made to make sure the processed data (hand_meta) is smaller in size and should therefore require less I/O and thus increase performance. I'm not sure what to make of: imagine that the view on the right side of join has some side effects. I can see some side effects may occur, but as it is a left join, the left hand side will always be part of the returning set (there is no where clause), so the index should be used. Even though I don't understand, you seem to be right, a natural join is 30 times faster: feiketracker=# explain analyze select max(hand_id) from hand_meta left join handhistory_plain using(hand_id); QUERY PLAN - Aggregate (cost=1049261.00..1049261.01 rows=1 width=4) (actual time=31179.238..31179.241 rows=1 loops=1) - Seq Scan on hand_meta (cost=100.00..1043062.40 rows=2479440 width=4) (actual time=0.131..16039.886 rows=2479440 loops=1) Total runtime: 31179.725 ms (3 rows) Time: 31185.088 ms feiketracker=# explain analyze select max(hand_id) from hand_meta join handhistory_plain using(hand_id); QUERY PLAN -- Aggregate (cost=53043.61..53043.62 rows=1 width=4) (actual time=962.242..962.245 rows=1 loops=1) - Nested Loop (cost=0.00..53029.93 rows=5470 width=4) (actual time=0.400..920.582 rows=5470 loops=1) - Index Scan using handhistory_plain_pkey on handhistory_plain (cost=0.00..14494.27 rows=5470 width=4) (actual time=0.215..101.177 rows=5470 loops=1) - Index Scan using hand_meta_pkey on hand_meta (cost=0.00..7.03 rows=1 width=4) (actual time=0.100..0.115 rows=1 loops=5470) Index Cond: (hand_meta.hand_id = handhistory_plain.hand_id) Total runtime: 962.968 ms try to experiment with SET enable_seqscan TO false; - and see what happens. Didn't make a difference; therefore I think postgres determines it is unable to use the index, is that correct? Thank you for now: I'll use the inner join (or natural join in this case) for this specific view -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Wrong query plan when using a left outer join
oops, but ofcourse, a natural view will not give the correct answer, back to the drawing board ... On Tue, Jan 17, 2012 at 19:53, Feike Steenbergen feikesteenber...@gmail.com wrote: BTW, add a foreign key and index on handhistory_plain.hand_id (unless you have it already). It's there already: feiketracker=# \d+ handhistory_plain; Table stage.handhistory_plain Column | Type | Modifiers | Storage | Description -+-+---+--+- hand_id | integer | not null | plain | history | text | not null | extended | Indexes: handhistory_plain_pkey PRIMARY KEY, btree (hand_id) CLUSTER Foreign-key constraints: handhistory_plain_hand_id_fkey FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) BTW2, if you really don't care on handhistory you can just use original query with no join. Well, sometimes I do, sometimes I don't. For easier application access I wanted to create a view that joins both these tables together: easier application design and better performance, as the analyzer should know best when not to use the handhistory_plain table. The design is as follows: hand_meta - holds all metadata for a pokerhand handhistory_plain holds the history for a pokerhand hand_meta is going to be used the most, it is around 165 bytes per tuple handhistory_plain is not going to be used often (it is there as a reference); it is around 5000 bytes per tuple. They both hold the same column as primary key, handhistory_plain holds a fraction of the tuples of hand_meta, the split was only made to make sure the processed data (hand_meta) is smaller in size and should therefore require less I/O and thus increase performance. I'm not sure what to make of: imagine that the view on the right side of join has some side effects. I can see some side effects may occur, but as it is a left join, the left hand side will always be part of the returning set (there is no where clause), so the index should be used. Even though I don't understand, you seem to be right, a natural join is 30 times faster: feiketracker=# explain analyze select max(hand_id) from hand_meta left join handhistory_plain using(hand_id); QUERY PLAN - Aggregate (cost=1049261.00..1049261.01 rows=1 width=4) (actual time=31179.238..31179.241 rows=1 loops=1) - Seq Scan on hand_meta (cost=100.00..1043062.40 rows=2479440 width=4) (actual time=0.131..16039.886 rows=2479440 loops=1) Total runtime: 31179.725 ms (3 rows) Time: 31185.088 ms feiketracker=# explain analyze select max(hand_id) from hand_meta join handhistory_plain using(hand_id); QUERY PLAN -- Aggregate (cost=53043.61..53043.62 rows=1 width=4) (actual time=962.242..962.245 rows=1 loops=1) - Nested Loop (cost=0.00..53029.93 rows=5470 width=4) (actual time=0.400..920.582 rows=5470 loops=1) - Index Scan using handhistory_plain_pkey on handhistory_plain (cost=0.00..14494.27 rows=5470 width=4) (actual time=0.215..101.177 rows=5470 loops=1) - Index Scan using hand_meta_pkey on hand_meta (cost=0.00..7.03 rows=1 width=4) (actual time=0.100..0.115 rows=1 loops=5470) Index Cond: (hand_meta.hand_id = handhistory_plain.hand_id) Total runtime: 962.968 ms try to experiment with SET enable_seqscan TO false; - and see what happens. Didn't make a difference; therefore I think postgres determines it is unable to use the index, is that correct? Thank you for now: I'll use the inner join (or natural join in this case) for this specific view -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Wrong query plan when using a left outer join
I have the following setup: A table called hand: Table stage.hand_meta Column | Type | Modifiers ---+--+- hand_id | integer | not null default nextval('hand_meta_hand_id_seq'::regclass) hand_no | bigint | not null site_id | smallint | not null game_id | smallint | not null time | timestamp with time zone | not null tournament_id | bigint | Indexes: hand_meta_pkey PRIMARY KEY, btree (hand_id) CLUSTER hand_meta_hand_no_site_unq UNIQUE, btree (hand_no, site_id) hand_meta_time_idx btree (time) hand_meta_tournament_id_idx btree (tournament_id) Referenced by: TABLE handhistory_plain CONSTRAINT handhistory_plain_hand_id_fkey FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) TABLE handhistory_staged CONSTRAINT staged_hand_hand_id_fkey FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) Getting the max hand_id (primary key) results in using an index: feiketracker= explain analyze select max(hand_id) from stage.hand; QUERY PLAN --- Result (cost=0.03..0.04 rows=1 width=0) (actual time=0.379..0.383 rows=1 loops=1) InitPlan 1 (returns $0) - Limit (cost=0.00..0.03 rows=1 width=4) (actual time=0.337..0.340 rows=1 loops=1) - Index Scan Backward using hand_meta_pkey on hand_meta (cost=0.00..82667.12 rows=2479440 width=4) (actual time=0.319..0.319 rows=1 loops=1) Index Cond: (hand_id IS NOT NULL) Total runtime: 0.823 ms (6 rows) Now, if i create a view which left outer joins another table and select max hand_id it uses a seq_scan, which I think it should'nt use, as it only needs to query hand_meta and then use the index: feiketracker= create view seqscan_example as (select * from hand_meta left join handhistory_plain using(hand_id)); CREATE VIEW Time: 72.736 ms feiketracker= explain analyze select max(hand_id) from seqscan_example; QUERY PLAN - Aggregate (cost=49261.00..49261.01 rows=1 width=4) (actual time=34672.052..34672.054 rows=1 loops=1) - Seq Scan on hand_meta (cost=0.00..43062.40 rows=2479440 width=4) (actual time=0.180..16725.109 rows=2479440 loops=1) Total runtime: 34672.874 ms (3 rows) feiketracker= select version(); version PostgreSQL 9.0.6 on armv5tejl-unknown-linux-gnueabi, compiled by GCC gcc (GCC) 3.4.4 (release) (CodeSourcery ARM 2005q3-2), 32-bit (1 row) I cannot think of a reason to use a seqscan, the left join should indicate all results from hand_meta should be used, hand_id is the primary key, so selecting max(hand_id) from the table or the view should result in the same execution plan or am I thinking wrong? -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Current transaction is aborted, commands ignored until end of transaction block
I recently started receiving this error as well, this was because I disabled autocommit. With the following option in .psqlrc the error doesn't wait for a rollback but automatically creates a savepoint allowing you to fix the error and continue: This is now in my .psqlrc: \set AUTOCOMMIT off \set ON_ERROR_ROLLBACK on http://www.postgresql.org/docs/9.1/static/app-psql.html look for ON_ERROR_ROLLBACK When on, if a statement in a transaction block generates an error, the error is ignored and the transaction continues. When interactive, such errors are only ignored in interactive sessions, and not when reading script files. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql