Re: [GENERAL] View with an outer join - is there any way to optimise
Tom Lane wrote: Rich Doughty <[EMAIL PROTECTED]> writes: I have a view vw_tokens defined as ... I cannot however perform a meaningful join against this view. ... PG forms the full output of the view. You seem to be wishing that PG would push the INNER JOIN down inside the nested LEFT JOINs. In general, rearranging inner and outer joins like that can change the results. There are limited cases where it can be done without breaking the query semantics, but the planner doesn't currently have any logic to analyze whether it's safe or not, so it just doesn't try. Improving this situation is (or ought to be) on the TODO list, but I dunno when it will happen. ok, thanks. as i suspected, i don't think i'm going to be able to views for this. when the query is ultimately returning only 100 or so rows, i cannot afford a full 4 million row table scan to form the full view when a nested loop might make more sense (anything makes more sense than the full view :-) i have a workaround (of sorts). instead of WHERE token_id IN (SELECT token_id FROM ta_tokenhist WHERE sarreport_id = 9) if i perform the subquery manually, then create a second query of the form WHERE token_id IN (?,?,?,?,?) i get decent results. it's pretty ugly but it works. i doubt that it will scale up to 500 or more results (if that), but thankfully in general, neither do the query results. cheers anyway - Rich Doughty ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] View with an outer join - is there any way to optimise this?
Rich Doughty <[EMAIL PROTECTED]> writes: > I have a view vw_tokens defined as > ... > I cannot however perform a meaningful join against this view. > ... > PG forms the full output of the view. You seem to be wishing that PG would push the INNER JOIN down inside the nested LEFT JOINs. In general, rearranging inner and outer joins like that can change the results. There are limited cases where it can be done without breaking the query semantics, but the planner doesn't currently have any logic to analyze whether it's safe or not, so it just doesn't try. Improving this situation is (or ought to be) on the TODO list, but I dunno when it will happen. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] View with an outer join - is there any way to optimise
John McCawley wrote: You should be able to use my trick...the join that is giving you the problem is: SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; ta_tokenhist is already part of your view, right? So you should be able to include the sarreport_id as part of your view, and then restructure your query as: SELECT * FROM tokens.ta_tokenhist INNER JOIN tokens.vw_tokens ON tokens.ta_tokenhist.token_id = tokens.vw_tokens.token_id WHERE tokens.vw_tokens.sarreport_id = 9 ; I removed the aliases because they confuse me ;) i don't think i can do that. basically i want to run a variety of queries on the vw_tokens view. for example, joins i hope to do may include: tokens.ta_tokenhist h INNER JOIN tokens.vw_tokens WHERE h.customer_id = ? tokens.ta_tokenhist h INNER JOIN tokens.vw_tokens WHERE h.histdate between 'then' and 'now' tokens.vw_tokens WHERE number = ? i just want vw_tokens to give me a constant resultset. i have a feeling though that views aren't go to be able to give me what i need. i suppose i could go for a set returning function, or just write the queries manually. -- - Rich Doughty ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] View with an outer join - is there any way to optimise
You should be able to use my trick...the join that is giving you the problem is: SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; ta_tokenhist is already part of your view, right? So you should be able to include the sarreport_id as part of your view, and then restructure your query as: SELECT * FROM tokens.ta_tokenhist INNER JOIN tokens.vw_tokens ON tokens.ta_tokenhist.token_id = tokens.vw_tokens.token_id WHERE tokens.vw_tokens.sarreport_id = 9 ; I removed the aliases because they confuse me ;) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] View with an outer join - is there any way to optimise
First of all, check out this thread: http://archives.postgresql.org/pgsql-general/2005-11/msg00734.php I had a similar performance issue with a view. Look at my solution and it might help you out. Second, you might want to change your token history status from a string to an integer that references a status table. If your view is causing a sequential scan, you're going to end up will bazillions of string comparisons. I don't know if Postgres has some form of black magic string comparison optimization, but I generally avoid string comparisons when I am dealing with a few known values, as would be the case in a status table. Rich Doughty wrote: I have a view vw_tokens defined as CREATE VIEW tokens.vw_tokens AS SELECT -- too many columns to mention FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist i ON t.token_id = i.token_id AND i.status = 'issued' LEFT JOIN tokens.ta_tokenhist s ON t.token_id = s.token_id AND s.status = 'sold' LEFT JOIN tokens.ta_tokenhist r ON t.token_id = r.token_id AND r.status = 'redeemed' ; the ta_tokens table contains approx 4 million records, and ta_tokenhist approx 10 millions. queries against the view itself on the primary key execute with no issues at all. I cannot however perform a meaningful join against this view. when i execute SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; PG forms the full output of the view. the query plan is Hash Join (cost=1638048.47..3032073.73 rows=1 width=702) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) -> Hash (cost=459239.41..459239.41 rows=4114456 width=152) -> Seq Scan on ta_tokenhist i (cost=0.00..459239.41 rows=4114456 width=152) Filter: ((status)::text = 'issued'::text) -> Hash (cost=459239.41..459239.41 rows=3905186 width=152) -> Seq Scan on ta_tokenhist s (cost=0.00..459239.41 rows=3905186 width=152) Filter: ((status)::text = 'sold'::text) -> Hash (cost=459239.41..459239.41 rows=2617645 width=152) -> Seq Scan on ta_tokenhist r (cost=0.00..459239.41 rows=2617645 width=152) Filter: ((status)::text = 'redeemed'::text) -> Hash (cost=6.01..6.01 rows=1 width=236) -> Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h (cost=0.00..6.01 rows=1 width=236) Index Cond: ((sarreport_id)::integer = 9) I have also tried explicitly querying token_id in the view, hoping to force a nested loop: EXPLAIN SELECT * FROM tokens.vw_tokens__user WHERE token_id IN (SELECT token_id FROM tokens.ta_tokenhist WHERE sarreport_id = 9); QUERY PLAN -- Hash IN Join (cost=1638048.47..3032073.73 rows=1 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) -> Hash (cost=459239.41..459239.41 rows=4114456 width=152) -> Seq Scan on ta_tokenhist i (cost=0.00..459239.41 rows=4114456 width=152) Filter: ((status)::text = 'issued'::text) -> Hash (cost=459239.41..459239.41 rows=3905186 width=152) -> Seq Scan on ta_tokenhist s (cost=0.00..459239.41 rows=3905186 width=152)
Re: [GENERAL] View with an outer join - is there any way to optimise
John McCawley wrote: First of all, check out this thread: http://archives.postgresql.org/pgsql-general/2005-11/msg00734.php I had a similar performance issue with a view. Look at my solution and it might help you out. i'm not sure that'll help in my case as ta_tokens has a 1-to-many relationship with ta_tokenhist. there are various indexed tokenhist columns that i want to query on (in addition to salesorder_id). none of them will return more than 100 or so rows. it'd probably be easier to abandon the view altogether (which isn't something i'd really like to do) Second, you might want to change your token history status from a string to an integer that references a status table. If your view is causing a sequential scan, you're going to end up will bazillions of string comparisons. I don't know if Postgres has some form of black magic string comparison optimization, but I generally avoid string comparisons when I am dealing with a few known values, as would be the case in a status table. interesting thought. of course, i'd rather postgres didn't do a full sequential scan ;-) Rich Doughty wrote: I have a view vw_tokens defined as CREATE VIEW tokens.vw_tokens AS SELECT -- too many columns to mention FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist i ON t.token_id = i.token_id AND i.status = 'issued' LEFT JOIN tokens.ta_tokenhist s ON t.token_id = s.token_id AND s.status = 'sold' LEFT JOIN tokens.ta_tokenhist r ON t.token_id = r.token_id AND r.status = 'redeemed' ; the ta_tokens table contains approx 4 million records, and ta_tokenhist approx 10 millions. queries against the view itself on the primary key execute with no issues at all. I cannot however perform a meaningful join against this view. when i execute SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; PG forms the full output of the view. the query plan is Hash Join (cost=1638048.47..3032073.73 rows=1 width=702) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) -> Hash (cost=459239.41..459239.41 rows=4114456 width=152) -> Seq Scan on ta_tokenhist i (cost=0.00..459239.41 rows=4114456 width=152) Filter: ((status)::text = 'issued'::text) -> Hash (cost=459239.41..459239.41 rows=3905186 width=152) -> Seq Scan on ta_tokenhist s (cost=0.00..459239.41 rows=3905186 width=152) Filter: ((status)::text = 'sold'::text) -> Hash (cost=459239.41..459239.41 rows=2617645 width=152) -> Seq Scan on ta_tokenhist r (cost=0.00..459239.41 rows=2617645 width=152) Filter: ((status)::text = 'redeemed'::text) -> Hash (cost=6.01..6.01 rows=1 width=236) -> Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h (cost=0.00..6.01 rows=1 width=236) Index Cond: ((sarreport_id)::integer = 9) I have also tried explicitly querying token_id in the view, hoping to force a nested loop: EXPLAIN SELECT * FROM tokens.vw_tokens__user WHERE token_id IN (SELECT token_id FROM tokens.ta_tokenhist WHERE sarreport_id = 9); QUERY PLAN -- Hash IN Join (cost=1638048.47..3032073.73 rows=1 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26)
[GENERAL] View with an outer join - is there any way to optimise this?
I have a view vw_tokens defined as CREATE VIEW tokens.vw_tokens AS SELECT -- too many columns to mention FROM tokens.ta_tokens t LEFT JOIN tokens.ta_tokenhist i ON t.token_id = i.token_id AND i.status = 'issued' LEFT JOIN tokens.ta_tokenhist s ON t.token_id = s.token_id AND s.status = 'sold' LEFT JOIN tokens.ta_tokenhist r ON t.token_id = r.token_id AND r.status = 'redeemed' ; the ta_tokens table contains approx 4 million records, and ta_tokenhist approx 10 millions. queries against the view itself on the primary key execute with no issues at all. I cannot however perform a meaningful join against this view. when i execute SELECT * FROM tokens.ta_tokenhist h INNER JOIN tokens.vw_tokenst ON h.token_id = t.token_id WHERE h.sarreport_id = 9 ; PG forms the full output of the view. the query plan is Hash Join (cost=1638048.47..3032073.73 rows=1 width=702) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) -> Hash (cost=459239.41..459239.41 rows=4114456 width=152) -> Seq Scan on ta_tokenhist i (cost=0.00..459239.41 rows=4114456 width=152) Filter: ((status)::text = 'issued'::text) -> Hash (cost=459239.41..459239.41 rows=3905186 width=152) -> Seq Scan on ta_tokenhist s (cost=0.00..459239.41 rows=3905186 width=152) Filter: ((status)::text = 'sold'::text) -> Hash (cost=459239.41..459239.41 rows=2617645 width=152) -> Seq Scan on ta_tokenhist r (cost=0.00..459239.41 rows=2617645 width=152) Filter: ((status)::text = 'redeemed'::text) -> Hash (cost=6.01..6.01 rows=1 width=236) -> Index Scan using fkx_tokenhist__sarreports on ta_tokenhist h (cost=0.00..6.01 rows=1 width=236) Index Cond: ((sarreport_id)::integer = 9) I have also tried explicitly querying token_id in the view, hoping to force a nested loop: EXPLAIN SELECT * FROM tokens.vw_tokens__user WHERE token_id IN (SELECT token_id FROM tokens.ta_tokenhist WHERE sarreport_id = 9); QUERY PLAN -- Hash IN Join (cost=1638048.47..3032073.73 rows=1 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1638042.45..3011803.15 rows=4052907 width=470) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=1114741.93..2011923.86 rows=4052907 width=322) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Hash Left Join (cost=559931.55..1093783.71 rows=4052907 width=174) Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer) -> Seq Scan on ta_tokens t (cost=0.00..73250.07 rows=4052907 width=26) -> Hash (cost=459239.41..459239.41 rows=4114456 width=152) -> Seq Scan on ta_tokenhist i (cost=0.00..459239.41 rows=4114456 width=152) Filter: ((status)::text = 'issued'::text) -> Hash (cost=459239.41..459239.41 rows=3905186 width=152) -> Seq Scan on ta_tokenhist s (cost=0.00..459239.41 rows=3905186 width=152) Filter: ((status)::text = 'sold'::text) -> Hash (cost=459239.41..459239.41 rows=2617645 width=152) -> Seq Scan on ta_tokenhist r (cost=0.00..459239.41 rows=2617645 width=152) Filter: ((status)::text = 'redeemed'::text) -> Hash (cost=6.01..6.01 rows=1 width=4) -> Index Scan using fkx_tokenhist__sarreports on ta_tokenhist (cost=0.00..6.01 rows=1 width=4) Index Cond: ((sarreport_id)::integer = 9) Setting enable_mergejoin and enable_hashjoin to off results in a nested but still forms the view output. I can achieve the results i need be eliminating the view and writing the query manually