Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-28 Thread Tom Lane
"Kevin Grittner" writes: > Tom Lane wrote: >> The approach contemplated in the comment, of assembling some stats >> on-the-fly from the stats for individual child tables, doesn't >> seem real practical from a planning-time standpoint. > Can you give a thumbnail sketch of why that is? Well, it

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-28 Thread Kevin Grittner
Tom Lane wrote: > Yeah, that is expected. Nestloop inner indexscans have a rowcount > estimate that is different from that of the parent table --- the > parent's rowcount is what would be applicable for another type of > join, such as merge or hash, where the join condition is applied > at the

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-27 Thread Tom Lane
"Kevin Grittner" writes: > Tom Lane wrote: >> That does look weird. Do we have a self-contained test case? > I've been tinkering with this and I now have a self-contained test > case (SQL statements and run results attached). I've debugged through > it and things don't seem right in set_append_r

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-26 Thread Kevin Grittner
Tom Lane wrote: > That does look weird. Do we have a self-contained test case? I've been tinkering with this and I now have a self-contained test case (SQL statements and run results attached). I've debugged through it and things don't seem right in set_append_rel_pathlist, since childrel->rows

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Richard Neill
Kevin Grittner wrote: Richard Neill wrote: Regarding pg_statistic, I don't understand how to find the relevant rows - what am I looking for? (the pg_statistic table is 247M in size). I think the only relevant rows would be the ones with starelid = pg_class.oid for a table used in the qu

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Richard Neill wrote: > Regarding pg_statistic, I don't understand how to find the > relevant rows - what am I looking for? (the pg_statistic table is > 247M in size). I think the only relevant rows would be the ones with starelid = pg_class.oid for a table used in the query, and I think you c

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Richard Neill
Kevin Grittner wrote: Tom Lane wrote: That does look weird. Do we have a self-contained test case? Not at the moment. It seems to only occur with relatively complex joins. Richard, could you capture the schema for the affected tables and views with pg_dump -s and also the related row

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Tom Lane wrote: > That does look weird. Do we have a self-contained test case? Richard, could you capture the schema for the affected tables and views with pg_dump -s and also the related rows from pg_statistic? (The actual table contents aren't needed to see this issue.) -Kevin -- Sent v

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Tom Lane
"Kevin Grittner" writes: > Estimates extracted from the problem plan: > Nested Loop Left Join (rows=806903677108) > -> Nested Loop Left Join (rows=203176856) > -> Nested Loop Left Join (rows=51160) > -> Nested Loop Left Join (rows=28) > ->

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Tom Lane wrote: > And yeah, I think the statistical support is pretty crummy. Do you know, off-hand, why the estimated row count for a "Nested Loop Left Join" is not the product of the estimates for the two sides? (I fear I'm missing something important which lead to the current estimates.)

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Tom Lane
"Kevin Grittner" writes: > Craig Ringer wrote: >> Doesn't the planner have some ... issues ... with estimation of >> row counts on joins over unions? Or is my memory just more faulty >> than usual? > So far I can't tell if it's views with unions or (as I suspect) > inheritance. As of recent ve

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Craig Ringer wrote: > Some of those tables are views composed of multiple unions, too, > by the looks of things. > > Doesn't the planner have some ... issues ... with estimation of > row counts on joins over unions? Or is my memory just more faulty > than usual? So far I can't tell if it's vi

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Craig Ringer
On 2/12/2009 7:08 PM, Matthew Wakeling wrote: On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Huh? At what point does the planner carry over p

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Matthew Wakeling
On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Huh? At what point does the planner carry over previous plans and use them to further optimise

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-01 Thread Richard Neill
Dear Kevin, Thanks for a very helpful reply. Kevin Grittner wrote: Richard Neill wrote: Am I wrong in thinking that ORDER BY is always applied after the main query is run? Yes, you are wrong to think that. It compares the costs of various plans, and when it has an index with the high or

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-01 Thread Kevin Grittner
Richard Neill wrote: > Am I wrong in thinking that ORDER BY is always applied after the > main query is run? Yes, you are wrong to think that. It compares the costs of various plans, and when it has an index with the high order portion matching your ORDER BY clause, it may think that it can s

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-01 Thread Richard Neill
Jean-Michel Pouré wrote: Le mardi 01 décembre 2009 à 18:52 +, Richard Neill a écrit : Is this normal? Have I hit a bug? PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. I did, and

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-01 Thread Jean-Michel Pouré
Le mardi 01 décembre 2009 à 18:52 +, Richard Neill a écrit : > Is this normal? Have I hit a bug? PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Run EXPLAIN ANALYSE your_query to underst

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-01 Thread Kevin Grittner
Richard Neill wrote: > I'd expect the ORDER BY to be the last thing that runs > Nested Loop Left Join (cost=0.00..727737158.77 > rows=806903677108 width=195) (actual time=31739.052..32862.322 > rows=15 loops=1) It probably would if it knew there were going to be 15 rows to sort. It is est

[PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-01 Thread Richard Neill
Dear All, I don't know if this is a stupid question, or not, but I can't understand the following. I have a pretty simple query, which runs in about 7ms SELECT * FROM h.inventory WHERE demand_id =289276563; The result of this is a 15 row x 17 column table. However, I want this to be sort