On Mon, Jul 20, 2009 at 12:29 PM, Greg Stark <gsst...@mit.edu> wrote:
> On Mon, Jul 20, 2009 at 7:40 PM, Alan Li<a...@truviso.com> wrote: > > Attached is an updated patch that removes the O(n^2) behavior and the > > awkwardness of optimizing the seqscan path as the plan was about to be > > created. Now, the optimization is considered when appendrel is > generating > > the paths. > > > > I also changed the plan as you had suggested. It now looks like this: > > Hm, that's not quite the plan I described either. I had in mind to > mirror the existing min/max optimization which put the whole thing in > an InitPlan and put a Result node in place of the actual plan. Your > original patch did that for each subplan but I thought it would be > better to do it for the whole aggregate. > > However the more I think about it the more I don't understand why Tom > arranged to do that for the min/max optimization anyways. For > subqueries where that makes sense that would surely happen anyways > such as in the first example below. And for joins where it's necessary > the planner knows to put a Materialize node which sounds just as good. > > Here's what happens with your current patch in the case I was > concerned about -- note that the planner automatically detects the > case and turns the whole subplan into an initplan anyways: > > postgres=# explain select * from y where j = (select min(i) from x) ; > QUERY PLAN > > ---------------------------------------------------------------------------------------------- > Seq Scan on y (cost=40.12..80.12 rows=12 width=4) > Filter: (j = $0) > InitPlan 1 (returns $0) > -> Aggregate (cost=40.11..40.12 rows=1 width=4) > -> Append (cost=0.00..34.10 rows=2403 width=4) > -> Limit (cost=0.00..0.03 rows=1 width=4) > -> Index Scan using xi on x (cost=0.00..80.25 > rows=2400 width=4) > -> Limit (cost=0.00..0.03 rows=1 width=4) > -> Index Scan using xi2 on x2 x > (cost=0.00..80.25 rows=2400 width=4) > -> Limit (cost=0.00..0.03 rows=1 width=4) > -> Index Scan using xi3 on x3 x > (cost=0.00..80.25 rows=2400 width=4) > -> Seq Scan on x1 x (cost=0.00..34.00 rows=2400 width=4) > (12 rows) > > > And here's another case where you wouldn't want multiple execution -- > but the planner here figures out to materialize the result: > > postgres=# explain select * from y left outer join (select min(i) as i > from x) as x on (i=j); > QUERY PLAN > > -------------------------------------------------------------------------------------------------- > Nested Loop Left Join (cost=40.13..128.13 rows=2400 width=8) > Join Filter: ((min(public.x.i)) = y.j) > -> Seq Scan on y (cost=0.00..34.00 rows=2400 width=4) > -> Materialize (cost=40.13..40.14 rows=1 width=4) > -> Aggregate (cost=40.11..40.12 rows=1 width=4) > -> Append (cost=0.00..34.10 rows=2403 width=4) > -> Limit (cost=0.00..0.03 rows=1 width=4) > -> Index Scan using xi on x > (cost=0.00..80.25 rows=2400 width=4) > -> Limit (cost=0.00..0.03 rows=1 width=4) > -> Index Scan using xi2 on x2 x > (cost=0.00..80.25 rows=2400 width=4) > -> Limit (cost=0.00..0.03 rows=1 width=4) > -> Index Scan using xi3 on x3 x > (cost=0.00..80.25 rows=2400 width=4) > -> Seq Scan on x1 x (cost=0.00..34.00 rows=2400 > width=4) > (13 rows) > > > So I'm a bit puzzled why Tom's min/max optimization bothers with the > whole Initplan/Result business itself anyways. > > -- > greg > http://mit.edu/~gsstark/resume.pdf <http://mit.edu/%7Egsstark/resume.pdf> > > > Yes, this is what I saw when I was testing this, and I think that it would be best to leave the decision of creating an initPlan/materialize to the optimization of the super-query. That's why I didn't bother to specifically put an InitPlan on top of the Aggregate in the new patch. Alan