2011/6/4 Simon Riggs <si...@2ndquadrant.com>: > > I like your simple patch and looks like it fixes your concern.
Thanks for your interest. I forgot to mention but this type of query is quite general in one-to-many entities and likely to be generated by simple ORMappers. > Your problem statement ignores the fact that most people would not > write the "original query" like this > > select m_id, sum_len from size_m m inner join(select m_id, > sum(length(val)) as sum_len from size_l group by m_id)l on m.id = > l.m_id where val = '1'; > > they would write it like this > > select m_id, sum(length(val)) > from size_m m join size_l l on m.id = l.m_id > where val = '1' > group by m_id; Hm, I didn't notice this hand transformation. Will check what's happened. But my example is simplified and it might be likely that some other joins (not uniquely) to size_m. > In terms of costing, I think it would be likely that we can apply > simple heuristics. We already assume that applying quals down to the > lowest level possible make sense. I would guess that anything that > reduces the number of rows should be pushed down as far as possible. > I'm sure there are cases where that isn't true, but lets not stop from > solving simple general cases because of the theoretical existence of > complex cases Agreed. After more thought, push-down-qual approach would be better than push down/pull up aggregates. The only concern was multiple aggregate call case in such cases like more rows than one are qualified in size_m. But it is clear that each aggregate call scans only qualified size_l rows if we can push down parameter qual to the subquery. Nestloop with parameter push down to aggregate subquery appoach is more general because it doesn't concern about "primary key" issue. You can push it down whenever the total execution cost is smaller. So, I'm now walking through planner code and finally I found the clue to start. First, the current problem of parameterized nestloop idea in general case is that while nested index scan pushes parameter to the other join relation, more general approach needs to do it with subquery. A nested index scan Path is generated in match_unsorted_outer(), which is at much deeper of make_rel_from_joinlist(), which is after set_base_rel_pathlist(), which contains set_subquery_pathlist(), which calls subquery_planner(). This means that if you want to add code to generate "general nestloop with parameter push down" during join search process, it is too late to push down the parameter to subquery, because subquery was already planned at that time. So we need to add new operation before subquery_planner(). It is hard because any join-relevant information is not ready at the stage. But I bet some simple cases like my aggregate-join can find it possible to make parameter from join qual. (I haven't yet written any code nor proven my theory). In this case we need to plan subquery twice, one with pure and the other with parameterized. Other than subquery case, LATERAL will be ok with near the nested index scan approach, since the joinned relation is FunctionScan, which is not planned lie subquery. "s JOIN(l1 LEFT JOIN l2)" case is unclear which of subquery or index scan. Maybe the third way, because l1 LEFT JOIN l2 is inside deconstructed jointree which is not planned subquery but also not plain RelOptInfo like base relation / function scan. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers