(2018/11/27 21:55), Etsuro Fujita wrote: > While working on [1], I noticed that since we don't set the selectivity > and cost of the local_conds (i.e., fpinfo->local_conds_sel and > fpinfo->local_conds_cost) properly in add_foreign_grouping_paths and > foreign_grouping_ok, estimate_path_cost_size produces considerably > underestimated results when use_remote_estimate. I think this would > cause problems in later planning steps. So, why not add something like > this to add_foreign_grouping_paths, as done in other functions such as > postgresGetForeignJopinPaths? > > --- a/contrib/postgres_fdw/postgres_fdw.c > +++ b/contrib/postgres_fdw/postgres_fdw.c > @@ -5496,6 +5496,19 @@ add_foreign_grouping_paths(PlannerInfo *root, > RelOptInfo\ > *input_rel, > if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual)) > return; > > + /* > + * Compute the selectivity and cost of the local_conds, so we don't have > + * to do it over again for each path. The best we can do for these > + * conditions is to estimate selectivity on the basis of local > statistics. > + */ > + fpinfo->local_conds_sel = clauselist_selectivity(root, > + fpinfo->local_conds, > + 0, > + JOIN_INNER, > + NULL); > + > + cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root); > + > /* Estimate the cost of push down */ > estimate_path_cost_size(root, grouped_rel, NIL, NIL,&rows, > &width,&startup_cost,&total_cost);
Here is a patch for that. BTW another thing I noticed is this comment on costing aggregate pushdown paths using local statistics in estimate_path_cost_size: * Also, core does not care about costing HAVING expressions and * adding that to the costs. So similarly, here too we are not * considering remote and local conditions for costing. I think this was true when aggregate pushdown went in, but isn't anymore because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2. So we should update estimate_path_cost_size so that it accounts for the selectivity and cost of the HAVING expressions as well? I think this comment needs to be updated at least, though. Best regards, Etsuro Fujita
*** a/contrib/postgres_fdw/expected/postgres_fdw.out --- b/contrib/postgres_fdw/expected/postgres_fdw.out *************** *** 3209,3214 **** select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 --- 3209,3216 ---- Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6)) (6 rows) + -- Update local stats on ft2 + ANALYZE ft2; -- Add into extension alter extension postgres_fdw add operator class my_op_class using btree; alter extension postgres_fdw add function my_op_cmp(a int, b int); *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *************** *** 5496,5501 **** add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel, --- 5496,5514 ---- if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual)) return; + /* + * Compute the selectivity and cost of the local_conds, so we don't have + * to do it over again for each path. The best we can do for these + * conditions is to estimate selectivity on the basis of local statistics. + */ + fpinfo->local_conds_sel = clauselist_selectivity(root, + fpinfo->local_conds, + 0, + JOIN_INNER, + NULL); + + cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root); + /* Estimate the cost of push down */ estimate_path_cost_size(root, grouped_rel, NIL, NIL, &rows, &width, &startup_cost, &total_cost); *** a/contrib/postgres_fdw/sql/postgres_fdw.sql --- b/contrib/postgres_fdw/sql/postgres_fdw.sql *************** *** 807,812 **** create operator class my_op_class for type int using btree family my_op_family a --- 807,815 ---- explain (verbose, costs off) select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2; + -- Update local stats on ft2 + ANALYZE ft2; + -- Add into extension alter extension postgres_fdw add operator class my_op_class using btree; alter extension postgres_fdw add function my_op_cmp(a int, b int);