Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes: > I think that the best approach is to generate two different paths: > original one, when projection is always done before sort and another one > with postponed projection of non-trivial columns. Then we compare costs > of two paths and choose the best one. > Unfortunately, I do not understand now how to implement it with existed > grouping_planner. > Do you think that it is possible?
After fooling with this awhile, I don't think it's actually necessary to do that. See attached proof-of-concept patch. Although this patch gets through our regression tests, that's only because it's conservative about deciding to postpone evaluation; if it tried to postpone evaluation in a query with window functions, it would fail miserably, because pull_var_clause doesn't know about window functions. I think that that's a clear oversight and we should extend it to offer the same sorts of behaviors as it does for Aggrefs. But that would be slightly invasive, there being a dozen or so callers; so I didn't bother to do it yet pending comments on this patch. I think it's probably also broken for SRFs in the tlist; we need to work out what semantics we want for those. If we postpone any SRF to after the Sort, we can no longer assume that a query LIMIT enables use of bounded sort (because the SRF might repeatedly return zero rows). I don't have a huge problem with that, but I think now would be a good time to nail down some semantics. Some other work that would be useful would be to refactor so that the cost_qual_eval operations aren't so redundant. But that's just a small time savings not a question of functionality. And we'd have to document that this changes the behavior for volatile functions. For the better, I think: this will mean that you get consistent results no matter whether the query is implemented by indexscan or seqscan-and-sort, which has never been true before. But it is a change. Do people approve of this sort of change in general, or this patch approach in particular? Want to bikeshed the specific when-to-postpone-eval policies implemented here? Other comments? regards, tom lane
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 8937e71..b15fca1 100644 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *************** static RelOptInfo *create_distinct_paths *** 126,131 **** --- 126,132 ---- RelOptInfo *input_rel); static RelOptInfo *create_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel, + PathTarget *target, double limit_tuples); static PathTarget *make_group_input_target(PlannerInfo *root, List *tlist); static List *postprocess_setop_tlist(List *new_tlist, List *orig_tlist); *************** static PathTarget *make_window_input_tar *** 134,139 **** --- 135,142 ---- List *tlist, List *activeWindows); static List *make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc, List *tlist); + static PathTarget *make_sort_input_target(PlannerInfo *root, + PathTarget *final_target); /***************************************************************************** *************** grouping_planner(PlannerInfo *root, bool *** 1378,1383 **** --- 1381,1387 ---- int64 offset_est = 0; int64 count_est = 0; double limit_tuples = -1.0; + PathTarget *final_target; RelOptInfo *current_rel; RelOptInfo *final_rel; ListCell *lc; *************** grouping_planner(PlannerInfo *root, bool *** 1437,1442 **** --- 1441,1449 ---- /* Save aside the final decorated tlist */ root->processed_tlist = tlist; + /* Also extract the PathTarget form of the setop result tlist */ + final_target = current_rel->cheapest_total_path->pathtarget; + /* * Can't handle FOR [KEY] UPDATE/SHARE here (parser should have * checked already, but let's make sure). *************** grouping_planner(PlannerInfo *root, bool *** 1461,1467 **** else { /* No set operations, do regular planning */ ! PathTarget *final_target; PathTarget *grouping_target; PathTarget *scanjoin_target; bool have_grouping; --- 1468,1474 ---- else { /* No set operations, do regular planning */ ! PathTarget *sort_input_target; PathTarget *grouping_target; PathTarget *scanjoin_target; bool have_grouping; *************** grouping_planner(PlannerInfo *root, bool *** 1657,1678 **** final_target = create_pathtarget(root, tlist); /* * If we have window functions to deal with, the output from any * grouping step needs to be what the window functions want; ! * otherwise, it should just be final_target. */ if (activeWindows) grouping_target = make_window_input_target(root, tlist, activeWindows); else ! grouping_target = final_target; /* * If we have grouping or aggregation to do, the topmost scan/join ! * plan node must emit what the grouping step wants; otherwise, if ! * there's window functions, it must emit what the window functions ! * want; otherwise, it should emit final_target. */ have_grouping = (parse->groupClause || parse->groupingSets || parse->hasAggs || root->hasHavingQual); --- 1664,1694 ---- final_target = create_pathtarget(root, tlist); /* + * If ORDER BY was given, consider whether we should use a post-sort + * projection, and compute the adjusted target for preceding steps if + * so. + */ + if (parse->sortClause) + sort_input_target = make_sort_input_target(root, final_target); + else + sort_input_target = final_target; + + /* * If we have window functions to deal with, the output from any * grouping step needs to be what the window functions want; ! * otherwise, it should be sort_input_target. */ if (activeWindows) grouping_target = make_window_input_target(root, tlist, activeWindows); else ! grouping_target = sort_input_target; /* * If we have grouping or aggregation to do, the topmost scan/join ! * plan node must emit what the grouping step wants; otherwise, it ! * should emit grouping_target. */ have_grouping = (parse->groupClause || parse->groupingSets || parse->hasAggs || root->hasHavingQual); *************** grouping_planner(PlannerInfo *root, bool *** 1733,1739 **** current_rel = create_window_paths(root, current_rel, grouping_target, ! final_target, tlist, wflists, activeWindows); --- 1749,1755 ---- current_rel = create_window_paths(root, current_rel, grouping_target, ! sort_input_target, tlist, wflists, activeWindows); *************** grouping_planner(PlannerInfo *root, bool *** 1793,1798 **** --- 1809,1815 ---- { current_rel = create_ordered_paths(root, current_rel, + final_target, limit_tuples); } *************** create_distinct_paths(PlannerInfo *root, *** 3704,3715 **** --- 3721,3734 ---- * cheapest-total existing path. * * input_rel: contains the source-data Paths + * target: the output tlist the result Paths must emit * limit_tuples: estimated bound on the number of output tuples, * or -1 if no LIMIT or couldn't estimate */ static RelOptInfo * create_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel, + PathTarget *target, double limit_tuples) { Path *cheapest_input_path = input_rel->cheapest_total_path; *************** create_ordered_paths(PlannerInfo *root, *** 3737,3742 **** --- 3756,3767 ---- root->sort_pathkeys, limit_tuples); } + + /* Add projection step if needed */ + if (path->pathtarget != target) + path = apply_projection_to_path(root, ordered_rel, + path, target); + add_path(ordered_rel, path); } } *************** make_pathkeys_for_window(PlannerInfo *ro *** 4140,4145 **** --- 4165,4357 ---- } /* + * make_sort_input_target + * Generate appropriate PathTarget for initial input to Sort step. + * + * If the query has ORDER BY, this function chooses the target list to be + * computed by the node just below the Sort (and Distinct, if any) steps. + * This might or might not be identical to the query's final output tlist. + * + * The main argument for keeping the sort-input tlist the same as the final + * is that we avoid a separate projection node (which will be needed if + * they're different, because Sort can't project). However, there are also + * advantages to postponing tlist evaluation till after the Sort: it ensures + * a consistent order of evaluation for any volatile functions in the tlist, + * and if there's also a LIMIT, we can stop the query without ever computing + * tlist functions for later rows, which is beneficial for both volatile and + * expensive functions. + * + * Our current policy is to postpone volatile expressions till after the sort + * unconditionally (assuming that that's possible, ie they are in plain tlist + * columns and not ORDER/GROUP BY/DISTINCT columns). Expensive expressions + * are postponed if there is a LIMIT, or if root->tuple_fraction shows that + * partial evaluation of the query is possible (if neither is true, we expect + * to have to evaluate the expressions for every row anyway), or if there are + * any volatile expressions (since once we've put in a projection at all, + * it won't cost any more to postpone more stuff). + * + * Another issue that could potentially be considered here is that + * evaluating tlist expressions could result in data that's either wider + * or narrower than the input Vars, thus changing the volume of data that + * has to go through the Sort. However, we usually have only a very bad + * idea of the output width of any expression more complex than a Var, + * so for now it seems too risky to try to optimize on that basis. + * + * Note that if we do produce a modified sort-input target, and then the + * query ends up not using an explicit Sort, no particular harm is done: + * we'll initially use the modified target for the preceding path nodes, + * but then change them to the final target with apply_projection_to_path. + * Moreover, in such a case the guarantees about evaluation order of + * volatile functions still hold, since the rows are sorted already. + * + * This function has some things in common with make_group_input_target and + * make_window_input_target, though the detailed rules for what to do are + * different. We never flatten/postpone any grouping or ordering columns; + * those are needed before the sort. If we do flatten a particular + * expression, we leave Aggref and WindowFunc nodes alone, since those were + * computed earlier. + * + * 'final_target' is the query's final target list (in PathTarget form) + * + * The result is the PathTarget to be computed by the plan node immediately + * below the Sort step (and the Distinct step, if any). This will be + * exactly final_target if we decide a projection step wouldn't be helpful. + */ + static PathTarget * + make_sort_input_target(PlannerInfo *root, + PathTarget *final_target) + { + Query *parse = root->parse; + PathTarget *input_target; + int ncols; + bool *postpone_col; + bool have_volatile; + bool have_expensive; + List *flattenable_cols; + List *flattenable_vars; + int i; + ListCell *lc; + + /* Shouldn't get here unless query has ORDER BY */ + Assert(parse->sortClause); + + /* Inspect tlist and collect per-column information */ + ncols = list_length(final_target->exprs); + postpone_col = (bool *) palloc0(ncols * sizeof(bool)); + have_volatile = have_expensive = false; + + i = 0; + foreach(lc, final_target->exprs) + { + Expr *expr = (Expr *) lfirst(lc); + + /* + * If the column has a sortgroupref, assume it has to be evaluated + * before sorting. Generally such columns would be ORDER BY, GROUP + * BY, etc targets. One exception is columns that were removed from + * GROUP BY by remove_useless_groupby_columns() ... but those would + * only be Vars anyway. + */ + if (final_target->sortgrouprefs[i] == 0) + { + /* + * If it's volatile, that's an unconditional reason to postpone. + */ + if (contain_volatile_functions((Node *) expr)) + { + postpone_col[i] = true; + have_volatile = true; + } + else + { + /* + * Else check the cost. XXX it's annoying to have to do this + * when set_pathtarget_cost_width() just did it. Refactor to + * allow sharing the work? + */ + QualCost cost; + + cost_qual_eval_node(&cost, (Node *) expr, root); + + /* + * We arbitrarily define "expensive" as "more than 10X + * cpu_operator_cost". Note this will take in any PL function + * with default cost. + */ + if (cost.per_tuple >= 10 * cpu_operator_cost) + { + postpone_col[i] = true; + have_expensive = true; + } + } + } + i++; + } + + /* + * If we don't need a post-sort projection, just return final_target. + */ + if (!(have_volatile || + (have_expensive && + (parse->limitCount || root->tuple_fraction > 0)))) + return final_target; + + /* + * Otherwise, construct the sort-input target, taking all non-postponable + * columns and then adding Vars, PlaceHolderVars, Aggrefs, and WindowFuncs + * found in the postponable ones. XXX define a create_empty_pathtarget() + */ + input_target = palloc0(sizeof(PathTarget)); + flattenable_cols = NIL; + + i = 0; + foreach(lc, final_target->exprs) + { + Expr *expr = (Expr *) lfirst(lc); + + if (postpone_col[i]) + { + flattenable_cols = lappend(flattenable_cols, expr); + } + else + { + add_column_to_pathtarget(input_target, expr, + final_target->sortgrouprefs[i]); + } + i++; + } + + /* + * Pull out all the Vars and Aggrefs mentioned in flattenable columns, and + * add them to the result tlist if not already present. (Some might be + * there already because they're used directly as window/group clauses.) + * + * Note: it's essential to use PVC_INCLUDE_AGGREGATES here, so that the + * Aggrefs are placed in the Agg node's tlist and not left to be computed + * at higher levels. + * + * XXX need to extend pull_var_clause to handle windowfuncs specially. + */ + flattenable_vars = pull_var_clause((Node *) flattenable_cols, + PVC_INCLUDE_AGGREGATES, + PVC_INCLUDE_PLACEHOLDERS); + foreach(lc, flattenable_vars) + { + Expr *expr = (Expr *) lfirst(lc); + + if (!list_member(input_target->exprs, expr)) + add_column_to_pathtarget(input_target, expr, 0); + } + + /* clean up cruft */ + list_free(flattenable_vars); + list_free(flattenable_cols); + + /* XXX this represents even more redundant cost calculation ... */ + return set_pathtarget_cost_width(root, input_target); + } + + /* * get_cheapest_fractional_path * Find the cheapest path for retrieving a specified fraction of all * the tuples expected to be returned by the given relation.
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers