On Sat, Jul 18, 2009 at 12:04 PM, Greg Stark <gsst...@mit.edu> wrote:
> On Sat, Jul 18, 2009 at 6:35 PM, Alan Li<a...@truviso.com> wrote: > > Yeah, are you running into the same issue as well? I tried to figure out > a > > way around the O(n^2) behavior, but there were no existing direct > > relationship between the child subpath and its associated AppendRelInfo. > I > > think an AppenRelInfo dynahash would work and just have it hash by the > > child_relid. > > I don't see any place in my patch where I had to do anything like > this. I do have to loop through all the appendrel elements and skip > over the ones which don't belong to this appendrel which seems weird > to me but it isn't n^2. > Generating a normal Append node you're generating a brand new path for > each child and attaching them to the append path. It looks like you're > allowing the append rel to generate paths and then digging around > looking at those paths. I wonder if that's the right approach. > > > >> The other thing is it would be nice if we could avoid making separate > >> subplans for each child and instead make one for the whole structure > >> including the aggregate. It would at the very least make the explain > >> output prettier, but I think it would avoid repeated execution of the > >> aggregate in some cases as well. > > > > How would this plan look? I think the repeated execution of the > aggregate > > comes from having to process the output of each child. So it's O(n) > > executions where n is the number of children, assuming each child has the > > appropriate index for this optimization. > > No I just mean instead of generating > > InitPlan 1 > Limit > Index Scan > InitPlan 2 > Limit > Index Scan > Aggregate > Append > Result > Result > > I think it would be better to generate this: > > InitPlan 1 > Aggregate > Append > Limit 1 > IndexScan > Limit 1 > IndexScan > Result > > The reason I think this is better is because if the subquery is > executed multiple times it will just return the one precalculated > result. In your plan it will have all the child minimums precalculated > but will still have to re-execute the aggregate and append node. > That's not terribly expensive but we might as well generate the > simpler more efficient plan. > > > > The other optimization that I thought was that the optimizer might use > the > > check constraints (in the range partitioning) on the column that I want > to > > do a MIN/MAX on. Then it'll only have to execute the subplan in one > child > > partition and the parent partition. But it was more of a wild idea on my > > part, not sure if that's possible. > > Yes, I think this is the long-term goal. That's the whole "better > representation of partitions" plotline. To do this efficiently the > planner should know what the partition key is and what the > partitioning structure is. > > In an ideal world we would be able to collapse the whole structure and > eliminate the append relation entirely. To do that we need some way to > indicate that the parent relation is provably empty. I had in mind to > mark it as read-only and the statistics as authoritative since that > seems more useful than just being able to mark it empty. I think there > are a lot of other interesting things we could do with statistics if > we knew they were authoritative for a read-only table. (The read-only > property we would need here is very strong. There would have to be a > vacuum freeze and moreover we would have to know that all tuples were > successfully frozen.) > > -- > greg > http://mit.edu/~gsstark/resume.pdf <http://mit.edu/%7Egsstark/resume.pdf> > > > Attached is an updated patch that removes the O(n^2) behavior and the awkwardness of optimizing the seqscan path as the plan was about to be created. Now, the optimization is considered when appendrel is generating the paths. I also changed the plan as you had suggested. It now looks like this: postgres=# explain select max(b) from foo_archive; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1.22..1.23 rows=1 width=8) -> Append (cost=0.00..1.13 rows=32 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_idx on foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_01_idx on foo_archive_2007_01_01 foo_archive (cost=0.00..2723.24 rows=86399 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_02_idx on foo_archive_2007_01_02 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_03_idx on foo_archive_2007_01_03 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_04_idx on foo_archive_2007_01_04 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_05_idx on foo_archive_2007_01_05 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_06_idx on foo_archive_2007_01_06 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_07_idx on foo_archive_2007_01_07 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_08_idx on foo_archive_2007_01_08 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_09_idx on foo_archive_2007_01_09 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_10_idx on foo_archive_2007_01_10 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_11_idx on foo_archive_2007_01_11 foo_archive (cost=0.00..2719.26 rows=86400 width=8) -> Limit (cost=0.00..0.03 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_12_idx on foo_archive_2007_01_12 foo_archive (cost=0.00..1568.27 rows=49601 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_13_idx on foo_archive_2007_01_13 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_14_idx on foo_archive_2007_01_14 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_15_idx on foo_archive_2007_01_15 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_16_idx on foo_archive_2007_01_16 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_17_idx on foo_archive_2007_01_17 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_18_idx on foo_archive_2007_01_18 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_19_idx on foo_archive_2007_01_19 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_20_idx on foo_archive_2007_01_20 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_21_idx on foo_archive_2007_01_21 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_22_idx on foo_archive_2007_01_22 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_23_idx on foo_archive_2007_01_23 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_24_idx on foo_archive_2007_01_24 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_25_idx on foo_archive_2007_01_25 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_26_idx on foo_archive_2007_01_26 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_27_idx on foo_archive_2007_01_27 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_28_idx on foo_archive_2007_01_28 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_29_idx on foo_archive_2007_01_29 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_30_idx on foo_archive_2007_01_30 foo_archive (cost=0.00..73.35 rows=1940 width=8) -> Limit (cost=0.00..0.04 rows=1 width=8) -> Index Scan Backward using foo_archive_2007_01_31_idx on foo_archive_2007_01_31 foo_archive (cost=0.00..73.35 rows=1940 width=8) (66 rows)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index b375902..0c839b9 100644 *** a/src/backend/optimizer/path/allpaths.c --- b/src/backend/optimizer/path/allpaths.c *************** *** 27,32 **** --- 27,33 ---- #include "optimizer/pathnode.h" #include "optimizer/paths.h" #include "optimizer/plancat.h" + #include "optimizer/planmain.h" #include "optimizer/planner.h" #include "optimizer/prep.h" #include "optimizer/restrictinfo.h" *************** set_append_rel_pathlist(PlannerInfo *roo *** 289,294 **** --- 290,296 ---- double *parent_attrsizes; int nattrs; ListCell *l; + bool can_optimize_minmax; /* * Initialize to compute size estimates for whole append relation. *************** set_append_rel_pathlist(PlannerInfo *roo *** 309,314 **** --- 311,319 ---- nattrs = rel->max_attr - rel->min_attr + 1; parent_attrsizes = (double *) palloc0(nattrs * sizeof(double)); + + can_optimize_minmax = can_optimize_append_minmax(root); + /* * Generate access paths for each member relation, and pick the cheapest * path for each one. *************** set_append_rel_pathlist(PlannerInfo *roo *** 427,433 **** --- 432,454 ---- subpaths = list_concat(subpaths, ((AppendPath *) childpath)->subpaths); else + { + if (can_optimize_minmax) + { + Path *minmax_path; + List *targetlist = (List *) + adjust_appendrel_attrs((Node *) root->parse->targetList, + appinfo); + + minmax_path = optimize_append_minmax(root, targetlist, childrel, + childpath); + + if (minmax_path) + childpath = minmax_path; + } + subpaths = lappend(subpaths, childpath); + } /* * Accumulate size information from each child. diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 6e5c251..6643486 100644 *** a/src/backend/optimizer/plan/createplan.c --- b/src/backend/optimizer/plan/createplan.c *************** *** 20,25 **** --- 20,26 ---- #include <math.h> #include "access/skey.h" + #include "catalog/pg_type.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" *************** create_append_plan(PlannerInfo *root, Ap *** 546,551 **** --- 547,554 ---- List *tlist = build_relation_tlist(best_path->path.parent); List *subplans = NIL; ListCell *subpaths; + bool can_optimize_minmax; + Node *limitCount; /* * It is possible for the subplans list to contain only one entry, or even *************** create_append_plan(PlannerInfo *root, Ap *** 566,577 **** NULL); } /* Normal case with multiple subpaths */ foreach(subpaths, best_path->subpaths) { Path *subpath = (Path *) lfirst(subpaths); ! subplans = lappend(subplans, create_plan(root, subpath)); } plan = make_append(subplans, false, tlist); --- 569,591 ---- NULL); } + can_optimize_minmax = can_optimize_append_minmax(root); + if (can_optimize_minmax) + limitCount = (Node *) makeConst(INT8OID, -1, sizeof(int64), + Int64GetDatum(1), false, + FLOAT8PASSBYVAL); + /* Normal case with multiple subpaths */ foreach(subpaths, best_path->subpaths) { Path *subpath = (Path *) lfirst(subpaths); + Plan *subplan = create_plan(root, subpath); ! /* apply the LIMIT 1 on top of the index scan */ ! if (can_optimize_minmax && IsA(subplan, IndexScan)) ! subplan = (Plan *) make_limit(subplan, NULL, limitCount, 0, 1); ! ! subplans = lappend(subplans, subplan); } plan = make_append(subplans, false, tlist); diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c index 34e9617..f7cdc5e 100644 *** a/src/backend/optimizer/plan/planagg.c --- b/src/backend/optimizer/plan/planagg.c *************** optimize_minmax_aggregates(PlannerInfo * *** 214,219 **** --- 214,320 ---- } /* + * can_optimize_append_minmax - check whether optimizing MIN/MAX via indexes + * can potentially be pushed down to child relations. + * + * This checks whether the query is of the form: + * 1. SELECT MAX(col) FROM tab WHERE ...; + * 2. SELECT MIN(col) FROM tab WHERE ...; + */ + bool + can_optimize_append_minmax(PlannerInfo *root) + { + Query *parse = root->parse; + List *tlist = parse->targetList; + List *aggs_list; + + /* Nothing to do if query has no aggregates */ + if (!parse->hasAggs) + return false; + + Assert(!parse->setOperations); /* shouldn't get here if a setop */ + Assert(parse->rowMarks == NIL); /* nor if FOR UPDATE */ + + /* + * Reject unoptimizable cases. + * + * We don't handle GROUP BY or windowing, because our current + * implementations of grouping require looking at all the rows anyway, and + * so there's not much point in optimizing MIN/MAX. + */ + if (parse->groupClause || parse->hasWindowFuncs) + return false; + + /* We're only supporting a single MIN/MAX in the targetlist. */ + if (list_length(tlist) > 1 || parse->havingQual) + return false; + + aggs_list = NIL; + if (find_minmax_aggs_walker((Node *) tlist, &aggs_list)) + return false; + + if (list_length(aggs_list) > 1) + return false; + + return true; + } + + /* + * optimize_append_minmax - returns the IndexPath for a child relation as an + * optimization for MIN/MAX. See can_optimize_append_minmax() above. Returns + * NULL if no IndexPath is available or it is not profitable compared to the + * existing best_path. + */ + Path * + optimize_append_minmax(PlannerInfo *root, List *tlist, RelOptInfo *rel, + Path *best_path) + { + List *aggs_list; + Cost total_cost; + MinMaxAggInfo *info; + Path agg_p; + Path minmax_agg_p; + + /* Pass 1: Get the one MIN/MAX from the targetlist */ + aggs_list = NIL; + if (find_minmax_aggs_walker((Node *) tlist, &aggs_list)) + return NULL; + + Assert(list_length(aggs_list) == 1); + + /* Pass 2: see if MIN/MAX is optimizable */ + info = (MinMaxAggInfo *) linitial(aggs_list); + if (!build_minmax_path(root, rel, info)) + return NULL; + total_cost = info->pathcost; + + /* + * Make the cost comparison. + * + * We are comparing the cost of the best_path that has to pass all of its + * output as input to the aggregate node against the one output tuple in + * the optimized path. + * + * Note that we don't include evaluation cost of the tlist here; this is + * OK since it isn't included in best_path's cost either, and should be + * the same in either case. + */ + cost_agg(&agg_p, root, AGG_PLAIN, list_length(aggs_list), + 0, 0, + best_path->startup_cost, best_path->total_cost, + best_path->parent->rows); + + /* only one input tuple expected from the minmax index path */ + cost_agg(&minmax_agg_p, root, AGG_PLAIN, list_length(aggs_list), + 0, 0, 0, total_cost, 1); + + if (minmax_agg_p.total_cost > agg_p.total_cost) + return NULL; /* too expensive */ + + return (Path *) info->path; + } + + /* * find_minmax_aggs_walker * Recursively scan the Aggref nodes in an expression tree, and check * that each one is a MIN/MAX aggregate. If so, build a list of the diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h index 0dd2bcc..61dd0a5 100644 *** a/src/include/optimizer/planmain.h --- b/src/include/optimizer/planmain.h *************** extern void query_planner(PlannerInfo *r *** 34,39 **** --- 34,42 ---- */ extern Plan *optimize_minmax_aggregates(PlannerInfo *root, List *tlist, Path *best_path); + extern bool can_optimize_append_minmax(PlannerInfo *root); + extern Path *optimize_append_minmax(PlannerInfo *root, List *tlist, + RelOptInfo *rel, Path *best_path); /* * prototypes for plan/createplan.c
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers