> > So, I am thinking about your approach of creating PartitionJoinPaths > without actually creating child paths and then at a later stage > actually plan the child joins. Here's rough sketch of how that may be > done. > > At the time of creating regular paths, we identify the join orders > which can use partition-wise join and save those in the RelOptInfo of > the parent table. If no such join order exists, we do not create > PartitionJoinPaths for that relation. Otherwise, once we have > considered all the join orders i.e. in > generate_partition_wise_join_paths(), we create one PartitionJoinPath > for every path that has survived in the parent or at least for every > path that has distinct properties like pathkeys or parameterisation, > with those properties. > > At the time of creating plans, if PartitionJoinPath is chosen, we > actually create paths for every partition of that relation > recursively. The path creation logic is carried out in a different > memory context. Amongst the paths that survive, we choose the best > path that has the same properties as PartitionJoinPath. We would > expect all parameterized paths to be retained and any unparameterized > path can be sorted to match the pathkeys of reference > PartitionJoinPath. We then create the plan out of this path and copy > it into the outer memory context and release the memory context used > for path creation. This is similar to how prepared statements save > their plans. Once we have the plan, the memory consumed by paths won't > be referenced, and hence can not create problems. At the end we create > an Append/MergeAppend plan with all the child plans and return it. > > Costing PartitionJoinPath needs more thought so that we don't end up > with bad overall plans. Here's an idea. Partition-wise joins are > better compared to the unpartitioned ones, because of the smaller > sizes of partitions. If we think of join as O(MN) operation where M > and N are sizes of unpartitioned tables being joined, partition-wise > join computes P joins each with average O(M/P * N/P) order where P is > the number of partitions, which is still O(MN) with constant factor > reduced by P times. I think, we need to apply similar logic to > costing. Let's say cost of a join is J(M, N) = S (M, N) + R (M, N) > where S and R are setup cost and joining cost (for M, N rows) resp. > Cost of partition-wise join would be P * J(M/P, N/P) = P * S(M/P, N/P) > + P * R(M/P, N/P). Each of the join methods will have different S and > R functions and may not be linear on the number of rows. So, > PartitionJoinPath costs are obtained from corresponding regular path > costs subjected to above transformation. This way, we will be > protected from choosing a PartitionJoinPath when it's not optimal. > Take example of a join where the joining relations are very small in > size, thus hash join on full relation is optimal compared to hash join > of each partition because of setup cost. In such a case, the function > which calculates the cost of hash table setup, would result in almost > same cost for full table as well as each of the partitions, thus > increasing P * S(M/P, N/P) as compared to S(M, N). > > Let me know your comments.
I tried to measure the impact of having a memory context reset 1000 times (once for each partition) with the attached patch. Without this patch make check in regress/ takes about 24 seconds on my laptop and with this patch it takes 26 seconds. This is almost 10% increase in time. I hope that's fine. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index d8c5dd3..abc34aa 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -54,6 +54,7 @@ #include "utils/rel.h" #include "utils/selfuncs.h" #include "utils/lsyscache.h" +#include "utils/memutils.h" #include "utils/syscache.h" @@ -192,6 +193,9 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams) Plan *top_plan; ListCell *lp, *lr; + MemoryContext temp_context; + int i; + MemoryContext default_context; /* Cursor options may come from caller or from DECLARE CURSOR stmt */ if (parse->utilityStmt && @@ -432,6 +436,24 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams) result->invalItems = glob->invalItems; result->nParamExec = glob->nParamExec; + temp_context = AllocSetContextCreate(CurrentMemoryContext, + "TemporaryContext", + ALLOCSET_DEFAULT_SIZES); + + /* Test the time impact of creating and destroying 1000 memory contexts. */ + for (i = 0; i < 1000; i++) + { + RelOptInfo *rel; + default_context = MemoryContextSwitchTo(temp_context); + rel = makeNode(RelOptInfo); + pfree(rel); + MemoryContextSwitchTo(default_context); + MemoryContextResetAndDeleteChildren(temp_context); + } + + MemoryContextSwitchTo(default_context); + MemoryContextDelete(temp_context); + return result; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers