On Mon, Jul 31, 2017 at 8:32 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Jul 14, 2017 at 3:02 AM, Ashutosh Bapat > <ashutosh.ba...@enterprisedb.com> wrote: >> Here's revised patch set with only 0004 revised. That patch deals with >> creating multi-level inheritance hierarchy from multi-level partition >> hierarchy. The original logic of recursively calling >> inheritance_planner()'s guts over the inheritance hierarchy required >> that for every such recursion we flatten many lists created by that >> code. Recursion also meant that root->append_rel_list is traversed as >> many times as the number of partitioned partitions in the hierarchy. >> Instead the revised version keep the iterative shape of >> inheritance_planner() intact, thus naturally creating flat lists, >> iterates over root->append_rel_list only once and is still easy to >> read and maintain. > > 0001-0003 look basically OK to me, modulo some cosmetic stuff. Regarding > 0004: > > + if (brel->reloptkind != RELOPT_BASEREL && > + brte->relkind != RELKIND_PARTITIONED_TABLE) > > I spent a lot of time staring at this code before I figured out what > was going on here. We're iterating over simple_rel_array, so the > reloptkind must be RELOPT_OTHER_MEMBER_REL if it isn't RELOPT_BASEREL. > But does that guarantee that rtekind is RTE_RELATION such that > brte->relkind will be initialized to a value? I'm not 100% sure.
Comment in RangeTblEntry says 952 /* 953 * Fields valid for a plain relation RTE (else zero): 954 * ... clipped portion for RTE_NAMEDTUPLESTORE related comment 960 Oid relid; /* OID of the relation */ 961 char relkind; /* relation kind (see pg_class.relkind) */ This means that relkind will be 0 when rtekind != RTE_RELATION. So, the condition holds. But code creating an RTE somewhere which is not in sync with this comment would create a problem. So your suggestion makes sense. > I > think it would be clearer to write this test like this: > > Assert(IS_SIMPLE_REL(brel)); > if (brel->reloptkind == RELOPT_OTHER_MEMBER_REL && > (brte->rtekind != RELOPT_BASEREL || Do you mean (brte_>rtekind != RTE_RELATION)? > brte->relkind != RELKIND_PARTITIONED_TABLE)) > continue; > > Note that the way you wrote the comment is says if it *is* another > REL, not if it's *not* a baserel; it's good if those kinds of little > details match between the code and the comments. I find the existing comment and code in this part of the function differ. The comment just above the loop on simple_rel_array[], talks about changing something in the child, but the very next line skips child relations and later a loop on append_rel_list makes changes to appropriate children. I guess, it's done that way to keep the code working even after we introduce some RELOPTKIND other than BASEREL or OTHER_MEMBER_REL for a simple rel. But your suggestion makes more sense. Changed it according to your suggestion. > > It is not clear to me what the motivation is for the API changes in > expanded_inherited_rtentry. They don't appear to be necessary. expand_inherited_rtentry() creates AppendRelInfos for all the children of a given parent and collects them in a list. The list is appended to root->append_rel_list at the end of the function. Now that function needs to do this recursively. This means that for a partitioned partition table its children's AppendRelInfos will be added to root->append_rel_list before AppendRelInfo of that partitioned partition table. inheritance_planner() assumes that the parent's AppendRelInfo comes before its children in append_rel_list.This assumption allows it to be simplified greately, retaining its iterative form. My earlier patches had recursive version of inheritance_planner(), which is complex. I have comments in this patch explaining this. Adding AppendRelInfos to root->append_rel_list as and when they are created would keep parent AppendRelInfos before those of children. But that function throws away the AppendRelInfo it created when their are no real children i.e. in partitioned table's case when has no leaf partitions. So, we can't do that. Hence, I chose to change the API to return the list of AppendRelInfos when the given RTE has real children. > If > they are necessary, you need to do a more thorough job updating the > comments. This one, in particular: > > * If so, add entries for all the child tables to the query's > * rangetable, and build AppendRelInfo nodes for all the child tables > * and add them to root->append_rel_list. If not, clear the entry's Done. > > And the comments could maybe say something like "We return the list of > appinfos rather than directly appending it to append_rel_list because > $REASON." Done. Please check the attached version. > > - * is a partitioned table. > + * RTE simply duplicates the parent *partitioned* table. > */ > - if (childrte->relkind != RELKIND_PARTITIONED_TABLE) > + if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh) > > This is another case where it's hard to understand the test from the comments. The current comment says it all, but it very cryptic manner. 1526 /* 1527 * Build an AppendRelInfo for this parent and child, unless the child 1528 * RTE simply duplicates the parent *partitioned* table. 1529 */ The comment makes sense in the context of this paragraph in the prologue 1364 * Note that the original RTE is considered to represent the whole 1365 * inheritance set. The first of the generated RTEs is an RTE for the same 1366 * table, but with inh = false, to represent the parent table in its role 1367 * as a simple member of the inheritance set. 1368 * The code avoids creating AppendRelInfos for a child which represents the parent in its role as a simple member of inheritance set. I have reworded it as 1526 /* 1527 * Build an AppendRelInfo for this parent and child, unless the child 1528 * RTE represents the parent as a simple member of inheritance set. 1529 */ > > + * In case of multi-level inheritance hierarchy, for every child we > require > + * PlannerInfo of its immediate parent. Hence we save those in a an array > > Say why. Also, need to fix "a an". Done. > > I'm a little bit surprised that this patch doesn't make any changes to > allpaths.c or relnode.c. > It looks to me like we'll generate paths for > the new RTEs that are being added. Are we going to end up with > multiple levels of Append nodes, then? Does the consider the way > consider_parallel is propagated up and down in set_append_rel_size() > and set_append_rel_pathlist() really work with multiple levels? Maybe > this is all fine; I haven't tried to verify it in depth. This has been discussed before, but I can not locate the mail answering these questions. accumulate_append_subpath() called from add_paths_to_append_rel() takes care of flattening Merge/Append paths. The planner code deals with the multi-level inheritance hierarchy created for subqueries with set operations. There is code in relnode.c to build the RelOptInfos for such subqueries recursively through using RangeTblEntry::inh flag. So there are no changes in allpaths.c and relnode.c. Are you looking for some other changes? > > Overall I think this is a reasonable direction to go but I'm worried > that there may be bugs lurking -- other code that needs adjusting that > hasn't been found, really. > Planner code is already aware of such hierarchies except DMLs, which this patch adjusts. We have fixed issues revealed by mine and Rajkumar's testing. What kinds of things you suspect? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers