(looking at the v5 patch but replying to an older email) On 2018/07/31 16:03, David Rowley wrote: > I've attached a complete v4 patch. > >> By the way, when going over the updated code, I noticed that the code >> around child_parent_tupconv_maps could use some refactoring too. >> Especially, I noticed that ExecSetupChildParentMapForLeaf() allocates >> child-to-parent map array needed for transition tuple capture even if not >> needed by any of the leaf partitions. I'm attaching here a patch that >> applies on top of your v3 to show what I'm thinking we could do. > > Maybe we can do that as a follow-on patch.
We probably could, but I think it would be a good idea get rid of *all* redundant allocations due to tuple routing in one patch, if that's the mission of this thread and the patch anyway. > I think what we have so far > is already ended up quite complex to review. What do you think? Yeah, it's kind of complex, but at least it seems that we're clear on the point that what we're trying to do here is to try to get rid of redundant allocations. Parts of the patch that appear complex seems to be around the allocation of various maps. Especially the child-to-parent maps, which as things stand today, come from two arrays -- a per-update-subplan array that's needed by update tuple routing proper and per-leaf partition array (one in PartitionTupleRouting) that's needed by transition capture machinery. The original coding was such the update tuple routing handling code would try to avoid allocating the per-update-subplan array if it saw that per-leaf partition array was already set up in PartitionTupleRouting, because transition capture is active in the query. For update-tuple-routing code to be able to use maps from the per-leaf array, it would have to know which update-subplans mapped to which tuple-routing-initialized partitions. That was maintained in the subplan_partition_offset array that's now gone with this patch, because we no longer want to fix the tuple-routing-initialized-partition offsets in advance. So, it's better to dissociate per-subplan maps which are initialized during ExecInitModifyTable from per-leaf maps which are initialized lazily when tuple routing initializes a partition, which is what my portion of the patch did. As mentioned in my last email, I still think it would be a good idea to simplify the handling of child-to-parent maps in PartitionTupleRouting even further, while we're at improving the code in this area. I revised the patch such that it makes the handling of maps in PartitionTupleRouting even more uniform. With that patch, we no longer have two completely unrelated places in the code managing parent-to-child and child-to-parent maps, even though both arrays are in the same PartitionTupleRouting. Please find the updated patch attached with this email. Thanks, Amit
From 1a814f5a40774a51bf702757ec91e02f418a5aba Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Fri, 3 Aug 2018 14:09:51 +0900 Subject: [PATCH v2 2/2] Refactor handling of child_parent_tupconv_maps They're currently created and handed out by TupConvMapForLeaf, which makes them look somewhat different from parent_to_child_tupconv_maps. In fact, both contain conversion maps possibly needed between a partition initialized by tuple routing and the root parent in one or the other direction, so it seems odd that parent-to-child ones are created in ExecInitRoutingInfo, whereas child-to-parent ones in TupConvMapForLeaf. The child-to-parent ones are only needed if transition capture is active, but we can already check that in ExecInitRoutingInfo via the incoming ModifyTableState (sure, we need to teach CopyFrom to add the necessary information into its dummy ModifyTableState, but that doesn't seem too bad). This way, we can manage both parent-to-child and child-to-parent maps in similar ways, and more importantly, use the same criterion of checking whether a partition's slot in the respective array is NULL or not to conclude if tuple conversion is necessary or not. --- src/backend/commands/copy.c | 37 +++++------- src/backend/executor/execPartition.c | 102 +++++++++------------------------ src/backend/executor/nodeModifyTable.c | 11 ++-- src/include/executor/execPartition.h | 33 ++++++----- 4 files changed, 64 insertions(+), 119 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 752ba3d767..6f4069d321 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -2510,8 +2510,12 @@ CopyFrom(CopyState cstate) /* * If there are any triggers with transition tables on the named relation, * we need to be prepared to capture transition tuples. + * + * Because partition tuple routing would like to know about whether + * transition capture is active, we also set it in mtstate, which is + * passed to ExecFindPartition below. */ - cstate->transition_capture = + cstate->transition_capture = mtstate->mt_transition_capture = MakeTransitionCaptureState(cstate->rel->trigdesc, RelationGetRelid(cstate->rel), CMD_INSERT); @@ -2521,19 +2525,8 @@ CopyFrom(CopyState cstate) * CopyFrom tuple routing. */ if (cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) - { proute = ExecSetupPartitionTupleRouting(NULL, cstate->rel); - /* - * If we are capturing transition tuples, they may need to be - * converted from partition format back to partitioned table format - * (this is only ever necessary if a BEFORE trigger modifies the - * tuple). - */ - if (cstate->transition_capture != NULL) - ExecSetupChildParentMapForLeaf(proute); - } - /* * It's more efficient to prepare a bunch of tuples for insertion, and * insert them in one heap_multi_insert() call, than call heap_insert() @@ -2835,8 +2828,7 @@ CopyFrom(CopyState cstate) */ cstate->transition_capture->tcs_original_insert_tuple = NULL; cstate->transition_capture->tcs_map = - TupConvMapForLeaf(proute, target_resultRelInfo, - leaf_part_index); + PROUTE_CHILD_TO_PARENT_MAP(proute, leaf_part_index); } else { @@ -2854,16 +2846,13 @@ CopyFrom(CopyState cstate) * partition rowtype. Don't free the already stored tuple as it * may still be required for a multi-insert batch. */ - if (proute->parent_child_tupconv_maps) - { - TupleConversionMap *map = - proute->parent_child_tupconv_maps[leaf_part_index]; - - tuple = ConvertPartitionTupleSlot(map, tuple, - proute->partition_tuple_slot, - &slot, - false); - } + tuple = + ConvertPartitionTupleSlot(PROUTE_PARENT_TO_CHILD_MAP(proute, + leaf_part_index), + tuple, + proute->partition_tuple_slot, + &slot, + false); tuple->t_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc); } diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index 7849e04bdb..4242f81548 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -113,7 +113,6 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel) /* We only allocate these arrays when we need to store the first map */ proute->parent_child_tupconv_maps = NULL; proute->child_parent_tupconv_maps = NULL; - proute->child_parent_map_not_required = NULL; /* * Initialize this table's PartitionDispatch object. Here we pass in the @@ -719,9 +718,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, { TupleConversionMap *map; - map = proute->parent_child_tupconv_maps ? - proute->parent_child_tupconv_maps[part_result_rel_index] : - NULL; + map = PROUTE_PARENT_TO_CHILD_MAP(proute, part_result_rel_index); Assert(node->onConflictSet != NIL); Assert(rootResultRelInfo->ri_onConflict != NULL); @@ -872,6 +869,33 @@ ExecInitRoutingInfo(ModifyTableState *mtstate, } /* + * Also, if transition capture is active, store a map to convert tuples + * from partition's rowtype to parent's. + */ + if (mtstate && mtstate->mt_transition_capture) + { + map = + convert_tuples_by_name(RelationGetDescr(partRelInfo->ri_RelationDesc), + RelationGetDescr(partRelInfo->ri_PartitionRoot), + gettext_noop("could not convert row type")); + + /* Allocate parent child map array only if we need to store a map */ + if (map) + { + if (proute->child_parent_tupconv_maps == NULL) + { + int size; + + size = proute->partitions_allocsize; + proute->child_parent_tupconv_maps = (TupleConversionMap **) + palloc0(sizeof(TupleConversionMap *) * size); + } + + proute->child_parent_tupconv_maps[partidx] = map; + } + } + + /* * If the partition is a foreign table, let the FDW init itself for * routing tuples to the partition. */ @@ -964,76 +988,6 @@ ExecInitPartitionDispatchInfo(PartitionTupleRouting *proute, Oid partoid, } /* - * ExecSetupChildParentMapForLeaf -- Initialize the per-leaf-partition - * child-to-root tuple conversion map array. - * - * This map is required for capturing transition tuples when the target table - * is a partitioned table. For a tuple that is routed by an INSERT or UPDATE, - * we need to convert it from the leaf partition to the target table - * descriptor. - */ -void -ExecSetupChildParentMapForLeaf(PartitionTupleRouting *proute) -{ - int size; - - Assert(proute != NULL); - - size = proute->partitions_allocsize; - - /* - * These array elements get filled up with maps on an on-demand basis. - * Initially just set all of them to NULL. - */ - proute->child_parent_tupconv_maps = - (TupleConversionMap **) palloc0(sizeof(TupleConversionMap *) * size); - - /* Same is the case for this array. All the values are set to false */ - proute->child_parent_map_not_required = (bool *) palloc0(sizeof(bool) * - size); -} - -/* - * TupConvMapForLeaf -- Get the tuple conversion map for a given leaf partition - * index. - */ -TupleConversionMap * -TupConvMapForLeaf(PartitionTupleRouting *proute, - ResultRelInfo *rootRelInfo, int leaf_index) -{ - TupleConversionMap **map; - TupleDesc tupdesc; - - /* If nobody else set up the per-leaf maps array, do so ourselves. */ - if (proute->child_parent_tupconv_maps == NULL) - ExecSetupChildParentMapForLeaf(proute); - - /* If it's already known that we don't need a map, return NULL. */ - else if (proute->child_parent_map_not_required[leaf_index]) - return NULL; - - /* If we've already got a map, return it. */ - map = &proute->child_parent_tupconv_maps[leaf_index]; - if (*map != NULL) - return *map; - - /* No map yet; try to create one. */ - tupdesc = RelationGetDescr(proute->partitions[leaf_index]->ri_RelationDesc); - *map = - convert_tuples_by_name(tupdesc, - RelationGetDescr(rootRelInfo->ri_RelationDesc), - gettext_noop("could not convert row type")); - - /* - * If it turns out no map is needed, remember that so we don't try making - * one again next time. - */ - proute->child_parent_map_not_required[leaf_index] = (*map == NULL); - - return *map; -} - -/* * ConvertPartitionTupleSlot -- convenience function for tuple conversion. * The tuple, if converted, is stored in new_slot, and *p_my_slot is * updated to point to it. new_slot typically should be one of the diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index bbffbd722e..f592e4c51a 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -1760,7 +1760,7 @@ ExecPrepareTupleRouting(ModifyTableState *mtstate, */ mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL; mtstate->mt_transition_capture->tcs_map = - TupConvMapForLeaf(proute, targetRelInfo, partidx); + PROUTE_CHILD_TO_PARENT_MAP(proute, partidx); } else { @@ -1775,16 +1775,15 @@ ExecPrepareTupleRouting(ModifyTableState *mtstate, if (mtstate->mt_oc_transition_capture != NULL) { mtstate->mt_oc_transition_capture->tcs_map = - TupConvMapForLeaf(proute, targetRelInfo, partidx); + PROUTE_CHILD_TO_PARENT_MAP(proute, partidx); } /* * Convert the tuple, if necessary. */ - if (proute->parent_child_tupconv_maps) - ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[partidx], - tuple, proute->partition_tuple_slot, &slot, - true); + ConvertPartitionTupleSlot(PROUTE_PARENT_TO_CHILD_MAP(proute, partidx), + tuple, proute->partition_tuple_slot, &slot, + true); /* Initialize information needed to handle ON CONFLICT DO UPDATE. */ Assert(mtstate != NULL); diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h index 0b03b9dd76..0bb84a27aa 100644 --- a/src/include/executor/execPartition.h +++ b/src/include/executor/execPartition.h @@ -112,21 +112,13 @@ typedef struct PartitionDispatchData *PartitionDispatch; * routing. Maintained separately because partitions * may have different rowtype. * - * Note: The following fields are used only when UPDATE ends up needing to - * do tuple routing. - * * child_parent_tupconv_maps As 'parent_child_tupconv_maps' but stores * conversion maps to translate partition tuples into - * partition_root's rowtype. + * partition_root's rowtype, needed if transition + * capture is active * - * child_parent_map_not_required True if the corresponding - * child_parent_tupconv_maps element has been - * determined to require no translation or set to - * NULL when child_parent_tupconv_maps is NULL. This - * is required in order to distinguish tuple - * translations which have been seen to not be - * required due to the TupleDescs being compatible - * with transactions which have yet to be determined. + * Note: The following fields are used only when UPDATE ends up needing to + * do tuple routing. * * subplan_resultrel_hash Hash table to store subplan ResultRelInfos by Oid. * This is used to cache ResultRelInfos from subplans @@ -159,6 +151,20 @@ typedef struct PartitionTupleRouting } PartitionTupleRouting; /* + * Accessor macros for tuple conversion maps contained in + * PartitionTupleRouting + */ +#define PROUTE_CHILD_TO_PARENT_MAP(proute, partidx) \ + ((proute)->child_parent_tupconv_maps != NULL ? \ + proute->child_parent_tupconv_maps[(partidx)] : \ + NULL) + +#define PROUTE_PARENT_TO_CHILD_MAP(proute, partidx) \ + ((proute)->parent_child_tupconv_maps != NULL ? \ + proute->parent_child_tupconv_maps[(partidx)] : \ + NULL) + +/* * PartitionedRelPruningData - Per-partitioned-table data for run-time pruning * of partitions. For a multilevel partitioned table, we have one of these * for the topmost partition plus one for each non-leaf child partition. @@ -260,9 +266,6 @@ extern void ExecInitRoutingInfo(ModifyTableState *mtstate, PartitionTupleRouting *proute, ResultRelInfo *partRelInfo, int partidx); -extern void ExecSetupChildParentMapForLeaf(PartitionTupleRouting *proute); -extern TupleConversionMap *TupConvMapForLeaf(PartitionTupleRouting *proute, - ResultRelInfo *rootRelInfo, int leaf_index); extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map, HeapTuple tuple, TupleTableSlot *new_slot, -- 2.11.0