(looking at the v5 patch but replying to an older email)

On 2018/07/31 16:03, David Rowley wrote:
> I've attached a complete v4 patch.
> 
>> By the way, when going over the updated code, I noticed that the code
>> around child_parent_tupconv_maps could use some refactoring too.
>> Especially, I noticed that ExecSetupChildParentMapForLeaf() allocates
>> child-to-parent map array needed for transition tuple capture even if not
>> needed by any of the leaf partitions.  I'm attaching here a patch that
>> applies on top of your v3 to show what I'm thinking we could do.
> 
> Maybe we can do that as a follow-on patch.

We probably could, but I think it would be a good idea get rid of *all*
redundant allocations due to tuple routing in one patch, if that's the
mission of this thread and the patch anyway.

> I think what we have so far
> is already ended up quite complex to review. What do you think?

Yeah, it's kind of complex, but at least it seems that we're clear on the
point that what we're trying to do here is to try to get rid of redundant
allocations.

Parts of the patch that appear complex seems to be around the allocation
of various maps.  Especially the child-to-parent maps, which as things
stand today, come from two arrays -- a per-update-subplan array that's
needed by update tuple routing proper and per-leaf partition array (one in
PartitionTupleRouting) that's needed by transition capture machinery.  The
original coding was such the update tuple routing handling code would try
to avoid allocating the per-update-subplan array if it saw that per-leaf
partition array was already set up in PartitionTupleRouting, because
transition capture is active in the query.  For update-tuple-routing code
to be able to use maps from the per-leaf array, it would have to know
which update-subplans mapped to which tuple-routing-initialized
partitions.  That was maintained in the subplan_partition_offset array
that's now gone with this patch, because we no longer want to fix the
tuple-routing-initialized-partition offsets in advance.  So, it's better
to dissociate per-subplan maps which are initialized during
ExecInitModifyTable from per-leaf maps which are initialized lazily when
tuple routing initializes a partition, which is what my portion of the
patch did.

As mentioned in my last email, I still think it would be a good idea to
simplify the handling of child-to-parent maps in PartitionTupleRouting
even further, while we're at improving the code in this area.  I revised
the patch such that it makes the handling of maps in PartitionTupleRouting
even more uniform.  With that patch, we no longer have two completely
unrelated places in the code managing parent-to-child and child-to-parent
maps, even though both arrays are in the same PartitionTupleRouting.
Please find the updated patch attached with this email.

Thanks,
Amit
From 1a814f5a40774a51bf702757ec91e02f418a5aba Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Fri, 3 Aug 2018 14:09:51 +0900
Subject: [PATCH v2 2/2] Refactor handling of child_parent_tupconv_maps

They're currently created and handed out by TupConvMapForLeaf, which
makes them look somewhat different from parent_to_child_tupconv_maps.
In fact, both contain conversion maps possibly needed between a
partition initialized by tuple routing and the root parent in one or
the other direction, so it seems odd that parent-to-child ones are
created in ExecInitRoutingInfo, whereas child-to-parent ones in
TupConvMapForLeaf.

The child-to-parent ones are only needed if transition capture is
active, but we can already check that in ExecInitRoutingInfo via
the incoming ModifyTableState (sure, we need to teach CopyFrom to
add the necessary information into its dummy ModifyTableState, but
that doesn't seem too bad).

This way, we can manage both parent-to-child and child-to-parent maps
in similar ways, and more importantly, use the same criterion of
checking whether a partition's slot in the respective array is NULL
or not to conclude if tuple conversion is necessary or not.
---
 src/backend/commands/copy.c            |  37 +++++-------
 src/backend/executor/execPartition.c   | 102 +++++++++------------------------
 src/backend/executor/nodeModifyTable.c |  11 ++--
 src/include/executor/execPartition.h   |  33 ++++++-----
 4 files changed, 64 insertions(+), 119 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 752ba3d767..6f4069d321 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2510,8 +2510,12 @@ CopyFrom(CopyState cstate)
        /*
         * If there are any triggers with transition tables on the named 
relation,
         * we need to be prepared to capture transition tuples.
+        *
+        * Because partition tuple routing would like to know about whether
+        * transition capture is active, we also set it in mtstate, which is
+        * passed to ExecFindPartition below.
         */
-       cstate->transition_capture =
+       cstate->transition_capture = mtstate->mt_transition_capture =
                MakeTransitionCaptureState(cstate->rel->trigdesc,
                                                                   
RelationGetRelid(cstate->rel),
                                                                   CMD_INSERT);
@@ -2521,19 +2525,8 @@ CopyFrom(CopyState cstate)
         * CopyFrom tuple routing.
         */
        if (cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-       {
                proute = ExecSetupPartitionTupleRouting(NULL, cstate->rel);
 
-               /*
-                * If we are capturing transition tuples, they may need to be
-                * converted from partition format back to partitioned table 
format
-                * (this is only ever necessary if a BEFORE trigger modifies the
-                * tuple).
-                */
-               if (cstate->transition_capture != NULL)
-                       ExecSetupChildParentMapForLeaf(proute);
-       }
-
        /*
         * It's more efficient to prepare a bunch of tuples for insertion, and
         * insert them in one heap_multi_insert() call, than call heap_insert()
@@ -2835,8 +2828,7 @@ CopyFrom(CopyState cstate)
                                         */
                                        
cstate->transition_capture->tcs_original_insert_tuple = NULL;
                                        cstate->transition_capture->tcs_map =
-                                               TupConvMapForLeaf(proute, 
target_resultRelInfo,
-                                                                               
  leaf_part_index);
+                                               
PROUTE_CHILD_TO_PARENT_MAP(proute, leaf_part_index);
                                }
                                else
                                {
@@ -2854,16 +2846,13 @@ CopyFrom(CopyState cstate)
                         * partition rowtype.  Don't free the already stored 
tuple as it
                         * may still be required for a multi-insert batch.
                         */
-                       if (proute->parent_child_tupconv_maps)
-                       {
-                               TupleConversionMap *map =
-                               
proute->parent_child_tupconv_maps[leaf_part_index];
-
-                               tuple = ConvertPartitionTupleSlot(map, tuple,
-                                                                               
                  proute->partition_tuple_slot,
-                                                                               
                  &slot,
-                                                                               
                  false);
-                       }
+                       tuple =
+                               
ConvertPartitionTupleSlot(PROUTE_PARENT_TO_CHILD_MAP(proute,
+                                                                               
  leaf_part_index),
+                                                                               
  tuple,
+                                                                               
  proute->partition_tuple_slot,
+                                                                               
  &slot,
+                                                                               
  false);
 
                        tuple->t_tableOid = 
RelationGetRelid(resultRelInfo->ri_RelationDesc);
                }
diff --git a/src/backend/executor/execPartition.c 
b/src/backend/executor/execPartition.c
index 7849e04bdb..4242f81548 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -113,7 +113,6 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, 
Relation rel)
        /* We only allocate these arrays when we need to store the first map */
        proute->parent_child_tupconv_maps = NULL;
        proute->child_parent_tupconv_maps = NULL;
-       proute->child_parent_map_not_required = NULL;
 
        /*
         * Initialize this table's PartitionDispatch object.  Here we pass in 
the
@@ -719,9 +718,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate,
                {
                        TupleConversionMap *map;
 
-                       map = proute->parent_child_tupconv_maps ?
-                               
proute->parent_child_tupconv_maps[part_result_rel_index] :
-                               NULL;
+                       map = PROUTE_PARENT_TO_CHILD_MAP(proute, 
part_result_rel_index);
 
                        Assert(node->onConflictSet != NIL);
                        Assert(rootResultRelInfo->ri_onConflict != NULL);
@@ -872,6 +869,33 @@ ExecInitRoutingInfo(ModifyTableState *mtstate,
        }
 
        /*
+        * Also, if transition capture is active, store a map to convert tuples
+        * from partition's rowtype to parent's.
+        */
+       if (mtstate && mtstate->mt_transition_capture)
+       {
+               map =
+               
convert_tuples_by_name(RelationGetDescr(partRelInfo->ri_RelationDesc),
+                                                          
RelationGetDescr(partRelInfo->ri_PartitionRoot),
+                                                          gettext_noop("could 
not convert row type"));
+
+               /* Allocate parent child map array only if we need to store a 
map */
+               if (map)
+               {
+                       if (proute->child_parent_tupconv_maps == NULL)
+                       {
+                               int                     size;
+
+                               size = proute->partitions_allocsize;
+                               proute->child_parent_tupconv_maps = 
(TupleConversionMap **)
+                                       palloc0(sizeof(TupleConversionMap *) * 
size);
+                       }
+
+                       proute->child_parent_tupconv_maps[partidx] = map;
+               }
+       }
+
+       /*
         * If the partition is a foreign table, let the FDW init itself for
         * routing tuples to the partition.
         */
@@ -964,76 +988,6 @@ ExecInitPartitionDispatchInfo(PartitionTupleRouting 
*proute, Oid partoid,
 }
 
 /*
- * ExecSetupChildParentMapForLeaf -- Initialize the per-leaf-partition
- * child-to-root tuple conversion map array.
- *
- * This map is required for capturing transition tuples when the target table
- * is a partitioned table. For a tuple that is routed by an INSERT or UPDATE,
- * we need to convert it from the leaf partition to the target table
- * descriptor.
- */
-void
-ExecSetupChildParentMapForLeaf(PartitionTupleRouting *proute)
-{
-       int                     size;
-
-       Assert(proute != NULL);
-
-       size = proute->partitions_allocsize;
-
-       /*
-        * These array elements get filled up with maps on an on-demand basis.
-        * Initially just set all of them to NULL.
-        */
-       proute->child_parent_tupconv_maps =
-               (TupleConversionMap **) palloc0(sizeof(TupleConversionMap *) * 
size);
-
-       /* Same is the case for this array. All the values are set to false */
-       proute->child_parent_map_not_required = (bool *) palloc0(sizeof(bool) *
-                                                                               
                                         size);
-}
-
-/*
- * TupConvMapForLeaf -- Get the tuple conversion map for a given leaf partition
- * index.
- */
-TupleConversionMap *
-TupConvMapForLeaf(PartitionTupleRouting *proute,
-                                 ResultRelInfo *rootRelInfo, int leaf_index)
-{
-       TupleConversionMap **map;
-       TupleDesc       tupdesc;
-
-       /* If nobody else set up the per-leaf maps array, do so ourselves. */
-       if (proute->child_parent_tupconv_maps == NULL)
-               ExecSetupChildParentMapForLeaf(proute);
-
-       /* If it's already known that we don't need a map, return NULL. */
-       else if (proute->child_parent_map_not_required[leaf_index])
-               return NULL;
-
-       /* If we've already got a map, return it. */
-       map = &proute->child_parent_tupconv_maps[leaf_index];
-       if (*map != NULL)
-               return *map;
-
-       /* No map yet; try to create one. */
-       tupdesc = 
RelationGetDescr(proute->partitions[leaf_index]->ri_RelationDesc);
-       *map =
-               convert_tuples_by_name(tupdesc,
-                                                          
RelationGetDescr(rootRelInfo->ri_RelationDesc),
-                                                          gettext_noop("could 
not convert row type"));
-
-       /*
-        * If it turns out no map is needed, remember that so we don't try 
making
-        * one again next time.
-        */
-       proute->child_parent_map_not_required[leaf_index] = (*map == NULL);
-
-       return *map;
-}
-
-/*
  * ConvertPartitionTupleSlot -- convenience function for tuple conversion.
  * The tuple, if converted, is stored in new_slot, and *p_my_slot is
  * updated to point to it.  new_slot typically should be one of the
diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index bbffbd722e..f592e4c51a 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1760,7 +1760,7 @@ ExecPrepareTupleRouting(ModifyTableState *mtstate,
                         */
                        
mtstate->mt_transition_capture->tcs_original_insert_tuple = NULL;
                        mtstate->mt_transition_capture->tcs_map =
-                               TupConvMapForLeaf(proute, targetRelInfo, 
partidx);
+                               PROUTE_CHILD_TO_PARENT_MAP(proute, partidx);
                }
                else
                {
@@ -1775,16 +1775,15 @@ ExecPrepareTupleRouting(ModifyTableState *mtstate,
        if (mtstate->mt_oc_transition_capture != NULL)
        {
                mtstate->mt_oc_transition_capture->tcs_map =
-                       TupConvMapForLeaf(proute, targetRelInfo, partidx);
+                       PROUTE_CHILD_TO_PARENT_MAP(proute, partidx);
        }
 
        /*
         * Convert the tuple, if necessary.
         */
-       if (proute->parent_child_tupconv_maps)
-               
ConvertPartitionTupleSlot(proute->parent_child_tupconv_maps[partidx],
-                                                                 tuple, 
proute->partition_tuple_slot, &slot,
-                                                                 true);
+       ConvertPartitionTupleSlot(PROUTE_PARENT_TO_CHILD_MAP(proute, partidx),
+                                                         tuple, 
proute->partition_tuple_slot, &slot,
+                                                         true);
 
        /* Initialize information needed to handle ON CONFLICT DO UPDATE. */
        Assert(mtstate != NULL);
diff --git a/src/include/executor/execPartition.h 
b/src/include/executor/execPartition.h
index 0b03b9dd76..0bb84a27aa 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -112,21 +112,13 @@ typedef struct PartitionDispatchData *PartitionDispatch;
  *                                                     routing.  Maintained 
separately because partitions
  *                                                     may have different 
rowtype.
  *
- * Note: The following fields are used only when UPDATE ends up needing to
- * do tuple routing.
- *
  * child_parent_tupconv_maps   As 'parent_child_tupconv_maps' but stores
  *                                                     conversion maps to 
translate partition tuples into
- *                                                     partition_root's 
rowtype.
+ *                                                     partition_root's 
rowtype, needed if transition
+ *                                                     capture is active
  *
- * child_parent_map_not_required       True if the corresponding
- *                                                     
child_parent_tupconv_maps element has been
- *                                                     determined to require 
no translation or set to
- *                                                     NULL when 
child_parent_tupconv_maps is NULL.  This
- *                                                     is required in order to 
distinguish tuple
- *                                                     translations which have 
been seen to not be
- *                                                     required due to the 
TupleDescs being compatible
- *                                                     with transactions which 
have yet to be determined.
+ * Note: The following fields are used only when UPDATE ends up needing to
+ * do tuple routing.
  *
  * subplan_resultrel_hash      Hash table to store subplan ResultRelInfos by 
Oid.
  *                                                     This is used to cache 
ResultRelInfos from subplans
@@ -159,6 +151,20 @@ typedef struct PartitionTupleRouting
 } PartitionTupleRouting;
 
 /*
+ * Accessor macros for tuple conversion maps contained in
+ * PartitionTupleRouting
+ */
+#define PROUTE_CHILD_TO_PARENT_MAP(proute, partidx) \
+                       ((proute)->child_parent_tupconv_maps != NULL ? \
+                               proute->child_parent_tupconv_maps[(partidx)] : \
+                                                       NULL)
+
+#define PROUTE_PARENT_TO_CHILD_MAP(proute, partidx) \
+                       ((proute)->parent_child_tupconv_maps != NULL ? \
+                               proute->parent_child_tupconv_maps[(partidx)] : \
+                                                       NULL)
+
+/*
  * PartitionedRelPruningData - Per-partitioned-table data for run-time pruning
  * of partitions.  For a multilevel partitioned table, we have one of these
  * for the topmost partition plus one for each non-leaf child partition.
@@ -260,9 +266,6 @@ extern void ExecInitRoutingInfo(ModifyTableState *mtstate,
                                        PartitionTupleRouting *proute,
                                        ResultRelInfo *partRelInfo,
                                        int partidx);
-extern void ExecSetupChildParentMapForLeaf(PartitionTupleRouting *proute);
-extern TupleConversionMap *TupConvMapForLeaf(PartitionTupleRouting *proute,
-                                 ResultRelInfo *rootRelInfo, int leaf_index);
 extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map,
                                                  HeapTuple tuple,
                                                  TupleTableSlot *new_slot,
-- 
2.11.0

Reply via email to