Hi Kaigai-san,

Thanks for the report and the patch.

On 2018/07/24 11:43, Kohei KaiGai wrote:
> Further investigation I did:
> 
> CopyFrom() calls ExecFindPartition() to identify the destination child
> table of partitioned table.
> Then, it internally calls get_partition_for_tuple() to get partition
> index according to the key value.
> This invocation is not under the per-tuple context.
> 
> In case of hash-partitioning, get_partition_for_tuple() calls
> hash-function of key data type; which is hash_numeric in my case.
> The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It
> internally calls pg_detoast_datum() which may allocate new memory if
> varlena datum is not uncompressed long (32bit) format.
> 
> Once this patch attached, PostgreSQL backend process has been working
> with about 130MB memory consumption for 20min right now (not finished
> yet...)
> Before the patch applied, its memory consumption grows up about
> 10BM/sec, then terminated a few hours later.
> 
> P.S,
> I think do_convert_tuple() in ExecFindPartition() and
> ConvertPartitionTupleSlot() may also allocate memory out of the
> per-tuple context, however, I could not confirmed yet, because my test
> case has TupleConversionMap == NULL.

Your patch takes care of allocation happening inside
get_partition_for_tuple, but as you mention there might be others in its
caller ExecFindPartition.  So, I think we should switch to the per-tuple
context in ExecFindPartition.

When I tried to do that, I discovered that we have to be careful about
releasing some of the memory that's allocated in ExecFindPartition
ourselves instead of relying on the reset of per-tuple context to take
care of it.  That's because some of the structures that ExecFindPartition
assigns the allocated memory to are cleaned up at the end of the query, by
when it's too late to try to release per-tuple memory.  So, the patch I
ended up with is slightly bigger than simply adding a
MemoryContextSwitchTo() call at the beginning of ExecFindPartition.
Please find it attached.

Thanks,
Amit
From 8c42a8dcab7e2f835eff5c001d9be2829301429d Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Tue, 24 Jul 2018 15:11:25 +0900
Subject: [PATCH v1] Use per-tuple memory in ExecFindPartition

---
 src/backend/executor/execPartition.c | 38 ++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/execPartition.c 
b/src/backend/executor/execPartition.c
index 7a4665cc4e..f5d9b1755a 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -196,6 +196,15 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
        PartitionDispatch parent;
        ExprContext *ecxt = GetPerTupleExprContext(estate);
        TupleTableSlot *ecxt_scantuple_old = ecxt->ecxt_scantuple;
+       TupleTableSlot *myslot = NULL;
+       MemoryContext   oldcontext;
+       HeapTuple               tuple = ExecFetchSlotTuple(slot);
+
+       /*
+        * Anything below that needs to allocate memory should use per-tuple
+        * memory.
+        */
+       oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
 
        /*
         * First check the root table's partition constraint, if any.  No point 
in
@@ -209,7 +218,6 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
        while (true)
        {
                PartitionDesc partdesc;
-               TupleTableSlot *myslot = parent->tupslot;
                TupleConversionMap *map = parent->tupmap;
                int                     cur_index = -1;
 
@@ -220,11 +228,9 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
                 * Convert the tuple to this parent's layout so that we can do 
certain
                 * things we do below.
                 */
+               myslot = parent->tupslot;
                if (myslot != NULL && map != NULL)
                {
-                       HeapTuple       tuple = ExecFetchSlotTuple(slot);
-
-                       ExecClearTuple(myslot);
                        tuple = do_convert_tuple(tuple, map);
                        ExecStoreTuple(tuple, myslot, InvalidBuffer, true);
                        slot = myslot;
@@ -269,9 +275,32 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
                        break;
                }
                else
+               {
                        parent = pd[-parent->indexes[cur_index]];
+
+                       /*
+                        * If we used the dedicated slot, must call 
ExecClearTuple now
+                        * to release the tuple contained in it and set its
+                        * tts_shouldFree to false so that nobody attempts to 
release it
+                        * later.  We have to do that because the tuple uses 
per-tuple
+                        * memory.
+                        */
+                       if (slot == myslot)
+                       {
+                               /*
+                                * Copy the tuple to be used for further 
routing before
+                                * clearing the slot itself.
+                                */
+                               tuple = ExecCopySlotTuple(myslot);
+                               ExecClearTuple(myslot);
+                       }
+               }
        }
 
+       /* Release the tuple in the lowest parent's dedicated slot. */
+       if (slot == myslot)
+               ExecClearTuple(myslot);
+
        /* A partition was not found. */
        if (result < 0)
        {
@@ -287,6 +316,7 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, 
PartitionDispatch *pd,
                                 val_desc ? errdetail("Partition key of the 
failing row contains %s.", val_desc) : 0));
        }
 
+       MemoryContextSwitchTo(oldcontext);
        ecxt->ecxt_scantuple = ecxt_scantuple_old;
        return result;
 }
-- 
2.11.0

Reply via email to