Further investigation I did: CopyFrom() calls ExecFindPartition() to identify the destination child table of partitioned table. Then, it internally calls get_partition_for_tuple() to get partition index according to the key value. This invocation is not under the per-tuple context.
In case of hash-partitioning, get_partition_for_tuple() calls hash-function of key data type; which is hash_numeric in my case. The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It internally calls pg_detoast_datum() which may allocate new memory if varlena datum is not uncompressed long (32bit) format. Once this patch attached, PostgreSQL backend process has been working with about 130MB memory consumption for 20min right now (not finished yet...) Before the patch applied, its memory consumption grows up about 10BM/sec, then terminated a few hours later. P.S, I think do_convert_tuple() in ExecFindPartition() and ConvertPartitionTupleSlot() may also allocate memory out of the per-tuple context, however, I could not confirmed yet, because my test case has TupleConversionMap == NULL. Thanks, 2018-07-24 10:43 GMT+09:00 Michael Paquier <mich...@paquier.xyz>: > On Tue, Jul 24, 2018 at 09:41:52AM +0900, Kohei KaiGai wrote: >> In PG11beta2, my backend process gets terminated during COPY FROM of >> large text file (544GB) on partitioned table. >> The kernel log says OOM Killer send SIGKILL due to memory pressure. >> In fact, 63GB of system physical 64GB was consumed by the PostgreSQL >> backend just before the termination. > > Hmm.. That's not nice. Let's add an open item. > -- > Michael -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kai...@heterodb.com>
fix-leak-copyfrom.patch
Description: Binary data