Re: [HACKERS] DBT-3 with SF=20 got failed

Tomas Vondra Wed, 19 Aug 2015 20:20:39 -0700

Hi,

On 08/20/2015 04:15 AM, Tomas Vondra wrote:

Hello KaiGain-san,


On 08/19/2015 03:19 PM, Kohei KaiGai wrote:

Unless we have no fail-safe mechanism when planner estimated too
large number of tuples than actual needs, a strange estimation will
consume massive amount of RAMs. It's a bad side effect.
My previous patch didn't pay attention to the scenario, so needs to
revise the patch.


I agree we need to put a few more safeguards there (e.g. make sure we
don't overflow INT when counting the buckets, which may happen with the
amounts of work_mem we'll see in the wild soon).

But I think we should not do any extensive changes to how we size the
hashtable - that's not something we should do in a bugfix I think.

Attached are two alternative version of a backpatch. Both limit thenbuckets so that it never exceeds MaxAllocSize - effectively 512MB dueto the doubling (so ~67M buckets on 64-bit architectures).


The only difference is that the 'alternative' patch limits max_pointers

+       /* ensure we don't exceed the maximum allocation size */
+       max_pointers = Min(max_pointers, MaxAllocSize / sizeof(void*));

so it affects both nbuckets and nbatches. That seems a bit more correct,but I guess whoever gets this many batches would be grateful even forthe quick crash.

For master, I think the right patch is what KaiGai-san posted in June. Idon't think we should really try making it smarter about handlingoverestimates at this point - that's 9.6 stuff IMNSHO.

I find it a bit awkward that we only have MemoryContextAllocHuge andrepalloc_huge, especially as nodeHash.c needs MemoryContextAllocHuge +memset to zero the chunk.

So I think we should extend the memutils API by adding palloc0_huge (andpossibly palloc_huge, although that's not needed by nodeHash.c).



regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 6917d3f..e72ac8b 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -475,6 +475,7 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
 
 		lbuckets = (hash_table_bytes / tupsize) / NTUP_PER_BUCKET;
 		lbuckets = Min(lbuckets, max_pointers);
+		lbuckets = Min(lbuckets, MaxAllocSize / sizeof(void*));
 		nbuckets = (int) lbuckets;
 
 		dbatch = ceil(inner_rel_bytes / hash_table_bytes);
@@ -491,6 +492,7 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
 
 		dbuckets = ceil(ntuples / NTUP_PER_BUCKET);
 		dbuckets = Min(dbuckets, max_pointers);
+		dbuckets = Min(dbuckets, MaxAllocSize / sizeof(void*));
 		nbuckets = (int) dbuckets;
 
 		nbatch = 1;
@@ -500,7 +502,8 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
 	 * Both nbuckets and nbatch must be powers of 2 to make
 	 * ExecHashGetBucketAndBatch fast.  We already fixed nbatch; now inflate
 	 * nbuckets to the next larger power of 2.  We also force nbuckets to not
-	 * be real small, by starting the search at 2^10.  (Note: above we made
+	 * be real small, by starting the search at 2^10, or too large - we allocate
+	 * them as a single chunk, so must fit in MaxAllocSize. (Note: above we made
 	 * sure that nbuckets is not more than INT_MAX / 2, so this loop cannot
 	 * overflow, nor can the final shift to recalculate nbuckets.)
 	 */

diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 6917d3f..ca90aed 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -465,6 +465,8 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
 	max_pointers = (work_mem * 1024L) / sizeof(void *);
 	/* also ensure we avoid integer overflow in nbatch and nbuckets */
 	max_pointers = Min(max_pointers, INT_MAX / 2);
+	/* ensure we don't exceed the maximum allocation size */
+	max_pointers = Min(max_pointers, MaxAllocSize / sizeof(void*));
 
 	if (inner_rel_bytes > hash_table_bytes)
 	{
@@ -500,7 +502,8 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
 	 * Both nbuckets and nbatch must be powers of 2 to make
 	 * ExecHashGetBucketAndBatch fast.  We already fixed nbatch; now inflate
 	 * nbuckets to the next larger power of 2.  We also force nbuckets to not
-	 * be real small, by starting the search at 2^10.  (Note: above we made
+	 * be real small, by starting the search at 2^10, or too large - we allocate
+	 * them as a single chunk, so must fit in MaxAllocSize. (Note: above we made
 	 * sure that nbuckets is not more than INT_MAX / 2, so this loop cannot
 	 * overflow, nor can the final shift to recalculate nbuckets.)
 	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DBT-3 with SF=20 got failed

Reply via email to