Hi,
On 08/20/2015 04:15 AM, Tomas Vondra wrote:
Hello KaiGain-san,
On 08/19/2015 03:19 PM, Kohei KaiGai wrote:
Unless we have no fail-safe mechanism when planner estimated too
large number of tuples than actual needs, a strange estimation will
consume massive amount of RAMs. It's a bad side effect.
My previous patch didn't pay attention to the scenario, so needs to
revise the patch.
I agree we need to put a few more safeguards there (e.g. make sure we
don't overflow INT when counting the buckets, which may happen with the
amounts of work_mem we'll see in the wild soon).
But I think we should not do any extensive changes to how we size the
hashtable - that's not something we should do in a bugfix I think.
Attached are two alternative version of a backpatch. Both limit the
nbuckets so that it never exceeds MaxAllocSize - effectively 512MB due
to the doubling (so ~67M buckets on 64-bit architectures).
The only difference is that the 'alternative' patch limits max_pointers
+ /* ensure we don't exceed the maximum allocation size */
+ max_pointers = Min(max_pointers, MaxAllocSize / sizeof(void*));
so it affects both nbuckets and nbatches. That seems a bit more correct,
but I guess whoever gets this many batches would be grateful even for
the quick crash.
For master, I think the right patch is what KaiGai-san posted in June. I
don't think we should really try making it smarter about handling
overestimates at this point - that's 9.6 stuff IMNSHO.
I find it a bit awkward that we only have MemoryContextAllocHuge and
repalloc_huge, especially as nodeHash.c needs MemoryContextAllocHuge +
memset to zero the chunk.
So I think we should extend the memutils API by adding palloc0_huge (and
possibly palloc_huge, although that's not needed by nodeHash.c).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 6917d3f..e72ac8b 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -475,6 +475,7 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
lbuckets = (hash_table_bytes / tupsize) / NTUP_PER_BUCKET;
lbuckets = Min(lbuckets, max_pointers);
+ lbuckets = Min(lbuckets, MaxAllocSize / sizeof(void*));
nbuckets = (int) lbuckets;
dbatch = ceil(inner_rel_bytes / hash_table_bytes);
@@ -491,6 +492,7 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
dbuckets = ceil(ntuples / NTUP_PER_BUCKET);
dbuckets = Min(dbuckets, max_pointers);
+ dbuckets = Min(dbuckets, MaxAllocSize / sizeof(void*));
nbuckets = (int) dbuckets;
nbatch = 1;
@@ -500,7 +502,8 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
* Both nbuckets and nbatch must be powers of 2 to make
* ExecHashGetBucketAndBatch fast. We already fixed nbatch; now inflate
* nbuckets to the next larger power of 2. We also force nbuckets to not
- * be real small, by starting the search at 2^10. (Note: above we made
+ * be real small, by starting the search at 2^10, or too large - we allocate
+ * them as a single chunk, so must fit in MaxAllocSize. (Note: above we made
* sure that nbuckets is not more than INT_MAX / 2, so this loop cannot
* overflow, nor can the final shift to recalculate nbuckets.)
*/
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 6917d3f..ca90aed 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -465,6 +465,8 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
max_pointers = (work_mem * 1024L) / sizeof(void *);
/* also ensure we avoid integer overflow in nbatch and nbuckets */
max_pointers = Min(max_pointers, INT_MAX / 2);
+ /* ensure we don't exceed the maximum allocation size */
+ max_pointers = Min(max_pointers, MaxAllocSize / sizeof(void*));
if (inner_rel_bytes > hash_table_bytes)
{
@@ -500,7 +502,8 @@ ExecChooseHashTableSize(double ntuples, int tupwidth, bool useskew,
* Both nbuckets and nbatch must be powers of 2 to make
* ExecHashGetBucketAndBatch fast. We already fixed nbatch; now inflate
* nbuckets to the next larger power of 2. We also force nbuckets to not
- * be real small, by starting the search at 2^10. (Note: above we made
+ * be real small, by starting the search at 2^10, or too large - we allocate
+ * them as a single chunk, so must fit in MaxAllocSize. (Note: above we made
* sure that nbuckets is not more than INT_MAX / 2, so this loop cannot
* overflow, nor can the final shift to recalculate nbuckets.)
*/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers