A customer of ours was hit by this recently and I'd like to get it fixed for good.
Robert Haas wrote: > On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby <jim.na...@bluetreble.com> wrote: > > In any case, I don't think it would be terribly difficult to allow a bit > > more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's > > some 1GB limits there too. > > The point is, those limits are there on purpose. Changing things > arbitrarily wouldn't be hard, but doing it in a principled way is > likely to require some thought. For example, in the COPY OUT case, > presumably what's happening is that we palloc a chunk for each > individual datum, and then palloc a buffer for the whole row. Right. There's a serious problem here already, and users are being bitten by it in existing releases. I think we should provide a non-invasive fix for back-branches which we could apply as a bug fix. Here's a proposed patch for the localized fix; it just adds a flag to StringInfo allowing the string to grow beyond the 1GB limit, but only for strings which are specifically marked. That way we avoid having to audit all the StringInfo-using code. In this patch, only the COPY path is allowed to grow beyond 1GB, which is enough to close the current problem -- or at least my test case for it. If others can try this patch to ensure it enables pg_dump to work on their databases, it would be great. (In this patch there's a known buglet which is that the UINT64_FORMAT patched in the error message doesn't allow for translatability.) Like Robert, I don't like this approach for the long term, however. I think it would be saner to have CopySendData not keep a single gigantic string in memory; it would be better to get the bytes out to the client sooner than end-of-row. To avoid causing a performance hit, we would only flush when the size of the output buffer is about to reach the allocation limit (MaxAllocSize); so for regular-sized tuples, we would only do it at end-of-row, keeping the current behavior. I don't have a patch for this yet; I'm going to try that next. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 3eba9ef..fcc4fe6 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -445,6 +445,15 @@ SendCopyEnd(CopyState cstate) } } +/* + * Prepare for output + */ +static void +CopyStartSend(CopyState cstate) +{ + allowLongStringInfo(cstate->fe_msgbuf); +} + /*---------- * CopySendData sends output data to the destination (file or frontend) * CopySendString does the same for null-terminated strings @@ -1865,6 +1874,8 @@ CopyOneRowTo(CopyState cstate, Oid tupleOid, Datum *values, bool *nulls) MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); + CopyStartSend(cstate); + if (cstate->binary) { /* Binary per-tuple header */ diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index 7d03090..8c08eb7 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -47,12 +47,24 @@ initStringInfo(StringInfo str) { int size = 1024; /* initial default buffer size */ - str->data = (char *) palloc(size); + str->data = (char *) palloc(size); /* no need for "huge" at this point */ str->maxlen = size; + str->allowlong = false; resetStringInfo(str); } /* + * allocLongStringInfo + * + * Mark the StringInfo as a candidate for a "huge" allocation + */ +void +allowLongStringInfo(StringInfo str) +{ + str->allowlong = true; +} + +/* * resetStringInfo * * Reset the StringInfo: the data buffer remains valid, but its @@ -64,6 +76,7 @@ resetStringInfo(StringInfo str) str->data[0] = '\0'; str->len = 0; str->cursor = 0; + str->allowlong = false; } /* @@ -244,7 +257,8 @@ appendBinaryStringInfo(StringInfo str, const char *data, int datalen) void enlargeStringInfo(StringInfo str, int needed) { - int newlen; + int64 newlen; + Size limit; /* * Guard against out-of-range "needed" values. Without this, we can get @@ -252,16 +266,19 @@ enlargeStringInfo(StringInfo str, int needed) */ if (needed < 0) /* should not happen */ elog(ERROR, "invalid string enlargement request size: %d", needed); - if (((Size) needed) >= (MaxAllocSize - (Size) str->len)) + + /* choose the proper limit and verify this allocation wouldn't exceed it */ + limit = str->allowlong ? MaxAllocHugeSize : MaxAllocSize; + if (((Size) needed) >= (limit - (Size) str->len)) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), errmsg("out of memory"), - errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.", + errdetail("Cannot enlarge string buffer containing "INT64_FORMAT" bytes by %d more bytes.", str->len, needed))); needed += str->len + 1; /* total space required now */ - /* Because of the above test, we now have needed <= MaxAllocSize */ + /* Because of the above test, we now have needed <= limit */ if (needed <= str->maxlen) return; /* got enough space already */ @@ -276,14 +293,14 @@ enlargeStringInfo(StringInfo str, int needed) newlen = 2 * newlen; /* - * Clamp to MaxAllocSize in case we went past it. Note we are assuming - * here that MaxAllocSize <= INT_MAX/2, else the above loop could + * Clamp to the limit in case we went past it. Note we are assuming + * here that MaxAllocHugeSize <= INT64_MAX/2, else the above loop could * overflow. We will still have newlen >= needed. */ - if (newlen > (int) MaxAllocSize) - newlen = (int) MaxAllocSize; + if (newlen > (int64) limit) + newlen = (int64) limit; - str->data = (char *) repalloc(str->data, newlen); + str->data = (char *) repalloc_huge(str->data, newlen); str->maxlen = newlen; } diff --git a/src/include/lib/stringinfo.h b/src/include/lib/stringinfo.h index 4fff10a..cac6741 100644 --- a/src/include/lib/stringinfo.h +++ b/src/include/lib/stringinfo.h @@ -30,14 +30,17 @@ * cursor is initialized to zero by makeStringInfo or initStringInfo, * but is not otherwise touched by the stringinfo.c routines. * Some routines use it to scan through a StringInfo. + * allowlong boolean flag indicating whether this StringInfo can allocate + * more than MaxAllocSize bytes. *------------------------- */ typedef struct StringInfoData { char *data; - int len; - int maxlen; + int64 len; + int64 maxlen; int cursor; + bool allowlong; } StringInfoData; typedef StringInfoData *StringInfo; @@ -79,6 +82,11 @@ extern StringInfo makeStringInfo(void); extern void initStringInfo(StringInfo str); /*------------------------ + * Set flag to allow "huge" stringinfos. + */ +extern void allowLongStringInfo(StringInfo str); + +/*------------------------ * resetStringInfo * Clears the current content of the StringInfo, if any. The * StringInfo remains valid.
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers