Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
FWIW I ended up reverting the whole thing, even from master. A more complete solution would have to be researched. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > With that, pushed and I hope this is closed for good. Great! I understand the fix couldn't be backpatched because we are not allowed to change the StringInfo struct in existing releases. But it occurs to me that the new "long_ok" flag might not be necessary after all now that it's settled that the length remains an int32. The flag is used to enforce a rule that the string can't normally grow past 1GB, and can reach 2GB only as an exception that we choose to exercise for COPY starting with v10. But that 1GB rule is never mentioned in stringinfo.[ch] AFAICS. I know that 1GB is the varlena size and is a limit for various things, but I don't know to what extent StringInfo is concerned by that. Is it the case that users of StringInfo in existing releases and extensions are counting on its allocator to fail and abort if the buffer grows over the varlena range? If it's true, then we're stuck with the current situation for existing releases. OTOH if this seems like a nonlegit expectation, couldn't we say that the size limit is 2^31 for all, and suppress the flag introduced by the fix? Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > Daniel Verite wrote: > > > My tests are OK too but I see an issue with the code in > > enlargeStringInfo(), regarding integer overflow. > > The bit of comment that says: > > > > Note we are assuming here that limit <= INT_MAX/2, else the above > > loop could overflow. > > > > is obsolete, it's now INT_MAX instead of INT_MAX/2. > > I would keep this comment but use UINT_MAX/2 instead. I rephrased the comment instead. As you say, the current code no longer depends on that, but we will depend on something similar when we enlarge the other variables. With that, pushed and I hope this is closed for good. If you or anyone else want to revisit things so that pg10 can load&dump even larger tuples, be my guest. There are quite a few places that will need to be touched, though (in particular, as I recall, the FE/BE protocol.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > My tests are OK too but I see an issue with the code in > enlargeStringInfo(), regarding integer overflow. > The bit of comment that says: > > Note we are assuming here that limit <= INT_MAX/2, else the above > loop could overflow. > > is obsolete, it's now INT_MAX instead of INT_MAX/2. I would keep this comment but use UINT_MAX/2 instead. > There's a related problem here: > newlen = 2 * str->maxlen; > while (needed > newlen) > newlen = 2 * newlen; > str->maxlen is an int going up to INT_MAX so [2 * str->maxlen] now > *will* overflow when [str->maxlen > INT_MAX/2]. > Eventually it somehow works because of this: > if (newlen > limit) > newlen = limit; > but newlen is wonky (when resulting from int overflow) > before being brought back to limit. Yeah, you're right. We also need to cast "needed" to Size in the while test; and the repalloc_huge() call no longer needs a cast. I propose the attached. Not sure if we also need to cast the assignment to str->maxlen in the last line. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services commit 1327c2eea9c7ca074d88bb167bd4c35338d2de0b[m Author: Alvaro Herrera AuthorDate: Tue Jan 10 02:46:42 2017 -0300 CommitDate: Tue Jan 10 02:46:42 2017 -0300 Fix overflow check in StringInfo A thinko I introduced in fa2fa9955280. Also, amend a similarly broken comment. Report and patch by Daniel Vérité. Discussion: https://postgr.es/m/1706e85e-60d2-494e-8a64-9af1e1b21...@manitou-mail.org diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index bdc204e..11d751a 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -313,19 +313,19 @@ enlargeStringInfo(StringInfo str, int needed) * for efficiency, double the buffer size each time it overflows. * Actually, we might need to more than double it if 'needed' is big... */ - newlen = 2 * str->maxlen; - while (needed > newlen) + newlen = 2 * (Size) str->maxlen; + while ((Size) needed > newlen) newlen = 2 * newlen; /* * Clamp to the limit in case we went past it. Note we are assuming here -* that limit <= INT_MAX/2, else the above loop could overflow. We will +* that limit <= UINT_MAX/2, else the above loop could overflow. We will * still have newlen >= needed. */ if (newlen > limit) newlen = limit; - str->data = (char *) repalloc_huge(str->data, (Size) newlen); + str->data = (char *) repalloc_huge(str->data, newlen); str->maxlen = newlen; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > My tests are OK too but I see an issue with the code in > enlargeStringInfo(), regarding integer overflow. > The bit of comment that says: > > Note we are assuming here that limit <= INT_MAX/2, else the above > loop could overflow. > > is obsolete, it's now INT_MAX instead of INT_MAX/2. Hmm, I think what it really needs to say there is UINT_MAX/2, which is what we really care about. I may be all wet, but what I see is that the expression (((Size) 1) << (sizeof(int32) * 8 - 1)) - 1 which is what we use as limit is exactly half the unsigned 32-bit integer range. So I would just update the constant in that comment instead of removing it completely. (We're still relying on the loop not to overflow in 32-bit machines, surely). > There's a related problem here: > newlen = 2 * str->maxlen; > while (needed > newlen) > newlen = 2 * newlen; > str->maxlen is an int going up to INT_MAX so [2 * str->maxlen] now > *will* overflow when [str->maxlen > INT_MAX/2]. > Eventually it somehow works because of this: > if (newlen > limit) > newlen = limit; > but newlen is wonky (when resulting from int overflow) > before being brought back to limit. Yeah, this is bogus and your patch looks correct to me. I propose this: diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index b618b37..a1d786d 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -313,13 +313,13 @@ enlargeStringInfo(StringInfo str, int needed) * for efficiency, double the buffer size each time it overflows. * Actually, we might need to more than double it if 'needed' is big... */ - newlen = 2 * str->maxlen; + newlen = 2 * (Size) str->maxlen; while (needed > newlen) newlen = 2 * newlen; /* * Clamp to the limit in case we went past it. Note we are assuming here -* that limit <= INT_MAX/2, else the above loop could overflow. We will +* that limit <= UINT_MAX/2, else the above loop could overflow. We will * still have newlen >= needed. */ if (newlen > limit) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > > I have now pushed this to 9.5, 9.6 and master. It could be backpatched > to 9.4 with ease (just a small change in heap_form_tuple); anything > further back would require much more effort. I had to revert this on 9.5 and 9.6 -- it is obvious (in hindsight) that changing StringInfoData is an ABI break, so we can't do it in back branches; see https://www.postgresql.org/message-id/27737.1480993...@sss.pgh.pa.us The patch still remains in master, with the bugs you pointed out. I suppose if somebody is desperate about getting data out from a table with large tuples, they'd need to use pg10's pg_dump for it. We could use the highest-order bit in StringInfoData->maxlen to represent the boolean flag instead, if we really cared. But I'm not going to sweat over it ... -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > My tests are OK too but I see an issue with the code in > enlargeStringInfo(), regarding integer overflow. Rats. I'll have a look later. You're probably right. In the meantime I added this problem to the Open Items wiki page for pg10, which I just created: https://wiki.postgresql.org/wiki/Open_Items I noticed that there's one open item in the 9.6 page that wasn't closed, so I carried it forward. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > I have now pushed this to 9.5, 9.6 and master. It could be backpatched > to 9.4 with ease (just a small change in heap_form_tuple); anything > further back would require much more effort. > > I used a 32-bit limit using sizeof(int32). I tested and all the > mentioned cases seem to work sanely; if you can spare some more time to > test what was committed, I'd appreciate it. My tests are OK too but I see an issue with the code in enlargeStringInfo(), regarding integer overflow. The bit of comment that says: Note we are assuming here that limit <= INT_MAX/2, else the above loop could overflow. is obsolete, it's now INT_MAX instead of INT_MAX/2. There's a related problem here: newlen = 2 * str->maxlen; while (needed > newlen) newlen = 2 * newlen; str->maxlen is an int going up to INT_MAX so [2 * str->maxlen] now *will* overflow when [str->maxlen > INT_MAX/2]. Eventually it somehow works because of this: if (newlen > limit) newlen = limit; but newlen is wonky (when resulting from int overflow) before being brought back to limit. PFA a minimal fix. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index b618b37..b01afbe 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -313,14 +313,13 @@ enlargeStringInfo(StringInfo str, int needed) * for efficiency, double the buffer size each time it overflows. * Actually, we might need to more than double it if 'needed' is big... */ - newlen = 2 * str->maxlen; + newlen = 2 * (Size)str->maxlen; /* avoid integer overflow */ while (needed > newlen) newlen = 2 * newlen; /* - * Clamp to the limit in case we went past it. Note we are assuming here - * that limit <= INT_MAX/2, else the above loop could overflow. We will - * still have newlen >= needed. + * Clamp to the limit in case we went past it. We will still have + * newlen >= needed. */ if (newlen > limit) newlen = limit; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
I have now pushed this to 9.5, 9.6 and master. It could be backpatched to 9.4 with ease (just a small change in heap_form_tuple); anything further back would require much more effort. I used a 32-bit limit using sizeof(int32). I tested and all the mentioned cases seem to work sanely; if you can spare some more time to test what was committed, I'd appreciate it. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > If we consider what would happen with the latest patch on a platform > with sizeof(int)=8 and a \copy invocation like this: > > \copy (select big,big,big,big,big,big,big,big,.. FROM > (select lpad('', 1024*1024*200) as big) s) TO /dev/null > > if we put enough copies of "big" in the select-list to grow over 2GB, > and then over 4GB. Oh, right, I was forgetting that. > On a platform with sizeof(int)=4 this should normally fail over 2GB with > "Cannot enlarge string buffer containing $X bytes by $Y more bytes" > > I don't have an ILP64 environment myself to test, but I suspect > it would malfunction instead of cleanly erroring out on such > platforms. I suspect nobody has such platforms, as they are mostly defunct. But I see an easy way to fix it, which is to use sizeof(int32). > Also, without this limit, we can "COPY FROM/TO file" really huge rows, 4GB > and beyond, like I tried successfully during the tests mentioned upthread > (again with len as int64 on x86_64). > So such COPYs would succeed or fail depending on whether they deal with > a file or a network connection. > Do we want this difference in behavior? Such a patch would be for master only. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > But I realized that doing it this way is simple enough; > and furthermore, in any platforms where int is 8 bytes (ILP64), this > would automatically allow for 63-bit-sized stringinfos On such platforms there's the next problem that we can't send COPY rows through the wire protocol when they're larger than 2GB. Based on the tests with previous iterations of the patch that used int64 for the StringInfo length, the COPY backend code does not gracefully fail in that case. What happened when trying (linux x86_64) with a 2GB-4GB row is that the size seems to overflow and is sent as a 32-bit integer with the MSB set, which is confusing for the client side (at least libpq cannot deal with it). If we consider what would happen with the latest patch on a platform with sizeof(int)=8 and a \copy invocation like this: \copy (select big,big,big,big,big,big,big,big,.. FROM (select lpad('', 1024*1024*200) as big) s) TO /dev/null if we put enough copies of "big" in the select-list to grow over 2GB, and then over 4GB. On a platform with sizeof(int)=4 this should normally fail over 2GB with "Cannot enlarge string buffer containing $X bytes by $Y more bytes" I don't have an ILP64 environment myself to test, but I suspect it would malfunction instead of cleanly erroring out on such platforms. One advantage of hardcoding the StringInfo limit to 2GB independantly of sizeof(int) is to basically avoid the problem. Also, without this limit, we can "COPY FROM/TO file" really huge rows, 4GB and beyond, like I tried successfully during the tests mentioned upthread (again with len as int64 on x86_64). So such COPYs would succeed or fail depending on whether they deal with a file or a network connection. Do we want this difference in behavior? (keeping in mind that they will both fail anyway if any individual field in the row is larger than 1GB) Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
I just wrote: > The big advantage of your v3 patch is that it can be backpatched without > fear of breaking ABI, so I've struggled to maintain that property in my > changes. We can further tweak in HEAD-only; for example change the API > to use Size instead of int. I think that would be desirable, but let's > not do it until we have backpatched this one. One thing I just noticed while trying to backpatch this is that we can do so only to 9.5, because older branches do not have MemoryContextAllocExtended(). They do have MemoryContextAllocHuge(), but the caller in heaptuple.c wants zeroed memory too, so we'd need to memset; I think that could get us back to 9.4. 9.3 and older is not possible because we don't have "huge alloc" there. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > And in enlargeStringInfo() the patch adds this: > /* >* Maximum size for strings with allow_long=true. >* It must not exceed INT_MAX, as the StringInfo routines >* expect offsets into the buffer to fit into an int. >*/ > const int max_alloc_long = 0x7fff; > > On a 32-bit arch, we can expect max_alloc_long == MaxAllocHugeSize, > but on a 64-bit arch, it will be much smaller with MaxAllocHugeSize > at (2^63)-1. Yeah, you're right. Here's a v4 of this patch, in which I've done some further very small adjustments to your v3: * I changed the 0x7fff hardcoded value with some arithmetic which sholud have the same result on most architectures: /* * Determine the upper size limit. The fact that the StringInfo API uses * "int" everywhere limits us to int's width even for "long_ok" strings. */ limit = str->long_ok ? (((Size) 1) << (Min(sizeof(int), sizeof(Size)) * 8 - 1)) - 1 : MaxAllocSize; Initially I just had "sizeof(int)" instead of the Min(), and a StaticAssert for sizeof(int) <= sizeof(Size), on the grounds that no actual platform is likely to break that (though I think the C standard does allow it). But I realized that doing it this way is simple enough; and furthermore, in any platforms where int is 8 bytes (ILP64), this would automatically allow for 63-bit-sized stringinfos. I don't think this exists today, but wikipedia[1] lists two obsolete ones, [2] and [3]. [1] https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models [2] https://en.wikipedia.org/wiki/HAL_SPARC64 [3] https://en.wikipedia.org/wiki/UNICOS * I reverted the enlargement looping logic in enlargeStringInfo() to pretty much the original code (minus the cast). * I tweaked a few comments. The big advantage of your v3 patch is that it can be backpatched without fear of breaking ABI, so I've struggled to maintain that property in my changes. We can further tweak in HEAD-only; for example change the API to use Size instead of int. I think that would be desirable, but let's not do it until we have backpatched this one. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index e27ec78..631e555 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor, * Allocate and zero the space needed. Note that the tuple body and * HeapTupleData management structure are allocated in one chunk. */ - tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len); + tuple = MemoryContextAllocExtended(CurrentMemoryContext, + HEAPTUPLESIZE + len, + MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO); tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE); /* diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 3c81906..ec5d6f1 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -385,7 +385,7 @@ ReceiveCopyBegin(CopyState cstate) pq_sendint(&buf, format, 2);/* per-column formats */ pq_endmessage(&buf); cstate->copy_dest = COPY_NEW_FE; - cstate->fe_msgbuf = makeStringInfo(); + cstate->fe_msgbuf = makeLongStringInfo(); } else { @@ -1907,7 +1907,7 @@ CopyTo(CopyState cstate) cstate->null_print_client = cstate->null_print; /* default */ /* We use fe_msgbuf as a per-row buffer regardless of copy_dest */ - cstate->fe_msgbuf = makeStringInfo(); + cstate->fe_msgbuf = makeLongStringInfo(); /* Get info about the columns we need to process. */ cstate->out_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo)); @@ -2742,8 +2742,8 @@ BeginCopyFrom(ParseState *pstate, cstate->cur_attval = NULL; /* Set up variables to avoid per-attribute overhead. */ - initStringInfo(&cstate->attribute_buf); - initStringInfo(&cstate->line_buf); + initLongStringInfo(&cstate->attribute_buf); + initLongStringInfo(&cstate->line_buf); cstate->line_buf_converted = false; cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1); cstate->raw_buf_index = cstate->raw_buf_len = 0; diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index 7382e08..35f2eab 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -37,10 +37,28 @@ makeStringInfo(void) } /* + * makeLongStringInfo + * + * Same as makeStringInfo for larger strings. + */ +StringInfo +
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > I propose to rename allow_long to huge_ok. "Huge" is the terminology > used by palloc anyway. I'd keep makeLongStringInfo() and > initLongStringInfo() though as interface, because using Huge instead of > Long there looks strange. Not wedded to that, though (particularly as > it's a bit inconsistent). "Long" makes sense to me as qualifying a limit greater than MaxAllocSize but lower (or equal) than MaxAllocHugeSize. In memutils.h we have these definitions: #define MaxAllocSize ((Size) 0x3fff) /* 1 gigabyte - 1 */ #define MaxAllocHugeSize ((Size) -1 >> 1) /* SIZE_MAX / 2 */ And in enlargeStringInfo() the patch adds this: /* * Maximum size for strings with allow_long=true. * It must not exceed INT_MAX, as the StringInfo routines * expect offsets into the buffer to fit into an int. */ const int max_alloc_long = 0x7fff; On a 32-bit arch, we can expect max_alloc_long == MaxAllocHugeSize, but on a 64-bit arch, it will be much smaller with MaxAllocHugeSize at (2^63)-1. IOW, the patch only doubles the maximum size of StringInfo whereas we could say that it should multiply it by 2^32 to pretend to the "huge" qualifier. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
I propose to rename allow_long to huge_ok. "Huge" is the terminology used by palloc anyway. I'd keep makeLongStringInfo() and initLongStringInfo() though as interface, because using Huge instead of Long there looks strange. Not wedded to that, though (particularly as it's a bit inconsistent). I'm not terribly sure about enlargeStringInfo(). I think I would propose that it takes Size instead of int. That has rather more fallout than I'd like, but if we don't do that, then I'm not sure that appendStringInfo continues to makes sense -- considering that it uses the return value from appendStringInfoVA (which I'm redefining as returning Size) to pass to enlargeStringInfo. I'm not really sure how to ensure that the values passed fit both in an int and Size ... which they would, given that all the callers use not-huge stringinfos anyway. But it'd be better if the compiler knew. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > Here's an updated patch. Compared to the previous version: > > - removed CopyStartSend (per comment #1 in review) > > - renamed flag to allow_long (comment #2) > > - resetStringInfo no longer resets the flag (comment #3). > > - allowLongStringInfo() is removed (comment #3 and #5), > makeLongStringInfo() and initLongStringInfo() are provided > instead, as alternatives to makeStringInfo() and initStringInfo(). > > - StringInfoData.len is back to int type, 2GB max. > (addresses comment #4 incidentally). > This is safer because many routines peeking > into StringInfoData use int for offsets into the buffer, > for instance most of the stuff in backend/libpq/pqformat.c > Altough these routines are not supposed to be called on long > buffers, this assertion was not enforced in the code, and > with a 64-bit length effectively over 2GB, they would misbehave > pretty badly. Hmm, this looks a lot better I think. One change we overlooked and I just noticed is that we need to change appendStringInfoVA() to return size_t rather than int. This comment at the bottom of it gave that up: /* * Return pvsnprintf's estimate of the space needed. (Although this is * given as a size_t, we know it will fit in int because it's not more * than MaxAllocSize.) */ return (int) nprinted; The parenthical comment is no longer true. This means we need to update all its callers to match. I suppose it won't be a problem for most of them since they are not using long stringinfos anyway, but it seems better to keep everything consistent. I hope that callers that do not adjust the type of the declared variable would get a compiler warning. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Hi Tomas and Gerdan, This is a gentle reminder. you both are assigned as reviewers to the current patch in the 11-2016 commitfest. But you haven't shared any reviews yet, can you please try to share your views about the patch. This will help us in smoother operation of commitfest. Please Ignore if you already shared your review. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On Fri, Sep 30, 2016 at 10:12 PM, Daniel Verite wrote: > Tomas Vondra wrote: > >> A few minor comments regarding the patch: >> >> 1) CopyStartSend seems pretty pointless - It only has one function call >> in it, and is called on exactly one place (and all other places simply >> call allowLongStringInfo directly). I'd get rid of this function and >> replace the call in CopyOneRowTo(() with allowLongStringInfo(). >> >> 2) allowlong seems awkward, allowLong or allow_long would be better >> >> 3) Why does resetStringInfo reset the allowLong flag? Are there any >> cases when we want/need to forget the flag value? I don't think so, so >> let's simply not reset it and get rid of the allowLongStringInfo() >> calls. Maybe it'd be better to invent a new makeLongStringInfo() method >> instead, which would make it clear that the flag value is permanent. >> >> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log >> message, but with '%d' and not '%ld' (as needed after changing the type >> to Size). >> >> 5) The comment at allowLongStringInfo talks about allocLongStringInfo >> (i.e. wrong function name). > > Here's an updated patch. Compared to the previous version: > > - removed CopyStartSend (per comment #1 in review) > > - renamed flag to allow_long (comment #2) > > - resetStringInfo no longer resets the flag (comment #3). > > - allowLongStringInfo() is removed (comment #3 and #5), > makeLongStringInfo() and initLongStringInfo() are provided > instead, as alternatives to makeStringInfo() and initStringInfo(). > > - StringInfoData.len is back to int type, 2GB max. > (addresses comment #4 incidentally). > This is safer because many routines peeking > into StringInfoData use int for offsets into the buffer, > for instance most of the stuff in backend/libpq/pqformat.c > Altough these routines are not supposed to be called on long > buffers, this assertion was not enforced in the code, and > with a 64-bit length effectively over 2GB, they would misbehave > pretty badly. The patch status is "Waiting on Author" in the CF app, but a new patch has been sent 2 days ago, so this entry has been moved to next CF with "Needs Review". -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Tomas Vondra wrote: > A few minor comments regarding the patch: > > 1) CopyStartSend seems pretty pointless - It only has one function call > in it, and is called on exactly one place (and all other places simply > call allowLongStringInfo directly). I'd get rid of this function and > replace the call in CopyOneRowTo(() with allowLongStringInfo(). > > 2) allowlong seems awkward, allowLong or allow_long would be better > > 3) Why does resetStringInfo reset the allowLong flag? Are there any > cases when we want/need to forget the flag value? I don't think so, so > let's simply not reset it and get rid of the allowLongStringInfo() > calls. Maybe it'd be better to invent a new makeLongStringInfo() method > instead, which would make it clear that the flag value is permanent. > > 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log > message, but with '%d' and not '%ld' (as needed after changing the type > to Size). > > 5) The comment at allowLongStringInfo talks about allocLongStringInfo > (i.e. wrong function name). Here's an updated patch. Compared to the previous version: - removed CopyStartSend (per comment #1 in review) - renamed flag to allow_long (comment #2) - resetStringInfo no longer resets the flag (comment #3). - allowLongStringInfo() is removed (comment #3 and #5), makeLongStringInfo() and initLongStringInfo() are provided instead, as alternatives to makeStringInfo() and initStringInfo(). - StringInfoData.len is back to int type, 2GB max. (addresses comment #4 incidentally). This is safer because many routines peeking into StringInfoData use int for offsets into the buffer, for instance most of the stuff in backend/libpq/pqformat.c Altough these routines are not supposed to be called on long buffers, this assertion was not enforced in the code, and with a 64-bit length effectively over 2GB, they would misbehave pretty badly. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 6d0f3f3..ed9cabd 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor, * Allocate and zero the space needed. Note that the tuple body and * HeapTupleData management structure are allocated in one chunk. */ - tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len); + tuple = MemoryContextAllocExtended(CurrentMemoryContext, + HEAPTUPLESIZE + len, + MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO); tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE); /* diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 432b0ca..7419362 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -397,7 +397,7 @@ ReceiveCopyBegin(CopyState cstate) pq_sendint(&buf, format, 2);/* per-column formats */ pq_endmessage(&buf); cstate->copy_dest = COPY_NEW_FE; - cstate->fe_msgbuf = makeStringInfo(); + cstate->fe_msgbuf = makeLongStringInfo(); } else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2) { @@ -1892,7 +1892,7 @@ CopyTo(CopyState cstate) cstate->null_print_client = cstate->null_print; /* default */ /* We use fe_msgbuf as a per-row buffer regardless of copy_dest */ - cstate->fe_msgbuf = makeStringInfo(); + cstate->fe_msgbuf = makeLongStringInfo(); /* Get info about the columns we need to process. */ cstate->out_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo)); @@ -2707,8 +2707,8 @@ BeginCopyFrom(ParseState *pstate, cstate->cur_attval = NULL; /* Set up variables to avoid per-attribute overhead. */ - initStringInfo(&cstate->attribute_buf); - initStringInfo(&cstate->line_buf); + initLongStringInfo(&cstate->attribute_buf); + initLongStringInfo(&cstate->line_buf); cstate->line_buf_converted = false; cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1); cstate->raw_buf_index = cstate->raw_buf_len = 0; diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index 7382e08..0125047 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -37,6 +37,24 @@ makeStringInfo(void) } /* + * makeLongStringInfo + * + * Same as makeStringInfo for larger strings. + */ +StringInfo +makeLongStringInfo(void) +{ + StringInfo res; + + res = (StringInfo) palloc(sizeof(StringInfoData)); + + initLongStringInfo(res); + + return res; +} + + +/* * initStringInfo * * Initialize a StringInfoData struct (with previously undefine
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Tomas Vondra wrote: > 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log > message, but with '%d' and not '%ld' (as needed after changing the type > to Size). This could be %zu for the Size type. It's supported by elog(). However, it occurs to me that if we don't claim the 2GB..4GB range for the CopyData message, because its Int32 length is not explicitly unsigned as mentioned upthread, then it should follow that we don't need to change StringInfo.{len,maxlen} from int type to Size type. We should just set the upper limit for StringInfo.maxlen to (0x7fff-1) instead of MaxAllocHugeSize for stringinfos with the allow_long flag set, and leave the len and maxlen fields to their original, int type. Does that make sense? Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On 09/03/2016 02:21 AM, Tomas Vondra wrote: A few minor comments regarding the patch: 1) CopyStartSend seems pretty pointless - It only has one function call in it, and is called on exactly one place (and all other places simply call allowLongStringInfo directly). I'd get rid of this function and replace the call in CopyOneRowTo(() with allowLongStringInfo(). 2) allowlong seems awkward, allowLong or allow_long would be better 3) Why does resetStringInfo reset the allowLong flag? Are there any cases when we want/need to forget the flag value? I don't think so, so let's simply not reset it and get rid of the allowLongStringInfo() calls. Maybe it'd be better to invent a new makeLongStringInfo() method instead, which would make it clear that the flag value is permanent. 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log message, but with '%d' and not '%ld' (as needed after changing the type to Size). 5) The comment at allowLongStringInfo talks about allocLongStringInfo (i.e. wrong function name). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Hi, I've subscribed to review this patch in the current CF, so let me start with a brief summary of the thread as it started more than a year ago. In general the thread is about hitting the 1GB allocation size limit (and 32-bit length in FE/BE protocol) in various ways: 1) attributes are smaller than 1GB, but row exceeds the limit This causes issues in COPY/pg_dump, as we allocate a single StringInfo for the whole row, and send it as a single message (which includes int32 length, just like most FE/BE messages). 2) attribute values that get over the 1GB due to encoding For example it's possible to construct values that are OK in hex encoding but >1GB in escape (and vice versa). Those values get stored just fine, but there's no way to dump / COPY them. And if you happen to have both, you can't do pg_dump :-/ I think it's OK not to be able to store extremely large values, but only if we make it predictable (ideally by rejecting the value before storing in in the database). The cases discussed in the thread are particularly annoying exactly because we do the opposite - we allow the value to be stored, but then fail when retrieving the value or trying to backup the database (which is particularly nasty, IMNSHO). So although we don't have that many reports about this, it'd be nice to improve the behavior a bit. The trouble is, this rabbit hole is fairly deep - wherever we palloc or detoast a value, we're likely to hit those issues with wide values/rows. Honestly, I don't think it's feasible to make all the code paths work with such wide values/rows - as TL points out, particularly with the wide values it would be enormously invasive. So the patch aims to fix just the simplest case, i.e. when the values are smaller than 1GB but the total row is larger. It does so by allowing StringInfo to exceed the MaxAllocSize in special cases (currently only COPY FROM/TO), and using MCXT_ALLOC_HUGE when forming the heap tuple (to make it possible to load the data). It seems to work and I do think it's a reasonable first step to make things work. A few minor comments regarding the patch: 1) CopyStartSend seems pretty pointless - It only has one function call in it, and is called on exactly one place (and all other places simply call allowLongStringInfo directly). I'd get rid of this function and replace the call in CopyOneRowTo(() with allowLongStringInfo(). 2) allowlong seems awkward, allowLong or allow_long would be better 3) Why does resetStringInfo reset the allowLong flag? Are there any cases when we want/need to forget the flag value? I don't think so, so let's simply not reset it and get rid of the allowLongStringInfo() calls. Maybe it'd be better to invent a new makeLongStringInfo() method instead, which would make it clear that the flag value is permanent. 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log message, but with '%d' and not '%ld' (as needed after changing the type to Size). I however wonder whether it wouldn't be better to try Robert's proposal, i.e. not building the huge StringInfo for the whole row, but sending the attribute data directly. I don't mean to send messages for each attribute - the current FE/BE protocol does not allow that (it says that each 'd' message is a whole row), but just sending the network message in smaller chunks. That would make the StringInfo changes unnecessary, reduce the memory consumption considerably (right now we need 2x the memory, and we know we're dealing with gigabytes of data). Of course, it'd require significant changes of how copy works internally (instead of accumulating data for the whole row, we'd have to send them right in smaller chunks). Which would allow us getting rid of the StringInfo changes, but we'd not be able to backpatch this. Regarding interpreting the Int32 length field in FE/BE protocol as unsigned, I'm a bit worried it might qualify as breaking the protocol. It's true we don't really say whether it's signed or unsigned, and we handle it differently depending on the message type, but I wonder how many libraries simply use int32. OTOH those clients are unlikely to handle even the 2GB we might send them without breaking the protocol, so I guess this is fine. And 4GB seems like the best we can do. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On 24 March 2016 at 01:14, Daniel Verite wrote: > > It provides a useful mitigation to dump/reload databases having > rows in the 1GB-2GB range, but it works under these limitations: > > - no single field has a text representation exceeding 1GB. > - no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we > could push this to 4GB with limited changes to libpq, by > interpreting the Int32 field in the CopyData message as unsigned). This seems like worthwhile mitigation for an issue multiple people have hit in the wild, and more will. Giving Pg more generally graceful handling for big individual datums is going to be a bit of work, though. Support for wide-row, big-Datum COPY in and out. Efficient lazy fetching of large TOASTed data by follow-up client requests. Range fetching of large compressed TOASTed values (possibly at the price of worse compression) without having to decompress the whole thing up to the start of the desired range. Lots of fun. At least we have lob / pg_largeobject. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On Sat, Apr 23, 2016 at 9:58 AM, Bruce Momjian wrote: > On Thu, Mar 3, 2016 at 10:31:26AM +0900, Michael Paquier wrote: >> On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera >> wrote: >> > Well, the CopyData message has an Int32 field for the message length. >> > I don't know the FE/BE protocol very well but I suppose each row >> > corresponds to one CopyData message, or perhaps each column corresponds >> > to one CopyData message. In either case, it's not possible to go beyond >> > 2GB without changing the protocol ... >> >> Based on what I know from this stuff (OOM libpq and other stuff >> remnants), one 'd' message means one row. fe-protocol3.c and >> CopySendEndOfRow in backend's copy.c are confirming that as well. I am >> indeed afraid that having extra logic to get chunks of data will >> require extending the protocol with a new message type for this >> purpose. > > Is there any documentation that needs updating based on this research? Perhaps. On the docs the two sections referring to the CopyData messages arein protocol.sgml - with this portion for the 'd' message itself: CopyData (F & B) - and a more precise description here: COPY Operations We could precise in one of them that the maximum size of a CopyData message can be up to 1GB. Thoughts? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On Thu, Mar 3, 2016 at 10:31:26AM +0900, Michael Paquier wrote: > On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera > wrote: > > Well, the CopyData message has an Int32 field for the message length. > > I don't know the FE/BE protocol very well but I suppose each row > > corresponds to one CopyData message, or perhaps each column corresponds > > to one CopyData message. In either case, it's not possible to go beyond > > 2GB without changing the protocol ... > > Based on what I know from this stuff (OOM libpq and other stuff > remnants), one 'd' message means one row. fe-protocol3.c and > CopySendEndOfRow in backend's copy.c are confirming that as well. I am > indeed afraid that having extra logic to get chunks of data will > require extending the protocol with a new message type for this > purpose. Is there any documentation that needs updating based on this research? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > > tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len); > > > > which fails because (HEAPTUPLESIZE + len) is again considered > > an invalid size, the size being 1468006476 in my test. > > Um, it seems reasonable to make this one be a huge-zero-alloc: > > MemoryContextAllocExtended(CurrentMemoryContext, > HEAPTUPLESIZE + len, > MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO) Good, this allows the tests to go to completion! The tests in question are dump/reload of a row with several fields totalling 1.4GB (deflated), with COPY TO/FROM file and psql's \copy in both directions, as well as pg_dump followed by pg_restore|psql. The modified patch is attached. It provides a useful mitigation to dump/reload databases having rows in the 1GB-2GB range, but it works under these limitations: - no single field has a text representation exceeding 1GB. - no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we could push this to 4GB with limited changes to libpq, by interpreting the Int32 field in the CopyData message as unsigned). It's also possible to go beyond 4GB per row with this patch, but when not using the protocol. I've managed to get a 5.6GB single-row file with COPY TO file. That doesn't help with pg_dump, but that might be useful in other situations. In StringInfo, I've changed int64 to Size, because on 32 bits platforms the downcast from int64 to Size is problematic, and as the rest of the allocation routines seems to favor Size, it seems more consistent anyway. I couldn't test on 32 bits though, as I seem to never have enough free contiguous memory available on a 32 bits VM to handle that kind of data. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 6d0f3f3..ed9cabd 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor, * Allocate and zero the space needed. Note that the tuple body and * HeapTupleData management structure are allocated in one chunk. */ - tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len); + tuple = MemoryContextAllocExtended(CurrentMemoryContext, + HEAPTUPLESIZE + len, + MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO); tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE); /* diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 3201476..ce681d7 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -446,6 +446,15 @@ SendCopyEnd(CopyState cstate) } } +/* + * Prepare for output + */ +static void +CopyStartSend(CopyState cstate) +{ + allowLongStringInfo(cstate->fe_msgbuf); +} + /*-- * CopySendData sends output data to the destination (file or frontend) * CopySendString does the same for null-terminated strings @@ -2015,6 +2024,8 @@ CopyOneRowTo(CopyState cstate, Oid tupleOid, Datum *values, bool *nulls) MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); + CopyStartSend(cstate); + if (cstate->binary) { /* Binary per-tuple header */ @@ -3203,6 +3214,7 @@ CopyReadLine(CopyState cstate) bool result; resetStringInfo(&cstate->line_buf); + allowLongStringInfo(&cstate->line_buf); cstate->line_buf_valid = true; /* Mark that encoding conversion hasn't occurred yet */ @@ -3272,6 +3284,7 @@ CopyReadLine(CopyState cstate) { /* transfer converted data back to line_buf */ resetStringInfo(&cstate->line_buf); + allowLongStringInfo(&cstate->line_buf); appendBinaryStringInfo(&cstate->line_buf, cvt, strlen(cvt)); pfree(cvt); } @@ -3696,7 +3709,7 @@ CopyReadAttributesText(CopyState cstate) } resetStringInfo(&cstate->attribute_buf); - + allowLongStringInfo(&cstate->attribute_buf); /* * The de-escaped attributes will certainly not be longer than the input * data line, so we can just force attribute_buf to be large enough and diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index 7382e08..6e451b2 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -47,12 +47,24 @@ initStringInfo(StringInfo str) { int size = 1024; /* initial default buffer size */ - str->data = (char *) palloc(size); + str->data = (char *) palloc(size); /* no need for "huge" at this point */ str->maxlen = size; + str->allowlong = false; resetStringInfo(str); } /* + * allocLongStringInfo + * + * Mark the StringInfo as a candidate for a "huge" allocation + */ +void +allowLongStringInfo(StringInfo str) +{ + str->allowlong = true; +} + +/* * resetStringInfo * * Reset the StringInfo: the data buffer remains valid, but its @@ -64,6 +76,7 @@ resetStringInfo(StringInfo str) str->data[0] = '\0'; str->len = 0; str->cursor = 0; + str->allowl
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > # \copy bigtext2 from '/var/tmp/bigtext.sql' > ERROR: 54000: out of memory > DETAIL: Cannot enlarge string buffer containing 1073741808 bytes by 8191 > more bytes. > CONTEXT: COPY bigtext2, line 1 > LOCATION: enlargeStringInfo, stringinfo.c:278 To go past that problem, I've tried tweaking the StringInfoData used for COPY FROM, like the original patch does in CopyOneRowTo. It turns out that it fails a bit later when trying to make a tuple from the big line, in heap_form_tuple(): tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len); which fails because (HEAPTUPLESIZE + len) is again considered an invalid size, the size being 1468006476 in my test. At this point it feels like a dead end, at least for the idea that extending StringInfoData might suffice to enable COPYing such large rows. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > To go past that problem, I've tried tweaking the StringInfoData > used for COPY FROM, like the original patch does in CopyOneRowTo. > > It turns out that it fails a bit later when trying to make a tuple > from the big line, in heap_form_tuple(): > > tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len); > > which fails because (HEAPTUPLESIZE + len) is again considered > an invalid size, the size being 1468006476 in my test. Um, it seems reasonable to make this one be a huge-zero-alloc: MemoryContextAllocExtended(CurrentMemoryContext, HEAPTUPLESIZE + len, MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Tom Lane wrote: > BTW, is anyone checking the other side of this, ie "COPY IN" with equally > wide rows? There doesn't seem to be a lot of value in supporting dump > if you can't reload ... Indeed, the lines bigger than 1GB can't be reloaded :( My test: with a stock 9.5.1 plus Alvaro's patch with my additional int64 fix mentioned upthread, creating this table and filling all columns with 200MB of text each: create table bigtext(t1 text ,t2 text, t3 text, t4 text, t5 text, t6 text, t7 text); # \copy bigtext to '/var/tmp/bigtext.sql' That part does work as expected, producing this huge single line: $ wc /var/tmp/bigtext.sql 1 7 1468006407 /var/tmp/bigtext.sql But reloading it fails: # create table bigtext2 (like bigtext); CREATE TABLE # copy bigtext2 from '/var/tmp/bigtext.sql'; ERROR: 54000: out of memory DETAIL: Cannot enlarge string buffer containing 1073676288 bytes by 65536 more bytes. CONTEXT: COPY bigtext2, line 1 LOCATION: enlargeStringInfo, stringinfo.c:278 # \copy bigtext2 from '/var/tmp/bigtext.sql' ERROR: 54000: out of memory DETAIL: Cannot enlarge string buffer containing 1073741808 bytes by 8191 more bytes. CONTEXT: COPY bigtext2, line 1 LOCATION: enlargeStringInfo, stringinfo.c:278 Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera wrote: > Well, the CopyData message has an Int32 field for the message length. > I don't know the FE/BE protocol very well but I suppose each row > corresponds to one CopyData message, or perhaps each column corresponds > to one CopyData message. In either case, it's not possible to go beyond > 2GB without changing the protocol ... Based on what I know from this stuff (OOM libpq and other stuff remnants), one 'd' message means one row. fe-protocol3.c and CopySendEndOfRow in backend's copy.c are confirming that as well. I am indeed afraid that having extra logic to get chunks of data will require extending the protocol with a new message type for this purpose. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Tomas Vondra wrote: > My guess is this is a problem at the protocol level - the 'd' message is > CopyData, and all the messages use int32 to define length. So if there's > a 2GB row, it's likely to overflow. Yes. Besides, the full message includes a negative length: > postgres=# \copy big2 to /dev/null > lost synchronization with server: got message type "d", length -1568669676 which happens to be the correct size if interpreted as an unsigned int32 -1568669676 = (int) (1300UL*1024*1024*2 + 3 + 3*4 + 1 + 4) One interpretation would be that putting an unsigned length in CopyData message is a protocol violation. However it's not clear to me that Int32 in the doc necessarily designates a signed integer. Int32 is defined as: Intn(i) An n-bit integer in network byte order (most significant byte first). If i is specified it is the exact value that will appear, otherwise the value is variable. Eg. Int16, Int32(42). There's a least one example when we use Int16 as unsigned: the number of parameters in Bind (F) can be up to 65535. This maximum is tested explicitly and refered to at several places in fe-exec. In some instances, Int32 is clearly signed, because -1 is accepted to indicate NULLness, such as again in Bind (F) for the length of the parameter value. From this it seems to me that Intn is to be interpreted as signed or unsigned on a case by case basis. Back to CopyData (F & B), it's documented as: Byte1('d') Identifies the message as COPY data. Int32 Length of message contents in bytes, including self. Byten Data that forms part of a COPY data stream. Messages sent from the backend will always correspond to single data rows, but messages sent by frontends might divide the data stream arbitrarily. I don't see any hint that this length is signed, nor any reason of having it signed. I guess before the patch it didn't matter, for the B case at least, because the backend never sent more than 1GB. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Daniel Verite wrote: > The cause of the crash turns out to be, in enlargeStringInfo(): > > needed += str->len + 1; /* total space required now */ > > needed is an int and str->len is an int64, so it overflows when the > size has to grow beyond 2^31 bytes, fails to enlarge the buffer and > overwrites memory after it. > > When fixing it with a local int64 copy of the variable, the backend > no longer crashes and COPY big2 TO 'file' appears to work. Great, thanks for debugging. > However, getting it to the client with \copy big2 to 'file' > still produces the error in psql: >lost synchronization with server: got message type "d" > and leaves an empty file, so there are more problems to solve to > go beyond 2GB text per row. Well, the CopyData message has an Int32 field for the message length. I don't know the FE/BE protocol very well but I suppose each row corresponds to one CopyData message, or perhaps each column corresponds to one CopyData message. In either case, it's not possible to go beyond 2GB without changing the protocol ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On 03/02/2016 04:23 PM, Tom Lane wrote: Tomas Vondra writes: On 03/02/2016 03:18 PM, Daniel Verite wrote: However, getting it to the client with \copy big2 to 'file' still produces the error in psql: lost synchronization with server: got message type "d" and leaves an empty file, so there are more problems to solve to go beyond 2GB text per row. My guess is this is a problem at the protocol level - the 'd' message is CopyData, and all the messages use int32 to define length. So if there's a 2GB row, it's likely to overflow. I'm too lazy to check the exact wording, but I don't think there's a hard and fast promise in the protocol doc that one CopyData message == one row. So we could probably subdivide a very wide line into multiple messages. Well, actually we claim this [1]: Data that forms part of a COPY data stream. Messages sent from the backend will always correspond to single data rows, but messages sent by frontends might divide the data stream arbitrarily. So that suggests 1:1 messages to rows, although I'm not sure how difficult would it be to relax this (or how much the clients might rely on this). [1] http://www.postgresql.org/docs/9.5/static/protocol-message-formats.html regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Tomas Vondra writes: > On 03/02/2016 03:18 PM, Daniel Verite wrote: >> However, getting it to the client with \copy big2 to 'file' >> still produces the error in psql: >> lost synchronization with server: got message type "d" >> and leaves an empty file, so there are more problems to solve to >> go beyond 2GB text per row. > My guess is this is a problem at the protocol level - the 'd' message is > CopyData, and all the messages use int32 to define length. So if there's > a 2GB row, it's likely to overflow. I'm too lazy to check the exact wording, but I don't think there's a hard and fast promise in the protocol doc that one CopyData message == one row. So we could probably subdivide a very wide line into multiple messages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Hi, On 03/02/2016 03:18 PM, Daniel Verite wrote: I wrote: postgres=# \copy big2 to /dev/null lost synchronization with server: got message type "d", length -1568669676 The backend aborts with the following backtrace: Program terminated with signal 6, Aborted. #0 0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6 #5 0x007b5a89 in AllocSetDelete (context=0x) at aset.c:643 #6 0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at mcxt.c:229 #7 0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967 The cause of the crash turns out to be, in enlargeStringInfo(): needed += str->len + 1; /* total space required now */ needed is an int and str->len is an int64, so it overflows when the size has to grow beyond 2^31 bytes, fails to enlarge the buffer and overwrites memory after it. When fixing it with a local int64 copy of the variable, the backend no longer crashes and COPY big2 TO 'file' appears to work. However, getting it to the client with \copy big2 to 'file' still produces the error in psql: lost synchronization with server: got message type "d" and leaves an empty file, so there are more problems to solve to go beyond 2GB text per row. My guess is this is a problem at the protocol level - the 'd' message is CopyData, and all the messages use int32 to define length. So if there's a 2GB row, it's likely to overflow. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
I wrote: > postgres=# \copy big2 to /dev/null > lost synchronization with server: got message type "d", length -1568669676 > > The backend aborts with the following backtrace: > > Program terminated with signal 6, Aborted. > #0 0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6 > #2 0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > #3 0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > #4 0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6 > #5 0x007b5a89 in AllocSetDelete (context=0x) >at aset.c:643 > #6 0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at > mcxt.c:229 > #7 0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967 The cause of the crash turns out to be, in enlargeStringInfo(): needed += str->len + 1; /* total space required now */ needed is an int and str->len is an int64, so it overflows when the size has to grow beyond 2^31 bytes, fails to enlarge the buffer and overwrites memory after it. When fixing it with a local int64 copy of the variable, the backend no longer crashes and COPY big2 TO 'file' appears to work. However, getting it to the client with \copy big2 to 'file' still produces the error in psql: lost synchronization with server: got message type "d" and leaves an empty file, so there are more problems to solve to go beyond 2GB text per row. Or maybe another approach would be to advertise that this is the maximum for a row in text mode, and limit the backend's string buffer to this size for the time being? Notwithstanding that there's still the other direction client->server to test. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
"Daniel Verite" writes: > I've tried adding another large field to see what happens if the whole row > exceeds 2GB, and data goes to the client rather than to a file. > My idea was to check if the client side was OK with that much data on > a single COPY row, but it turns out that the server is not OK anyway. BTW, is anyone checking the other side of this, ie "COPY IN" with equally wide rows? There doesn't seem to be a lot of value in supporting dump if you can't reload ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
I wrote: > If splitting the table into 3 fields, each smaller than 512MB: > > postgres=# create table big2 as select > substring(binarycol from 1 for 300*1024*1024) as b1, > substring(binarycol from 1+300*1024*1024 for 300*1024*1024) as b2 , > substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3 > from big; I've tried adding another large field to see what happens if the whole row exceeds 2GB, and data goes to the client rather than to a file. postgres=# alter table big2 add b4 bytea; postgres=# update big2 set b4=b1; So big2 has 4 bytea columns with 300+300+400+300 MB on a single row. Then I'm trying to \copy this from a 9.5.1 backend with patch applied, configured with --enable-debug, on Debian8 x86-64 with 64GB of RAM and all defaults in postgresql.conf My idea was to check if the client side was OK with that much data on a single COPY row, but it turns out that the server is not OK anyway. postgres=# \copy big2 to /dev/null lost synchronization with server: got message type "d", length -1568669676 The backend aborts with the following backtrace: Program terminated with signal 6, Aborted. #0 0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6 #5 0x007b5a89 in AllocSetDelete (context=0x) at aset.c:643 #6 0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at mcxt.c:229 #7 0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967 #8 DoCopyTo (cstate=cstate@entry=0x1fb1050) at copy.c:1778 #9 0x00562ea9 in DoCopy (stmt=stmt@entry=0x1f796d0, queryString=queryString@entry=0x1f78c60 "COPY big2 TO STDOUT ", processed=processed@entry=0x7fff22103338) at copy.c:961 #10 0x006b4440 in standard_ProcessUtility (parsetree=0x1f796d0, queryString=0x1f78c60 "COPY big2 TO STDOUT ", context=, params=0x0, dest=0x1f79a30, completionTag=0x7fff22103680 "") at utility.c:541 #11 0x006b1937 in PortalRunUtility (portal=0x1f26680, utilityStmt=0x1f796d0, isTopLevel=1 '\001', dest=0x1f79a30, completionTag=0x7fff22103680 "") at pquery.c:1183 #12 0x006b2535 in PortalRunMulti (portal=portal@entry=0x1f26680, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f79a30, altdest=altdest@entry=0x1f79a30, completionTag=completionTag@entry=0x7fff22103680 "") at pquery.c:1314 #13 0x006b3022 in PortalRun (portal=portal@entry=0x1f26680, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f79a30, altdest=altdest@entry=0x1f79a30, completionTag=completionTag@entry=0x7fff22103680 "") at pquery.c:812 #14 0x006b0393 in exec_simple_query ( query_string=0x1f78c60 "COPY big2 TO STDOUT ") at postgres.c:1104 #15 PostgresMain (argc=, argv=argv@entry=0x1f0d240, dbname=0x1f0d0f0 "postgres", username=) at postgres.c:4030 #16 0x0047072b in BackendRun (port=0x1f2d230) at postmaster.c:4239 #17 BackendStartup (port=0x1f2d230) at postmaster.c:3913 #18 ServerLoop () at postmaster.c:1684 #19 0x0065b8cd in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1f0c330) at postmaster.c:1292 #20 0x00471161 in main (argc=3, argv=0x1f0c330) at main.c:223 The server log has this: STATEMENT: COPY big2 TO STDOUT *** glibc detected *** postgres: postgres postgres [local] COPY: double free or corruption (out): 0x7f8234929010 *** === Backtrace: = /lib/x86_64-linux-gnu/libc.so.6(+0x75be6)[0x7f82ee6d1be6] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x7f82ee6d698c] postgres: postgres postgres [local] COPY[0x7b5a89] postgres: postgres postgres [local] COPY(MemoryContextDelete+0x43)[0x7b63e3] postgres: postgres postgres [local] COPY[0x55fa25] postgres: postgres postgres [local] COPY(DoCopy+0x479)[0x562ea9] postgres: postgres postgres [local] COPY(standard_ProcessUtility+0x590)[0x6b4440] postgres: postgres postgres [local] COPY[0x6b1937] postgres: postgres postgres [local] COPY[0x6b2535] postgres: postgres postgres [local] COPY(PortalRun+0x202)[0x6b3022] postgres: postgres postgres [local] COPY(PostgresMain+0x1493)[0x6b0393] postgres: postgres postgres [local] COPY[0x47072b] postgres: postgres postgres [local] COPY(PostmasterMain+0xe7d)[0x65b8cd] postgres: postgres postgres [local] COPY(main+0x3e1)[0x471161] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7f82ee67aead] postgres: postgres postgres [local] COPY[0x4711c9] I will try other tests without bytea exported in text format but with several huge text columns. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
"Daniel Verite" writes: > Alvaro Herrera wrote: >> If others can try this patch to ensure it enables pg_dump to work on >> their databases, it would be great. > It doesn't seem to help if one field exceeds 1Gb, for instance when > inflated by a bin->hex translation. It's not going to be possible to fix that without enormously invasive changes (affecting individual datatype I/O functions, for example). And in general, fields approaching that size are going to give you problems in all kinds of ways, not only COPY. I think we should be satisfied if we can make COPY deal with the sum of a line's fields exceeding 1GB. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Alvaro Herrera wrote: > If others can try this patch to ensure it enables pg_dump to work on > their databases, it would be great. It doesn't seem to help if one field exceeds 1Gb, for instance when inflated by a bin->hex translation. postgres=# create table big as select pg_read_binary_file('data') as binarycol; postgres=# select octet_length(binarycol) from big; octet_length -- 107370 postgres=# copy big to '/var/tmp/big.copy'; ERROR: XX000: invalid memory alloc request size 214743 LOCATION: palloc, mcxt.c:903 Same problem with pg_dump. OTOH, it improves the case where the cumulative size of field contents for a row exceeds 1 Gb, but not any single field exceeds that size. If splitting the table into 3 fields, each smaller than 512MB: postgres=# create table big2 as select substring(binarycol from 1 for 300*1024*1024) as b1, substring(binarycol from 1+300*1024*1024 for 300*1024*1024) as b2 , substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3 from big; postgres=# copy big2 to '/var/tmp/big.copy'; COPY 1 then that works, producing a single line of 2097152012 chars in the output file. By contrast, it fails with an unpatched 9.5: postgres=# copy big2 to '/var/tmp/big.copy'; ERROR: 54000: out of memory DETAIL: Cannot enlarge string buffer containing 629145605 bytes by 629145602 more bytes. LOCATION: enlargeStringInfo, stringinfo.c:260 If setting bytea_output to 'escape', it also fails with the patch applied, as it tries to allocate 4x the binary field size, and it exceeds 1GB again. postgres=# set bytea_output =escape; SET postgres=# copy big2 to '/var/tmp/big.copy'; ERROR: invalid memory alloc request size 1258291201 LOCATION: palloc, mcxt.c:821 1258291201 = 300*1024*1024*4+1 Also, the COPY of both tables work fine if using (FORMAT BINARY), on both the patched and unpatched server. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
A customer of ours was hit by this recently and I'd like to get it fixed for good. Robert Haas wrote: > On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby wrote: > > In any case, I don't think it would be terribly difficult to allow a bit > > more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's > > some 1GB limits there too. > > The point is, those limits are there on purpose. Changing things > arbitrarily wouldn't be hard, but doing it in a principled way is > likely to require some thought. For example, in the COPY OUT case, > presumably what's happening is that we palloc a chunk for each > individual datum, and then palloc a buffer for the whole row. Right. There's a serious problem here already, and users are being bitten by it in existing releases. I think we should provide a non-invasive fix for back-branches which we could apply as a bug fix. Here's a proposed patch for the localized fix; it just adds a flag to StringInfo allowing the string to grow beyond the 1GB limit, but only for strings which are specifically marked. That way we avoid having to audit all the StringInfo-using code. In this patch, only the COPY path is allowed to grow beyond 1GB, which is enough to close the current problem -- or at least my test case for it. If others can try this patch to ensure it enables pg_dump to work on their databases, it would be great. (In this patch there's a known buglet which is that the UINT64_FORMAT patched in the error message doesn't allow for translatability.) Like Robert, I don't like this approach for the long term, however. I think it would be saner to have CopySendData not keep a single gigantic string in memory; it would be better to get the bytes out to the client sooner than end-of-row. To avoid causing a performance hit, we would only flush when the size of the output buffer is about to reach the allocation limit (MaxAllocSize); so for regular-sized tuples, we would only do it at end-of-row, keeping the current behavior. I don't have a patch for this yet; I'm going to try that next. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 3eba9ef..fcc4fe6 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -445,6 +445,15 @@ SendCopyEnd(CopyState cstate) } } +/* + * Prepare for output + */ +static void +CopyStartSend(CopyState cstate) +{ + allowLongStringInfo(cstate->fe_msgbuf); +} + /*-- * CopySendData sends output data to the destination (file or frontend) * CopySendString does the same for null-terminated strings @@ -1865,6 +1874,8 @@ CopyOneRowTo(CopyState cstate, Oid tupleOid, Datum *values, bool *nulls) MemoryContextReset(cstate->rowcontext); oldcontext = MemoryContextSwitchTo(cstate->rowcontext); + CopyStartSend(cstate); + if (cstate->binary) { /* Binary per-tuple header */ diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index 7d03090..8c08eb7 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -47,12 +47,24 @@ initStringInfo(StringInfo str) { int size = 1024; /* initial default buffer size */ - str->data = (char *) palloc(size); + str->data = (char *) palloc(size); /* no need for "huge" at this point */ str->maxlen = size; + str->allowlong = false; resetStringInfo(str); } /* + * allocLongStringInfo + * + * Mark the StringInfo as a candidate for a "huge" allocation + */ +void +allowLongStringInfo(StringInfo str) +{ + str->allowlong = true; +} + +/* * resetStringInfo * * Reset the StringInfo: the data buffer remains valid, but its @@ -64,6 +76,7 @@ resetStringInfo(StringInfo str) str->data[0] = '\0'; str->len = 0; str->cursor = 0; + str->allowlong = false; } /* @@ -244,7 +257,8 @@ appendBinaryStringInfo(StringInfo str, const char *data, int datalen) void enlargeStringInfo(StringInfo str, int needed) { - int newlen; + int64 newlen; + Size limit; /* * Guard against out-of-range "needed" values. Without this, we can get @@ -252,16 +266,19 @@ enlargeStringInfo(StringInfo str, int needed) */ if (needed < 0)/* should not happen */ elog(ERROR, "invalid string enlargement request size: %d", needed); - if (((Size) needed) >= (MaxAllocSize - (Size) str->len)) + + /* choose the proper limit and verify this allocation wouldn't exceed it */ + limit = str->allowlong ? MaxAllocHugeSize : MaxAllocSize; + if (((Size) needed) >= (limit - (Size) str->len)) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), errmsg("out of memory"), - errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.", + errdetail("Cannot enlarge string buffer containing "INT64_FORMAT" bytes by %d more bytes.", str->len, needed))); needed += str->len + 1; /* total space required now */ - /* Because of the above test, we
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On 4/7/15 10:29 PM, Michael Paquier wrote: On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas wrote: On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby wrote: In any case, I don't think it would be terribly difficult to allow a bit more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's some 1GB limits there too. The point is, those limits are there on purpose. Changing things arbitrarily wouldn't be hard, but doing it in a principled way is likely to require some thought. For example, in the COPY OUT case, presumably what's happening is that we palloc a chunk for each individual datum, and then palloc a buffer for the whole row. Now, we could let the whole-row buffer be bigger, but maybe it would be better not to copy all of the (possibly very large) values for the individual columns over into a row buffer before sending it. Some refactoring that avoids the need for a potentially massive (1.6TB?) whole-row buffer would be better than just deciding to allow it. I think that something to be aware of is that this is as well going to require some rethinking of the existing libpq functions that are here to fetch a row during COPY with PQgetCopyData, to make them able to fetch chunks of data from one row. The discussion about upping the StringInfo limit was for cases where a change in encoding blows up because it's now larger. My impression was that these cases don't expand by a lot, so we wouldn't be significantly expanding StringInfo. I agree that buffering 1.6TB of data would be patently absurd. Handling the case of COPYing a row that's >1GB clearly needs work than just bumping up some size limits. That's why I was wondering whether this was a real scenario or just hypothetical... I'd be surprised if someone would be happy with the performance of 1GB tuples, let alone even larger than that. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas wrote: > On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby wrote: >> In any case, I don't think it would be terribly difficult to allow a bit >> more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's >> some 1GB limits there too. > > The point is, those limits are there on purpose. Changing things > arbitrarily wouldn't be hard, but doing it in a principled way is > likely to require some thought. For example, in the COPY OUT case, > presumably what's happening is that we palloc a chunk for each > individual datum, and then palloc a buffer for the whole row. Now, we > could let the whole-row buffer be bigger, but maybe it would be better > not to copy all of the (possibly very large) values for the individual > columns over into a row buffer before sending it. Some refactoring > that avoids the need for a potentially massive (1.6TB?) whole-row > buffer would be better than just deciding to allow it. I think that something to be aware of is that this is as well going to require some rethinking of the existing libpq functions that are here to fetch a row during COPY with PQgetCopyData, to make them able to fetch chunks of data from one row. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby wrote: > In any case, I don't think it would be terribly difficult to allow a bit > more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's > some 1GB limits there too. The point is, those limits are there on purpose. Changing things arbitrarily wouldn't be hard, but doing it in a principled way is likely to require some thought. For example, in the COPY OUT case, presumably what's happening is that we palloc a chunk for each individual datum, and then palloc a buffer for the whole row. Now, we could let the whole-row buffer be bigger, but maybe it would be better not to copy all of the (possibly very large) values for the individual columns over into a row buffer before sending it. Some refactoring that avoids the need for a potentially massive (1.6TB?) whole-row buffer would be better than just deciding to allow it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On 3/31/15 3:46 AM, Ronan Dunklau wrote: >StringInfo uses int's to store length, so it could possibly be changed, >but then you'd just error out due to MaxAllocSize. > >Now perhaps those could both be relaxed, but certainly not to the extent >that you can shove an entire 1.6TB row into an output buffer. Another way to look at it would be to work in small chunks. For the first test case (rows bigger than 1GB), maybe the copy command could be rewritten to work in chunks, flushing the output more often if needed. Possibly; I'm not sure how well the FE/BE protocol or code would actually support that. >The other issue is that there's a LOT of places in code that blindly >copy detoasted data around, so while we technically support 1GB toasted >values you're probably going to be quite unhappy with performance. I'm >actually surprised you haven't already seen this with 500MB objects. > >So long story short, I'm not sure how worthwhile it would be to try and >fix this. We probably should improve the docs though. > I think that having data that can't be output by pg_dump is quite surprising, and if this is not fixable, I agree that it should clearly be documented. >Have you looked at using large objects for what you're doing? (Note that >those have their own set of challenges and limitations.) Yes I do. This particular customer of ours did not mind the performance penalty of using bytea objects as long as it was convenient to use. What do they do when they hit 1GB? Presumably if they're this close to the limit they're already hitting 1GB, no? Or is this mostly hypothetical? > > >We also hit a second issue, this time related to bytea encoding. > >There's probably several other places this type of thing could be a >problem. I'm thinking of conversions in particular. Yes, thats what the two other test cases I mentioned are about: any conversion leadng to a size greater than 1GB results in an error, even implicit conversions like doubling antislashes in the output. I think the big issue with encoding is going to be the risk of changing encoding and ending up with something too large to fit back into storage. They might need to consider using something like bytea(990MB). In any case, I don't think it would be terribly difficult to allow a bit more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's some 1GB limits there too. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
Le lundi 30 mars 2015 18:45:41 Jim Nasby a écrit : > On 3/30/15 5:46 AM, Ronan Dunklau wrote: > > Hello hackers, > > > > I've tried my luck on pgsql-bugs before, with no success, so I report > > these > > problem here. > > > > The documentation mentions the following limits for sizes: > > > > Maximum Field Size 1 GB > > Maximum Row Size1.6 TB > > > > However, it seems like rows bigger than 1GB can't be COPYed out: > > > > ro=# create table test_text (c1 text, c2 text); > > CREATE TABLE > > ro=# insert into test_text (c1) VALUES (repeat('a', 536870912)); > > INSERT 0 1 > > ro=# update test_text set c2 = c1; > > UPDATE 1 > > > > Then, trying to dump or copy that results in the following error: > > > > ro=# COPY test_text TO '/tmp/test'; > > ERROR: out of memory > > DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by > > 536870912 more bytes. > > > > In fact, the same thing happens when using a simple SELECT: > > > > ro=# select * from test_text ; > > ERROR: out of memory > > DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by > > 536870912 more bytes. > > > > In the case of COPY, the server uses a StringInfo to output the row. The > > problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row > > should be able to hold much more than that. > > Yeah, shoving a whole row into one StringInfo is ultimately going to > limit a row to 1G, which is a far cry from what the docs claim. There's > also going to be problems with FE/BE communications, because things like > pq_sendbyte all use StringInfo as a buffer too. So while Postgres can > store a 1.6TB row, you're going to find a bunch of stuff that doesn't > work past around 1GB. > > > So, is this a bug ? Or is there a caveat I would have missed in the > > documentation ? > > I suppose that really depends on your point of view. The real question > is whether we think it's worth fixing, or a good idea to change the > behavior of StringInfo. > > StringInfo uses int's to store length, so it could possibly be changed, > but then you'd just error out due to MaxAllocSize. > > Now perhaps those could both be relaxed, but certainly not to the extent > that you can shove an entire 1.6TB row into an output buffer. Another way to look at it would be to work in small chunks. For the first test case (rows bigger than 1GB), maybe the copy command could be rewritten to work in chunks, flushing the output more often if needed. For the conversion related issues, I don't really see any other solution than extending StrinigInfo to allow for more than 1GB of data. On the other hand, those one can easily be circumvented by using a COPY ... WITH binary. > > The other issue is that there's a LOT of places in code that blindly > copy detoasted data around, so while we technically support 1GB toasted > values you're probably going to be quite unhappy with performance. I'm > actually surprised you haven't already seen this with 500MB objects. > > So long story short, I'm not sure how worthwhile it would be to try and > fix this. We probably should improve the docs though. > I think that having data that can't be output by pg_dump is quite surprising, and if this is not fixable, I agree that it should clearly be documented. > Have you looked at using large objects for what you're doing? (Note that > those have their own set of challenges and limitations.) Yes I do. This particular customer of ours did not mind the performance penalty of using bytea objects as long as it was convenient to use. > > > We also hit a second issue, this time related to bytea encoding. > > There's probably several other places this type of thing could be a > problem. I'm thinking of conversions in particular. Yes, thats what the two other test cases I mentioned are about: any conversion leadng to a size greater than 1GB results in an error, even implicit conversions like doubling antislashes in the output. -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] pg_dump / copy bugs with "big lines" ?
On 3/30/15 5:46 AM, Ronan Dunklau wrote: Hello hackers, I've tried my luck on pgsql-bugs before, with no success, so I report these problem here. The documentation mentions the following limits for sizes: Maximum Field Size 1 GB Maximum Row Size1.6 TB However, it seems like rows bigger than 1GB can't be COPYed out: ro=# create table test_text (c1 text, c2 text); CREATE TABLE ro=# insert into test_text (c1) VALUES (repeat('a', 536870912)); INSERT 0 1 ro=# update test_text set c2 = c1; UPDATE 1 Then, trying to dump or copy that results in the following error: ro=# COPY test_text TO '/tmp/test'; ERROR: out of memory DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by 536870912 more bytes. In fact, the same thing happens when using a simple SELECT: ro=# select * from test_text ; ERROR: out of memory DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by 536870912 more bytes. In the case of COPY, the server uses a StringInfo to output the row. The problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row should be able to hold much more than that. Yeah, shoving a whole row into one StringInfo is ultimately going to limit a row to 1G, which is a far cry from what the docs claim. There's also going to be problems with FE/BE communications, because things like pq_sendbyte all use StringInfo as a buffer too. So while Postgres can store a 1.6TB row, you're going to find a bunch of stuff that doesn't work past around 1GB. So, is this a bug ? Or is there a caveat I would have missed in the documentation ? I suppose that really depends on your point of view. The real question is whether we think it's worth fixing, or a good idea to change the behavior of StringInfo. StringInfo uses int's to store length, so it could possibly be changed, but then you'd just error out due to MaxAllocSize. Now perhaps those could both be relaxed, but certainly not to the extent that you can shove an entire 1.6TB row into an output buffer. The other issue is that there's a LOT of places in code that blindly copy detoasted data around, so while we technically support 1GB toasted values you're probably going to be quite unhappy with performance. I'm actually surprised you haven't already seen this with 500MB objects. So long story short, I'm not sure how worthwhile it would be to try and fix this. We probably should improve the docs though. Have you looked at using large objects for what you're doing? (Note that those have their own set of challenges and limitations.) We also hit a second issue, this time related to bytea encoding. There's probably several other places this type of thing could be a problem. I'm thinking of conversions in particular. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_dump / copy bugs with "big lines" ?
Hello hackers, I've tried my luck on pgsql-bugs before, with no success, so I report these problem here. The documentation mentions the following limits for sizes: Maximum Field Size 1 GB Maximum Row Size1.6 TB However, it seems like rows bigger than 1GB can't be COPYed out: ro=# create table test_text (c1 text, c2 text); CREATE TABLE ro=# insert into test_text (c1) VALUES (repeat('a', 536870912)); INSERT 0 1 ro=# update test_text set c2 = c1; UPDATE 1 Then, trying to dump or copy that results in the following error: ro=# COPY test_text TO '/tmp/test'; ERROR: out of memory DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by 536870912 more bytes. In fact, the same thing happens when using a simple SELECT: ro=# select * from test_text ; ERROR: out of memory DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by 536870912 more bytes. In the case of COPY, the server uses a StringInfo to output the row. The problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row should be able to hold much more than that. So, is this a bug ? Or is there a caveat I would have missed in the documentation ? We also hit a second issue, this time related to bytea encoding. This test case is a bit more complicated, since I had to use an external (client) program to insert my data. It involves inserting a string that fit into 1GB when encoded in escape format, but is larger than that in hex, and another string which fits in 1GB using the hex format, but is larger than that in escape: from psycopg2 import connect from io import BytesIO conn = connect(dbname="ro") cur = conn.cursor() fullcontent = BytesIO() # Write a binary string that weight less # than 1 GB when escape encoded, but more than # that if hex encoded for i in range(200): fullcontent.write(b"aaa" * 100) fullcontent.seek(0) cur.copy_from(fullcontent, "test_bytea") fullcontent.seek(0) fullcontent.truncate() # Write another binary string that weight # less than 1GB when hex encoded, but more than # that if escape encoded cur.execute("SET bytea_output = 'hex'") fullcontent.write(b"x") for i in range(300): fullcontent.write(b"00" * 100) fullcontent.seek(0) cur.copy_from(fullcontent, "test_bytea") cur.execute("COMMIT;") cur.close() I couldn't find an invocation of pg_dump which would allow me to dump both lines: ro@ronan_laptop /tmp % PGOPTIONS="-c bytea_output=escape" pg_dump -Fc > /dev/null pg_dump: Dumping the contents of table "test_bytea" failed: PQgetResult() failed. pg_dump: Error message from server: ERROR: invalid memory alloc request size 120001 pg_dump: The command was: COPY public.test_bytea (c1) TO stdout; ro@ronan_laptop /tmp % PGOPTIONS="-c bytea_output=hex" pg_dump -Fc > /dev/null pg_dump: Dumping the contents of table "test_bytea" failed: PQgetResult() failed. pg_dump: Error message from server: ERROR: invalid memory alloc request size 120003 pg_dump: The command was: COPY public.test_bytea (c1) TO stdout; Using a COPY with binary format works: ro=# COPY test_bytea TO '/tmp/test' WITH BINARY; There seems to be a third issue, with regards to escape encoding: the backslash character is escaped, by adding another backslash. This means that a field which size is less than 1GB using the escape sequence will not be able to be output once the backslash are escaped. For example, lets consider a string consisting of 3 '\' characters: ro=# select length(c1) from test_bytea; length --- 3 (1 ligne) ro=# select length(encode(c1, 'escape')) from test_bytea ; length --- 6 (1 ligne) ro=# set bytea_output to escape; SET ro=# copy test_bytea to '/tmp/test.csv' ; ERROR: out of memory DÉTAIL : Cannot enlarge string buffer containing 1073741822 bytes by 1 more bytes. I think pg_dump should not error out on any data which was valid upon insertion. It seems the fix would be non-trivial, since StringInfo structures are relying on a limit of MaxAllocSize. Or am I missing something ? Thank you. -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.