Re: Invalid memory alloc request size for repeat()

2022-05-26 Thread Tom Lane
Japin Li  writes:
> Do those mean we cannot store 1GB to a field [1] and send 1GB of data to the 
> client?

That's what I said upthread.  I'm not terribly excited about that.
Shoving gigabyte-sized field values around as atomic strings is not
going to lead to anything but pain: even if the server can manage
it, clients will likely fall over.  (Try a string a little smaller
than that, and notice how much psql sucks at handling it.)

There's been speculation from time to time about creating some
sort of streaming interface that would allow processing enormous
field values more reasonably.  You can kinda-sorta do that now
with large objects, but those have enough other limitations and
issues that they're not very recommendable as a general solution.
Someone should try to develop a better version of that.

regards, tom lane




Re: Invalid memory alloc request size for repeat()

2022-05-26 Thread Japin Li


On Thu, 26 May 2022 at 09:03, Japin Li  wrote:
> On Wed, 25 May 2022 at 22:50, Tom Lane  wrote:
>> Japin Li  writes:
>>> Today, I try to use repeat() to generate 1GB text, and it occurs invalid 
>>> memory
>>> alloc request size [1].  It is a limit from palloc(), then I try to reduce 
>>> it,
>>> it still complains out of memory which comes from enlargeStringInfo() [2].  
>>> The
>>> documentation about repect() [3] doesn't mentaion the limitation.
>>
>> It would probably make sense for repeat() to check this explicitly:
>>
>> if (unlikely(pg_mul_s32_overflow(count, slen, )) ||
>> -   unlikely(pg_add_s32_overflow(tlen, VARHDRSZ, )))
>> +   unlikely(pg_add_s32_overflow(tlen, VARHDRSZ, )) ||
>> +   unlikely(!AllocSizeIsValid(tlen)))
>> ereport(ERROR,
>> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
>>  errmsg("requested length too large")));
>>
>
> LGTM.  Thanks for your patch!

After some analysis, I found it might not easy to solve this.

For example,

```
postgres=# CREATE TABLE myrepeat AS SELECT repeat('a', 1024 * 1024 * 1024 - 5);
ERROR:  invalid memory alloc request size 1073741871
```

Here is the backtrace:

#0  palloc0 (size=1073741871) at 
/mnt/workspace/postgresql/build/../src/backend/utils/mmgr/mcxt.c:1103
#1  0x561925199faf in heap_form_tuple (tupleDescriptor=0x561927cb4310, 
values=0x561927cb4470, isnull=0x561927cb4478)
at 
/mnt/workspace/postgresql/build/../src/backend/access/common/heaptuple.c:1069
#2  0x5619254879aa in tts_virtual_copy_heap_tuple (slot=0x561927cb4428) at 
/mnt/workspace/postgresql/build/../src/backend/executor/execTuples.c:280
#3  0x56192548a40c in ExecFetchSlotHeapTuple (slot=0x561927cb4428, 
materialize=true, shouldFree=0x7ffc9cc5197f)
at /mnt/workspace/postgresql/build/../src/backend/executor/execTuples.c:1660
#4  0x56192520e0c9 in heapam_tuple_insert (relation=0x7fdacaa9d3e0, 
slot=0x561927cb4428, cid=5, options=2, bistate=0x561927cb83f0)
at 
/mnt/workspace/postgresql/build/../src/backend/access/heap/heapam_handler.c:245
#5  0x5619253b2edf in table_tuple_insert (rel=0x7fdacaa9d3e0, 
slot=0x561927cb4428, cid=5, options=2, bistate=0x561927cb83f0)
at /mnt/workspace/postgresql/build/../src/include/access/tableam.h:1376
#6  0x5619253b3d73 in intorel_receive (slot=0x561927cb4428, 
self=0x561927c82430) at 
/mnt/workspace/postgresql/build/../src/backend/commands/createas.c:586
#7  0x561925478d86 in ExecutePlan (estate=0x561927cb3ed0, 
planstate=0x561927cb4108, use_parallel_mode=false, operation=CMD_SELECT, 
sendTuples=true, numberTuples=0,
direction=ForwardScanDirection, dest=0x561927c82430, execute_once=true) at 
/mnt/workspace/postgresql/build/../src/backend/executor/execMain.c:1667
#8  0x561925476735 in standard_ExecutorRun (queryDesc=0x561927cab990, 
direction=ForwardScanDirection, count=0, execute_once=true)
at /mnt/workspace/postgresql/build/../src/backend/executor/execMain.c:363
#9  0x56192547654b in ExecutorRun (queryDesc=0x561927cab990, 
direction=ForwardScanDirection, count=0, execute_once=true)
at /mnt/workspace/postgresql/build/../src/backend/executor/execMain.c:307
#10 0x5619253b3711 in ExecCreateTableAs (pstate=0x561927c35b00, 
stmt=0x561927b5aa70, params=0x0, queryEnv=0x0, qc=0x7ffc9cc52370)

The heap_form_tupe() function need extra 48 bytes for HeapTupleHeaderData and 
HEAPTUPLESIZE.

If we use the following, everything is ok.

postgres=# CREATE TABLE myrepeat AS SELECT repeat('a', 1024 * 1024 * 1024 - 5 - 
48);
SELECT 1


If we want to send the result to the client.  Here is an error.

postgres=# SELECT repeat('a', 1024 * 1024 * 1024 - 5);
ERROR:  out of memory
DETAIL:  Cannot enlarge string buffer containing 6 bytes by 1073741819 more 
bytes.

This is because the printtup() needs send to the number of attributes (2-byte)
and the length of data (4-byte) and then the data (1073741819-byte).

Do those mean we cannot store 1GB to a field [1] and send 1GB of data to the 
client?

[1] https://www.postgresql.org/docs/current/limits.html

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Invalid memory alloc request size for repeat()

2022-05-25 Thread Japin Li


On Wed, 25 May 2022 at 22:50, Tom Lane  wrote:
> Japin Li  writes:
>> Today, I try to use repeat() to generate 1GB text, and it occurs invalid 
>> memory
>> alloc request size [1].  It is a limit from palloc(), then I try to reduce 
>> it,
>> it still complains out of memory which comes from enlargeStringInfo() [2].  
>> The
>> documentation about repect() [3] doesn't mentaion the limitation.
>
> It would probably make sense for repeat() to check this explicitly:
>
> if (unlikely(pg_mul_s32_overflow(count, slen, )) ||
> -   unlikely(pg_add_s32_overflow(tlen, VARHDRSZ, )))
> +   unlikely(pg_add_s32_overflow(tlen, VARHDRSZ, )) ||
> +   unlikely(!AllocSizeIsValid(tlen)))
> ereport(ERROR,
> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
>  errmsg("requested length too large")));
>

LGTM.  Thanks for your patch!


-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Invalid memory alloc request size for repeat()

2022-05-25 Thread Japin Li


On Wed, 25 May 2022 at 22:41, David G. Johnston  
wrote:
> On Wednesday, May 25, 2022, Japin Li  wrote:
>
>>
>> Hi,
>>
>> Today, I try to use repeat() to generate 1GB text, and it occurs invalid
>> memory
>> alloc request size [1].  It is a limit from palloc(), then I try to reduce
>> it,
>> it still complains out of memory which comes from enlargeStringInfo()
>> [2].  The
>> documentation about repect() [3] doesn't mentaion the limitation.
>>
>
> That is still a “field” even if it is not stored.
>
> https://www.postgresql.org/docs/current/limits.html
>

I mean this is a limitation about repect() function, it isn't really about 1GB,
we can only use 1GB - 4 for it.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Invalid memory alloc request size for repeat()

2022-05-25 Thread Tom Lane
Japin Li  writes:
> Today, I try to use repeat() to generate 1GB text, and it occurs invalid 
> memory
> alloc request size [1].  It is a limit from palloc(), then I try to reduce it,
> it still complains out of memory which comes from enlargeStringInfo() [2].  
> The
> documentation about repect() [3] doesn't mentaion the limitation.

It would probably make sense for repeat() to check this explicitly:

if (unlikely(pg_mul_s32_overflow(count, slen, )) ||
-   unlikely(pg_add_s32_overflow(tlen, VARHDRSZ, )))
+   unlikely(pg_add_s32_overflow(tlen, VARHDRSZ, )) ||
+   unlikely(!AllocSizeIsValid(tlen)))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 errmsg("requested length too large")));

The failure in enlargeStringInfo is probably coming from trying to
construct an output message to send back to the client.  That's
going to be a lot harder to do anything nice about (and even if
the backend didn't fail, the client might).

regards, tom lane




Re: Invalid memory alloc request size for repeat()

2022-05-25 Thread David G. Johnston
On Wednesday, May 25, 2022, Japin Li  wrote:

>
> Hi,
>
> Today, I try to use repeat() to generate 1GB text, and it occurs invalid
> memory
> alloc request size [1].  It is a limit from palloc(), then I try to reduce
> it,
> it still complains out of memory which comes from enlargeStringInfo()
> [2].  The
> documentation about repect() [3] doesn't mentaion the limitation.
>

That is still a “field” even if it is not stored.

https://www.postgresql.org/docs/current/limits.html

David J.