genfile.c)

Ranier Vilela Tue, 15 Sep 2020 10:46:59 -0700

Em sex., 11 de set. de 2020 às 18:43, Ranier Vilela <[email protected]>
escreveu:

> Em sex., 11 de set. de 2020 às 17:44, Tom Lane <[email protected]>
> escreveu:
>
>> Ranier Vilela <[email protected]> writes:
>> > New version, with support to read Virtual File (pipe, FIFO and socket).
>> > With assert, in case, erroneous, of trying to read a pipe, with offset.
>>
>> Really, could you do a little more thinking and testing of your own,
>> rather than expecting the rest of us to point out holes in your thinking?
>>
> Yes, I can.
>
>
>> * bytes_to_read == 0 is a legal case.
>>
> Ok. Strange call to read a file with zero bytes.
>
>
>> * "Assert(seek_offset != 0)" is an equally awful idea.
>>
> I was trying to predict the case of reading a Virtual File, with
> bytes_to_read == -1 and seek_offset == 0,
> which is bound to fail in fseeko.
> In any case, 96d1f423f95d  it will fail with any Virtual File.
>
>
>>
>> * Removing the seek code from the negative-bytes_to_read path
>> is just broken.
>>
> Ok.
>
>
>> * The only reason this code is shorter than the previous version is
>> you took out all the comments explaining what it was doing, and
>> failed to replace them.  This is just as subtle as before, so I
>> don't find that acceptable.
>>
> It was not my intention.
>
>
>>
>> I do agree that it might be worth skipping the fseeko call in the
>> probably-common case where seek_offset is zero.  Otherwise I don't
>> see much reason to change what's there.
>>
> Well, I think that v3 is a little better, but it’s just my opinion.
> v3 try to copy directly in the final buffer, which will certainly be a
> little faster.
>
v4 patch, simple benchmark, read binary file with size > 6GB.

PostgreSQL main:
postgres=# \timing on
Timing is on.
postgres=# select
pg_read_file('c:\users\ranier\downloads\macOS_Catalina.7z');
ERROR:  file length too large
Time: 3068,459 ms (00:03,068)
postgres=#

PostgreSQL patched (v4 read_binary_file):
postgres=# \timing on
Timing is on.
postgres=# select
pg_read_file('c:\users\ranier\downloads\macOS_Catalina.7z');
ERROR:  file length too large
Time: 701,035 ms
postgres=#

4.37x faster, very good.

v4 handles pipes very well.
Tested with
https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server

Terminal one:
C:\usr\src\tests\pipes>pipe_server

Pipe Server: Main thread awaiting client connection on \\.\pipe\mynamedpipe

Terminal two:
postgres=# \timing on
Timing is on.
postgres=# select pg_read_file('\\.\pipe\mynamedpipe');

Terminal one:
Client connected, creating a processing thread.

Pipe Server: Main thread awaiting client connection on \\.\pipe\mynamedpipe
InstanceThread created, receiving and processing messages.

Pipe Server: Main thread awaiting client connection on \\.\pipe\mynamedpipe
InstanceThread created, receiving and processing messages.
^C
C:\usr\src\tests\pipes>

Terminal two:
postgres=# select pg_read_file('\\.\pipe\mynamedpipe');
pg_read_file
--------------

(1 row)

Time: 77267,481 ms (01:17,267)
postgres=#

regards,
Ranier Vilela

v4-0001-simplified_read_binary_file.patch
Description: Binary data

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Reply via email to