Hi,

On 2024-04-06 14:34:17 +1300, David Rowley wrote:
> I don't see any issues with v5, so based on the performance numbers
> shown on this thread for the latest patch, it would make sense to push
> it.  The problem is, I just can't recreate the performance numbers.
>
> I've tried both on my AMD 3990x machine and an Apple M2 with a script
> similar to the test.sh from above.  I mostly just stripped out the
> buffer size stuff and adjusted the timing code to something that would
> work with mac.

I think there are a few issues with the test script leading to not seeing a
gain:

1) I think using the textual protocol, with the text datatype, will make it
   harder to spot differences. That's a lot of overhead.

2) Afaict the test is connecting over the unix socket, I think we expect
   bigger wins for tcp

3) Particularly the larger string is bottlenecked due to pglz compression in
   toast.


Where I had noticed the overhead of the current approach badly, was streaming
out basebackups. Which is all binary, of course.


I added WITH BINARY, SET STORAGE EXTERNAL and tested both unix socket and
localhost. I also reduced row counts and iteration counts, because I am
impatient, and I don't think it matters much here. Attached the modified
version.


On a dual xeon Gold 5215, turbo boost disabled, server pinned to one core,
script pinned to another:


unix:

master:
Run 100 100 1000000: 0.058482377
Run 1024 10240 100000: 0.120909810
Run 1024 1048576 2000: 0.153027916
Run 1048576 1048576 1000: 0.154953512

v5:
Run 100 100 1000000: 0.058760126
Run 1024 10240 100000: 0.118831396
Run 1024 1048576 2000: 0.124282503
Run 1048576 1048576 1000: 0.123894962


localhost:

master:
Run 100 100 1000000: 0.067088000
Run 1024 10240 100000: 0.170894273
Run 1024 1048576 2000: 0.230346632
Run 1048576 1048576 1000: 0.230336078

v5:
Run 100 100 1000000: 0.067144036
Run 1024 10240 100000: 0.167950948
Run 1024 1048576 2000: 0.135167027
Run 1048576 1048576 1000: 0.135347867


The perf difference for 1MB via TCP is really impressive.

The small regression for small results is still kinda visible, I haven't yet
tested the patch downthread.

Greetings,

Andres Freund
#!/bin/bash

set -e

dbname=postgres
port=5440
host=/tmp
host=localhost

test_cases=(
"100 100 1000000"               # only 100 bytes
"1024 10240 100000"    # 1Kb and 10Kb
"1024 1048576 2000"             # 1Kb and 1Mb
"1048576 1048576 1000"  # all 1Mb
)

insert_rows(){
        psql -d $dbname -p $port -h $host  -c "
        DO \$\$
        DECLARE
            counter INT;
        BEGIN
            FOR counter IN 1..$3 LOOP
                IF counter % 2 = 1 THEN
                    INSERT INTO test_table VALUES (repeat('a', $1)::text);
                ELSE
                    INSERT INTO test_table VALUES (repeat('b', $2)::text);
                END IF;
            END LOOP;
        END \$\$;
        " > /dev/null
}


psql -d $dbname -p $port -c "CREATE EXTENSION IF NOT EXISTS pg_prewarm;" > 
/dev/null

for case in "${test_cases[@]}"
do
        psql -d $dbname -p $port -h $host -c "DROP TABLE IF EXISTS test_table;" 
> /dev/null
        psql -d $dbname -p $port -h $host  -c "CREATE UNLOGGED TABLE 
test_table(data text not null);" > /dev/null
        psql -d $dbname -p $port -h $host  -c "ALTER TABLE test_table ALTER 
data SET STORAGE EXTERNAL;" > /dev/null

        insert_rows $case

        psql -d $dbname -p $port -h $host  -c "select 
pg_prewarm('test_table');" > /dev/null

        echo -n "Run $case: "

        elapsed_time=0
        for a in {1..5}
        do
                start_time=$(perl -MTime::HiRes=time -e 'printf "%.9f\n", time')
                psql -d $dbname -p $port -h $host -c "COPY test_table TO STDOUT 
WITH BINARY;" > /dev/null
                end_time=$(perl -MTime::HiRes=time -e 'printf "%.9f\n", time')
                elapsed_time=$(perl -e "printf('%.9f', ($end_time - 
$start_time) + $elapsed_time)")
        done

        avg_elapsed_time_in_ms=$(perl -e "printf('%.9f', ($elapsed_time / 30))")
        echo $avg_elapsed_time_in_ms
done

Reply via email to