Hi Michael, On Thu, Feb 1, 2024 at 9:58 AM Michael Paquier <mich...@paquier.xyz> wrote: > > On Wed, Jan 31, 2024 at 02:39:54PM +0900, Michael Paquier wrote: > > Thanks, I'm looking into that now. > > I have much to say about the patch, but for now I have begun running > some performance tests using the patches, because this thread won't > get far until we are sure that the callbacks do not impact performance > in some kind of worst-case scenario. First, here is what I used to > setup a set of tables used for COPY FROM and COPY TO (requires [1] to > feed COPY FROM's data to the void, and note that default values is to > have a strict control on the size of the StringInfos used in the copy > paths): > CREATE EXTENSION blackhole_am; > CREATE OR REPLACE FUNCTION create_table_cols(tabname text, num_cols int) > RETURNS VOID AS > $func$ > DECLARE > query text; > BEGIN > query := 'CREATE UNLOGGED TABLE ' || tabname || ' ('; > FOR i IN 1..num_cols LOOP > query := query || 'a_' || i::text || ' int default 1'; > IF i != num_cols THEN > query := query || ', '; > END IF; > END LOOP; > query := query || ')'; > EXECUTE format(query); > END > $func$ LANGUAGE plpgsql; > -- Tables used for COPY TO > SELECT create_table_cols ('to_tab_1', 1); > SELECT create_table_cols ('to_tab_10', 10); > INSERT INTO to_tab_1 SELECT FROM generate_series(1, 10000000); > INSERT INTO to_tab_10 SELECT FROM generate_series(1, 10000000); > -- Data for COPY FROM > COPY to_tab_1 TO '/tmp/to_tab_1.bin' WITH (format binary); > COPY to_tab_10 TO '/tmp/to_tab_10.bin' WITH (format binary); > COPY to_tab_1 TO '/tmp/to_tab_1.txt' WITH (format text); > COPY to_tab_10 TO '/tmp/to_tab_10.txt' WITH (format text); > -- Tables used for COPY FROM > SELECT create_table_cols ('from_tab_1', 1); > SELECT create_table_cols ('from_tab_10', 10); > ALTER TABLE from_tab_1 SET ACCESS METHOD blackhole_am; > ALTER TABLE from_tab_10 SET ACCESS METHOD blackhole_am; > > Then I have run a set of tests using HEAD, v7 and v10 with queries > like that (adapt them depending on the format and table): > COPY to_tab_1 TO '/dev/null' WITH (FORMAT text) \watch count=5 > SET client_min_messages TO error; -- for blackhole_am > COPY from_tab_1 FROM '/tmp/to_tab_1.txt' with (FORMAT 'text') \watch count=5 > COPY from_tab_1 FROM '/tmp/to_tab_1.bin' with (FORMAT 'binary') \watch count=5 > > All the patches have been compiled with -O2, without assertions, etc. > Postgres is run in tmpfs mode, on scissors, without fsync. Unlogged > tables help a bit in focusing on the execution paths as we don't care > about WAL, of course. I have also included v7 in the test of tests, > as this version uses more simple per-row callbacks. > > And here are the results I get for text and binary (ms, average of 15 > queries after discarding the three highest and three lowest values): > test | master | v7 | v10 > -----------------+--------+------+------ > from_bin_1col | 1575 | 1546 | 1575 > from_bin_10col | 5364 | 5208 | 5230 > from_text_1col | 1690 | 1715 | 1684 > from_text_10col | 4875 | 4793 | 4757 > to_bin_1col | 1717 | 1730 | 1731 > to_bin_10col | 7728 | 7707 | 7513 > to_text_1col | 1710 | 1730 | 1698 > to_text_10col | 5998 | 5960 | 5987 > > I am getting an interesting trend here in terms of a speedup between > HEAD and the patches with a table that has 10 attributes filled with > integers, especially for binary and text with COPY FROM. COPY TO > binary also gets nice numbers, while text looks rather stable. Hmm. > > These were on my buildfarm animal, but we need to be more confident > about all this. Could more people run these tests? I am going to do > a second session on a local machine I have at hand and see what > happens. Will publish the numbers here, the method will be the same. > > [1]: https://github.com/michaelpq/pg_plugins/tree/main/blackhole_am > -- > Michael
I'm running the benchmark, but I got some strong numbers: postgres=# \timing Timing is on. postgres=# COPY to_tab_10 TO '/dev/null' WITH (FORMAT binary) \watch count=15 COPY 10000000 Time: 3168.497 ms (00:03.168) COPY 10000000 Time: 3255.464 ms (00:03.255) COPY 10000000 Time: 3270.625 ms (00:03.271) COPY 10000000 Time: 3285.112 ms (00:03.285) COPY 10000000 Time: 3322.304 ms (00:03.322) COPY 10000000 Time: 3341.328 ms (00:03.341) COPY 10000000 Time: 3621.564 ms (00:03.622) COPY 10000000 Time: 3700.911 ms (00:03.701) COPY 10000000 Time: 3717.992 ms (00:03.718) COPY 10000000 Time: 3708.350 ms (00:03.708) COPY 10000000 Time: 3704.367 ms (00:03.704) COPY 10000000 Time: 3724.281 ms (00:03.724) COPY 10000000 Time: 3703.335 ms (00:03.703) COPY 10000000 Time: 3728.629 ms (00:03.729) COPY 10000000 Time: 3758.135 ms (00:03.758) The first 6 rounds are like 10% better than the later 9 rounds, is this normal? -- Regards Junwang Zhao