On Thu, Mar 26, 2026 at 1:35 AM Hayato Kuroda (Fujitsu)
<[email protected]> wrote:
>
> Dear Sawada-san,
> (Sending again because blocked by some rules)
>
> I ran the performance testing independently for the 0001 patch. Overall 
> performance looked
> very nice, new function spent O(1) time based on the total number of tables.
> It seems good enough.
>
> Source code:
> ----------------
> HEAD (4287c50f) + v4-0001 patch.
>
> Setup:
> ---------
> A database cluster was set up with shared_buffers=100GB. Several tables were
> defined on the public schema, and same number of tables were on the sch1.
> Total number of tables were {50, 500, 5000, 50000}.
> A publication included a schema sch1 and all public tables individually.
>
> Attached script setup the same. The suffix is changed to .txt to pass the 
> rule.
>
> Workload Run:
> --------------------
> I ran two types of SQLs and measured the execution time via \timing 
> metacommand.
> Cases were emulated which tablesync worker would do.
>
> Case 1: old SQL
> ```
> SELECT DISTINCT
>   (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)
>    THEN NULL ELSE gpt.attrs END)
>   FROM pg_publication p,
>   LATERAL pg_get_publication_tables(p.pubname) gpt,
>   pg_class c
>  WHERE gpt.relid = 17885 AND c.oid = gpt.relid
>    AND p.pubname IN ( 'pub' );
> ```
>
> Case 2: new SQL
> ```
> SELECT DISTINCT
>   (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)
>    THEN NULL ELSE gpt.attrs END)
>   FROM pg_publication p,
>   LATERAL pg_get_publication_tables(p.pubname, 16535) gpt,
>   pg_class c
>  WHERE c.oid = gpt.relid
>    AND p.pubname IN ( 'pub' );
> ```
>
> Result Observations:
> ---------------
> Attached bar graph shows the result. A logarithmic scale is used for the 
> execution
> time (y-axis) to see both small/large scale case. The spent time became 
> approximately
> 10x longer for 500->5000, and 5000->50000, in case of old SQL is used.
> Apart from that, the spent time for the new SQL is mostly the stable based on 
> the
> number of tables.
>
> Detailed Result:
> --------------
> Each cell are the median of 10 runs.
>
> Total tables    Execution time for the old SQL was done [ms]    Execution 
> time for the old SQL was done [ms]
> 50              5.77                                            4.19
> 500             15.75                                           4.28
> 5000            120.39                                          4.22
> 50000           1741.89                                         4.60
> 500000          73287.16                                        4.95

Thank you for doing the performance tests! These observation match the
results of my local performance test.

BTW the new is_table_publishable_in_publication() can be useful other
places too where we check if the particular table is published by the
publication, for example get-rel_sync_entry(). It would be a separate
patch though.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to