Re: Initial COPY of Logical Replication is too slow

Chao Li Thu, 26 Feb 2026 00:23:14 -0800

> On Feb 26, 2026, at 03:03, Masahiko Sawada <[email protected]> wrote:
> 
> On Mon, Jan 26, 2026 at 12:30 PM Masahiko Sawada <[email protected]> 
> wrote:
>> 
>> On Mon, Jan 19, 2026 at 9:44 AM Marcos Pegoraro <[email protected]> wrote:
>>> 
>>> Em sex., 19 de dez. de 2025 às 22:59, Masahiko Sawada 
>>> <[email protected]> escreveu:
>>>> 
>>>> Yeah, if we pass a publication that a lot of tables belong to to
>>>> pg_get_publication_tables(), it could take a long time to return as it
>>>> needs to construct many entries.
>>> 
>>> 
>>> Well, I don't know how to help but I'm sure it's working badly.
>>> Today I added some fields on my server, then seeing logs I could see how 
>>> slow this process is.
>>> 
>>> duration: 2213.872 ms  statement: SELECT DISTINCT  (CASE WHEN 
>>> (array_length(gpt.attrs, 1) = c.relnatts)   THEN NULL ELSE gpt.attrs END)  
>>> FROM pg_publication p,  LATERAL pg_get_publication_tables(p.pubname) gpt,  
>>> pg_class c WHERE gpt.relid = 274376788 AND c.oid = gpt.relid   AND 
>>> p.pubname IN ( 'mypub' )
>>> 
>>> 2 seconds to get the list of fields of a table is really too slow.
>>> How can we solve this ?
>> 
>> After more investigation of slowness, it seems that the
>> list_concat_unique_oid() called below is quite slow when the database
>> has a lot of tables to publish:
>> 
>>    relids = GetPublicationRelations(pub_elem->oid,
>>                                     pub_elem->pubviaroot ?
>>                                     PUBLICATION_PART_ROOT :
>>                                     PUBLICATION_PART_LEAF);
>>    schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
>>                                                     pub_elem->pubviaroot ?
>>                                                     PUBLICATION_PART_ROOT :
>>                                                     PUBLICATION_PART_LEAF);
>>    pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
>> 
>> This is simply because it's O(n^2), where n is the number of oids in
>> schemarelids in the test case. A simple change would be to do sort &
>> dedup instead. With the attached experimental patch, the
>> pg_get_publication_tables() execution time gets halved in my
>> environment (796ms -> 430ms with 50k tables). If the number of tables
>> is not large, this method might be slower than today but it's not a
>> huge regression.
>> 
>> In the initial tablesync cases, it could be optimized further in a way
>> that we introduce a new SQL function that gets the column list and
>> expr of the specific table. This way, we can filter the result by
>> relid at an early stage instead of getting all information and
>> filtering by relid as the tablesync worker does today, avoiding
>> overheads of gathering system catalog scan results.
> 
> I've drafted this idea and I find it looks like a better approach. The
> patch introduces the pg_get_publication_table_info() SQL function that
> returns the column list and row filter expression like
> pg_get_publication_tables() returns but it checks only the specific
> table unlike pg_get_publication_tables(). On my env, the tablesync
> worker's query in question becomes 0.6ms from 288 ms with 50k tables
> in one publication. Feedback is very welcome.
> 
> Regards,
> 
> -- 
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <0001-Add-pg_get_publication_table_info-to-optimize-logica.patch>

A few comments:

1. pg_publication.c needs to pg_indent. When I ran pg_indent agains it, the 
patch code got a lot format changes.

2
```
+       values[0] = ObjectIdGetDatum(pub->oid);
+       values[1] = ObjectIdGetDatum(relid);
+
+       values[0] = ObjectIdGetDatum(pub->oid);
+       values[1] = ObjectIdGetDatum(relid);
```

Looks like a copy-pasto.

3 I think we can optimize pg_get_publication_table_info() a little bit for 
non-publish tables by setting funcctx->max_calls = 0. I tried this code 
locally, and test still passed:
```
                if (publish)
                {
                        pubrel = palloc_object(published_rel);
                        pubrel->relid = relid;
                        pubrel->pubid = pub->oid;
                        funcctx->tuple_desc = BlessTupleDesc(tupdesc);
                        funcctx->user_fctx = pubrel;
                        funcctx->max_calls = 1; /* return one row */
                }
                else
                        funcctx->max_calls = 0; /* return no rows */

                MemoryContextSwitchTo(oldcontext);
        }

        /* stuff done on every call of the function */
        funcctx = SRF_PERCALL_SETUP();

        if (funcctx->call_cntr < funcctx->max_calls)
        {
                HeapTuple       rettuple;

                table_info = (published_rel *) funcctx->user_fctx;
                rettuple = construct_published_rel_tuple(table_info, 
funcctx->tuple_desc);

                SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(rettuple));
        }

        SRF_RETURN_DONE(funcctx);
```

4
```
+               int                i;
+
+               attnums = palloc_array(int16, desc->natts);
+
+               for (i = 0; i < desc->natts; i++)
```

Nit: Peter (E) ever did some cleanup of changing for loop variable into for. So 
I think that style is more preferred now:
```
for (int i = 0; i < desc->natts; i++)
```

5
```
+               attnums = palloc_array(int16, desc->natts);
```

Nit: attnums array is never free-ed, is that intentional? Actually, I’m kinda 
lost when a memory should be free-ed and when not.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Re: Initial COPY of Logical Replication is too slow

Reply via email to