On Wed, Feb 25, 2026 at 11:03 AM Masahiko Sawada <[email protected]> wrote:
>
> On Mon, Jan 26, 2026 at 12:30 PM Masahiko Sawada <[email protected]>
> wrote:
> >
> > On Mon, Jan 19, 2026 at 9:44 AM Marcos Pegoraro <[email protected]> wrote:
> > >
> > > Em sex., 19 de dez. de 2025 às 22:59, Masahiko Sawada
> > > <[email protected]> escreveu:
> > >>
> > >> Yeah, if we pass a publication that a lot of tables belong to to
> > >> pg_get_publication_tables(), it could take a long time to return as it
> > >> needs to construct many entries.
> > >
> > >
> > > Well, I don't know how to help but I'm sure it's working badly.
> > > Today I added some fields on my server, then seeing logs I could see how
> > > slow this process is.
> > >
> > > duration: 2213.872 ms statement: SELECT DISTINCT (CASE WHEN
> > > (array_length(gpt.attrs, 1) = c.relnatts) THEN NULL ELSE gpt.attrs END)
> > > FROM pg_publication p, LATERAL pg_get_publication_tables(p.pubname)
> > > gpt, pg_class c WHERE gpt.relid = 274376788 AND c.oid = gpt.relid AND
> > > p.pubname IN ( 'mypub' )
> > >
> > > 2 seconds to get the list of fields of a table is really too slow.
> > > How can we solve this ?
> >
> > After more investigation of slowness, it seems that the
> > list_concat_unique_oid() called below is quite slow when the database
> > has a lot of tables to publish:
> >
> > relids = GetPublicationRelations(pub_elem->oid,
> > pub_elem->pubviaroot ?
> > PUBLICATION_PART_ROOT :
> > PUBLICATION_PART_LEAF);
> > schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
> > pub_elem->pubviaroot ?
> > PUBLICATION_PART_ROOT :
> > PUBLICATION_PART_LEAF);
> > pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
> >
> > This is simply because it's O(n^2), where n is the number of oids in
> > schemarelids in the test case. A simple change would be to do sort &
> > dedup instead. With the attached experimental patch, the
> > pg_get_publication_tables() execution time gets halved in my
> > environment (796ms -> 430ms with 50k tables). If the number of tables
> > is not large, this method might be slower than today but it's not a
> > huge regression.
> >
> > In the initial tablesync cases, it could be optimized further in a way
> > that we introduce a new SQL function that gets the column list and
> > expr of the specific table. This way, we can filter the result by
> > relid at an early stage instead of getting all information and
> > filtering by relid as the tablesync worker does today, avoiding
> > overheads of gathering system catalog scan results.
>
> I've drafted this idea and I find it looks like a better approach. The
> patch introduces the pg_get_publication_table_info() SQL function that
> returns the column list and row filter expression like
> pg_get_publication_tables() returns but it checks only the specific
> table unlike pg_get_publication_tables(). On my env, the tablesync
> worker's query in question becomes 0.6ms from 288 ms with 50k tables
> in one publication. Feedback is very welcome.
Another variant of this approach is to extend
pg_get_publication_table() so that it can accept a relid to get the
publication information of the specific table. I've attached the patch
for this idea. I'm going to add regression test cases.
pg_get_publication_table() is a VARIACID array function so the patch
changes its signature to {text[] [, oid]}, breaking the tool
compatibility. Given this function is mostly an internal-use function
(we don't have the documentation for it), it would probably be okay
with it. I find it's clearer than the other approach of introducing
pg_get_publication_table_info(). Feedback is very welcome.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From 152b798903cd181b4e9b5ca39409d5616ade1bbd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Fri, 27 Feb 2026 15:42:38 -0800
Subject: [PATCH] Avoid full table scans when getting publication table
information by tablesync workers.
Author:
Reviewed-by:
Discussion: https://postgr.es/m/
---
src/backend/catalog/pg_publication.c | 147 ++++++++++++++++----
src/backend/catalog/system_views.sql | 2 +-
src/backend/commands/subscriptioncmds.c | 4 +-
src/backend/replication/logical/tablesync.c | 9 +-
src/include/catalog/pg_proc.dat | 15 +-
5 files changed, 138 insertions(+), 39 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..2d48580ad9a 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1116,13 +1116,83 @@ GetPublicationByName(const char *pubname, bool missing_ok)
return OidIsValid(oid) ? GetPublication(oid) : NULL;
}
+/*
+ * Returns true if the table of the given relid is published by the publication.
+ *
+ * Note that being published here means we actually use its OID as the published
+ * table OID, which depends on publication's publish_via_partition_root value.
+ * For example, even if pg_publication_rel has the entry for the parent table,
+ * this function returns false as we use its leaf partitions' OIDs as the
+ * published OIDs.
+ */
+static bool
+is_table_publishable_in_publication(Oid relid, Publication *pub)
+{
+ if (pub->pubviaroot)
+ {
+ /*
+ * For ALL TABLES publication with pubviaroot, the table is published
+ * if not a partition.
+ */
+ if (pub->alltables)
+ return !get_rel_relispartition(relid);
+
+ /*
+ * For pubviaroot publications, we can simply check if the given
+ * relation's OIS exists on either pg_publication_rel or
+ * pg_publication_namespace.
+ */
+ return (SearchSysCacheExists2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pub->oid)) ||
+ SearchSysCacheExists2(PUBLICATIONNAMESPACEMAP,
+ ObjectIdGetDatum(get_rel_namespace(relid)),
+ ObjectIdGetDatum(pub->oid)));
+ }
+
+ /*
+ * For non-pubviaroot publications, partitioned table's OID can never be a
+ * published OID.
+ */
+ if (get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE)
+ return false;
+
+ if (pub->alltables)
+ return true;
+
+ /*
+ * For the partition in the !pubviaroot publication, we need to check its
+ * ancestors instead of the given relation itself.
+ */
+ if (get_rel_relispartition(relid))
+ {
+ List *ancestors = get_partition_ancestors(relid);
+
+ Oid topmost = GetTopMostAncestorInPublication(pub->oid, ancestors,
+ NULL);
+
+ return OidIsValid(topmost);
+ }
+
+ return (SearchSysCacheExists2(PUBLICATIONRELMAP,
+ ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(pub->oid)) ||
+ SearchSysCacheExists2(PUBLICATIONNAMESPACEMAP,
+ ObjectIdGetDatum(get_rel_namespace(relid)),
+ ObjectIdGetDatum(pub->oid)));
+}
+
/*
* Get information of the tables in the given publication array.
*
* Returns pubid, relid, column list, row filter for each table.
+ *
+ * If relid is an valid OID, it returns only these information of the table
+ * of the given relid instead of all tables in the given publication array,
+ * returning at most one tuple.
*/
-Datum
-pg_get_publication_tables(PG_FUNCTION_ARGS)
+static Datum
+pg_get_publication_tables(FunctionCallInfo fcinfo, Oid relid)
{
#define NUM_PUBLICATION_TABLES_ELEM 4
FuncCallContext *funcctx;
@@ -1161,29 +1231,38 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
pub_elem = GetPublicationByName(TextDatumGetCString(elems[i]), false);
- /*
- * Publications support partitioned tables. If
- * publish_via_partition_root is false, all changes are replicated
- * using leaf partition identity and schema, so we only need
- * those. Otherwise, get the partitioned table itself.
- */
- if (pub_elem->alltables)
- pub_elem_tables = GetAllPublicationRelations(RELKIND_RELATION,
- pub_elem->pubviaroot);
+ if (OidIsValid(relid))
+ {
+ if (is_table_publishable_in_publication(relid, pub_elem))
+ pub_elem_tables = list_make1_oid(relid);
+ }
else
{
- List *relids,
- *schemarelids;
-
- relids = GetPublicationRelations(pub_elem->oid,
- pub_elem->pubviaroot ?
- PUBLICATION_PART_ROOT :
- PUBLICATION_PART_LEAF);
- schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
- pub_elem->pubviaroot ?
- PUBLICATION_PART_ROOT :
- PUBLICATION_PART_LEAF);
- pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
+ /*
+ * Publications support partitioned tables. If
+ * publish_via_partition_root is false, all changes are
+ * replicated using leaf partition identity and schema, so we
+ * only need those. Otherwise, get the partitioned table
+ * itself.
+ */
+ if (pub_elem->alltables)
+ pub_elem_tables = GetAllPublicationRelations(RELKIND_RELATION,
+ pub_elem->pubviaroot);
+ else
+ {
+ List *relids,
+ *schemarelids;
+
+ relids = GetPublicationRelations(pub_elem->oid,
+ pub_elem->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ pub_elem->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
+ }
}
/*
@@ -1246,8 +1325,8 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
HeapTuple rettuple;
Publication *pub;
published_rel *table_info = (published_rel *) list_nth(table_infos, funcctx->call_cntr);
- Oid relid = table_info->relid;
- Oid schemaid = get_rel_namespace(relid);
+ Oid tableoid = table_info->relid;
+ Oid schemaid = get_rel_namespace(tableoid);
Datum values[NUM_PUBLICATION_TABLES_ELEM] = {0};
bool nulls[NUM_PUBLICATION_TABLES_ELEM] = {0};
@@ -1258,7 +1337,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
pub = GetPublication(table_info->pubid);
values[0] = ObjectIdGetDatum(pub->oid);
- values[1] = ObjectIdGetDatum(relid);
+ values[1] = ObjectIdGetDatum(tableoid);
/*
* We don't consider row filters or column lists for FOR ALL TABLES or
@@ -1269,7 +1348,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
ObjectIdGetDatum(schemaid),
ObjectIdGetDatum(pub->oid)))
pubtuple = SearchSysCacheCopy2(PUBLICATIONRELMAP,
- ObjectIdGetDatum(relid),
+ ObjectIdGetDatum(tableoid),
ObjectIdGetDatum(pub->oid));
if (HeapTupleIsValid(pubtuple))
@@ -1293,7 +1372,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
/* Show all columns when the column list is not specified. */
if (nulls[2])
{
- Relation rel = table_open(relid, AccessShareLock);
+ Relation rel = table_open(tableoid, AccessShareLock);
int nattnums = 0;
int16 *attnums;
TupleDesc desc = RelationGetDescr(rel);
@@ -1342,6 +1421,18 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+Datum
+pg_get_publication_tables_a(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables(fcinfo, InvalidOid);
+}
+
+Datum
+pg_get_publication_tables_b(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables(fcinfo, PG_GETARG_OID(1));
+}
+
/*
* Returns Oids of sequences in a publication.
*/
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1ea8f1faa9e..0c867cf0bf0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -411,7 +411,7 @@ CREATE VIEW pg_publication_tables AS
) AS attnames,
pg_get_expr(GPT.qual, GPT.relid) AS rowfilter
FROM pg_publication P,
- LATERAL pg_get_publication_tables(P.pubname) GPT,
+ LATERAL pg_get_publication_tables(ARRAY[P.pubname]) GPT,
pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE C.oid = GPT.relid;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5e3c0964d38..0bf7db71d5a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2577,7 +2577,7 @@ check_publications_origin_tables(WalReceiverConn *wrconn, List *publications,
appendStringInfoString(&cmd,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
- " LATERAL pg_get_publication_tables(P.pubname) GPT\n"
+ " LATERAL pg_get_publication_tables(ARRAY[P.pubname]) GPT\n"
" JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
" GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
" SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
@@ -2956,7 +2956,7 @@ fetch_relation_list(WalReceiverConn *wrconn, List *publications)
appendStringInfo(&cmd, "SELECT DISTINCT n.nspname, c.relname, c.relkind, gpt.attrs\n"
" FROM pg_class c\n"
" JOIN pg_namespace n ON n.oid = c.relnamespace\n"
- " JOIN ( SELECT (pg_get_publication_tables(VARIADIC array_agg(pubname::text))).*\n"
+ " JOIN ( SELECT (pg_get_publication_tables(array_agg(pubname::text))).*\n"
" FROM pg_publication\n"
" WHERE pubname IN ( %s )) AS gpt\n"
" ON gpt.relid = c.oid\n",
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 2f2f0121ecf..a7f52755d05 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -801,9 +801,9 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
" (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
" THEN NULL ELSE gpt.attrs END)"
" FROM pg_publication p,"
- " LATERAL pg_get_publication_tables(p.pubname) gpt,"
+ " LATERAL pg_get_publication_tables(ARRAY[p.pubname], %u) gpt,"
" pg_class c"
- " WHERE gpt.relid = %u AND c.oid = gpt.relid"
+ " WHERE c.oid = gpt.relid"
" AND p.pubname IN ( %s )",
lrel->remoteid,
pub_names->data);
@@ -983,9 +983,8 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel,
appendStringInfo(&cmd,
"SELECT DISTINCT pg_get_expr(gpt.qual, gpt.relid)"
" FROM pg_publication p,"
- " LATERAL pg_get_publication_tables(p.pubname) gpt"
- " WHERE gpt.relid = %u"
- " AND p.pubname IN ( %s )",
+ " LATERAL pg_get_publication_tables(ARRAY[p.pubname], %u) gpt"
+ " WHERE p.pubname IN ( %s )",
lrel->remoteid,
pub_names->data);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index dac40992cbc..f6b775fe25b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12382,12 +12382,21 @@
{ oid => '6119',
descr => 'get information of the tables that are part of the specified publications',
proname => 'pg_get_publication_tables', prorows => '1000',
- provariadic => 'text', proretset => 't', provolatile => 's',
+ proretset => 't', provolatile => 's',
prorettype => 'record', proargtypes => '_text',
proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
- proargmodes => '{v,o,o,o,o}',
+ proargmodes => '{i,o,o,o,o}',
proargnames => '{pubname,pubid,relid,attrs,qual}',
- prosrc => 'pg_get_publication_tables' },
+ prosrc => 'pg_get_publication_tables_a' },
+{ oid => '8060',
+ descr => 'get information of the tables that are part of the specified publications',
+ proname => 'pg_get_publication_tables', prorows => '1',
+ proretset => 't', provolatile => 's',
+ prorettype => 'record', proargtypes => '_text oid',
+ proallargtypes => '{_text,oid,oid,oid,int2vector,pg_node_tree}',
+ proargmodes => '{i,i,o,o,o,o}',
+ proargnames => '{pubname,relid,pubid,relid,attrs,qual}',
+ prosrc => 'pg_get_publication_tables_b' },
{ oid => '8052', descr => 'get OIDs of sequences in a publication',
proname => 'pg_get_publication_sequences', prorows => '1000', proretset => 't',
provolatile => 's', prorettype => 'oid', proargtypes => 'text',
--
2.53.0