Re: Improve logical replication usability when tables lack primary keys

Chao Li Mon, 13 Apr 2026 20:58:50 -0700

> On Apr 8, 2026, at 19:43, shveta malik <[email protected]> wrote:
> 
> On Wed, Feb 4, 2026 at 12:51 PM Chao Li <[email protected]> wrote:
>> 
>> 
>> 
>>> On Dec 30, 2025, at 16:07, Chao Li <[email protected]> wrote:
>>> 
>>> 
>>> 
>>>> On Dec 22, 2025, at 19:48, Amit Kapila <[email protected]> wrote:
>>>> 
>>>> On Fri, Dec 19, 2025 at 1:39 PM Chao Li <[email protected]> wrote:
>>>>> 
>>>>>> On Dec 18, 2025, at 22:49, Euler Taveira <[email protected]> wrote:
>>>>>> 
>>>>>> On Wed, Dec 17, 2025, at 6:43 PM, GRANT ZHOU wrote:
>>>>>>> On Wed, Dec 17, 2025 at 12:50 PM Euler Taveira <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Each table needs to say what's its row identifier. The user created a 
>>>>>>>> table
>>>>>>>> without primary key. Well, create a primary key. There are dozens of 
>>>>>>>> thousands
>>>>>>>> of objects. Use a script.
>>>>>>> However, I’d like to share a user perspective regarding the "use a
>>>>>>> script" approach. The main value of `FOR TABLES IN SCHEMA` is
>>>>>>> *in-database automation*. If users still need to maintain external
>>>>>>> scripts to monitor and `ALTER` new tables to prevent replication
>>>>>>> errors, it significantly diminishes the value of that automation.
>>>>>>> 
>>>>>> 
>>>>>> As I tried to explain in the previous email, the problem with FOR ALL 
>>>>>> TABLES
>>>>>> and FOR TABLES IN SCHEMA syntax is that the is no catalog information 
>>>>>> about the
>>>>>> relations; the list of relations is collected at runtime.
>>>>>> 
>>>>>> When I suggested "use a script" I was referring to fix the logical 
>>>>>> replication
>>>>>> setup regarding the lack of primary key. There is no need to have an 
>>>>>> automation
>>>>>> outside the database, use an event trigger. If your lazy user doesn't 
>>>>>> create
>>>>>> the primary key, assign REPLICA IDENTITY FULL. Something like
>>>>>> 
>>>>>> -- This example is far from being a complete solution for fixing the 
>>>>>> lack of
>>>>>> -- primary key in a logical replication scenario.
>>>>>> -- ALTER TABLE should be supported too
>>>>>> CREATE OR REPLACE FUNCTION event_trigger_for_replica_identity()
>>>>>>     RETURNS event_trigger LANGUAGE plpgsql AS $$
>>>>>> DECLARE
>>>>>> obj record;
>>>>>> rec record;
>>>>>> ricnt integer := 0;
>>>>>> BEGIN
>>>>>> FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
>>>>>> LOOP
>>>>>>     IF obj.command_tag = 'CREATE TABLE' THEN
>>>>>>         SELECT COUNT(*) INTO ricnt FROM pg_index WHERE indrelid = 
>>>>>> obj.objid AND indisprimary;
>>>>>>         RAISE NOTICE 'ricnt: %', ricnt;
>>>>>>         IF ricnt = 0 THEN
>>>>>>             EXECUTE 'ALTER TABLE ' || obj.object_identity || ' REPLICA 
>>>>>> IDENTITY FULL';
>>>>>>         END IF;
>>>>>>     END IF;
>>>>>> END LOOP;
>>>>>> END;
>>>>>> $$;
>>>>>> 
>>>>>> CREATE EVENT TRIGGER event_trigger_for_replica_identity
>>>>>> ON ddl_command_end
>>>>>> EXECUTE FUNCTION event_trigger_for_replica_identity();
>>>>>> 
>>>>>> CREATE TABLE event_trigger_test_1 (a int);
>>>>>> \d+ event_trigger_test_1
>>>>>> CREATE TABLE event_trigger_test_2 (a int primary key);
>>>>>> \d+ event_trigger_test_2
>>>>>> CREATE TABLE event_trigger_test_3 (a int, b text not null, primary 
>>>>>> key(b));
>>>>>> \d+ event_trigger_test_3
>>>>>> --ALTER TABLE event_trigger_test_3 DROP CONSTRAINT 
>>>>>> event_trigger_test_3_pkey;
>>>>>> --\d+ event_trigger_test_3
>>>>>> 
>>>>>> DROP EVENT TRIGGER event_trigger_for_replica_identity;
>>>>>> DROP FUNCTION event_trigger_for_replica_identity;
>>>>>> DROP TABLE event_trigger_test_1, event_trigger_test_2, 
>>>>>> event_trigger_test_3;
>>>>>> 
>>>>>> 8<----------------------------------------------------------------------------8<
>>>>>> 
>>>>>>> Additionally, tables without Primary Keys are valid SQL and extremely
>>>>>>> common in enterprise environments (e.g., audit logs, data warehousing).
>>>>>>> In large-scale deployments, enforcing PKs on every single table isn't
>>>>>>> always practical.
>>>>>>> 
>>>>>> 
>>>>>> I'm not saying users shouldn't create tables without a primary key. I'm 
>>>>>> arguing
>>>>>> that this decision should take into account what adjustments need to be 
>>>>>> made to
>>>>>> use these tables in logical replication.
>>>>>> 
>>>>>>> 
>>>>>>> I think the goal of this proposal is not to change the underlying table
>>>>>>> property design, but rather to seek a mechanism (like a Publication
>>>>>>> option) to ensure this automation functions safely without external
>>>>>>> intervention. It is simply about allowing the database to handle these
>>>>>>> valid, common scenarios gracefully when automation is enabled.
>>>>>>> 
>>>>>> 
>>>>>> You didn't get it. You already have one property to handle it and you are
>>>>>> proposing to add a second property to handle it.
>>>>>> 
>>>>>> I think you are pursuing the wrong solution. IMO we need a solution to 
>>>>>> enforce
>>>>>> that the logical replication contract is valid. If you create or modify 
>>>>>> a table
>>>>>> that is part of a publication, there is no validation that that table 
>>>>>> complies
>>>>>> with the publication properties (update and delete properties should 
>>>>>> require an
>>>>>> appropriate replica identity). We should close the gaps in both 
>>>>>> publication and
>>>>>> table.
>>>>>> 
>>>> 
>>>> If we want, we can ensure that any table added to that specific
>>>> publication (that has an option replica_identy='full') would
>>>> automatically override the default to FULL, if PK is not available.
>>>> This information can be cached to avoid overhead.
>>>> 
>>>>> 
>>>>> If I summarize Euler’s position in short words: discipline over 
>>>>> convenience. I actually strongly agree with that. In PG we generally 
>>>>> prefer explicit over implicit behavior, and predictability over magic.
>>>>> 
>>>> 
>>>> You haven't told why we can't consider a custom event trigger as
>>>> suggested by Euler for customers who are not willing to change the RI
>>>> default explicitly for each table. I think it is worth considering
>>>> providing a custom solution outside core-postgres for your customers
>>>> for this specific case.
>>> 
>>> Thanks for raising this. Let me clarify why we don’t consider a custom 
>>> event trigger a satisfactory solution in practice, even though it is 
>>> technically possible.
>>> 
>>> I discussed this with our field teams, and some customers have indeed 
>>> experimented with event-trigger-based solutions before. However, they 
>>> generally don’t prefer them for this use case.
>>> 
>>> First, the required logic is non-trivial and fragile. The trigger would 
>>> need to track table creation, primary key creation and removal, and 
>>> distinguish between cases where REPLICA IDENTITY FULL was set implicitly 
>>> versus explicitly by the user. Handling all these cases correctly makes the 
>>> solution feel like a workaround rather than a robust enforcement mechanism.
>>> 
>>> Second, event triggers introduce operational risk. They need to be 
>>> installed, monitored, and maintained separately from the core system. If a 
>>> trigger is accidentally dropped, disabled, or modified, the behavior 
>>> silently changes, which is particularly risky for replication semantics.
>>> 
>>> Third, customers place much higher trust in core PostgreSQL behavior than 
>>> in custom scripts layered on top. Issues caused by core behavior are seen 
>>> as something that can be understood, worked around, or fixed by upgrading, 
>>> whereas failures caused by custom triggers are harder to diagnose and are 
>>> often attributed to the overall solution quality.
>>> 
>>> For these reasons, while event triggers can work as a stopgap, our 
>>> customers strongly prefer a solution where the replication contract is 
>>> enforced by core PostgreSQL rather than external mechanisms.
>>> 
>>>> 
>>>>> Based on the discussion so far, I think we share the following design 
>>>>> goals:
>>>>> 
>>>>> 1) Keep replica identity as a table property.
>>>>> 2) Avoid silent runtime failures when FOR TABLES IN SCHEMA pulls in 
>>>>> tables without primary keys.
>>>>> 3) Avoid global or implicit behavior changes.
>>>>> 4) Preserve explicit opt-in for higher WAL cost.
>>>>> 5) Keep the logical replication contract explicit and enforceable.
>>>>> 
>>>>> I’ve been thinking about whether adding a new replica identity could meet 
>>>>> these goals.
>>>>> 
>>>>> Today we have four replica identities: DEFAULT (PK, fallback to NONE), 
>>>>> INDEX, FULL, and NONE.
>>>>> 
>>>>> What if we introduce a new replica identity, tentatively called “FORCE”: 
>>>>> PK with fallback to FULL. (Let’s keep our focus on the design, not argue 
>>>>> the name for now.)
>>>>> 
>>>>> With this approach:
>>>>> 
>>>>> 1) Replica identity remains a table property.
>>>>> 2) Publication membership is still evaluated at runtime, so FOR TABLES IN 
>>>>> SCHEMA is not special-cased.
>>>>> 3) No new GUCs are required.
>>>>> 4) The user must explicitly opt in by setting the replica identity. Once 
>>>>> FORCE is chosen, adding or dropping a primary key later does not silently 
>>>>> break UPDATE/DELETE replication.
>>>>> 
>>>>> 5) The logical replication contract remains explicit; the table declares 
>>>>> that it is safe for UPDATE/DELETE replication even without a PK, at the 
>>>>> cost of higher WAL volume.
>>>>> 
>>>>> This feels like a small, explicit extension of the existing RI semantics. 
>>>>> Notably, REPLICA IDENTITY DEFAULT already has conditional behavior (PK 
>>>>> fallback
>>>>> to NONE), so conditional RI behavior is not new, this would just make a 
>>>>> different fallback explicit and user-chosen.
>>>>> 
>>>>> After that, we could consider a database-level default_replica_identity 
>>>>> setting, applied at table creation time, for environments that want this 
>>>>> behavior consistently. But that would only make sense if we first agree 
>>>>> on the table-level mechanism.
>>>>> 
>>>> 
>>>> I don't much like the database-level option as it expects a new
>>>> default to be introduced. I think the internal working will almost be
>>>> same as the option at publication-level.
>>> 
>>> That’s fair. I agree that a database-level option wouldn’t be fundamentally 
>>> different from a publication-level solution and would likely share most of 
>>> the same internal mechanics.
>>> 
>>> At this point nothing is decided yet; we’re still exploring different 
>>> approaches and trying to understand the trade-offs.
>>> 
>>> I have a question to better understand how a publication-level approach 
>>> would behave in edge cases.
>>> 
>>> Since replica identity is defined on tables and a table can belong to 
>>> multiple publications, how should UPDATE/DELETE be handled if the same 
>>> table is added to two publications with different expectations?
>>> 
>>> For example, suppose a table without a PK is added to:
>>> - pub_a, which does not require FULL (or effectively falls back to NONE)
>>> - pub_b, which requires FULL for UPDATE/DELETE
>>> 
>>> In this case, should UPDATE/DELETE on the table be allowed at all, and if 
>>> so, based on which publication’s semantics? What do you think?
>>> 
>>> Best regards,
>>> --
>>> Chao Li (Evan)
>>> HighGo Software Co., Ltd.
>>> https://www.highgo.com/
>> 
>> 
>> Hi Amit,
>> 
>> Following your suggestion, I implemented a PoC that adds a new publication 
>> parameter (tentatively named fallbackfull) to make the DEFAULT → FULL 
>> fallback behavior per-publication. I’m not attached to the parameter name — 
>> if we decide to go with the publication approach, I’m happy to adjust naming 
>> based on feedback.
>> 
>> After playing with this implementation for a couple of days, I ran into a 
>> few concerns:
>> 
>> 1. Protocol extension required
>> 
>> If the DEFAULT → FULL fallback is triggered, the subscriber needs to know 
>> whether the corresponding publication has fallbackfull enabled in order to 
>> decide how to apply UPDATE/DELETE. That means we’d need to extend the 
>> logical replication protocol, e.g., by adding a new field to the RELATION 
>> message to carry the fallbackfull flag.
> 
> If the DEFAULT → FULL fallback is triggered, can we send 'full'  to
> the subscriber? I think subscriber need not to know if FULl was set by
> user or is FORCED and the protocol need not be extended.
> 
>> 
>> 2. Impact on decoding plugins
>> 
>> Decoding plugins would need to understand this new flag. In my PoC, I 
>> updated pgoutput, but there may be third-party plugins that would also need 
>> changes. That feels like a compatibility risk.
> 
> If we make the change described in pt1, I think the protocol and
> subscriber sides will not need to change, but pgoutput would still
> need to be updated when sending the RELATION metadata. Specifically,
> instead of using the table’s configured replica identity directly, it
> should use the effective replica identity.
> 
> pq_sendbyte(out, rel->rd_rel->relreplident -->effect _RI);
> 
> So IIUC, third-party plugins may also need such change.
> 
>> 
>> 3. Potential data-integrity issues
>> 
>> This is the most concerning part to me.
>> 
>> Consider a table t1 with REPLICA IDENTITY DEFAULT and no primary key, 
>> included in publication p1. By design, UPDATE/DELETE on t1 are not allowed.
>> 
>> However, a user could work around this by creating a dummy publication, 
>> adding t1 to it, and setting fallbackfull = true on that publication. This 
>> would effectively enable UPDATE/DELETE on t1.
>> 
>> Later, if the owner of p1 decides to enable fallbackfull on p1 to replicate 
>> t1, the subscriber of p1 may already be out of sync due to the earlier 
>> updates/deletes performed via the dummy publication. At that point, 
>> subsequent UPDATE/DELETE replication may fail or behave incorrectly.
>> 
>> From this perspective, allowing fallbackfull at the publication level seems 
>> to open the door to cross-publication interference and data divergence.
>> 
>> Given these concerns, I’m leaning toward keeping fallbackfull as a per-table 
>> option rather than a per-publication one. Curious to hear your thoughts.
>> 
> 
> I see your point. One possible approach could be that when any
> publication is altered to enable the fallback option, the effective
> replica identity for the affected table is promoted to FULL across all
> publications.
> 
> i.e. for t1:
> If RI_DEFAULT + no PK + pub->fallbackfull=true;
> effective_replica_identity for t1 = FULL (across all pubs)
> 
> But then this brought us to the point that if we are deciding
> effective_replica_identity for a table irrespective of publication,
> then why not to make it table property?
> 

This is true, but going back to the motivation for this feature, users do not 
want just another table-level property, because they can already do ALTER TABLE 
t REPLICA IDENTITY FULL today. A table-level property would therefore not help 
much. What users want from this feature is a mechanism that allows new tables 
to fall back from DEFAULT to FULL automatically.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Re: Improve logical replication usability when tables lack primary keys

Reply via email to