On Mon, Jan 22, 2024, at 4:06 AM, Hayato Kuroda (Fujitsu) wrote:
> I analyzed and found a reason. This is because publications are invisible for 
> some transactions.
> 
> As the first place, below operations were executed in this case.
> Tuples were inserted after getting consistent_lsn, but before starting the 
> standby.
> After doing the workload, I confirmed again that the publication was created.
> 
> 1. on primary, logical replication slots were created.
> 2. on primary, another replication slot was created.
> 3. ===on primary, some tuples were inserted. ===
> 4. on standby, a server process was started
> 5. on standby, the process waited until all changes have come.
> 6. on primary, publications were created.
> 7. on standby, subscriptions were created.
> 8. on standby, a replication progress for each subscriptions was set to given 
> LSN (got at step2).
> =====pg_subscriber finished here=====
> 9. on standby, a server process was started again
> 10. on standby, subscriptions were enabled. They referred slots created at 
> step1.
> 11. on primary, decoding was started but ERROR was raised.

Good catch! It is a design flaw.

> In this case, tuples were inserted *before creating publication*.
> So I thought that the decoded transaction could not see the publication 
> because
> it was committed after insertions.
> 
> One solution is to create a publication before creating a consistent slot.
> Changes which came before creating the slot were surely replicated to the 
> standby,
> so upcoming transactions can see the object. We are planning to patch set to 
> fix
> the issue in this approach.

I'll include a similar code in the next patch and also explain why we should
create the publication earlier. (I'm renaming
create_all_logical_replication_slots to setup_publisher and calling
create_publication from there and also adding the proposed GUC checks in it.)


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Reply via email to