On Fri, Jul 3, 2026 at 9:24 AM Dilip Kumar <[email protected]> wrote:
>
> On Fri, Jul 3, 2026 at 9:09 AM shveta malik <[email protected]> wrote:
> >
> > On Thu, Jul 2, 2026 at 5:30 PM Dilip Kumar <[email protected]> wrote:
> > >
> > > On Thu, Jul 2, 2026 at 4:48 PM Shlok Kyal <[email protected]> 
> > > wrote:
> > > >
> > > > On Thu, 2 Jul 2026 at 13:39, Dilip Kumar <[email protected]> wrote:
> > > > >
> > > > > On Wed, Jul 1, 2026 at 6:55 PM Dilip Kumar <[email protected]> 
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Jul 1, 2026 at 1:16 PM shveta malik 
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > On Wed, Jul 1, 2026 at 12:45 PM Dilip Kumar 
> > > > > > > <[email protected]> wrote:
> > > > > > > >
> > > > > > > > Thanks I have fixed these comments and also merged the docs.
> > > > > > >
> > > > > > > Thanks. Sharing the version with pgindent run on v61.
> > > > > >
> > > > > > Here is the updated patch version. It fixes offlist review comments
> > > > > > provided by Amit and Shveta, bumps the catversion and adds authors 
> > > > > > and
> > > > > > reviewers details.
> > > > >
> > > > > I rebased the remaining patches on top of HEAD. So far, I have run
> > > > > pgindent and completed the doc merge for 0001.  The 0002 and 0003 are
> > > > > just rebased.
> > > >
> > > > Hi Dilip,
> > > >
> > > > While testing I hit the following assert:
> > > > +           if (OidIsValid(relid))
> > > > +           {
> > > > +               /* Existing table from upgrade */
> > > > +               Assert(IsBinaryUpgrade);
> > > > +           }
> > > >
> > > > Steps to reproduce:
> > > > 1. Suppose initially we have a logical replication setup and the
> > > > subscriber is created with 'conflict_log_destination = log'
> > > > 2. Now we have two psql sessions of the subscriber node. Attach GDB to
> > > > first session and add breakpoint to line:
> > > > AlterSubscription->LockSharedObject(SubscriptionRelationId, subid, 0,
> > > > AccessExclusiveLock);
> > > > 3.Execute 'ALTER SUBSCRIPTION sub1 SET (conflict_log_destination =
> > > > 'all')' on the first session. It will stop at the breakpoint.
> > > > 4. Now execute 'ALTER SUBSCRIPTION sub1 SET (conflict_log_destination
> > > > = 'all');' on the second session. The execution on the second session
> > > > is successful and a conflict log table is created.
> > > > 5. Continue the execution of the first session. This will hit the
> > > > above Assert condition.
> > > >
> > > > I think this happens because for the first session, we have the stale
> > > > copy of 'sub'.
> > > > While the first session is stopped at the breakpoint, the second
> > > > session successfully executes ALTER SUBSCRIPTION and creates the
> > > > conflict log table. When the first session resumes, it still sees the
> > > > old value of old_dest (i.e CONFLICT_LOG_DEST_LOG), so
> > > > 'opts.conflictlogdest != old_dest' evaluates to true and
> > > > 'alter_sub_conflict_log_dest()' is invoked.
> > > > Since the second session has already created the conflict log table
> > > > and 'want_table' is true, the code finds an existing conflict log
> > > > table '(OidIsValid(relid))', causing the assert to fire.
> > >
> > > In short, if I understand correctly, the issue is that altering a
> > > subscription that tries to create the table might hit this assert
> > > because we assume that if we are switching from 'log' to 'table' or
> > > 'all', there should not be an existing table except in binary upgrade
> > > mode.  This happens because we read the destination information from
> > > subscription cache while checking the table status in real time.  In
> > > fact in other cases, it might cause alter subscription to error out if
> > > the table is created while we are creating the table.  Therefore, we
> > > need to handle concurrent table creation during ALTER SUBSCRIPTION.
> > > One option is: if the table exists and it's not a binary upgrade,
> > > should we just error out?
> > >
> >
> > No, I feel it should be Assert only. IMO, we should re-fetch 'sub'
> > beofre CLT creation or move first 'sub' fetching itself post
> > LockSharedObject if there is nothing blocking it. Thoughts?
>
> Yeah, IMHO once we fix [1] this issue should be automatically resolved.
>
> [1] https://www.postgresql.org/message-id/akZUpiDa1UfmzYxL%40bdtpg
>

Oh yes, it seems to point to the issue due to the exact same root
cause (stale sub). Yes, it will be fix this.

thanks
Shveta


Reply via email to