On Mon, Jan 10, 2022 at 6:27 PM vignesh C <vignes...@gmail.com> wrote: > > On Fri, Jan 7, 2022 at 11:23 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.m...@gmail.com> > > > > wrote: > > > > > > > > > > On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada > > > > > <sawada.m...@gmail.com> wrote: > > > > > > > > > > > > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila > > > > > > <amit.kapil...@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada > > > > > > > <sawada.m...@gmail.com> wrote: > > > > > > > > > > > > > > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila > > > > > > > > <amit.kapil...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > I thought we just want to lock before clearing the skip_xid > > > > > > > > > something > > > > > > > > > like take the lock, check if the skip_xid in the catalog is > > > > > > > > > the same > > > > > > > > > as we have skipped, if it is the same then clear it, > > > > > > > > > otherwise, leave > > > > > > > > > it as it is. How will that disallow users to change skip_xid > > > > > > > > > when we > > > > > > > > > are skipping changes? > > > > > > > > > > > > > > > > Oh I thought we wanted to keep holding the lock while skipping > > > > > > > > changes > > > > > > > > (changing skip_xid requires acquiring the lock). > > > > > > > > > > > > > > > > So if skip_xid is already changed, the apply worker would do > > > > > > > > replorigin_advance() with WAL logging, instead of committing the > > > > > > > > catalog change? > > > > > > > > > > > > > > > > > > > > > > Right. BTW, how are you planning to advance the origin? Normally, > > > > > > > a > > > > > > > commit transaction would do it but when we are skipping all > > > > > > > changes, > > > > > > > the commit might not do it as there won't be any transaction id > > > > > > > assigned. > > > > > > > > > > > > I've not tested it yet but replorigin_advance() with wal_log = true > > > > > > seems to work for this case. > > > > > > > > > > I've tested it and realized that we cannot use replorigin_advance() > > > > > for this purpose without changes. That is, the current > > > > > replorigin_advance() doesn't allow to advance the origin by the owner: > > > > > > > > > > /* Make sure it's not used by somebody else */ > > > > > if (replication_state->acquired_by != 0) > > > > > { > > > > > ereport(ERROR, > > > > > (errcode(ERRCODE_OBJECT_IN_USE), > > > > > errmsg("replication origin with OID %d is already > > > > > active for PID %d", > > > > > replication_state->roident, > > > > > replication_state->acquired_by))); > > > > > } > > > > > > > > > > So we need to change it so that the origin owner can advance its > > > > > origin, which makes sense to me. > > > > > > > > > > Also, when we have to update the origin instead of committing the > > > > > catalog change while updating the origin, we cannot record the origin > > > > > timestamp. > > > > > > > > > > > > > Is it because we currently update the origin timestamp with commit > > > > record? > > > > > > Yes. > > > > > > > > > > > > This behavior makes sense to me because we skipped the > > > > > transaction. But ISTM it’s not good if we emit the origin timestamp > > > > > only when directly updating the origin. So probably we need to always > > > > > omit origin timestamp. > > > > > > > > > > > > > Do you mean to say that you want to omit it even when we are > > > > committing the changes? > > > > > > Yes, it would be better to record only origin lsn in terms of consistency. > > > > > > > > > > > > Apart from that, I'm vaguely concerned that the logic seems to be > > > > > getting complex. Probably it comes from the fact that we store > > > > > skip_xid in the catalog and update the catalog to clear/set the > > > > > skip_xid. It might be worth revisiting the idea of storing skip_xid on > > > > > shmem (e.g., ReplicationState)? > > > > > > > > > > > > > IIRC, the problem with that idea was that we won't remember skip_xid > > > > information after server restart and the user won't even know that it > > > > has to set it again. > > > > > > Right, I agree that it’s not convenient when the server restarts or > > > crashes, but these problems could not be critical in the situation > > > where users have to use this feature; the subscriber already entered > > > an error loop so they can know xid again and it’s an uncommon case > > > that they need to restart during skipping changes. > > > > > > Anyway, I'll submit an updated patch soon so we can discuss complexity > > > vs. convenience. > > > > Attached an updated patch. Please review it.
Thank you for the comments! > > Thanks for the updated patch, few comments: > 1) Should this be case insensitive to support NONE too: > + /* Setting xid = NONE is treated as resetting xid */ > + if (strcmp(xid_str, "none") == 0) > + xid = InvalidTransactionId; I think the string value is always small cases so we don't need to do strcacsecmp here. > > 2) Can we have an option to specify last_error_xid of > pg_stat_subscription_workers. Something like: > alter subscription sub1 skip ( XID = 'last_subscription_error'); > > When the user specified last_subscription_error, it should pick > last_error_xid from pg_stat_subscription_workers. > As this operation is a critical operation, if there is an option which > could automatically pick and set from pg_stat_subscription_workers, it > would be useful. As I mentioned before in another mail, I think we can do that in a separate patch. > > 3) Currently the following syntax is being supported, I felt this > should throw an error: > postgres=# alter subscription sub1 set ( XID = 100); > ALTER SUBSCRIPTION Fixed. > > 4) You might need to rebase the patch: > git am v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch > Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on > subscriber nodes > error: patch failed: doc/src/sgml/logical-replication.sgml:333 > error: doc/src/sgml/logical-replication.sgml: patch does not apply > Patch failed at 0001 Add ALTER SUBSCRIPTION ... SKIP to skip the > transaction on subscriber nodes > hint: Use 'git am --show-current-patch=diff' to see the failed patch > > 5) You might have to rename 027_skip_xact to 028_skip_xact as > 027_nosuperuser.pl already exists > diff --git a/src/test/subscription/t/027_skip_xact.pl > b/src/test/subscription/t/027_skip_xact.pl > new file mode 100644 > index 0000000000..a63c9c345e > --- /dev/null > +++ b/src/test/subscription/t/027_skip_xact.pl I've resolved these conflicts. These comments are incorporated into the latest v3 patch I just submitted[1]. Regards, [1] https://www.postgresql.org/message-id/CAD21AoD9JXah2V8uFURUpZbK_ewsut%2Bjb1ESm6YQkrhQm3nJRg%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/