On Mon, 29 Aug 2022 at 11:59, Peter Smith <smithpb2...@gmail.com> wrote: > > Here are some review comments for patch v42-0001: > > ====== > > 1. Commit message > > A later review comment below suggests some changes to the WARNING > message so if those changes are made then the example in this commit > message also needs to be modified.
Modified > ====== > > 2. doc/src/sgml/ref/alter_subscription.sgml - copy_data > > Refer to the Notes for the usage of true for copy_data parameter and > its interaction with the origin parameter. > > I think saying "usage of true" sounded a bit strange. > > SUGGESTION > Refer to the Notes about how copy_data = true can interact with the > origin parameter. Modified > ====== > > 3. doc/src/sgml/ref/create_subscription.sgml - copy data > > Refer to the Notes for the usage of true for copy_data parameter and > its interaction with the origin parameter. > > SUGGESTION > (same as #2) Modified > ~~ > > 4. doc/src/sgml/ref/create_subscription.sgml - origin > > Refer to the Notes for the usage of true for copy_data parameter and > its interaction with the origin parameter. > > SUGGESTION > (same as #2) Modified > ~~~ > > 5. doc/src/sgml/ref/create_subscription.sgml - Notes > > + <para> > + If the subscription is created with <literal>origin = none</literal> and > + <literal>copy_data = true</literal>, it will check if the publisher has > + subscribed to the same table from other publishers and, if so, throw a > + warning to notify user to check the publisher tables. The user can ensure > > "throw a warning to notify user" -> "log a warning to notify the user" Modified > ~~ > > 6. > > + warning to notify user to check the publisher tables. The user can ensure > + that publisher tables do not have data which has an origin associated > before > + continuing with any other operations to prevent inconsistent data being > + replicated. > + </para> > > 6a. > > I'm not so sure about this. IMO the warning is not about the > replication – really it is about the COPY which has already happened > anyway. So the user can't really prevent anything from going wrong; > instead, they have to take some action to clean up if anything did go > wrong. > > SUGGESTION > Before continuing with other operations the user should check that > publisher tables did not have data with different origins, otherwise > inconsistent data may have been copied. Modified based on the suggestion provided by Amit. > ~ > > 6b. > > I am also wondering what can the user do now. Assuming there was bad > COPY then the subscriber table has already got unwanted stuff copied > into it. Is there any advice we can give to help users fix this mess? There is nothing much a user can do in this case. Only option would be to take a backup before the operation and restore it and then recreate the replication setup. I was not sure if we should document these steps as similar inconsistent data could get created without the origin option . Thoughts? > ====== > > 7. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh > > + /* Check whether we can allow copy of newly added relations. */ > + check_pub_table_subscribed(wrconn, sub->publications, copy_data, > + sub->origin, subrel_local_oids, > + subrel_count); > > "whether we can allow" seems not quite the right wording here anymore, > because now there is no ERROR stopping this - so if there was unwanted > data the COPY will proceed to copy it anyhow... Removed the comment as the function details the necessary things. > ~~~ > > 8. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed > > @@ -1781,6 +1794,121 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId) > table_close(rel, RowExclusiveLock); > } > > +/* > + * Check and throw a warning if the publisher has subscribed to the same > table > + * from some other publisher. This check is required only if "copy_data = > true" > + * and "origin = none" for CREATE SUBSCRIPTION and > + * ALTER SUBSCRIPTION ... REFRESH statements to notify user that data having > + * origin might have been copied. > > "throw a warning" -> "log a warning" > > "to notify user" -> "to notify the user" ? Modified > ~~~ > > 9. > > + if (copydata == false || !origin || > + (pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0)) > + return; > > SUGGESTION > if (!copydata || !origin || > (pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0)) Modified > ~~~ > > 10. (Question) > > I can't tell just by reading the code if FOR ALL TABLES case is > handled – e.g. will this recognise the case were the publisher might > have table data from other origins because it has a subscription on > some other node that was publishing "FOR ALL TABLES"? Yes it handles it. > ~~~ > > 11. > > + ereport(WARNING, > + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("publisher has subscribed table \"%s.%s\" from some other publisher", > + nspname, relname), > + errdetail("Publisher might have subscribed one or more tables from > some other publisher."), > + errhint("Verify that these publisher tables do not have data that > has an origin associated before proceeding to avoid inconsistency.")); > + break; > > 11a. > I'm not sure having that "break" code is correct logic. Won't that > mean the user will only get a warning for the first potential problem > encountered but then other potential problems tables will have no > warnings at all so the user may not be made aware of them? Perhaps you > need to first gather the list of all the suspicious tables before > logging a single warning which shows that list? Or perhaps you need to > log multiple warnings – one warning per suspicious table? Modified to get the list of all the tables, I have limited it to display 100 tables followed by ... to indicate there might be more tables. I did not want to loga warning message with many 1000's of tables which will not help the readability of the message. > ~ > > 11b. > The WARNING message seems a bit inside-out because the errmsg is > giving more details than the errdetail. > > SUGGESTIONS (or something similar) > errmsg - "subscription XXX requested origin=NONE but may have copied > data that had a different origin." > errdetail – Publisher YYY has subscribed table \"%s.%s\" from some > other publisher" We don't have the tables based on each publisher, it is for all the publishers. I have changed the errdetail message slightly. > ~ > > 11c. > The errhint sentence is unusual. > > BEFORE > "Verify that these publisher tables do not have data that has an > origin associated before proceeding to avoid inconsistency." > > SUGGESTION (or something similar) > Before proceeding, verify that initial data copied from the publisher > tables did not come from other origins. Modified > ====== > > 12. src/test/subscription/t/030_origin.pl > > +# have remotely originated data from node_A. We throw a warning, in this > case, > +# to draw attention to there being possible remote data. > > "throw a warning" -> "log a warning" (this occurs 2x) Modified The attached v43 patch has the changes for the same. Regards, Vignesh
v43-0001-Check-and-log-a-warning-if-the-publisher-has-sub.patch
Description: Binary data
v43-0002-Document-the-steps-for-replication-between-prima.patch
Description: Binary data