On Wed, Jan 6, 2021 at 3:39 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Wed, Jan 6, 2021 at 2:13 PM Peter Smith <smithpb2...@gmail.com> wrote: > > > > I think it makes sense. If there can be a race between the tablesync > > re-launching (after error), and the AlterSubscription_refresh removing > > some table’s relid from the subscription then there could be lurking > > slot/origin tablesync resources (of the removed table) which a > > subsequent DROP SUBSCRIPTION cannot discover. I will think more about > > how/if it is possible to make this happen. Anyway, I suppose I ought > > to refactor/isolate some of the tablesync cleanup code in case it > > needs to be commonly called from DropSubscription and/or from > > AlterSubscription_refresh. > > > > Fair enough. >
I think before implementing, we should once try to reproduce this case. I understand this is a timing issue and can be reproduced only with the help of debugger but we should do that. > BTW, I have analyzed whether we need any modifications to > pg_dump/restore for this patch as this changes the state of one of the > fields in the system table and concluded that we don't need any > change. For subscriptions, we don't dump any of the information from > pg_subscription_rel, rather we just dump subscriptions with the > connect option as false which means users need to enable the > subscription and refresh publication after restore. I have checked > this in the code and tested it as well. The related information is > present in pg_dump doc page [1], see from "When dumping logical > replication subscriptions ....". > I have further analyzed that we don't need to do anything w.r.t pg_upgrade as well because it uses pg_dump/pg_dumpall to dump the schema info of the old cluster and then restore it to the new cluster. And, we know that pg_dump ignores the info in pg_subscription_rel, so we don't need to change anything as our changes are specific to the state of one of the columns in pg_subscription_rel. I have not tested this but we should test it by having some relations in not_ready state and then allow the old cluster (<=PG13) to be upgraded to new (pg14) both with and without this patch and see if there is any change in behavior. -- With Regards, Amit Kapila.