On Wed, Jan 13, 2021 at 11:08 AM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > On Wed, Jan 13, 2021 at 10:33 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Tue, Jan 12, 2021 at 5:23 PM japin <japi...@hotmail.com> wrote: > > > > > > On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote: > > > > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <japi...@hotmail.com> wrote: > > > >> IIUC the logical replication only replicate the tables in publication, > > > >> I think > > > >> when the tables that aren't in publication should not be replicated. > > > >> > > > >> Attached the patch that fixes it. Thought? > > > > > > > > With that change, we don't get the behaviour that's stated in the > > > > document - "The ADD TABLE and DROP TABLE clauses will add and remove > > > > one or more tables from the publication. Note that adding tables to a > > > > publication that is already subscribed to will require a ALTER > > > > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in > > > > order to become effective" - > > > > https://www.postgresql.org/docs/devel/sql-alterpublication.html. > > > > > > > > > > The documentation only emphasize adding tables to a publication, not > > > include dropping tables from a publication. > > > > > > > Right and I think that is ensured by the subscriber by calling > > should_apply_changes_for_rel() which won't return true unless the > > newly added relation is not synced via Refresh Publication. So, this > > means with or without this patch we should be sending the changes of > > the newly published table from the publisher? > > Oh, my bad, alter subscription...refresh publication is required only when > the tables are added to the publisher. Patch by japin makes the walsender > process to stop sending the data to the subscriber/apply worker. The patch is > based on the idea of looking at the PUBLICATIONRELMAP in get_rel_sync_entry > when the entries have been invalidated in rel_sync_cache_publication_cb > because of alter publication...drop table. > , > When the alter subscription...refresh publication is run on the subscriber, > the SUBSCRIPTIONRELMAP catalogue gets invalidated but the corresponding cache > entries in the LogicalRepRelMap which is used by logicalrep_rel_open are not > invalidated. LogicalRepRelMap is used to know the relations that are > associated with the subscription. But we seem to have not taken care of > invalidating those entries, though we have the invalidation callback > invalidate_syncing_table_states registered for SUBSCRIPTIONRELMAP in > ApplyWorkerMain. So, we miss to work on updating the entries in > LogicalRepRelMap. > > IMO, the ideal way to fix this issue is 1) stop the walsender sending the > changes to dropped tables, for this japin patch works 2) we must mark all the > LogicalRepRelMap entries as invalid in invalidate_syncing_table_states so > that in the next logicalrep_rel_open, if the entry is invalidated, then we > have to call GetSubscriptionRelState to get the latest state, as shown in > [1]. Likewise, we might also have to mark the cache entries invalid in > subscription_change_cb which is invalidation callback for pg_subscription >
Is the second point you described here is related to the original bug-reported or will it cause some other problem? > Thoughts? > > [1] - > if (entry->state != SUBREL_STATE_READY || entry->invalid) > entry->state = GetSubscriptionRelState(MySubscription->oid, > entry->localreloid, > &entry->statelsn); > > if (entry->invalid) > entry->invalid = false; > > return entry; > > > I have another question on your patch which is why in some cases like > > when we have not inserted in step-5 (as mentioned by you) the > > following insertions are not sent. Is somehow we are setting the > > pubactions as false in that case, if so, how? > > The reason is that the issue reported in this thread occurs - when we have > the walsender process running, RelationSyncCache is initialized, we inserted > some data into the table that's sent to the subscriber and the table is > dropped, we miss to set the pubactions to false in get_rel_sync_entry, though > the cache entries have been invalidated. > > In some cases, it works properly because the old walsender process was > stopped and when a new walsender is started, then the cache RelationSyncCache > gets initialized again and the pubactions will be set to false in > get_rel_sync_entry. > Why is walsender process was getting stopped in one case but not in another? -- With Regards, Amit Kapila.