On Wed, Jan 13, 2021 at 11:08 AM Bharath Rupireddy
<bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Wed, Jan 13, 2021 at 10:33 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
> >
> > On Tue, Jan 12, 2021 at 5:23 PM japin <japi...@hotmail.com> wrote:
> > >
> > > On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:
> > > > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <japi...@hotmail.com> wrote:
> > > >> IIUC the logical replication only replicate the tables in publication, 
> > > >> I think
> > > >> when the tables that aren't in publication should not be replicated.
> > > >>
> > > >> Attached the patch that fixes it.  Thought?
> > > >
> > > > With that change, we don't get the behaviour that's stated in the
> > > > document - "The ADD TABLE and DROP TABLE clauses will add and remove
> > > > one or more tables from the publication. Note that adding tables to a
> > > > publication that is already subscribed to will require a ALTER
> > > > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
> > > > order to become effective" -
> > > > https://www.postgresql.org/docs/devel/sql-alterpublication.html.
> > > >
> > >
> > > The documentation only emphasize adding tables to a publication, not
> > > include dropping tables from a publication.
> > >
> >
> > Right and I think that is ensured by the subscriber by calling
> > should_apply_changes_for_rel() which won't return true unless the
> > newly added relation is not synced via Refresh Publication. So, this
> > means with or without this patch we should be sending the changes of
> > the newly published table from the publisher?
>
> Oh, my bad, alter subscription...refresh publication is required only when 
> the tables are added to the publisher. Patch by japin makes the walsender 
> process to stop sending the data to the subscriber/apply worker. The patch is 
> based on the idea of looking at the PUBLICATIONRELMAP in get_rel_sync_entry 
> when the entries have been invalidated in rel_sync_cache_publication_cb 
> because of alter publication...drop table.
> ,
> When the alter subscription...refresh publication is run on the subscriber, 
> the SUBSCRIPTIONRELMAP catalogue gets invalidated but the corresponding cache 
> entries in the LogicalRepRelMap which is used by logicalrep_rel_open are not 
> invalidated. LogicalRepRelMap is used to know the relations that are 
> associated with the subscription. But we seem to have not taken care of 
> invalidating those entries, though we have the invalidation callback 
> invalidate_syncing_table_states registered for SUBSCRIPTIONRELMAP in 
> ApplyWorkerMain. So, we miss to work on updating the entries in 
> LogicalRepRelMap.
>
> IMO, the ideal way to fix this issue is 1) stop the walsender sending the 
> changes to dropped tables, for this japin patch works 2) we must mark all the 
> LogicalRepRelMap entries as invalid in invalidate_syncing_table_states so 
> that in the next logicalrep_rel_open, if the entry is invalidated, then we 
> have to call GetSubscriptionRelState to get the latest state, as shown in 
> [1]. Likewise, we might also have to mark the cache entries invalid in  
> subscription_change_cb which is invalidation callback for pg_subscription
>

Is the second point you described here is related to the original
bug-reported or will it cause some other problem?

> Thoughts?
>
> [1] -
>     if (entry->state != SUBREL_STATE_READY || entry->invalid)
>         entry->state = GetSubscriptionRelState(MySubscription->oid,
>                                                entry->localreloid,
>                                                &entry->statelsn);
>
>    if (entry->invalid)
>        entry->invalid = false;
>
>     return entry;
>
> > I have another question on your patch which is why in some cases like
> > when we have not inserted in step-5 (as mentioned by you) the
> > following insertions are not sent. Is somehow we are setting the
> > pubactions as false in that case, if so, how?
>
> The reason is that the issue reported in this thread occurs - when we have 
> the walsender process running, RelationSyncCache is initialized, we inserted 
> some data into the table that's sent to the subscriber and the table is 
> dropped, we miss to set the pubactions to false in get_rel_sync_entry, though 
> the cache entries have been invalidated.
>
> In some cases, it works properly because the old walsender process was 
> stopped and when a new walsender is started, then the cache RelationSyncCache 
> gets initialized again and the pubactions will be set to false in 
> get_rel_sync_entry.
>

Why is walsender process was getting stopped in one case but not in another?

-- 
With Regards,
Amit Kapila.


Reply via email to