On Mon, Sep 30, 2024 at 12:02 PM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > On Wednesday, September 25, 2024 2:23 AM Masahiko Sawada > <sawada.m...@gmail.com> wrote: > > > > I think the remote wal flush location is asked using a replication protocol. > > Therefore, if a new worker is responsible for asking wal flush location from > > multiple publishers (like the idea (b)), the corresponding process would > > need > > to be launched on publisher sides and logical replication would also need to > > start on each connection. I think it would be better to get the remote wal > > flush > > location using the existing logical replication connection (i.e., between > > the > > logical wal sender and the apply worker), and advertise the locations on the > > shared memory. Then, the central process who holds the slot to retain the > > deleted row versions traverses them and increases slot.xmin if possible. > > > > The cost of requesting the remote wal flush location would not be huge if we > > don't ask it very frequently. So probably we can start by having each apply > > worker (in the retain_sub_list) ask the remote wal flush location and can > > leave > > the optimization of avoiding sending the request for the same publisher. > > Agreed. Here is the POC patch set based on this idea. > > The implementation is as follows: > > A subscription option is added to allow users to specify whether dead > tuples on the subscriber, which are useful for detecting update_deleted > conflicts, should be retained. The default setting is false. If set to true, > the detection of update_deleted will be enabled, >
I find the option name retain_dead_tuples bit misleading because by name one can't make out the purpose of the same. It is better to name it as detect_update_deleted or something on those lines. > and an additional replication > slot named pg_conflict_detection will be created on the subscriber to prevent > dead tuples from being removed. Note that if multiple subscriptions on one > node > enable this option, only one replication slot will be created. > In general, we should have done this by default but as detecting update_deleted type conflict has some overhead in terms of retaining dead tuples for more time, so having an option seems reasonable. But I suggest to keep this as a separate last patch. If we can make the core idea work by default then we can enable it via option in the end. -- With Regards, Amit Kapila.