On Wed, Sep 11, 2024 at 11:07 AM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > On Wednesday, September 11, 2024 1:03 PM shveta malik > <shveta.ma...@gmail.com> wrote: > > > > > > > > > > Another query is about 3 node setup. I couldn't figure out what > > > > would be feedback_slots setting when it is not bidirectional, as in > > > > consider the case where there are three nodes A,B,C. Node C is > > > > subscribing to both Node A and Node B. Node A and Node B are the > > > > ones doing concurrent "update" and "delete" which will both be > > > > replicated to Node C. In this case what will be the feedback_slots > > > > setting on Node C? We don't have any slots here which will be > > > > replicating changes from Node C to Node A and Node C to Node B. This > > > > is given in [3] in your first email ([1]) > > > > > > Thanks for pointing this, the link was a bit misleading. I think the > > > solution proposed in this thread is only used to allow detecting > > > update_deleted reliably in a bidirectional cluster. For non- > > > bidirectional cases, it would be more tricky to predict the timing till > > > when > > should we retain the dead tuples. > > > > > > > So in brief, this solution is only for bidrectional setup? For > > non-bidirectional, > > feedback_slots is non-configurable and thus irrelevant. > > Right. >
One possible idea to address the non-bidirectional case raised by Shveta is to use a time-based cut-off to remove dead tuples. As mentioned earlier in my email [1], we can define a new GUC parameter say vacuum_committs_age which would indicate that we will allow rows to be removed only if the modified time of the tuple as indicated by committs module is greater than the vacuum_committs_age. We could keep this parameter a table-level option without introducing a GUC as this may not apply to all tables. I checked and found that some other replication solutions like GoldenGate also allowed similar parameters (tombstone_deletes) to be specified at table level [2]. The other advantage of allowing it at table level is that it won't hamper the performance of hot-pruning or vacuum in general. Note, I am careful here because to decide whether to remove a dead tuple or not we need to compare its committs_time both during hot-pruning and vacuum. Note that tombstones_deletes is a general concept used by replication solutions to detect updated_deleted conflict and time-based purging is recommended. See [3][4]. We previously discussed having tombstone tables to keep the deleted records information but it was suggested to prevent the vacuum from removing the required dead tuples as that would be simpler than inventing a new kind of tables/store for tombstone_deletes [5]. So, we came up with the idea of feedback slots discussed in this email but that didn't work out in all cases and appears difficult to configure as pointed out by Shveta. So, now, we are back to one of the other ideas [1] discussed previously to solve this problem. Thoughts? [1] - https://www.postgresql.org/message-id/CAA4eK1Lj-PWrP789KnKxZydisHajd38rSihWXO8MVBLDwxG1Kg%40mail.gmail.com [2] - BEGIN DBMS_GOLDENGATE_ADM.ALTER_AUTO_CDR( schema_name => 'hr', table_name => 'employees', tombstone_deletes => TRUE); END; / [3] - https://en.wikipedia.org/wiki/Tombstone_(data_store) [4] - https://docs.oracle.com/en/middleware/goldengate/core/19.1/oracle-db/automatic-conflict-detection-and-resolution1.html#GUID-423C6EE8-1C62-4085-899C-8454B8FB9C92 [5] - https://www.postgresql.org/message-id/e4cdb849-d647-4acf-aabe-7049ae170fbf%40enterprisedb.com -- With Regards, Amit Kapila.