On Tue, Jun 18, 2024 at 12:11 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Jun 18, 2024 at 11:54 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Mon, Jun 17, 2024 at 8:51 PM Robert Haas <robertmh...@gmail.com> wrote: > > > > > > On Mon, Jun 17, 2024 at 1:42 AM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > The difference w.r.t the existing mechanisms for holding deleted data > > > > is that we don't know whether we need to hold off the vacuum from > > > > cleaning up the rows because we can't say with any certainty whether > > > > other nodes will perform any conflicting operations in the future. > > > > Using the example we discussed, > > > > Node A: > > > > T1: INSERT INTO t (id, value) VALUES (1,1); > > > > T2: DELETE FROM t WHERE id = 1; > > > > > > > > Node B: > > > > T3: UPDATE t SET value = 2 WHERE id = 1; > > > > > > > > Say the order of receiving the commands is T1-T2-T3. We can't predict > > > > whether we will ever get T-3, so on what basis shall we try to prevent > > > > vacuum from removing the deleted row? > > > > > > The problem arises because T2 and T3 might be applied out of order on > > > some nodes. Once either one of them has been applied on every node, no > > > further conflicts are possible. > > > > If we decide to skip the update whether the row is missing or deleted, > > we indeed reach the same end result regardless of the order of T2, T3, > > and Vacuum. Here's how it looks in each case: > > > > Case 1: T1, T2, Vacuum, T3 -> Skip the update for a non-existing row > > -> end result we do not have a row. > > Case 2: T1, T2, T3 -> Skip the update for a deleted row -> end result > > we do not have a row. > > Case 3: T1, T3, T2 -> deleted the row -> end result we do not have a row. > > > > In case 3, how can deletion be successful? The row required to be > deleted has already been updated.
Hmm, I was considering this case in the example given by you above[1], so we have updated some fields of the row with id=1, isn't this row still detectable by the delete because delete will find this by id=1 as we haven't updated the id? I was making the point w.r.t. the example used above. [1] > > > > Node A: > > > > T1: INSERT INTO t (id, value) VALUES (1,1); > > > > T2: DELETE FROM t WHERE id = 1; > > > > > > > > Node B: > > > > T3: UPDATE t SET value = 2 WHERE id = 1; -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com