On Sat, Dec 18, 2021 at 1:33 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Dec 17, 2021 at 5:29 PM Greg Nancarrow <gregn4...@gmail.com> wrote: > > > > On Fri, Dec 17, 2021 at 7:20 PM Ajin Cherian <itsa...@gmail.com> wrote: > > > > > > On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow <gregn4...@gmail.com> > > > wrote: > > > > > > > So using the v47 patch-set, I still find that the UPDATE above results > > > > in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to > > > > (2,1). > > > > This is according to the 2nd UPDATE rule below, from patch 0003. > > > > > > > > + * old-row (no match) new-row (no match) -> (drop change) > > > > + * old-row (no match) new row (match) -> INSERT > > > > + * old-row (match) new-row (no match) -> DELETE > > > > + * old-row (match) new row (match) -> UPDATE > > > > > > > > This is because the old row (1,1) doesn't match the UPDATE filter > > > > "(a>1)", but the new row (2,1) does. > > > > This functionality doesn't seem right to me. I don't think it can be > > > > assumed that (1,1) was never published (and thus requires an INSERT > > > > rather than UPDATE) based on these checks, because in this example, > > > > (1,1) was previously published via a different operation - INSERT (and > > > > using a different filter too). > > > > I think the fundamental problem here is that these UPDATE rules assume > > > > that the old (current) row was previously UPDATEd (and published, or > > > > not published, according to the filter applicable to UPDATE), but this > > > > is not necessarily the case. > > > > Or am I missing something? > > > > > > But it need not be correct in assuming that the old-row was part of a > > > previous INSERT either (and published, or not published according to > > > the filter applicable to an INSERT). > > > For example, change the sequence of inserts and updates prior to the > > > last update: > > > > > > truncate tbl1 ; > > > insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < > > > 2); > > > update tbl1 set b = 1; ==> not replicated since update and ! (a > 1) > > > update tbl1 set a = 2; ==> replicated and update converted to insert > > > since (a > 1) > > > > > > In this case, the last update "update tbl1 set a = 2; " is updating a > > > row that was previously updated and not inserted and not replicated to > > > the subscriber. > > > How does the replication logic differentiate between these two cases, > > > and decide if the update was previously published or not? > > > I think it's futile for the publisher side to try and figure out the > > > history of published rows. In fact, if this level of logic is required > > > then it is best implemented on the subscriber side, which then defeats > > > the purpose of a publication filter. > > > > > > > I think it's a concern, for such a basic example with only one row, > > getting unpredictable (and even wrong) replication results, depending > > upon the order of operations. > > > > I am not sure how we can deduce that. The results are based on current > and new values of row which is what I think we are expecting here. > > > Doesn't this problem result from allowing different WHERE clauses for > > different pubactions for the same table? > > My current thoughts are that this shouldn't be allowed, and also WHERE > > clauses for INSERTs should, like UPDATE and DELETE, be restricted to > > using only columns covered by the replica identity or primary key. > > > > Hmm, even if we do that one could have removed the insert row filter > by the time we are evaluating the update. So, we will get the same > result. I think the behavior in your example is as we expect as per > the specs defined by the patch and I don't see any problem, in this > case, w.r.t replication results. Let us see what others think on this? >
I think currently there could be a problem with user perceptions. IMO a user would be mostly interested in predictability and getting results that are intuitive. So, even if all strange results can (after careful examination) be after-the-fact explained away as being "correct" according to a spec, I don't think that is going to make any difference. e.g. regardless of correctness, even if it just "appeared" to give unexpected results then a user may just decide that row-filtering is not worth their confusion... Perhaps there is a slightly dumbed-down RF design that can still be useful, but which can give much more comfort to the user because the replica will be more like what they were expecting? ------ Kind Regards, Peter Smith. Fujitsu Australia