I will try to summarize the discussion to clarify the outcome.

Mick is in favor of #4
Summanth is in favor of #4
Sylvain answer was not clear for me. I understood it like I prefer #3 to #4
and I am also fine with #1
Jeff is in favor of #3 and will understand #4
David is in favor #3 (fix bug and add flag to roll back to old behavior) in
4.0 and #4 in 3.0 and 3.11

Do not hesitate to correct me if I misunderstood your answer.

Based on these answers it seems clear that most people prefer to go for #3
or #4.

The choice between #3 (fix correctness opt-in to current behavior) and #4
(current behavior opt-in to correctness) is a bit less clear specially if
we consider the 3.X branches or 4.0.

Does anybody as some idea on how to choose between those 2 choices or some
extra opinions on #3 versus #4?






On Wed, Nov 18, 2020 at 9:45 PM David Capwell <dcapw...@gmail.com> wrote:

> I feel that #4 (fix bug and add flag to roll back to old behavior) is best.
>
> About the alternative implementation, I am fine adding it to 3.x and 4.0,
> but should treat it as a different path disabled by default that you can
> opt-into, with a plan to opt-in by default "eventually".
>
> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > Perhaps there might be broader appetite to weigh in on which major
> > releases we might target for work that fixes the correctness bug without
> > serious performance regression?
> >
> > i.e., if we were to fix the correctness bug now, introducing a serious
> > performance regression (either opt-in or opt-out), but were to land work
> > without this problem for 5.0, would there be appetite to backport this
> work
> > to any of 4.0, 3.11 or 3.0?
> >
> >
> > On 18/11/2020, 18:31, "Jeff Jirsa" <jji...@gmail.com> wrote:
> >
> >     This is complicated and relatively few people on earth understand it,
> > so
> >     having little feedback is mostly expected, unfortunately.
> >
> >     My normal emotional response is "correctness is required, opt-in to
> >     performance improvements that sacrifice strict correctness", but I'm
> > also
> >     sure this is going to surprise people, and would understand / accept
> #4
> >     (default to current, opt-in to correct).
> >
> >
> >     On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
> > bened...@apache.org>
> >     wrote:
> >
> >     > It doesn't seem like there's much enthusiasm for any of the options
> >     > available here...
> >     >
> >     > On 12/11/2020, 14:37, "Benedict Elliott Smith" <
> bened...@apache.org
> > >
> >     > wrote:
> >     >
> >     >     > Is the new implementation a separate, distinctly modularized
> > new
> >     > body of work
> >     >
> >     >     It’s primarily a distinct, modularised and new body of work,
> > however
> >     > there is some shared code that has been modified - namely
> > PaxosState, in
> >     > which legacy code is maintained but modified for compatibility, and
> > the
> >     > system.paxos table (which receives a new column, and slightly
> > modified
> >     > serialization code).  It is conceptually an optimised version of
> the
> >     > existing algorithm.
> >     >
> >     >     If there's a chance of being of value to 4.0, I can try to put
> > up a
> >     > patch next week alongside a high level description of the changes.
> >     >
> >     >     > But a performance regression is a regression, I'm not
> > shrugging it
> >     > off.
> >     >
> >     >     I don't want to give the impression I'm shrugging off the
> > correctness
> >     > issue either. It's a serious issue to fix, but since all successful
> > updates
> >     > to the database are linearizable, I think it's likely that many
> >     > applications behave correctly with the present semantics, or at
> least
> >     > encounter only transient errors. No doubt many also do not, but I
> > have no
> >     > idea of the ratio.
> >     >
> >     >     The regression isn't itself a simple issue either - depending
> on
> > the
> >     > topology and message latencies it is not difficult to produce
> > inescapable
> >     > contention, i.e. guaranteed timeouts - that might persist as long
> as
> >     > clients continue to retry. It could be quite a serious degradation
> of
> >     > service to impose on our users.
> >     >
> >     >     I don't pretend to know the correct way to make a decision
> > balancing
> >     > these considerations, but I am perhaps more concerned about
> imposing
> >     > service outages than I am temporarily maintaining semantics our
> > users have
> >     > apparently accepted for years - though I absolutely share your
> >     > embarrassment there.
> >     >
> >     >
> >     >     On 12/11/2020, 12:41, "Joshua McKenzie" <jmcken...@apache.org
> >
> > wrote:
> >     >
> >     >         Is the new implementation a separate, distinctly
> modularized
> > new
> >     > body of
> >     >         work or does it make substantial changes to existing
> >     > implementation and
> >     >         subsume it?
> >     >
> >     >         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
> >     > lebre...@gmail.com> wrote:
> >     >
> >     >         > Regarding option #4, I'll remark that experience tends to
> >     > suggest users
> >     >         > don't consistently read the `NEWS.txt` file on upgrade,
> so
> >     > option #4 will
> >     >         > likely essentially mean "LWT has a correctness issue, but
> > once
> >     > it broke
> >     >         > your data enough that you'll notice, you'll be able to
> dig
> > the
> >     > proper flag
> >     >         > to fix it for next time". I guess it's better than
> > nothing, of
> >     > course, but
> >     >         > I'll admit that defaulting to "opt-in correctness",
> > especially
> >     > for a
> >     >         > feature (LWT) that exists uniquely to provide additional
> >     > guarantees, is
> >     >         > something I have a hard rallying behind.
> >     >         >
> >     >         > But a performance regression is a regression, I'm not
> > shrugging
> >     > it off.
> >     >         > Still, I feel we shouldn't leave LWT with a fairly
> serious
> > known
> >     >         > correctness bug and I frankly feel bad for "the project"
> > that
> >     > this has been
> >     >         > known for so long without action, so I'm a bit biased in
> > wanting
> >     > to get it
> >     >         > fixed asap.
> >     >         >
> >     >         > But maybe I'm overstating the urgency here, and maybe
> > option #1
> >     > is a better
> >     >         > way forward.
> >     >         >
> >     >         > --
> >     >         > Sylvain
> >     >         >
> >     >
> >     >
> >     >
> >     >
> >  ---------------------------------------------------------------------
> >     >     To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     >     For additional commands, e-mail: dev-h...@cassandra.apache.org
> >     >
> >     >
> >     >
> >     >
> >     >
> ---------------------------------------------------------------------
> >     > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >     >
> >     >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Reply via email to