Re: [DISCUSS] Considering when to push tickets out of 4.0

Joshua McKenzie Tue, 16 Jun 2020 16:09:16 -0700

I completely respect and agree with the need for a drumbeat to change our
culture around testing and quality; I also agree we haven't done much to
materially change that uniquely to 4.0. The 40_quality_testing epic is our
first step in that direction though I have some personal concerns about
leaning on bespoke manual testing for quality since we humans are
infinitely fallible. :)


What elicited that response from me is the claim that we haven't yet tested
the software, implicitly invalidating the time and energy the community has
put into that thus far. I wouldn't argue that we've adequately tested for a
GA release, certainly, but we're discussing beta in this thread. As a
project, the advice we have about the testing and usage of the beta is
something along the lines of "use this in test/QA and only in cases where
minutes of downtime is acceptable." Perhaps we should consider revising the
release lifecycle on the wiki if this is something we're not aligned on?

To your point above, the problems found to date were largely with 3.0 and
found by user report and not by project developer testing. The sooner we
can get the 4.0 beta into the hands of the community, the sooner we can get
more of those reports while we also work to broaden and deepen our
programmatic testing frameworks and platforms. (To acknowledge: I presume
that a majority of the user testing that surfaced defects in 3.0 came from
one large user's investment of time and resources, however historically on
the project we've had a large number of defects surfaced by a diverse
collection of users and I'd like to see us move in that direction again for
the long-term health of the project. Hence my attempts to move us towards
beta and take on an awareness campaign and call to action for the community
to engage in testing.)


On Tue, Jun 16, 2020 at 6:37 PM Benedict Elliott Smith <bened...@apache.org>
wrote:

> > Further, we have thousands of tests across all our suites
>
> I think most here would agree that our testing remains inadequate, and
> that this (modest, even in pure numerical terms for such a large project)
> number of often poorly-written unit tests does not really change that fact.
>
> Most of the problems found to date have been found with 3.0, not with 4.0,
> and found by user report.  We agreed a long time ago that we would aim for
> 4.0 to be a more stable release than any prior.  Today I think the only
> reason that might be true is the amount of work invested in fixing problems
> found in _earlier releases_, not due to verification of 4.0.
>
> I say this not to influence the decision about when and what lands in
> beta, only to ensure we stay honest with ourselves about our progress on
> quality.  I hope the software itself is higher quality today, but I do not
> believe it is honest to (yet) claim that our testing is significantly
> higher quality than those releases we all agree were inadequate.  There
> exists some wider external use case testing, but being mostly invisible to
> the community it is unclear how much broader our coverage is with these
> included.
>
> On 16/06/2020, 23:08, "David Capwell" <dcapw...@apple.com.INVALID> wrote:
>
>     Inline
>
>     > On Jun 16, 2020, at 2:17 PM, Joshua McKenzie <jmcken...@apache.org>
> wrote:
>     >
>     >>
>     >> we still produce incorrect results as shown by CASSANDRA-15313;
> this is a
>     >> correctness issue, so must be a blocker for v5 protocol.
>     >
>     > That makes complete sense; I'd somehow missed the incorrect results
> aspect
>     > in trying to get context on the work. I'd be eager to hear about
> progress
>     > on it as well.
>     >
>     > Regarding the question of "why would users test if we haven't tested
> yet",
>     > I respectfully disagree both on the assertion we haven't tested yet
> as well
>     > as on the distinction between an "us vs. them" in the community.
> We're all
>     > users and participants in the Cassandra community and ecosystem so
> anyone
>     > downloading the DB to test it out is just as vital as one of us from
> the
>     > dev list, committer list, or pmc list testing out the DB.
>
>     I apologies if I came off discriminatory, I will try to absorb your
> words carefully; thank you for correcting my behavior.
>
>     > While we can
>     > reasonably expect a dev paid full time working on the project with a
> large
>     > amount of infrastructure doing testing to be crucial to getting a
> release
>     > out and doing certain kinds of testing, there are literally
> thousands of
>     > different companies out in the world basing their critical
> infrastructure
>     > on this project and them testing out their use-cases and migration
> is just
>     > as critical to this release being ready. It takes a village.
>
>     I do agree that user validation is important for the release, I was
> mostly trying to question why start here before the testing work in JIRA is
> complete.  Maybe I am in the wrong, I have been heads down working on data
> corruption issues in 3.x; I have become more risk adverse.
>
>     >
>     > Further, we have thousands of tests across all our suites, hundreds
> of new
>     > use-case testing that has been done against 4.0 at this point, and
> 30+%
>     > more bugs fixed in this release than 3.0; the blanket assertion that
> we
>     > haven't tested 4.0 yet doesn't resonate with me. While we haven't
> done the
>     > entirety of our final 40 beta phase testing yet, testing is
> constantly
>     > going on against this codebase by both people on the ML and off.
>     >
>     > Now, if there are major known glaring issues where we have problems
> that
>     > would prevent users from actually testing out the beta and kicking
> the
>     > tires, that's a different story entirely and I'd argue those tickets
> should
>     > be reflected in the alpha phase (see: CASSANDRA-15299 apparently ;) )
>     >
>     > Does that make sense?
>
>     I have been meaning to ask this, mostly asking people in Slack and
> this actually confuses me.
>
>     I was working off the assumption that the fix version meant it was a
> blocker for that release, and that Alpha special cased and would have
> releases even with blocking issues (which is documented in the Release
> Lifecycle).  When I ask around I hear that this is not correct and that
> alpha means “blocks beta”, beta means “blocks RC”, etc (is any of this
> documented, I couldn’t find any last time I was talking to others about
> this).
>
>     Now, lets say we close alpha and cut a beta release, my understanding
> is that tickets which block the next beta release are alpha…. So do we
> still mark them alpha (even though we won’t have a alpha release)?
>
>     This has been confusing me since beta has a lot of work pending… sorry
> for not bring this up in a dedicated dev@ thread
>
>
>     >
>     > On Tue, Jun 16, 2020 at 4:58 PM Benedict Elliott Smith <
> bened...@apache.org>
>     > wrote:
>     >
>     >> So, if it helps matters: I am explicitly -1 the prior version of
> this work
>     >> due to the technical concerns expressed here and on the ticket.  So
> we
>     >> either need to revert that patch or incorporate 15299.
>     >>
>     >> On 16/06/2020, 21:48, "Mick Semb Wever" <m...@apache.org> wrote:
>     >>
>     >>>
>     >>> 2) Alternatively, it's been 3 years, 4 months, 13 days since the
>     >> release of
>     >>> 3.10.0 (the last time we added new features to the DB)
>     >>>
>     >>
>     >>
>     >>    We did tick-tock, pushing feature releases too quickly, and
> without
>     >>    supporting them for long enough to get stable. And then we've
> done "a
>     >> la no
>     >>    feature releases" for over 3 years. It feels like the bar went
> from
>     >> too low
>     >>    to too high.
>     >>
>     >>    I understand the importance of CASSANDRA-15299. But it hasn't
> had any
>     >>    comments in 12 twelve days, and in this stage of the feature
> freeze,
>     >> with
>     >>    so few alpha bugs remaining, that's a long time. Sam, can you
> speak to
>     >> its
>     >>    eta?
>     >>
>     >>
>     >>
>     >>> 4) If we plan on releasing 4.1 six months after the release of 4.0
>     >> (i.e.
>     >>> calender scope vs. feature scope - not yet agreed upon but an
>     >> option),
>     >>
>     >>
>     >>
>     >>    I like this. I think it's worth appreciating the different
>     >> perspectives of
>     >>    this community: those involved with private clusters that don't
> rely on
>     >>    official releases, versus those involved with the public and
> other
>     >> people's
>     >>    clusters. The latter group needs those official releases much
> more, but
>     >>    this also ties into putting those users more in focus and
> figuring out
>     >>    where the bar best sits. This isn't meant to divide, we all care
> and
>     >> voice
>     >>    for the user, but just to utilise the different strengths
> brought to
>     >> the
>     >>    table.
>     >>
>     >>
>     >>> If we want 4.0.0 out faster, the biggest gains would be to get the
>     >> test
>     >>    plans written up and get more people working on automated
> testing.
>     >>
>     >>
>     >>    Yes, 110%.  Though, as long as this continues to improve, as it
> has,
>     >> does
>     >>    it need to be a blocker on 4.0?
>     >>
>     >>
>     >>
>     >>
> ---------------------------------------------------------------------
>     >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     >> For additional commands, e-mail: dev-h...@cassandra.apache.org
>     >>
>     >>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>     For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] Considering when to push tickets out of 4.0

Reply via email to