A second RC is appropriate given the revert of CASSANDRA-15899 necessitated by 
the discovery of CASSANDRA-16735: Adding columns via ALTER TABLE can generate 
corrupt sstables.

Ekaterina and Benedict's statement regarding the true positive rate of flaky 
tests also shows the value of resolving these, and that it would be good to pay 
this down as far as we can reasonably do so without unnecessarily withholding 
the release.

I do think it's possible that an RC2 build is a candidate for nomination as our 
GA release. I don't think the RC2 phase needs to be drawn-out, but believe it 
would build confidence for the project to have positive feedback from a release 
containing the fix for C-16735. If work paying down the remaining flaky tests 
surfaces a similar true positive rate, a third build might be warranted, and it 
would be to the benefit of our users - but I don't think we're far off.

I hope others are working to deploy the beta/RC builds and integrate + deploy 
changes from trunk into the releases they're deploying, as heavy contributors 
doing so provides us the best opportunity to catch these issues before our 
users do.

We're getting close.

________________________________________
From: bened...@apache.org <bened...@apache.org>
Sent: Monday, June 14, 2021 3:03 PM
To: dev@cassandra.apache.org
Subject: Re: Are we ready for 4.0.0 (GA) ?

A rate of 4/30 is a rate of 13% true bugs, which worries me with respect to our 
promise of shipping a bug-free GA.  In past releases we have ensured no flaky 
tests, I think.

That said, I’ve not had the time to contribute to the fixing of flaky tests, so 
I’ll leave the decision to those who have, or otherwise have a strong opinion.


From: Ekaterina Dimitrova <e.dimitr...@gmail.com>
Date: Monday, 14 June 2021 at 20:51
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: Are we ready for 4.0.0 (GA) ?
To give some context around the flaky tests, I pulled a quick report for the 
fixed ones during the past two months. It is attached for your reference.

To summarize, in two months 30 tickets for flaky tests were closed and only 4 
of them were Cassandra bugs(marked in red in the report), the rest of them were 
test fixes.

I think Butler and running in a loop any new tests before adding them to our 
test suite will help a lot. Also, Mick did a lot of work to stabilize Jenkins. 
Timeouts and resource issues are less common than before, that is  a win! Thank 
you Mick!

Best regards,
Ekaterina


On Mon, 14 Jun 2021 at 13:08, Adam Holmberg 
<adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com>> wrote:
To the point of "long-term observability over flakies":

I will mention here that we intend to deploy a tool called Butler that we
have developed and used internally for a while. It compliments Jenkins to
present different views of test results, allowing developers to better
ascertain those tests that are flaky vs failing vs new regressions. We
already have a server provisioned for public hosting. The application
requires a bit of work to generalize for this project. We've been putting
it on while focused on getting 4.0 over the line, but should be getting to
it soon after.

On Mon, Jun 14, 2021 at 11:33 AM Mick Semb Wever 
<m...@apache.org<mailto:m...@apache.org>> wrote:

> Are we ready to cut 4.0.0 (GA) once the following tickets land?
>
>  CASSANDRA-16733 – Allow operators to disable 'ALTER ... DROP COMPACT
> STORAGE' statements"
>  CASSANDRA-16669 – Password obfuscation for DCL audit log statements
>  CASSANDRA-16735 – Adding columns via ALTER TABLE can generate corrupt
> sstables
>
>
> A bit more background.
>
> 1. On our 4.0 GA board there's a few other tickets, which have priority but
> are not blockers for a GA release.
>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1661
>
>  CASSANDRA-16715 – WEBSITE - June 2021 updates
>  CASSANDRA-12519 – dtest failure in
> offline_tools_test.TestOfflineTools.sstableofflinerelevel_test
>  CASSANDRA-16681 – org.apache.cassandra.utils.memory.LongBufferPoolTest -
> tests are flaky
>  CASSANDRA-16689 – Flaky LeaveAndBootstrapTest
>
>
> 2. We also said we would get 5 green CI runs in a row. Progress on that
> front
> has been slow and risks delaying GA and our user base. It has had priority
> and there's been lots of momentum which is persisting: lots of flaky fixes
> committed; and the following are being discussed to keep pushing it in the
> right direction…
>  - Long-term observability over flakies
>  - Jenkins agent observability (infra stability)
>
> The past weeks has seen good progress on stability of ci-cassandra.a.o with
> the introduction of cpu docker limits imposed, and better monitoring of the
> agents so we can ensure we get the saturation and load we want. Dockerising
> the cqlshlib tests is also in progress.
>
> The alternative to a 4.0.0 GA release is a 4.0-rc2 release.
> Should the next release be: 4.0.0 (GA) or 4.0-rc2 ?
>


--
Adam Holmberg
e. adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com>
w. www.datastax.com<http://www.datastax.com>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to