A second RC is appropriate given the revert of CASSANDRA-15899 necessitated by the discovery of CASSANDRA-16735: Adding columns via ALTER TABLE can generate corrupt sstables.
Ekaterina and Benedict's statement regarding the true positive rate of flaky tests also shows the value of resolving these, and that it would be good to pay this down as far as we can reasonably do so without unnecessarily withholding the release. I do think it's possible that an RC2 build is a candidate for nomination as our GA release. I don't think the RC2 phase needs to be drawn-out, but believe it would build confidence for the project to have positive feedback from a release containing the fix for C-16735. If work paying down the remaining flaky tests surfaces a similar true positive rate, a third build might be warranted, and it would be to the benefit of our users - but I don't think we're far off. I hope others are working to deploy the beta/RC builds and integrate + deploy changes from trunk into the releases they're deploying, as heavy contributors doing so provides us the best opportunity to catch these issues before our users do. We're getting close. ________________________________________ From: bened...@apache.org <bened...@apache.org> Sent: Monday, June 14, 2021 3:03 PM To: dev@cassandra.apache.org Subject: Re: Are we ready for 4.0.0 (GA) ? A rate of 4/30 is a rate of 13% true bugs, which worries me with respect to our promise of shipping a bug-free GA. In past releases we have ensured no flaky tests, I think. That said, I’ve not had the time to contribute to the fixing of flaky tests, so I’ll leave the decision to those who have, or otherwise have a strong opinion. From: Ekaterina Dimitrova <e.dimitr...@gmail.com> Date: Monday, 14 June 2021 at 20:51 To: dev@cassandra.apache.org <dev@cassandra.apache.org> Subject: Re: Are we ready for 4.0.0 (GA) ? To give some context around the flaky tests, I pulled a quick report for the fixed ones during the past two months. It is attached for your reference. To summarize, in two months 30 tickets for flaky tests were closed and only 4 of them were Cassandra bugs(marked in red in the report), the rest of them were test fixes. I think Butler and running in a loop any new tests before adding them to our test suite will help a lot. Also, Mick did a lot of work to stabilize Jenkins. Timeouts and resource issues are less common than before, that is a win! Thank you Mick! Best regards, Ekaterina On Mon, 14 Jun 2021 at 13:08, Adam Holmberg <adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com>> wrote: To the point of "long-term observability over flakies": I will mention here that we intend to deploy a tool called Butler that we have developed and used internally for a while. It compliments Jenkins to present different views of test results, allowing developers to better ascertain those tests that are flaky vs failing vs new regressions. We already have a server provisioned for public hosting. The application requires a bit of work to generalize for this project. We've been putting it on while focused on getting 4.0 over the line, but should be getting to it soon after. On Mon, Jun 14, 2021 at 11:33 AM Mick Semb Wever <m...@apache.org<mailto:m...@apache.org>> wrote: > Are we ready to cut 4.0.0 (GA) once the following tickets land? > > CASSANDRA-16733 – Allow operators to disable 'ALTER ... DROP COMPACT > STORAGE' statements" > CASSANDRA-16669 – Password obfuscation for DCL audit log statements > CASSANDRA-16735 – Adding columns via ALTER TABLE can generate corrupt > sstables > > > A bit more background. > > 1. On our 4.0 GA board there's a few other tickets, which have priority but > are not blockers for a GA release. > > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1661 > > CASSANDRA-16715 – WEBSITE - June 2021 updates > CASSANDRA-12519 – dtest failure in > offline_tools_test.TestOfflineTools.sstableofflinerelevel_test > CASSANDRA-16681 – org.apache.cassandra.utils.memory.LongBufferPoolTest - > tests are flaky > CASSANDRA-16689 – Flaky LeaveAndBootstrapTest > > > 2. We also said we would get 5 green CI runs in a row. Progress on that > front > has been slow and risks delaying GA and our user base. It has had priority > and there's been lots of momentum which is persisting: lots of flaky fixes > committed; and the following are being discussed to keep pushing it in the > right direction… > - Long-term observability over flakies > - Jenkins agent observability (infra stability) > > The past weeks has seen good progress on stability of ci-cassandra.a.o with > the introduction of cpu docker limits imposed, and better monitoring of the > agents so we can ensure we get the saturation and load we want. Dockerising > the cqlshlib tests is also in progress. > > The alternative to a 4.0.0 GA release is a 4.0-rc2 release. > Should the next release be: 4.0.0 (GA) or 4.0-rc2 ? > -- Adam Holmberg e. adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com> w. www.datastax.com<http://www.datastax.com> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org