> Would it make sense to only block commits on the test strategy you've listed, 
> and shift the entire massive test suite to post-commit? 

> Lots and lots of other emails

;)

There's an interesting broad question of: What config do we consider 
"recommended" going forward, the "conservative" (i.e. old) or the "performant" 
(i.e. new)? And what JDK do we consider "recommended" going forward, the oldest 
we support or the newest?

Since those recommendations apply for new clusters, people need to qualify 
their setups, and we have a high bar of quality on testing pre-merge, my gut 
tells me "performant + newest JDK". This would impact what we'd test pre-commit 
IMO.

Having been doing a lot of CI stuff lately, some observations:
 • Our True North needs to be releasing a database that's free of defects that 
violate our core properties we commit to our users. No data loss, no data 
resurrection, transient or otherwise, due to defects in our code (meteors, 
tsunamis, etc notwithstanding).
 • The relationship of time spent on CI and stability of final full 
*post-commit* runs is asymptotic. It's not even 90/10; we're probably somewhere 
like 98% value gained from 10% of work, and the other 2% "stability" (i.e. 
green test suites, not "our database works") is a long-tail slog. Especially in 
the current ASF CI heterogenous env w/its current orchestration.
 • Thus: Pre-commit and post-commit should be different. The following points 
all apply to pre-commit:
 • The goal of pre-commit tests should be some number of 9's of no test 
failures post-commit (i.e. for every 20 green pre-commit we introduce 1 flake 
post-commit). Not full perfection; it's not worth the compute and complexity.
 • We should **build **all branches on all supported JDK's (8 + 11 for older, 
11 + 17 for newer, etc).
 • We should **run **all test suites with the *recommended **configuration* 
against the *highest versioned JDK a branch supports. *And we should formally 
recommend our users run on that JDK.
 • We should *at least* run all jvm-based configurations on the highest 
supported JDK version with the "not recommended but still supported" 
configuration.
 • I'm open to being persuaded that we should at least run jvm-unit tests on 
the older JDK w/the conservative config pre-commit, but not much beyond that.
That would leave us with the following distilled:

*Pre-commit:*
 • Build on all supported jdks
 • All test suites on highest supported jdk using recommended config
 • Repeat testing on new or changed tests on highest supported JDK 
w/recommended config
 • JDK-based test suites on highest supported jdk using other config
*Post-commit:*
 • Run everything. All suites, all supported JDK's, both config files.
With Butler + the *jenkins-jira* integration script  
<https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>(need
 to dust that off but it should remain good to go), we should have a pretty 
clear view as to when any consistent regressions are introduced and why. We'd 
remain exposed to JDK-specific flake introductions and flakes in unchanged 
tests, but there's no getting around the 2nd one and I expect the former to be 
rare enough to not warrant the compute to prevent it.

On Thu, Feb 15, 2024, at 10:02 AM, Jon Haddad wrote:
> Would it make sense to only block commits on the test strategy you've listed, 
> and shift the entire massive test suite to post-commit?  If there really is 
> only a small % of times the entire suite is useful this seems like it could 
> unblock the dev cycle but still have the benefit of the full test suite.  
> 
> 
> 
> On Thu, Feb 15, 2024 at 3:18 AM Berenguer Blasi <berenguerbl...@gmail.com> 
> wrote:
>> __
>> On reducing circle ci usage during dev while iterating, not with the 
>> intention to replace the pre-commit CI (yet), we could do away with testing 
>> only dtests, jvm-dtests, units and cqlsh for a _single_ configuration imo. 
>> That would greatly reduce usage. I hacked it quickly here for illustration 
>> purposes: 
>> https://app.circleci.com/pipelines/github/bereng/cassandra/1164/workflows/3a47c9ef-6456-4190-b5a5-aea2aff641f1
>>  The good thing is that we have the tooling to dial in whatever we decide 
>> atm.
>> 
>> Changing pre-commit is a different discussion, to which I agree btw. But the 
>> above could save time and $ big time during dev and be done and merged in a 
>> matter of days imo.
>> 
>> I can open a DISCUSS thread if we feel it's worth it.
>> 
>> On 15/2/24 10:24, Mick Semb Wever wrote:
>>>      
>>>> Mick and Ekaterina (and everyone really) - any thoughts on what test 
>>>> coverage, if any, we should commit to for this new configuration? 
>>>> Acknowledging that we already have *a lot* of CI that we run.
>>> 
>>> 
>>> 
>>> Branimir in this patch has already done some basic cleanup of test 
>>> variations, so this is not a duplication of the pipeline.  It's a 
>>> significant improvement.
>>> 
>>> I'm ok with cassandra_latest being committed and added to the pipeline, 
>>> *if* the authors genuinely believe there's significant time and effort 
>>> saved in doing so.
>>> 
>>> How many broken tests are we talking about ? 
>>> Are they consistently broken or flaky ? 
>>> Are they ticketed up and 5.0-rc blockers ? 
>>> 
>>> Having to deal with flakies and broken tests is an unfortunate reality to 
>>> having a pipeline of 170k tests.  
>>> 
>>> Despite real frustrations I don't believe the broken windows analogy is 
>>> appropriate here – it's more of a leave the campground cleaner…   That 
>>> being said, knowingly introducing a few broken tests is not that either, 
>>> but still having to deal with a handful of consistently breaking tests for 
>>> a short period of time is not the same cognitive burden as flakies.    
>>> There are currently other broken tests in 5.0: VectorUpdateDeleteTest, 
>>> upgrade_through_versions_test; are these compounding to the frustrations ? 
>>> 
>>> It's also been questioned about why we don't just enable settings we 
>>> recommend.  These are settings we recommend for new clusters.  Our existing 
>>> cassandra.yaml needs to be tailored for existing clusters being upgraded, 
>>> where we are very conservative about changing defaults.  
>>> 

Reply via email to