Revert for only trunk patches right?
I’d say we need to completely stabilize the environment, no noise before we
go into that direction.

On Wed, 12 Jul 2023 at 8:55, Jacek Lewandowski <lewandowski.ja...@gmail.com>
wrote:

> Would it be re-opening the ticket or creating a new ticket with "revert of
> fix" ?
>
>
>
> śr., 12 lip 2023 o 14:51 Ekaterina Dimitrova <e.dimitr...@gmail.com>
> napisał(a):
>
>> jenkins_jira_integration
>> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>>  script
>> updating the JIRA ticket with test results if you cause a regression + us
>> building a muscle around reverting your commit if they break tests.“
>>
>> I am not sure people finding the time to fix their breakages will be
>> solved but at least they will be pinged automatically. Hopefully many
>> follow Jira updates.
>>
>> “  I don't take the past as strongly indicative of the future here since
>> we've been allowing circle to validate pre-commit and haven't been
>> multiplexing.”
>> I am interested to compare how many tickets for flaky tests we will have
>> pre-5.0 now compared to pre-4.1.
>>
>>
>> On Wed, 12 Jul 2023 at 8:41, Josh McKenzie <jmcken...@apache.org> wrote:
>>
>>> (This response ended up being a bit longer than intended; sorry about
>>> that)
>>>
>>> What is more common though is packaging errors,
>>> cdc/compression/system_ks_directory targeted fixes, CI w/wo
>>> upgrade tests, being less responsive post-commit as you already
>>> moved on
>>>
>>> *Two that **should **be resolved in the new regime:*
>>> * Packaging errors should be caught pre as we're making the artifact
>>> builds part of pre-commit.
>>> * I'm hoping to merge the commit log segment allocation so CDC allocator
>>> is the only one for 5.0 (and just bypasses the cdc-related work on
>>> allocation if it's disabled thus not impacting perf); the existing targeted
>>> testing of cdc specific functionality should be sufficient to confirm its
>>> correctness as it doesn't vary from the primary allocation path when it
>>> comes to mutation space in the buffer
>>> * Upgrade tests are going to be part of the pre-commit suite
>>>
>>> *Outstanding issues:*
>>> * compression. If we just run with defaults we won't test all cases so
>>> errors could pop up here
>>> * system_ks_directory related things: is this still ongoing or did we
>>> have a transient burst of these types of issues? And would we expect these
>>> to vary based on different JDK's, non-default configurations, etc?
>>> * Being less responsive post-commit: My only ideas here are a
>>> combination of the jenkins_jira_integration
>>> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>>> script updating the JIRA ticket with test results if you cause a regression
>>> + us building a muscle around reverting your commit if they break tests.
>>>
>>> To quote Jacek:
>>>
>>> why don't run dtests w/wo sstable compression x w/wo internode
>>> encryption x w/wo vnodes,
>>> w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc.
>>> I think this is a matter of cost vs result.
>>>
>>>
>>> I think we've organically made these decisions and tradeoffs in the past
>>> without being methodical about it. If we can:
>>> 1. Multiplex changed or new tests
>>> 2. Tighten the feedback loop of "tests were green, now they're
>>> *consistently* not, you're the only one who changed something", and
>>> 3. Instill a culture of "if you can't fix it immediately revert your
>>> commit"
>>>
>>> Then I think we'll only be vulnerable to flaky failures introduced
>>> across different non-default configurations as side effects in tests that
>>> aren't touched, which *intuitively* feels like a lot less than we're
>>> facing today. We could even get clever as a day 2 effort and define
>>> packages in the primary codebase where changes take place and multiplex (on
>>> a smaller scale) their respective packages of unit tests in the future if
>>> we see problems in this area.
>>>
>>> Flakey tests are a giant pain in the ass and a huge drain on
>>> productivity, don't get me wrong. *And* we have to balance how much
>>> cost we're paying before each commit with the benefit we expect to gain
>>> from that.
>>>
>>> Does the above make sense? Are there things you've seen in the trenches
>>> that challenge or invalidate any of those perspectives?
>>>
>>> On Wed, Jul 12, 2023, at 7:28 AM, Jacek Lewandowski wrote:
>>>
>>> Isn't novnodes a special case of vnodes with n=1 ?
>>>
>>> We should rather select a subset of tests for which it makes sense to
>>> run with different configurations.
>>>
>>> The set of configurations against which we run the tests currently is
>>> still only the subset of all possible cases.
>>> I could ask - why don't run dtests w/wo sstable compression x w/wo
>>> internode encryption x w/wo vnodes,
>>> w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc.
>>> I think this is a matter of cost vs result.
>>> This equation contains the likelihood of failure in configuration X,
>>> given there was no failure in the default
>>> configuration, the cost of running those tests, the time we delay
>>> merging, the likelihood that we wait for
>>> the test results so long that our branch diverge and we will have to
>>> rerun them or accept the fact that we merge
>>> a code which was tested on outdated base. Eventually, the overall new
>>> contributors experience - whether they
>>> want to participate in the future.
>>>
>>>
>>>
>>> śr., 12 lip 2023 o 07:24 Berenguer Blasi <berenguerbl...@gmail.com>
>>> napisał(a):
>>>
>>> On our 4.0 release I remember a number of such failures but not
>>> recently. What is more common though is packaging errors,
>>> cdc/compression/system_ks_directory targeted fixes, CI w/wo upgrade tests,
>>> being less responsive post-commit as you already moved on,... Either the
>>> smoke pre-commit has approval steps for everything or we should give imo a
>>> devBranch alike job to the dev pre-commit. I find it terribly useful. My
>>> 2cts.
>>> On 11/7/23 18:26, Josh McKenzie wrote:
>>>
>>> 2: Pre-commit 'devBranch' full suite for high risk/disruptive merges: at
>>> reviewer's discretion
>>>
>>> In general, maybe offering a dev the option of choosing either
>>> "pre-commit smoke" or "post-commit full" at their discretion for any work
>>> would be the right play.
>>>
>>> A follow-on thought: even with something as significant as Accord, TCM,
>>> Trie data structures, etc - I'd be a bit surprised to see tests fail on
>>> JDK17 that didn't on 11, or with vs. without vnodes, in ways that weren't
>>> immediately clear the patch stumbled across something surprising and was
>>> immediately trivially attributable if not fixable. *In theory* the
>>> things we're talking about excluding from the pre-commit smoke test suite
>>> are all things that are supposed to be identical across environments and
>>> thus opaque / interchangeable by default (JDK version outside checking
>>> build which we will, vnodes vs. non, etc).
>>>
>>> Has that not proven to be the case in your experience?
>>>
>>> On Tue, Jul 11, 2023, at 10:15 AM, Derek Chen-Becker wrote:
>>>
>>> A strong +1 to getting to a single CI system. CircleCI definitely has
>>> some niceties and I understand why it's currently used, but right now we
>>> get 2 CI systems for twice the price. +1 on the proposed subsets.
>>>
>>> Derek
>>>
>>> On Mon, Jul 10, 2023 at 9:37 AM Josh McKenzie <jmcken...@apache.org>
>>> wrote:
>>>
>>>
>>> I'm personally not thinking about CircleCI at all; I'm envisioning a
>>> world where all of us have 1 CI *software* system (i.e. reproducible on
>>> any env) that we use for pre-commit validation, and then post-commit
>>> happens on reference ASF hardware.
>>>
>>> So:
>>> 1: Pre-commit subset of tests (suites + matrices + env) runs. On green,
>>> merge.
>>> 2: Post-commit tests (all suites, matrices, env) runs. If failure, link
>>> back to the JIRA where the commit took place
>>>
>>> Circle would need to remain in lockstep with the requirements for point
>>> 1 here.
>>>
>>> On Mon, Jul 10, 2023, at 1:04 AM, Berenguer Blasi wrote:
>>>
>>> +1 to Josh which is exactly my line of thought as well. But that is only
>>> valid if we have a solid Jenkins that will eventually run all test configs.
>>> So I think I lost track a bit here. Are you proposing:
>>>
>>> 1- CircleCI: Run pre-commit a single (the most common/meaningful, TBD)
>>> config of tests
>>>
>>> 2- Jenkins: Runs post-commit _all_ test configs and emails/notifies you
>>> in case of problems?
>>>
>>> Or sthg different like having 1 also in Jenkins?
>>> On 7/7/23 17:55, Andrés de la Peña wrote:
>>>
>>> I think 500 runs combining all configs could be reasonable, since it's
>>> unlikely to have config-specific flaky tests. As in five configs with 100
>>> repetitions each.
>>>
>>> On Fri, 7 Jul 2023 at 16:14, Josh McKenzie <jmcken...@apache.org> wrote:
>>>
>>> Maybe. Kind of depends on how long we write our tests to run doesn't it?
>>> :)
>>>
>>> But point taken. Any non-trivial test would start to be something of a
>>> beast under this approach.
>>>
>>> On Fri, Jul 7, 2023, at 11:12 AM, Brandon Williams wrote:
>>>
>>> On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie <jmcken...@apache.org>
>>> wrote:
>>> > 3. Multiplexed tests (changed, added) run against all JDK's and a
>>> broader range of configs (no-vnode, vnode default, compression, etc)
>>>
>>> I think this is going to be too heavy...we're taking 500 iterations
>>> and multiplying that by like 4 or 5?
>>>
>>>
>>>
>>>
>>>
>>> --
>>> +---------------------------------------------------------------+
>>> | Derek Chen-Becker                                             |
>>> | GPG Key available at https://keybase.io/dchenbecker and       |
>>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>>> +---------------------------------------------------------------+
>>>
>>>
>>>
>>>

Reply via email to