Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

Jacek Lewandowski Wed, 12 Jul 2023 05:55:26 -0700

Would it be re-opening the ticket or creating a new ticket with "revert of
fix" ?




śr., 12 lip 2023 o 14:51 Ekaterina Dimitrova <[email protected]>
napisał(a):

> jenkins_jira_integration
> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>  script
> updating the JIRA ticket with test results if you cause a regression + us
> building a muscle around reverting your commit if they break tests.“
>
> I am not sure people finding the time to fix their breakages will be
> solved but at least they will be pinged automatically. Hopefully many
> follow Jira updates.
>
> “  I don't take the past as strongly indicative of the future here since
> we've been allowing circle to validate pre-commit and haven't been
> multiplexing.”
> I am interested to compare how many tickets for flaky tests we will have
> pre-5.0 now compared to pre-4.1.
>
>
> On Wed, 12 Jul 2023 at 8:41, Josh McKenzie <[email protected]> wrote:
>
>> (This response ended up being a bit longer than intended; sorry about
>> that)
>>
>> What is more common though is packaging errors,
>> cdc/compression/system_ks_directory targeted fixes, CI w/wo
>> upgrade tests, being less responsive post-commit as you already
>> moved on
>>
>> *Two that **should **be resolved in the new regime:*
>> * Packaging errors should be caught pre as we're making the artifact
>> builds part of pre-commit.
>> * I'm hoping to merge the commit log segment allocation so CDC allocator
>> is the only one for 5.0 (and just bypasses the cdc-related work on
>> allocation if it's disabled thus not impacting perf); the existing targeted
>> testing of cdc specific functionality should be sufficient to confirm its
>> correctness as it doesn't vary from the primary allocation path when it
>> comes to mutation space in the buffer
>> * Upgrade tests are going to be part of the pre-commit suite
>>
>> *Outstanding issues:*
>> * compression. If we just run with defaults we won't test all cases so
>> errors could pop up here
>> * system_ks_directory related things: is this still ongoing or did we
>> have a transient burst of these types of issues? And would we expect these
>> to vary based on different JDK's, non-default configurations, etc?
>> * Being less responsive post-commit: My only ideas here are a combination
>> of the jenkins_jira_integration
>> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>> script updating the JIRA ticket with test results if you cause a regression
>> + us building a muscle around reverting your commit if they break tests.
>>
>> To quote Jacek:
>>
>> why don't run dtests w/wo sstable compression x w/wo internode encryption
>> x w/wo vnodes,
>> w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc.
>> I think this is a matter of cost vs result.
>>
>>
>> I think we've organically made these decisions and tradeoffs in the past
>> without being methodical about it. If we can:
>> 1. Multiplex changed or new tests
>> 2. Tighten the feedback loop of "tests were green, now they're
>> *consistently* not, you're the only one who changed something", and
>> 3. Instill a culture of "if you can't fix it immediately revert your
>> commit"
>>
>> Then I think we'll only be vulnerable to flaky failures introduced across
>> different non-default configurations as side effects in tests that aren't
>> touched, which *intuitively* feels like a lot less than we're facing
>> today. We could even get clever as a day 2 effort and define packages in
>> the primary codebase where changes take place and multiplex (on a smaller
>> scale) their respective packages of unit tests in the future if we see
>> problems in this area.
>>
>> Flakey tests are a giant pain in the ass and a huge drain on
>> productivity, don't get me wrong. *And* we have to balance how much cost
>> we're paying before each commit with the benefit we expect to gain from
>> that.
>>
>> Does the above make sense? Are there things you've seen in the trenches
>> that challenge or invalidate any of those perspectives?
>>
>> On Wed, Jul 12, 2023, at 7:28 AM, Jacek Lewandowski wrote:
>>
>> Isn't novnodes a special case of vnodes with n=1 ?
>>
>> We should rather select a subset of tests for which it makes sense to run
>> with different configurations.
>>
>> The set of configurations against which we run the tests currently is
>> still only the subset of all possible cases.
>> I could ask - why don't run dtests w/wo sstable compression x w/wo
>> internode encryption x w/wo vnodes,
>> w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc.
>> I think this is a matter of cost vs result.
>> This equation contains the likelihood of failure in configuration X,
>> given there was no failure in the default
>> configuration, the cost of running those tests, the time we delay
>> merging, the likelihood that we wait for
>> the test results so long that our branch diverge and we will have to
>> rerun them or accept the fact that we merge
>> a code which was tested on outdated base. Eventually, the overall new
>> contributors experience - whether they
>> want to participate in the future.
>>
>>
>>
>> śr., 12 lip 2023 o 07:24 Berenguer Blasi <[email protected]>
>> napisał(a):
>>
>> On our 4.0 release I remember a number of such failures but not recently.
>> What is more common though is packaging errors,
>> cdc/compression/system_ks_directory targeted fixes, CI w/wo upgrade tests,
>> being less responsive post-commit as you already moved on,... Either the
>> smoke pre-commit has approval steps for everything or we should give imo a
>> devBranch alike job to the dev pre-commit. I find it terribly useful. My
>> 2cts.
>> On 11/7/23 18:26, Josh McKenzie wrote:
>>
>> 2: Pre-commit 'devBranch' full suite for high risk/disruptive merges: at
>> reviewer's discretion
>>
>> In general, maybe offering a dev the option of choosing either
>> "pre-commit smoke" or "post-commit full" at their discretion for any work
>> would be the right play.
>>
>> A follow-on thought: even with something as significant as Accord, TCM,
>> Trie data structures, etc - I'd be a bit surprised to see tests fail on
>> JDK17 that didn't on 11, or with vs. without vnodes, in ways that weren't
>> immediately clear the patch stumbled across something surprising and was
>> immediately trivially attributable if not fixable. *In theory* the
>> things we're talking about excluding from the pre-commit smoke test suite
>> are all things that are supposed to be identical across environments and
>> thus opaque / interchangeable by default (JDK version outside checking
>> build which we will, vnodes vs. non, etc).
>>
>> Has that not proven to be the case in your experience?
>>
>> On Tue, Jul 11, 2023, at 10:15 AM, Derek Chen-Becker wrote:
>>
>> A strong +1 to getting to a single CI system. CircleCI definitely has
>> some niceties and I understand why it's currently used, but right now we
>> get 2 CI systems for twice the price. +1 on the proposed subsets.
>>
>> Derek
>>
>> On Mon, Jul 10, 2023 at 9:37 AM Josh McKenzie <[email protected]>
>> wrote:
>>
>>
>> I'm personally not thinking about CircleCI at all; I'm envisioning a
>> world where all of us have 1 CI *software* system (i.e. reproducible on
>> any env) that we use for pre-commit validation, and then post-commit
>> happens on reference ASF hardware.
>>
>> So:
>> 1: Pre-commit subset of tests (suites + matrices + env) runs. On green,
>> merge.
>> 2: Post-commit tests (all suites, matrices, env) runs. If failure, link
>> back to the JIRA where the commit took place
>>
>> Circle would need to remain in lockstep with the requirements for point 1
>> here.
>>
>> On Mon, Jul 10, 2023, at 1:04 AM, Berenguer Blasi wrote:
>>
>> +1 to Josh which is exactly my line of thought as well. But that is only
>> valid if we have a solid Jenkins that will eventually run all test configs.
>> So I think I lost track a bit here. Are you proposing:
>>
>> 1- CircleCI: Run pre-commit a single (the most common/meaningful, TBD)
>> config of tests
>>
>> 2- Jenkins: Runs post-commit _all_ test configs and emails/notifies you
>> in case of problems?
>>
>> Or sthg different like having 1 also in Jenkins?
>> On 7/7/23 17:55, Andrés de la Peña wrote:
>>
>> I think 500 runs combining all configs could be reasonable, since it's
>> unlikely to have config-specific flaky tests. As in five configs with 100
>> repetitions each.
>>
>> On Fri, 7 Jul 2023 at 16:14, Josh McKenzie <[email protected]> wrote:
>>
>> Maybe. Kind of depends on how long we write our tests to run doesn't it?
>> :)
>>
>> But point taken. Any non-trivial test would start to be something of a
>> beast under this approach.
>>
>> On Fri, Jul 7, 2023, at 11:12 AM, Brandon Williams wrote:
>>
>> On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie <[email protected]>
>> wrote:
>> > 3. Multiplexed tests (changed, added) run against all JDK's and a
>> broader range of configs (no-vnode, vnode default, compression, etc)
>>
>> I think this is going to be too heavy...we're taking 500 iterations
>> and multiplying that by like 4 or 5?
>>
>>
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>>

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

Reply via email to