Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski
I fully understand you. Although I have that luxury to use more containers,
I simply feel that rerunning the same code with different configurations
which do not impact that code is just a waste of resources and money.

- - -- --- -  -
Jacek Lewandowski


czw., 15 lut 2024 o 08:41 Štefan Miklošovič 
napisał(a):

> By the way, I am not sure if it is all completely transparent and
> understood by everybody but let me guide you through a typical patch which
> is meant to be applied from 4.0 to trunk (4 branches) to see how it looks
> like.
>
> I do not have the luxury of running CircleCI on 100 containers, I have
> just 25. So what takes around 2.5h for 100 containers takes around 6-7 for
> 25. That is a typical java11_pre-commit_tests for trunk. Then I have to
> provide builds for java17_pre-commit_tests too, that takes around 3-4 hours
> because it just tests less, let's round it up to 10 hours for trunk.
>
> Then I need to do this for 5.0 as well, basically double the time because
> as I am writing this the difference is not too big between these two
> branches. So 20 hours.
>
> Then I need to build 4.1 and 4.0 too, 4.0 is very similar to 4.1 when it
> comes to the number of tests, nevertheless, there are workflows for Java 8
> and Java 11 for each so lets say this takes 10 hours again. So together I'm
> 35.
>
> To schedule all the builds, trigger them, monitor their progress etc is
> work in itself. I am scripting this like crazy to not touch the UI in
> Circle at all and I made my custom scripts which call Circle API and it
> triggers the builds from the console to speed this up because as soon as a
> developer is meant to be clicking around all day, needing to tracking the
> progress, it gets old pretty quickly.
>
> Thank god this is just a patch from 4.0, when it comes to 3.0 and 3.11
> just add more hours to that.
>
> So all in all, a typical 4.0 - trunk patch is tested for two days at
> least, that's when all is nice and I do not need to rework it and rurun it
> again ... Does this all sound flexible and speedy enough for people?
>
> If we dropped the formal necessity to build various jvms it would
> significantly speed up the development.
>
>
> On Thu, Feb 15, 2024 at 8:10 AM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Excellent point, I was saying for some time that IMHO we can reduce
>>> to running in CI at least pre-commit:
>>> 1) Build J11 2) build J17
>>> 3) run tests with build 11 + runtime 11
>>> 4) run tests with build 11 and runtime 17.
>>
>>
>> Ekaterina, I was thinking more about:
>> 1) build J11
>> 2) build J17
>> 3) run tests with build J11 + runtime J11
>> 4) run smoke tests with build J17 and runtime J17
>>
>> Again, I don't see value in running build J11 and J17 runtime
>> additionally to J11 runtime - just pick one unless we change something
>> specific to JVM
>>
>> If we need to decide whether to test the latest or default, I think we
>> should pick the latest because this is actually Cassandra 5.0 defined as a
>> set of new features that will shine on the website.
>>
>> Also - we have configurations which test some features but they more like
>> dimensions:
>> - commit log compression
>> - sstable compression
>> - CDC
>> - Trie memtables
>> - Trie SSTable format
>> - Extended deletion time
>> ...
>>
>> Currently, with what we call the default configuration is tested with:
>> - no compression, no CDC, no extended deletion time
>> - *commit log compression + sstable compression*, no cdc, no extended
>> deletion time
>> - no compression, *CDC enabled*, no extended deletion time
>> - no compression, no CDC, *enabled extended deletion time*
>>
>> This applies only to unit tests of course
>>
>> Then, are we going to test all of those scenarios with the "latest"
>> configuration? I'm asking because the latest configuration is mostly about
>> tries and UCS and has nothing to do with compression or CDC. Then why the
>> default configuration should be tested more thoroughly than latest which
>> enables essential Cassandra 5.0 features?
>>
>> I propose to significantly reduce that stuff. Let's distinguish the
>> packages of tests that need to be run with CDC enabled / disabled, with
>> commitlog compression enabled / disabled, tests that verify sstable formats
>> (mostly io and index I guess), and leave other parameters set as with the
>> latest configuration - this is the easiest way I think.
>>
>> For dtests we have vnodes/no-vnodes, offheap/onheap, and nothing about
>> other stuff. To me running no-vnodes makes no sense because no-vnodes is
>> just a special case of vnodes=1. On the other hand offheap/onheap buffers
>> could be tested in unit tests. In short, I'd run dtests only with the
>> default and latest configuration.
>>
>> Sorry for being too wordy,
>>
>>
>> czw., 15 lut 2024 o 07:39 Štefan Miklošovič 
>> napisał(a):
>>
>>> Something along what Paulo is proposing makes sense to me. To sum it up,
>>> knowing what workflows we have now:
>>>

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Štefan Miklošovič
By the way, I am not sure if it is all completely transparent and
understood by everybody but let me guide you through a typical patch which
is meant to be applied from 4.0 to trunk (4 branches) to see how it looks
like.

I do not have the luxury of running CircleCI on 100 containers, I have just
25. So what takes around 2.5h for 100 containers takes around 6-7 for 25.
That is a typical java11_pre-commit_tests for trunk. Then I have to provide
builds for java17_pre-commit_tests too, that takes around 3-4 hours because
it just tests less, let's round it up to 10 hours for trunk.

Then I need to do this for 5.0 as well, basically double the time because
as I am writing this the difference is not too big between these two
branches. So 20 hours.

Then I need to build 4.1 and 4.0 too, 4.0 is very similar to 4.1 when it
comes to the number of tests, nevertheless, there are workflows for Java 8
and Java 11 for each so lets say this takes 10 hours again. So together I'm
35.

To schedule all the builds, trigger them, monitor their progress etc is
work in itself. I am scripting this like crazy to not touch the UI in
Circle at all and I made my custom scripts which call Circle API and it
triggers the builds from the console to speed this up because as soon as a
developer is meant to be clicking around all day, needing to tracking the
progress, it gets old pretty quickly.

Thank god this is just a patch from 4.0, when it comes to 3.0 and 3.11 just
add more hours to that.

So all in all, a typical 4.0 - trunk patch is tested for two days at least,
that's when all is nice and I do not need to rework it and rurun it again
... Does this all sound flexible and speedy enough for people?

If we dropped the formal necessity to build various jvms it would
significantly speed up the development.


On Thu, Feb 15, 2024 at 8:10 AM Jacek Lewandowski <
lewandowski.ja...@gmail.com> wrote:

> Excellent point, I was saying for some time that IMHO we can reduce
>> to running in CI at least pre-commit:
>> 1) Build J11 2) build J17
>> 3) run tests with build 11 + runtime 11
>> 4) run tests with build 11 and runtime 17.
>
>
> Ekaterina, I was thinking more about:
> 1) build J11
> 2) build J17
> 3) run tests with build J11 + runtime J11
> 4) run smoke tests with build J17 and runtime J17
>
> Again, I don't see value in running build J11 and J17 runtime
> additionally to J11 runtime - just pick one unless we change something
> specific to JVM
>
> If we need to decide whether to test the latest or default, I think we
> should pick the latest because this is actually Cassandra 5.0 defined as a
> set of new features that will shine on the website.
>
> Also - we have configurations which test some features but they more like
> dimensions:
> - commit log compression
> - sstable compression
> - CDC
> - Trie memtables
> - Trie SSTable format
> - Extended deletion time
> ...
>
> Currently, with what we call the default configuration is tested with:
> - no compression, no CDC, no extended deletion time
> - *commit log compression + sstable compression*, no cdc, no extended
> deletion time
> - no compression, *CDC enabled*, no extended deletion time
> - no compression, no CDC, *enabled extended deletion time*
>
> This applies only to unit tests of course
>
> Then, are we going to test all of those scenarios with the "latest"
> configuration? I'm asking because the latest configuration is mostly about
> tries and UCS and has nothing to do with compression or CDC. Then why the
> default configuration should be tested more thoroughly than latest which
> enables essential Cassandra 5.0 features?
>
> I propose to significantly reduce that stuff. Let's distinguish the
> packages of tests that need to be run with CDC enabled / disabled, with
> commitlog compression enabled / disabled, tests that verify sstable formats
> (mostly io and index I guess), and leave other parameters set as with the
> latest configuration - this is the easiest way I think.
>
> For dtests we have vnodes/no-vnodes, offheap/onheap, and nothing about
> other stuff. To me running no-vnodes makes no sense because no-vnodes is
> just a special case of vnodes=1. On the other hand offheap/onheap buffers
> could be tested in unit tests. In short, I'd run dtests only with the
> default and latest configuration.
>
> Sorry for being too wordy,
>
>
> czw., 15 lut 2024 o 07:39 Štefan Miklošovič 
> napisał(a):
>
>> Something along what Paulo is proposing makes sense to me. To sum it up,
>> knowing what workflows we have now:
>>
>> java17_pre-commit_tests
>> java11_pre-commit_tests
>> java17_separate_tests
>> java11_separate_tests
>>
>> We would have couple more, together like:
>>
>> java17_pre-commit_tests
>> java17_pre-commit_tests-latest-yaml
>> java11_pre-commit_tests
>> java11_pre-commit_tests-latest-yaml
>> java17_separate_tests
>> java17_separate_tests-default-yaml
>> java11_separate_tests
>> java11_separate_tests-latest-yaml
>>
>> To go over Paulo's plan, his steps 1-3 for 5.0 would result

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski
>
> Excellent point, I was saying for some time that IMHO we can reduce
> to running in CI at least pre-commit:
> 1) Build J11 2) build J17
> 3) run tests with build 11 + runtime 11
> 4) run tests with build 11 and runtime 17.


Ekaterina, I was thinking more about:
1) build J11
2) build J17
3) run tests with build J11 + runtime J11
4) run smoke tests with build J17 and runtime J17

Again, I don't see value in running build J11 and J17 runtime additionally
to J11 runtime - just pick one unless we change something specific to JVM

If we need to decide whether to test the latest or default, I think we
should pick the latest because this is actually Cassandra 5.0 defined as a
set of new features that will shine on the website.

Also - we have configurations which test some features but they more like
dimensions:
- commit log compression
- sstable compression
- CDC
- Trie memtables
- Trie SSTable format
- Extended deletion time
...

Currently, with what we call the default configuration is tested with:
- no compression, no CDC, no extended deletion time
- *commit log compression + sstable compression*, no cdc, no extended
deletion time
- no compression, *CDC enabled*, no extended deletion time
- no compression, no CDC, *enabled extended deletion time*

This applies only to unit tests of course

Then, are we going to test all of those scenarios with the "latest"
configuration? I'm asking because the latest configuration is mostly about
tries and UCS and has nothing to do with compression or CDC. Then why the
default configuration should be tested more thoroughly than latest which
enables essential Cassandra 5.0 features?

I propose to significantly reduce that stuff. Let's distinguish the
packages of tests that need to be run with CDC enabled / disabled, with
commitlog compression enabled / disabled, tests that verify sstable formats
(mostly io and index I guess), and leave other parameters set as with the
latest configuration - this is the easiest way I think.

For dtests we have vnodes/no-vnodes, offheap/onheap, and nothing about
other stuff. To me running no-vnodes makes no sense because no-vnodes is
just a special case of vnodes=1. On the other hand offheap/onheap buffers
could be tested in unit tests. In short, I'd run dtests only with the
default and latest configuration.

Sorry for being too wordy,


czw., 15 lut 2024 o 07:39 Štefan Miklošovič 
napisał(a):

> Something along what Paulo is proposing makes sense to me. To sum it up,
> knowing what workflows we have now:
>
> java17_pre-commit_tests
> java11_pre-commit_tests
> java17_separate_tests
> java11_separate_tests
>
> We would have couple more, together like:
>
> java17_pre-commit_tests
> java17_pre-commit_tests-latest-yaml
> java11_pre-commit_tests
> java11_pre-commit_tests-latest-yaml
> java17_separate_tests
> java17_separate_tests-default-yaml
> java11_separate_tests
> java11_separate_tests-latest-yaml
>
> To go over Paulo's plan, his steps 1-3 for 5.0 would result in requiring
> just one workflow
>
> java11_pre-commit_tests
>
> when no configuration is touched and two workflows
>
> java11_pre-commit_tests
> java11_pre-commit_tests-latest-yaml
>
> when there is some configuration change.
>
> Now the term "some configuration change" is quite tricky and it is not
> always easy to evaluate if both default and latest yaml workflows need to
> be executed. It might happen that a change is of such a nature that it does
> not change the configuration but it is necessary to verify that it still
> works with both scenarios. -latest.yaml config might be such that a change
> would make sense to do in isolation for default config only but it would
> not work with -latest.yaml too. I don't know if this is just a theoretical
> problem or not but my gut feeling is that we would be safer if we just
> required both default and latest yaml workflows together.
>
> Even if we do, we basically replace "two jvms" builds for "two yamls"
> builds but I consider "two yamls" builds to be more valuable in general
> than "two jvms" builds. It would take basically the same amount of time, we
> would just reoriented our building matrix from different jvms to different
> yamls.
>
> For releases we would for sure need to just run it across jvms too.
>
> On Thu, Feb 15, 2024 at 7:05 AM Paulo Motta  wrote:
>
>> > Perhaps it is also a good opportunity to distinguish subsets of tests
>> which make sense to run with a configuration matrix.
>>
>> Agree. I think we should define a “standard/golden” configuration for
>> each branch and minimally require precommit tests for that configuration.
>> Assignees and reviewers can determine if additional test variants are
>> required based on the patch scope.
>>
>> Nightly and prerelease tests can be run to catch any issues outside the
>> standard configuration based on the supported configuration matrix.
>>
>> On Wed, 14 Feb 2024 at 15:32 Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>>
>>> śr., 14 lut 2024 o 17:30 Josh McKenzie 
>

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Štefan Miklošovič
Something along what Paulo is proposing makes sense to me. To sum it up,
knowing what workflows we have now:

java17_pre-commit_tests
java11_pre-commit_tests
java17_separate_tests
java11_separate_tests

We would have couple more, together like:

java17_pre-commit_tests
java17_pre-commit_tests-latest-yaml
java11_pre-commit_tests
java11_pre-commit_tests-latest-yaml
java17_separate_tests
java17_separate_tests-default-yaml
java11_separate_tests
java11_separate_tests-latest-yaml

To go over Paulo's plan, his steps 1-3 for 5.0 would result in requiring
just one workflow

java11_pre-commit_tests

when no configuration is touched and two workflows

java11_pre-commit_tests
java11_pre-commit_tests-latest-yaml

when there is some configuration change.

Now the term "some configuration change" is quite tricky and it is not
always easy to evaluate if both default and latest yaml workflows need to
be executed. It might happen that a change is of such a nature that it does
not change the configuration but it is necessary to verify that it still
works with both scenarios. -latest.yaml config might be such that a change
would make sense to do in isolation for default config only but it would
not work with -latest.yaml too. I don't know if this is just a theoretical
problem or not but my gut feeling is that we would be safer if we just
required both default and latest yaml workflows together.

Even if we do, we basically replace "two jvms" builds for "two yamls"
builds but I consider "two yamls" builds to be more valuable in general
than "two jvms" builds. It would take basically the same amount of time, we
would just reoriented our building matrix from different jvms to different
yamls.

For releases we would for sure need to just run it across jvms too.

On Thu, Feb 15, 2024 at 7:05 AM Paulo Motta  wrote:

> > Perhaps it is also a good opportunity to distinguish subsets of tests
> which make sense to run with a configuration matrix.
>
> Agree. I think we should define a “standard/golden” configuration for each
> branch and minimally require precommit tests for that configuration.
> Assignees and reviewers can determine if additional test variants are
> required based on the patch scope.
>
> Nightly and prerelease tests can be run to catch any issues outside the
> standard configuration based on the supported configuration matrix.
>
> On Wed, 14 Feb 2024 at 15:32 Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):
>>
>>> When we have failing tests people do not spend the time to figure out if
>>> their logic caused a regression and merge, making things more unstable… so
>>> when we merge failing tests that leads to people merging even more failing
>>> tests...
>>>
>>> What's the counter position to this Jacek / Berenguer?
>>>
>>
>> For how long are we going to deceive ourselves? Are we shipping those
>> features or not? Perhaps it is also a good opportunity to distinguish
>> subsets of tests which make sense to run with a configuration matrix.
>>
>> If we don't add those tests to the pre-commit pipeline, "people do not
>> spend the time to figure out if their logic caused a regression and merge,
>> making things more unstable…"
>> I think it is much more valuable to test those various configurations
>> rather than test against j11 and j17 separately. I can see a really little
>> value in doing that.
>>
>>
>>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Paulo Motta
> Perhaps it is also a good opportunity to distinguish subsets of tests
which make sense to run with a configuration matrix.

Agree. I think we should define a “standard/golden” configuration for each
branch and minimally require precommit tests for that configuration.
Assignees and reviewers can determine if additional test variants are
required based on the patch scope.

Nightly and prerelease tests can be run to catch any issues outside the
standard configuration based on the supported configuration matrix.

On Wed, 14 Feb 2024 at 15:32 Jacek Lewandowski 
wrote:

> śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):
>
>> When we have failing tests people do not spend the time to figure out if
>> their logic caused a regression and merge, making things more unstable… so
>> when we merge failing tests that leads to people merging even more failing
>> tests...
>>
>> What's the counter position to this Jacek / Berenguer?
>>
>
> For how long are we going to deceive ourselves? Are we shipping those
> features or not? Perhaps it is also a good opportunity to distinguish
> subsets of tests which make sense to run with a configuration matrix.
>
> If we don't add those tests to the pre-commit pipeline, "people do not
> spend the time to figure out if their logic caused a regression and merge,
> making things more unstable…"
> I think it is much more valuable to test those various configurations
> rather than test against j11 and j17 separately. I can see a really little
> value in doing that.
>
>
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Paulo Motta
> If there’s an “old compatible default” and “latest recommended settings”,
when does the value in “old compatible default” get updated? Never?

How about replacing cassandra.yaml with cassandra_latest.yaml on trunk when
cutting cassandra-6.0 branch? Any new default changes on trunk go to
cassandra_latest.yaml.

Basically major branch creation syncs cassandra_latest.yaml with
cassandra.yaml on trunk, and default changes on trunk are added to
cassandra_latest.yaml which will be eventually synced to cassandra.yaml
when the next major is cut.

On Wed, 14 Feb 2024 at 13:42 Jeff Jirsa  wrote:

> 1) If there’s an “old compatible default” and “latest recommended
> settings”, when does the value in “old compatible default” get updated?
> Never?
> 2) If there are test failures with the new values, it seems REALLY
> IMPORTANT to make sure those test failures are discovered + fixed IN THE
> FUTURE TOO. If pushing new yaml into a different file makes us less likely
> to catch the failures in the future, it seems like we’re hurting ourselves.
> Branimir mentions this, but how do we ensure that we don’t let this pattern
> disguise future bugs?
>
>
>
>
>
> On Feb 13, 2024, at 8:41 AM, Branimir Lambov  wrote:
>
> Hi All,
>
> CASSANDRA-18753 introduces a second set of defaults (in a separate
> "cassandra_latest.yaml") that enable new features of Cassandra. The
> objective is two-fold: to be able to test the database in this
> configuration, and to point potential users that are evaluating the
> technology to an optimized set of defaults that give a clearer picture of
> the expected performance of the database for a new user. The objective is
> to get this configuration into 5.0 to have the extra bit of confidence that
> we are not releasing (and recommending) options that have not gone through
> thorough CI.
>
> The implementation has already gone through review, but I'd like to get
> people's opinion on two things:
> - There are currently a number of test failures when the new options are
> selected, some of which appear to be genuine problems. Is the community
> okay with committing the patch before all of these are addressed? This
> should prevent the introduction of new failures and make sure we don't
> release before clearing the existing ones.
> - I'd like to get an opinion on what's suitable wording and documentation
> for the new defaults set. Currently, the patch proposes adding the
> following text to the yaml (see
> https://github.com/apache/cassandra/pull/2896/files):
> # NOTE:
> #   This file is provided in two versions:
> # - cassandra.yaml: Contains configuration defaults for a "compatible"
> #   configuration that operates using settings that are
> backwards-compatible
> #   and interoperable with machines running older versions of
> Cassandra.
> #   This version is provided to facilitate pain-free upgrades for
> existing
> #   users of Cassandra running in production who want to gradually and
> #   carefully introduce new features.
> # - cassandra_latest.yaml: Contains configuration defaults that enable
> #   the latest features of Cassandra, including improved functionality
> as
> #   well as higher performance. This version is provided for new users
> of
> #   Cassandra who want to get the most out of their cluster, and for
> users
> #   evaluating the technology.
> #   To use this version, simply copy this file over cassandra.yaml, or
> specify
> #   it using the -Dcassandra.config system property, e.g. by running
> # cassandra
> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
> # /NOTE
> Does this sound sensible? Should we add a pointer to this defaults set
> elsewhere in the documentation?
>
> Regards,
> Branimir
>
>
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Paulo Motta
I share Jacek’s and Stefan’s sentiment about the low value of requiring
precommit j11+j17 tests for all changes.

Perhaps this was needed during j17 stabilization but is no longer required?
Please correct if I’m missing some context.

To have a practical proposal to address this, how about:

1) Define “standard” java version for branch (11 or 17).
2) Define “standard” cassandra.yaml variant for branch (legacy
cassandra.yaml or shiny cassandra_latest.yaml).
3) Require green CI on precommit on standard java version + standard
cassandra.yaml variant.
4) Any known java-related changes require precommit j11 + j17.
5) Any known configuration changes require precommit tests on all
cassandra.yaml variants.
6) All supported java versions + cassandra.yaml variants need to be checked
before a release is proposed, to catch any issue missed during 4) or 5).

For example:
- If j17 is set as “default” java version of the branch cassandra-5.0, then
j11 tests are no longer required for patches that don’t touch java-related
stuff
- if cassandra_latest.yaml becomes the new default configuration for 6.0,
then precommit only needs to be run against thatversion - prerelease needs
to be run against all cassandra.yaml variants.

Wdyt?

On Wed, 14 Feb 2024 at 18:25 Štefan Miklošovič 
wrote:

> Jon,
>
> I was mostly referring to Circle CI where we have two pre-commit
> workflows. (just click on anything here
> https://app.circleci.com/pipelines/github/instaclustr/cassandra)
>
> java17_pre-commit_tests
>
> This workflow is compiling & testing everything with Java 17
>
> java11_pre-commit_tests
>
> This workflow is compiling with Java 11 and it contains jobs which are
> also run with Java 11 and another set of jobs which run with Java 17.
>
> The workflow I have so far is that when I want to merge something, it is
> required to formally provide builds for both workflows. Maybe I am doing
> more work than necessary here but my understanding is that this has to be
> done and it is required.
>
> I think that Jacek was talking also about this and that it is questionable
> what value it brings.
>
>
>
> On Thu, Feb 15, 2024 at 12:13 AM Jon Haddad  wrote:
>
>> Stefan, can you elaborate on what you are proposing?  It's not clear (at
>> least to me) what level of testing you're advocating for.  Dropping testing
>> both on dev branches, every commit, just on release?  In addition, can you
>> elaborate on what is a hassle about it?  It's been a long time since I
>> committed anything but I don't remember 2 JVMs (8 & 11) being a problem.
>>
>> Jon
>>
>>
>>
>> On Wed, Feb 14, 2024 at 2:35 PM Štefan Miklošovič <
>> stefan.mikloso...@gmail.com> wrote:
>>
>>> I agree with Jacek, I don't quite understand why we are running the
>>> pipeline for j17 and j11 every time. I think this should be opt-in.
>>> Majority of the time, we are just refactoring and coding stuff for
>>> Cassandra where testing it for both jvms is just pointless and we _know_
>>> that it will be fine in 11 and 17 too because we do not do anything
>>> special. If we find some subsystems where testing that on both jvms is
>>> crucial, we might do that, I just do not remember when it was last time
>>> that testing it in both j17 and j11 suddenly uncovered some bug. Seems more
>>> like a hassle.
>>>
>>> We might then test the whole pipeline with a different config basically
>>> for same time as we currently do.
>>>
>>> On Wed, Feb 14, 2024 at 9:32 PM Jacek Lewandowski <
>>> lewandowski.ja...@gmail.com> wrote:
>>>
 śr., 14 lut 2024 o 17:30 Josh McKenzie 
 napisał(a):

> When we have failing tests people do not spend the time to figure out
> if their logic caused a regression and merge, making things more unstable…
> so when we merge failing tests that leads to people merging even more
> failing tests...
>
> What's the counter position to this Jacek / Berenguer?
>

 For how long are we going to deceive ourselves? Are we shipping those
 features or not? Perhaps it is also a good opportunity to distinguish
 subsets of tests which make sense to run with a configuration matrix.

 If we don't add those tests to the pre-commit pipeline, "people do not
 spend the time to figure out if their logic caused a regression and merge,
 making things more unstable…"
 I think it is much more valuable to test those various configurations
 rather than test against j11 and j17 separately. I can see a really little
 value in doing that.





Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Štefan Miklošovič
Jon,

I was mostly referring to Circle CI where we have two pre-commit workflows.
(just click on anything here
https://app.circleci.com/pipelines/github/instaclustr/cassandra)

java17_pre-commit_tests

This workflow is compiling & testing everything with Java 17

java11_pre-commit_tests

This workflow is compiling with Java 11 and it contains jobs which are also
run with Java 11 and another set of jobs which run with Java 17.

The workflow I have so far is that when I want to merge something, it is
required to formally provide builds for both workflows. Maybe I am doing
more work than necessary here but my understanding is that this has to be
done and it is required.

I think that Jacek was talking also about this and that it is questionable
what value it brings.



On Thu, Feb 15, 2024 at 12:13 AM Jon Haddad  wrote:

> Stefan, can you elaborate on what you are proposing?  It's not clear (at
> least to me) what level of testing you're advocating for.  Dropping testing
> both on dev branches, every commit, just on release?  In addition, can you
> elaborate on what is a hassle about it?  It's been a long time since I
> committed anything but I don't remember 2 JVMs (8 & 11) being a problem.
>
> Jon
>
>
>
> On Wed, Feb 14, 2024 at 2:35 PM Štefan Miklošovič <
> stefan.mikloso...@gmail.com> wrote:
>
>> I agree with Jacek, I don't quite understand why we are running the
>> pipeline for j17 and j11 every time. I think this should be opt-in.
>> Majority of the time, we are just refactoring and coding stuff for
>> Cassandra where testing it for both jvms is just pointless and we _know_
>> that it will be fine in 11 and 17 too because we do not do anything
>> special. If we find some subsystems where testing that on both jvms is
>> crucial, we might do that, I just do not remember when it was last time
>> that testing it in both j17 and j11 suddenly uncovered some bug. Seems more
>> like a hassle.
>>
>> We might then test the whole pipeline with a different config basically
>> for same time as we currently do.
>>
>> On Wed, Feb 14, 2024 at 9:32 PM Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>>
>>> śr., 14 lut 2024 o 17:30 Josh McKenzie 
>>> napisał(a):
>>>
 When we have failing tests people do not spend the time to figure out
 if their logic caused a regression and merge, making things more unstable…
 so when we merge failing tests that leads to people merging even more
 failing tests...

 What's the counter position to this Jacek / Berenguer?

>>>
>>> For how long are we going to deceive ourselves? Are we shipping those
>>> features or not? Perhaps it is also a good opportunity to distinguish
>>> subsets of tests which make sense to run with a configuration matrix.
>>>
>>> If we don't add those tests to the pre-commit pipeline, "people do not
>>> spend the time to figure out if their logic caused a regression and merge,
>>> making things more unstable…"
>>> I think it is much more valuable to test those various configurations
>>> rather than test against j11 and j17 separately. I can see a really little
>>> value in doing that.
>>>
>>>
>>>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Ekaterina Dimitrova
>
> I'm ok with breaking trunk CI temporarily as long as failures are tracked
> and triaged/addressed before the next release.


>From the ticket, I understand it is meant for 5.0-rc

I share this sentiment for the release we decide to ship with:

> The failures should block release or we should not advertise we have those
> features at all, and the configuration should be named "experimental"
> rather than "latest".


Is the community okay with committing the patch before all of these are
> addressed?

If we aim to fix everything before the next release 5.0-rc, we can commit
CASSANDRA-18753 after the fixes are applied. If we are not going to do all
the fixes anytime soon - I prefer to commit and have the failures and the
tickets open. Otherwise, I can guarantee that I, personally, will forget
some of those failures and miss them in time... and I am suspicious I won’t
be the only one :-)

This version is provided for new users of # Cassandra who want to get the
> most out of their cluster and for users # evaluating the technology.

>From reading this thread, we do not recommend using it straight into
production but to experiment, gain trust, and then use it in production.
Did I get it correctly? We need to confirm what it is and be sure it is
clearly stated in the docs.

Announcing this new yaml file under NEWS.txt features sounds reasonable to
me. Or can we add a new separate section on top of  NEWS.txt 5.0, dedicated
only to the announcement of this new configuration file?

Mick and Ekaterina (and everyone really) - any thoughts on what test
> coverage we should commit to for this new configuration? Acknowledging that
> we already have *a lot* of CI that we run.

I do not have an immediate answer. I see there is some proposed CI
configuration in the ticket. As far as I can tell from a quick look, the
suggestion is to replace unit-trie with unit-latest (which exercises also
tries) and the additional new jobs will be Python and Java DTests. (no new
upgrade tests)
On top of my mind - we probably need a cost-benefit analysis, risk
analysis, and tradeoffs discussed - burnt resources vs manpower, early
detection vs late discovery, or even prod issues. Experimental vs
production-ready, etc

Now, this question can have different answers depending on whether this is
an experimental config or we recommend it for production use.

I would expect new features to be enabled in this configuration and all
tests to be run pre-commit with the default and the new YAML files. Is this
a correct assumption? Probably done with a note on the ML.

The question is, do we have enough resources in Jenkins to facilitate all
this testing post-commit?

> I think it is much more valuable to test those various configurations
> rather than test against j11 and j17 separately. I can see a really little
> value in doing that.

Excellent point, I was saying for some time that IMHO we can reduce
to running in CI at least pre-commit:
1) Build J11 2) build J17
3) run tests with build 11 + runtime 11
4) run tests with build 11 and runtime 17.

Technically, that is what we also ship in 5.0. (Except the 2), the JDK17
build but we should not remove that from CI)
Does it make sense to reduce to what I mentioned in 1,2,3,4 and instead add
the suggested jobs with the new configuration from CASSANDRA-18753 in
pre-commit? Please correct me if I am wrong, but I understand that running
with JDK17 tests on the 17 build is experimental in CI, so we can gain
confidence until the release when we will drop 11. No? If that is correct,
I do not see why we run those tests on every pre-commit and not only what
we ship.

Best regards,
Ekaterina

On Wed, 14 Feb 2024 at 17:35, Štefan Miklošovič 
wrote:

> I agree with Jacek, I don't quite understand why we are running the
> pipeline for j17 and j11 every time. I think this should be opt-in.
> Majority of the time, we are just refactoring and coding stuff for
> Cassandra where testing it for both jvms is just pointless and we _know_
> that it will be fine in 11 and 17 too because we do not do anything
> special. If we find some subsystems where testing that on both jvms is
> crucial, we might do that, I just do not remember when it was last time
> that testing it in both j17 and j11 suddenly uncovered some bug. Seems more
> like a hassle.
>
> We might then test the whole pipeline with a different config basically
> for same time as we currently do.
>
> On Wed, Feb 14, 2024 at 9:32 PM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):
>>
>>> When we have failing tests people do not spend the time to figure out if
>>> their logic caused a regression and merge, making things more unstable… so
>>> when we merge failing tests that leads to people merging even more failing
>>> tests...
>>>
>>> What's the counter position to this Jacek / Berenguer?
>>>
>>
>> For how long are we going to deceive ourselves? Are we shipping those
>> features or not? Perhaps it is 

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jon Haddad
Stefan, can you elaborate on what you are proposing?  It's not clear (at
least to me) what level of testing you're advocating for.  Dropping testing
both on dev branches, every commit, just on release?  In addition, can you
elaborate on what is a hassle about it?  It's been a long time since I
committed anything but I don't remember 2 JVMs (8 & 11) being a problem.

Jon



On Wed, Feb 14, 2024 at 2:35 PM Štefan Miklošovič <
stefan.mikloso...@gmail.com> wrote:

> I agree with Jacek, I don't quite understand why we are running the
> pipeline for j17 and j11 every time. I think this should be opt-in.
> Majority of the time, we are just refactoring and coding stuff for
> Cassandra where testing it for both jvms is just pointless and we _know_
> that it will be fine in 11 and 17 too because we do not do anything
> special. If we find some subsystems where testing that on both jvms is
> crucial, we might do that, I just do not remember when it was last time
> that testing it in both j17 and j11 suddenly uncovered some bug. Seems more
> like a hassle.
>
> We might then test the whole pipeline with a different config basically
> for same time as we currently do.
>
> On Wed, Feb 14, 2024 at 9:32 PM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):
>>
>>> When we have failing tests people do not spend the time to figure out if
>>> their logic caused a regression and merge, making things more unstable… so
>>> when we merge failing tests that leads to people merging even more failing
>>> tests...
>>>
>>> What's the counter position to this Jacek / Berenguer?
>>>
>>
>> For how long are we going to deceive ourselves? Are we shipping those
>> features or not? Perhaps it is also a good opportunity to distinguish
>> subsets of tests which make sense to run with a configuration matrix.
>>
>> If we don't add those tests to the pre-commit pipeline, "people do not
>> spend the time to figure out if their logic caused a regression and merge,
>> making things more unstable…"
>> I think it is much more valuable to test those various configurations
>> rather than test against j11 and j17 separately. I can see a really little
>> value in doing that.
>>
>>
>>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Štefan Miklošovič
I agree with Jacek, I don't quite understand why we are running the
pipeline for j17 and j11 every time. I think this should be opt-in.
Majority of the time, we are just refactoring and coding stuff for
Cassandra where testing it for both jvms is just pointless and we _know_
that it will be fine in 11 and 17 too because we do not do anything
special. If we find some subsystems where testing that on both jvms is
crucial, we might do that, I just do not remember when it was last time
that testing it in both j17 and j11 suddenly uncovered some bug. Seems more
like a hassle.

We might then test the whole pipeline with a different config basically for
same time as we currently do.

On Wed, Feb 14, 2024 at 9:32 PM Jacek Lewandowski <
lewandowski.ja...@gmail.com> wrote:

> śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):
>
>> When we have failing tests people do not spend the time to figure out if
>> their logic caused a regression and merge, making things more unstable… so
>> when we merge failing tests that leads to people merging even more failing
>> tests...
>>
>> What's the counter position to this Jacek / Berenguer?
>>
>
> For how long are we going to deceive ourselves? Are we shipping those
> features or not? Perhaps it is also a good opportunity to distinguish
> subsets of tests which make sense to run with a configuration matrix.
>
> If we don't add those tests to the pre-commit pipeline, "people do not
> spend the time to figure out if their logic caused a regression and merge,
> making things more unstable…"
> I think it is much more valuable to test those various configurations
> rather than test against j11 and j17 separately. I can see a really little
> value in doing that.
>
>
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski
śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):

> When we have failing tests people do not spend the time to figure out if
> their logic caused a regression and merge, making things more unstable… so
> when we merge failing tests that leads to people merging even more failing
> tests...
>
> What's the counter position to this Jacek / Berenguer?
>

For how long are we going to deceive ourselves? Are we shipping those
features or not? Perhaps it is also a good opportunity to distinguish
subsets of tests which make sense to run with a configuration matrix.

If we don't add those tests to the pre-commit pipeline, "people do not
spend the time to figure out if their logic caused a regression and merge,
making things more unstable…"
I think it is much more valuable to test those various configurations
rather than test against j11 and j17 separately. I can see a really little
value in doing that.


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jeff Jirsa
1) If there’s an “old compatible default” and “latest recommended settings”, 
when does the value in “old compatible default” get updated? Never? 
2) If there are test failures with the new values, it seems REALLY IMPORTANT to 
make sure those test failures are discovered + fixed IN THE FUTURE TOO. If 
pushing new yaml into a different file makes us less likely to catch the 
failures in the future, it seems like we’re hurting ourselves. Branimir 
mentions this, but how do we ensure that we don’t let this pattern disguise 
future bugs? 





> On Feb 13, 2024, at 8:41 AM, Branimir Lambov  wrote:
> 
> Hi All,
> 
> CASSANDRA-18753 introduces a second set of defaults (in a separate 
> "cassandra_latest.yaml") that enable new features of Cassandra. The objective 
> is two-fold: to be able to test the database in this configuration, and to 
> point potential users that are evaluating the technology to an optimized set 
> of defaults that give a clearer picture of the expected performance of the 
> database for a new user. The objective is to get this configuration into 5.0 
> to have the extra bit of confidence that we are not releasing (and 
> recommending) options that have not gone through thorough CI.
> 
> The implementation has already gone through review, but I'd like to get 
> people's opinion on two things:
> - There are currently a number of test failures when the new options are 
> selected, some of which appear to be genuine problems. Is the community okay 
> with committing the patch before all of these are addressed? This should 
> prevent the introduction of new failures and make sure we don't release 
> before clearing the existing ones.
> - I'd like to get an opinion on what's suitable wording and documentation for 
> the new defaults set. Currently, the patch proposes adding the following text 
> to the yaml (see https://github.com/apache/cassandra/pull/2896/files):
> # NOTE:
> #   This file is provided in two versions:
> # - cassandra.yaml: Contains configuration defaults for a "compatible"
> #   configuration that operates using settings that are 
> backwards-compatible
> #   and interoperable with machines running older versions of Cassandra.
> #   This version is provided to facilitate pain-free upgrades for existing
> #   users of Cassandra running in production who want to gradually and
> #   carefully introduce new features.
> # - cassandra_latest.yaml: Contains configuration defaults that enable
> #   the latest features of Cassandra, including improved functionality as
> #   well as higher performance. This version is provided for new users of
> #   Cassandra who want to get the most out of their cluster, and for users
> #   evaluating the technology.
> #   To use this version, simply copy this file over cassandra.yaml, or 
> specify
> #   it using the -Dcassandra.config system property, e.g. by running
> # cassandra 
> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
> # /NOTE
> Does this sound sensible? Should we add a pointer to this defaults set 
> elsewhere in the documentation?
> 
> Regards,
> Branimir



[RELEASE] Apache Cassandra 4.1.4 released

2024-02-14 Thread Štefan Miklošovič
The Cassandra team is pleased to announce the release of Apache Cassandra
version 4.1.4.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 https://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 https://cassandra.apache.org/download/

This version is a bug fix release[1] on the 4.1 series. As always, please
pay attention to the release notes[2] and Let us know[3] if you were to
encounter any problem.

[WARNING] Debian and RedHat package repositories have moved! Debian
/etc/apt/sources.list.d/cassandra.sources.list and RedHat
/etc/yum.repos.d/cassandra.repo files must be updated to the new repository
URLs. For Debian it is now https://debian.cassandra.apache.org . For RedHat
it is now https://redhat.cassandra.apache.org/41x/ .

Enjoy!

[1]: CHANGES.txt
https://github.com/apache/cassandra/blob/cassandra-4.1.4/CHANGES.txt
[2]: NEWS.txt
https://github.com/apache/cassandra/blob/cassandra-4.1.4/NEWS.txt
[3]: https://issues.apache.org/jira/browse/CASSANDRA


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Paulo Motta
Cool stuff! This will make it easier to advance configuration defaults
without affecting stable configuration.

Wording looks good to me. +1 to include a NEWS.txt note. I'm ok with
breaking trunk CI temporarily as long as failures are tracked and
triaged/addressed before the next release.

I haven't had the chance to look into CASSANDRA-18753 yet so apologies if
this was already discussed but I have the following questions about
handling 2 configuration files moving forward:
1) Will cassandra.yaml remain the default test config? Is the plan moving
forward to require green CI for both configurations on pre-commit, or
pre-release?
2) What will this mean for the release artifact, is the idea to continue
shipping with the current cassandra.yaml or eventually switch to the
optimized configuration (ie. 6.X) while making the legacy default
configuration available via an optional flag?

On Tue, Feb 13, 2024 at 11:42 AM Branimir Lambov  wrote:

> Hi All,
>
> CASSANDRA-18753 introduces a second set of defaults (in a separate
> "cassandra_latest.yaml") that enable new features of Cassandra. The
> objective is two-fold: to be able to test the database in this
> configuration, and to point potential users that are evaluating the
> technology to an optimized set of defaults that give a clearer picture of
> the expected performance of the database for a new user. The objective is
> to get this configuration into 5.0 to have the extra bit of confidence that
> we are not releasing (and recommending) options that have not gone through
> thorough CI.
>
> The implementation has already gone through review, but I'd like to get
> people's opinion on two things:
> - There are currently a number of test failures when the new options are
> selected, some of which appear to be genuine problems. Is the community
> okay with committing the patch before all of these are addressed? This
> should prevent the introduction of new failures and make sure we don't
> release before clearing the existing ones.
> - I'd like to get an opinion on what's suitable wording and documentation
> for the new defaults set. Currently, the patch proposes adding the
> following text to the yaml (see
> https://github.com/apache/cassandra/pull/2896/files):
> # NOTE:
> #   This file is provided in two versions:
> # - cassandra.yaml: Contains configuration defaults for a "compatible"
> #   configuration that operates using settings that are
> backwards-compatible
> #   and interoperable with machines running older versions of
> Cassandra.
> #   This version is provided to facilitate pain-free upgrades for
> existing
> #   users of Cassandra running in production who want to gradually and
> #   carefully introduce new features.
> # - cassandra_latest.yaml: Contains configuration defaults that enable
> #   the latest features of Cassandra, including improved functionality
> as
> #   well as higher performance. This version is provided for new users
> of
> #   Cassandra who want to get the most out of their cluster, and for
> users
> #   evaluating the technology.
> #   To use this version, simply copy this file over cassandra.yaml, or
> specify
> #   it using the -Dcassandra.config system property, e.g. by running
> # cassandra
> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
> # /NOTE
> Does this sound sensible? Should we add a pointer to this defaults set
> elsewhere in the documentation?
>
> Regards,
> Branimir
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Josh McKenzie
> When we have failing tests people do not spend the time to figure out if 
> their logic caused a regression and merge, making things more unstable… so 
> when we merge failing tests that leads to people merging even more failing 
> tests...
What's the counter position to this Jacek / Berenguer?

Mick and Ekaterina (and everyone really) - any thoughts on what test coverage, 
if any, we should commit to for this new configuration? Acknowledging that we 
already have *a lot* of CI that we run.


On Wed, Feb 14, 2024, at 5:11 AM, Berenguer Blasi wrote:
> +1 to not doing, imo, the ostrich lol
> 
> On 14/2/24 10:58, Jacek Lewandowski wrote:
>> We should not block merging configuration changes given it is a valid 
>> configuration - which I understand as it is correct, passes all config 
>> validations, it matches documented rules, etc. And this provided latest 
>> config matches those requirements I assume.
>> 
>> The failures should block release or we should not advertise we have those 
>> features at all, and the configuration should be named "experimental" rather 
>> than "latest".
>> 
>> The config changes are not responsible for broken features and we should not 
>> bury our heads in the sand pretending that everything is ok.
>> 
>> Thanks,
>> 
>> śr., 14 lut 2024, 10:47 użytkownik Štefan Miklošovič 
>>  napisał:
>>> Wording looks good to me. I would also put that into NEWS.txt but I am not 
>>> sure what section. New features, Upgrading nor Deprecation does not seem to 
>>> be a good category. 
>>> 
>>> On Tue, Feb 13, 2024 at 5:42 PM Branimir Lambov  wrote:
 Hi All,
 
 CASSANDRA-18753 introduces a second set of defaults (in a separate 
 "cassandra_latest.yaml") that enable new features of Cassandra. The 
 objective is two-fold: to be able to test the database in this 
 configuration, and to point potential users that are evaluating the 
 technology to an optimized set of defaults that give a clearer picture of 
 the expected performance of the database for a new user. The objective is 
 to get this configuration into 5.0 to have the extra bit of confidence 
 that we are not releasing (and recommending) options that have not gone 
 through thorough CI.
 
 The implementation has already gone through review, but I'd like to get 
 people's opinion on two things:
 - There are currently a number of test failures when the new options are 
 selected, some of which appear to be genuine problems. Is the community 
 okay with committing the patch before all of these are addressed? This 
 should prevent the introduction of new failures and make sure we don't 
 release before clearing the existing ones.
 - I'd like to get an opinion on what's suitable wording and documentation 
 for the new defaults set. Currently, the patch proposes adding the 
 following text to the yaml (see 
 https://github.com/apache/cassandra/pull/2896/files):
 # NOTE:
 #   This file is provided in two versions:
 # - cassandra.yaml: Contains configuration defaults for a "compatible"
 #   configuration that operates using settings that are 
 backwards-compatible
 #   and interoperable with machines running older versions of 
 Cassandra.
 #   This version is provided to facilitate pain-free upgrades for 
 existing
 #   users of Cassandra running in production who want to gradually and
 #   carefully introduce new features.
 # - cassandra_latest.yaml: Contains configuration defaults that enable
 #   the latest features of Cassandra, including improved functionality 
 as
 #   well as higher performance. This version is provided for new users 
 of
 #   Cassandra who want to get the most out of their cluster, and for 
 users
 #   evaluating the technology.
 #   To use this version, simply copy this file over cassandra.yaml, or 
 specify
 #   it using the -Dcassandra.config system property, e.g. by running
 # cassandra 
 -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
 # /NOTE
 Does this sound sensible? Should we add a pointer to this defaults set 
 elsewhere in the documentation?
 
 Regards,
 Branimir


[RESULT][VOTE] Release Apache Cassandra 4.1.4

2024-02-14 Thread Štefan Miklošovič
The vote has passed with three binding +1s and no vetoes.


Re: [Discuss] Introducing Flexible Authentication in Cassandra via Feature Flag

2024-02-14 Thread Jacek Lewandowski
Hi,

I think what Gaurav means is what we know at DataStax as transitional
authenticator, which temporarily allows for partially enabled
authentication - when the system allows the clients to authenticate but
does not enforce it.

All in all, that should be included in CEP-31 - also CEP-31 aims to let the
administrators enable/disable and reconfigure authentication without a
restart so we could discuss whether such transitional mode would be needed
at all in that case.

Thanks,
- - -- --- -  -
Jacek Lewandowski


wt., 13 lut 2024 o 07:04 Jeff Jirsa  napisał(a):

> Auth is one of those things that needs to be a bit more concrete
>
> In the scenario you describe, you already have an option to deploy the
> auth in piece partially during the rollout (pause halfway through) in the
> cluster and look for asymmetric connections, and the option to drop in a
> new Authenticator jar in the class path that does the flexible auth you
> describe
>
> I fear that the extra flexibility this allows for 1% of operations exposes
> people to long term problems
>
> Have you considered just implementing the feature flag you describe using
> the existing plugin infrastructure ?
>
> On Feb 12, 2024, at 9:47 PM, Gaurav Agarwal 
> wrote:
>
> 
> Dear Dinesh and Abe,
>
> Thank you for reviewing the document on enabling Cassandra authentication.
> I apologize that I didn't initially include the following failure scenarios
> where this feature could be particularly beneficial (I've included them
> now):
>
> *Below are the failure scenarios:*
>
>- Incorrect credentials: If a client accidentally uses the wrong
>username/password combination during the rollout, While restarting the
>server to enable authentication, it will refuse connections with incorrect
>credentials. This can temporarily interrupt the service until correct
>credentials are sent.
>- Missed service auth updates: In a large-scale system, a service "X"
>might miss the credential update during rollout. After some server nodes
>restart, service "X" might finally realize it needs correct credentials,
>but it's too late. Nodes are already expecting authorized requests, and
>this mismatch causes "X" to stop working on auth enabled and restarted
>nodes.
>- Infrequent traffic:  Suppose one of the services only interacts with
>the server once a week. Suppose it starts sending requests with incorrect
>credentials after authentication is enabled. Since the entire cluster is
>now running on authentication, the service's outdated credentials cause it
>to be denied access, resulting in a service-wide outage.
>
>
> The overall aim of the proposed feature flag would allow clients to
> connect momentarily without authentication during the rollout, mitigating
> these risks and ensuring a smoother transition.
>
> Thanks in advance for your continued review of the proposal.
>
>
>
> On Mon, Feb 12, 2024 at 2:24 PM Abe Ratnofsky  wrote:
>
>> Hey Guarav,
>>
>> Thanks for your proposal.
>>
>> > disruptive, full-cluster restart, posing significant risks in live
>> environments
>>
>> For configuration that isn't hot-reloadable, like providing a new
>> IAuthenticator implementation, a rolling restart is required. But rolling
>> restarts are zero-downtime and safe in production, as long as you pace them
>> accordingly.
>>
>> In general, changing authenticators is a risky thing because it requires
>> coordination with clients. To mitigate this risk and support clients while
>> they transition between authenticators, I like the approach taken by
>> MutualTlsWithPasswordFallbackAuthenticator:
>>
>> https://github.com/apache/cassandra/blob/bec6bfde1f3b6a782f123f9f9ff18072a97e379f/src/java/org/apache/cassandra/auth/MutualTlsWithPasswordFallbackAuthenticator.java#L34
>>
>> If client certificates are available, then use those, otherwise use the
>> existing PasswordAuthenticator that clients are already using. The existing
>> IAuthenticator interface supports this transitional behavior well.
>>
>> Your proposal to include a new configuration for auth_enforcement_flag
>> doesn't clearly cover how to transition from one authenticator to another.
>> It says:
>>
>> > Soft: Operates in a monitoring mode without enforcing authentication
>>
>> Most users use authentication today, so auth_enforcement_flag=Soft would
>> allow unauthenticated clients to connect to the database.
>>
>> --
>> Abe
>>
>> On Feb 12, 2024, at 2:44 PM, Gaurav Agarwal 
>> wrote:
>>
>> Dear Cassandra Community,
>>
>> I'm excited to share a proposal for a new feature that I believe would
>> significantly enhance the platform's security and operational flexibility: *a
>> flexible authentication mechanism implemented through a feature flag *.
>>
>> Currently, enforcing authentication in Cassandra requires a disruptive,
>> full-cluster restart, posing significant risks in live environments. My
>> proposal, the *auth_enforcement_flag*, addresses this challenge 

Re: [VOTE] Release Apache Cassandra 4.1.4

2024-02-14 Thread Mick Semb Wever
>
> The vote will be open for 72 hours (longer if needed). Everyone who has
> tested the build is invited to vote. Votes by PMC members are considered
> binding. A vote passes if there are at least three binding +1s and no -1's.
>


+1

Checked
- signing correct
- checksums are correct
- source artefact builds (JDK 8+11)
- binary artefact runs (JDK 8+11)
- debian package runs (JDK 8+11)
- debian repo runs (JDK 8+11)
- redhat* package runs (JDK 8+11)
- redhat* repo runs (JDK 8+11)


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Berenguer Blasi

+1 to not doing, imo, the ostrich lol

On 14/2/24 10:58, Jacek Lewandowski wrote:
We should not block merging configuration changes given it is a valid 
configuration - which I understand as it is correct, passes all config 
validations, it matches documented rules, etc. And this provided 
latest config matches those requirements I assume.


The failures should block release or we should not advertise we have 
those features at all, and the configuration should be named 
"experimental" rather than "latest".


The config changes are not responsible for broken features and we 
should not bury our heads in the sand pretending that everything is ok.


Thanks,

śr., 14 lut 2024, 10:47 użytkownik Štefan Miklošovič 
 napisał:


Wording looks good to me. I would also put that into NEWS.txt but
I am not sure what section. New features, Upgrading nor
Deprecation does not seem to be a good category.

On Tue, Feb 13, 2024 at 5:42 PM Branimir Lambov
 wrote:

Hi All,

CASSANDRA-18753 introduces a second set of defaults (in a
separate "cassandra_latest.yaml") that enable new features of
Cassandra. The objective is two-fold: to be able to test the
database in this configuration, and to point potential users
that are evaluating the technology to an optimized set of
defaults that give a clearer picture of the expected
performance of the database for a new user. The objective is
to get this configuration into 5.0 to have the extra bit of
confidence that we are not releasing (and recommending)
options that have not gone through thorough CI.

The implementation has already gone through review, but I'd
like to get people's opinion on two things:
- There are currently a number of test failures when the new
options are selected, some of which appear to
be genuine problems. Is the community okay with committing the
patch before all of these are addressed? This should prevent
the introduction of new failures and make sure we don't
release before clearing the existing ones.
- I'd like to get an opinion on what's suitable wording and
documentation for the new defaults set. Currently, the patch
proposes adding the following text to the yaml (see
https://github.com/apache/cassandra/pull/2896/files):
# NOTE:
#   This file is provided in two versions:
#     - cassandra.yaml: Contains configuration defaults for a
"compatible"
#       configuration that operates using settings that are
backwards-compatible
#       and interoperable with machines running older versions
of Cassandra.
#       This version is provided to facilitate pain-free
upgrades for existing
#       users of Cassandra running in production who want to
gradually and
#       carefully introduce new features.
#     - cassandra_latest.yaml: Contains configuration defaults
that enable
#       the latest features of Cassandra, including improved
functionality as
#       well as higher performance. This version is provided
for new users of
#       Cassandra who want to get the most out of their
cluster, and for users
#       evaluating the technology.
#       To use this version, simply copy this file over
cassandra.yaml, or specify
#       it using the -Dcassandra.config system property, e.g.
by running
#         cassandra
-Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
# /NOTE
Does this sound sensible? Should we add a pointer to this
defaults set elsewhere in the documentation?

Regards,
Branimir


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski
We should not block merging configuration changes given it is a valid
configuration - which I understand as it is correct, passes all config
validations, it matches documented rules, etc. And this provided latest
config matches those requirements I assume.

The failures should block release or we should not advertise we have those
features at all, and the configuration should be named "experimental"
rather than "latest".

The config changes are not responsible for broken features and we should
not bury our heads in the sand pretending that everything is ok.

Thanks,

śr., 14 lut 2024, 10:47 użytkownik Štefan Miklošovič <
stefan.mikloso...@gmail.com> napisał:

> Wording looks good to me. I would also put that into NEWS.txt but I am not
> sure what section. New features, Upgrading nor Deprecation does not seem to
> be a good category.
>
> On Tue, Feb 13, 2024 at 5:42 PM Branimir Lambov 
> wrote:
>
>> Hi All,
>>
>> CASSANDRA-18753 introduces a second set of defaults (in a separate
>> "cassandra_latest.yaml") that enable new features of Cassandra. The
>> objective is two-fold: to be able to test the database in this
>> configuration, and to point potential users that are evaluating the
>> technology to an optimized set of defaults that give a clearer picture of
>> the expected performance of the database for a new user. The objective is
>> to get this configuration into 5.0 to have the extra bit of confidence that
>> we are not releasing (and recommending) options that have not gone through
>> thorough CI.
>>
>> The implementation has already gone through review, but I'd like to get
>> people's opinion on two things:
>> - There are currently a number of test failures when the new options are
>> selected, some of which appear to be genuine problems. Is the community
>> okay with committing the patch before all of these are addressed? This
>> should prevent the introduction of new failures and make sure we don't
>> release before clearing the existing ones.
>> - I'd like to get an opinion on what's suitable wording and documentation
>> for the new defaults set. Currently, the patch proposes adding the
>> following text to the yaml (see
>> https://github.com/apache/cassandra/pull/2896/files):
>> # NOTE:
>> #   This file is provided in two versions:
>> # - cassandra.yaml: Contains configuration defaults for a "compatible"
>> #   configuration that operates using settings that are
>> backwards-compatible
>> #   and interoperable with machines running older versions of
>> Cassandra.
>> #   This version is provided to facilitate pain-free upgrades for
>> existing
>> #   users of Cassandra running in production who want to gradually and
>> #   carefully introduce new features.
>> # - cassandra_latest.yaml: Contains configuration defaults that enable
>> #   the latest features of Cassandra, including improved
>> functionality as
>> #   well as higher performance. This version is provided for new
>> users of
>> #   Cassandra who want to get the most out of their cluster, and for
>> users
>> #   evaluating the technology.
>> #   To use this version, simply copy this file over cassandra.yaml,
>> or specify
>> #   it using the -Dcassandra.config system property, e.g. by running
>> # cassandra
>> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
>> # /NOTE
>> Does this sound sensible? Should we add a pointer to this defaults set
>> elsewhere in the documentation?
>>
>> Regards,
>> Branimir
>>
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Štefan Miklošovič
Wording looks good to me. I would also put that into NEWS.txt but I am not
sure what section. New features, Upgrading nor Deprecation does not seem to
be a good category.

On Tue, Feb 13, 2024 at 5:42 PM Branimir Lambov  wrote:

> Hi All,
>
> CASSANDRA-18753 introduces a second set of defaults (in a separate
> "cassandra_latest.yaml") that enable new features of Cassandra. The
> objective is two-fold: to be able to test the database in this
> configuration, and to point potential users that are evaluating the
> technology to an optimized set of defaults that give a clearer picture of
> the expected performance of the database for a new user. The objective is
> to get this configuration into 5.0 to have the extra bit of confidence that
> we are not releasing (and recommending) options that have not gone through
> thorough CI.
>
> The implementation has already gone through review, but I'd like to get
> people's opinion on two things:
> - There are currently a number of test failures when the new options are
> selected, some of which appear to be genuine problems. Is the community
> okay with committing the patch before all of these are addressed? This
> should prevent the introduction of new failures and make sure we don't
> release before clearing the existing ones.
> - I'd like to get an opinion on what's suitable wording and documentation
> for the new defaults set. Currently, the patch proposes adding the
> following text to the yaml (see
> https://github.com/apache/cassandra/pull/2896/files):
> # NOTE:
> #   This file is provided in two versions:
> # - cassandra.yaml: Contains configuration defaults for a "compatible"
> #   configuration that operates using settings that are
> backwards-compatible
> #   and interoperable with machines running older versions of
> Cassandra.
> #   This version is provided to facilitate pain-free upgrades for
> existing
> #   users of Cassandra running in production who want to gradually and
> #   carefully introduce new features.
> # - cassandra_latest.yaml: Contains configuration defaults that enable
> #   the latest features of Cassandra, including improved functionality
> as
> #   well as higher performance. This version is provided for new users
> of
> #   Cassandra who want to get the most out of their cluster, and for
> users
> #   evaluating the technology.
> #   To use this version, simply copy this file over cassandra.yaml, or
> specify
> #   it using the -Dcassandra.config system property, e.g. by running
> # cassandra
> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
> # /NOTE
> Does this sound sensible? Should we add a pointer to this defaults set
> elsewhere in the documentation?
>
> Regards,
> Branimir
>


Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Branimir Lambov
is there a reason all guardrails and reliability (aka repair retries)
configs are off by default?  They are off by default in the normal config
for backwards compatibility reasons, but if we are defining a config saying
what we recommend, we should enable these things by default IMO.

This is one more question to be answered by this discussion. Are there
other options that should be enabled by the "latest" configuration? To what
values should they be set?
Is there something that is currently enabled that should not be?

Should we merge the configs breaking these tests?  No…. When we have
failing tests people do not spend the time to figure out if their logic
caused a regression and merge, making things more unstable… so when we
merge failing tests that leads to people merging even more failing tests...

In this case this also means that people will not see at all failures that
they introduce in any of the advanced features, as they are not tested at
all. Also, since CASSANDRA-19167 and 19168 already have fixes, the
non-latest test suite will remain clean after merge. Note that these two
problems demonstrate that we have failures in the configuration we ship
with, because we are not actually testing it at all. IMHO this is a problem
that we should not delay fixing.

Regards,
Branimir

On Wed, Feb 14, 2024 at 1:07 AM David Capwell  wrote:

> so can cause repairs to deadlock forever
>
>
> Small correction, I finished fixing the tests in CASSANDRA-19042 and we
> don’t deadlock, we timeout and fail repair if any of those messages are
> dropped.
>
> On Feb 13, 2024, at 11:04 AM, David Capwell  wrote:
>
> and to point potential users that are evaluating the technology to an
> optimized set of defaults
>
>
> Left this comment in the GH… is there a reason all guardrails and
> reliability (aka repair retries) configs are off by default?  They are
> off by default in the normal config for backwards compatibility reasons,
> but if we are defining a config saying what we recommend, we should enable
> these things by default IMO.
>
> There are currently a number of test failures when the new options are
> selected, some of which appear to be genuine problems. Is the community
> okay with committing the patch before all of these are addressed?
>
>
> I was tagged on CASSANDRA-19042, the paxos repair message handing does
> not have the repair reliably improvements that 5.0 have, so can cause
> repairs to deadlock forever (same as current 4.x repairs).  Bringing these
> up to par with the rest of repair would be very much welcome (they are also
> lacking visibility, so need to fallback to heap dumps to see what’s going
> on; same as 4.0.x but not 4.1.x), but I doubt I have cycles to do that….
> This refactor is not 100% trivial as it has fun subtle concurrency issues
> to address (message retries and dedupping), and making sure this logic
> works with the existing repair simulation tests does require refactoring
> how the paxos cleanup state is tracked, which could have subtle consequents.
>
> I do think this should be fixed, but should it block 5.0?  Not sure… will
> leave to others….
>
> Should we merge the configs breaking these tests?  No…. When we have
> failing tests people do not spend the time to figure out if their logic
> caused a regression and merge, making things more unstable… so when we
> merge failing tests that leads to people merging even more failing tests...
>
> On Feb 13, 2024, at 8:41 AM, Branimir Lambov  wrote:
>
> Hi All,
>
> CASSANDRA-18753 introduces a second set of defaults (in a separate
> "cassandra_latest.yaml") that enable new features of Cassandra. The
> objective is two-fold: to be able to test the database in this
> configuration, and to point potential users that are evaluating the
> technology to an optimized set of defaults that give a clearer picture of
> the expected performance of the database for a new user. The objective is
> to get this configuration into 5.0 to have the extra bit of confidence that
> we are not releasing (and recommending) options that have not gone through
> thorough CI.
>
> The implementation has already gone through review, but I'd like to get
> people's opinion on two things:
> - There are currently a number of test failures when the new options are
> selected, some of which appear to be genuine problems. Is the community
> okay with committing the patch before all of these are addressed? This
> should prevent the introduction of new failures and make sure we don't
> release before clearing the existing ones.
> - I'd like to get an opinion on what's suitable wording and documentation
> for the new defaults set. Currently, the patch proposes adding the
> following text to the yaml (see
> https://github.com/apache/cassandra/pull/2896/files):
> # NOTE:
> #   This file is provided in two versions:
> # - cassandra.yaml: Contains configuration defaults for a "compatible"
> #   configuration that operates using settings that are
> backwards-compatible
> #