from:"Jacek Lewandowski"

Re: Cassandra PMC Chair Rotation, 2024 Edition

2024-06-20 Thread Jacek Lewandowski

Congratulations!!!


- - -- --- -  -
Jacek Lewandowski


czw., 20 cze 2024 o 21:00 lorinapoland  napisał(a):

> Congrats, Dinesh!
>
>
>
> Sent from my Verizon, Samsung Galaxy smartphone
>
>
>  Original message 
> From: Josh McKenzie 
> Date: 6/20/24 08:51 (GMT-08:00)
> To: dev 
> Subject: Cassandra PMC Chair Rotation, 2024 Edition
>
> Another PMC Chair baton pass incoming! On behalf of the Apache Cassandra
> Project Management Committee (PMC) I would like to welcome and congratulate
> our next PMC Chair Dinesh Joshi (djoshi).
>
> Dinesh has been a member of the PMC for a few years now and many of you
> likely know him from his thoughtful, measured presence on many of our
> collective discussions as we've grown and evolved over the past few years.
>
> I appreciate the project trusting me as liaison with the board over the
> past year and look forward to supporting Dinesh in the role in the future.
>
> Repeating Mick (repeating Paulo's) words from last year: The chair is an
> administrative position that interfaces with the Apache Software Foundation
> Board, by submitting regular reports about project status and health. Read
> more about the PMC chair role on Apache projects:
> - https://www.apache.org/foundation/how-it-works.html#pmc
> - https://www.apache.org/foundation/how-it-works.html#pmc-chair
> - https://www.apache.org/foundation/faq.html#why-are-PMC-chairs-officers
>
> The PMC as a whole is the entity that oversees and leads the project and
> any PMC member can be approached as a representative of the committee. A
> list of Apache Cassandra PMC members can be found on:
> https://cassandra.apache.org/_/community.html
>

Re: [DISCUSS] Stream Pipelines on hot paths

2024-05-31 Thread Jacek Lewandowski

Usages of them in tests are ok I think. We have a separate checkstyle file
for the test code.

- - -- --- -  -
Jacek Lewandowski


pt., 31 maj 2024 o 19:14 David Capwell  napisał(a):

> I am cool for forbidding with a callout that tests are ok.  I am cool with
> forbidding in tests as well, but thats just for consistency reasons than
> anything.
>
> On May 31, 2024, at 8:12 AM, Brandon Williams  wrote:
>
>
> On Fri, May 31, 2024 at 9:35 AM Abe Ratnofsky  wrote:
>
>> +1 to forbidding Stream usage entirely; the convenience of using them
>> outside of hot paths is less than the burden of figuring out whether or not
>> a particular path is hot.
>>
>
> I think I have most frequently appreciated them in tests, which I think we
> could except, since these are categorically not in the hot path.
>
> Kind Regards,
> Brandon
>
>
>
>

Re: [DISCUSS] Stream Pipelines on hot paths

2024-05-31 Thread Jacek Lewandowski

+1 to either forbid them entirely or not at all.


pt., 31 maj 2024, 16:05 użytkownik Benjamin Lerer 
napisał:

> For me the definition of hot path is too vague. We had arguments with
> Berenger multiple times and it is more a waste of time than anything else
> at the end. If we are truly concerned about stream efficiency then we
> should simply forbid them. That will avoid lengthy discussions about what
> constitute the hot path and what does not.
>
> Le ven. 31 mai 2024 à 11:08, Berenguer Blasi  a
> écrit :
>
>> +1 on avoiding streams in hot paths
>> On 31/5/24 9:48, Benedict wrote:
>>
>> My concept of hot path is simply anything we can expect to be called
>> frequently enough in normal operation that it might show up in a profiler.
>> If it’s a library method then it’s reasonable to assume it should be able
>> to be used in a hot path unless clearly labelled otherwise.
>>
>> In my view this includes things that might normally be masked by caching
>> but under supported workloads may not be - such as query preparation.
>>
>> In fact, I’d say the default assumption should probably be that a method
>> is “in a hot path” unless there’s good argument they aren’t - such as that
>> the operation is likely to be run at some low frequency and the slow part
>> is not part of any loop. Repair setup messages perhaps aren’t a hot path
>> for instance (unless many of them are sent per repair), but validation
>> compaction or merkle tree construction definitely is.
>>
>> I think it’s fine to not have perfect agreement about edge cases, but if
>> anyone in a discussion thinks something is a hot path then it should be
>> treated as one IMO.
>>
>> On 30 May 2024, at 18:39, David Capwell 
>>  wrote:
>>
>>  As a general statement I agree with you (same for String.format as
>> well), but one thing to call out is that it can be hard to tell what is the
>> hot path and what isn’t.  When you are doing background work (like repair)
>> its clear, but when touching something internal it can be hard to tell;
>> this can also be hard with shared code as it gets authored outside the hot
>> path then later used in the hot path…
>>
>> Also, what defines hot path?  Is this user facing only?  What about
>> Validation/Streaming (stuff processing a large dataset)?
>>
>> On May 30, 2024, at 9:29 AM, Benedict 
>>  wrote:
>>
>> Since it’s related to the logging discussion we’re already having, I have
>> seen stream pipelines showing up in a lot of traces recently. I am
>> surprised; I thought it was understood that they shouldn’t be used on hot
>> paths as they are not typically as efficient as old skool for-each
>> constructions done sensibly, especially for small collections that may
>> normally take zero or one items.
>>
>> I would like to propose forbidding the use of streams on hot paths
>> without good justification that the cost:benefit is justified.
>>
>> It looks like it was nominally agreed two years ago that we would include
>> words to this effect in the code style guide, but I forgot to include them
>> when I transferred the new contents from the Google Doc proposal. So we
>> could just include the “Performance” section that was meant to be included
>> at the time.
>>
>> lists.apache.org
>> 
>> 
>> 
>> 
>>
>>
>> On 30 May 2024, at 13:33, Štefan Miklošovič 
>>  wrote:
>>
>> 
>> I see the feedback is overall positive. I will merge that and I will
>> improve the documentation on the website along with what Benedict suggested.
>>
>> On Thu, May 30, 2024 at 10:32 AM Mick Semb Wever  wrote:
>>
>>>
>>>
>>>
 Based on these findings, I went through the code and I have
 incorporated these rules and I rewrote it like this:

 1) no wrapping in "if" if we are not logging more than 2 parameters.
 2) rewritten log messages to not contain any string concatenation but
 moving it all to placeholders ({}).
 3) wrap it in "if" if we need to execute a method(s) on parameter(s)
 which is resource-consuming.

>>>
>>>
>>> +1
>>>
>>>
>>> It's a shame slf4j botched it with lambdas, their 2.0 fluent api doesn't
>>> impress me.
>>>
>>
>>

Re: CCM and CASSANDRA_USE_JDK11

2024-05-24 Thread Jacek Lewandowski

BTW. I've created a simple workflow for GitHub Actions -
https://github.com/riptano/ccm/pull/771 - feel free to review

- - -- --- -  -
Jacek Lewandowski


sob., 25 maj 2024 o 06:09 Jacek Lewandowski 
napisał(a):

> Thank you for all the opinions. That is useful for future work on CCM.
>
> When I implemented the changes that caused recent headaches, I wasn't
> aware that the CCM code was so patchworked, which resulted in many
> surprises. I apologize for that. Anyway, there is no reason to sit and
> complain instead of fixing what I already found.
>
> There are some unit tests attached to CCM, and I wonder whether they are
> ever run before merging a patch because I'm unaware of any CI. Along with
> my recent patches, I've implemented quite comprehensive tests verifying
> whether the env updating function works as "expected", but it turned out
> the problems were outside that function. I'm making fun of the word
> "expected" because precisely defined contracts for CCM do not seem to
> exist. I'd like us to invest in unit tests and make them the contract
> guards. It is tough to verify whether a patch would cause problems in some
> environments; we usually have to run all workloads to confirm that, and
> this would not work in the future if we expect the community to get more
> involved. This recent incident was an excellent example of such a problem -
> I'm really thankful to Mick and Ariel for running all the Apache and Apple
> workloads to verify the patch, which consumed much of their time, and I
> cannot expect such involvement in the future.
>
> To the point, let's pay attention to the unit tests, make sure they are
> mocking Cassandra instead of running real programs so that they are fast,
> and add some CI—if they are fast unit tests, I bet we can rely on some free
> tier of GitHub actions. Adding the expectations we assumed when
> implementing the CI systems for Cassandra will make it less likely to break
> something accidentally.
>
> In the end, I feel we strayed off the topic a bit - my question was quite
> concrete - I'd like to remove the CASSANDRA_USE_JDK11 knob for CCM - it
> should set it appropriately for Cassandra 4 so that the CCM user does not
> have to bother. However, CCM should not use it to decide which Java version
> to use. I'm unaware of any release cycle of CCM versions anywhere. Perhaps
> we should do the following - tag a new version before the change and then
> add the proposed change.
>
> There are also other problems related to env setup - for example, when a
> user or the dtest framework wants to force a certain Java version, it is
> honored only for running a Cassandra node - it is not applied for running
> nodetool or any other command line tool. Therefore, there is a broader
> question about when the explicit Java version should be set - it feels like
> the correct approach would be to set it up when a node is created rather
> than when it is started so that the selection applies to running the server
> and all the commands. This would simplify things significantly - instead of
> resolving env and checking Java distributions each time we are about to run
> a node or a tool - resolve the required env changes once we create a node
> or when we update the installation directory, which we do when testing
> upgrades. Such simplification would also remove some clutter from the logs.
> Can you remember the whole environment logged frequently and twice when
> running a node?
>
> Can we make this discussion conclusive?
>
> Thanks!
>
> - - -- --- -  -
> Jacek Lewandowski
>
>
> pt., 24 maj 2024 o 20:55 Josh McKenzie  napisał(a):
>
>> The scripts that are in cassandra-builds seem like a starting point for
>> converging different CI systems so that they run the same set of tests in
>> as similar environments as possible
>>
>> Yeah, I took a superset of circle and ASF tests to try and run
>> :allthethings:. Part of how the checkstyle dependency check got in the way
>> too, since we weren't running that on ASF CI. :)
>>
>> Strong +1 to more convergence on what we're running in CI for sure.
>>
>> On Fri, May 24, 2024, at 11:59 AM, Ariel Weisberg wrote:
>>
>> Hi,
>>
>> There is definitely a mismatch between how the full range of dtests work
>> and the direction CCM is going in and we have some difficulty getting those
>> to match. I fully empathize with several of those CI systems not being
>> publicly visible/accessible, and the behavior of upgrade paths being
>> absolutely inscrutable relative to the environment variables that are set.
>>
>> I am happy to volunteer to test things in advance on Apple's CI. I'll

Re: CCM and CASSANDRA_USE_JDK11

2024-05-24 Thread Jacek Lewandowski

Thank you for all the opinions. That is useful for future work on CCM.

When I implemented the changes that caused recent headaches, I wasn't aware
that the CCM code was so patchworked, which resulted in many surprises. I
apologize for that. Anyway, there is no reason to sit and complain instead
of fixing what I already found.

There are some unit tests attached to CCM, and I wonder whether they are
ever run before merging a patch because I'm unaware of any CI. Along with
my recent patches, I've implemented quite comprehensive tests verifying
whether the env updating function works as "expected", but it turned out
the problems were outside that function. I'm making fun of the word
"expected" because precisely defined contracts for CCM do not seem to
exist. I'd like us to invest in unit tests and make them the contract
guards. It is tough to verify whether a patch would cause problems in some
environments; we usually have to run all workloads to confirm that, and
this would not work in the future if we expect the community to get more
involved. This recent incident was an excellent example of such a problem -
I'm really thankful to Mick and Ariel for running all the Apache and Apple
workloads to verify the patch, which consumed much of their time, and I
cannot expect such involvement in the future.

To the point, let's pay attention to the unit tests, make sure they are
mocking Cassandra instead of running real programs so that they are fast,
and add some CI—if they are fast unit tests, I bet we can rely on some free
tier of GitHub actions. Adding the expectations we assumed when
implementing the CI systems for Cassandra will make it less likely to break
something accidentally.

In the end, I feel we strayed off the topic a bit - my question was quite
concrete - I'd like to remove the CASSANDRA_USE_JDK11 knob for CCM - it
should set it appropriately for Cassandra 4 so that the CCM user does not
have to bother. However, CCM should not use it to decide which Java version
to use. I'm unaware of any release cycle of CCM versions anywhere. Perhaps
we should do the following - tag a new version before the change and then
add the proposed change.

There are also other problems related to env setup - for example, when a
user or the dtest framework wants to force a certain Java version, it is
honored only for running a Cassandra node - it is not applied for running
nodetool or any other command line tool. Therefore, there is a broader
question about when the explicit Java version should be set - it feels like
the correct approach would be to set it up when a node is created rather
than when it is started so that the selection applies to running the server
and all the commands. This would simplify things significantly - instead of
resolving env and checking Java distributions each time we are about to run
a node or a tool - resolve the required env changes once we create a node
or when we update the installation directory, which we do when testing
upgrades. Such simplification would also remove some clutter from the logs.
Can you remember the whole environment logged frequently and twice when
running a node?

Can we make this discussion conclusive?

Thanks!

- - -- --- - ---- -----
Jacek Lewandowski

pt., 24 maj 2024 o 20:55 Josh McKenzie  napisał(a):

> The scripts that are in cassandra-builds seem like a starting point for
> converging different CI systems so that they run the same set of tests in
> as similar environments as possible
>
> Yeah, I took a superset of circle and ASF tests to try and run
> :allthethings:. Part of how the checkstyle dependency check got in the way
> too, since we weren't running that on ASF CI. :)
>
> Strong +1 to more convergence on what we're running in CI for sure.
>
> On Fri, May 24, 2024, at 11:59 AM, Ariel Weisberg wrote:
>
> Hi,
>
> There is definitely a mismatch between how the full range of dtests work
> and the direction CCM is going in and we have some difficulty getting those
> to match. I fully empathize with several of those CI systems not being
> publicly visible/accessible, and the behavior of upgrade paths being
> absolutely inscrutable relative to the environment variables that are set.
>
> I am happy to volunteer to test things in advance on Apple's CI. I'll also
> try to get on top of responding faster :-)
>
> The window where reverting is useful is slightly past now that all the
> issues I am aware of have been fixed, but in the future I think the burden
> for revert might need to be lower. It's tough those because putting the
> burden on ASF for non-ASF CI is not necessarily a given.
>
> There is a big gap between CI systems where how they invoke the dtests
> determines the exact set of tests they run and how they invoke CCM (and
> which CCM bugs they expose). I really don't like this approach including
> relying on environment var

CCM and CASSANDRA_USE_JDK11

2024-05-23 Thread Jacek Lewandowski

Hi,

When starting Cassandra nodes, CCM uses the current env Java distribution
(defined by the JAVA_HOME env variable). This behavior is overridden in
three cases:

- Java version is not supported by the selected Cassandra distribution - in
which case, CCM looks for supported Java distribution across JAVAx_HOME env
variables

- Java version is specified explicitly (--jvm-version arg or jvm_version
param if used in Python)

- CASSANDRA_USE_JDK11 is defined in env, in which case, for Cassandra 4.x
CCM forces to use only JDK11

I want to ask you guys whether you are okay with removing the third
exception. If we remove it, Cassandra 4.x will not be treated in any
special way—CCM will use the current Java version, so if it is Java 11, it
will use Java 11 (and automatically set CASSANDRA_USE_JDK11), and if it is
Java 8, it will use Java 8 (and automatically unset CASSANDRA_USE_JDK11).

I think there is no need for CCM to use CASSANDRA_USE_JDK11 to make a
decision about which Java version to use as it adds more complexity, makes
it work differently for Cassandra 4.x than for other Cassandra versions,
and actually provides no value at all because if we work with Cassandra
having our env configured for Java 11, we have to have CASSANDRA_USE_JDK11
and if not, we cannot have it. Therefore, CCM can be based solely on the
current Java version and not include the existence of CASSANDRA_USE_JDK11
in the Java version selection process.

WDYT?

- - -- --- -  -
Jacek Lewandowski

Re: [DISCUSS] ccm as a subproject

2024-05-16 Thread Jacek Lewandowski

+1 (my personal opinion)

How to deal with the DSE-supporting code is a separate discussion IMO

- - -- --- -  -
Jacek Lewandowski


czw., 16 maj 2024 o 10:21 Berenguer Blasi 
napisał(a):

> +1 ccm is super useful
> On 16/5/24 10:09, Mick Semb Wever wrote:
>
>
>
> On Wed, 15 May 2024 at 16:24, Josh McKenzie  wrote:
>
>> Right now ccm isn't formally a subproject of Cassandra or under
>> governance of the ASF. Given it's an integral components of our CI as well
>> as for local testing for many devs, and we now have more experience w/our
>> muscle on IP clearance and ingesting / absorbing subprojects where we can't
>> track down every single contributor to get an ICLA, seems like it might be
>> worth revisiting the topic of donation of ccm to Apache.
>>
>> For what it's worth, Sylvain originally and then DataStax after transfer
>> have both been incredible and receptive stewards of the projects and repos,
>> so this isn't about any response to any behavior on their part.
>> Structurally, however, it'd be better for the health of the project(s)
>> long-term to have ccm promoted in. As far as I know there was strong
>> receptivity to that donation in the past but the IP clearance was the
>> primary hurdle.
>>
>> Anyone have any thoughts for or against?
>>
>> https://github.com/riptano/ccm
>>
>
>
>
> We've been working on this along with the python-driver (just haven't
> raised it yet).  It is recognised, like the python-driver, as a key
> dependency that would best be in the project.
>
> Obtaining the CLAs should be much easier, the contributors to ccm are less
> diverse, being more the people we know already.
>
> We do still have the issues of DSE-supporting code in it, as we do with
> the drivers.  I doubt any of us strongly object to it: there's no trickery
> happening here on the user; but we should be aware of it and have a rough
> direction sketched out for when someone else comes along wanting to add
> support for their proprietary product.  We also don't want to be pushing
> downstream users to be having to create their own forks either.
>
> Great to see general consensus (so far) in receiving it :)
>
>
>
>

Re: Schema Disagreement Issue for Cassandra 4.1

2024-04-05 Thread Jacek Lewandowski

Cheng, as far as I remember, CASSANDRA-18291 was a fix for a problem we
noticed with a driver which could result in noticing the updated schema
version before it was applied.

It should be easy to implement a jvm-dtest for the problem you mentioned -
would like to provide such a test?

Thanks
- - -- --- -  -
Jacek Lewandowski


wt., 2 kwi 2024 o 20:26 Cheng Wang via dev 
napisał(a):

> And it seems that the main fix for the
> https://issues.apache.org/jira/browse/CASSANDRA-18291 is to change the
> order of updating the schema version in the system.local table.
> Before the patch, it was at the beginning of
> the Schema::mergeAndUpdateVersion(), vs. with the patch, it is moved to the
> end of the method..
> Jacek - can you elaborate a little more why the order matters? I am just
> wondering it there might be race conditions that cause the schema
> disagreement.
>
> Cheng
>
> On Tue, Apr 2, 2024 at 10:13 AM Cheng Wang  wrote:
>
>> Thanks Jacek,
>>
>> In our C* version 4.1.1.1 we have your change
>> https://issues.apache.org/jira/browse/CASSANDRA-17044
>> However, I noticed that you had another fix
>> https://issues.apache.org/jira/browse/CASSANDRA-18291 that we didn't
>> pick up yet.
>> Do you think it might be the cause? Could you elaborate a little more on
>> the CASSANDRA-18291?
>>
>> Thanks
>> Cheng
>>
>> On Tue, Apr 2, 2024 at 8:02 AM Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>>
>>> Please cc me when you create a ticket
>>>
>>> pon., 1 kwi 2024, 14:13 użytkownik Bowen Song via dev <
>>> dev@cassandra.apache.org> napisał:
>>>
>>>> It sounds worthy of a Jira ticket.
>>>>
>>>> On 01/04/2024 06:23, Cheng Wang via dev wrote:
>>>> > Hello,
>>>> >
>>>> > I have recently encountered a problem concerning schema disagreement
>>>> > in Cassandra 4.1. It appears that the schema versions do not
>>>> reconcile
>>>> > as expected.
>>>> >
>>>> > The issue can be reproduced by following these steps:
>>>> > - Disable the gossip in Node A.
>>>> > - Make a schema change in Node B, such as creating a new table.
>>>> > - Re-enable the gossip in Node A.
>>>> >
>>>> > My expectation was that the schema versions would eventually
>>>> > reconcile. However, in Cassandra 4.1, it seems that reconciliation
>>>> > hangs indefinitely unless I reboot the node. Interestingly, when
>>>> > performing the same steps in Cassandra 3.0, the schema version
>>>> > synchronizes within about a minute.
>>>> >
>>>> > Has anyone else experienced this issue with Cassandra 4.x? It appears
>>>> > to me that this could be a regression in the 4.x series.
>>>> >
>>>> > Any insights or suggestions would be greatly appreciated.
>>>> >
>>>> > Thanks,
>>>> > Cheng
>>>>
>>>

Re: Schema Disagreement Issue for Cassandra 4.1

2024-04-02 Thread Jacek Lewandowski

Please cc me when you create a ticket

pon., 1 kwi 2024, 14:13 użytkownik Bowen Song via dev <
dev@cassandra.apache.org> napisał:

> It sounds worthy of a Jira ticket.
>
> On 01/04/2024 06:23, Cheng Wang via dev wrote:
> > Hello,
> >
> > I have recently encountered a problem concerning schema disagreement
> > in Cassandra 4.1. It appears that the schema versions do not reconcile
> > as expected.
> >
> > The issue can be reproduced by following these steps:
> > - Disable the gossip in Node A.
> > - Make a schema change in Node B, such as creating a new table.
> > - Re-enable the gossip in Node A.
> >
> > My expectation was that the schema versions would eventually
> > reconcile. However, in Cassandra 4.1, it seems that reconciliation
> > hangs indefinitely unless I reboot the node. Interestingly, when
> > performing the same steps in Cassandra 3.0, the schema version
> > synchronizes within about a minute.
> >
> > Has anyone else experienced this issue with Cassandra 4.x? It appears
> > to me that this could be a regression in the 4.x series.
> >
> > Any insights or suggestions would be greatly appreciated.
> >
> > Thanks,
> > Cheng
>

Re: Default table compression defined in yaml.

2024-03-21 Thread Jacek Lewandowski

Only indented items below "sstable" belong to "sstable". It is commented
out by default to make it clear that it is not required and the default
values apply.

There are a number of sstable parameters which are historically spread
across the yaml with no structure. The point is that we should not add to
that mess and try to group the new stuff.

"default_compression" under ""sstable" key sounds good to me.

- - -- --- -  -
Jacek Lewandowski


czw., 21 mar 2024 o 08:32 Claude Warren, Jr via dev <
dev@cassandra.apache.org> napisał(a):

> Jacek,
>
> I am a bit confused here.  I find a key for "sstable" in the yaml but it
> is commented out by default.  There are a number of options under it that
> are commented out and then one that is not and then the
> "default_compaction" section, which I assume is supposed to apply to the
> "sstable" section.  Are you saying that the "sstable_compression" section
> that we introduced should be placed as a child to the "sstable" key (and
> probably renamed to default_compression"?
>
> I have included the keys from the trunk yaml below with non-key comments
> excluded.  The way I read it either the "sstable" key is not required and a
> user can just uncomment "column_index_size"; or "column_index_cache_size"
> is not really used because it would be under
> "sstable/column_index_cache_size" in the Config; or the "sstable:" is only
> intended to be a visual break / section for the human editor.
>
> Can you or someone clarify this form me?
>
> #sstable:
> #  selected_format: big
> # column_index_size: 4KiB
> column_index_cache_size: 2KiB
> # default_compaction:
> #   class_name: SizeTieredCompactionStrategy
> #   parameters:
> # min_threshold: 4
> #     max_threshold: 32
>
> On Wed, Mar 20, 2024 at 10:31 PM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Compression params for sstables should be under the "sstable" key.
>>
>>
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>> wt., 19 mar 2024 o 13:10 Ekaterina Dimitrova 
>> napisał(a):
>>
>>> Any new settings are expected to be added in the new format
>>>
>>> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <
>>> dev@cassandra.apache.org> wrote:
>>>
>>>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>>>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>>>> introduce new settings entries with the deprecated format only to be
>>>> removed at a later version?
>>>>
>>>>
>>>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>>>
>>>> After much work by several people, I have pulled together the changes
>>>> to define the default compression in the cassandra.yaml file and have
>>>> created a pull request [1].
>>>>
>>>> If you are interested this in topic, please take a look at the changes
>>>> and give at least a cursory review.
>>>>
>>>> [1]  https://github.com/apache/cassandra/pull/3168
>>>>
>>>> Thanks,
>>>> Claude
>>>>
>>>>

Re: Default table compression defined in yaml.

2024-03-20 Thread Jacek Lewandowski

Compression params for sstables should be under the "sstable" key.


- - -- --- -  -----
Jacek Lewandowski


wt., 19 mar 2024 o 13:10 Ekaterina Dimitrova 
napisał(a):

> Any new settings are expected to be added in the new format
>
> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev 
> wrote:
>
>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>> introduce new settings entries with the deprecated format only to be
>> removed at a later version?
>>
>>
>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>
>> After much work by several people, I have pulled together the changes to
>> define the default compression in the cassandra.yaml file and have created
>> a pull request [1].
>>
>> If you are interested this in topic, please take a look at the changes
>> and give at least a cursory review.
>>
>> [1]  https://github.com/apache/cassandra/pull/3168
>>
>> Thanks,
>> Claude
>>
>>

[DISCUSS] Types compatibility

2024-03-08 Thread Jacek Lewandowski

Hey,

I was going through the different kinds of types compatibility and found
that I need help in understanding them. Therefore, I tried to figure out
what is used where, in particular, what means that two types are
"compatible", "valueCompatible" or "serializationCompatible".

I found https://issues.apache.org/jira/browse/CASSANDRA-14476 as a place
where I can put my work. The ticket includes a comment with some problems
I've seen. I've prepared a pull request that has not fixed anything yet,
but it adds a bunch of tests that verify whether the assumed properties of
different types are confirmed.

There are some apparent problems with missing compatibilities of primitive
types. However, for me, there are more critical questions:

1. "isCompatibleWith" is used when replacing function/aggregate to check if
the new return type is more generic than the old one. To me, it is wrong,
and the condition should be the opposite - the old return type should be
more generic than the new one

2. For all multi-cell types, "isCompatibleWith" should be equivalent to
their frozen version, but it is not. It should be because, in reality, the
"compareCustom" method does not consider whether the type is frozen.

Can anybody answer those questions?

Thanks,
Jacek

Re: Welcome Brad Schoening as Cassandra Committer

2024-02-22 Thread Jacek Lewandowski

Congrats Brad!


- - -- --- -  -
Jacek Lewandowski


czw., 22 lut 2024 o 01:29 Štefan Miklošovič 
napisał(a):

> Congrats Brad, great work in the Python department :)
>
> On Wed, Feb 21, 2024 at 9:46 PM Josh McKenzie 
> wrote:
>
>> The Apache Cassandra PMC is pleased to announce that Brad Schoening has
>> accepted
>> the invitation to become a committer.
>>
>> Your work on the integrated python driver, launch script environment, and
>> tests
>> has been a big help to many. Congratulations and welcome!
>>
>> The Apache Cassandra PMC members
>>
>

Re: [Discuss] Improving CI workflows

2024-02-20 Thread Jacek Lewandowski

Let's not start another thread and discuss that directly on the ticket.


- - -- --- -  -
Jacek Lewandowski


wt., 20 lut 2024 o 11:31 Berenguer Blasi 
napisał(a):

> Hi All,
>
> after the recent discussion that came up in CASSANDRA-18753's discuss
> thread and given it is a long aspiration of mine it's probably a good
> time to thread this needle.
>
> Jacek was kind enough to open CASSANDRA-19406 for us so I would suggest
> everybody interested to watch those tickets. I will ready some
> preliminary work with CASSANDRA-19414 which I hope it's going to be
> pretty non-controversial: during my hacky dirty dev iterations (this is
> not a pre-commit substitute) give me the smallest possible test matrix
> i.e.:
>
> https://app.circleci.com/pipelines/github/bereng/cassandra/1164/workflows/3a47c9ef-6456-4190-b5a5-aea2aff641f1
> This will be 100% opt-in
>
> With the experience gained from that then I will gather all view points
> and come up with a proposal for pre-commit.
>
> Hope it makes sense.
>
>

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-16 Thread Jacek Lewandowski

We should conclude this discussion by answering Branimir's original
question.* I vote for merging that and exposing issues to the CI.*

For pre-commit optimization I've opened
https://issues.apache.org/jira/browse/CASSANDRA-19406 epic and we should
add exact tasks there to make this valuable discussion result in some
concrete actions. Then, we can discuss each task in a more organized way.

czw., 15 lut 2024 o 21:29 Štefan Miklošovič 
napisał(a):

> I love the idea of David to make this dual config stuff directly the part
> of the tests, I just leave this here where I quickly put some super
> primitive runner together
>
>
> https://github.com/smiklosovic/cassandra/commit/693803772218b52c424491b826c704811d606a31
>
> We could just run by default with one config and annotate it with all
> configs if we think this is crucial to test in both scenarios.
>
> Anyway, happy to expand on this but I do not want to block any progress in
> the ticket, might come afterwards, just showing what is possible.
>
> On Thu, Feb 15, 2024 at 7:59 PM David Capwell  wrote:
>
>> This thread got large quick, yay!
>>
>> is there a reason all guardrails and reliability (aka repair retries)
>> configs are off by default?  They are off by default in the normal config
>> for backwards compatibility reasons, but if we are defining a config saying
>> what we recommend, we should enable these things by default IMO.
>>
>> This is one more question to be answered by this discussion. Are there
>> other options that should be enabled by the "latest" configuration? To what
>> values should they be set?
>> Is there something that is currently enabled that should not be?
>>
>>
>> Very likely, we should try to figure that out.  We should also answer how
>> conservative do we want to be by default?  There are many configs we need
>> to flesh out here, glad to help with the configs I authored (prob best for
>> JIRA rather than this thread)
>>
>>
>> Should we merge the configs breaking these tests?  No…. When we have
>> failing tests people do not spend the time to figure out if their logic
>> caused a regression and merge, making things more unstable… so when we
>> merge failing tests that leads to people merging even more failing tests...
>>
>> In this case this also means that people will not see at all failures
>> that they introduce in any of the advanced features, as they are not tested
>> at all. Also, since CASSANDRA-19167 and 19168 already have fixes, the
>> non-latest test suite will remain clean after merge. Note that these two
>> problems demonstrate that we have failures in the configuration we ship
>> with, because we are not actually testing it at all. IMHO this is a problem
>> that we should not delay fixing.
>>
>>
>> I am not arguing we should not get this into CI, but more we should fix
>> the known issues before getting into CI… its what we normally do, I don’t
>> see a reason to special case this work.
>>
>> I am 100% cool blocking 5.0 on these bugs found (even if they are test
>> failures), but don’t feel we should enable in CI until these issues are
>> resolved; we can add the yaml now, but not the CI pipelines.
>>
>>
>> 1) If there’s an “old compatible default” and “latest recommended
>> settings”, when does the value in “old compatible default” get updated?
>> Never?
>>
>>
>> How about replacing cassandra.yaml with cassandra_latest.yaml on trunk
>> when cutting cassandra-6.0 branch? Any new default changes on trunk go to
>> cassandra_latest.yaml.
>>
>>
>> I feel its dangerous to define this at the file level and should do at
>> the config level… I personally see us adding new features disabled by
>> default in cassandra.yaml and the recommended values in
>> Cassandra-latest.yaml… If I add a config in 5.1.2 should it get enabled by
>> default in 6.0?  I don’t feel thats wise.
>>
>> Maybe it makes sense to annotate the configs with the target version for
>> the default change?
>>
>> Let's distinguish the packages of tests that need to be run with CDC
>> enabled / disabled, with commitlog compression enabled / disabled, tests
>> that verify sstable formats (mostly io and index I guess), and leave other
>> parameters set as with the latest configuration - this is the easiest way I
>> think.
>>
>>
>> Yes please!  I really hate having a pipeline per config, we should
>> annotate this some how in the tests that matter… junit can param the tests
>> for us so we cover the different configs the test supports… I have written
>> many tests that are costly and run on all these other pipelines but have 0
>> change in the config… just wasting resources rerunning…
>>
>> Pushing this to the test also is a better author/maintainer experience…
>> running the test in your IDE and seeing all the params and their results is
>> so much better than monkeying around with yaml files and ant…. My repair
>> simulation tests have a hack flag to try to switch the yaml to make it
>> easier to test against the other configs and I loath it so much…
>>
>> To

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-15 Thread Jacek Lewandowski

Brandon, that should be doable with the current filters I think - that is,
select only those tests which do not support vnodes. Do you know about such
in-jvm dtests as well?

- - -- --- -  -
Jacek Lewandowski


czw., 15 lut 2024 o 18:21 Brandon Williams  napisał(a):

> On Thu, Feb 15, 2024 at 1:10 AM Jacek Lewandowski
>  wrote:
> > For dtests we have vnodes/no-vnodes, offheap/onheap, and nothing about
> other stuff. To me running no-vnodes makes no sense because no-vnodes is
> just a special case of vnodes=1. On the other hand offheap/onheap buffers
> could be tested in unit tests. In short, I'd run dtests only with the
> default and latest configuration.
>
> I largely agree that no-vnodes isn't useful, but there are some
> non-vnode operations like moving a token that don't work with vnodes
> and still need to be tested.  I think we could probably get quick
> savings by breaking out the @no_vnodes tests to a separate suite run
> so we aren't completely doubling our effort for little gain with every
> commit.
>

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-15 Thread Jacek Lewandowski

Great summary Josh,

>
>- JDK-based test suites on highest supported jdk using other config
>
> Do you mean a smoke test suite by that ^ ?

- - -- --- -  -----
Jacek Lewandowski


czw., 15 lut 2024 o 18:12 Josh McKenzie  napisał(a):

> Would it make sense to only block commits on the test strategy you've
> listed, and shift the entire massive test suite to post-commit?
>
>
> Lots and lots of other emails
>
>
> ;)
>
> There's an interesting broad question of: What config do we consider
> "recommended" going forward, the "conservative" (i.e. old) or the
> "performant" (i.e. new)? And what JDK do we consider "recommended" going
> forward, the oldest we support or the newest?
>
> Since those recommendations apply for new clusters, people need to qualify
> their setups, and we have a high bar of quality on testing pre-merge, my
> gut tells me "performant + newest JDK". This would impact what we'd test
> pre-commit IMO.
>
> Having been doing a lot of CI stuff lately, some observations:
>
>- Our True North needs to be releasing a database that's free of
>defects that violate our core properties we commit to our users. No data
>loss, no data resurrection, transient or otherwise, due to defects in our
>code (meteors, tsunamis, etc notwithstanding).
>- The relationship of time spent on CI and stability of final full
>*post-commit* runs is asymptotic. It's not even 90/10; we're probably
>somewhere like 98% value gained from 10% of work, and the other 2%
>"stability" (i.e. green test suites, not "our database works") is a
>long-tail slog. Especially in the current ASF CI heterogenous env w/its
>current orchestration.
>- Thus: Pre-commit and post-commit should be different. The following
>points all apply to pre-commit:
>- The goal of pre-commit tests should be some number of 9's of no test
>failures post-commit (i.e. for every 20 green pre-commit we introduce 1
>flake post-commit). Not full perfection; it's not worth the compute and
>complexity.
>- We should *build *all branches on all supported JDK's (8 + 11 for
>older, 11 + 17 for newer, etc).
>- We should *run *all test suites with the *recommended *
>*configuration* against the *highest versioned JDK a branch supports. *And
>we should formally recommend our users run on that JDK.
>- We should *at least* run all jvm-based configurations on the highest
>supported JDK version with the "not recommended but still supported"
>configuration.
>- I'm open to being persuaded that we should at least run jvm-unit
>tests on the older JDK w/the conservative config pre-commit, but not much
>beyond that.
>
> That would leave us with the following distilled:
>
> *Pre-commit:*
>
>- Build on all supported jdks
>- All test suites on highest supported jdk using recommended config
>- Repeat testing on new or changed tests on highest supported JDK
>w/recommended config
>- JDK-based test suites on highest supported jdk using other config
>
> *Post-commit:*
>
>- Run everything. All suites, all supported JDK's, both config files.
>
> With Butler + the *jenkins-jira* integration script
> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>(need
> to dust that off but it should remain good to go), we should have a pretty
> clear view as to when any consistent regressions are introduced and why.
> We'd remain exposed to JDK-specific flake introductions and flakes in
> unchanged tests, but there's no getting around the 2nd one and I expect the
> former to be rare enough to not warrant the compute to prevent it.
>
> On Thu, Feb 15, 2024, at 10:02 AM, Jon Haddad wrote:
>
> Would it make sense to only block commits on the test strategy you've
> listed, and shift the entire massive test suite to post-commit?  If there
> really is only a small % of times the entire suite is useful this seems
> like it could unblock the dev cycle but still have the benefit of the full
> test suite.
>
>
>
> On Thu, Feb 15, 2024 at 3:18 AM Berenguer Blasi 
> wrote:
>
>
> On reducing circle ci usage during dev while iterating, not with the
> intention to replace the pre-commit CI (yet), we could do away with testing
> only dtests, jvm-dtests, units and cqlsh for a _single_ configuration imo.
> That would greatly reduce usage. I hacked it quickly here for illustration
> purposes:
> https://app.circleci.com/pipelines/github/bereng/cassandra/1164/workflows/3a47c9ef-6456-4190-b5a5-aea2aff641f1
> The good thing is that we have the tooling to dial

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski

I fully understand you. Although I have that luxury to use more containers,
I simply feel that rerunning the same code with different configurations
which do not impact that code is just a waste of resources and money.

- - -- --- -  -
Jacek Lewandowski


czw., 15 lut 2024 o 08:41 Štefan Miklošovič 
napisał(a):

> By the way, I am not sure if it is all completely transparent and
> understood by everybody but let me guide you through a typical patch which
> is meant to be applied from 4.0 to trunk (4 branches) to see how it looks
> like.
>
> I do not have the luxury of running CircleCI on 100 containers, I have
> just 25. So what takes around 2.5h for 100 containers takes around 6-7 for
> 25. That is a typical java11_pre-commit_tests for trunk. Then I have to
> provide builds for java17_pre-commit_tests too, that takes around 3-4 hours
> because it just tests less, let's round it up to 10 hours for trunk.
>
> Then I need to do this for 5.0 as well, basically double the time because
> as I am writing this the difference is not too big between these two
> branches. So 20 hours.
>
> Then I need to build 4.1 and 4.0 too, 4.0 is very similar to 4.1 when it
> comes to the number of tests, nevertheless, there are workflows for Java 8
> and Java 11 for each so lets say this takes 10 hours again. So together I'm
> 35.
>
> To schedule all the builds, trigger them, monitor their progress etc is
> work in itself. I am scripting this like crazy to not touch the UI in
> Circle at all and I made my custom scripts which call Circle API and it
> triggers the builds from the console to speed this up because as soon as a
> developer is meant to be clicking around all day, needing to tracking the
> progress, it gets old pretty quickly.
>
> Thank god this is just a patch from 4.0, when it comes to 3.0 and 3.11
> just add more hours to that.
>
> So all in all, a typical 4.0 - trunk patch is tested for two days at
> least, that's when all is nice and I do not need to rework it and rurun it
> again ... Does this all sound flexible and speedy enough for people?
>
> If we dropped the formal necessity to build various jvms it would
> significantly speed up the development.
>
>
> On Thu, Feb 15, 2024 at 8:10 AM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Excellent point, I was saying for some time that IMHO we can reduce
>>> to running in CI at least pre-commit:
>>> 1) Build J11 2) build J17
>>> 3) run tests with build 11 + runtime 11
>>> 4) run tests with build 11 and runtime 17.
>>
>>
>> Ekaterina, I was thinking more about:
>> 1) build J11
>> 2) build J17
>> 3) run tests with build J11 + runtime J11
>> 4) run smoke tests with build J17 and runtime J17
>>
>> Again, I don't see value in running build J11 and J17 runtime
>> additionally to J11 runtime - just pick one unless we change something
>> specific to JVM
>>
>> If we need to decide whether to test the latest or default, I think we
>> should pick the latest because this is actually Cassandra 5.0 defined as a
>> set of new features that will shine on the website.
>>
>> Also - we have configurations which test some features but they more like
>> dimensions:
>> - commit log compression
>> - sstable compression
>> - CDC
>> - Trie memtables
>> - Trie SSTable format
>> - Extended deletion time
>> ...
>>
>> Currently, with what we call the default configuration is tested with:
>> - no compression, no CDC, no extended deletion time
>> - *commit log compression + sstable compression*, no cdc, no extended
>> deletion time
>> - no compression, *CDC enabled*, no extended deletion time
>> - no compression, no CDC, *enabled extended deletion time*
>>
>> This applies only to unit tests of course
>>
>> Then, are we going to test all of those scenarios with the "latest"
>> configuration? I'm asking because the latest configuration is mostly about
>> tries and UCS and has nothing to do with compression or CDC. Then why the
>> default configuration should be tested more thoroughly than latest which
>> enables essential Cassandra 5.0 features?
>>
>> I propose to significantly reduce that stuff. Let's distinguish the
>> packages of tests that need to be run with CDC enabled / disabled, with
>> commitlog compression enabled / disabled, tests that verify sstable formats
>> (mostly io and index I guess), and leave other parameters set as with the
>> latest configuration - this is the easiest way I think.
>>
>> For dtests we have vnodes/no-vnodes, offheap/onheap, and nothing about
>> other

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski

>
> Excellent point, I was saying for some time that IMHO we can reduce
> to running in CI at least pre-commit:
> 1) Build J11 2) build J17
> 3) run tests with build 11 + runtime 11
> 4) run tests with build 11 and runtime 17.


Ekaterina, I was thinking more about:
1) build J11
2) build J17
3) run tests with build J11 + runtime J11
4) run smoke tests with build J17 and runtime J17

Again, I don't see value in running build J11 and J17 runtime additionally
to J11 runtime - just pick one unless we change something specific to JVM

If we need to decide whether to test the latest or default, I think we
should pick the latest because this is actually Cassandra 5.0 defined as a
set of new features that will shine on the website.

Also - we have configurations which test some features but they more like
dimensions:
- commit log compression
- sstable compression
- CDC
- Trie memtables
- Trie SSTable format
- Extended deletion time
...

Currently, with what we call the default configuration is tested with:
- no compression, no CDC, no extended deletion time
- *commit log compression + sstable compression*, no cdc, no extended
deletion time
- no compression, *CDC enabled*, no extended deletion time
- no compression, no CDC, *enabled extended deletion time*

This applies only to unit tests of course

Then, are we going to test all of those scenarios with the "latest"
configuration? I'm asking because the latest configuration is mostly about
tries and UCS and has nothing to do with compression or CDC. Then why the
default configuration should be tested more thoroughly than latest which
enables essential Cassandra 5.0 features?

I propose to significantly reduce that stuff. Let's distinguish the
packages of tests that need to be run with CDC enabled / disabled, with
commitlog compression enabled / disabled, tests that verify sstable formats
(mostly io and index I guess), and leave other parameters set as with the
latest configuration - this is the easiest way I think.

For dtests we have vnodes/no-vnodes, offheap/onheap, and nothing about
other stuff. To me running no-vnodes makes no sense because no-vnodes is
just a special case of vnodes=1. On the other hand offheap/onheap buffers
could be tested in unit tests. In short, I'd run dtests only with the
default and latest configuration.

Sorry for being too wordy,


czw., 15 lut 2024 o 07:39 Štefan Miklošovič 
napisał(a):

> Something along what Paulo is proposing makes sense to me. To sum it up,
> knowing what workflows we have now:
>
> java17_pre-commit_tests
> java11_pre-commit_tests
> java17_separate_tests
> java11_separate_tests
>
> We would have couple more, together like:
>
> java17_pre-commit_tests
> java17_pre-commit_tests-latest-yaml
> java11_pre-commit_tests
> java11_pre-commit_tests-latest-yaml
> java17_separate_tests
> java17_separate_tests-default-yaml
> java11_separate_tests
> java11_separate_tests-latest-yaml
>
> To go over Paulo's plan, his steps 1-3 for 5.0 would result in requiring
> just one workflow
>
> java11_pre-commit_tests
>
> when no configuration is touched and two workflows
>
> java11_pre-commit_tests
> java11_pre-commit_tests-latest-yaml
>
> when there is some configuration change.
>
> Now the term "some configuration change" is quite tricky and it is not
> always easy to evaluate if both default and latest yaml workflows need to
> be executed. It might happen that a change is of such a nature that it does
> not change the configuration but it is necessary to verify that it still
> works with both scenarios. -latest.yaml config might be such that a change
> would make sense to do in isolation for default config only but it would
> not work with -latest.yaml too. I don't know if this is just a theoretical
> problem or not but my gut feeling is that we would be safer if we just
> required both default and latest yaml workflows together.
>
> Even if we do, we basically replace "two jvms" builds for "two yamls"
> builds but I consider "two yamls" builds to be more valuable in general
> than "two jvms" builds. It would take basically the same amount of time, we
> would just reoriented our building matrix from different jvms to different
> yamls.
>
> For releases we would for sure need to just run it across jvms too.
>
> On Thu, Feb 15, 2024 at 7:05 AM Paulo Motta  wrote:
>
>> > Perhaps it is also a good opportunity to distinguish subsets of tests
>> which make sense to run with a configuration matrix.
>>
>> Agree. I think we should define a “standard/golden” configuration for
>> each branch and minimally require precommit tests for that configuration.
>> Assignees and reviewers can determine if additional test variants are
>> required based on the patch scope.
>>
>

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski

śr., 14 lut 2024 o 17:30 Josh McKenzie  napisał(a):

> When we have failing tests people do not spend the time to figure out if
> their logic caused a regression and merge, making things more unstable… so
> when we merge failing tests that leads to people merging even more failing
> tests...
>
> What's the counter position to this Jacek / Berenguer?
>

For how long are we going to deceive ourselves? Are we shipping those
features or not? Perhaps it is also a good opportunity to distinguish
subsets of tests which make sense to run with a configuration matrix.

If we don't add those tests to the pre-commit pipeline, "people do not
spend the time to figure out if their logic caused a regression and merge,
making things more unstable…"
I think it is much more valuable to test those various configurations
rather than test against j11 and j17 separately. I can see a really little
value in doing that.

Re: [Discuss] Introducing Flexible Authentication in Cassandra via Feature Flag

2024-02-14 Thread Jacek Lewandowski

Hi,

I think what Gaurav means is what we know at DataStax as transitional
authenticator, which temporarily allows for partially enabled
authentication - when the system allows the clients to authenticate but
does not enforce it.

All in all, that should be included in CEP-31 - also CEP-31 aims to let the
administrators enable/disable and reconfigure authentication without a
restart so we could discuss whether such transitional mode would be needed
at all in that case.

Thanks,
- - -- --- -  -
Jacek Lewandowski


wt., 13 lut 2024 o 07:04 Jeff Jirsa  napisał(a):

> Auth is one of those things that needs to be a bit more concrete
>
> In the scenario you describe, you already have an option to deploy the
> auth in piece partially during the rollout (pause halfway through) in the
> cluster and look for asymmetric connections, and the option to drop in a
> new Authenticator jar in the class path that does the flexible auth you
> describe
>
> I fear that the extra flexibility this allows for 1% of operations exposes
> people to long term problems
>
> Have you considered just implementing the feature flag you describe using
> the existing plugin infrastructure ?
>
> On Feb 12, 2024, at 9:47 PM, Gaurav Agarwal 
> wrote:
>
> 
> Dear Dinesh and Abe,
>
> Thank you for reviewing the document on enabling Cassandra authentication.
> I apologize that I didn't initially include the following failure scenarios
> where this feature could be particularly beneficial (I've included them
> now):
>
> *Below are the failure scenarios:*
>
>- Incorrect credentials: If a client accidentally uses the wrong
>username/password combination during the rollout, While restarting the
>server to enable authentication, it will refuse connections with incorrect
>credentials. This can temporarily interrupt the service until correct
>credentials are sent.
>- Missed service auth updates: In a large-scale system, a service "X"
>might miss the credential update during rollout. After some server nodes
>restart, service "X" might finally realize it needs correct credentials,
>but it's too late. Nodes are already expecting authorized requests, and
>this mismatch causes "X" to stop working on auth enabled and restarted
>nodes.
>- Infrequent traffic:  Suppose one of the services only interacts with
>the server once a week. Suppose it starts sending requests with incorrect
>credentials after authentication is enabled. Since the entire cluster is
>now running on authentication, the service's outdated credentials cause it
>to be denied access, resulting in a service-wide outage.
>
>
> The overall aim of the proposed feature flag would allow clients to
> connect momentarily without authentication during the rollout, mitigating
> these risks and ensuring a smoother transition.
>
> Thanks in advance for your continued review of the proposal.
>
>
>
> On Mon, Feb 12, 2024 at 2:24 PM Abe Ratnofsky  wrote:
>
>> Hey Guarav,
>>
>> Thanks for your proposal.
>>
>> > disruptive, full-cluster restart, posing significant risks in live
>> environments
>>
>> For configuration that isn't hot-reloadable, like providing a new
>> IAuthenticator implementation, a rolling restart is required. But rolling
>> restarts are zero-downtime and safe in production, as long as you pace them
>> accordingly.
>>
>> In general, changing authenticators is a risky thing because it requires
>> coordination with clients. To mitigate this risk and support clients while
>> they transition between authenticators, I like the approach taken by
>> MutualTlsWithPasswordFallbackAuthenticator:
>>
>> https://github.com/apache/cassandra/blob/bec6bfde1f3b6a782f123f9f9ff18072a97e379f/src/java/org/apache/cassandra/auth/MutualTlsWithPasswordFallbackAuthenticator.java#L34
>>
>> If client certificates are available, then use those, otherwise use the
>> existing PasswordAuthenticator that clients are already using. The existing
>> IAuthenticator interface supports this transitional behavior well.
>>
>> Your proposal to include a new configuration for auth_enforcement_flag
>> doesn't clearly cover how to transition from one authenticator to another.
>> It says:
>>
>> > Soft: Operates in a monitoring mode without enforcing authentication
>>
>> Most users use authentication today, so auth_enforcement_flag=Soft would
>> allow unauthenticated clients to connect to the database.
>>
>> --
>> Abe
>>
>> On Feb 12, 2024, at 2:44 PM, Gaurav Agarwal 
>> wrote:
>>
>> Dear Cassandra Community,
>>
>>

Re: [Discuss] "Latest" configuration for testing and evaluation (CASSANDRA-18753)

2024-02-14 Thread Jacek Lewandowski

We should not block merging configuration changes given it is a valid
configuration - which I understand as it is correct, passes all config
validations, it matches documented rules, etc. And this provided latest
config matches those requirements I assume.

The failures should block release or we should not advertise we have those
features at all, and the configuration should be named "experimental"
rather than "latest".

The config changes are not responsible for broken features and we should
not bury our heads in the sand pretending that everything is ok.

Thanks,

śr., 14 lut 2024, 10:47 użytkownik Štefan Miklošovič <
stefan.mikloso...@gmail.com> napisał:

> Wording looks good to me. I would also put that into NEWS.txt but I am not
> sure what section. New features, Upgrading nor Deprecation does not seem to
> be a good category.
>
> On Tue, Feb 13, 2024 at 5:42 PM Branimir Lambov 
> wrote:
>
>> Hi All,
>>
>> CASSANDRA-18753 introduces a second set of defaults (in a separate
>> "cassandra_latest.yaml") that enable new features of Cassandra. The
>> objective is two-fold: to be able to test the database in this
>> configuration, and to point potential users that are evaluating the
>> technology to an optimized set of defaults that give a clearer picture of
>> the expected performance of the database for a new user. The objective is
>> to get this configuration into 5.0 to have the extra bit of confidence that
>> we are not releasing (and recommending) options that have not gone through
>> thorough CI.
>>
>> The implementation has already gone through review, but I'd like to get
>> people's opinion on two things:
>> - There are currently a number of test failures when the new options are
>> selected, some of which appear to be genuine problems. Is the community
>> okay with committing the patch before all of these are addressed? This
>> should prevent the introduction of new failures and make sure we don't
>> release before clearing the existing ones.
>> - I'd like to get an opinion on what's suitable wording and documentation
>> for the new defaults set. Currently, the patch proposes adding the
>> following text to the yaml (see
>> https://github.com/apache/cassandra/pull/2896/files):
>> # NOTE:
>> #   This file is provided in two versions:
>> # - cassandra.yaml: Contains configuration defaults for a "compatible"
>> #   configuration that operates using settings that are
>> backwards-compatible
>> #   and interoperable with machines running older versions of
>> Cassandra.
>> #   This version is provided to facilitate pain-free upgrades for
>> existing
>> #   users of Cassandra running in production who want to gradually and
>> #   carefully introduce new features.
>> # - cassandra_latest.yaml: Contains configuration defaults that enable
>> #   the latest features of Cassandra, including improved
>> functionality as
>> #   well as higher performance. This version is provided for new
>> users of
>> #   Cassandra who want to get the most out of their cluster, and for
>> users
>> #   evaluating the technology.
>> #   To use this version, simply copy this file over cassandra.yaml,
>> or specify
>> #   it using the -Dcassandra.config system property, e.g. by running
>> # cassandra
>> -Dcassandra.config=file:/$CASSANDRA_HOME/conf/cassandra_latest.yaml
>> # /NOTE
>> Does this sound sensible? Should we add a pointer to this defaults set
>> elsewhere in the documentation?
>>
>> Regards,
>> Branimir
>>
>

Re: Welcome Maxim Muzafarov as Cassandra Committer

2024-01-08 Thread Jacek Lewandowski

Congratulations Maxim, well deserved, it's a pleasure to work with you!

- - -- --- -  -
Jacek Lewandowski


pon., 8 sty 2024 o 19:35 Lorina Poland  napisał(a):

> Congratulations Maxim!
>
> On 2024/01/08 18:19:04 Josh McKenzie wrote:
> > The Apache Cassandra PMC is pleased to announce that Maxim Muzafarov has
> accepted
> > the invitation to become a committer.
> >
> > Thanks for all the hard work and collaboration on the project thus far,
> and we're all looking forward to working more with you in the future.
> Congratulations and welcome!
> >
> > The Apache Cassandra PMC members
> >
> >
>

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

2023-12-22 Thread Jacek Lewandowski

Obviously +1

Thank you Alex

pt., 22 gru 2023, 16:45 użytkownik Sumanth Pasupuleti <
sumanth.pasupuleti...@gmail.com> napisał:

> +1, thank you for your efforts in bringing Harry in-tree. Anything that
> improves the testing ecosystem for Cassandra, particularly around complex
> scenarios / edge cases  goes a long way in improving reliability, and with
> having a powerful tool like Harry in-tree, it is a lot more accessible to
> the developers than it has been. Also, thank you for keeping in mind the
> onboarding experience of developers.
>
> - Sumanth
>
> On Fri, Dec 22, 2023 at 1:11 AM Alex Petrov  wrote:
>
>> Some follow-up tickets to establish the project direction:
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-19229
>>
>> Two other things that we will work on in Tree are:
>> https://issues.apache.org/jira/browse/CASSANDRA-18275 (model and in-JVM
>> test for partition-restricted 2i queries)
>> https://issues.apache.org/jira/browse/CASSANDRA-18667 (multi-threaded
>> SAI read and write fuzz test)
>>
>> If you would like to get your recently added feature tested with Harry
>> model, please let me know!
>>
>> On Fri, Dec 22, 2023, at 12:41 AM, Joseph Lynch wrote:
>>
>> +1
>>
>> Sounds like a great change that will help us unify around a common
>> testing paradigm, and even pave the path to in-tree load testing plus
>> integrated correctness checking which would be extremely valuable!
>>
>> -Joey
>>
>> On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe 
>> wrote:
>>
>> +1
>>
>> Agree w/ all the justifications mentioned above.
>>
>> As a reviewer on CASSANDRA-19210
>> , my goals were
>> to a.) look at the directory, naming, and package structure of the ported
>> code, b.) make sure IDE integration was working, and c.) make sure any
>> modifications to existing code (rather than direct code movements from
>> cassandra-harry) were straightforward.
>>
>> On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov  wrote:
>>
>>
>> Hey folks,
>>
>> I am mostly done with a patch that brings Harry in-tree [1]. I will
>> trigger one more CI run overnight, and my intention was to merge it some
>> time soon, but I wanted to give a fair warning here, since this is a
>> relatively large patch.
>>
>> Good news for everyone that it:
>>   a) touches no production code whatsoever. Only test (in-jvm dtest
>> namely) code that was using Harry already.
>>   b) the only tests that are changed are ones that used a duplicate
>> version of placement simulator we had both for testing TCM, and in Harry
>>   c) in addition, I have converted 3 existing TCM tests to a new API to
>> have some base for examples/usage.
>>
>> Since we were effectively relying on this code for a while now, and the
>> intention now is to converge to:
>>   a) fewer different generators, and have a shareable version of
>> generators for everyone to use accross the base
>>   b) a testing tool that can be useful for both trivial cases, and
>> complex scenarios
>> myself and many other Cassandra contributors have expressed an opinion
>> that bringing Harry in-tree will be highly benefitial.
>>
>> I strongly believe that bringing Harry in-tree will help to lower the
>> barrier for fuzz test and simplify co-development of Cassandra and Harry.
>> Previously, it has been rather difficult to debug edge cases because I had
>> to either re-compile an in-jvm dtest jar and bring it to Harry, or
>> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
>> time consuming. Moreover, I believe we have missed at very least one RT
>> regression [2] because Harry was not in-tree, as its tests would've caught
>> the issue even with the model that existed.
>>
>> For other recently found issues, I think having Harry in-tree would have
>> substantially lowered a turnaround time, and allowed me to share repros
>> with developers of corresponding features much quicker.
>>
>> I do expect a slight learning curve for Harry, but my intention is to
>> build a web of simple tests (worked on some of them yesterday after
>> conversation with David already), which can follow the in-jvm-dtest pattern
>> of find-similar-test / copy / modify. There's already copious
>> documentation, so I do not believe not having docs for Harry was ever an
>> issue, since there have been plenty.
>>
>> You all are aware of my dedication to testing and quality of Apache
>> Cassandra, and I hope you also see the benefits of having a model checker
>> in-tree.
>>
>> Thank you and happy upcoming holidays,
>> --Alex
>>
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>
>>
>>

Re: [ATTENTION] Forced push on cassandra-5.0 branch !!!

2023-12-16 Thread Jacek Lewandowski

It looks like it happened due to my recent merge. We were discussing on
Slack what could go wrong but we have no clues so far.

I've pushed cassandra-5.0 and trunk with the command (I've looked into the
bash history):
git push --atomic apache cassandra-5.0 trunk

My 5.0 branch which I pushed has no such merge (I pushed this branch also
to my personal fork:
https://github.com/jacek-lewandowski/cassandra/commits/cassandra-5.0)

I am really curious about what went wrong as I want to avoid that in the
future

- - -- --- -  -
Jacek Lewandowski


niedz., 17 gru 2023 o 05:00 Dinesh Joshi  napisał(a):

> thanks for the heads up. Is there anything we could do to avoid bad merges
> in the future?
>
> Dinesh
>
> On Dec 16, 2023, at 3:26 PM, Mick Semb Wever  wrote:
>
>
> The cassandra-5.0 branch accidentally got 229 trunk merge commits brought
> into it.
>
> This has been fixed now, but required a forced push.  I've gone ahead and
> done this quickly for the sake of avoiding most folk from seeing it.
>
> The fix was
>
> git switch cassandra-5.0
> git reset --hard 2fc2be5
> git push --force origin cassandra-5.0
>
>
>
>
>

Moving Semver4j from test to main dependencies

2023-12-15 Thread Jacek Lewandowski

Hi,

I'd like to add Semver4j to the production dependencies. It is currently on
the test classpath. The library is pretty lightweight, licensed with MIT
and has no transitive dependencies.

We need to represent the kernel version somehow in CASSANDRA-19196 and
Semver4j looks as the right tool for it. Maybe at some point we can replace
our custom implementation of CassandraVersion as well.

Thanks,
- - -- --- -  -
Jacek Lewandowski

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-12 Thread Jacek Lewandowski

First of all - when you want to have a parameterized test case you do not
have to make the whole test class parameterized - it is per test case.
Also, each method can have different parameters.

For the extensions - we can have extensions which provide Cassandra
configuration, extensions which provide a running cluster and others. We
could for example apply some extensions to all test classes externally
without touching those classes, something like logging the begin and end of
each test case.



wt., 12 gru 2023 o 12:07 Benedict  napisał(a):

> Could you give (or link to) some examples of how this would actually
> benefit our test suites?
>
> On 12 Dec 2023, at 10:51, Jacek Lewandowski 
> wrote:
>
> 
> I have two major pros for JUnit 5:
> - much better support for parameterized tests
> - global test hooks (automatically detectable extensions) +
> multi-inheritance
>
>
>
>
> pon., 11 gru 2023 o 13:38 Benedict  napisał(a):
>
>> Why do we want to move to JUnit 5?
>>
>> I’m generally opposed to churn unless well justified, which it may be -
>> just not immediately obvious to me.
>>
>> On 11 Dec 2023, at 08:33, Jacek Lewandowski 
>> wrote:
>>
>> 
>> Nobody referred so far to the idea of moving to JUnit 5, what are the
>> opinions?
>>
>>
>>
>> niedz., 10 gru 2023 o 11:03 Benedict  napisał(a):
>>
>>> Alex’s suggestion was that we meta randomise, ie we randomise the config
>>> parameters to gain better rather than lesser coverage overall. This means
>>> we cover these specific configs and more - just not necessarily on any
>>> single commit.
>>>
>>> I strongly endorse this approach over the status quo.
>>>
>>> On 8 Dec 2023, at 13:26, Mick Semb Wever  wrote:
>>>
>>> 
>>>
>>>
>>>
>>>>
>>>> I think everyone agrees here, but…. these variations are still
>>>>> catching failures, and until we have an improvement or replacement we
>>>>> do rely on them.   I'm not in favour of removing them until we have
>>>>> proof /confidence that any replacement is catching the same failures.
>>>>> Especially oa, tries, vnodes. (Not tries and offheap is being
>>>>> replaced with "latest", which will be valuable simplification.)
>>>>
>>>>
>>>> What kind of proof do you expect? I cannot imagine how we could prove
>>>> that because the ability of detecting failures results from the randomness
>>>> of those tests. That's why when such a test fail you usually cannot
>>>> reproduce that easily.
>>>>
>>>
>>>
>>> Unit tests that fail consistently but only on one configuration, should
>>> not be removed/replaced until the replacement also catches the failure.
>>>
>>>
>>>
>>>> We could extrapolate that to - why we only have those configurations?
>>>> why don't test trie / oa + compression, or CDC, or system memtable?
>>>>
>>>
>>>
>>> Because, along the way, people have decided a certain configuration
>>> deserves additional testing and it has been done this way in lieu of any
>>> other more efficient approach.
>>>
>>>
>>>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-12 Thread Jacek Lewandowski

I have two major pros for JUnit 5:
- much better support for parameterized tests
- global test hooks (automatically detectable extensions) +
multi-inheritance




pon., 11 gru 2023 o 13:38 Benedict  napisał(a):

> Why do we want to move to JUnit 5?
>
> I’m generally opposed to churn unless well justified, which it may be -
> just not immediately obvious to me.
>
> On 11 Dec 2023, at 08:33, Jacek Lewandowski 
> wrote:
>
> 
> Nobody referred so far to the idea of moving to JUnit 5, what are the
> opinions?
>
>
>
> niedz., 10 gru 2023 o 11:03 Benedict  napisał(a):
>
>> Alex’s suggestion was that we meta randomise, ie we randomise the config
>> parameters to gain better rather than lesser coverage overall. This means
>> we cover these specific configs and more - just not necessarily on any
>> single commit.
>>
>> I strongly endorse this approach over the status quo.
>>
>> On 8 Dec 2023, at 13:26, Mick Semb Wever  wrote:
>>
>> 
>>
>>
>>
>>>
>>> I think everyone agrees here, but…. these variations are still catching
>>>> failures, and until we have an improvement or replacement we do rely
>>>> on them.   I'm not in favour of removing them until we have proof
>>>> /confidence that any replacement is catching the same failures.  Especially
>>>> oa, tries, vnodes. (Not tries and offheap is being replaced with
>>>> "latest", which will be valuable simplification.)
>>>
>>>
>>> What kind of proof do you expect? I cannot imagine how we could prove
>>> that because the ability of detecting failures results from the randomness
>>> of those tests. That's why when such a test fail you usually cannot
>>> reproduce that easily.
>>>
>>
>>
>> Unit tests that fail consistently but only on one configuration, should
>> not be removed/replaced until the replacement also catches the failure.
>>
>>
>>
>>> We could extrapolate that to - why we only have those configurations?
>>> why don't test trie / oa + compression, or CDC, or system memtable?
>>>
>>
>>
>> Because, along the way, people have decided a certain configuration
>> deserves additional testing and it has been done this way in lieu of any
>> other more efficient approach.
>>
>>
>>

Re: Ext4 data corruption in stable kernels

2023-12-11 Thread Jacek Lewandowski

Frankly there only two kernel versions mentioned there. I've created
https://issues.apache.org/jira/browse/CASSANDRA-19196 to do something with
that.


pon., 11 gru 2023 o 21:05 Jon Haddad  napisał(a):

> Like I said, I didn't have time to verify the full scope and what's
> affected, just that some stable kernels are affected.  Adding to the
> problem is that it might be vendor specific as well.  For example, RH might
> backport an upstream patch in the kernel they ship that's non-standard.
>
> Hopefully someone compiles a list.
>
> Jon
>
> On Mon, Dec 11, 2023 at 11:51 AM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Aren't only specific kernels affected? If we can detect the kernel
>> version, the feature can be force disabled with the problematic kernels
>>
>>
>> pon., 11 gru 2023, 20:45 użytkownik Jon Haddad 
>> napisał:
>>
>>> Hey folks,
>>>
>>> Just wanted to raise awareness about a I/O issue that seems to be
>>> affecting some Linux Kernal releases that were listed as STABLE, causing
>>> corruption when using the ext4 filesystem with direct I/O.  I don't have
>>> time to get a great understanding of the full scope of the issue, what
>>> versions are affected, etc, I just want to get this in front of the
>>> project.  I am disappointed that this might negatively affect our ability
>>> to leverage direct I/O for both the commitlog (recently merged) and
>>> SSTables (potentially a future use case), since users won't be able to
>>> discern between a bug we ship and one that we hit as a result of our
>>> filesystem choices.
>>>
>>> I think it might be worth putting a note in our docs and in the config
>>> to warn the user to ensure they're not affected, and we may even want to
>>> consider hiding this feature if the blast radius is significant enough that
>>> users would be affected.
>>>
>>> https://lwn.net/Articles/954285/
>>>
>>> Jon
>>>
>>

Re: Ext4 data corruption in stable kernels

2023-12-11 Thread Jacek Lewandowski

Aren't only specific kernels affected? If we can detect the kernel version,
the feature can be force disabled with the problematic kernels


pon., 11 gru 2023, 20:45 użytkownik Jon Haddad  napisał:

> Hey folks,
>
> Just wanted to raise awareness about a I/O issue that seems to be
> affecting some Linux Kernal releases that were listed as STABLE, causing
> corruption when using the ext4 filesystem with direct I/O.  I don't have
> time to get a great understanding of the full scope of the issue, what
> versions are affected, etc, I just want to get this in front of the
> project.  I am disappointed that this might negatively affect our ability
> to leverage direct I/O for both the commitlog (recently merged) and
> SSTables (potentially a future use case), since users won't be able to
> discern between a bug we ship and one that we hit as a result of our
> filesystem choices.
>
> I think it might be worth putting a note in our docs and in the config to
> warn the user to ensure they're not affected, and we may even want to
> consider hiding this feature if the blast radius is significant enough that
> users would be affected.
>
> https://lwn.net/Articles/954285/
>
> Jon
>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-11 Thread Jacek Lewandowski

Nobody referred so far to the idea of moving to JUnit 5, what are the
opinions?



niedz., 10 gru 2023 o 11:03 Benedict  napisał(a):

> Alex’s suggestion was that we meta randomise, ie we randomise the config
> parameters to gain better rather than lesser coverage overall. This means
> we cover these specific configs and more - just not necessarily on any
> single commit.
>
> I strongly endorse this approach over the status quo.
>
> On 8 Dec 2023, at 13:26, Mick Semb Wever  wrote:
>
> 
>
>
>
>>
>> I think everyone agrees here, but…. these variations are still catching
>>> failures, and until we have an improvement or replacement we do rely on
>>> them.   I'm not in favour of removing them until we have proof /confidence
>>> that any replacement is catching the same failures.  Especially oa, tries,
>>> vnodes. (Not tries and offheap is being replaced with "latest", which
>>> will be valuable simplification.)
>>
>>
>> What kind of proof do you expect? I cannot imagine how we could prove
>> that because the ability of detecting failures results from the randomness
>> of those tests. That's why when such a test fail you usually cannot
>> reproduce that easily.
>>
>
>
> Unit tests that fail consistently but only on one configuration, should
> not be removed/replaced until the replacement also catches the failure.
>
>
>
>> We could extrapolate that to - why we only have those configurations? why
>> don't test trie / oa + compression, or CDC, or system memtable?
>>
>
>
> Because, along the way, people have decided a certain configuration
> deserves additional testing and it has been done this way in lieu of any
> other more efficient approach.
>
>
>

Re: Welcome Mike Adamson as Cassandra committer

2023-12-08 Thread Jacek Lewandowski

Awesome, congratulations Mike !!!

- - -- --- -  -
Jacek Lewandowski


pt., 8 gru 2023 o 16:10 Miklosovic, Stefan via dev 
napisał(a):

> Wow, great news! Congratulations on your committership, Mike.
>
> 
> From: Benjamin Lerer 
> Sent: Friday, December 8, 2023 15:41
> To: dev@cassandra.apache.org
> Subject: Welcome Mike Adamson as Cassandra committer
>
> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>
>
>
> The PMC members are pleased to announce that Mike Adamson has accepted
> the invitation to become committer.
>
> Thanks a lot, Mike, for everything you have done for the project.
>
> Congratulations and welcome
>
> The Apache Cassandra PMC members
>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-08 Thread Jacek Lewandowski

>
> It would be great to setup a JUnitRunner using the simulator and find out
> though.
>

I like this idea - this is what I meant when asking about the current unit
tests - to me, a test is either simulation or a fuzz. Due to pretty random
execution order of unit tests, all of them can be considered really
unrobust fuzz tests, implemented with the intention to be simulation tests
(with exact execution order, testing a very specific behaviour).

I think everyone agrees here, but…. these variations are still catching
> failures, and until we have an improvement or replacement we do rely on
> them.   I'm not in favour of removing them until we have proof /confidence
> that any replacement is catching the same failures.  Especially oa, tries,
> vnodes. (Not tries and offheap is being replaced with "latest", which
> will be valuable simplification.)


What kind of proof do you expect? I cannot imagine how we could prove that
because the ability of detecting failures results from the randomness of
those tests. That's why when such a test fail you usually cannot reproduce
that easily. We could extrapolate that to - why we only have those
configurations? why don't test trie / oa + compression, or CDC, or system
memtable? Each random run of a any test can find a new problem. I'm in
favour of parametrizing the "clients" of a certain feature - like
parameterize storage engine tests, streaming and tools tests against
different sstable formats; though it make no sense to parameterize gossip
tests, utility classes tests or dedicated test for certain storage
implementations .



pt., 8 gru 2023 o 07:51 Alex Petrov  napisał(a):

> My logic here was that CQLTester tests would probably be the best
> candidate as they are largely single-threaded and single-node. I'm sure
> there are background processes that might slow things down when serialised
> into a single execution thread, but my expectation would be that it will
> not be as significant as with other tests such as multinode in-jvm dtests.
>
> On Thu, Dec 7, 2023, at 7:44 PM, Benedict wrote:
>
>
> I think the biggest impediment to that is that most tests are probably not
> sufficiently robust for simulation. If things happen in a surprising order
> many tests fail, as they implicitly rely on the normal timing of things.
>
> Another issue is that the simulator does potentially slow things down a
> little at the moment. Not sure what the impact would be overall.
>
> It would be great to setup a JUnitRunner using the simulator and find out
> though.
>
>
> On 7 Dec 2023, at 15:43, Alex Petrov  wrote:
>
> 
> We have been extensively using simulator for TCM, and I think we have make
> simulator tests more approachable. I think many of the existing tests
> should be ran under simulator instead of CQLTester, for example. This will
> both strengthen the simulator, and make things better in terms of
> determinism. Of course not to say that CQLTester tests are the biggest
> beneficiary there.
>
> On Thu, Dec 7, 2023, at 4:09 PM, Benedict wrote:
>
> To be fair, the lack of coherent framework doesn’t mean we can’t merge
> them from a naming perspective. I don’t mind losing one of burn or fuzz,
> and merging them.
>
> Today simulator tests are kept under the simulator test tree but that
> primarily exists for the simulator itself and testing it. It’s quite a
> complex source tree, as you might expect, and it exists primarily for
> managing its own complexity. It might make sense to bring the Paxos and
> Accord simulator entry points out into the burn/fuzz trees, though not sure
> it’s all that important.
>
>
> > On 7 Dec 2023, at 15:05, Benedict  wrote:
> >
> > Yes, the only system/real-time timeout is a progress one, wherein if
> nothing happens for ten minutes we assume the simulation has locked up.
> Hitting this is indicative of a bug, and the timeout is so long that no
> realistic system variability could trigger it.
> >
> >> On 7 Dec 2023, at 14:56, Brandon Williams  wrote:
> >>
> >> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov  wrote:
>  I've noticed many "sleeps" in the tests - is it possible with
> simulation tests to artificially move the clock forward by, say, 5 seconds
> instead of sleeping just to test, for example whether TTL works?)
> >>>
> >>> Yes, simulator will skip the sleep and do a simulated sleep with a
> simulated clock instead.
> >>
> >> Since it uses an artificial clock, does this mean that the simulator
> >> is also impervious to timeouts caused by the underlying environment?
> >>
> >> Kind Regards,
> >> Brandon
>
>
>
>
>

Re: [DISCUSS] CASSANDRA-19104: Standardize tablestats formatting and data units

2023-12-04 Thread Jacek Lewandowski

This looks great,

I'd consider limiting the number of significant digits to 3 in the human
readable format. In the above example it would translate to:

Space used (live): 1.46 TiB
Space used (total): 1.46 TiB

Bytes repaired: 0.00 KiB
Bytes unrepaired: 4.31 TiB
Bytes pending repair: 0.000 KiB

I just think with human readable format we just expect to have a grasp view
of the stats and 4th significant digit has very little meaning in that case.


thanks,
Jacek

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-01 Thread Jacek Lewandowski

Thanks for the exhaustive response, Alex :)

Let me bring my point of view:

1. Since long tests are just unit tests that take a long time to run, it
makes sense to separate them for efficient parallelization in CI. Since we
are adding new tests, modifying the existing ones, etc., that should be
something maintainable; otherwise, the distinction makes no sense to me.
For example - adjust timeouts on CI to 1 minute per test class for "short"
tests and more for "long" tests. To satisfy CI, the contributor will have
to either make the test run faster or move it to the "long" tests. The
opposite enforcement could be more difficult, though it is doable as well -
failing the "long" test if it takes too little time and should be qualified
as a regular unit test. As I'm reading what I've just written, it sounds
stupid :/ We should get rid of long-running unit tests altogether. They
should run faster or be split.

2. I'm still confused about the distinction between burn and fuzz tests -
it seems to me that fuzz tests are just modern burn tests - should we
refactor the existing burn tests to use the new framework?

3. Simulation tests - since you say they provide a way to execute a test
deterministically, it should be a property of unit tests - well, a unit
test is either deterministic or a fuzz test. Is the simulation framework
usable for CQLTester-based tests? (side question here: I've noticed many
"sleeps" in the tests - is it possible with simulation tests to
artificially move the clock forward by, say, 5 seconds instead of sleeping
just to test, for example whether TTL works?)

4. Yeah, running a complete suite for each artificially crafted
configuration brings little value compared to the maintenance and
infrastructure costs. It feels like we are running all tests a bit blindly,
hoping we catch something accidentally. I agree this is not the purpose of
the unit tests and should be covered instead by fuzz. For features like
CDC, compression, different sstable formats, trie memtable, commit log
compression/encryption, system directory keyspace, etc... we should have
dedicated tests that verify just that functionality

With more or more functionality offered by Cassandra, they will become a
significant pain shortly. Let's start thinking about concrete actions.

Also, as we start refactoring the tests, it will be an excellent
opportunity to move to JUnit 5.

thanks,
Jacek

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-11-30 Thread Jacek Lewandowski

How those burn tests then compare to the fuzz tests? (the new ones)

czw., 30 lis 2023, 20:22 użytkownik Benedict  napisał:

> By “could run indefinitely” I don’t mean by default they run forever.
> There will be parameters that change how much work is done for a given run,
> but just running repeatedly (each time with a different generated seeds) is
> the expected usage. Until you run out of compute or patience.
>
> I agree they are only of value pre-commit to check they haven’t been
> broken in some way by changes.
>
>
>
> On 30 Nov 2023, at 18:36, Josh McKenzie  wrote:
>
> 
>
> that may be long-running and that could be run indefinitely
>
> Perfect. That was the distinction I wasn't aware of. Also means having the
> burn target as part of regular CI runs is probably a mistake, yes? i.e. if
> someone adds a burn tests that runs indefinitely, are there any guardrails
> or built-in checks or timeouts to keep it from running right up to job
> timeout and then failing?
>
> On Thu, Nov 30, 2023, at 1:11 PM, Benedict wrote:
>
>
> A burn test is a randomised test targeting broad coverage of a single
> system, subsystem or utility, that may be long-running and that could be
> run indefinitely, each run providing incrementally more assurance of
> quality of the system.
>
> A long test is a unit test that sometimes takes a long time to run, no
> more no less. I’m not sure any of these offer all that much value anymore,
> and perhaps we could look to deprecate them.
>
> On 30 Nov 2023, at 17:20, Josh McKenzie  wrote:
>
> 
> Strongly agree. I started working on a declarative refactor out of our CI
> configuration so circle, ASFCI, and other systems could inherit from it
> (for instance, see pre-commit pipeline declaration here
> <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89>);
> I had to set that down while I finished up implementing an internal CI
> system since the code in neither the ASF CI structure nor circle structure
> (.sh embedded in .yml /cry) was re-usable in their current form.
>
> Having a jvm.options and cassandra.yaml file per suite and referencing
> them from a declarative job definition
> <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267>
> would make things a lot easier to wrap our heads around and maintain I
> think.
>
> As for what qualifies as burn vs. long... /shrug couldn't tell you. Would
> have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe someone
> else on-list knows.
>
> On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:
>
> Hi,
>
> I'm getting a bit lost - what are the exact differences between those
> test scenarios? What are the criteria for qualifying a test to be part of a
> certain scenario?
>
> I'm working a little bit with tests and build scripts and the number of
> different configurations for which we have a separate target in the build
> starts to be problematic, I cannot imagine how problematic it is for a new
> contributor.
>
> It is not urgent, but we should at least have a plan on how to
> simplify and unify things.
>
> I'm in favour of reducing the number of test targets to the minimum - for
> different configurations I think we should provide a parameter pointing to
> jvm options file and maybe to cassandra.yaml. I know that we currently do
> some super hacky things with cassandra yaml for different configs - like
> concatenting parts of it. I presume it is not necessary - we can have a
> default test config yaml and a directory with overriding yamls; while
> building we could have a tool which is able to load the default
> configuration, apply the override and save the resulting yaml somewhere in
> the build/test/configs for example. That would allows us to easily use
> those yamls in IDE as well - currently it is impossible.
>
> What do you think?
>
> Thank you and my apologize for bothering about lower priority stuff while
> we have a 5.0 release headache...
>
> Jacek
>
>
>
>

Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-11-30 Thread Jacek Lewandowski

Hi,

I'm getting a bit lost - what are the exact differences between those
test scenarios? What are the criteria for qualifying a test to be part of a
certain scenario?

I'm working a little bit with tests and build scripts and the number of
different configurations for which we have a separate target in the build
starts to be problematic, I cannot imagine how problematic it is for a new
contributor.

It is not urgent, but we should at least have a plan on how to simplify and
unify things.

I'm in favour of reducing the number of test targets to the minimum - for
different configurations I think we should provide a parameter pointing to
jvm options file and maybe to cassandra.yaml. I know that we currently do
some super hacky things with cassandra yaml for different configs - like
concatenting parts of it. I presume it is not necessary - we can have a
default test config yaml and a directory with overriding yamls; while
building we could have a tool which is able to load the default
configuration, apply the override and save the resulting yaml somewhere in
the build/test/configs for example. That would allows us to easily use
those yamls in IDE as well - currently it is impossible.

What do you think?

Thank you and my apologize for bothering about lower priority stuff while
we have a 5.0 release headache...

Jacek

Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-29 Thread Jacek Lewandowski

>
> If we end up not releasing a final 5.0 artifact by a Cassandra Summit it
> will signal to the community that we’re prioritizing stability and it could
> be a good opportunity to get people to test the beta or RC before we stamp
> it as production ready.
>

I agree with Paulo's comment

czw., 30 lis 2023 o 04:44 Paulo Motta  napisał(a):

> > if any contributor has an opinion which is not technically refuted it
> will usually be backed by a PMC via a binding -1
>
> clarifying a bit my personal view: if any contributor has an opinion
> against a proposal (in this case this release proposal) that is not refuted
> it will usually be backed by a PMC via binding -1
>
> Opinions supporting the proposal are also valuable, provided there are no
> valid claims against a proposal.
>
> On Wed, 29 Nov 2023 at 22:27 Paulo Motta  wrote:
>
>> To me, the goal of a beta is to find unknown bugs. If no new bugs are
>> found during a beta release, then it can be automatically promoted to RC
>> via re-tagging. Likewise, if no new bugs are found during a RC after X
>> time, then it can be promoted to final.
>>
>> If we end up not releasing a final 5.0 artifact by a Cassandra Summit it
>> will signal to the community that we’re prioritizing stability and it could
>> be a good opportunity to get people to test the beta or RC before we stamp
>> it as production ready.
>>
>> WDYT?
>>
>> >  Aaron (and anybody who takes the time to follow this list, really),
>> your opinion matters, that's why we discuss it here.
>>
>> +1, PMC are just officers who endorse community decisions, so if any
>> contributor has an opinion which is not technically refuted it will usually
>> be backed by a PMC via a binding -1 (as seen on this thread)
>>
>> On Wed, 29 Nov 2023 at 20:04 Nate McCall  wrote:
>>
>>>
>>>
>>> On Thu, Nov 30, 2023 at 3:28 AM Aleksey Yeshchenko 
>>> wrote:
>>>
 -1 on cutting a beta1 in this state. An alpha2 would be acceptable now,
 but I’m not sure there is significant value to be had from it. Merge the
 fixes for outstanding issues listed above, then cut beta1.

>>> 
>>>
>>> Agree with Aleksey. -1 on a beta we know has issues with a top-line new
>>> feature.
>>>
>>>
>>>
>>

Re: Welcome Francisco Guerrero Hernandez as Cassandra Committer

2023-11-28 Thread Jacek Lewandowski

Congrats!!!

wt., 28 lis 2023, 23:08 użytkownik Abe Ratnofsky  napisał:

> Congrats Francisco!
>
> > On Nov 28, 2023, at 1:56 PM, C. Scott Andreas 
> wrote:
> >
> > Congratulations, Francisco!
> >
> > - Scott
> >
> >> On Nov 28, 2023, at 10:53 AM, Dinesh Joshi  wrote:
> >>
> >> The PMC members are pleased to announce that Francisco Guerrero
> Hernandez has accepted
> >> the invitation to become committer today.
> >>
> >> Congratulations and welcome!
> >>
> >> The Apache Cassandra PMC members
>
>

Re: Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-28 Thread Jacek Lewandowski

Should I assume it's an agreement?

- - -- --- -  -
Jacek Lewandowski


wt., 28 lis 2023 o 12:05 Mick Semb Wever  napisał(a):

>
>
> So, assuming beta2, can it be merged to cassandra-5.0 now?
>>
>
>
> Yes from me.
> And I would interpret Brandon and Maxwell's statements as supporting too.
>
>
>

Re: Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-28 Thread Jacek Lewandowski

So, assuming beta2, can it be merged to cassandra-5.0 now?

- - -- --- -  -
Jacek Lewandowski


pon., 27 lis 2023 o 16:02 Mick Semb Wever  napisał(a):

>
> I don't want to veto the 5.0-beta1 release for this.
>
> I would rather cut and include it in the next: 5.0-beta2; release.
> Given it's an add-on and does not change default behaviour, I believe this
> would be ok – if we agree.
>
> I feel this would be a better use of our time.  And would mean, yes commit
> to cassandra-5.0
>
>
>
>
> On Mon, 27 Nov 2023 at 14:04, Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Hey,
>>
>> I'd like to ask if we can include
>> https://issues.apache.org/jira/browse/CASSANDRA-18464 in the 5.0-beta1
>> release. This introduces the ability to write to the commitlog using direct
>> I/O and bringing some noticeable performance improvements when enabled
>> (disabled by default).
>>
>> Since it introduces a change in the yaml config, it probably cannot be
>> delivered in RC or 5.0.x - hence my question.
>>
>> The ticket has been reviewed and tested. It is basically in the
>> read-to-commit state.
>>
>> thanks,
>> Jacek
>>
>>

Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Jacek Lewandowski

+1

- - -- --- -  -
Jacek Lewandowski


pon., 27 lis 2023 o 22:11 Ekaterina Dimitrova 
napisał(a):

> +1, also, Alex, just an idea - maybe you want to make a virtual talk, as
> part of the contributors meetings?
>
>
> На понеделник, 27 ноември 2023 г. Yifan Cai  написа:
>
>> +1
>> --
>> *发件人:* Sam Tunnicliffe 
>> *发送时间:* Tuesday, November 28, 2023 2:43:51 AM
>> *收件人:* dev 
>> *主题:* Re: [DISCUSS] Harry in-tree
>>
>> Definite +1 to bringing harry-core in tree.
>>
>> On 24 Nov 2023, at 15:43, Alex Petrov  wrote:
>>
>> Hi everyone,
>>
>> With TCM landed, there will be way more Harry tests in-tree: we are using
>> it for many coordination tests, and there's now a simulator test that uses
>> Harry. During development, Harry has allowed us to uncover and resolve
>> numerous elusive edge cases.
>>
>> I had conversations with several folks, and wanted to propose to move
>> harry-core to Cassandra test tree. This will substantially
>> simplify/streamline co-development of Cassandra and Harry. With a new
>> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it
>> will also be much more approachable.
>>
>> Besides making it easier for everyone to develop new fuzz tests, it will
>> also substantially lower the barrier to entry. Currently, debugging an
>> issue found by Harry involves a cumbersome process of rebuilding and
>> transferring jars between Cassandra and Harry, depending on which side you
>> modify. This not only hampers efficiency but also deters broader adoption.
>> By merging harry-core into the Cassandra test tree, we eliminate this
>> barrier.
>>
>> Thank you,
>> --Alex
>>
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
>> [3] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>
>>
>>

Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Jacek Lewandowski

I propose to consider including
https://issues.apache.org/jira/browse/CASSANDRA-18464 (started a separate
thread)




pon., 27 lis 2023 o 13:06 Benjamin Lerer  napisał(a):

> Looking at the board it is unclear to me why CASSANDRA-19011
> , CASSANDRA-19018
> ,  CASSANDRA-18796
>  and
> CASSANDRA-18940 
> are not beta tickets.
> SAI being one of the important features of 5.0 it seems to me that those
> tickets should have been handled for the beta release.
> CASSANDRA-19039 
> could also be a real problem.
>
>
> Le dim. 26 nov. 2023 à 13:35, Mick Semb Wever  a écrit :
>
>>
>> Proposing the test build of Cassandra 5.0-beta1 for release.
>>
>> sha1: e0c0c31c7f6db1e3ddb80cef842b820fc27fd0eb
>> Git: https://github.com/apache/cassandra/tree/5.0-beta1-tentative
>> Maven Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1319/org/apache/cassandra/cassandra-all/5.0-beta1/
>>
>> The Source and Build Artifacts, and the Debian and RPM packages and
>> repositories, are available here:
>> https://dist.apache.org/repos/dist/dev/cassandra/5.0-beta1/
>>
>> The vote will be open for 72 hours (longer if needed). Everyone who has
>> tested the build is invited to vote. Votes by PMC members are considered
>> binding. A vote passes if there are at least three binding +1s and no -1's.
>>
>> Remaining tickets to get us to 5.0-rc1 can be found on this jira board:
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=593=detail
>>
>> [1]: CHANGES.txt:
>> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/CHANGES.txt
>> [2]: NEWS.txt:
>> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/NEWS.txt
>>
>

Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-27 Thread Jacek Lewandowski

Hey,

I'd like to ask if we can include
https://issues.apache.org/jira/browse/CASSANDRA-18464 in the 5.0-beta1
release. This introduces the ability to write to the commitlog using direct
I/O and bringing some noticeable performance improvements when enabled
(disabled by default).

Since it introduces a change in the yaml config, it probably cannot be
delivered in RC or 5.0.x - hence my question.

The ticket has been reviewed and tested. It is basically in the
read-to-commit state.

thanks,
Jacek

Re: CEP-21 - Transactional cluster metadata merged to trunk

2023-11-27 Thread Jacek Lewandowski

Hi,

I'm happy to hear that the feature got merged. Though, I share Benjamin's
worries about that being a bad precedent.

I don't think it makes sense to do repeated runs in this particular case.
Detecting flaky tests would not prove anything; they can be caused by this
patch, but we would not know that for sure. We would have to have a similar
build with the same tests repeated to compare. It would take time and
resources, and in the end, we will have to fix those flaky tests regardless
of whether they were caused by this change. IMO, it makes sense to do a
repeated run of the new tests, though. Aside from that, we can also
consider making it easier and more automated for the developer to determine
whether a particular flakiness comes from a feature branch one wants to
merge.

thanks,
Jacek


pon., 27 lis 2023 o 10:15 Benjamin Lerer  napisał(a):

> Hi,
>
> I must admit that I have been surprised by this merge and this following
> email. We had lengthy discussions recently and the final agreement was that
> the requirement for a merge was a green CI.
> I could understand that for some reasons as a community we could wish to
> make some exceptions. In this present case there was no official discussion
> to ask for an exception.
> I believe that this merge creates a bad precedent where anybody can feel
> entitled to merge without a green CI and disregard any previous community
> agreement.
>
> Le sam. 25 nov. 2023 à 09:22, Mick Semb Wever  a écrit :
>
>>
>> Great work Sam, Alex & Marcus !
>>
>>
>>
>>> There are about 15-20 flaky or failing tests in total, spread over
>>> several test jobs[2] (i.e. single digit failures in a few of these). We
>>> have filed JIRAs for the failures and are working on getting those fixed as
>>> a top priority. CASSANDRA-19055[3] is the umbrella ticket for this follow
>>> up work.
>>>
>>> There are also a number of improvements we will work on in the coming
>>> weeks, we will file JIRAs for those early next week and add them as
>>> subtasks to CASSANDRA-19055.
>>>
>>
>>
>> Can we get these tests temporarily annotated as skipped while all the
>> subtickets to 19055 are being worked on ?
>>
>> As we have seen from CASSANDRA-18166 and CASSANDRA-19034 there's a lot of
>> overhead now on 5.0 tickets having to navigate around these failures in
>> trunk CI runs.
>>
>> Also, we're still trying to figure out how to do repeated runs for a
>> patch so big… (the list of touched tests was too long for circleci, i need
>> to figure out what the limit is and chunk it into separate circleci
>> configs) … and it probably makes sense to wait until most of 19055 is done
>> (or tests are temporarily annotated as skipped).
>>
>>
>>

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-27 Thread Jacek Lewandowski

I've been thinking about this and I believe that if we ever decide to delay
a release to include some CEPs, we should make the plan and status of those
CEPs public. This should include publishing a branch, creating tickets for
the remaining work required for feature completion in Jira, and notifying
the mailing list.

By doing this, we can make an informed decision about whether delivering a
CEP in a release x.y planned for some time z is feasible. This approach
would also be beneficial for improving collaboration, as we will all be
aware of what is left to be done and can adjust our focus accordingly to
participate in the remaining work.

Thanks,
- - -- --- -  -
Jacek Lewandowski


pt., 27 paź 2023 o 10:26 Benjamin Lerer  napisał(a):

> I would be interested in testing Maxim's approach. We need more visibility
> on big features and their progress to improve our coordination. Hopefully
> it will also open the door to more collaboration on those big projects.
>
> Le jeu. 26 oct. 2023 à 21:35, German Eichberger via dev <
> dev@cassandra.apache.org> a écrit :
>
>> +1 to Maxim's idea
>>
>> Like Stefan my assumption was that we would get some version of TCM +
>> ACCORD in 5.0 but it wouldn't be ready for production use. My own testing
>> and conversations at Community over Code in Halifax confirmed this.
>>
>> From this perspective as disappointing as TCM+ACCORD slipping is moving
>> it to 5.1 makes sense and I am supporting of this - but I am worried if 5.1
>> is basically 5.0 + TCM/ACCORD and this slips again we draw ourselves into a
>> corner where we can't release 5.2 before 5.1 or something. I would like
>> some more elaboration on that.
>>
>> I am also very worried about ANN vector search being in jeopardy for 5.0
>> which is an important feature for me to win some internal company bet 
>>
>> My 2 cents,
>> German
>>
>> --
>> *From:* Miklosovic, Stefan via dev 
>> *Sent:* Thursday, October 26, 2023 4:23 AM
>> *To:* dev@cassandra.apache.org 
>> *Cc:* Miklosovic, Stefan 
>> *Subject:* [EXTERNAL] Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1
>> (and cut an immediate 5.1-alpha1)
>>
>> What Maxim proposes in the last paragraph would be definitely helpful.
>> Not for the project only but for a broader audience, companies etc., too.
>>
>> Until this thread was started, my assumption was that "there will be 5.0
>> on summit with TCM and Accord and it somehow just happens". More
>> transparent communication where we are at with high-profile CEPs like these
>> and knowing if deadlines are going to be met would be welcome.
>>
>> I don't want to be that guy and don't take me wrong here, but really,
>> these CEPs are being developed, basically, by devs from two companies,
>> which have developers who do not have any real need to explain themselves
>> like what they do, regularly, to outsiders. (or maybe you do, you just
>> don't have time?) I get that. But on the other hand, you can not
>> realistically expect that other folks will have any visibility into what is
>> going on there and that there is a delay on the horizon and so on.
>>
>> 
>> From: Maxim Muzafarov 
>> Sent: Thursday, October 26, 2023 12:21
>> To: dev@cassandra.apache.org
>> Subject: Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an
>> immediate 5.1-alpha1)
>>
>> NetApp Security WARNING: This is an external email. Do not click links or
>> open attachments unless you recognize the sender and know the content is
>> safe.
>>
>>
>>
>>
>> Personally, I think frequent releases (2-3 per year) are better than
>> infrequent big releases. I can understand all the concerns from a
>> marketing perspective, as smaller major releases may not shine as
>> brightly as a single "game changer" release. However, smaller
>> releases, especially if they don't have backwards compatibility
>> issues, are better for the engineering and SRE teams because if a
>> long-awaited feature is delayed for any reason, there should be no
>> worry about getting it in right into the next release.
>>
>> An analogy here might be that if you miss your train (small release)
>> due to circumstances, you can wait right here for the next one, but if
>> you miss a flight (big release), you will go back home :-) This is why
>> I think that the 5.0, 5.1, 5.2, etc. are better and I support Mick's
>> plan with the caveat that we should release 5.1 when we think we are
>> ready to do so. Here is an example o

Re: [VOTE] Release Apache Cassandra 5.0-alpha1 (take3)

2023-09-08 Thread Jacek Lewandowski

Ok, +1

czw., 7 wrz 2023, 21:33 użytkownik Mick Semb Wever  napisał:

>
>
> On Thu, 7 Sept 2023 at 13:53, Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Mick, is the documentation / website ok?
>>
>
>
> Can you elaborate on what is ok ?
>
> https://cassandra.apache.org/doc/5.0/index.html
>  is in an ok state.
> There's lots still to add for 5.0 and I know Lorina is very busy on this
> front.  I don't think any of that is a release blocker though.
>
> And, I suggest we repeat what we did with previous major releases and not
> list the alpha releases on the downloads page (but do list the first
> beta).  Announcement emails will contain all the information to links and
> there'll probably be blog posts too.
>
>
>

Re: [VOTE] Release Apache Cassandra 5.0-alpha1 (take3)

2023-09-07 Thread Jacek Lewandowski

Mick, is the documentation / website ok?

If so, +1

Best Regards,
- - -- --- -  -
Jacek Lewandowski


czw., 7 wrz 2023 o 12:58 Brandon Williams  napisał(a):

> +1
>
> Kind Regards,
> Brandon
>
> On Mon, Sep 4, 2023 at 3:26 PM Mick Semb Wever  wrote:
> >
> >
> > Proposing the test build of Cassandra 5.0-alpha1 for release.
> >
> > DISCLAIMER, this alpha release does not contain the expected 5.0
> > features: Vector Search (CEP-30), Transactional Cluster Metadata
> > (CEP-21) and Accord Transactions (CEP-15).  These features will land
> > in a later alpha release.
> >
> > Please also note that this is an alpha release and what that means,
> further info at
> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
> >
> > sha1: bc5e3741d475e2e99fd7a10450681fd708431a89
> > Git: https://github.com/apache/cassandra/tree/5.0-alpha1-tentative
> > Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1316/org/apache/cassandra/cassandra-all/5.0-alpha1/
> >
> > The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/5.0-alpha1/
> >
> > The vote will be open for 72 hours (longer if needed). Everyone who has
> tested the build is invited to vote. Votes by PMC members are considered
> binding. A vote passes if there are at least three binding +1s and no -1's.
> >
> > [1]: CHANGES.txt:
> https://github.com/apache/cassandra/blob/5.0-alpha1-tentative/CHANGES.txt
> > [2]: NEWS.txt:
> https://github.com/apache/cassandra/blob/5.0-alpha1-tentative/NEWS.txt
> >
>

Re: [DISCUSSION] Shall we remove ant javadoc task?

2023-08-02 Thread Jacek Lewandowski

With or without outputting JavaDoc to HTML, there are some errors which we
should maybe fix. We want to keep the documentation, but there can be
syntax errors which may prevent IDE generating a proper preview. So, the
question is - should we validate the JavaDoc comments as a precommit task?
Can it be done without actually generating HTML output?

Thanks,
Jacek

śr., 2 sie 2023, 22:24 użytkownik Derek Chen-Becker 
napisał:

> Oh, whoops, I guess I'm the only one that thinks Javadoc is just the tool
> and/or it's output (not the markup itself) :P If anything, the codebase
> could use a little more package/class/method markup in some places, so I'm
> definitely only in favor of getting rid of the ant task. I should amend my
> statement to be "...I suspect most people are not *opening their browsers
> and *looking at Javadoc..." :)
>
> Cheers,
>
> Derek
>
>
>
> On Wed, Aug 2, 2023, 1:30 PM Josh McKenzie  wrote:
>
>> most people are not looking at Javadoc when working on the codebase.
>>
>> I definitely use it extensively *inside the IDE*. But never as a
>> compiled set of external docs.
>>
>> Which is to say, I'm +1 on removing the target and I'd ask everyone to
>> keep javadoccing your classes and methods where things are non-obvious or
>> there's a logical coupling with something else in the system. :)
>>
>> On Wed, Aug 2, 2023, at 2:08 PM, Derek Chen-Becker wrote:
>>
>> +1. If a need comes up for Javadoc we can fix it at that point, but I
>> suspect most people are not looking at Javadoc when working on the codebase.
>>
>> Cheers,
>>
>> Derek
>>
>> On Wed, Aug 2, 2023 at 11:11 AM Brandon Williams 
>> wrote:
>>
>> I don't think even if it works anyone is going to use the output, so
>> I'm good with removal.
>>
>> Kind Regards,
>> Brandon
>>
>> On Wed, Aug 2, 2023 at 11:50 AM Ekaterina Dimitrova
>>  wrote:
>> >
>> > Hi everyone,
>> > We were looking into a user report around our ant javadoc task recently.
>> > That made us realize it is not run in CI; it finishes successfully even
>> if there are hundreds of errors, some potentially breaking doc pages.
>> >
>> > There was a ticket discussion where a few community members mentioned
>> that this task was probably unnecessary. Can we remove it, or shall we fix
>> it?
>> >
>> > Best regards,
>> > Ekaterina
>>
>>
>>
>> --
>> +---+
>> | Derek Chen-Becker |
>> | GPG Key available at https://keybase.io/dchenbecker and   |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---+
>>
>>
>>

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-12 Thread Jacek Lewandowski

I believe some tools can determine which tests make sense to multiplex,
given that some exact lines of code were changed using code coverage
analysis. After the initial run, we should have data from the coverage
analysis, which would tell us which test classes are tainted - that is,
they cover the modified code fragments.

Using a similar approach, we could detect the coverage differences when
running, say w/wo compression, and discover the tests which cover those
parts of the code.

That way, we can be smart and save time by precisely pointing to it makes
sense to test more accurately.


śr., 12 lip 2023 o 14:52 Jacek Lewandowski 
napisał(a):

> Would it be re-opening the ticket or creating a new ticket with "revert of
> fix" ?
>
>
>
> śr., 12 lip 2023 o 14:51 Ekaterina Dimitrova 
> napisał(a):
>
>> jenkins_jira_integration
>> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>>  script
>> updating the JIRA ticket with test results if you cause a regression + us
>> building a muscle around reverting your commit if they break tests.“
>>
>> I am not sure people finding the time to fix their breakages will be
>> solved but at least they will be pinged automatically. Hopefully many
>> follow Jira updates.
>>
>> “  I don't take the past as strongly indicative of the future here since
>> we've been allowing circle to validate pre-commit and haven't been
>> multiplexing.”
>> I am interested to compare how many tickets for flaky tests we will have
>> pre-5.0 now compared to pre-4.1.
>>
>>
>> On Wed, 12 Jul 2023 at 8:41, Josh McKenzie  wrote:
>>
>>> (This response ended up being a bit longer than intended; sorry about
>>> that)
>>>
>>> What is more common though is packaging errors,
>>> cdc/compression/system_ks_directory targeted fixes, CI w/wo
>>> upgrade tests, being less responsive post-commit as you already
>>> moved on
>>>
>>> *Two that **should **be resolved in the new regime:*
>>> * Packaging errors should be caught pre as we're making the artifact
>>> builds part of pre-commit.
>>> * I'm hoping to merge the commit log segment allocation so CDC allocator
>>> is the only one for 5.0 (and just bypasses the cdc-related work on
>>> allocation if it's disabled thus not impacting perf); the existing targeted
>>> testing of cdc specific functionality should be sufficient to confirm its
>>> correctness as it doesn't vary from the primary allocation path when it
>>> comes to mutation space in the buffer
>>> * Upgrade tests are going to be part of the pre-commit suite
>>>
>>> *Outstanding issues:*
>>> * compression. If we just run with defaults we won't test all cases so
>>> errors could pop up here
>>> * system_ks_directory related things: is this still ongoing or did we
>>> have a transient burst of these types of issues? And would we expect these
>>> to vary based on different JDK's, non-default configurations, etc?
>>> * Being less responsive post-commit: My only ideas here are a
>>> combination of the jenkins_jira_integration
>>> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>>> script updating the JIRA ticket with test results if you cause a regression
>>> + us building a muscle around reverting your commit if they break tests.
>>>
>>> To quote Jacek:
>>>
>>> why don't run dtests w/wo sstable compression x w/wo internode
>>> encryption x w/wo vnodes,
>>> w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc.
>>> I think this is a matter of cost vs result.
>>>
>>>
>>> I think we've organically made these decisions and tradeoffs in the past
>>> without being methodical about it. If we can:
>>> 1. Multiplex changed or new tests
>>> 2. Tighten the feedback loop of "tests were green, now they're
>>> *consistently* not, you're the only one who changed something", and
>>> 3. Instill a culture of "if you can't fix it immediately revert your
>>> commit"
>>>
>>> Then I think we'll only be vulnerable to flaky failures introduced
>>> across different non-default configurations as side effects in tests that
>>> aren't touched, which *intuitively* feels like a lot less than we're
>>> facing today. We could even get clever as a day 2 effort and define
>>> packages in the primary codebase where changes take place and multiplex (on
>>> a smalle

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-12 Thread Jacek Lewandowski

Would it be re-opening the ticket or creating a new ticket with "revert of
fix" ?



śr., 12 lip 2023 o 14:51 Ekaterina Dimitrova 
napisał(a):

> jenkins_jira_integration
> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>  script
> updating the JIRA ticket with test results if you cause a regression + us
> building a muscle around reverting your commit if they break tests.“
>
> I am not sure people finding the time to fix their breakages will be
> solved but at least they will be pinged automatically. Hopefully many
> follow Jira updates.
>
> “  I don't take the past as strongly indicative of the future here since
> we've been allowing circle to validate pre-commit and haven't been
> multiplexing.”
> I am interested to compare how many tickets for flaky tests we will have
> pre-5.0 now compared to pre-4.1.
>
>
> On Wed, 12 Jul 2023 at 8:41, Josh McKenzie  wrote:
>
>> (This response ended up being a bit longer than intended; sorry about
>> that)
>>
>> What is more common though is packaging errors,
>> cdc/compression/system_ks_directory targeted fixes, CI w/wo
>> upgrade tests, being less responsive post-commit as you already
>> moved on
>>
>> *Two that **should **be resolved in the new regime:*
>> * Packaging errors should be caught pre as we're making the artifact
>> builds part of pre-commit.
>> * I'm hoping to merge the commit log segment allocation so CDC allocator
>> is the only one for 5.0 (and just bypasses the cdc-related work on
>> allocation if it's disabled thus not impacting perf); the existing targeted
>> testing of cdc specific functionality should be sufficient to confirm its
>> correctness as it doesn't vary from the primary allocation path when it
>> comes to mutation space in the buffer
>> * Upgrade tests are going to be part of the pre-commit suite
>>
>> *Outstanding issues:*
>> * compression. If we just run with defaults we won't test all cases so
>> errors could pop up here
>> * system_ks_directory related things: is this still ongoing or did we
>> have a transient burst of these types of issues? And would we expect these
>> to vary based on different JDK's, non-default configurations, etc?
>> * Being less responsive post-commit: My only ideas here are a combination
>> of the jenkins_jira_integration
>> <https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py>
>> script updating the JIRA ticket with test results if you cause a regression
>> + us building a muscle around reverting your commit if they break tests.
>>
>> To quote Jacek:
>>
>> why don't run dtests w/wo sstable compression x w/wo internode encryption
>> x w/wo vnodes,
>> w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc.
>> I think this is a matter of cost vs result.
>>
>>
>> I think we've organically made these decisions and tradeoffs in the past
>> without being methodical about it. If we can:
>> 1. Multiplex changed or new tests
>> 2. Tighten the feedback loop of "tests were green, now they're
>> *consistently* not, you're the only one who changed something", and
>> 3. Instill a culture of "if you can't fix it immediately revert your
>> commit"
>>
>> Then I think we'll only be vulnerable to flaky failures introduced across
>> different non-default configurations as side effects in tests that aren't
>> touched, which *intuitively* feels like a lot less than we're facing
>> today. We could even get clever as a day 2 effort and define packages in
>> the primary codebase where changes take place and multiplex (on a smaller
>> scale) their respective packages of unit tests in the future if we see
>> problems in this area.
>>
>> Flakey tests are a giant pain in the ass and a huge drain on
>> productivity, don't get me wrong. *And* we have to balance how much cost
>> we're paying before each commit with the benefit we expect to gain from
>> that.
>>
>> Does the above make sense? Are there things you've seen in the trenches
>> that challenge or invalidate any of those perspectives?
>>
>> On Wed, Jul 12, 2023, at 7:28 AM, Jacek Lewandowski wrote:
>>
>> Isn't novnodes a special case of vnodes with n=1 ?
>>
>> We should rather select a subset of tests for which it makes sense to run
>> with different configurations.
>>
>> The set of configurations against which we run the tests currently is
>> still only the subset of all possible cases.
>> I could ask - why don't ru

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-12 Thread Jacek Lewandowski

Isn't novnodes a special case of vnodes with n=1 ?

We should rather select a subset of tests for which it makes sense to run
with different configurations.

The set of configurations against which we run the tests currently is still
only the subset of all possible cases.
I could ask - why don't run dtests w/wo sstable compression x w/wo
internode encryption x w/wo vnodes,
w/wo off-heap buffers x j8/j11/j17 x w/wo CDC x RedHat/Debian/SUSE, etc. I
think this is a matter of cost vs result.
This equation contains the likelihood of failure in configuration X, given
there was no failure in the default
configuration, the cost of running those tests, the time we delay merging,
the likelihood that we wait for
the test results so long that our branch diverge and we will have to rerun
them or accept the fact that we merge
a code which was tested on outdated base. Eventually, the overall new
contributors experience - whether they
want to participate in the future.



śr., 12 lip 2023 o 07:24 Berenguer Blasi 
napisał(a):

> On our 4.0 release I remember a number of such failures but not recently.
> What is more common though is packaging errors,
> cdc/compression/system_ks_directory targeted fixes, CI w/wo upgrade tests,
> being less responsive post-commit as you already moved on,... Either the
> smoke pre-commit has approval steps for everything or we should give imo a
> devBranch alike job to the dev pre-commit. I find it terribly useful. My
> 2cts.
> On 11/7/23 18:26, Josh McKenzie wrote:
>
> 2: Pre-commit 'devBranch' full suite for high risk/disruptive merges: at
> reviewer's discretion
>
> In general, maybe offering a dev the option of choosing either "pre-commit
> smoke" or "post-commit full" at their discretion for any work would be the
> right play.
>
> A follow-on thought: even with something as significant as Accord, TCM,
> Trie data structures, etc - I'd be a bit surprised to see tests fail on
> JDK17 that didn't on 11, or with vs. without vnodes, in ways that weren't
> immediately clear the patch stumbled across something surprising and was
> immediately trivially attributable if not fixable. *In theory* the things
> we're talking about excluding from the pre-commit smoke test suite are all
> things that are supposed to be identical across environments and thus
> opaque / interchangeable by default (JDK version outside checking build
> which we will, vnodes vs. non, etc).
>
> Has that not proven to be the case in your experience?
>
> On Tue, Jul 11, 2023, at 10:15 AM, Derek Chen-Becker wrote:
>
> A strong +1 to getting to a single CI system. CircleCI definitely has some
> niceties and I understand why it's currently used, but right now we get 2
> CI systems for twice the price. +1 on the proposed subsets.
>
> Derek
>
> On Mon, Jul 10, 2023 at 9:37 AM Josh McKenzie 
> wrote:
>
>
> I'm personally not thinking about CircleCI at all; I'm envisioning a world
> where all of us have 1 CI *software* system (i.e. reproducible on any
> env) that we use for pre-commit validation, and then post-commit happens on
> reference ASF hardware.
>
> So:
> 1: Pre-commit subset of tests (suites + matrices + env) runs. On green,
> merge.
> 2: Post-commit tests (all suites, matrices, env) runs. If failure, link
> back to the JIRA where the commit took place
>
> Circle would need to remain in lockstep with the requirements for point 1
> here.
>
> On Mon, Jul 10, 2023, at 1:04 AM, Berenguer Blasi wrote:
>
> +1 to Josh which is exactly my line of thought as well. But that is only
> valid if we have a solid Jenkins that will eventually run all test configs.
> So I think I lost track a bit here. Are you proposing:
>
> 1- CircleCI: Run pre-commit a single (the most common/meaningful, TBD)
> config of tests
>
> 2- Jenkins: Runs post-commit _all_ test configs and emails/notifies you in
> case of problems?
>
> Or sthg different like having 1 also in Jenkins?
> On 7/7/23 17:55, Andrés de la Peña wrote:
>
> I think 500 runs combining all configs could be reasonable, since it's
> unlikely to have config-specific flaky tests. As in five configs with 100
> repetitions each.
>
> On Fri, 7 Jul 2023 at 16:14, Josh McKenzie  wrote:
>
> Maybe. Kind of depends on how long we write our tests to run doesn't it? :)
>
> But point taken. Any non-trivial test would start to be something of a
> beast under this approach.
>
> On Fri, Jul 7, 2023, at 11:12 AM, Brandon Williams wrote:
>
> On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie 
> wrote:
> > 3. Multiplexed tests (changed, added) run against all JDK's and a
> broader range of configs (no-vnode, vnode default, compression, etc)
>
> I think this is going to be too heavy...we're taking 500 iterations
> and multiplying that by like 4 or 5?
>
>
>
>
>
> --
> +---+
> | Derek Chen-Becker |
> | GPG Key available at https://keybase.io/dchenbecker and   |
> |

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-07-11 Thread Jacek Lewandowski

Thanks,

I will follow that path then,



pon., 10 lip 2023 o 19:03 Jon Meredith  napisał(a):

> +1 from me too. I would support removing all of the optional checks from
> jar/test as I also hit issues with rat from time to time while iterating,
> as long as the CI system runs them and makes it very clear for any
> committer there are failures.
>
> On Mon, Jul 10, 2023 at 9:40 AM Josh McKenzie 
> wrote:
>
>>
>>- Remove the checkstyle dependency from "jar" and "test"
>>- Create a single "check" target that includes all the checks we
>>expect to pass in the CI (currently Checkstyle, RAT, and 
>> Eclipse-Warnings),
>>making this task the default.
>>
>> +1 here.
>>
>> (of note: haven't forgotten the request from this thread to share local
>> env; just gotten sidetracked by things and also realized how little I've
>> actually modified locally since I just run most of the linting against
>> delta'ed files only to keep my changed work in compliance. Still a very
>> noisy mess when SpotBugs is run against the entire codebase proper)
>>
>> On Mon, Jul 10, 2023, at 7:13 AM, Brandon Williams wrote:
>>
>> On Mon, Jul 10, 2023 at 6:07 AM Jacek Lewandowski
>>  wrote:
>> > Remove the checkstyle dependency from "jar" and "test"
>> > Create a single "check" target that includes all the checks we expect
>> to pass in the CI (currently Checkstyle, RAT, and Eclipse-Warnings), making
>> this task the default.
>>
>> I support this.  Having checkstyle run when building is clearly
>> constant friction for many, even though you can disable it.
>>
>>
>>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-07-10 Thread Jacek Lewandowski

t;> ant test -Dno-build
>>> ant publish -Dno-tests -Dno-checks
>>>
>>>
>>> I'm not saying what you've proposed is bad, in fact, we're not
>>> currently doing the pipeline I'm talking about, but adding an
>>> additional endpoint is something we should consider very carefully as
>>> it may create some difficulties for Maven/Gradle migration if it ever
>>> happens.
>>>
>>> So, if I'm not mistaken the following you're trying to add a new
>>> endpoint to the way how we might build the project:
>>>
>>> - "ant [check]" = build + all checks (first endpoint)
>>> - "ant jar" = build + make jars + no checks (second endpoint)
>>>
>>> And I would suggest running `ant jar -Dno-checks` instead to achieve
>>> the same result assuming the `jar` is still transitively dependent on
>>> `checks`.
>>>
>>> On Thu, 6 Jul 2023 at 14:02, Jacek Lewandowski
>>>  wrote:
>>> >
>>> > Great discussion, but I feel we still have no conclusion.
>>> >
>>> >
>>> > I fully support automatically setting up IDE(A) to run the necessary
>>> stuff automatically in a developer-friendly environment, but let it be
>>> continued in a separate thread.
>>> >
>>> >
>>> > I wouldn't say I like flags, especially if they have to be used on a
>>> daily basis. The build script help message does not list them when "ant -p"
>>> is run.
>>> >
>>> >
>>> > I'm going to make these changes unless it is vetoed:
>>> >
>>> > "ant [check]" = build + all checks, build everything, and run all the
>>> checks; also, this would become the default target if no target is specified
>>> > "ant jar" = build + make jars: build all the jars and tests, no checks
>>> > All "test" commands = build + make jars + run the tests: build all the
>>> jars and tests, run the tests, no checks
>>> >
>>> >
>>> > Therefore, a user who wants to validate their branch before running CI
>>> would need to run just "ant" without any args. This way, a newcomer who
>>> does not know our build targets will likely run the checks.
>>> >
>>> >
>>> > We still need some flags for skipping specific tasks to optimize for
>>> CI, but in general, they would not be required for local development.
>>> >
>>> >
>>> > Flags will also be needed to customize some tasks, but they should be
>>> optional for newcomers. In addition, a "help" target could display a list
>>> of selected tasks and properties with descriptions.
>>> >
>>> >
>>> > I'd be more than happy if we could conclude the discussion somehow and
>>> move forward :)
>>> >
>>> >
>>> > thanks,
>>> >
>>> > Jacek
>>> >
>>> >
>>> >
>>> > czw., 29 cze 2023 o 23:34 Ekaterina Dimitrova 
>>> napisał(a):
>>> >>
>>> >> There is a separate thread started and respective ticket for
>>> generate-idea-files.
>>> >> https://lists.apache.org/thread/o2fdkyv2skvf9ngy9jhpnhvo92qvr17m
>>> >> CASSANDRA-18467
>>> >>
>>> >>
>>> >> On Thu, 29 Jun 2023 at 16:54, Jeremiah Jordan <
>>> jeremiah.jor...@gmail.com> wrote:
>>> >>>
>>> >>> +100 I support making generate-idea-files auto setup everything in
>>> IntelliJ for you.  If you post a diff, I will test it.
>>> >>>
>>> >>> On this proposal, I don’t really have an opinion one way or the
>>> other about what the default is for local "ant jar”, if its slow I will
>>> figure out how to turn it off, if its fast I will leave it on.
>>> >>> I do care that CI runs checks, and complains loudly if something is
>>> wrong such that it is very easy to tell during review.
>>> >>>
>>> >>> -Jeremiah
>>> >>>
>>> >>> On Jun 29, 2023 at 1:44:09 PM, Josh McKenzie 
>>> wrote:
>>> >>>>
>>> >>>> In accord I added an opt-out for each hook, and will require such
>>> here as well
>>> >>>>
>>> >>>> On for main branches, off for feature branches seems like it might
>>> blanket satisfy this concern? Doesn't fix the &quo

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-07-10 Thread Jacek Lewandowski

Given what was said, I propose rephrasing this functionality to limit the
memory used to execute a query. We will not expose the page size measured
in bytes to the client. Instead, an upper limit will be a guardrail so that
we won't fetch more data.

Aggregation query with grouping is a special case in which we would count
only those columns marked as queried in a ColumnFilter for a grouped result
(maximum sizes of those columns in a group).

This way, we can still achieve the goal of making the server more stable
under heavy load. Letting the user specify a page size in bytes is indeed a
separate story, as the result set size needs to be measured on a higher
level, where the selectors are applied.

thanks,
Jacek


wt., 13 cze 2023 o 10:42 Benjamin Lerer  napisał(a):

> So my other question - for aggregation with the "group by" clause, we
>> return an aggregated row which is computed from a group of rows - with my
>> current implementation, it is approximated by counting the size of the
>> largest row in that group - I think it is the safest and simplest
>> approximation - wdyt?
>
>
> I feel that there are something that was not discussed here. The storage
> engine can return some rows that are much larger than the actual row
> returned to the user depending on the projections being used. Therefore
> there will only be a reliable matching between the size of the page loaded
> internally and the size of the page returned to the user when the full row
> is queried without transformation. For all the other case the difference
> can be really significant. For a group by queries doing a count(*), the
> approach suggested will return a page size that is totally off with what
> was requested.
>
> Le mar. 13 juin 2023 à 07:00, Jacek Lewandowski <
> lewandowski.ja...@gmail.com> a écrit :
>
>> Josh, that answers my question exactly; thank you.
>>
>> I will not implement limiting the result set in CQL (that is, by LIMIT
>> clause) and stay with just paging. Whether the page size is defined in
>> bytes or rows can be determined by a flag - there are many unused bits for
>> that.
>>
>> So my other question - for aggregation with the "group by" clause, we
>> return an aggregated row which is computed from a group of rows - with my
>> current implementation, it is approximated by counting the size of the
>> largest row in that group - I think it is the safest and simplest
>> approximation - wdyt?
>>
>>
>> pon., 12 cze 2023 o 22:55 Josh McKenzie 
>> napisał(a):
>>
>>> As long as it is valid in the paging protocol to return a short page,
>>> but still say “there are more pages”, I think that is fine to do that.
>>>
>>> Thankfully the v3-v5 spec all make it clear that clients need to respect
>>> what the server has to say about there being more pages:
>>> https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v5.spec#L1247-L1253
>>>
>>>   - Clients should not rely on the actual size of the result set
>>> returned to
>>> decide if there are more results to fetch or not. Instead, they
>>> should always
>>> check the Has_more_pages flag (unless they did not enable paging for
>>> the query
>>> obviously). Clients should also not assert that no result will have
>>> more than
>>>  results. While the current implementation always
>>> respects
>>> the exact value of , we reserve the right to return
>>> slightly smaller or bigger pages in the future for performance
>>> reasons.
>>>
>>>
>>> On Mon, Jun 12, 2023, at 3:19 PM, Jeremiah Jordan wrote:
>>>
>>> As long as it is valid in the paging protocol to return a short page,
>>> but still say “there are more pages”, I think that is fine to do that.  For
>>> an actual LIMIT that is part of the user query, I think the server must
>>> always have returned all data that fits into the LIMIT when all pages have
>>> been returned.
>>>
>>> -Jeremiah
>>>
>>> On Jun 12, 2023 at 12:56:14 PM, Josh McKenzie 
>>> wrote:
>>>
>>>
>>> Yeah, my bad. I have paging on the brain. Seriously.
>>>
>>> I can't think of a use-case in which a LIMIT based on # bytes makes
>>> sense from a user perspective.
>>>
>>> On Mon, Jun 12, 2023, at 1:35 PM, Jeff Jirsa wrote:
>>>
>>>
>>>
>>> On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer 
>>> wrote:
>>>
>>> If you have rows that vary significantly in their size, your latencies
>>> could end u

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-07-06 Thread Jacek Lewandowski

>
>>> Trading one for one with Josh :-)
>>>
>>> Best regards,
>>> Ekaterina
>>>
>>> On Thu, 29 Jun 2023 at 10:52, Josh McKenzie 
>>> wrote:
>>>
>>>
>>> I really prefer separate tasks than flags. Flags are not listed in the
>>> help message like "ant -p" and are not auto-completed in the terminal. That
>>> makes them almost undiscoverable for newcomers.
>>>
>>> Please, no more flags. We are *more* than flaggy enough right now.
>>>
>>> Having to dig through build.xml to determine how to change things or do
>>> things is painful; the more we can avoid this (for oldtimers and newcomers
>>> alike!) the better.
>>>
>>> On Thu, Jun 29, 2023, at 8:34 AM, Mick Semb Wever wrote:
>>>
>>>
>>>
>>> On Thu, 29 Jun 2023 at 13:30, Jacek Lewandowski <
>>> lewandowski.ja...@gmail.com> wrote:
>>>
>>> There is another target called "build", which retrieves dependencies,
>>> and then calls "build-project".
>>>
>>>
>>>
>>> Is it intended to be called by a user ?
>>>
>>> If not, please follow the ant style prefixing the target name with an
>>> underscore (so that it does not appear in the `ant -projecthelp` list).
>>>
>>> If possible, I agree with Brandon, `build` is the better name to expose
>>> to the user.
>>>
>>>
>>>
>>>

Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-05 Thread Jacek Lewandowski

Perhaps pre-commit checks should include mostly the typical configuration
of Cassandra rather than some subset of possible combinations. Like it was
said somewhere above - test with the default number of vnodes, test with
the default compression settings, and test with the default heap/off-heap
buffers.

A longer-term goal could be to isolate what depends on particular
configuration options. Instead of blindly running everything with, say,
vnodes enabled and disabled, isolate those tests that need to be run with
those two configurations and run the rest with the default one.

... the rule of multiplexing new or changed tests might go a long way to
> mitigating that ...


I wonder if there is some commonality in the flaky tests reported so far,
like the presence of certain statements? Also, there could be a tool that
inspects coverage analysis reports and chooses the proper tests to
run/multiplex because, in the end, we want to verify the changed production
code in addition to the modified test files.

thanks,
Jacek

śr., 5 lip 2023 o 06:28 Berenguer Blasi 
napisał(a):

> Currently a dtest is being ran in j8 w/wo vnodes , j8/j11 w/wo vnodes and
> j11 w/wo vnodes. That is 6 times total. I wonder about that ROI.
>
> On dtest cluster reusage yes, I stopped that as at the time we had lots of
> CI changes, an upcoming release and priorities. But when the CI starts
> flexing it's muscles that'd be easy to pick up again as dtests code
> shouldn't have changed much.
> On 4/7/23 17:11, Derek Chen-Becker wrote:
>
> Ultimately I think we have to invest in two directions: first, choose a
> consistent, representative subset of stable tests that we feel give us a
> reasonable level of confidence in return for a reasonable amount of
> runtime. Second, we need to invest in figuring out why certain tests fail.
> I strongly dislike the term "flaky" because it suggests that it's some
> inconsequential issue causing problems. The truth is that a test that fails
> is either a bug in the service code or a bug in the test. I've come to
> realize that the CI and build framework is way too complex for me to be
> able to help with much, but I would love to start chipping away at failing
> test bugs. I'm getting settled into my new job and I should be able to
> commit some regular time each week to triage and fixing starting in August,
> and if there are any other folks who are interested let me know.
>
> Cheers,
>
> Derek
>
> On Mon, Jul 3, 2023, 12:30 PM Josh McKenzie  wrote:
>
>> Instead of running all the tests through available CI agents every time
>> we can have presets of tests:
>>
>> Back when I joined the project in 2014, unit tests took ~ 5 minutes to
>> run on a local machine. We had pre-commit and post-commit tests as a
>> distinction as well, but also had flakes in the pre and post batch. I'd
>> love to see us get back to a unit test regime like that.
>>
>> The challenge we've always had is flaky tests showing up in either the
>> pre-commit or post-commit groups and difficulty in attribution on a flaky
>> failure to where it was introduced (not to lay blame but to educate and
>> learn and prevent recurrence). While historically further reduced smoke
>> testing suites would just mean more flakes showing up downstream, the rule
>> of multiplexing new or changed tests might go a long way to mitigating that.
>>
>> Should we mention in this concept how we will build the sub-projects
>> (e.g. Accord) alongside Cassandra?
>>
>> I think it's an interesting question, but I also think there's no real
>> dependency of process between primary mainline branches and feature
>> branches. My intuition is that having the same bar (green CI, multiplex,
>> don't introduce flakes, smart smoke suite tiering) would be a good idea on
>> feature branches so there's not a death march right before merge, squashing
>> flakes when you have to multiplex hundreds of tests before merge to
>> mainline (since presumably a feature branch would impact a lot of tests).
>>
>> Now that I write that all out it does sound Painful. =/
>>
>> On Mon, Jul 3, 2023, at 10:38 AM, Maxim Muzafarov wrote:
>>
>> For me, the biggest benefit of keeping the build scripts and CI
>> configurations as well in the same project is that these files are
>> versioned in the same way as the main sources do. This ensures that we
>> can build past releases without having any annoying errors in the
>> scripts, so I would say that this is a pretty necessary change.
>>
>> I'd like to mention the approach that could work for the projects with
>> a huge amount of tests. Instead of running all the tests through
>> available CI agents every time we can have presets of tests:
>> - base tests (to make sure that your design basically works, the set
>> will not run longer than 30 min);
>> - pre-commit tests (the number of tests to make sure that we can
>> safely commit new changes and fit the run into the 1-2 hour build
>> timeframe);
>> - nightly builds (scheduled task to build everything we

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-29 Thread Jacek Lewandowski

There is another target called "build", which retrieves dependencies, and
then calls "build-project".


czw., 29 cze 2023 o 12:33 Brandon Williams  napisał(a):

> This sounds good to me.  Can we shorten 'build-project' to just 'build'?
>
> Kind Regards,
> Brandon
>
> On Thu, Jun 29, 2023 at 3:22 AM Jacek Lewandowski
>  wrote:
> >
> > So given all the feedback, I'm going to do the following:
> >
> > "jar" will depend on "check" target
> > "build-project", "build-test" and "test" targets will not depend on
> "check" target
> > "check" target will include checkstyle, rat and eclipse-warnings
> >
> > There is an additional flag "no-check" to disable checks in the "jar"
> target.
> >
> > Will not introduce any Git hook.
> >
> > wt., 27 cze 2023 o 18:35 Jacek Lewandowski 
> napisał(a):
> >>
> >> With git you can always opt-out by adding --no-verify flag to either
> push or commit
> >>
> >> I really prefer separate tasks than flags. Flags are not listed in the
> help message like "ant -p" and are not auto-completed in the terminal. That
> makes them almost undiscoverable for newcomers.
> >>
> >> Want to have jar include checks? Ok, but let's don't run checks
> automatically with "build" or "test"
> >>
> >>
> >>
> >> wt., 27 cze 2023 o 18:26 David Capwell  napisał(a):
> >>>
> >>> nobody referred to running checks in a pre-push (or pre-commit) hook
> >>>
> >>>
> >>>
> >>> In accord I added an opt-out for each hook, and will require such here
> as well… as long as you can opt-out, its fine by me… I know I will likely
> opt-out, but wouldn’t block such an effort
> >>>
> >>> Your point that pre-push hook might not be the best one is valid, and
> we should rather think about pre-commit
> >>>
> >>>
> >>> Pre-push is far better than pre-commit, with pre-commit you are
> forcing a style on people…. I for one have many many commits in a single
> PR, where I use commits to not loose track of progress (even if the code
> doesn’t compile), so forcing the build to work would be a -1 from me….
> Pre-push at least means “you want the world to see this” so makes more
> sense there…
> >>>
> >>> Again, must have an opt-out
> >>>
> >>> proposed:
> >>> ant jar (just build)
> >>> git commit (run some checks)
> >>>
> >>>
> >>> I am against this, jar should also check and ask users to opt-out if
> they don’t want the checks….
> >>>
> >>> If we go with opt-out i'd like to see one flag that can disable all
> checks: `-Dchecks.skip`
> >>>
> >>>
> >>> Works for me, you can also do the following to disable and not worry
> about
> >>>
> >>> $ cat < build.properties
> >>> checks.skip: true
> >>> EOF
> >>>
> >>> On Jun 27, 2023, at 3:14 AM, Mick Semb Wever  wrote:
> >>>
> >>> The context is that we currently have 3 checks in the build:
> >>>
> >>> - Checkstyle,
> >>> - Eclipse-Warnings,
> >>> - RAT
> >>>
> >>>
> >>>
> >>> And dependency-check (owasp).
> >>>
> >>>
> >>>
> >>> I want to discuss whether you are ok with extracting all checks to
> their distinct target and not running it automatically with the targets
> which devs usually run locally. In particular:
> >>>
> >>>
> >>> "build", "jar", and all "test" targets would not trigger CheckStyle,
> RAT or Eclipse-Warnings
> >>> A new target "check" would trigger all CheckStyle, RAT, and
> Eclipse-Warnings
> >>> The new "check" target would be run along with the "artifacts" target
> on Jenkins-CI, and it as a separate build step in CircleCI
> >>>
> >>>
> >>>
> >>> +0 I prefer this opt-in over an opt-out approach.
> >>>
> >>> It should be separated from `artifacts` too.
> >>> We would need to encourage engineers to run `ant check` before
> >>> starting CI and/or requesting review.
> >>>
> >>> I'm in favour of the opt-in approach because it keeps it visible.
> >>> Folks configure flags and it "disappears" forever.  Also it's a
> >>> headache in all the other ant targets where we actually don't want it,
> >>> e.g. tests.
> >>>
> >>> If we go with opt-out i'd like to see one flag that can disable all
> >>> checks: `-Dchecks.skip`
> >>>
> >>>
> >>> That could be fixed by running checks in a pre-push Git hook. There
> are some benefits of this compared to the current behavior:
> >>>
> >>>
> >>>
> >>> -1
> >>> committing and pushing to a personal branch is often done to save work
> >>> or for cross-machine or collaboration. We should not gate on checks or
> >>> compilation here.
> >>>
> >>> PRs should fail if checks fail, to give newcomers clear feedback (and
> >>> to take this nit-picking out of the review process).
> >>>
> >>>
>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-29 Thread Jacek Lewandowski

So given all the feedback, I'm going to do the following:

"jar" will depend on "check" target
"build-project", "build-test" and "test" targets will not depend on "check"
target
"check" target will include checkstyle, rat and eclipse-warnings

There is an additional flag "no-check" to disable checks in the "jar"
target.

Will not introduce any Git hook.

wt., 27 cze 2023 o 18:35 Jacek Lewandowski 
napisał(a):

> With git you can always opt-out by adding --no-verify flag to either push
> or commit
>
> I really prefer separate tasks than flags. Flags are not listed in the
> help message like "ant -p" and are not auto-completed in the terminal. That
> makes them almost undiscoverable for newcomers.
>
> Want to have jar include checks? Ok, but let's don't run checks
> automatically with "build" or "test"
>
>
>
> wt., 27 cze 2023 o 18:26 David Capwell  napisał(a):
>
>> nobody referred to running checks in a pre-push (or pre-commit) hook
>>
>>
>>
>> In accord I added an opt-out for each hook, and will require such here as
>> well… as long as you can opt-out, its fine by me… I know I will likely
>> opt-out, but wouldn’t block such an effort
>>
>> Your point that pre-push hook might not be the best one is valid, and we
>> should rather think about pre-commit
>>
>>
>> Pre-push is far better than pre-commit, with pre-commit you are forcing a
>> style on people…. I for one have many many commits in a single PR, where I
>> use commits to not loose track of progress (even if the code doesn’t
>> compile), so forcing the build to work would be a -1 from me…. Pre-push at
>> least means “you want the world to see this” so makes more sense there…
>>
>> Again, must have an opt-out
>>
>> proposed:
>> ant jar (just build)
>> git commit (run some checks)
>>
>>
>> I am against this, jar should also check and ask users to opt-out if they
>> don’t want the checks….
>>
>> If we go with opt-out i'd like to see one flag that can disable all checks:
>> `-Dchecks.skip`
>>
>>
>> Works for me, you can also do the following to disable and not worry about
>>
>> $ cat < build.properties
>> checks.skip: true
>> EOF
>>
>> On Jun 27, 2023, at 3:14 AM, Mick Semb Wever  wrote:
>>
>> The context is that we currently have 3 checks in the build:
>>
>> - Checkstyle,
>> - Eclipse-Warnings,
>> - RAT
>>
>>
>>
>> And dependency-check (owasp).
>>
>>
>>
>> I want to discuss whether you are ok with extracting all checks to their
>> distinct target and not running it automatically with the targets which
>> devs usually run locally. In particular:
>>
>>
>> "build", "jar", and all "test" targets would not trigger CheckStyle, RAT
>> or Eclipse-Warnings
>> A new target "check" would trigger all CheckStyle, RAT, and
>> Eclipse-Warnings
>> The new "check" target would be run along with the "artifacts" target on
>> Jenkins-CI, and it as a separate build step in CircleCI
>>
>>
>>
>> +0 I prefer this opt-in over an opt-out approach.
>>
>> It should be separated from `artifacts` too.
>> We would need to encourage engineers to run `ant check` before
>> starting CI and/or requesting review.
>>
>> I'm in favour of the opt-in approach because it keeps it visible.
>> Folks configure flags and it "disappears" forever.  Also it's a
>> headache in all the other ant targets where we actually don't want it,
>> e.g. tests.
>>
>> If we go with opt-out i'd like to see one flag that can disable all
>> checks: `-Dchecks.skip`
>>
>>
>> That could be fixed by running checks in a pre-push Git hook. There are
>> some benefits of this compared to the current behavior:
>>
>>
>>
>> -1
>> committing and pushing to a personal branch is often done to save work
>> or for cross-machine or collaboration. We should not gate on checks or
>> compilation here.
>>
>> PRs should fail if checks fail, to give newcomers clear feedback (and
>> to take this nit-picking out of the review process).
>>
>>
>>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-27 Thread Jacek Lewandowski

With git you can always opt-out by adding --no-verify flag to either push
or commit

I really prefer separate tasks than flags. Flags are not listed in the help
message like "ant -p" and are not auto-completed in the terminal. That
makes them almost undiscoverable for newcomers.

Want to have jar include checks? Ok, but let's don't run checks
automatically with "build" or "test"



wt., 27 cze 2023 o 18:26 David Capwell  napisał(a):

> nobody referred to running checks in a pre-push (or pre-commit) hook
>
>
>
> In accord I added an opt-out for each hook, and will require such here as
> well… as long as you can opt-out, its fine by me… I know I will likely
> opt-out, but wouldn’t block such an effort
>
> Your point that pre-push hook might not be the best one is valid, and we
> should rather think about pre-commit
>
>
> Pre-push is far better than pre-commit, with pre-commit you are forcing a
> style on people…. I for one have many many commits in a single PR, where I
> use commits to not loose track of progress (even if the code doesn’t
> compile), so forcing the build to work would be a -1 from me…. Pre-push at
> least means “you want the world to see this” so makes more sense there…
>
> Again, must have an opt-out
>
> proposed:
> ant jar (just build)
> git commit (run some checks)
>
>
> I am against this, jar should also check and ask users to opt-out if they
> don’t want the checks….
>
> If we go with opt-out i'd like to see one flag that can disable all checks:
> `-Dchecks.skip`
>
>
> Works for me, you can also do the following to disable and not worry about
>
> $ cat < build.properties
> checks.skip: true
> EOF
>
> On Jun 27, 2023, at 3:14 AM, Mick Semb Wever  wrote:
>
> The context is that we currently have 3 checks in the build:
>
> - Checkstyle,
> - Eclipse-Warnings,
> - RAT
>
>
>
> And dependency-check (owasp).
>
>
>
> I want to discuss whether you are ok with extracting all checks to their
> distinct target and not running it automatically with the targets which
> devs usually run locally. In particular:
>
>
> "build", "jar", and all "test" targets would not trigger CheckStyle, RAT
> or Eclipse-Warnings
> A new target "check" would trigger all CheckStyle, RAT, and
> Eclipse-Warnings
> The new "check" target would be run along with the "artifacts" target on
> Jenkins-CI, and it as a separate build step in CircleCI
>
>
>
> +0 I prefer this opt-in over an opt-out approach.
>
> It should be separated from `artifacts` too.
> We would need to encourage engineers to run `ant check` before
> starting CI and/or requesting review.
>
> I'm in favour of the opt-in approach because it keeps it visible.
> Folks configure flags and it "disappears" forever.  Also it's a
> headache in all the other ant targets where we actually don't want it,
> e.g. tests.
>
> If we go with opt-out i'd like to see one flag that can disable all
> checks: `-Dchecks.skip`
>
>
> That could be fixed by running checks in a pre-push Git hook. There are
> some benefits of this compared to the current behavior:
>
>
>
> -1
> committing and pushing to a personal branch is often done to save work
> or for cross-machine or collaboration. We should not gate on checks or
> compilation here.
>
> PRs should fail if checks fail, to give newcomers clear feedback (and
> to take this nit-picking out of the review process).
>
>
>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-27 Thread Jacek Lewandowski

Stefan, if you build using command line and then commit / push also in the
terminal, nothing would change for you:

now:
ant jar (automatically runs some checks)
git commit
git push

proposed:
ant jar (just build)
git commit (run some checks)
git push

Your point that pre-push hook might not be the best one is valid, and we
should rather think about pre-commit



- - -- --- -  -
Jacek Lewandowski


wt., 27 cze 2023 o 09:25 Miklosovic, Stefan 
napisał(a):

> I am doing all git-related operations in the console / bash (IDEA
> terminal, alt+f12 in IDEA). I know IDEA can do git stuff as well but I
> never tried it and I just do not care. I just do not "believe it" (yeah,
> call me old-fashioned if you want) so for me how it looks like in IDEA
> around some checkboxes I have to turn off is irrelevant.
>
> I do not like the idea of git hooks. Maybe it is a matter of a strong
> habit but I am executing all these checks before I push anyway so for me
> the git hooks are not important and I would have to unlearn building it if
> git hook is going to do that for me instead.
>
> If I am going to push 5 branches like this:
>
> git push upstream cassandra-3.0 cassandra-3.11 cassandra-4.0 cassandra-4.1
> trunk --atomic
>
> This means that git hooks would start to build 5 branches again? What if
> somebody pushes as I am building it? Building 5 branches from scratch would
> take like 10 minutes, probably ...
>
> 
> From: Jacek Lewandowski 
> Sent: Tuesday, June 27, 2023 9:08
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] When to run CheckStyle and other verificiations
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> So far, nobody referred to running checks in a pre-push (or pre-commit)
> hook. The use of Git hooks is going to be introduced along with Accord, so
> we could use them to force running checks once before sending changes to
> the repo.
> It would still be an opt-out approach because one would have to add the
> "--no-verify" flag or uncheck a box in the commit dialog to skip running
> the checks.
>
> thanks,
> Jacek
>
>
> wt., 27 cze 2023 o 01:55 Ekaterina Dimitrova  <mailto:e.dimitr...@gmail.com>> napisał(a):
> Thank you, Jacek, for starting the thread; those things are essential for
> developer productivity.
>
> I support the idea of opting out vs opting into checks. In my experience,
> it also makes things easier and faster during review time.
>
> If people have to opt-in - it is one more thing for new people to
> discover, and it will probably happen only during review time if they do
> not have access to Jenkins/paid CircleCI, etc.
>
> I also support consolidating all types of checks/analyses and running them
> together.
>
> Maxim’s suggestion about rat replacement sounds like a good improvement
> that can be explored (not part of what Jacek does here, though). Maxim, do
> you mind creating a ticket, please?
>
> Best regards,
> Ekaterina
>
> On Mon, 26 Jun 2023 at 17:04, Miklosovic, Stefan <
> stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote:
> Yes, in this case, opting-out is better than opting-in. I feel like the
> build process is quite versatile and one just picks what is necessary. I
> never build docs, there is a flag for that. I turned off checkstyle because
> I was fed up with that until Berenguer cached it and now I get ant jar with
> checkstyle like under 10 seconds so I leave it on, which is great.
>
> Even though I feel like it is already flexible enough, grouping all
> checkstyles and rats etc under one target seems like a good idea. From my
> perspective, it is "all or nothing" so turning it all off until I am going
> to push it so I want it all on is a good idea. I barely want to "just
> checkstyle" in the middle of the development.
>
> I do not think that having a lot of flags is bad. I like that I have bash
> aliases almost for everything and I bet folks have their tricks to get the
> mundane stuff done.
>
> It would be pretty interesting to know the workflow of other people. I
> think there would be a lot of insights how other people have it on a daily
> basis when it comes to Cassandra development.
>
> 
> From: David Capwell mailto:dcapw...@apple.com>>
> Sent: Monday, June 26, 2023 19:57
> To: dev
> Subject: Re: [DISCUSS] When to run CheckStyle and other verificiations
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-27 Thread Jacek Lewandowski

So far, nobody referred to running checks in a pre-push (or pre-commit)
hook. The use of Git hooks is going to be introduced along with Accord, so
we could use them to force running checks once before sending changes to
the repo.
It would still be an opt-out approach because one would have to add the
"--no-verify" flag or uncheck a box in the commit dialog to skip running
the checks.

thanks,
Jacek


wt., 27 cze 2023 o 01:55 Ekaterina Dimitrova 
napisał(a):

> Thank you, Jacek, for starting the thread; those things are essential for
> developer productivity.
>
> I support the idea of opting out vs opting into checks. In my experience,
> it also makes things easier and faster during review time.
>
> If people have to opt-in - it is one more thing for new people to
> discover, and it will probably happen only during review time if they do
> not have access to Jenkins/paid CircleCI, etc.
>
> I also support consolidating all types of checks/analyses and running them
> together.
>
> Maxim’s suggestion about rat replacement sounds like a good improvement
> that can be explored (not part of what Jacek does here, though). Maxim, do
> you mind creating a ticket, please?
>
> Best regards,
> Ekaterina
>
> On Mon, 26 Jun 2023 at 17:04, Miklosovic, Stefan <
> stefan.mikloso...@netapp.com> wrote:
>
>> Yes, in this case, opting-out is better than opting-in. I feel like the
>> build process is quite versatile and one just picks what is necessary. I
>> never build docs, there is a flag for that. I turned off checkstyle because
>> I was fed up with that until Berenguer cached it and now I get ant jar with
>> checkstyle like under 10 seconds so I leave it on, which is great.
>>
>> Even though I feel like it is already flexible enough, grouping all
>> checkstyles and rats etc under one target seems like a good idea. From my
>> perspective, it is "all or nothing" so turning it all off until I am going
>> to push it so I want it all on is a good idea. I barely want to "just
>> checkstyle" in the middle of the development.
>>
>> I do not think that having a lot of flags is bad. I like that I have bash
>> aliases almost for everything and I bet folks have their tricks to get the
>> mundane stuff done.
>>
>> It would be pretty interesting to know the workflow of other people. I
>> think there would be a lot of insights how other people have it on a daily
>> basis when it comes to Cassandra development.
>>
>> 
>> From: David Capwell 
>> Sent: Monday, June 26, 2023 19:57
>> To: dev
>> Subject: Re: [DISCUSS] When to run CheckStyle and other verificiations
>>
>> NetApp Security WARNING: This is an external email. Do not click links or
>> open attachments unless you recognize the sender and know the content is
>> safe.
>>
>>
>>
>> not running it automatically with the targets which devs usually run
>> locally.
>>
>> The checks tend to have an opt-out, such as -Dno-checkstyle=true… so its
>> really easy to setup your local environment to opt out what you do not care
>> about… I feel we should force people to opt-out rather than opt-in…
>>
>>
>>
>> On Jun 26, 2023, at 7:47 AM, Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>>
>> That would work as well Brandon, basically what is proposed in
>> CASSANDRA-18618, that is "check" target, actually needs to build the
>> project to perform some verifications - I suppose running "ant check"
>> should be sufficient.
>>
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>> pon., 26 cze 2023 o 16:01 Brandon Williams > dri...@gmail.com>> napisał(a):
>> The "artifacts" task is not quite the same since it builds other things
>> like docs, which significantly contributes to longer build time.  I don't
>> see why we couldn't add a new task that preserves the old behavior though,
>> "fulljar" or something like that.
>>
>> Kind Regards,
>> Brandon
>>
>>
>> On Mon, Jun 26, 2023 at 6:12 AM Jacek Lewandowski <
>> lewandowski.ja...@gmail.com<mailto:lewandowski.ja...@gmail.com>> wrote:
>> Yes, I've mentioned that there is a property we can set to skip
>> checkstyle.
>>
>> Currently such a goal is "artifacts" which basically validates everything.
>>
>>
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>> pon., 26 cze 2023 o 13:09 Mike Adamson > madam...@datastax.com>> napisał(a):
>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-26 Thread Jacek Lewandowski

That would work as well Brandon, basically what is proposed in
CASSANDRA-18618, that is "check" target, actually needs to build the
project to perform some verifications - I suppose running "ant check"
should be sufficient.

- - -- --- - ---- -----
Jacek Lewandowski


pon., 26 cze 2023 o 16:01 Brandon Williams  napisał(a):

> The "artifacts" task is not quite the same since it builds other things
> like docs, which significantly contributes to longer build time.  I don't
> see why we couldn't add a new task that preserves the old behavior though,
> "fulljar" or something like that.
>
> Kind Regards,
> Brandon
>
>
> On Mon, Jun 26, 2023 at 6:12 AM Jacek Lewandowski <
> lewandowski.ja...@gmail.com> wrote:
>
>> Yes, I've mentioned that there is a property we can set to skip
>> checkstyle.
>>
>> Currently such a goal is "artifacts" which basically validates everything.
>>
>>
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>> pon., 26 cze 2023 o 13:09 Mike Adamson 
>> napisał(a):
>>
>>> While I like the idea of this because of added time these checks take, I
>>> was under the impression that checkstyle (at least) can be disabled with a
>>> flag.
>>>
>>> If we did do this, would it make sense to have a "release"  or "commit"
>>> target (or some other name) that ran a full build with all checks that can
>>> be used prior to pushing changes?
>>>
>>> On Mon, 26 Jun 2023 at 08:35, Berenguer Blasi 
>>> wrote:
>>>
>>>> I would prefer sthg that is totally transparent to me and not add one
>>>> more step I have to remember. Just to push/run CI to find out I missed it
>>>> and rinse and repeat... With the recent fix to checkstyle I am happy as
>>>> things stand atm. My 2cts
>>>> On 26/6/23 8:43, Jacek Lewandowski wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>> The context is that we currently have 3 checks in the build:
>>>>
>>>> - Checkstyle,
>>>>
>>>> - Eclipse-Warnings,
>>>>
>>>> - RAT
>>>>
>>>>
>>>> CheckStyle and RAT are executed with almost every target we run: build,
>>>> jar, test, test-some, testclasslist, etc.; on the other hand,
>>>> Eclipse-Warnings is executed automatically only with the artifacts target.
>>>>
>>>>
>>>> Checkstyle currently uses some caching, so subsequent reruns without
>>>> cleaning the project validate only the modified files.
>>>>
>>>>
>>>> Both CI - Jenkins and Circle forces running all checks.
>>>>
>>>>
>>>> I want to discuss whether you are ok with extracting all checks to
>>>> their distinct target and not running it automatically with the targets
>>>> which devs usually run locally. In particular:
>>>>
>>>>
>>>>
>>>>- "build", "jar", and all "test" targets would not trigger
>>>>CheckStyle, RAT or Eclipse-Warnings
>>>>- A new target "check" would trigger all CheckStyle, RAT, and
>>>>Eclipse-Warnings
>>>>- The new "check" target would be run along with the "artifacts"
>>>>target on Jenkins-CI, and it as a separate build step in CircleCI
>>>>
>>>>
>>>> The rationale for that change is:
>>>>
>>>>- Running all the checks together would be more consistent, but
>>>>running all of them automatically with build and test targets could 
>>>> waste
>>>>time when we develop something locally, frequently rebuilding and 
>>>> running
>>>>tests.
>>>>- On the other hand, it would be more consistent if the build did
>>>>what we want - as a dev, when prototyping, I don't want to be forced to 
>>>> run
>>>>analysis (and potentially fix issues) whenever I want to build a 
>>>> project or
>>>>just run a single test.
>>>>- There are ways to avoid running checks automatically by
>>>>specifying some build properties. Though, the discussion is about the
>>>>default behavior - on the flip side, if one wants to run the checks 
>>>> along
>>>>with the specified target, they could add the "check" target to the 
&g

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-26 Thread Jacek Lewandowski

Berenguer, as I said, I started this discussion because it is confusing
that we do implicit and unexpected tasks.
It is inconsistent that we run checkstyle, but we skip static code analysis
like Eclipse-Warnings because that actually falsifies the advantages of
running checks automatically.
More robust static code analysis will take even more time than
Eclipse-Warnings. Eventually, nobody is guaranteed to run any Ant task
before pushing if the whole development is done in IDE.

You basically want those tasks to be guaranteed to run locally before
pushing, which could be addressed more consistently by adding a Git hook.
Do you think we can encounter some particular problems with this approach?

Maxim, this is a great idea, and I fully support it - but this does not
address the issues I've raised there.
- - -- --- -  -
Jacek Lewandowski


pon., 26 cze 2023 o 14:05 Maxim Muzafarov  napisał(a):

> Hello everyone,
>
> We can replace RAT with the appropriate checkstyle rule - the HeaderCheck,
> I think. This will reduce the number of tools we now use and reduce the
> build time as only modified files will be checked, and this, in turn, will
> remove some of the concerns mentioned in the first message.
>
> https://checkstyle.org/apidocs/com/puppycrawl/tools/checkstyle/checks/header/HeaderCheck.html
>
>
>
> On Mon, 26 Jun 2023 at 13:48, Berenguer Blasi 
> wrote:
>
>> Just for awareness if you rebase thanks to CASSANDRA-18588 checkstyle
>> shouldn't be a problem anymore. If it is still let me know and I can look
>> into it.
>> On 26/6/23 13:11, Jacek Lewandowski wrote:
>>
>> Yes, I've mentioned that there is a property we can set to skip
>> checkstyle.
>>
>> Currently such a goal is "artifacts" which basically validates everything.
>>
>>
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>> pon., 26 cze 2023 o 13:09 Mike Adamson 
>> napisał(a):
>>
>>> While I like the idea of this because of added time these checks take, I
>>> was under the impression that checkstyle (at least) can be disabled with a
>>> flag.
>>>
>>> If we did do this, would it make sense to have a "release"  or "commit"
>>> target (or some other name) that ran a full build with all checks that can
>>> be used prior to pushing changes?
>>>
>>> On Mon, 26 Jun 2023 at 08:35, Berenguer Blasi 
>>> wrote:
>>>
>>>> I would prefer sthg that is totally transparent to me and not add one
>>>> more step I have to remember. Just to push/run CI to find out I missed it
>>>> and rinse and repeat... With the recent fix to checkstyle I am happy as
>>>> things stand atm. My 2cts
>>>> On 26/6/23 8:43, Jacek Lewandowski wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>> The context is that we currently have 3 checks in the build:
>>>>
>>>> - Checkstyle,
>>>>
>>>> - Eclipse-Warnings,
>>>>
>>>> - RAT
>>>>
>>>>
>>>> CheckStyle and RAT are executed with almost every target we run: build,
>>>> jar, test, test-some, testclasslist, etc.; on the other hand,
>>>> Eclipse-Warnings is executed automatically only with the artifacts target.
>>>>
>>>>
>>>> Checkstyle currently uses some caching, so subsequent reruns without
>>>> cleaning the project validate only the modified files.
>>>>
>>>>
>>>> Both CI - Jenkins and Circle forces running all checks.
>>>>
>>>>
>>>> I want to discuss whether you are ok with extracting all checks to
>>>> their distinct target and not running it automatically with the targets
>>>> which devs usually run locally. In particular:
>>>>
>>>>
>>>>
>>>>- "build", "jar", and all "test" targets would not trigger
>>>>CheckStyle, RAT or Eclipse-Warnings
>>>>- A new target "check" would trigger all CheckStyle, RAT, and
>>>>Eclipse-Warnings
>>>>- The new "check" target would be run along with the "artifacts"
>>>>target on Jenkins-CI, and it as a separate build step in CircleCI
>>>>
>>>>
>>>> The rationale for that change is:
>>>>
>>>>- Running all the checks together would be more consistent, but
>>>>running all of them automatically with build and test targets could 
>>>> waste
>>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-26 Thread Jacek Lewandowski

Yes, I've mentioned that there is a property we can set to skip checkstyle.

Currently such a goal is "artifacts" which basically validates everything.


- - -- --- -  -----
Jacek Lewandowski


pon., 26 cze 2023 o 13:09 Mike Adamson  napisał(a):

> While I like the idea of this because of added time these checks take, I
> was under the impression that checkstyle (at least) can be disabled with a
> flag.
>
> If we did do this, would it make sense to have a "release"  or "commit"
> target (or some other name) that ran a full build with all checks that can
> be used prior to pushing changes?
>
> On Mon, 26 Jun 2023 at 08:35, Berenguer Blasi 
> wrote:
>
>> I would prefer sthg that is totally transparent to me and not add one
>> more step I have to remember. Just to push/run CI to find out I missed it
>> and rinse and repeat... With the recent fix to checkstyle I am happy as
>> things stand atm. My 2cts
>> On 26/6/23 8:43, Jacek Lewandowski wrote:
>>
>> Hi,
>>
>>
>> The context is that we currently have 3 checks in the build:
>>
>> - Checkstyle,
>>
>> - Eclipse-Warnings,
>>
>> - RAT
>>
>>
>> CheckStyle and RAT are executed with almost every target we run: build,
>> jar, test, test-some, testclasslist, etc.; on the other hand,
>> Eclipse-Warnings is executed automatically only with the artifacts target.
>>
>>
>> Checkstyle currently uses some caching, so subsequent reruns without
>> cleaning the project validate only the modified files.
>>
>>
>> Both CI - Jenkins and Circle forces running all checks.
>>
>>
>> I want to discuss whether you are ok with extracting all checks to their
>> distinct target and not running it automatically with the targets which
>> devs usually run locally. In particular:
>>
>>
>>
>>- "build", "jar", and all "test" targets would not trigger
>>CheckStyle, RAT or Eclipse-Warnings
>>- A new target "check" would trigger all CheckStyle, RAT, and
>>Eclipse-Warnings
>>- The new "check" target would be run along with the "artifacts"
>>target on Jenkins-CI, and it as a separate build step in CircleCI
>>
>>
>> The rationale for that change is:
>>
>>- Running all the checks together would be more consistent, but
>>running all of them automatically with build and test targets could waste
>>time when we develop something locally, frequently rebuilding and running
>>tests.
>>- On the other hand, it would be more consistent if the build did
>>what we want - as a dev, when prototyping, I don't want to be forced to 
>> run
>>analysis (and potentially fix issues) whenever I want to build a project 
>> or
>>just run a single test.
>>- There are ways to avoid running checks automatically by specifying
>>some build properties. Though, the discussion is about the default 
>> behavior
>>- on the flip side, if one wants to run the checks along with the 
>> specified
>>target, they could add the "check" target to the command line.
>>
>>
>> The rationale for keeping the checks running automatically with every
>> target is to reduce the likelihood of not running the checks locally before
>> pushing the branch and being surprised by failing CI soon after starting
>> the build.
>>
>>
>> That could be fixed by running checks in a pre-push Git hook. There are
>> some benefits of this compared to the current behavior:
>>
>>- the checks would be run automatically only once
>>- they would be triggered even for those devs who do everything in
>>IDE and do not even touch Ant commands directly
>>
>>
>> Checks can take time; to optimize that, they could be enforced locally to
>> verify only the modified files in the same way as we currently determine
>> the tests to be repeated for CircleCI.
>>
>> Thanks
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>>
>
> --
> [image: DataStax Logo Square] <https://www.datastax.com/> *Mike Adamson*
> Engineering
>
> +1 650 389 6000 <16503896000> | datastax.com <https://www.datastax.com/>
> Find DataStax Online: [image: LinkedIn Logo]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax=DwMFaQ=adz96Xi0w1RHqtPMowiL2g=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q=>
>[image: Facebook Logo]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax=DwMFaQ=adz96Xi0w1RHqtPMowiL2g=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU=>
>[image: Twitter Logo] <https://twitter.com/DataStax>   [image: RSS
> Feed] <https://www.datastax.com/blog/rss.xml>   [image: Github Logo]
> <https://github.com/datastax>
>
>

[DISCUSS] When to run CheckStyle and other verificiations

2023-06-26 Thread Jacek Lewandowski

Hi,


The context is that we currently have 3 checks in the build:

- Checkstyle,

- Eclipse-Warnings,

- RAT


CheckStyle and RAT are executed with almost every target we run: build,
jar, test, test-some, testclasslist, etc.; on the other hand,
Eclipse-Warnings is executed automatically only with the artifacts target.


Checkstyle currently uses some caching, so subsequent reruns without
cleaning the project validate only the modified files.


Both CI - Jenkins and Circle forces running all checks.


I want to discuss whether you are ok with extracting all checks to their
distinct target and not running it automatically with the targets which
devs usually run locally. In particular:



   - "build", "jar", and all "test" targets would not trigger CheckStyle,
   RAT or Eclipse-Warnings
   - A new target "check" would trigger all CheckStyle, RAT, and
   Eclipse-Warnings
   - The new "check" target would be run along with the "artifacts" target
   on Jenkins-CI, and it as a separate build step in CircleCI


The rationale for that change is:

   - Running all the checks together would be more consistent, but running
   all of them automatically with build and test targets could waste time when
   we develop something locally, frequently rebuilding and running tests.
   - On the other hand, it would be more consistent if the build did what
   we want - as a dev, when prototyping, I don't want to be forced to run
   analysis (and potentially fix issues) whenever I want to build a project or
   just run a single test.
   - There are ways to avoid running checks automatically by specifying
   some build properties. Though, the discussion is about the default behavior
   - on the flip side, if one wants to run the checks along with the specified
   target, they could add the "check" target to the command line.


The rationale for keeping the checks running automatically with every
target is to reduce the likelihood of not running the checks locally before
pushing the branch and being surprised by failing CI soon after starting
the build.


That could be fixed by running checks in a pre-push Git hook. There are
some benefits of this compared to the current behavior:

   - the checks would be run automatically only once
   - they would be triggered even for those devs who do everything in IDE
   and do not even touch Ant commands directly


Checks can take time; to optimize that, they could be enforced locally to
verify only the modified files in the same way as we currently determine
the tests to be repeated for CircleCI.


Thanks
- - -- --- -  -
Jacek Lewandowski

Re: [DISCUSSIONS] Replace ant eclipse-warnings with CheckerFramework

2023-06-16 Thread Jacek Lewandowski

Later, we may enable more checks than just leaked resources to improve the
code gradually.

- - -- --- -  -
Jacek Lewandowski


pt., 16 cze 2023 o 13:48 Ekaterina Dimitrova 
napisał(a):

> Got so excited that I forgot to say which of the two options exactly  I
> meant - running the analysis only on changed files after the initial full
> pass is done sounds like a good improvement to me
>
> On Fri, 16 Jun 2023 at 7:43, Ekaterina Dimitrova 
> wrote:
>
>> I think this is a great idea and it will probably reduce the time to run
>> it. Thank you!
>>
>> On Fri, 16 Jun 2023 at 7:40, Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>>
>>> Additional question is whether we want to run the checks against the
>>> whole project or just against the file changes between the feature branch
>>> and the target release branch?
>>>
>>>
>>> - - -- --- -  -
>>> Jacek Lewandowski
>>>
>>>
>>> pt., 16 cze 2023 o 13:09 Aleksey Yeshchenko 
>>> napisał(a):
>>>
>>>> Sounds like a clear improvement to me. Only once this check flagged a
>>>> legitimate issue I missed, if I’m remembering correctly. All other
>>>> instances have just been annoyances, forcing to add a redundant suppressed
>>>> annotation.
>>>>
>>>> On 15 Jun 2023, at 19:01, Ekaterina Dimitrova 
>>>> wrote:
>>>>
>>>> Hi everyone,
>>>> Happy Thursday!
>>>> Some time ago, Jacek raised the point that ant eclipse-warnings is 
>>>> generating too many false positives and not really working as expected. 
>>>> (CASSANDRA-18239)
>>>>
>>>> Reminder: ant eclipse-warnings is a task we run with the goal to check 
>>>> Cassandra code - static analysis to warn on unsafe use of Autocloseable 
>>>> instances; checks against two related particular compiler options
>>>>
>>>> While trying to upgrade ECJ compiler that we use for this task 
>>>> (CASSANDRA-18190) so we can switch the task from running it with JDK8 to 
>>>> JDK11 in preparation for dropping JDK8, I hit the following issues:
>>>> - the latest version of ECJ is throwing more than 300 Potential Resource 
>>>> Leak warnings. I looked at 10-15, and they were all false positives.
>>>> - Even if we file a bug report to the Eclipse community, JDK11 is about to 
>>>> be removed with the next version of the compiler
>>>>
>>>> So I shared this information with Jacek. He came up with a different 
>>>> solution:
>>>> It seems we already pull through Guava CheckerFramework with an MIT 
>>>> license, which appears to be acceptable according to this link -  
>>>> https://www.apache.org/legal/resolved.html#category-a
>>>> He already has an initial integration with Cassandra which shows the 
>>>> following:
>>>> - CheckerFramework does not understand the @SuppressWarnings("resource") 
>>>> (there is a different one to be used), so it is immediately visible how it 
>>>> does not report all those false positives that eclipse-warnings does. On 
>>>> the flip side, I got the feedback that what it has witnessed so far is 
>>>> something we should investigate.
>>>> - Also, there are additional annotations like @Owning that let you fix 
>>>> many problems at once because the tool understands that the ownership of 
>>>> the resources was passed to another entity; It also enables you to do 
>>>> something impossible with eclipse-warnings - you can tell the tool that 
>>>> there is another method that needs to be called to release the resources, 
>>>> like release, free, disconnect, etc.
>>>> - the tool works with JDK8, JDK11, JDK17, and JDK20, so we can backport it 
>>>> even to older branches (while at the same time keeping eclipse-warnings 
>>>> there)
>>>> - though it runs 8 minutes so, we should not run it with every test, some 
>>>> reorganization around ant tasks will be covered as even for 
>>>> eclipse-warnings it was weird to call it on every single test run locally 
>>>> by default
>>>>
>>>>
>>>> If there are no concerns, we will continue replacing ant eclipse-warnings 
>>>> with the CheckerFramework as part of CASSANDRA-18239 and CASSANDRA-18190 
>>>> in trunk.
>>>>
>>>> Best regards,
>>>>
>>>> Ekaterina
>>>>
>>>>
>>>>

Re: [DISCUSSIONS] Replace ant eclipse-warnings with CheckerFramework

2023-06-16 Thread Jacek Lewandowski

Additional question is whether we want to run the checks against the whole
project or just against the file changes between the feature branch and the
target release branch?


- - -- --- -  -
Jacek Lewandowski


pt., 16 cze 2023 o 13:09 Aleksey Yeshchenko  napisał(a):

> Sounds like a clear improvement to me. Only once this check flagged a
> legitimate issue I missed, if I’m remembering correctly. All other
> instances have just been annoyances, forcing to add a redundant suppressed
> annotation.
>
> On 15 Jun 2023, at 19:01, Ekaterina Dimitrova 
> wrote:
>
> Hi everyone,
> Happy Thursday!
> Some time ago, Jacek raised the point that ant eclipse-warnings is generating 
> too many false positives and not really working as expected. (CASSANDRA-18239)
>
> Reminder: ant eclipse-warnings is a task we run with the goal to check 
> Cassandra code - static analysis to warn on unsafe use of Autocloseable 
> instances; checks against two related particular compiler options
>
> While trying to upgrade ECJ compiler that we use for this task 
> (CASSANDRA-18190) so we can switch the task from running it with JDK8 to 
> JDK11 in preparation for dropping JDK8, I hit the following issues:
> - the latest version of ECJ is throwing more than 300 Potential Resource Leak 
> warnings. I looked at 10-15, and they were all false positives.
> - Even if we file a bug report to the Eclipse community, JDK11 is about to be 
> removed with the next version of the compiler
>
> So I shared this information with Jacek. He came up with a different solution:
> It seems we already pull through Guava CheckerFramework with an MIT license, 
> which appears to be acceptable according to this link -  
> https://www.apache.org/legal/resolved.html#category-a
> He already has an initial integration with Cassandra which shows the 
> following:
> - CheckerFramework does not understand the @SuppressWarnings("resource") 
> (there is a different one to be used), so it is immediately visible how it 
> does not report all those false positives that eclipse-warnings does. On the 
> flip side, I got the feedback that what it has witnessed so far is something 
> we should investigate.
> - Also, there are additional annotations like @Owning that let you fix many 
> problems at once because the tool understands that the ownership of the 
> resources was passed to another entity; It also enables you to do something 
> impossible with eclipse-warnings - you can tell the tool that there is 
> another method that needs to be called to release the resources, like 
> release, free, disconnect, etc.
> - the tool works with JDK8, JDK11, JDK17, and JDK20, so we can backport it 
> even to older branches (while at the same time keeping eclipse-warnings there)
> - though it runs 8 minutes so, we should not run it with every test, some 
> reorganization around ant tasks will be covered as even for eclipse-warnings 
> it was weird to call it on every single test run locally by default
>
>
> If there are no concerns, we will continue replacing ant eclipse-warnings 
> with the CheckerFramework as part of CASSANDRA-18239 and CASSANDRA-18190 in 
> trunk.
>
> Best regards,
>
> Ekaterina
>
>
>

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski

Josh, that answers my question exactly; thank you.

I will not implement limiting the result set in CQL (that is, by LIMIT
clause) and stay with just paging. Whether the page size is defined in
bytes or rows can be determined by a flag - there are many unused bits for
that.

So my other question - for aggregation with the "group by" clause, we
return an aggregated row which is computed from a group of rows - with my
current implementation, it is approximated by counting the size of the
largest row in that group - I think it is the safest and simplest
approximation - wdyt?


pon., 12 cze 2023 o 22:55 Josh McKenzie  napisał(a):

> As long as it is valid in the paging protocol to return a short page, but
> still say “there are more pages”, I think that is fine to do that.
>
> Thankfully the v3-v5 spec all make it clear that clients need to respect
> what the server has to say about there being more pages:
> https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v5.spec#L1247-L1253
>
>   - Clients should not rely on the actual size of the result set returned
> to
> decide if there are more results to fetch or not. Instead, they should
> always
> check the Has_more_pages flag (unless they did not enable paging for
> the query
> obviously). Clients should also not assert that no result will have
> more than
>  results. While the current implementation always
> respects
> the exact value of , we reserve the right to return
> slightly smaller or bigger pages in the future for performance reasons.
>
>
> On Mon, Jun 12, 2023, at 3:19 PM, Jeremiah Jordan wrote:
>
> As long as it is valid in the paging protocol to return a short page, but
> still say “there are more pages”, I think that is fine to do that.  For an
> actual LIMIT that is part of the user query, I think the server must always
> have returned all data that fits into the LIMIT when all pages have been
> returned.
>
> -Jeremiah
>
> On Jun 12, 2023 at 12:56:14 PM, Josh McKenzie 
> wrote:
>
>
> Yeah, my bad. I have paging on the brain. Seriously.
>
> I can't think of a use-case in which a LIMIT based on # bytes makes sense
> from a user perspective.
>
> On Mon, Jun 12, 2023, at 1:35 PM, Jeff Jirsa wrote:
>
>
>
> On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer  wrote:
>
> If you have rows that vary significantly in their size, your latencies
> could end up being pretty unpredictable using a LIMIT BY . Being
> able to specify a limit by bytes at the driver / API level would allow app
> devs to get more deterministic results out of their interaction w/the DB if
> they're looking to respond back to a client within a certain time frame and
> / or determine next steps in the app (continue paging, stop, etc) based on
> how long it took to get results back.
>
>
> Are you talking about the page size or the LIMIT. Once the LIMIT is
> reached there is no "continue paging". LIMIT is also at the CQL level not
> at the driver level.
> I can totally understand the need for a page size in bytes not for a LIMIT.
>
>
> Would only ever EXPECT to see a page size in bytes, never a LIMIT
> specifying bytes.
>
> I know the C-11745 ticket says LIMIT, too, but that feels very odd to me.
>
>
>
>

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski

Yes, LIMIT BY  provided by the user in CQL does not make much sense
to me either


pon., 12 cze 2023 o 11:20 Benedict  napisał(a):

> I agree that this is more suitable as a paging option, and not as a CQL
> LIMIT option.
>
> If it were to be a CQL LIMIT option though, then it should be accurate
> regarding result set IMO; there shouldn’t be any further results that could
> have been returned within the LIMIT.
>
> On 12 Jun 2023, at 10:16, Benjamin Lerer  wrote:
>
> 
> Thanks Jacek for raising that discussion.
>
> I do not have in mind a scenario where it could be useful to specify a
> LIMIT in bytes. The LIMIT clause is usually used when you know how many
> rows you wish to display or use. Unless somebody has a useful scenario in
> mind I do not think that there is a need for that feature.
>
> Paging in bytes makes sense to me as the paging mechanism is transparent
> for the user in most drivers. It is simply a way to optimize your memory
> usage from end to end.
>
> I do not like the approach of using both of them simultaneously because if
> you request a page with a certain amount of rows and do not get it then is
> is really confusing and can be a problem for some usecases. We have users
> keeping their session open and the page information to display page of data.
>
> Le lun. 12 juin 2023 à 09:08, Jacek Lewandowski <
> lewandowski.ja...@gmail.com> a écrit :
>
>> Hi,
>>
>> I was working on limiting query results by their size expressed in bytes,
>> and some questions arose that I'd like to bring to the mailing list.
>>
>> The semantics of queries (without aggregation) - data limits are applied
>> on the raw data returned from replicas - while it works fine for the row
>> number limits as the number of rows is not likely to change after
>> post-processing, it is not that accurate for size based limits as the cell
>> sizes may be different after post-processing (for example due to applying
>> some transformation function, projection, or whatever).
>>
>> We can truncate the results after post-processing to stay within the
>> user-provided limit in bytes, but if the result is smaller than the limit -
>> we will not fetch more. In that case, the meaning of "limit" being an
>> actual limit is valid though it would be misleading for the page size
>> because we will not fetch the maximum amount of data that does not exceed
>> the page size.
>>
>> Such a problem is much more visible for "group by" queries with
>> aggregation. The paging and limiting mechanism is applied to the rows
>> rather than groups, as it has no information about how much memory a single
>> group uses. For now, I've approximated a group size as the size of the
>> largest participating row.
>>
>> The problem concerns the allowed interpretation of the size limit
>> expressed in bytes. Whether we want to use this mechanism to let the users
>> precisely control the size of the resultset, or we instead want to use this
>> mechanism to limit the amount of memory used internally for the data and
>> prevent problems (assuming restricting size and rows number can be used
>> simultaneously in a way that we stop when we reach any of the specified
>> limits).
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-11745
>>
>> thanks,
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski

Limiting the amount of returned data in bytes in addition to the row limit
could be helpful when applied transparently by the server as a kind of
guardrail. The server could fail the query if it exceeds some
administratively imposed limit on the configuration level, WDYT?



pon., 12 cze 2023 o 11:16 Benjamin Lerer  napisał(a):

> Thanks Jacek for raising that discussion.
>
> I do not have in mind a scenario where it could be useful to specify a
> LIMIT in bytes. The LIMIT clause is usually used when you know how many
> rows you wish to display or use. Unless somebody has a useful scenario in
> mind I do not think that there is a need for that feature.
>
> Paging in bytes makes sense to me as the paging mechanism is transparent
> for the user in most drivers. It is simply a way to optimize your memory
> usage from end to end.
>
> I do not like the approach of using both of them simultaneously because if
> you request a page with a certain amount of rows and do not get it then is
> is really confusing and can be a problem for some usecases. We have users
> keeping their session open and the page information to display page of data.
>
> Le lun. 12 juin 2023 à 09:08, Jacek Lewandowski <
> lewandowski.ja...@gmail.com> a écrit :
>
>> Hi,
>>
>> I was working on limiting query results by their size expressed in bytes,
>> and some questions arose that I'd like to bring to the mailing list.
>>
>> The semantics of queries (without aggregation) - data limits are applied
>> on the raw data returned from replicas - while it works fine for the row
>> number limits as the number of rows is not likely to change after
>> post-processing, it is not that accurate for size based limits as the cell
>> sizes may be different after post-processing (for example due to applying
>> some transformation function, projection, or whatever).
>>
>> We can truncate the results after post-processing to stay within the
>> user-provided limit in bytes, but if the result is smaller than the limit -
>> we will not fetch more. In that case, the meaning of "limit" being an
>> actual limit is valid though it would be misleading for the page size
>> because we will not fetch the maximum amount of data that does not exceed
>> the page size.
>>
>> Such a problem is much more visible for "group by" queries with
>> aggregation. The paging and limiting mechanism is applied to the rows
>> rather than groups, as it has no information about how much memory a single
>> group uses. For now, I've approximated a group size as the size of the
>> largest participating row.
>>
>> The problem concerns the allowed interpretation of the size limit
>> expressed in bytes. Whether we want to use this mechanism to let the users
>> precisely control the size of the resultset, or we instead want to use this
>> mechanism to limit the amount of memory used internally for the data and
>> prevent problems (assuming restricting size and rows number can be used
>> simultaneously in a way that we stop when we reach any of the specified
>> limits).
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-11745
>>
>> thanks,
>> - - -- --- -  -
>> Jacek Lewandowski
>>
>

[DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski

Hi,

I was working on limiting query results by their size expressed in bytes,
and some questions arose that I'd like to bring to the mailing list.

The semantics of queries (without aggregation) - data limits are applied on
the raw data returned from replicas - while it works fine for the row
number limits as the number of rows is not likely to change after
post-processing, it is not that accurate for size based limits as the cell
sizes may be different after post-processing (for example due to applying
some transformation function, projection, or whatever).

We can truncate the results after post-processing to stay within the
user-provided limit in bytes, but if the result is smaller than the limit -
we will not fetch more. In that case, the meaning of "limit" being an
actual limit is valid though it would be misleading for the page size
because we will not fetch the maximum amount of data that does not exceed
the page size.

Such a problem is much more visible for "group by" queries with
aggregation. The paging and limiting mechanism is applied to the rows
rather than groups, as it has no information about how much memory a single
group uses. For now, I've approximated a group size as the size of the
largest participating row.

The problem concerns the allowed interpretation of the size limit expressed
in bytes. Whether we want to use this mechanism to let the users precisely
control the size of the resultset, or we instead want to use this mechanism
to limit the amount of memory used internally for the data and prevent
problems (assuming restricting size and rows number can be used
simultaneously in a way that we stop when we reach any of the specified
limits).

https://issues.apache.org/jira/browse/CASSANDRA-11745

thanks,
- - -- --- - ---- -----
Jacek Lewandowski

[DISCUSS] CEP-31 negotiated authentication

2023-05-26 Thread Jacek Lewandowski

Hi,

I'd like to start a discussion on negotiated authentication and
improvements to authentication, authorization, and role management in
general. A draft of proposed changes is included in CEP-31.

https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-31+%28DRAFT%29+Negotiated+authentication+and+authorization

thanks,
- - -- --- -  -
Jacek Lewandowski

Re: [CASSANDRA-11471] Authentication mechanism negotiation (OPTIONS/SUPPORTED)

2023-05-26 Thread Jacek Lewandowski

Hi,

I'm happy that this discussion has started because we wanted to propose
something we have been working on. I've created
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-31+%28DRAFT%29+Negotiated+authentication+and+authorization
with some details. It is a draft and we are open to suggestions and
cooperation on that.

Basically what we have been using in DataStax for years already is based on
embedding negotiations in SASL messages. In particular, the authentication
mechanism chosen by the client is sent in a preamble of the first
authentication message. It does not require a new protocol - everything we
need is internal to authentication message exchange. It is also backward
compatible as if the server does not receive the mechanism preamble, it
will just continue assuming the default authentication mechanism.

Please have a look and let me know what you think. I'll start a separate
thread on that topic soon.

thanks
- - -- --- -  -
Jacek Lewandowski


pt., 26 maj 2023 o 04:08 Dinesh Joshi  napisał(a):

> Leaving the naming aside (the hardest part of any software), I am
> generally positive about your idea. A protocol version bump may be
> avoidable like you suggested. Perhaps a prototype of this idea is in order
> to help shape the idea? Would you like to take it on?
>
> On May 21, 2023, at 4:21 AM, Derek Chen-Becker 
> wrote:
>
> We had a recent discussion in Slack about how to potentially use the
> OPTIONS and SUPPORTED messages in the existing CQL protocol to allow the
> server to advertise more than one authentication method and allow the
> client to then choose which authenticator to use. The primary use case here
> is to allow seamless migration to a new authenticator without having to
> have all parties involved agree on a single class (and avoid a disruptive
> change). There's already a ticket open that was focused on making a change
> to the binary protocol (
> https://issues.apache.org/jira/browse/CASSANDRA-11471) but I think that
> we can accomplish this in a backwards compatible way that avoids a change
> to the protocol itself.
>
> What I propose is to allow a server configured for this graceful auth
> change to send an additional value in the [string multimap] body of the
> SUPPORTED message that indicates which authenticators are supported, in
> descending priority order. For example, if I wanted to migrate my server to
> support both PlainTextAuthProvider and some new MyAwesomeAuthProvider, I
> would configure my client to query options and the server would respond with
>
> 'AUTHENTICATORS': ['MyAwesomeAuthProvider', 'PlainTextAuthProvider']
>
> The client can then choose from its own supported providers and send it as
> part of the STARTUP message [string map] body:
>
> 'AUTHENTICATOR': 'MyAwesomeAuthenticator'
>
> I'm not good with naming so feel free to propose a different key for
> either of these map entries. In any case, the server then validates that
> the client-chosen authenticator is actually supported and would then
> proceed with the AUTHENTICATE message. In the case where the client sends
> an invalid/unsupported authenticator choice, the server can simply respond
> with an AUTHENTICATE using the most-preferred configured authenticator.
>
> I think this is a better approach than changing the binary protocol
> because the mechanism already exists for negotiating options and this seems
> like a natural use case that avoids having to create an entirely new
> version of the protocol. It does not appear to conflict with the existing
> protocol definition but I'm not 100% certain. Section 4.1.1 discusses
> "Possible options"  for the STARTUP message (
> https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec#L296),
> but that's an unfortunate use of English that's ambiguous as to whether it
> means "The only ones supported" or "Supported but not exclusively".
>
> I've taken a look at the Java and Python driver source so far and I can't
> find anything that would seem to cause a problem by returning a SUPPORTED
> multimap entry that the client isn't aware of (in both they would appear to
> ignore it), but I'll also admit that this is the first time I've looked at
> this part of the client code and I could be missing something. Is anyone
> aware of possible problems that would be caused by using this approach? In
> particular, if there are clients that strictly validate all entries in the
> SUPPORTED map then this could cause a problem.
>
> Worst case, we may still need a protocol version bump if the enumeration
> of STARTUP options is intended to be strict, but at least this would not
> require a new message type and would fit into the existing framework for
> negotiation between client and server.
>
>

Re: [EXTERNAL] Re: (CVE only) support for 3,11 beyond published EOL

2023-04-14 Thread Jacek Lewandowski

To me, as this is an open source project, we, the community, do not have to
do anything, we can, but we are not obliged to, and we usually do that
because we want to :-)

To me, EOL means that we move focus to newer releases. Not that we are
forbidden to do anything in the older ones. One formal point though is the
machinery - as long as we have the machinery to test and release, that's
all we need. However, in face of coming changes in testing, I suppose some
extra effort will have to be done to support older versions. Finding people
who want to help out with that could be a kind of validation whether that
effort is justified.

btw. We have recently agreed to keep support for M sstables format (3.0 -
3.11).

thanks,
- - -- --- -  -
Jacek Lewandowski


czw., 13 kwi 2023 o 21:59 Mick Semb Wever  napisał(a):

> Yes, this would be great. Right now users are confused what EOL means and
>> what they can expect.
>>
>>
>
> I think the project would need to land on an agreed position.  I tried to
> find any reference to my earlier statement around CVEs on the latest
> unmaintained branch but could not find it (I'm sure it was mentioned
> somewhere :(
>
> How many past branches?  All CVEs?  What if CVEs are in dependencies?
> And is this a slippery slope, will such a formalised and documented
> commitment lead to more users on EOL versions? (see below)
> How do other committers feel about this?
>
>
> I am also asking specifically for 3.11 since this release has been around
>> so long that it might warrant longer support than what we would offer for
>> 4.0.
>>
>>
>
> This logic can also be the other way around :-)
>
> We should be sending a clear signal that OSS users are expected to perform
> a major upgrade every ~two years.  Vendors can, and are welcome to solve
> this, but the project itself does not support any user's production system,
> it only maintains code branches and performs releases off them, with our
> focus on quality solely on those maintained branches.
>
>

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Jacek Lewandowski

Haha... we have opinions against each name :)

According to what Caleb said, I don't think all new users start learning
Cassandra from understanding the replication.
There are probably many small projects where Cassandra is used on a single
node, or bigger projects where people
try different things to make some PoC. Understanding the internals,
architecture of Cassandra is not crucial - they
 want to start writing queries as soon as possible and the less prior
knowledge is required to do that the better.

That being said, we should maybe even go further and assume some default
replication config, like simple 1, so that
creating a names boils down to a simply CREATE
KEYSPACE|SCHEMA|DATABASE|NAMESPACE ;

thanks,
- - -- --- -  -
Jacek Lewandowski


czw., 6 kwi 2023 o 04:09 guo Maxwell  napisał(a):

> either KEYSPACE or DATABASE or SCHEMA is better than NAMESPACE
> NAMESPACE is always used in hbase which is a table store in my mind.
> For existing users, NAMESPACE may take some time to be accepted. For hbase
> and cassandra users, it may be necessary to mix the corresponding terms.
> From the terminology standard of the database， DATABASE or SCHAME may be
> better , for terminology standard of the nosql database (cassandra),
> KESYACEP is better.
>
>
> Caleb Rackliffe  于2023年4月6日周四 07:09写道：
>
>> KEYSPACE isn’t a terrible name for a namespace that also configures how
>> keys are replicated. NAMESPACE is accurate but not comprehensive. DATABASE
>> doesn’t seem to have the advantages of either.
>>
>> I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to
>> believe KEYSPACE is really a stumbling block for new users, especially when
>> it connotes something those users should understand about them (the
>> replication configuration).
>>
>> On Apr 5, 2023, at 4:16 AM, Aleksey Yeshchenko  wrote:
>>
>> FYI we support SCHEMA as an alias to KEYSPACE today (have since always).
>> Can use CREATE SCHEMA in place of CREATE KEYSPACE, etc.
>>
>> On 4 Apr 2023, at 19:23, Henrik Ingo  wrote:
>>
>> I find the Postgres terminology overly complex. Where most SQL databases
>> will have several *databases*, each containing several *tables*, in
>> Postgres we have namespaces, databases, schemas and tables...
>>
>> Oracle seems to also use the words database, schema and tables. I don't
>> know if it has namespaces.
>>
>> Ah, ok, so SQL Server actually is like Oracle too!
>>
>>
>> So in MySQL, referring unambiguously (aka full path) to a table would be:
>>
>> SELECT * FROM mydb.mytable;
>>
>> Whereas in Postgresql and Oracle and SQL Server you'd have to:
>>
>> SELECT * FROM mydb.myschema.mytable;   /* And I don't even know what
>> to do with the namespace! */
>>
>>
>> https://www.postgresql.org/docs/current/catalog-pg-namespace.html
>> https://www.postgresql.org/docs/current/ddl-schemas.html
>>
>> https://docs.oracle.com/database/121/ADMQS/GUID-6E0CE8C9-7DC4-450C-BAE0-2E1CDD882993.htm#ADMQS0821
>>
>> https://docs.oracle.com/database/121/ADMQS/GUID-8AC1A325-3542-48A0-9B0E-180D633A5BD1.htm#ADMQS081
>>
>> https://learn.microsoft.com/en-us/sql/t-sql/statements/create-schema-transact-sql?view=sql-server-ver16
>>
>> https://learn.microsoft.com/en-us/sql/t-sql/statements/create-database-transact-sql?view=sql-server-ver16=sqlpool
>>
>> The Microsoft docs perhaps best explain the role of each: The Database
>> contains the configuration of physical things like where on disk is the
>> database stored. The Schema on the other hand contains "logical" objects
>> like tables, views andprocedures.
>>
>> MongoDB has databases and collections. As an easter egg / inside joke, it
>> also supports the command `SHOW TABLES` as a synonym for collections.
>>
>> A TABLESPACE btw is something else completely:
>> https://docs.oracle.com/database/121/ADMQS/GUID-F05EE514-FFC6-4E86-A592-802BA5A49254.htm#ADMQS12053
>>
>>
>>
>> Personally I would be in favor of introducing `DATABASE` as a synonym for
>> KEYSPACE. The latter could remain the "official" usage.
>>
>> henrik
>>
>>
>> On Tue, Apr 4, 2023 at 8:32 PM Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>>
>>> While for someone who already knows Cassandra keyspace is something
>>> natural, for newcomers it is yet another concept to understand.
>>>
>>> If namespace is used in PostgreSQL, it sounds even better to me.
>>>
>>> Thanks,
>>> - - -- --- -  -
>>> Jacek Lewandowski
>>>
>

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-04 Thread Jacek Lewandowski

While for someone who already knows Cassandra keyspace is something
natural, for newcomers it is yet another concept to understand.

If namespace is used in PostgreSQL, it sounds even better to me.

Thanks,
- - -- --- -  -
Jacek Lewandowski


wt., 4 kwi 2023 o 19:07 Rahul Xavier Singh 
napisał(a):

> My 2 cents:
>
> Keeping it keyspace works for me, namespace could be cool also since we
> decide where that namespace exists in relation to Datacenters, etc.  In our
> case, a Keyspace is more similar to a namespace than it is to a database
> since we expect all the UDTs,/UDFs, indexes to refer to only the tables in
> that keyspace/namespace.
>
> Alternatively interesting to observe and throw some fuel into the
> discussion , looking at the Postgres (only because there are many
> distributed databases that are now PG compliant) :
> From the interwebs:
> *In PostgreSQL, a schema is a namespace that contains named database
> objects such as tables, views, indexes, data types, functions, stored
> procedures and operators. A database can contain one or multiple schemas
> and each schema belongs to only one database.*
> I used to gripe about this but as a platform gets more complex it is
> useful to organize PG DBs into schemas. In C* world, I found myself doing
> similar things by having a prefix : e.g. appprefix_system1
> appprefix_system2 ...
>
>
> Rahul Singh
>
> Chief Executive Officer | Business Platform Architect m: 202.905.2818 e:
> rahul.si...@anant.us li: http://linkedin.com/in/xingh ca:
> http://calendly.com/xingh
>
> *We create, support, and manage real-time global data & analytics
> platforms for the modern enterprise.*
>
> *Anant | https://anant.us <https://anant.us/>*
>
> 3 Washington Circle, Suite 301
>
> Washington, D.C. 20037
>
> *http://Cassandra.Link <http://cassandra.link/>* : The best resources for
> Apache Cassandra
>
>
> On Tue, Apr 4, 2023 at 12:52 PM Jeff Jirsa  wrote:
>
>> KEYSPACE at least makes sense in the context that it is the unit that
>> defines how those partitions keys are going to be treated/replicated
>>
>> DATABASE may be ambiguous, but it's ambiguity shared across the industry.
>>
>> Creating a new name like TABLESPACE or TABLEGROUP sounds horrible because
>> it'll be unique to us in the world, and therefore unintuitive for new users.
>>
>>
>>
>> On Tue, Apr 4, 2023 at 9:36 AM Josh McKenzie 
>> wrote:
>>
>>> I think there's competing dynamics here.
>>>
>>> 1) KEYSPACE isn't that great of a name; it's not a space in which keys
>>> are necessarily unique, and you can't address things just by key w/out
>>> their respective tables
>>> 2) DATABASE isn't that great of a name either due to the aforementioned
>>> ambiguity.
>>>
>>> Something like "TABLESPACE" or 'TABLEGROUP" would *theoretically*
>>> better satisfy point 1 and 2 above but subjectively I kind of recoil at
>>> both equally. So there's that.
>>>
>>> On Tue, Apr 4, 2023, at 12:30 PM, Abe Ratnofsky wrote:
>>>
>>> I agree with Bowen - I find Keyspace easier to communicate with. There
>>> are plenty of situations where the use of "database" is ambiguous (like
>>> "Could you help me connect to database x?"), but Keyspace refers to a
>>> single thing. I think more software is moving towards calling these things
>>> "namespaces" (like Kubernetes), and while "Keyspaces" is not a term used in
>>> this way elsewhere, I still find it leads to clearer communication.
>>>
>>> --
>>> Abe
>>>
>>>
>>> On Apr 4, 2023, at 9:24 AM, Andrés de la Peña 
>>> wrote:
>>>
>>> I think supporting DATABASE is a great idea.
>>>
>>> It's better aligned with SQL databases, and can save new users one of
>>> the first troubles they find.
>>>
>>> Probably anyone starting to use Cassandra for the first time is going to
>>> face the what is a keyspace? question in the first minutes. Saving that to
>>> users with a more common name would be a victory for usability IMO.
>>>
>>> On Tue, 4 Apr 2023 at 16:48, Mike Adamson  wrote:
>>>
>>> Hi,
>>>
>>> I'd like to propose that we add DATABASE to the CQL grammar as an
>>> alternative to KEYSPACE.
>>>
>>> Background: While TABLE was introduced as an alternative for
>>> COLUMNFAMILY in the grammar we have kept KEYSPACE for the container name
>>> for a group of tables. Nearly all traditional SQL

Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-04 Thread Jacek Lewandowski

+1

- - -- --- -  -
Jacek Lewandowski


wt., 4 kwi 2023 o 15:01 Berenguer Blasi 
napisał(a):

> +1
> On 4/4/23 14:36, J. D. Jordan wrote:
>
> +1
>
> On Apr 4, 2023, at 7:29 AM, Brandon Williams 
>  wrote:
>
> 
> +1
>
> On Tue, Apr 4, 2023, 7:24 AM Branimir Lambov  wrote:
>
>> Hi everyone,
>>
>> I would like to put CEP-26 to a vote.
>>
>> Proposal:
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
>>
>> JIRA and draft implementation:
>> https://issues.apache.org/jira/browse/CASSANDRA-18397
>>
>> Up-to-date documentation:
>>
>> https://github.com/blambov/cassandra/blob/CASSANDRA-18397/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
>>
>> Discussion:
>> https://lists.apache.org/thread/8xf5245tclf1mb18055px47b982rdg4b
>>
>> The vote will be open for 72 hours.
>> A vote passes if there are at least three binding +1s and no binding
>> vetoes.
>>
>> Thanks,
>> Branimir
>>
>

Re: [DISCUSS] Moving system property names to the CassandraRelevantProperties

2023-03-30 Thread Jacek Lewandowski

I'll do

- - -- --- -  -
Jacek Lewandowski


czw., 30 mar 2023 o 22:09 Miklosovic, Stefan 
napisał(a):

> Hi list,
>
> we are looking for one more committer to take a look at this patch (1, 2).
>
> It looks like there is a lot to go through because of number of files
> modified (around 200) but changes are really just about moving everything
> to CassandraRelevantProperties. I do not think that it should take more
> than 1 hour of dedicated effort and we are done!
>
> Thanks in advance to whoever reviews this.
>
> I want to especially thank Maxim for his perseverance in this matter and I
> hope we will eventually deliver this work to trunk.
>
> (1) https://github.com/apache/cassandra/pull/2046
> (2) https://issues.apache.org/jira/browse/CASSANDRA-17797
>
> Regards
>
> Regards
>
> 
> From: Miklosovic, Stefan 
> Sent: Wednesday, March 22, 2023 14:34
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Moving system property names to the
> CassandraRelevantProperties
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> Hi Maxim,
>
> thanks for letting us know.
>
> I reviewed it couple months ago but I can revisit it to double check. We
> need the second reviewer. Until we find somebody, we can not merge this.
>
> If anybody wants to take a look, it would be awesome. It seems like a lot
> of changes / files touched but it is just about centralizing all properties
> scattered around the code base into one place.
>
> Regards
>
> 
> From: Maxim Muzafarov 
> Sent: Tuesday, March 21, 2023 22:59
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] Moving system property names to the
> CassandraRelevantProperties
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> Hello everyone,
>
> This a friendly reminder that some help is still needed with the review :-)
> I have resolved all the conflicts that have arisen in the last month or
> two.
>
> If you'd like to invest some time in code clarity, please take a look:
> https://github.com/apache/cassandra/pull/2046/files
>
> On Wed, 8 Feb 2023 at 19:48, Maxim Muzafarov  wrote:
> >
> > Hello everyone,
> >
> >
> > We are trying to clean up the source code around the direct use of
> > system properties and make this use more manageable and transparent.
> > To achieve this, I have prepared a patch that moves all system
> > property names to the CassandraRelevantProperties, which in turn makes
> > some of the properties visible to a user through the
> > SystemPropertiesTable virtual table.
> >
> > The patch has passed a few rounds of review, but we still need another
> > pair of eyes to make sure we are not missing anything valuable.
> > Please, take a look at the patch.
> >
> > You can find all the changes here:
> > https://issues.apache.org/jira/browse/CASSANDRA-17797
> >
> >
> > I'd also like to share the names of the properties that will appear in
> > the SystemPropertiesTable, the appearance of which is related to the
> > public API changes we agreed to discuss on the dev list.
> >
> >
> > The public API changes
> >
> > Newly production system properties added:
> >
> > io.netty.eventLoopThreads
> > io.netty.transport.estimateSizeOnSubmit
> > java.security.auth.login.config
> > javax.rmi.ssl.client.enabledCipherSuites
> > javax.rmi.ssl.client.enabledProtocols
> > ssl.enable
> > log4j2.disable.jmx
> > log4j2.shutdownHookEnabled
> > logback.configurationFile
> >
> > Newly added and used for tests only:
> >
> > invalid-legacy-sstable-root
> > legacy-sstable-root
> > org.apache.cassandra.tools.UtilALLOW_TOOL_REINIT_FOR_TEST
> > org.caffinitas.ohc.segmentCount
> > suitename
> > sun.stderr.encoding
> > sun.stdout.encoding
> > test.bbfailhelper.enabled
> > write_survey
>

Re: Welcome our next PMC Chair Josh McKenzie

2023-03-23 Thread Jacek Lewandowski

Congrats Josh!

- - -- --- -  -
Jacek Lewandowski


czw., 23 mar 2023 o 09:32 Berenguer Blasi 
napisał(a):

> Congrats Josh!
> On 23/3/23 9:22, Mick Semb Wever wrote:
>
> It is time to pass the baton on, and on behalf of the Apache Cassandra
> Project Management Committee (PMC) I would like to welcome and congratulate
> our next PMC Chair Josh McKenzie (jmckenzie).
>
> Most of you already know Josh, especially through his regular and valuable
> project oversight and status emails, always presenting a balance and
> understanding to the various views and concerns incoming.
>
> Repeating Paulo's words from last year: The chair is an administrative
> position that interfaces with the Apache Software Foundation Board, by
> submitting regular reports about project status and health. Read more about
> the PMC chair role on Apache projects:
> - https://www.apache.org/foundation/how-it-works.html#pmc
> - https://www.apache.org/foundation/how-it-works.html#pmc-chair
> - https://www.apache.org/foundation/faq.html#why-are-PMC-chairs-officers
>
> The PMC as a whole is the entity that oversees and leads the project and
> any PMC member can be approached as a representative of the committee. A
> list of Apache Cassandra PMC members can be found on:
> https://cassandra.apache.org/_/community.html
>
>

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Jacek Lewandowski

Hi,

Do we consider it as an occasional exception to the rule or we will define
a rule which explicitly says how many versions the user can expect to be
supported?

I'm slightly towards keeping the support for Mx versions just because the
time gap between 3.11 and 4.0 was very long. I suppose many people are
still on 3.11 and we should not make their life harder when they consider
upgrading directly to 5.0. Though clear rules for that would help the users
and us.

thanks,


- - -- --- -  -
Jacek Lewandowski


wt., 14 mar 2023 o 15:36 C. Scott Andreas  napisał(a):

> I agree with Aleksey's view here.
>
> To expand on the final point he makes re: requiring SSTables be fully
> rewritten prior to rev'ing from 4.x to 5.x (if the cluster previously ran
> 3.x) –
>
> This would also invalidate incremental backups. Operators would either be
> required to perform a full snapshot backup of each cluster to object
> storage prior to upgrading from 4.x to 5.x; or to enumerate the contents of
> all snapshots from an incremental backup series to ensure that no m*-series
> SSTables were present prior to upgrading.
>
> If one failed to take on the work to do so, incremental backup snapshots
> would not be restorable to a 5.x cluster if an m*-series SSTable were
> present.
>
> – Scott
>
> On Mar 14, 2023, at 4:38 AM, Aleksey Yeshchenko  wrote:
>
>
> Raising messaging service minimum, I have a less strong opinion on, but on
> dropping m* sstable code I’m strongly -1.
>
> 1. This is code on a rarely touched path
> 2. It’s very stable and battle tested at this point
> 3. Removing it doesn’t reduce much complexity at all, just a few branches
> are affected
> 4. Removing code comes with risk
> 5. There are third-party tools that I know of which benefit from a single
> C* jar that can read all relevant stable versions, and relevant here
> includes 3.0 ones
>
> Removing a little of battle-tested reliable code and a tinier amount of
> complexity is not, to me, a benefit enough to justify intentionally
> breaking perfectly good and useful functionality.
>
> Oh, to add to that - if an operator wishes to upgrade from 3.0 to 5.0, and
> we don’t support it directly, I think most of us are fine with the
> requirement to go through a 4.X release first. But it’s one thing to
> require a two rolling restarts (3.0 to 4.0, 4.0 to 5.0), it’s another to
> require the operator to upgrade every single m* sstable to n*. Especially
> when we have perfectly working code to read those. That’s incredibly
> wasteful.
>
> AY
>
> On 13 Mar 2023, at 22:54, Mick Semb Wever  wrote:
>
> If we do not recommend and do not test direct upgrades from 3.x to
> 5.x, we can clean up a fair bit by removing code related to sstable
> formats m*, as Cassandra versions 4.x and 5.0 are all on sstable
> formats n*.
>
> We don't allow mixed-version streaming, so it's not possible today to
> stream any such older sstable format between nodes. This
> compatibility-break impacts only node-local and/or offline.
>
> Some arguments raised to keep m* sstable formats are:
> - offline cluster upgrade, e.g. direct from 3.x to 5.0,
> - single-invocation sstableupgrade usage
> - third-party tools based on the above
>
> Personally I am not in favour of keeping, or recommending users use,
> code we don't test.
>
> An _example_ of the code that can be cleaned up is in the patch
> attached to the ticket:
> CASSANDRA-18312 – Drop support for sstable formats before `na`
>
> What do you think?
>
>
>
>
>
>
>

Re: Should we cut some new releases?

2023-03-13 Thread Jacek Lewandowski

+1

pon., 13 mar 2023, 20:36 użytkownik Miklosovic, Stefan <
stefan.mikloso...@netapp.com> napisał:

> Yes, I was waiting for CASSANDRA-18125 to be in.
>
> I can release 4.1.1 to staging tomorrow morning CET if nobody objects that.
>
> Not sure about 4.0.9. We released 4.0.8 just few weeks ago. I would do
> 4.1.1 first.
>
> 
> From: Ekaterina Dimitrova 
> Sent: Monday, March 13, 2023 18:12
> To: dev@cassandra.apache.org
> Subject: Re: Should we cut some new releases?
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> +1
>
> On Mon, 13 Mar 2023 at 12:23, Benjamin Lerer  ble...@apache.org>> wrote:
> Hi everybody,
>
> Benedict and Jon recently committed the patch for CASSANDRA-18125<
> https://issues.apache.org/jira/browse/CASSANDRA-18125> which fixes some
> serious problems at the memtable/flush level. Should we consider cutting
> some releases that contain this fix?
>

Re: Role of Hadoop code in Cassandra 5.0

2023-03-10 Thread Jacek Lewandowski

I've experimentally added
https://issues.apache.org/jira/browse/CASSANDRA-16984 to
https://issues.apache.org/jira/browse/CASSANDRA-18306 (post 4.0 cleanup)

- - -- --- -  -
Jacek Lewandowski


pt., 10 mar 2023 o 09:56 Berenguer Blasi 
napisał(a):

> +1 deprecate + removal
> On 10/3/23 1:41, Jeremy Hanna wrote:
>
> It was mainly to integrate with Hadoop - I used it from 0.6 to 1.2 in
> production prior to starting at DataStax and at that time I was stitching
> together Cloudera's distribution of Hadoop with Cassandra.  Back then there
> were others that used it as well.  As far as I know, usage dropped off when
> the Spark Cassandra Connector got pretty mature.  It enabled people to take
> an off the shelf Hadoop distribution and run the Hadoop processes on the
> same nodes or external to the Cassandra cluster and get topology
> information to do things like Hadoop splits and things like that through
> the Hadoop interfaces.  I think the version lag is an indication that it
> hasn't been used recently.  Also, like others have said, the Spark
> Cassandra Connector is really what people should be using at this point
> imo.  That or depending on the use case, Apple's bulk reader:
> https://github.com/jberragan/spark-cassandra-bulkreader that is mentioned
> on https://issues.apache.org/jira/browse/CASSANDRA-16222.
>
> On Mar 9, 2023, at 12:00 PM, Rahul Xavier Singh
>   wrote:
>
> What is the hadoop code for? For interacting from Hadoop via CQL, or
> Thrift if it's that old, or directly looking at SSTables? Been using C*
> since 2 and have never used it.
>
> Agree to deprecate in next possible 4.1.x version and remove in 5.0
>
> Rahul Singh
> Chief Executive Officer | Business Platform Architect m: 202.905.2818 e:
> rahul.si...@anant.us li: http://linkedin.com/in/xingh ca:
> http://calendly.com/xingh
>
> *We create, support, and manage real-time global data & analytics
> platforms for the modern enterprise.*
>
> * Anant | https://anant.us <https://anant.us/>*
> 3 Washington Circle, Suite 301
> Washington, D.C. 20037
>
> *http://Cassandra.Link <http://cassandra.link/>* : The best resources for
> Apache Cassandra
>
>
> On Thu, Mar 9, 2023 at 12:53 PM Brandon Williams  wrote:
>
>> I think if we reach consensus here that decides it. I too vote to
>> deprecate in 4.1.x.  This means we would remove it in 5.0.
>>
>> Kind Regards,
>> Brandon
>>
>> On Thu, Mar 9, 2023 at 11:32 AM Ekaterina Dimitrova
>>  wrote:
>> >
>> > Deprecation sounds good to me, but I am not completely sure in which
>> version we can do it. If it is possible to add a deprecation warning in the
>> 4.x series or at least 4.1.x - I vote for that.
>> >
>> > On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski <
>> lewandowski.ja...@gmail.com> wrote:
>> >>
>> >> Is it possible to deprecate it in the 4.1.x patch release? :)
>> >>
>> >>
>> >> - - -- --- -  -
>> >> Jacek Lewandowski
>> >>
>> >>
>> >> czw., 9 mar 2023 o 18:11 Brandon Williams 
>> napisał(a):
>> >>>
>> >>> This is my feeling too, but I think we should accomplish this by
>> >>> deprecating it first.  I don't expect anything will change after the
>> >>> deprecation period.
>> >>>
>> >>> Kind Regards,
>> >>> Brandon
>> >>>
>> >>> On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski
>> >>>  wrote:
>> >>> >
>> >>> > I vote for removing it entirely.
>> >>> >
>> >>> > thanks
>> >>> > - - -- --- -  -
>> >>> > Jacek Lewandowski
>> >>> >
>> >>> >
>> >>> > czw., 9 mar 2023 o 18:07 Miklosovic, Stefan <
>> stefan.mikloso...@netapp.com> napisał(a):
>> >>> >>
>> >>> >> Derek,
>> >>> >>
>> >>> >> I have couple more points ... I do not think that extracting it to
>> a separate repository is "win". That code is on Hadoop 1.0.3. We would be
>> spending a lot of work on extracting it just to extract 10 years old code
>> with occasional updates (in my humble opinion just to make it compilable
>> again if the code around changes). What good is in that? We would have one
>> more place to take care of ... Now we at least have it all in one place.
>> >>> >>
>> >>> >> I believe we have four options:
>>

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

Is there a ticket for that?

- - -- --- -  -
Jacek Lewandowski


czw., 9 mar 2023 o 20:27 Mick Semb Wever  napisał(a):

>
>
> On Thu, 9 Mar 2023 at 18:54, Brandon Williams  wrote:
>
>> I think if we reach consensus here that decides it. I too vote to
>> deprecate in 4.1.x.  This means we would remove it in 5.0.
>>
>
>
> +1
>
>
>

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

Is it possible to deprecate it in the 4.1.x patch release? :)


- - -- --- -  -
Jacek Lewandowski


czw., 9 mar 2023 o 18:11 Brandon Williams  napisał(a):

> This is my feeling too, but I think we should accomplish this by
> deprecating it first.  I don't expect anything will change after the
> deprecation period.
>
> Kind Regards,
> Brandon
>
> On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski
>  wrote:
> >
> > I vote for removing it entirely.
> >
> > thanks
> > - - -- --- -  -
> > Jacek Lewandowski
> >
> >
> > czw., 9 mar 2023 o 18:07 Miklosovic, Stefan <
> stefan.mikloso...@netapp.com> napisał(a):
> >>
> >> Derek,
> >>
> >> I have couple more points ... I do not think that extracting it to a
> separate repository is "win". That code is on Hadoop 1.0.3. We would be
> spending a lot of work on extracting it just to extract 10 years old code
> with occasional updates (in my humble opinion just to make it compilable
> again if the code around changes). What good is in that? We would have one
> more place to take care of ... Now we at least have it all in one place.
> >>
> >> I believe we have four options:
> >>
> >> 1) leave it there so it will be like this is for next years with
> questionable and diminishing usage
> >> 2) update it to Hadoop 3.3 (I wonder who is going to do that)
> >> 3) 2) and extract it to a separate repository but if we do 2) we can
> just leave it there
> >> 4) remove it
> >>
> >> 
> >> From: Derek Chen-Becker 
> >> Sent: Thursday, March 9, 2023 15:55
> >> To: dev@cassandra.apache.org
> >> Subject: Re: Role of Hadoop code in Cassandra 5.0
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >>
> >>
> >>
> >> I think the question isn't "Who ... is still using that?" but more "are
> we actually going to support it?" If we're on a version that old it would
> appear that we've basically abandoned it, although there do appear to have
> been refactoring (for other things) commits in the last couple of years. I
> would be in favor of removal from 5.0, but at the very least, could it be
> moved into a separate repo/package so that it's not pulling a relatively
> large dependency subtree from Hadoop into our main codebase?
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan <
> stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote:
> >> Hi list,
> >>
> >> I stumbled upon Hadoop package again. I think there was some discussion
> about the relevancy of Hadoop code some time ago but I would like to ask
> this again.
> >>
> >> Do you think Hadoop code (1) is still relevant in 5.0? Who in the
> industry is still using that?
> >>
> >> We might drop a lot of code and some Hadoop dependencies too (3) (even
> their scope is "provided"). The version of Hadoop we build upon is 1.0.3
> which was released 10 years ago. This code does not have any tests nor
> documentation on the website.
> >>
> >> There seems to be issues like this (2) and it seems like the solution
> is to, basically, use Spark Cassandra connector instead which I would say
> is quite reasonable.
> >>
> >> Regards
> >>
> >> (1)
> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop
> >> (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p
> >> (3)
> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589
> >>
> >>
> >> --
> >> +---+
> >> | Derek Chen-Becker |
> >> | GPG Key available at https://keybase.io/dchenbecker and   |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---+
> >>
>

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

... because - why Hadoop? This is something to be made as a separate
project if there is a need for that. Just like the Spark Cassandra
Connector. Why do we need to include Hadoop specific classes and no
specific stuff for other frameworks?

- - -- --- -  -
Jacek Lewandowski


czw., 9 mar 2023 o 18:08 Jacek Lewandowski 
napisał(a):

> I vote for removing it entirely.
>
> thanks
> - - -- --- -  -----
> Jacek Lewandowski
>
>
> czw., 9 mar 2023 o 18:07 Miklosovic, Stefan 
> napisał(a):
>
>> Derek,
>>
>> I have couple more points ... I do not think that extracting it to a
>> separate repository is "win". That code is on Hadoop 1.0.3. We would be
>> spending a lot of work on extracting it just to extract 10 years old code
>> with occasional updates (in my humble opinion just to make it compilable
>> again if the code around changes). What good is in that? We would have one
>> more place to take care of ... Now we at least have it all in one place.
>>
>> I believe we have four options:
>>
>> 1) leave it there so it will be like this is for next years with
>> questionable and diminishing usage
>> 2) update it to Hadoop 3.3 (I wonder who is going to do that)
>> 3) 2) and extract it to a separate repository but if we do 2) we can just
>> leave it there
>> 4) remove it
>>
>> 
>> From: Derek Chen-Becker 
>> Sent: Thursday, March 9, 2023 15:55
>> To: dev@cassandra.apache.org
>> Subject: Re: Role of Hadoop code in Cassandra 5.0
>>
>> NetApp Security WARNING: This is an external email. Do not click links or
>> open attachments unless you recognize the sender and know the content is
>> safe.
>>
>>
>>
>> I think the question isn't "Who ... is still using that?" but more "are
>> we actually going to support it?" If we're on a version that old it would
>> appear that we've basically abandoned it, although there do appear to have
>> been refactoring (for other things) commits in the last couple of years. I
>> would be in favor of removal from 5.0, but at the very least, could it be
>> moved into a separate repo/package so that it's not pulling a relatively
>> large dependency subtree from Hadoop into our main codebase?
>>
>> Cheers,
>>
>> Derek
>>
>> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan <
>> stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote:
>> Hi list,
>>
>> I stumbled upon Hadoop package again. I think there was some discussion
>> about the relevancy of Hadoop code some time ago but I would like to ask
>> this again.
>>
>> Do you think Hadoop code (1) is still relevant in 5.0? Who in the
>> industry is still using that?
>>
>> We might drop a lot of code and some Hadoop dependencies too (3) (even
>> their scope is "provided"). The version of Hadoop we build upon is 1.0.3
>> which was released 10 years ago. This code does not have any tests nor
>> documentation on the website.
>>
>> There seems to be issues like this (2) and it seems like the solution is
>> to, basically, use Spark Cassandra connector instead which I would say is
>> quite reasonable.
>>
>> Regards
>>
>> (1)
>> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop
>> (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p
>> (3)
>> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589
>>
>>
>> --
>> +---+
>> | Derek Chen-Becker |
>> | GPG Key available at https://keybase.io/dchenbecker and   |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---+
>>
>>

Re: Role of Hadoop code in Cassandra 5.0

2023-03-09 Thread Jacek Lewandowski

I vote for removing it entirely.

thanks
- - -- --- -  -
Jacek Lewandowski


czw., 9 mar 2023 o 18:07 Miklosovic, Stefan 
napisał(a):

> Derek,
>
> I have couple more points ... I do not think that extracting it to a
> separate repository is "win". That code is on Hadoop 1.0.3. We would be
> spending a lot of work on extracting it just to extract 10 years old code
> with occasional updates (in my humble opinion just to make it compilable
> again if the code around changes). What good is in that? We would have one
> more place to take care of ... Now we at least have it all in one place.
>
> I believe we have four options:
>
> 1) leave it there so it will be like this is for next years with
> questionable and diminishing usage
> 2) update it to Hadoop 3.3 (I wonder who is going to do that)
> 3) 2) and extract it to a separate repository but if we do 2) we can just
> leave it there
> 4) remove it
>
> 
> From: Derek Chen-Becker 
> Sent: Thursday, March 9, 2023 15:55
> To: dev@cassandra.apache.org
> Subject: Re: Role of Hadoop code in Cassandra 5.0
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> I think the question isn't "Who ... is still using that?" but more "are we
> actually going to support it?" If we're on a version that old it would
> appear that we've basically abandoned it, although there do appear to have
> been refactoring (for other things) commits in the last couple of years. I
> would be in favor of removal from 5.0, but at the very least, could it be
> moved into a separate repo/package so that it's not pulling a relatively
> large dependency subtree from Hadoop into our main codebase?
>
> Cheers,
>
> Derek
>
> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan <
> stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote:
> Hi list,
>
> I stumbled upon Hadoop package again. I think there was some discussion
> about the relevancy of Hadoop code some time ago but I would like to ask
> this again.
>
> Do you think Hadoop code (1) is still relevant in 5.0? Who in the industry
> is still using that?
>
> We might drop a lot of code and some Hadoop dependencies too (3) (even
> their scope is "provided"). The version of Hadoop we build upon is 1.0.3
> which was released 10 years ago. This code does not have any tests nor
> documentation on the website.
>
> There seems to be issues like this (2) and it seems like the solution is
> to, basically, use Spark Cassandra connector instead which I would say is
> quite reasonable.
>
> Regards
>
> (1)
> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop
> (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p
> (3)
> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589
>
>
> --
> +---+
> | Derek Chen-Becker |
> | GPG Key available at https://keybase.io/dchenbecker and   |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---+
>
>

Re: Downgradability

2023-03-06 Thread Jacek Lewandowski

A bit of organization - I've created
https://issues.apache.org/jira/browse/CASSANDRA-18300 epic to track tickets
related to the downgradability. Please add the tickets you are aware of.

thanks
- - -- --- -  -
Jacek Lewandowski


czw., 23 lut 2023 o 17:47 Benedict  napisał(a):

> Either way, it feels like this has become much more of a big deal than it
> needed to.
>
> I would prefer the pending patches to avoid breaking compatibility, as I
> think they can do it easily. But, if we agree to block release until we can
> double back to fix it with versioned writing (which I agree with Jacek are
> LHF - I think we literally just need a method that chooses the descriptor
> version) then let’s not further agonise over this.
>
> Alternatively I’d be happy to work with the authors to get this done
> alongside their work, as I don’t think it would hold anything up. We just
> need something to pick a descriptor besides latest on write, everything
> else is basically there for these patches.
>
>
> On 23 Feb 2023, at 16:37, Henrik Ingo  wrote:
>
> 
> Right. So I'm speculating everyone else who worked on a patch that breaks
> compatibility has been working under the mindset "I'll just put this behind
> the same switch". Or something more vague / even less correct, such as
> assuming that tries would become the default immediately.
>
> At least in my mind when we talk about the "switch to enable tries" I do
> also consider things like "don't break streaming". So I guess whether one
> considers "a switch" to exist already or not, might be subjective in this
> case, because people have different assumptions on the definition of done
> of such a switch.
>
> henrik
>
> On Thu, Feb 23, 2023 at 2:53 PM Benedict  wrote:
>
>> I don’t think there’s anything about a new format that requires a version
>> bump, but I could be missing something.
>>
>> We have to have a switch to enable tries already don’t we? I’m pretty
>> sure we haven’t talked about switching the default format?
>>
>> On 23 Feb 2023, at 12:12, Henrik Ingo  wrote:
>>
>> 
>> On Thu, Feb 23, 2023 at 11:57 AM Benedict  wrote:
>>
>>> Can somebody explain to me why this is being fought tooth and nail, when
>>> the work involved is absolutely minimal?
>>>
>>>
>> I don't know how each individual has been thinking about this, but it
>> seems to me just looking at all the tasks that at least the introduction of
>> tries is a major format change anyway - since it's the whole point - and
>> therefore people working on other tasks may have assumed the format is
>> changing anyway and therefore something like a switch (is this what is
>> referred to as the C-8110 solution?) will take care of it for everyone.
>>
>> I'm not sure there's consensus that such a switch is a sufficient
>> resolution to this discussion, but if there were such a consensus, the next
>> question would be whether the patches that are otherwise ready now can
>> merge, or whether they will all be blocked waiting for the compatibility
>> solution first. And possibly better testing, etc. Letting them merge would
>> be justified by the desire to have more frequent and smaller increments of
>> work merged into trunk... well, I'm not going to repeat everything from
>> that discussion but the same pro's and con's apply.
>>
>> henrik
>> --
>>
>> Henrik Ingo
>>
>> c. +358 40 569 7354
>>
>> w. www.datastax.com
>>
>>
>> <https://urldefense.com/v3/__https://www.facebook.com/datastax__;!!PbtH5S7Ebw!dOQqeDGZHgRdaV7zT4J-u7QGa4b2HCSNBgF8KrDldGjvy_guOGUws3L2sV2X5y_vzNYF7iZ85aa0n0n_sPsT$>
>> <https://twitter.com/datastax>
>> <https://urldefense.com/v3/__https://www.linkedin.com/company/datastax/__;!!PbtH5S7Ebw!dOQqeDGZHgRdaV7zT4J-u7QGa4b2HCSNBgF8KrDldGjvy_guOGUws3L2sV2X5y_vzNYF7iZ85aa0n2bcAuFd$>
>> <https://github.com/datastax/>
>>
>>
>
> --
>
> Henrik Ingo
>
> c. +358 40 569 7354
>
> w. www.datastax.com
>
> <https://www.facebook.com/datastax>  <https://twitter.com/datastax>
> <https://www.linkedin.com/company/datastax/>
> <https://github.com/datastax/>
>
>

Merging CEP-17

2023-02-27 Thread Jacek Lewandowski

Hi,

As it was mentioned in Slack over a week ago, we are going to merge CEP-17 (
https://issues.apache.org/jira/browse/CASSANDRA-17056). The PR has been
reviewed thoroughly and it passes tests. It is mostly a refactoring
introducing the ability to implement a custom sstable format and it makes a
space for CEP-25 (big table with trie indexes) which is a significant
performance improvement, as well as it will contribute to the ability to
downgrade (which seems to be becoming more and more important).

If somebody has questions or doubts please speak up. Unless there is strong
controversy, I'm gonna merge it this week.

I'm also open for doing some follow-up work if something is missing or
something more is needed.

Thanks,
- - -- --- -  -
Jacek Lewandowski

Re: Downgradability

2023-02-23 Thread Jacek Lewandowski

Running upgrade tests backwards is great idea which does not require extra
work.

For stats metadata it already supports writing in previous serialization
version

We need a small fix in compression metadata and that's it.

A flag with the write format version is probably LHF.

Maybe let's try,  we still have time to fix it before 5.0


czw., 23 lut 2023, 10:57 użytkownik Benedict  napisał:

> Forget downgradeability for a moment: we should not be breaking format
> compatibility without good reason. Bumping a major version isn’t enough of
> a reason.
>
> Can somebody explain to me why this is being fought tooth and nail, when
> the work involved is absolutely minimal?
>
> Regarding tests: what more do you want, than running our upgrade suite
> backwards?
>
>
> On 23 Feb 2023, at 09:45, Benjamin Lerer  wrote:
>
> 
>
>> Can somebody explain to me what is so burdensome, that we seem to be
>> spending longer debating it than it would take to implement the necessary
>> changes?
>
>
> I believe that we all agree on the outcome. Everybody wants
> downgradability. The issue is on the path to get there.
>
> As far as I am concerned, I would like to see a proper solution and as
> Jeff suggested the equivalent of the upgrade tests as gatekeepers. Having
> everybody trying to enforce it on his own way will only lead to a poor
> result in my opinion with some addoc code that might not really guarantee
> real downgradability in the end.
> We have rushed in the past to get feature outs and pay the price for it. I
> simply prefer that we take the time to do things right.
>
> Thanks to Scott and you, downgradability got a much better visibility so
> no matter what approach we pick, I am convinced that we will get there.
>
> Le jeu. 23 févr. 2023 à 09:49, Claude Warren, Jr via dev <
> dev@cassandra.apache.org> a écrit :
>
>> Broken downgrading can be fixed (I think) by modifying the
>> SearializationHeader.toHeader() method where it currently throws an
>> UnknownColumnException.  If we can, instead of throwing the exception,
>> create a dropped column for the unexpected column then I think the code
>> will work.
>>
>> I realise that to do this in the wild is not possible as it is a
>> change to released code, but we could handle it going forward.
>>
>> On Wed, Feb 22, 2023 at 11:21 PM Henrik Ingo 
>> wrote:
>>
>>> ... ok apparently shift+enter  sends messages now?
>>>
>>> I was just saying if at least the file format AND system/tables -
>>> anything written to disk - can be protected with a switch, then it allows
>>> for quick downgrade by shutting down the entire cluster and restarting with
>>> the downgraded binary. It's a start.
>>>
>>> To be able to do that live in a distributed system needs to consider
>>> much more: gossip, streaming, drivers, and ultimately all features, because
>>> we don't' want an application developer to use a shiny new thing that a)
>>> may not be available on all nodes, or b) may disappear if the cluster has
>>> to be downgraded later.
>>>
>>> henrik
>>>
>>> On Thu, Feb 23, 2023 at 1:14 AM Henrik Ingo 
>>> wrote:
>>>
 Just this once I'm going to be really brief :-)

 Just wanted to share for reference how Mongodb implemented
 downgradeability around their 4.4 version:
 https://www.mongodb.com/docs/manual/release-notes/6.0-downgrade-sharded-cluster/

 Jeff you're right. Ultimately this is about more than file formats.
 However, ideally if at least the

 On Mon, Feb 20, 2023 at 10:02 PM Jeff Jirsa  wrote:

> I'm not even convinced even 8110 addresses this - just writing
> sstables in old versions won't help if we ever add things like new types 
> or
> new types of collections without other control abilities. Claude's other
> email in another thread a few hours ago talks about some of these 
> surprises
> - "Specifically during the 3.1 -> 4.0 changes a column broadcast_port was
> added to system/local.  This means that 3.1 system can not read the table
> as it has no definition for it.  I tried marking the column for deletion 
> in
> the metadata and in the serialization header.  The later got past the
> column not found problem, but I suspect that it just means that data
> columns after broadcast_port shifted and so incorrectly read." - this is a
> harder problem to solve than just versioning sstables and network
> protocols.
>
> Stepping back a bit, we have downgrade ability listed as a goal, but
> it's not (as far as I can tell) universally enforced, nor is it clear at
> which point we will be able to concretely say "this release can be
> downgraded to X".   Until we actually define and agree that this is a
> real goal with a concrete version where downgrade-ability becomes real, it
> feels like things are somewhat arbitrarily enforced, which is probably 
> very
> frustrating for people trying to commit work/tickets.
>
> - Jeff
>
>
>

Re: [DISCUSSION] Cassandra's code style and source code analysis

2023-02-22 Thread Jacek Lewandowski

I suppose it can be easy for the existing feature branches if they have a
single commit. Don't we need to adjust each commit for multi-commit feature
branches?

śr., 22 lut 2023, 19:48 użytkownik Maxim Muzafarov 
napisał:

> Hello everyone,
>
> I have created an issue CASSANDRA-18277 that may help us move forward
> with code style changes. It only affects the way we store the IntelliJ
> code style configuration and has no effect on any current (or any)
> releases, so it should be safe to merge. So, once the issue is
> resolved, every developer that checkouts a release branch will use the
> same code style stored in that branch. This in turn makes rebasing a
> big change like the import order [1] a really straightforward matter
> (by pressing Crtl + Opt + O in their local branch to organize
> imports).
>
> See:
>
> Move the IntelliJ Idea code style and inspections configuration to the
> project's root .idea directory
> https://issues.apache.org/jira/browse/CASSANDRA-18277
>
>
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-17925
>
> On Wed, 25 Jan 2023 at 13:05, Miklosovic, Stefan
>  wrote:
> >
> > Thank you Maxim for doing this.
> >
> > It is nice to see this effort materialized in a PR.
> >
> > I would wait until bigger chunks of work are committed to trunk (like
> CEP-15) to not collide too much. I would say we can postpone doing this
> until the actual 5.0 release, last weeks before it so we would not clash
> with any work people would like to include in 5.0. This can go in anytime,
> basically.
> >
> > Are people on the same page?
> >
> > Regards
> >
> > 
> > From: Maxim Muzafarov 
> > Sent: Monday, January 23, 2023 19:46
> > To: dev@cassandra.apache.org
> > Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis
> >
> > NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >
> >
> >
> >
> > Hello everyone,
> >
> > You can find the changes here:
> > https://issues.apache.org/jira/browse/CASSANDRA-17925
> >
> > While preparing the code style configuration for the Eclipse IDE, I
> > discovered that there was no easy way to have complex grouping options
> > for the set of packages. So we need to add extra blank lines between
> > each group of packages so that all the configurations for Eclipse,
> > NetBeans, IntelliJ IDEA and checkstyle are aligned. I should have
> > checked this earlier for sure, but I only did it for static imports
> > and some groups, my bad. The resultant configuration looks like this:
> >
> > java.*
> > [blank line]
> > javax.*
> > [blank line]
> > com.*
> > [blank line]
> > net.*
> > [blank line]
> > org.*
> > [blank line]
> > org.apache.cassandra.*
> > [blank line]
> > all other imports
> > [blank line]
> > static all other imports
> >
> > The pull request is here:
> > https://github.com/apache/cassandra/pull/2108
> >
> > The configuration-related changes are placed in a dedicated commit, so
> > it should be easy to make a review:
> >
> https://github.com/apache/cassandra/pull/2108/commits/84e292ddc9671a0be76ceb9304b2b9a051c2d52a
> >
> > 
> >
> > Another important thing to mention is that the total amount of changes
> > for organising imports is really big (more than 2000 files!), so we
> > need to decide the right time to merge this PR. Although rebasing or
> > merging changes to development branches should become much easier
> > ("Accept local" + "Organize imports"), we still need to pay extra
> > attention here to minimise the impact on major patches for the next
> > release.
> >
> > On Mon, 16 Jan 2023 at 13:16, Maxim Muzafarov  wrote:
> > >
> > > Stefan,
> > >
> > > Thank you for bringing this topic up. I'll prepare the PR shortly with
> > > option 4, so everyone can take a look at the amount of changes. This
> > > does not force us to go exactly this path, but it may shed light on
> > > changes in general.
> > >
> > > What exactly we're planning to do in the PR:
> > >
> > > 1. Checkstyle AvoidStarImport rule, so no star imports will be allowed.
> > > 2. Checkstyle ImportOrder rule, for controlling the order.
> > > 3. The IDE code style configuration for Intellij IDEA, NetBeans, and
> > > Eclipse (it doesn't exist for Eclipse yet).
> > > 4. The import order according to option 4:
> > >
> > > ```
> > > java.*
> > > javax.*
> > > [blank line]
> > > com.*
> > > net.*
> > > org.*
> > > [blank line]
> > > org.apache.cassandra.*
> > > [blank line]
> > > all other imports
> > > [blank line]
> > > static all other imports
> > > ```
> > >
> > >
> > >
> > > On Mon, 16 Jan 2023 at 12:39, Miklosovic, Stefan
> > >  wrote:
> > > >
> > > > Based on the voting we should go with option 4?
> > > >
> > > > Two weeks passed without anybody joining so I guess folks are all
> happy with that or this just went unnoticed?
> > > >
> > > > Let's give it time until the end of this week (Friday 12:00

Re: Downgradability

2023-02-20 Thread Jacek Lewandowski

I'd like to mention CASSANDRA-17056 (CEP-17) here as it aims to introduce
multiple sstable formats support. It allows for providing an implementation
of SSTableFormat along with SSTableReader and SSTableWriter. That could be
extended easily to support different implementations for certain version
ranges, like one impl for ma-nz, other for oa+, etc. without having a
confusing implementation with a lot of conditional blocks. Old formats in
such case could be maintained separately from the main code and easily
switched any time.

thanks
- - -- --- -  -
Jacek Lewandowski


wt., 21 lut 2023 o 01:46 Yuki Morishita  napisał(a):

> Hi,
>
> What I wanted to address in my comment in CASSANDRA-8110(
> https://issues.apache.org/jira/browse/CASSANDRA-8110?focusedCommentId=17641705=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17641705)
> is to focus on better upgrade experience.
>
> Upgrading the cluster can be painful for some orgs with mission critical
> Cassandra cluster, where they cannot tolerate less availability because of
> the inability to replace the downed node.
> They also need to plan rolling back to the previous state when something
> happens along the way.
> The change I proposed in CASSANDRA-8110 is to achieve the goal of at least
> enabling SSTable streaming during the upgrade by not upgrading the SSTable
> version. This can make the cluster to easily rollback to the previous
> version.
> Downgrading SSTable is not the primary focus (though Cassandra needs to
> implement the way to write SSTable in older versions, so it is somewhat
> related.)
>
> I'm preparing the design doc for the change.
> Also, if I should create a separate ticket from CASSANDRA-8110 for the
> clarity of the goal of the change, please let me know.
>
>
> On Tue, Feb 21, 2023 at 5:31 AM Benedict  wrote:
>
>> FWIW I think 8110 is the right approach, even if it isn’t a panacea. We
>> will have to eventually also tackle system schema changes (probably not
>> hard), and may have to think a little carefully about other things, eg with
>> TTLs the format change is only the contract about what values can be
>> present, so we have to make sure the data validity checks are consistent
>> with the format we write. It isn’t as simple as writing an earlier version
>> in this case (unless we permit truncating the TTL, perhaps)
>>
>> On 20 Feb 2023, at 20:24, Benedict  wrote:
>>
>>
>> 
>> In a self-organising community, things that aren’t self-policed naturally
>> end up policed in an adhoc manner, and with difficulty. I’m not sure that’s
>> the same as arbitrary enforcement. It seems to me the real issue is nobody
>> noticed this was agreed and/or forgot and didn’t think about it much.
>>
>> But, even without any prior agreement, it’s perfectly reasonable to
>> request that things do not break compatibility if they do not need to, as
>> part of the normal patch integration process.
>>
>> Issues with 3.1->4.0 aren’t particularly relevant as they predate any
>> agreement to do this. But we can and should address the problem of new
>> columns in schema tables, as this happens often between versions. I’m not
>> sure it has in 4.1 though?
>>
>> Regarding downgrade versions, surely this should simply be the same as
>> upgrade versions we support?
>>
>>
>> On 20 Feb 2023, at 20:02, Jeff Jirsa  wrote:
>>
>> 
>> I'm not even convinced even 8110 addresses this - just writing sstables
>> in old versions won't help if we ever add things like new types or new
>> types of collections without other control abilities. Claude's other email
>> in another thread a few hours ago talks about some of these surprises -
>> "Specifically during the 3.1 -> 4.0 changes a column broadcast_port was
>> added to system/local.  This means that 3.1 system can not read the table
>> as it has no definition for it.  I tried marking the column for deletion in
>> the metadata and in the serialization header.  The later got past the
>> column not found problem, but I suspect that it just means that data
>> columns after broadcast_port shifted and so incorrectly read." - this is a
>> harder problem to solve than just versioning sstables and network
>> protocols.
>>
>> Stepping back a bit, we have downgrade ability listed as a goal, but it's
>> not (as far as I can tell) universally enforced, nor is it clear at which
>> point we will be able to concretely say "this release can be downgraded to
>> X".   Until we actually define and agree that this is a real goal with a
>> concrete version where downgrade-ability becomes rea

Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-07 Thread Jacek Lewandowski

+1

- - -- --- -  -
Jacek Lewandowski


wt., 7 lut 2023 o 10:12 Benjamin Lerer  napisał(a):

> +1
>
> Le mar. 7 févr. 2023 à 06:21,  a écrit :
>
>> +1 nb
>>
>> On Feb 6, 2023, at 9:05 PM, Ariel Weisberg  wrote:
>>
>> +1
>>
>> On Mon, Feb 6, 2023, at 11:15 AM, Sam Tunnicliffe wrote:
>>
>> Hi everyone,
>>
>> I would like to start a vote on this CEP.
>>
>> Proposal:
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>>
>> Discussion:
>> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>>
>> The vote will be open for 72 hours.
>> A vote passes if there are at least three binding +1s and no binding
>> vetoes.
>>
>> Thanks,
>> Sam
>>
>>
>>

Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-02 Thread Jacek Lewandowski

Congrats !!!

- - -- --- -  -
Jacek Lewandowski


czw., 2 lut 2023 o 19:10 Melissa Logan  napisał(a):

> Congrats, Patrick!
>
> On Thu, Feb 2, 2023 at 9:58 AM Benjamin Lerer  wrote:
>
>> The PMC members are pleased to announce that Patrick McFadin has accepted
>> the invitation to become committer today.
>>
>> Thanks a lot, Patrick, for everything you have done for this project and
>> its community through the years.
>>
>> Congratulations and welcome!
>>
>> The Apache Cassandra PMC members
>>
>
>
> --
> Melissa Logan (she/her)
> CEO & Founder, Constantia.io
> LinkedIn <https://www.linkedin.com/in/mklogan/> | Twitter
> <https://twitter.com/Melissa_B2B>
>
>
>

Re: Merging CEP-15 to trunk

2023-01-16 Thread Jacek Lewandowski

Hi,

It would be great if some documentation got added to the code you want to
merge. To me, it would be enough to just quickly
characterize on the class level what is the class for and what are the
expectations. This is especially important for Accord API
classes because now it is hard to review whether the implementation in
Cassandra conforms the API requirements.

Given it is going to be a possibility for others to try Accord before the
release, it would be good to create some CQL syntax
documentation, something like a chapter in
https://cassandra.apache.org/doc/latest/cassandra/cql/index.html but for
unreleased
Cassandra version or a blog post, so that the syntax is known to the users
and they can quickly get into speed, hopefully
reporting any problems soon.

- - -- --- -  -
Jacek Lewandowski


On Mon, 16 Jan 2023 at 17:52, Benedict  wrote:

> That’s fair, though for long term contributors probably the risk is
> relatively low on that front. I guess that’s something we can perhaps raise
> as part of each CEP if we envisage it taking several months of development?
>
> > Did we document this or is it in an email thread somewhere?
>
> It’s probably buried in one of the many threads we’ve had about related
> topics on releases and development. We’ve definitely discussed feature
> branches before, and I recall discussing a goal of merging ~quarterly. But
> perhaps like most sub topics it didn’t get enough visibility, in which case
> this thread I suppose can serve as a dedicated rehash and we can formalise
> whatever falls out.
>
> In theory as Jeremiah says there’s only the normal merge criteria. But
> that includes nobody saying no to a piece of work or raising concerns, and
> advertising the opportunity to say no is important for that IMO.
>
> On 16 Jan 2023, at 16:36, J. D. Jordan  wrote:
>
> 
> My only concern to merging (given all normal requirements are met) would
> be if there was a possibility that the feature would never be finished.
> Given all of the excitement and activity around accord, I do not think that
> is a concern here. So I see no reason not to merge incremental progress
> behind a feature flag.
>
> -Jeremiah
>
> On Jan 16, 2023, at 10:30 AM, Josh McKenzie  wrote:
>
> 
> Did we document this or is it in an email thread somewhere?
>
> I don't see it on the confluence wiki nor does a cursory search of
> ponymail turn it up.
>
> What was it for something flagged experimental?
> 1. Same tests pass on the branch as to the root it's merging back to
> 2. 2 committers eyes on (author + reviewer or 2 reviewers, etc)
> 3. Disabled by default w/flag to enable
>
> So really only the 3rd thing is different right? Probably ought to add an
> informal step 4 which Benedict is doing here which is "hit the dev ML w/a
> DISCUSS thread about the upcoming merge so it's on people's radar and they
> can coordinate".
>
> On Mon, Jan 16, 2023, at 11:08 AM, Benedict wrote:
>
> My goal isn’t to ask if others believe we have the right to merge, only to
> invite feedback if there are any specific concerns. Large pieces of work
> like this cause headaches and concerns for other contributors, and so it’s
> only polite to provide notice of our intention, since probably many haven’t
> even noticed the feature branch developing.
>
> The relevant standard for merging a feature branch, if we want to rehash
> that, is that it is feature- and bug-neutral by default, ie that a release
> could be cut afterwards while maintaining our usual quality standards, and
> that the feature is disabled by default, yes. It is not however
> feature-complete or production read as a feature; that would prevent any
> incremental merging of feature development.
>
> > On 16 Jan 2023, at 15:57, J. D. Jordan 
> wrote:
> >
> > I haven’t been following the progress of the feature branch, but I
> would think the requirements for merging it into master would be the same
> as any other merge.
> >
> > A subset of those requirements being:
> > Is the code to be merged in releasable quality? Is it disabled by a
> feature flag by default if not?
> > Do all the tests pass?
> > Has there been review and +1 by two committer?
> >
> > If the code in the feature branch meets all of the merging criteria of
> the project then I see no reason to keep it in a feature branch for ever.
> >
> > -Jeremiah
> >
> >
> >> On Jan 16, 2023, at 3:21 AM, Benedict  wrote:
> >>
> >> Hi Everyone, I hope you all had a lovely holiday period.
> >>
> >> Those who have been following along will have seen a steady drip of
> progress into the cep-15-accord feature branch over the past year. We
> originally discusse

[DISCUSS] Clear rules about sstable versioning and downgrade support

2023-01-13 Thread Jacek Lewandowski

Hi,

I'd like to bring that topic to your attention. I think that we should
think about allowing users to downgrade under certain conditions. For
example, always allow for downgrading to any previous minor release.

Clear rules should make users feel safer when upgrading and perhaps
encourage trying Cassandra at all.

One of the things related to that is sstable format version. It consists of
major and minor components and is incremented independently from Cassandra
releases. One rule here is that a Cassandra release producing sstables at
version XY should be able to read any sstable with version (X-1)* and X*
(which means that all the minor future versions X. Perhaps we could make
some commitment to change major sstable format only with new major release?

What do you think?

Thanks
- - -- --- -  -
Jacek Lewandowski

Re: Introducing mockito-inline library among test dependencies

2023-01-12 Thread Jacek Lewandowski

Will it work with Java17?

czw., 12 sty 2023, 12:56 użytkownik Brandon Williams 
napisał:

> +1
>
> Kind Regards,
> Brandon
>
> On Wed, Jan 11, 2023 at 2:02 PM Miklosovic, Stefan
>  wrote:
> >
> > Hi list,
> >
> > the test for (1) is using mockito-inline dependency for mocking static
> methods as mockito-core is not able to do that on its own. mockito-inline
> was not part of our test dependencies prior this work. I want to ask if we
> are all OK with being able to mock static methods from now on with the help
> of this library.
> >
> > Please tell me if we are mocking static methods already by some other
> (to me yet unknown) mean so we do not include this unnecessarily.
> >
> > G:A:V is org.mockito:mockito-inline:4.7.0
> >
> > (1) https://issues.apache.org/jira/browse/CASSANDRA-14361
> >
> > Thanks
>

Re: upgrade sstable selection

2023-01-11 Thread Jacek Lewandowski

Hi,

It does not look like returning an unsorted map is a bug. Whenever we want
to sort sstables according to their generation id, we do that explicitly,
for example in leveled compactor.

Though, as you pointed out, the order should be retained when upgrading
sstables because otherwise things like the mentioned compactor would get
wrong order.

I suggest creating a ticket and fix the upgrader code by explicitly
ordering sstables by their id. The comparator for SSTableId is available in
SSTableIdFactory.COMPARATOR.

Thanks,
- - -- --- -  -
Jacek Lewandowski


On Tue, Jan 10, 2023 at 2:26 PM Claude Warren, Jr via dev <
dev@cassandra.apache.org> wrote:

> Actually since the Directories.SSTableLister stores the Components in a
> HashMap indexed by the Descriptor.  Since the upgrade/downgrade code
> retrieves the list in hash order there is no guarantee that they will be in
> order.  I suspect that this is a bug.
>
> On Tue, Jan 10, 2023 at 12:34 PM Brandon Williams 
> wrote:
>
>> > I think this means that the Directories.SSTableLister on occasion
>> returns files in the incorrect order during a call to
>> lister.list().entrySet()
>>
>> This seems easy enough to verify by looping it and examining the results.
>>
>> Kind Regards,
>> Brandon
>>
>> On Tue, Jan 10, 2023 at 4:44 AM Claude Warren, Jr via dev
>>  wrote:
>> >
>> > Greetings,
>> >
>> > I am working on the downgradesstables code and seem to have a problem
>> with ordering of the downgrade or perhaps the Directories.SSTableLister
>> >
>> > I lifted the code from upgradesstables to select the files to
>> downgrade.  The only difference in the code that selects the files to
>> downgrade is the actual selection of the file.  There is no change to the
>> ordering of the files that are evaluated for inclusion.  Yet I think the
>> downgrade ordering is incorrect.
>> >
>> > My process is to start 3.1 version to create the tables and then use
>> the 4.0 code base to run the standaloneupgrader and then the
>> standalonedowngrader
>> >
>> > When running the standaloneupgrader on system local I see the following
>> > {{noformat}}
>> > Found 3 sstables that need upgrading.
>> > Upgrading
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/ma-1-big-Data.db')
>> > Upgrade of
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/ma-1-big-Data.db')
>> complete.
>> > Upgrading
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/ma-2-big-Data.db')
>> > Upgrade of
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/ma-2-big-Data.db')
>> complete.
>> > Upgrading
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/ma-3-big-Data.db')
>> > Upgrade of
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/ma-3-big-Data.db')
>> complete.
>> > {{noformat}}
>> >
>> > when running the standalonedowngrader is see
>> > {{noformat}}
>> > Found 3 sstables that need downgrading.
>> > Downgrading
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-6-big-Data.db')
>> > Downgrade of
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-6-big-Data.db')
>> complete.
>> > Downgrading
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-4-big-Data.db')
>> > Downgrade of
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-4-big-Data.db')
>> complete.
>> > Downgrading
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-5-big-Data.db')
>> > Downgrade of
>> BigTableReader(path='/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-5-big-Data.db')
>> complete.
>> > {{noformat}}}
>> >
>> > Note the order of the generations in the downgrader (I have seen
>> similar out of order issues with the upgrader, but infrequently)
>> >
>> > The difference between the upgrader and downgrader code in the
>> questionable section (
>> https://github.com/Claudenw/cassandra/blob/CASSANDRA-8928/src/java/org/apache/cassandra/tools/StandaloneDowngrader.java#:~:text=new%20ArrayList%3C%3E()%3B-,//%20Downgrade%20sstables,%7D,-int%20numSSTables%20%3D)
&

Re: [VOTE] CEP-25: Trie-indexed SSTable format

2022-12-19 Thread Jacek Lewandowski

+1

- - -- --- -  -
Jacek Lewandowski


On Mon, Dec 19, 2022 at 2:00 PM Branimir Lambov  wrote:

> Hi everyone,
>
> I'd like to propose CEP-25 for approval.
>
> Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format
> Discussion:
> https://lists.apache.org/thread/3dpdg6dgm3rqxj96cyhn58b50g415dyh
>
> The vote will be open for 72 hours.
> Votes by committers are considered binding.
> A vote passes if there are at least three binding +1s and no binding
> vetoes.
>
> Thank you,
> Branimir
>

Re: Review requested: Add downgradesstables

2022-12-09 Thread Jacek Lewandowski

Hi,

The feature looks useful to me

Could you add the pull request address to the ticket and summarize the
features it is going to provide?

Thank you,
- - -- --- -  -
Jacek Lewandowski


On Fri, Dec 9, 2022 at 10:13 AM Claude Warren, Jr via dev <
dev@cassandra.apache.org> wrote:

> https://github.com/apache/cassandra/pull/2045
>
> https://issues.apache.org/jira/browse/CASSANDRA-8928
>
> This is a work in progress and I am looking for some feedback.
>
> This fix appears to work correctly. But I think the placement of the v3
> directory is probably not the best and perhaps should be moved under the
> db/compaction directory where the Downgrader code is.
>
> Suggestions appreciated.
> The changes are:
>
> *added*
>
>- db/compaction/Downgrader
>- tools/StandAloneDowngrader
>- getNextsstableID to ColumnFamilyStore
>- DOWNGRADE_SSTABLES to OperationType
>- hasMaxCompressedLength to CompressionMetadata constructor and
>associated calls.
>- V3 to SSTableFormat.Type
>- added io/sstable/format/big/v3 directory containing BigFormatV3,
>BigTableReaderV3, BitTableScannerV3, BitTableWriterV3
>
> *modified*
>
>- CompressionMetadata to skip output of maxCompressedLength if not
>supported
>
> *notes*
>
>- io/sstable/format/big/v3 classes are the V3 classes modified as
>necessary to run within the V4 environment.
>
>

1 2 >

1 - 100 of 132 matches

Mail list logo