[jira] [Commented] (CASSANDRA-17080) Fix test: dtest-upgrade.upgrade_tests.drop_compact_storage_upgrade_test.TestDropCompactStorage.test_drop_compact_storage_mixed_cluster

2021-12-01 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452191#comment-17452191
 ] 

Berenguer Blasi commented on CASSANDRA-17080:
-

Right. As the original link to the failure is on trunk and afaik CI is weird on 
4.0 a CI link was worth it imo. Did you notice how the 4.0 here has 10 failures 
instead of 40-ish? There is sthg going on in 4.0... Anyway regarding this 
ticket +1. Thx for the work!

> Fix test: 
> dtest-upgrade.upgrade_tests.drop_compact_storage_upgrade_test.TestDropCompactStorage.test_drop_compact_storage_mixed_cluster
> --
>
> Key: CASSANDRA-17080
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17080
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> !https://ci-cassandra.apache.org/static/a177fe56/images/32x32/health-80plus.png!
>  Failed 28 times in the last 28 runs. Flakiness: 0%, Stability: 0%
>   
>  Example of failure: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/801/testReport/junit/dtest-upgrade.upgrade_tests.drop_compact_storage_upgrade_test/TestDropCompactStorage/test_drop_compact_storage_mixed_cluster/]
>    
> {code:java}
> upgrade_tests/drop_compact_storage_upgrade_test.py:149: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
>  object at 0x7fa0e7f1ceb0>
> session = 
> assert_msg = 'Cannot DROP COMPACT STORAGE as some nodes in the cluster 
> ([/127.0.0.2:7000, /127.0.0.1:7000]) are not on 4.0+ yet. Please upgrade 
> those nodes and run `upgradesstables` before retrying.'
> def drop_compact_storage(self, session, assert_msg):
> try:
> session.execute("ALTER TABLE drop_compact_storage_test.test DROP 
> COMPACT STORAGE")
> pytest.fail("No exception has been thrown")
> except InvalidRequest as e:
> >   assert assert_msg in str(e)
> E   assert 'Cannot DROP COMPACT STORAGE as some nodes in the cluster 
> ([/127.0.0.2:7000, /127.0.0.1:7000]) are not on 4.0+ yet. Please upgrade 
> those nodes and run `upgradesstables` before retrying.' in 'Error from 
> server: code=2200 [Invalid query] message="Cannot DROP COMPACT STORAGE as 
> some nodes in the cluster ([/1271:7000, /127.0.0.2:7000]) are not on 4.0+ 
> yet. Please upgrade those nodes and run `upgradesstables` before retrying."'
> E+  where 'Error from server: code=2200 [Invalid query] 
> message="Cannot DROP COMPACT STORAGE as some nodes in the cluster 
> ([/1271:7000, /127.0.0.2:7000]) are not on 4.0+ yet. Please upgrade those 
> nodes and run `upgradesstables` before retrying."' = 
> str(InvalidRequest('Error from server: code=2200 [Invalid query] 
> message="Cannot DROP COMPACT STORAGE as some nodes in the...1:7000, 
> /127.0.0.2:7000]) are not on 4.0+ yet. Please upgrade those nodes and run 
> `upgradesstables` before retrying."'))
> upgrade_tests/drop_compact_storage_upgrade_test.py:45: AssertionError
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452143#comment-17452143
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15234 at 12/2/21, 2:54 AM:
---

Quick update - I figured it is purely pip3 issue how we install ccm. The time 
things worked was when I had a bug in my code, but in general CCM is not 
changing to my new version when I change it in requirements.txt

 

What I found in CircleCI logs:
{code:java}
Collecting ccm
  Cloning https://github.com/ekaterinadimitrova2/ccm.git (to revision 
CASSANDRA-15234) to 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git clone -q https://github.com/ekaterinadimitrova2/ccm.git 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git checkout -b CASSANDRA-15234 --track origin/CASSANDRA-15234
  Switched to a new branch 'CASSANDRA-15234'
  Branch 'CASSANDRA-15234' set up to track remote branch 'CASSANDRA-15234' from 
'origin'.
  Resolved https://github.com/ekaterinadimitrova2/ccm.git to commit 
ee52d120ea34d44500c64bfb3b9d3f517b0865f1
{code}
 But then
{code:java}
 pip3 freeze
{code}
output shows:
{code:java}
ccm @ 
git+https://github.com/riptano/ccm.git@ce612ea71587bf263ed513cb8f8d5dfcf7c8dadb
{code}

I will debug this tomorrow but it is the way we update ccm not CircleCI


was (Author: e.dimitrova):
Quick update - I figured it is purely pip3 issue how we install ccm. The time 
things worked was when I had a bug in my code, but in general CCM is not 
changing to my new version when I change it in requirements.txt

 

What I found in CircleCI logs:

{code:java}
Collecting ccm
  Cloning https://github.com/ekaterinadimitrova2/ccm.git (to revision 
CASSANDRA-15234) to 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git clone -q https://github.com/ekaterinadimitrova2/ccm.git 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git checkout -b CASSANDRA-15234 --track origin/CASSANDRA-15234
  Switched to a new branch 'CASSANDRA-15234'
  Branch 'CASSANDRA-15234' set up to track remote branch 'CASSANDRA-15234' from 
'origin'.
  Resolved https://github.com/ekaterinadimitrova2/ccm.git to commit 
ee52d120ea34d44500c64bfb3b9d3f517b0865f1
{code}

 But then
{code:java}
 pip3 freeze
{code}
 output shows:

{code:java}
ccm @ 
git+https://github.com/riptano/ccm.git@ce612ea71587bf263ed513cb8f8d5dfcf7c8dadb
{code}


> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452143#comment-17452143
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15234 at 12/2/21, 2:54 AM:
---

Quick update - I figured it is purely pip3 issue how we install ccm. The time 
things worked was when I had a bug in my code, but in general CCM is not 
changing to my new version when I change it in requirements.txt

 

What I found in CircleCI logs:
{code:java}
Collecting ccm
  Cloning https://github.com/ekaterinadimitrova2/ccm.git (to revision 
CASSANDRA-15234) to 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git clone -q https://github.com/ekaterinadimitrova2/ccm.git 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git checkout -b CASSANDRA-15234 --track origin/CASSANDRA-15234
  Switched to a new branch 'CASSANDRA-15234'
  Branch 'CASSANDRA-15234' set up to track remote branch 'CASSANDRA-15234' from 
'origin'.
  Resolved https://github.com/ekaterinadimitrova2/ccm.git to commit 
ee52d120ea34d44500c64bfb3b9d3f517b0865f1
{code}
 But then
{code:java}
 pip3 freeze
{code}
output shows:
{code:java}
ccm @ 
git+https://github.com/riptano/ccm.git@ce612ea71587bf263ed513cb8f8d5dfcf7c8dadb
{code}

I will debug this tomorrow but it is the way we update ccm not CircleCI caching 
or so


was (Author: e.dimitrova):
Quick update - I figured it is purely pip3 issue how we install ccm. The time 
things worked was when I had a bug in my code, but in general CCM is not 
changing to my new version when I change it in requirements.txt

 

What I found in CircleCI logs:
{code:java}
Collecting ccm
  Cloning https://github.com/ekaterinadimitrova2/ccm.git (to revision 
CASSANDRA-15234) to 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git clone -q https://github.com/ekaterinadimitrova2/ccm.git 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git checkout -b CASSANDRA-15234 --track origin/CASSANDRA-15234
  Switched to a new branch 'CASSANDRA-15234'
  Branch 'CASSANDRA-15234' set up to track remote branch 'CASSANDRA-15234' from 
'origin'.
  Resolved https://github.com/ekaterinadimitrova2/ccm.git to commit 
ee52d120ea34d44500c64bfb3b9d3f517b0865f1
{code}
 But then
{code:java}
 pip3 freeze
{code}
output shows:
{code:java}
ccm @ 
git+https://github.com/riptano/ccm.git@ce612ea71587bf263ed513cb8f8d5dfcf7c8dadb
{code}

I will debug this tomorrow but it is the way we update ccm not CircleCI

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452143#comment-17452143
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15234:
-

Quick update - I figured it is purely pip3 issue how we install ccm. The time 
things worked was when I had a bug in my code, but in general CCM is not 
changing to my new version when I change it in requirements.txt

 

What I found in CircleCI logs:

{code:java}
Collecting ccm
  Cloning https://github.com/ekaterinadimitrova2/ccm.git (to revision 
CASSANDRA-15234) to 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git clone -q https://github.com/ekaterinadimitrova2/ccm.git 
/tmp/pip-install-dvfww32x/ccm_727b34808faa4db6904e104a715a22d2
  Running command git checkout -b CASSANDRA-15234 --track origin/CASSANDRA-15234
  Switched to a new branch 'CASSANDRA-15234'
  Branch 'CASSANDRA-15234' set up to track remote branch 'CASSANDRA-15234' from 
'origin'.
  Resolved https://github.com/ekaterinadimitrova2/ccm.git to commit 
ee52d120ea34d44500c64bfb3b9d3f517b0865f1
{code}

 But then
{code:java}
 pip3 freeze
{code}
 output shows:

{code:java}
ccm @ 
git+https://github.com/riptano/ccm.git@ce612ea71587bf263ed513cb8f8d5dfcf7c8dadb
{code}


> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452124#comment-17452124
 ] 

David Capwell commented on CASSANDRA-17147:
---

[~adelapena] did another pass (closer this time) and think we are almost there, 
the PR comments are mostly small.

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17031) Add support for PEM based key material for SSL

2021-12-01 Thread Maulin Vasavada (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452094#comment-17452094
 ] 

Maulin Vasavada commented on CASSANDRA-17031:
-

[~smiklosovic] Your idea of dumping details of the certificate in plain text 
makes sense. Currently we have 
[this|https://github.com/apache/cassandra/pull/1316/files#diff-6e9b4d54347a547d5bb5b002cad6afccd25826beb221bb79ebf57c65bc891e11R199]
 logging that prints certificates Issuer, Subject, Serial number and Expiry 
-fields in my experience most useful in order to debug any TLS cert issues. I 
can change the above log to be of 'info' type to make it easily accessible than 
debug. [~jonmeredith] Do you think it would be helpful going beyond this and 
dump full certificate details?  

> Add support for PEM based key material for SSL
> --
>
> Key: CASSANDRA-17031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17031
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Maulin Vasavada
>Assignee: Maulin Vasavada
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h1. Scope
> Currently Cassandra supports standard keystore types for SSL 
> keys/certificates. The scope of this enhancement is to add support for PEM 
> based key material (keys/certificate) given that PEM is widely used common 
> format for the same. We intend to add support for Unencrypted and Password 
> Based Encrypted (PBE) PKCS#8 formatted Private Keys in PEM format with 
> standard algorithms (RSA, DSA and EC) along with the certificate chain for 
> the private key and PEM based X509 certificates. The work here is going to be 
> built on top of [CEP-9: Make SSLContext creation 
> pluggable|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-9%3A+Make+SSLContext+creation+pluggable]
>  for which the code is merged for Apache Cassandra 4.1 release.
> We intend to support the key material be configured as direct PEM values 
> input OR via the file (configured with keystore and truststore configurations 
> today). We are not going to model PEM as a valid 'store_type' given that 
> 'store_type' has a [specific 
> definition|https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html#GUID-AB51DEFD-5238-4F96-967F-082F6D34FBEA].
>  
> h1. Approach
> Create an implementation for 
> [ISslContextFactory|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/ISslContextFactory.java]
>  extending 
> [FileBasedSslContextFactory|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/FileBasedSslContextFactory.java]
>  implementation to add PEM formatted key/certificates.
> h1. Motivation
> PEM is a widely used format for encoding Private Keys and X.509 Certificates 
> and Apache Cassandra's current implementation lacks the support for 
> specifying the PEM formatted key material for SSL configurations. This means 
> operators have to re-create the key material to comply to the supported 
> formats (using key/trust store types - jks, pkcs12 etc) and deal with an 
> operational task for managing it. This is an operational overhead we can 
> avoid by supporting the PEM format making Apache Cassandra even more customer 
> friendly and drive more adoption.
> h1. Proposed Changes
>  # A new implementation for ISslContextFactory - PEMBasedSslContextFactory 
> with the following supported configuration
> {panel:title=New configurations}
> {panel}
> |{{encryption_options:  }}
>  {{}}{{ssl_context_factory:}}
>  {{}}{{class_name: 
> org.apache.cassandra.security.PEMBasedSslContextFactory}}
>  {{}}{{parameters:}}
>  {{  }}{{private_key:  certificate chain>}}
>  {{  }}{{private_key_password:  }}{{private}} {{key }}{{if}} {{it is encrypted>}}
>  {{  }}{{trusted_certificates: }}|
> *NOTE:* We could reuse 'keystore_password' instead of the 
> 'private_key_password'. However PEM encoded private key is not a 'keystore' 
> in itself hence it would be inappropriate to piggyback on that other than 
> avoid duplicating similar fields.
>  # The PEMBasedSslContextFactory will also support file based key material 
> (and the corresponding HOT Reloading based on file timestamp updates) for the 
> PEM format via existing  'keystore' and 'truststore' encryption options. 
> However in that case the 'truststore_password' configuration won't be used 
> since generally PEM formatted certificates for truststore don't get encrypted 
> with a password.
>  # The PEMBasedSslContextFactory will internally create PKCS12 keystore for 
> private key and the trusted certificates. However, this doesn't impact the 
> user of the implementatio

[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452050#comment-17452050
 ] 

David Capwell commented on CASSANDRA-17147:
---

bq. But I would be happy to address this topic too when the time comes. 

Lets include that in followup work for  CASSANDRA-15234, maybe include it as an 
umbrella for guardrails?

bq. if I strive for "all problems" we will never commit that work

Believe in yourself!

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16958) Prewarm role and credentials caches to avoid timeouts at startup

2021-12-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16958:

Reviewers: Caleb Rackliffe

> Prewarm role and credentials caches to avoid timeouts at startup
> 
>
> Key: CASSANDRA-16958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16958
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Authorization
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> With our current auth querying setup, you can end up with a thundering herd 
> on reconnect due to delays on auth resolution.
> We should instead allow the ability to do a 'select' on all roles and 
> pre-load the cache before turning on native transport. Notably, this could 
> present some unacceptable delays on start with a very large count of users 
> (thousands), or someone who uses a third party auth system rather than the 
> built-in authorizer so this should be configurable and opt-in.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451939#comment-17451939
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15234 at 12/1/21, 9:07 PM:
---

Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX and metrics for now and assume the 
old units are in place but [~dcapwell] is right we need at least to verify (and 
convert if needed) that in Config the old unit was used with the new custom 
types and we didn't use something different as this can be a mess if we leave 
it to people and reading docs. Something similar will also be needed for VTs 
where we will have to check the Converter in the new @Replaces annotation. 
There was idea for getting the VT backward compatibility out until we figure 
out how we are going to handle the VT as there is ongoing discussion around 
that, I will leave this now aside for a moment until we clear a bit the 
discussion. CC [~blerer] as I know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 


was (Author: e.dimitrova):
Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX and metrics for now and assume the 
old units are in place but [~dcapwell] is right we need at least to verify and 
convert if needed that in Config the old unit was used with the new custom 
types and we didn't use something different as this can be a mess if we leave 
it to people and reading docs. Something similar will also be needed for VTs 
where we will have to check the Converter in the new @Replaces annotation. 
There was idea for getting the VT backward compatibility out until we figure 
out how we are going to handle the VT as there is ongoing discussion around 
that, I will leave this now aside for a moment until we clear a bit the 
discussion. CC [~blerer] as I know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452030#comment-17452030
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-17147 at 12/1/21, 9:01 PM:
---

{quote}[~e.dimitrova] will solve all problems!
{quote}
Only putting general framework; if I strive for "all problems" we will never 
commit that work :D 

About "fail" and "abort", I haven't thought about or discussed this with 
anyone. CASSANDRA-15234 is more about a liberation of the parameters' names 
from the units suffixes and to have a common format noun_verb, things like that 
:) But I would be happy to address this topic too when the time comes. 
{quote} [~e.dimitrova], if this ticket goes in before CASSANDRA-15234 don't 
worry about "fixing" the configs, we can pick that up in January after the fact.
{quote}
Thank you! My plan Is no rebase until we are done with the current review cycle 
as otherwise it becomes never ending story, running in circles. 


was (Author: e.dimitrova):
{quote}[~e.dimitrova] will solve all problems!
{quote}
Only putting general framework, if I strive for "all problems" we will never 
commit that work :D 

About "fail" and "abort" I haven't thought about or discussed this. It was more 
of a liberations the parameters names from the suffixes and have a common 
format noun_verb, things like that :) But I would be happy to address this 
topic too when the time comes. 
{quote} [~e.dimitrova], if this ticket goes in before CASSANDRA-15234 don't 
worry about "fixing" the configs, we can pick that up in January after the fact.
{quote}
Thank you! My plan Is no rebase until we are done with the current review cycle 
as otherwise it becomes never ending story, running in circles. 

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452030#comment-17452030
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17147:
-

{quote}[~e.dimitrova] will solve all problems!
{quote}
Only putting general framework, if I strive for "all problems" we will never 
commit that work :D 

About "fail" and "abort" I haven't thought about or discussed this. It was more 
of a liberations the parameters names from the suffixes and have a common 
format noun_verb, things like that :) But I would be happy to address this 
topic too when the time comes. 
{quote} [~e.dimitrova], if this ticket goes in before CASSANDRA-15234 don't 
worry about "fixing" the configs, we can pick that up in January after the fact.
{quote}
Thank you! My plan Is no rebase until we are done with the current review cycle 
as otherwise it becomes never ending story, running in circles. 

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17080) Fix test: dtest-upgrade.upgrade_tests.drop_compact_storage_upgrade_test.TestDropCompactStorage.test_drop_compact_storage_mixed_cluster

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452018#comment-17452018
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17080:
-

Well, I mentioned that it doesn't fail in Circle but I reproduce it locally so 
I was testing locally. I was thinking that probably we can skip the full for 
local run proof but considering you asked and our CI is funky, I pushed only 
the upgrade tests in Jenkins for 3.0, 4.0 and trunk as I am using the updated 
method also with the test for 3.0. Everything looks good to me:
[3.0|https://jenkins-cm4.apache.org/job/Cassandra-devbranch-dtest-upgrade/735/#showFailuresLink],
 
[4.0|https://jenkins-cm4.apache.org/job/Cassandra-devbranch-dtest-upgrade/734/#showFailuresLink],
 
[trunk|https://jenkins-cm4.apache.org/job/Cassandra-devbranch-dtest-upgrade/733/#showFailuresLink]

 

Side note: I don't see timeouts and these are rebased. 

> Fix test: 
> dtest-upgrade.upgrade_tests.drop_compact_storage_upgrade_test.TestDropCompactStorage.test_drop_compact_storage_mixed_cluster
> --
>
> Key: CASSANDRA-17080
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17080
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> !https://ci-cassandra.apache.org/static/a177fe56/images/32x32/health-80plus.png!
>  Failed 28 times in the last 28 runs. Flakiness: 0%, Stability: 0%
>   
>  Example of failure: 
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/801/testReport/junit/dtest-upgrade.upgrade_tests.drop_compact_storage_upgrade_test/TestDropCompactStorage/test_drop_compact_storage_mixed_cluster/]
>    
> {code:java}
> upgrade_tests/drop_compact_storage_upgrade_test.py:149: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
>  object at 0x7fa0e7f1ceb0>
> session = 
> assert_msg = 'Cannot DROP COMPACT STORAGE as some nodes in the cluster 
> ([/127.0.0.2:7000, /127.0.0.1:7000]) are not on 4.0+ yet. Please upgrade 
> those nodes and run `upgradesstables` before retrying.'
> def drop_compact_storage(self, session, assert_msg):
> try:
> session.execute("ALTER TABLE drop_compact_storage_test.test DROP 
> COMPACT STORAGE")
> pytest.fail("No exception has been thrown")
> except InvalidRequest as e:
> >   assert assert_msg in str(e)
> E   assert 'Cannot DROP COMPACT STORAGE as some nodes in the cluster 
> ([/127.0.0.2:7000, /127.0.0.1:7000]) are not on 4.0+ yet. Please upgrade 
> those nodes and run `upgradesstables` before retrying.' in 'Error from 
> server: code=2200 [Invalid query] message="Cannot DROP COMPACT STORAGE as 
> some nodes in the cluster ([/1271:7000, /127.0.0.2:7000]) are not on 4.0+ 
> yet. Please upgrade those nodes and run `upgradesstables` before retrying."'
> E+  where 'Error from server: code=2200 [Invalid query] 
> message="Cannot DROP COMPACT STORAGE as some nodes in the cluster 
> ([/1271:7000, /127.0.0.2:7000]) are not on 4.0+ yet. Please upgrade those 
> nodes and run `upgradesstables` before retrying."' = 
> str(InvalidRequest('Error from server: code=2200 [Invalid query] 
> message="Cannot DROP COMPACT STORAGE as some nodes in the...1:7000, 
> /127.0.0.2:7000]) are not on 4.0+ yet. Please upgrade those nodes and run 
> `upgradesstables` before retrying."'))
> upgrade_tests/drop_compact_storage_upgrade_test.py:45: AssertionError
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452008#comment-17452008
 ] 

David Capwell commented on CASSANDRA-17147:
---

oh, said this to her in slack... just to be public..

[~e.dimitrova], if this ticket goes in before CASSANDRA-15234 don't worry about 
"fixing" the configs, we can pick that up in January after the fact.

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452005#comment-17452005
 ] 

David Capwell edited comment on CASSANDRA-17147 at 12/1/21, 8:04 PM:
-

cool, wrapping things up for something else, so can take a look later today.

bq. renaming "fail" thresholds and related stuff to "abort"

for this ticket sounds good, but will leave the final desired naming to 
CASSANDRA-15234; I know Benedict had issues as this is not consistent in the 
code.  [~e.dimitrova] will solve all problems!


was (Author: dcapwell):
cool, wrapping things up for something else, so can take a look later today.

bq. renaming "fail" thresholds and related stuff to "abort"

for this ticket sounds good, but will leave the final desired naming to 
CASSANDRA-15234; I know Benedict had issues as this is not consistent in the 
code.  [~e.dimitrova] will solve all problems!  ^_^

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452005#comment-17452005
 ] 

David Capwell commented on CASSANDRA-17147:
---

cool, wrapping things up for something else, so can take a look later today.

bq. renaming "fail" thresholds and related stuff to "abort"

for this ticket sounds good, but will leave the final desired naming to 
CASSANDRA-15234; I know Benedict had issues as this is not consistent in the 
code.  [~e.dimitrova] will solve all problems!  ^_^

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451972#comment-17451972
 ] 

Stefan Miklosovic commented on CASSANDRA-17180:
---

I dont think it is actually a good match. [~adelapena] thoughts?

> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451954#comment-17451954
 ] 

Andres de la Peña commented on CASSANDRA-17147:
---

So it seems we are in agreement regarding config, we'll go back over this once 
we fix the config layer.

I have pushed one last commit renaming "fail" thresholds and related stuff to 
"abort". I think the PR is ready for another round of review, I hope I'm not 
missing anything from the previous comments.

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17001) Optionally prune CDC segments if consumer fails to consume them fast enough

2021-12-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451951#comment-17451951
 ] 

Josh McKenzie commented on CASSANDRA-17001:
---

Had one other nit / question / thought, but otherwise LGTM. +1

> Optionally prune CDC segments if consumer fails to consume them fast enough
> ---
>
> Key: CASSANDRA-17001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17001
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Dinesh Joshi
>Assignee: Yifan Cai
>Priority: Normal
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Current CDC implementation blocks writes if the CDC segments have filled up. 
> This makes sense for some use-cases. In other cases it would be beneficial 
> for C* to prune the CDC segments if they haven't been consumed. This will 
> prevent blocking of writes. With this change we will introduce a flag to 
> prune CDC segments much like a circular buffer. This will prevent the writes 
> being blocked.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17001) Optionally prune CDC segments if consumer fails to consume them fast enough

2021-12-01 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451942#comment-17451942
 ] 

Yifan Cai commented on CASSANDRA-17001:
---

Thanks [~jmckenzie] for the review. I pushed new commits to address the 
comments. 

> Optionally prune CDC segments if consumer fails to consume them fast enough
> ---
>
> Key: CASSANDRA-17001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17001
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Dinesh Joshi
>Assignee: Yifan Cai
>Priority: Normal
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Current CDC implementation blocks writes if the CDC segments have filled up. 
> This makes sense for some use-cases. In other cases it would be beneficial 
> for C* to prune the CDC segments if they haven't been consumed. This will 
> prevent blocking of writes. With this change we will introduce a flag to 
> prune CDC segments much like a circular buffer. This will prevent the writes 
> being blocked.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451939#comment-17451939
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15234 at 12/1/21, 5:43 PM:
---

Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX and metrics for now and assume the 
old units are in place but [~dcapwell] is right we need at least to verify and 
convert if needed that in Config the old unit was used with the new custom 
types and we didn't use something different as this can be a mess if we leave 
it to people and reading docs. Something similar will also be needed for VTs 
where we will have to check the Converter in the new @Replaces annotation. 
There was idea for getting the VT backward compatibility out until we figure 
out how we are going to handle the VT as there is ongoing discussion around 
that, I will leave this now aside for a moment until we clear a bit the 
discussion. CC [~blerer] as I know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 


was (Author: e.dimitrova):
Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX and metrics for now and assume the 
old units are in place but [~dcapwell] is right we need at least to verify and 
convert if needed that in Config the old unit was used with the new custom 
types and we didn't use something different as this can be a mess if we live it 
to people and reading docs. Something similar will also be needed for VTs where 
we will have to check the Converter in the new @Replaces annotation. There was 
idea for getting the VT backward compatibility out until we figure out how we 
are going to handle the VT as there is ongoing discussion around that, I will 
leave this now aside for a moment until we clear a bit the discussion. CC 
[~blerer] as I know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451939#comment-17451939
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15234 at 12/1/21, 5:42 PM:
---

Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX and metrics for now and assume the 
old units are in place but [~dcapwell] is right we need at least to verify and 
convert if needed that in Config the old unit was used with the new custom 
types and we didn't use something different as this can be a mess if we live it 
to people and reading docs. Something similar will also be needed for VTs where 
we will have to check the Converter in the new @Replaces annotation. There was 
idea for getting the VT backward compatibility out until we figure out how we 
are going to handle the VT as there is ongoing discussion around that, I will 
leave this now aside for a moment until we clear a bit the discussion. CC 
[~blerer] as I know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 


was (Author: e.dimitrova):
Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX for now and assume the old units are 
in place but [~dcapwell] is right we need at least to verify and convert if 
needed that in Config the old unit was used with the new custom types and we 
didn't use something different as this can be a mess if we live it to people 
and reading docs. Something similar will also be needed for VTs where we will 
have to check the Converter in the new @Replaces annotation. There was idea for 
getting the VT backward compatibility out until we figure out how we are going 
to handle the VT as there is ongoing discussion around that, I will leave this 
now aside for a moment until we clear a bit the discussion. CC [~blerer] as I 
know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451939#comment-17451939
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15234:
-

Moving the ticket to "In Review" as [~dcapwell]  is already looking into it.

Also, for visibility and context as it seems I missed in my summary - there is 
a separate ticket linked to this one(CASSANDRA-9691) which was mentioning 
liberating metrics and JMX from units or just moving to lowest unit at least. I 
left this part to be handled there but I forgot to mention it in my summary. 
Apologize for that

My thought was currently to leave the  JMX for now and assume the old units are 
in place but [~dcapwell] is right we need at least to verify and convert if 
needed that in Config the old unit was used with the new custom types and we 
didn't use something different as this can be a mess if we live it to people 
and reading docs. Something similar will also be needed for VTs where we will 
have to check the Converter in the new @Replaces annotation. There was idea for 
getting the VT backward compatibility out until we figure out how we are going 
to handle the VT as there is ongoing discussion around that, I will leave this 
now aside for a moment until we clear a bit the discussion. CC [~blerer] as I 
know he is working on a patch for CASSANDRA-15254

Also, I will be looking again into the ccm issue in the afternoon. 

 

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451938#comment-17451938
 ] 

Josh McKenzie commented on CASSANDRA-17082:
---

Tidied that up on commit. Thanks!

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.1
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17174) Harden resource management on SSTable components to prevent further leaks

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17174:
--
  Fix Version/s: 4.1
 (was: 4.x)
Source Control Link: 
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=8e28dc0ebac3d80db43acfe76cfb45c0cb17a5c8
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Harden resource management on SSTable components to prevent further leaks
> -
>
> Key: CASSANDRA-17174
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17174
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.1
>
>
> We've seen resource leaks pop up w/histogram overflows repeatedly; the code 
> in {{BigTableWriter.openEarly()}} and {{BigTableWriter.openFinal()}} doesn't 
> appropriately catch and handle any exceptions during creation before things 
> are registered with a {{LifecycleTransaction}} so any errors there will leak.
> We should clean that up.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17174) Harden resource management on SSTable components to prevent further leaks

2021-12-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451937#comment-17451937
 ] 

Josh McKenzie commented on CASSANDRA-17174:
---

Revised, added JVMStabilityInspector check against it, and committed. Thanks!

> Harden resource management on SSTable components to prevent further leaks
> -
>
> Key: CASSANDRA-17174
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17174
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.1
>
>
> We've seen resource leaks pop up w/histogram overflows repeatedly; the code 
> in {{BigTableWriter.openEarly()}} and {{BigTableWriter.openFinal()}} doesn't 
> appropriately catch and handle any exceptions during creation before things 
> are registered with a {{LifecycleTransaction}} so any errors there will leak.
> We should clean that up.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Harden resource management on SSTable components to prevent future leaks

2021-12-01 Thread jmckenzie
This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 8e28dc0  Harden resource management on SSTable components to prevent 
future leaks
8e28dc0 is described below

commit 8e28dc0ebac3d80db43acfe76cfb45c0cb17a5c8
Author: Josh McKenzie 
AuthorDate: Mon Nov 29 11:27:17 2021 -0500

Harden resource management on SSTable components to prevent future leaks

Patch by Josh McKenzie; reviewed by Caleb Rackliffe and Marcus Erikkson for 
CASSANDRA-17174
---
 CHANGES.txt|   1 +
 .../io/sstable/format/big/BigTableWriter.java  | 140 +
 2 files changed, 90 insertions(+), 51 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 5203504..c8cc544 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1
+ * Harden resource management on SSTable components to prevent future leaks 
(CASSANDRA-17174)
  * Make nodes more resilient to local unrelated files during startup 
(CASSANDRA-17082)
  * repair prepare message would produce a wrong error message if network 
timeout happened rather than reply wait timeout (CASSANDRA-16992)
  * Log queries that fail on timeout or unavailable errors up to once per 
minute by default (CASSANDRA-17159)
diff --git 
a/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java 
b/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
index 889547d..dc43380 100644
--- a/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java
@@ -22,6 +22,7 @@ import java.io.IOException;
 import java.nio.BufferOverflowException;
 import java.nio.ByteBuffer;
 import java.util.*;
+import java.util.stream.Stream;
 
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.db.lifecycle.LifecycleNewTracker;
@@ -50,6 +51,7 @@ import org.apache.cassandra.io.util.*;
 import org.apache.cassandra.schema.CompressionParams;
 import org.apache.cassandra.schema.TableMetadataRef;
 import org.apache.cassandra.utils.*;
+import org.apache.cassandra.utils.concurrent.SharedCloseableImpl;
 import org.apache.cassandra.utils.concurrent.Transactional;
 
 import static org.apache.cassandra.utils.Clock.Global.currentTimeMillis;
@@ -338,32 +340,50 @@ public class BigTableWriter extends SSTableWriter
 if (boundary == null)
 return null;
 
-StatsMetadata stats = statsMetadata();
-assert boundary.indexLength > 0 && boundary.dataLength > 0;
-// open the reader early
-IndexSummary indexSummary = 
iwriter.summary.build(metadata().partitioner, boundary);
-long indexFileLength = new 
File(descriptor.filenameFor(Component.PRIMARY_INDEX)).length();
-int indexBufferSize = optimizationStrategy.bufferSize(indexFileLength 
/ indexSummary.size());
-FileHandle ifile = 
iwriter.builder.bufferSize(indexBufferSize).complete(boundary.indexLength);
-if (compression)
-dbuilder.withCompressionMetadata(((CompressedSequentialWriter) 
dataFile).open(boundary.dataLength));
-int dataBufferSize = 
optimizationStrategy.bufferSize(stats.estimatedPartitionSize.percentile(DatabaseDescriptor.getDiskOptimizationEstimatePercentile()));
-FileHandle dfile = 
dbuilder.bufferSize(dataBufferSize).complete(boundary.dataLength);
-invalidateCacheAtBoundary(dfile);
-SSTableReader sstable = SSTableReader.internalOpen(descriptor,
-   components, 
metadata,
-   ifile, dfile,
-   indexSummary,
-   
iwriter.bf.sharedCopy(), 
-   maxDataAge, 
-   stats, 
-   
SSTableReader.OpenReason.EARLY, 
-   header);
-
-// now it's open, find the ACTUAL last readable key (i.e. for which 
the data file has also been flushed)
-sstable.first = getMinimalKey(first);
-sstable.last = getMinimalKey(boundary.lastKey);
-return sstable;
+IndexSummary indexSummary = null;
+FileHandle ifile = null;
+FileHandle dfile = null;
+SSTableReader sstable = null;
+
+try
+{
+StatsMetadata stats = statsMetadata();
+assert boundary.indexLength > 0 && boundary.dataLength > 0;
+// open the reader early
+indexSummary = iwriter.summary.build(metadata().partitioner, 
boundary);
+long indexFileLength

[jira] [Updated] (CASSANDRA-17174) Harden resource management on SSTable components to prevent further leaks

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17174:
--
Status: Ready to Commit  (was: Review In Progress)

> Harden resource management on SSTable components to prevent further leaks
> -
>
> Key: CASSANDRA-17174
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17174
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> We've seen resource leaks pop up w/histogram overflows repeatedly; the code 
> in {{BigTableWriter.openEarly()}} and {{BigTableWriter.openFinal()}} doesn't 
> appropriately catch and handle any exceptions during creation before things 
> are registered with a {{LifecycleTransaction}} so any errors there will leak.
> We should clean that up.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17082:
--
Status: Ready to Commit  (was: Review In Progress)

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17082:
--
Status: Patch Available  (was: In Progress)

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17082:
--
Status: Review In Progress  (was: Patch Available)

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17082:
--
  Fix Version/s: 4.1
 (was: 4.x)
Source Control Link: 
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=92dc415902654c0e69de47205af62b9bb4532809
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.1
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17082:
--
Status: In Progress  (was: Patch Available)

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Tolerate local files in data dir during startup

2021-12-01 Thread jmckenzie
This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 92dc415  Tolerate local files in data dir during startup
92dc415 is described below

commit 92dc415902654c0e69de47205af62b9bb4532809
Author: Alex Petrov 
AuthorDate: Thu Oct 28 15:35:57 2021 -0400

Tolerate local files in data dir during startup

Patch by Alex Petrov; reviewed by Aleksey Yeschenko, Jon Meredith, and 
Caleb Rackliffe for CASSANDRA-17082

Co-authored-by: Alex Petrov 
Co-authored-by: Josh McKenzie 
---
 CHANGES.txt   |  1 +
 .../org/apache/cassandra/io/sstable/SSTableHeaderFix.java | 12 +++-
 .../apache/cassandra/io/sstable/SSTableHeaderFixTest.java | 15 +++
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index c399c0d..5203504 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.1
+ * Make nodes more resilient to local unrelated files during startup 
(CASSANDRA-17082)
  * repair prepare message would produce a wrong error message if network 
timeout happened rather than reply wait timeout (CASSANDRA-16992)
  * Log queries that fail on timeout or unavailable errors up to once per 
minute by default (CASSANDRA-17159)
  * Refactor normal/preview/IR repair to standardize repair cleanup and error 
handling of failed RepairJobs (CASSANDRA-17069)
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableHeaderFix.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableHeaderFix.java
index ad0a722..643a9de 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableHeaderFix.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableHeaderFix.java
@@ -298,7 +298,17 @@ public abstract class SSTableHeaderFix
 {
 Stream.of(path)
   .flatMap(SSTableHeaderFix::maybeExpandDirectory)
-  .filter(p -> Descriptor.fromFilenameWithComponent(new 
File(p)).right.type == Component.Type.DATA)
+  .filter(p -> {
+  try
+  {
+  return Descriptor.fromFilenameWithComponent(new 
File(p.toFile())).right.type == Component.Type.DATA;
+  }
+  catch (IllegalArgumentException t)
+  {
+  logger.info("Couldn't parse filename {}, ignoring", p);
+  return false;
+  }
+  })
   .map(Path::toString)
   .map(Descriptor::fromFilename)
   .forEach(descriptors::add);
diff --git 
a/test/unit/org/apache/cassandra/io/sstable/SSTableHeaderFixTest.java 
b/test/unit/org/apache/cassandra/io/sstable/SSTableHeaderFixTest.java
index f578f77..5b28c0e 100644
--- a/test/unit/org/apache/cassandra/io/sstable/SSTableHeaderFixTest.java
+++ b/test/unit/org/apache/cassandra/io/sstable/SSTableHeaderFixTest.java
@@ -33,6 +33,7 @@ import java.util.stream.IntStream;
 import com.google.common.collect.Sets;
 import org.apache.cassandra.io.util.File;
 import org.junit.After;
+import org.junit.Assert;
 import org.junit.Before;
 import org.junit.Test;
 
@@ -735,6 +736,20 @@ public class SSTableHeaderFixTest
 }
 }
 
+@Test
+public void ignoresStaleFilesTest() throws Exception
+{
+File dir = temporaryFolder;
+IntStream.range(1, 2).forEach(g -> generateFakeSSTable(dir, g));
+
+File newFile = new File(dir.toAbsolute(), 
"something_something-something.something");
+Assert.assertTrue(newFile.createFileIfNotExists());
+
+SSTableHeaderFix headerFix = builder().withPath(dir.toPath())
+  .build();
+headerFix.execute();
+}
+
 private static final Pattern p = Pattern.compile(".* Column '([^']+)' 
needs to be updated from type .*");
 
 private SSTableHeaderFix.Builder builder()

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16446) Parent repair sessions leak may lead to node long pauses

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451934#comment-17451934
 ] 

David Capwell commented on CASSANDRA-16446:
---

Thanks, saw it while refactoring and wasn't sure if there was a good reason.  
What I see now is that both FINALIZE_COMMIT and CLEANUP touch different maps, 
so I don't see a clear conflict with IR so makes sense to me to cleanup on 
failure.

> Parent repair sessions leak may lead to node long pauses
> 
>
> Key: CASSANDRA-16446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16446
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc1, 4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{ActiveRepairService}} keeps  a map `parentRepairSessions`. If these 
> sessions leak, that map can grow to a size when a node restarts 
> {{ActiveRepairService.onRestart()}} triggers a cleanup of sessions that can 
> pause nodes in a cluster for a long time.
> The proposed solution is for repairs to cleanup these sessions on all nodes 
> on completion by sending a CLEANUP message to involved nodes. Tests rely on a 
> new {{parentRepairSessionsCount()}} method on the parent repair sessions 
> MBean to keep track of these.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17082) Make nodes more resilient to local unrelated files during startup

2021-12-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-17082:
--
Summary: Make nodes more resilient to local unrelated files during startup  
(was: Make nodes more resilient to stale JSON files during startup)

> Make nodes more resilient to local unrelated files during startup
> -
>
> Key: CASSANDRA-17082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: Normal
> Fix For: 4.x
>
>
> There's a few things we can protect against being in the data dir on startup 
> that might be around from older activity, tool usage, exports, etc on a 3.x 
> -> 4.x update:
>  a) file ending with *-old.json
>  b) file ending with *.json or *idx.json
> A trivial update to the filter on SSTableHeaderFix.java should protect 
> against hitting these types of files on startup and throwing.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451933#comment-17451933
 ] 

David Capwell commented on CASSANDRA-17147:
---

bq.  able to use Jackson annotations and have private attributes.

The patch allows plugging SnakeYAML and Jackson for object mapping, but 
currently defaults to SnakeYAML; though I did add support for @JsonIgnore, but 
other stuff like @JsonProperty are not supported (so can't rename).

bq. I guess that those annotations would also make things easier for the 
settings virtual table.

ATM harder!  That is another issue we need to solve in Cassandra, settings 
vtable makes assumptions which are not valid today (well, if it supported 
nested that is), so I am in favor of centralizing the logic of "what is a 
property" so YAML AND vtable both use the same code path (it is so much easier 
to enhance our logic down the line...); but again, not in this ticket.

bq.  I guess we could leave the camel case getters/setters and public snake 
case attributes and wait for a better solution

Yeah, I am in favor of that as well.  I see this as a bug in our config layer, 
so fixing config layer is best IMO.  Its present in more than guardrails, so 
patching every access is... yeah lets just fix config layer...

bq. We could also use snake casing in the getters/setters

I wouldn't, again I see this as a bug in the config layer, so lets fix config 
layer.

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-01 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-15234:

Reviewers: Benjamin Lerer, David Capwell, Ekaterina Dimitrova  (was: 
Benjamin Lerer, David Capwell)
   Status: Review In Progress  (was: Patch Available)

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451897#comment-17451897
 ] 

Brandon Williams commented on CASSANDRA-17180:
--

bq. If set to true, there will be a startup check reading the content of 
heartbeat_file and node will fail to start if there are tables for which gc 
grace period is older

Seems like this would fit nicely as a guardrail.

> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2021-12-01 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451892#comment-17451892
 ] 

Michael Semb Wever commented on CASSANDRA-15510:


The discussion raised on dev@ ML to include this also in 4.0.x can be read 
[here|https://lists.apache.org/thread/f3dl7rfc2kv9f5r9pxzyz6zojsss81b9].

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduc

[jira] [Commented] (CASSANDRA-15511) Utilising BTree Improvements

2021-12-01 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451891#comment-17451891
 ] 

Michael Semb Wever commented on CASSANDRA-15511:


The discussion raised on dev@ ML to include this also in 4.0.x can be read 
[here|https://lists.apache.org/thread/f3dl7rfc2kv9f5r9pxzyz6zojsss81b9].

> Utilising BTree Improvements
> 
>
> Key: CASSANDRA-15511
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15511
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
> Attachments: atomicbtreepartition.ods, atomicbtreepartition.xlsx.zip, 
> perfsh.tar.gz
>
>
> This patch utilises CASSANDRA-15510 to improve throughput and reduce garbage 
> produced by a number of common operations, by employing 
> {{transformAndFilter}}, {{transform}} and {{FastBuilder}}
> * {{Row}}, {{Cell}} and {{ComplexColumnData}} cloning are implemented with 
> {{BTree.transform}}, so no special builders are necessary; 
> ** {{Rows.copy}} removed
> * {{Rows.merge}} implemented using {{BTree.update}} and a {{ColumnData}} 
> reconciler
> ** Zero-allocations if result of merge is same as {{existing}}
> ** Fewer comparisons
> * {{ColumnData}} reconciler implemented in same manner
> ** {{Cells.reconcileComplex}} is retired
> ** {{ComplexColumnData}} reconciliation now
> *** Garbage-free if the merge has no effect
> *** Always fewer allocations
> *** Fewer comparisons
> * {{FastBuilder}} employed widely:
> ** {{ClusteringIndexNamesFilter}} deserialization
> ** {{Columns}} deserialization
> ** {{PartitionUpdate}} deserialization
> ** {{AbstractBTreePartition}} construction
> ** Misc others
> The upshot of this work when combined with the proposed patch for 
> CASSANDRA-15367 has a dramatic impact on operations over collection types - 
> under contention, as much as 100x improved throughput, and hundreds of 
> megabytes of reduced allocations.  For all operations, allocations under 
> contention and no contention are significantly reduced and throughput 
> improved.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15511) Utilising BTree Improvements

2021-12-01 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15511:
---
Fix Version/s: 4.0.x

> Utilising BTree Improvements
> 
>
> Key: CASSANDRA-15511
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15511
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
> Attachments: atomicbtreepartition.ods, atomicbtreepartition.xlsx.zip, 
> perfsh.tar.gz
>
>
> This patch utilises CASSANDRA-15510 to improve throughput and reduce garbage 
> produced by a number of common operations, by employing 
> {{transformAndFilter}}, {{transform}} and {{FastBuilder}}
> * {{Row}}, {{Cell}} and {{ComplexColumnData}} cloning are implemented with 
> {{BTree.transform}}, so no special builders are necessary; 
> ** {{Rows.copy}} removed
> * {{Rows.merge}} implemented using {{BTree.update}} and a {{ColumnData}} 
> reconciler
> ** Zero-allocations if result of merge is same as {{existing}}
> ** Fewer comparisons
> * {{ColumnData}} reconciler implemented in same manner
> ** {{Cells.reconcileComplex}} is retired
> ** {{ComplexColumnData}} reconciliation now
> *** Garbage-free if the merge has no effect
> *** Always fewer allocations
> *** Fewer comparisons
> * {{FastBuilder}} employed widely:
> ** {{ClusteringIndexNamesFilter}} deserialization
> ** {{Columns}} deserialization
> ** {{PartitionUpdate}} deserialization
> ** {{AbstractBTreePartition}} construction
> ** Misc others
> The upshot of this work when combined with the proposed patch for 
> CASSANDRA-15367 has a dramatic impact on operations over collection types - 
> under contention, as much as 100x improved throughput, and hundreds of 
> megabytes of reduced allocations.  For all operations, allocations under 
> contention and no contention are significantly reduced and throughput 
> improved.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming

2021-12-01 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15510:
---
Fix Version/s: 4.0.x

> BTree: Improve Building, Inserting and Transforming
> ---
>
> Key: CASSANDRA-15510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15510
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Benedict Elliott Smith
>Assignee: Benedict Elliott Smith
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> This work was originally undertaken as a follow-up to CASSANDRA-15367 to 
> ensure performance is strictly improved, but it may no longer be needed for 
> that purpose.  It’s still hugely impactful, however.  It remains to be 
> decided where this should land.
> The current {{BTree}} implementation is suboptimal in a number of ways, with 
> very little focus having been given to its performance besides its 
> memory-occupancy.  This patch aims to address that, specifically improving 
> the performance and allocations involved in: building, transforming and 
> inserting into a tree.
> To facilitate this work, the {{BTree}} definition is modified slightly, so 
> that we can perform some simple arithmetic on tree sizes.  Specifically, 
> trees of depth n are defined to have a maximum capacity of {{branchFactor^n - 
> 1}}, which translates into capping the number of leaf children at 
> {{branchFactor-1}}, as opposed to {{branchFactor}}.  Since {{branchFactor}} 
> is a power of 2, this permits fast tree size arithmetic, enabling some of 
> these changes.
> h2. Building
> The static build method has been modified to utilise dedicated 
> {{buildPerfect}} methods that build either perfectly dense or perfectly 
> sparse sub-trees.  These perfect trees all share their {{sizeMap}} with each 
> other, and can be built more efficiently than trees of arbitrary size.  The 
> specifics are described in detail in the comments, but this building block 
> can be used to construct trees of any size, using at most one child at each 
> level that is not either perfectly sparse or perfectly dense.  Bulk methods 
> are used where possible.
> For large trees this can produce up to 30x throughput improvement and 30% 
> allocation reduction vs 3.0 (TBC, and to be tested vs 4.0).
> {{FastBuilder}} is introduced for building a tree in-order (or in reverse) 
> without duplicate elements to resolve, without necessarily knowing the size 
> upfront.  This meets the needs of most use cases.  Data is built directly 
> into nodes, with up to one already-constructed node, and one partially 
> constructed node, on each level, being mutated to share their contents in the 
> event of insufficient data to populate the tree.  These builders are 
> thread-locally shared.  These leads to minimal copying, the same sharing of 
> {{sizeMap}} as above, zero wasted allocations, and results in minimal 
> difference in performance between utilising the less-ergonomic static build 
> and builder approach.
> For large trees this leads to ~4.5x throughput improvement, and 70% reduction 
> in allocations vs a normal Builder.  For small trees performance is 
> comparable, but allocations similarly reduced.
> h2. Inserting
> It turns out that we only ever insert another tree into a tree, so we exploit 
> this to implement an efficient union of two trees, operating on them directly 
> via stacks in the transformer, instead of via a collection interface.  A 
> builder-like object is introduced that shares functionality with 
> {{FastBuilder}}, and permits us to build the result of the union directly 
> into the final nodes, reusing as much of the original trees as possible.  
> Bulk methods are used where possible.
> The result is not _uniformly_ faster, but is _significantly_ faster on 
> average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean 
> improvement of 10x.  Worst reduction is 30%, and it may be that we can 
> isolate and alleviate that.  Allocations are also reduced significantly, with 
> a median of 30% and mean of 42% for the tested workloads.  As the trees get 
> larger the improvement drops, but remains uniformly lower.
> h2. Transforming
> Transformations garbage overhead is minimal, i.e. the main allocations are 
> those necessary to represent the new tree.  It is significantly faster and 
> particularly more efficient when removing elements, utilising the shared 
> functionality of the builder and transformer objects to define an efficient 
> builder that reuses as much of the original tree as possible. 
> We also introduce dedicated {{transform}} methods (that forbid returning 
> {{null}}), and {{BiFunction}} transformations to permit efficient follow-ups.



--
This message was sent by Atlassian Jir

[jira] [Commented] (CASSANDRA-16894) Java 11 support - remove the experimental flag

2021-12-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451867#comment-17451867
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16894:
-

The proposed change looks good to me, thank you

> Java 11 support - remove the experimental flag
> --
>
> Key: CASSANDRA-16894
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16894
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0.x
>
>
> As per this [thread. 
> |https://lists.apache.org/thread.html/r06a4768bae215dc19469491a1faec64dd90e1c3d4d10ed34b85ba248%40%3Cdev.cassandra.apache.org%3E]the
>  goal of this ticket is to remove the experimental flag for Java 11 support.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17169) Flaky RecomputingSupplierTest

2021-12-01 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-17169:
--
Reviewers: Aleksandr Sorokoumov, Andres de la Peña  (was: Andres de la Peña)

> Flaky RecomputingSupplierTest
> -
>
> Key: CASSANDRA-17169
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17169
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> See 
> https://ci-cassandra.apache.org/job/Cassandra-4.0/293/testReport/junit/org.apache.cassandra.utils/RecomputingSupplierTest/recomputingSupplierTest/
> {noformat}
> java.util.concurrent.TimeoutException
>   at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
>   at 
> org.apache.cassandra.utils.RecomputingSupplier.get(RecomputingSupplier.java:110)
>   at 
> org.apache.cassandra.utils.RecomputingSupplierTest.recomputingSupplierTest(RecomputingSupplierTest.java:120)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16894) Java 11 support - remove the experimental flag

2021-12-01 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451833#comment-17451833
 ] 

Michael Semb Wever commented on CASSANDRA-16894:


bq. I agree, the patch was created before 4.0.1 but I thought we are waiting on 
the ASCII patch to deal with these tickets?

My concern/suggestion was only limited to taking [this 
change|https://github.com/apache/cassandra/commit/8da687241c70cf59bb05afe4e8d923c5d806ee06]
 instead.

> Java 11 support - remove the experimental flag
> --
>
> Key: CASSANDRA-16894
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16894
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0.x
>
>
> As per this [thread. 
> |https://lists.apache.org/thread.html/r06a4768bae215dc19469491a1faec64dd90e1c3d4d10ed34b85ba248%40%3Cdev.cassandra.apache.org%3E]the
>  goal of this ticket is to remove the experimental flag for Java 11 support.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17180:
--
Test and Documentation Plan: unit test
 Status: Patch Available  (was: Open)

https://github.com/apache/cassandra/pull/1351/files

> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17180:
--
Status: In Progress  (was: Patch Available)

> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17180:
--
Description: 
As already discussed on ML, it would be nice to have a service which would 
periodically write timestamp to a file signalling it is up / running.

Then, on the startup, we would read this file and we would determine if there 
is some table which gc grace is behind this time and we would fail the start so 
we would prevent zombie data to be likely spread around a cluster.

https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw

  was:
As already discussed on ML, it would be nice to have a service which would 
periodically write timestamp to a file signalling it is up / running.

Then, on the startup, we would read this file and we would determine if there 
is some table which gc grace is behind this time and we would fail the start so 
we would prevent zombie data to be likely spread around a cluster.


> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17180:
--
Change Category: Operability
 Complexity: Normal
 Status: Open  (was: Triage Needed)

> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-17180:
-

Assignee: Stefan Miklosovic

> Implement heartbeat service to know last time Cassandra node was up
> ---
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Observability
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>
> As already discussed on ML, it would be nice to have a service which would 
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there 
> is some table which gc grace is behind this time and we would fail the start 
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17180) Implement heartbeat service to know last time Cassandra node was up

2021-12-01 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-17180:
-

 Summary: Implement heartbeat service to know last time Cassandra 
node was up
 Key: CASSANDRA-17180
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
 Project: Cassandra
  Issue Type: New Feature
  Components: Legacy/Observability
Reporter: Stefan Miklosovic


As already discussed on ML, it would be nice to have a service which would 
periodically write timestamp to a file signalling it is up / running.

Then, on the startup, we would read this file and we would determine if there 
is some table which gc grace is behind this time and we would fail the start so 
we would prevent zombie data to be likely spread around a cluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17147) Guardrails prototype

2021-12-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451796#comment-17451796
 ] 

Andres de la Peña commented on CASSANDRA-17147:
---

bq. Once you merge into the PR can view whole patch together; think we are 
close!

I have added the nested config to the PR, rebased and run CI for 
[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1184/workflows/2c0a4c55-f740-41db-8356-9b222bb38815]
 and 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1184/workflows/ecab9e8a-f7a5-430b-bf13-884a7c54852f].

bq.  though would drop "_values" in table_properties

Makes sense, done.

bq. then CASSANDRA-17166 can fix your concerns for getter/setters; I know that 
the yaml will be interesting with your patch, but prefer CASSANDRA-17166 fixes 
those issues rather than you dealing with it in this patch (your getter/setters 
are currently exposed as camel case in yaml; found the same bug in 
TrackWarnings).

It would be great if we were able to use Jackson annotations and have private 
attributes. I guess that those annotations would also make things easier for 
the settings virtual table. In the meantime, I guess we could leave the camel 
case getters/setters and public snake case attributes and wait for a better 
solution in CASSANDRA-17166. We could also use snake casing in the 
getters/setters, but that would mean extending the snake casing workaround up 
to the getters in the {{Threshold.Config}}/{{Values.Config}} interfaces, which 
is not ideal IMO.

> Guardrails prototype
> 
>
> Key: CASSANDRA-17147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17147
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is creating an initial implementation of the 
> guardrails framework, as well as adding a few simple guardrails using this 
> framework.
> To keep things easy, this initial implementation would only support 
> guardrails that are triggered on the coordinator, and they would be 
> dynamically updatable only through JMX.
> Once we have this initial framework ready in a feature branch we can have 
> multiple tickets addressing all the things that would have been left out of 
> the scope of this ticket, such as:
> * Dynamic updates through virtual tables
> * Being able to notify about guardrails triggered on replicas
> * Using custom exceptions other than {{InvalidRequestException}}.
> * Porting existing limits to use the new guardrails framework
> * Adding new guardrails beyond the initial ones
> The reason for having this simpler prototype is that it will give us a common 
> ground to parallelize work on the parts mentioned above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17031) Add support for PEM based key material for SSL

2021-12-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451675#comment-17451675
 ] 

Stefan Miklosovic commented on CASSANDRA-17031:
---

I do not have a strong opinion about your first point. [~jonmeredith] what do 
you think?

I was thinking about having that in "plain text", what that cert in plain text 
looks like, I think right now it is pretty much just "a rubbish", would be nice 
to have a textual representation of that in form of some text dump but if you 
find my idea silly feel free to ignore it.

Thanks for taking care of this ticket, believe or not I was thinking about 
pinging you these days whats up. I hope we will manage to deliver this in a 
forseeable future.

> Add support for PEM based key material for SSL
> --
>
> Key: CASSANDRA-17031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17031
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Maulin Vasavada
>Assignee: Maulin Vasavada
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h1. Scope
> Currently Cassandra supports standard keystore types for SSL 
> keys/certificates. The scope of this enhancement is to add support for PEM 
> based key material (keys/certificate) given that PEM is widely used common 
> format for the same. We intend to add support for Unencrypted and Password 
> Based Encrypted (PBE) PKCS#8 formatted Private Keys in PEM format with 
> standard algorithms (RSA, DSA and EC) along with the certificate chain for 
> the private key and PEM based X509 certificates. The work here is going to be 
> built on top of [CEP-9: Make SSLContext creation 
> pluggable|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-9%3A+Make+SSLContext+creation+pluggable]
>  for which the code is merged for Apache Cassandra 4.1 release.
> We intend to support the key material be configured as direct PEM values 
> input OR via the file (configured with keystore and truststore configurations 
> today). We are not going to model PEM as a valid 'store_type' given that 
> 'store_type' has a [specific 
> definition|https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html#GUID-AB51DEFD-5238-4F96-967F-082F6D34FBEA].
>  
> h1. Approach
> Create an implementation for 
> [ISslContextFactory|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/ISslContextFactory.java]
>  extending 
> [FileBasedSslContextFactory|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/FileBasedSslContextFactory.java]
>  implementation to add PEM formatted key/certificates.
> h1. Motivation
> PEM is a widely used format for encoding Private Keys and X.509 Certificates 
> and Apache Cassandra's current implementation lacks the support for 
> specifying the PEM formatted key material for SSL configurations. This means 
> operators have to re-create the key material to comply to the supported 
> formats (using key/trust store types - jks, pkcs12 etc) and deal with an 
> operational task for managing it. This is an operational overhead we can 
> avoid by supporting the PEM format making Apache Cassandra even more customer 
> friendly and drive more adoption.
> h1. Proposed Changes
>  # A new implementation for ISslContextFactory - PEMBasedSslContextFactory 
> with the following supported configuration
> {panel:title=New configurations}
> {panel}
> |{{encryption_options:  }}
>  {{}}{{ssl_context_factory:}}
>  {{}}{{class_name: 
> org.apache.cassandra.security.PEMBasedSslContextFactory}}
>  {{}}{{parameters:}}
>  {{  }}{{private_key:  certificate chain>}}
>  {{  }}{{private_key_password:  }}{{private}} {{key }}{{if}} {{it is encrypted>}}
>  {{  }}{{trusted_certificates: }}|
> *NOTE:* We could reuse 'keystore_password' instead of the 
> 'private_key_password'. However PEM encoded private key is not a 'keystore' 
> in itself hence it would be inappropriate to piggyback on that other than 
> avoid duplicating similar fields.
>  # The PEMBasedSslContextFactory will also support file based key material 
> (and the corresponding HOT Reloading based on file timestamp updates) for the 
> PEM format via existing  'keystore' and 'truststore' encryption options. 
> However in that case the 'truststore_password' configuration won't be used 
> since generally PEM formatted certificates for truststore don't get encrypted 
> with a password.
>  # The PEMBasedSslContextFactory will internally create PKCS12 keystore for 
> private key and the trusted certificates. However, this doesn't impact the 
> user of the implementation in anyway and it is mentioned for clar

[jira] [Commented] (CASSANDRA-16310) Track top partitions (by size and tombstone count) and display in nodetool tablestats

2021-12-01 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451618#comment-17451618
 ] 

Marcus Eriksson commented on CASSANDRA-16310:
-

bq. switching from blob to text to represent the partition keys in the schema.
yes, this was done to let operators {{select * from system.top_partitions where 
...}} instead of running nodetool/JMX

> Track top partitions (by size and tombstone count) and display in nodetool 
> tablestats
> -
>
> Key: CASSANDRA-16310
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16310
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.x
>
>
> We should track the top partitions by size and tombstone count and display in 
> {{nodetool tablestats}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16994) WEBSITE - September 2021 website edits

2021-12-01 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-16994:
--
Status: Changes Suggested  (was: Review In Progress)

> WEBSITE - September 2021 website edits
> --
>
> Key: CASSANDRA-16994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16994
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Diogenese Topper
>Assignee: Erick Ramirez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0.x
>
>
> Edits and fixes for formatting and typos across the website.
> Can be closed upon merged changes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16994) WEBSITE - September 2021 website edits

2021-12-01 Thread Erick Ramirez (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451589#comment-17451589
 ] 

Erick Ramirez edited comment on CASSANDRA-16994 at 12/1/21, 8:01 AM:
-

[~diotopper] I've reviewed the PR and several items need fixing:
 # Date for Apachecon post on blog index has already been fixed by 
CASSANDRA-16980 (PR [#73|https://github.com/apache/cassandra-website/pull/73]).
 # Copy of {{blog.adoc}} is out-of-date and needs to be re-based from 
{{{}trunk{}}}.
 # Update to Changelog no. 9 has already been fixed by CASSANDRA-16980 (PR 
[#73|https://github.com/apache/cassandra-website/pull/73]).


was (Author: flightc):
[~diotopper] I've reviewed the PR and several items need fixing:
 # Date for Apachecon post on blog index has already been fixed by 
CASSANDRA-16980 (PR [#73|https://github.com/apache/cassandra-website/pull/73]).
 # Copy of `blog.adoc` is out-of-date and needs to be re-based from `trunk`.
 # Update to Changelog no. 9 has already been fixed by CASSANDRA-16980 (PR 
[#73|https://github.com/apache/cassandra-website/pull/73]).

> WEBSITE - September 2021 website edits
> --
>
> Key: CASSANDRA-16994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16994
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Diogenese Topper
>Assignee: Erick Ramirez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0.x
>
>
> Edits and fixes for formatting and typos across the website.
> Can be closed upon merged changes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16994) WEBSITE - September 2021 website edits

2021-12-01 Thread Erick Ramirez (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451589#comment-17451589
 ] 

Erick Ramirez commented on CASSANDRA-16994:
---

[~diotopper] I've reviewed the PR and several items need fixing:
 # Date for Apachecon post on blog index has already been fixed by 
CASSANDRA-16980 (PR [#73|https://github.com/apache/cassandra-website/pull/73]).
 # Copy of `blog.adoc` is out-of-date and needs to be re-based from `trunk`.
 # Update to Changelog no. 9 has already been fixed by CASSANDRA-16980 (PR 
[#73|https://github.com/apache/cassandra-website/pull/73]).

> WEBSITE - September 2021 website edits
> --
>
> Key: CASSANDRA-16994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16994
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Diogenese Topper
>Assignee: Erick Ramirez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0.x
>
>
> Edits and fixes for formatting and typos across the website.
> Can be closed upon merged changes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org