Re: [PR] Minor fix to unit test [cassandra-java-driver]

2024-04-22 Thread via GitHub


absurdfarce merged PR #1930:
URL: https://github.com/apache/cassandra-java-driver/pull/1930


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-java-driver) branch 4.x updated: Initial fix to unit tests

2024-04-22 Thread absurdfarce
This is an automated email from the ASF dual-hosted git repository.

absurdfarce pushed a commit to branch 4.x
in repository https://gitbox.apache.org/repos/asf/cassandra-java-driver.git


The following commit(s) were added to refs/heads/4.x by this push:
 new 07265b4a6 Initial fix to unit tests
07265b4a6 is described below

commit 07265b4a6830a47752bf31eb4f631b9917863da2
Author: absurdfarce 
AuthorDate: Tue Apr 23 00:38:48 2024 -0500

Initial fix to unit tests

patch by Bret McGuire; reviewed by Bret McGuire for PR 1930
---
 .../oss/driver/internal/core/session/DefaultSession.java  | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git 
a/core/src/main/java/com/datastax/oss/driver/internal/core/session/DefaultSession.java
 
b/core/src/main/java/com/datastax/oss/driver/internal/core/session/DefaultSession.java
index cb1271c9c..6f063ae9a 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/internal/core/session/DefaultSession.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/internal/core/session/DefaultSession.java
@@ -39,6 +39,7 @@ import 
com.datastax.oss.driver.internal.core.metadata.MetadataManager;
 import 
com.datastax.oss.driver.internal.core.metadata.MetadataManager.RefreshSchemaResult;
 import com.datastax.oss.driver.internal.core.metadata.NodeStateEvent;
 import com.datastax.oss.driver.internal.core.metadata.NodeStateManager;
+import com.datastax.oss.driver.internal.core.metrics.NodeMetricUpdater;
 import com.datastax.oss.driver.internal.core.metrics.SessionMetricUpdater;
 import com.datastax.oss.driver.internal.core.pool.ChannelPool;
 import com.datastax.oss.driver.internal.core.util.Loggers;
@@ -549,10 +550,11 @@ public class DefaultSession implements CqlSession {
 
   // clear metrics to prevent memory leak
   for (Node n : metadataManager.getMetadata().getNodes().values()) {
-((DefaultNode) n).getMetricUpdater().clearMetrics();
+NodeMetricUpdater updater = ((DefaultNode) n).getMetricUpdater();
+if (updater != null) updater.clearMetrics();
   }
 
-  DefaultSession.this.metricUpdater.clearMetrics();
+  if (metricUpdater != null) metricUpdater.clearMetrics();
 
   List> childrenCloseStages = new ArrayList<>();
   for (AsyncAutoCloseable closeable : internalComponentsToClose()) {
@@ -575,10 +577,11 @@ public class DefaultSession implements CqlSession {
 
   // clear metrics to prevent memory leak
   for (Node n : metadataManager.getMetadata().getNodes().values()) {
-((DefaultNode) n).getMetricUpdater().clearMetrics();
+NodeMetricUpdater updater = ((DefaultNode) n).getMetricUpdater();
+if (updater != null) updater.clearMetrics();
   }
 
-  DefaultSession.this.metricUpdater.clearMetrics();
+  if (metricUpdater != null) metricUpdater.clearMetrics();
 
   if (closeWasCalled) {
 // onChildrenClosed has already been scheduled


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[PR] Minor fix to unit test [cassandra-java-driver]

2024-04-22 Thread via GitHub


absurdfarce opened a new pull request, #1930:
URL: https://github.com/apache/cassandra-java-driver/pull/1930

   The recent metrics changes to prevent session leakage ([this 
PR](https://github.com/apache/cassandra-java-driver/pull/1916)) introduced a 
small issue in one of the unit tests.  This PR addresses that issue.
   
   A combo branch containing this fix + [the fix for 
CASSANDRA-19292](https://github.com/apache/cassandra-java-driver/pull/1924) 
passed all unit and integration tests in a local run using Cassandra 4.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (690a3a5d -> 916d1569)

2024-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


omit 690a3a5d generate docs for cc1c7113
 new 916d1569 generate docs for cc1c7113

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (690a3a5d)
\
 N -- N -- N   refs/heads/asf-staging (916d1569)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 site-ui/build/ui-bundle.zip | Bin 4883646 -> 4883646 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839924#comment-17839924
 ] 

Brandon Williams commented on CASSANDRA-19580:
--

The purpose of hibernate is to have other nodes ignore the dead state, 
otherwise they will see the old node alive and just mark it back UP.

If you have internode_compression=dc then replacement with the same IP will not 
work, you need to use a different IP because the compression has already been 
negotiated on the other nodes.



> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839901#comment-17839901
 ] 

Cameron Zemek edited comment on CASSANDRA-19580 at 4/23/24 1:03 AM:


[~brandon.williams] do you know why it needs to use Hibernate for replacement 
for same address? CASSANDRA-8523

added BOOT_REPLACE status. I am not sure what I am breaking by doing this:
{code:java}
    public void prepareToJoin() throws ConfigurationException
    {
        // omitted for brevity
                else if (isReplacingSameAddress())
                {
                    //only go into hibernate state if replacing the same 
address (CASSANDRA-8523)
                    logger.warn("Writes will not be forwarded to this node 
during replacement because it has the same address as " +
                                "the node to be replaced ({}). If the previous 
node has been down for longer than max_hint_window_in_ms, " +
                                "repair must be run after the replacement 
process in order to make this node consistent.",
                                DatabaseDescriptor.getReplaceAddress());
                    appStates.put(ApplicationState.STATUS, 
valueFactory.bootReplacing(DatabaseDescriptor.getReplaceAddress()));
                }{code}
This stops the issue as no longer putting the node into hibernate during 
replacement. So if the replacement fails not in a dead state.


was (Author: cam1982):
[~brandon.williams] do you know why it needs to use Hibernate for replacement 
for same address. CASSANDRA-8523

added BOOT_REPLACE status. I am not sure what I am breaking by doing this:
{code:java}
    public void prepareToJoin() throws ConfigurationException
    {
        // omitted for brevity
                else if (isReplacingSameAddress())
                {
                    //only go into hibernate state if replacing the same 
address (CASSANDRA-8523)
                    logger.warn("Writes will not be forwarded to this node 
during replacement because it has the same address as " +
                                "the node to be replaced ({}). If the previous 
node has been down for longer than max_hint_window_in_ms, " +
                                "repair must be run after the replacement 
process in order to make this node consistent.",
                                DatabaseDescriptor.getReplaceAddress());
                    appStates.put(ApplicationState.STATUS, 
valueFactory.bootReplacing(DatabaseDescriptor.getReplaceAddress()));
                }{code}
This stops the issue as no longer putting the node into hibernate during 
replacement. So if the replacement fails not in a dead state.

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                    

[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839901#comment-17839901
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

[~brandon.williams] do you know why it needs to use Hibernate for replacement 
for same address. CASSANDRA-8523

added BOOT_REPLACE status. I am not sure what I am breaking by doing this:
{code:java}
    public void prepareToJoin() throws ConfigurationException
    {
        // omitted for brevity
                else if (isReplacingSameAddress())
                {
                    //only go into hibernate state if replacing the same 
address (CASSANDRA-8523)
                    logger.warn("Writes will not be forwarded to this node 
during replacement because it has the same address as " +
                                "the node to be replaced ({}). If the previous 
node has been down for longer than max_hint_window_in_ms, " +
                                "repair must be run after the replacement 
process in order to make this node consistent.",
                                DatabaseDescriptor.getReplaceAddress());
                    appStates.put(ApplicationState.STATUS, 
valueFactory.bootReplacing(DatabaseDescriptor.getReplaceAddress()));
                }{code}
This stops the issue as no longer putting the node into hibernate during 
replacement. So if the replacement fails not in a dead state.

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19566:
--
  Fix Version/s: 4.0.13
 4.1.5
 5.0-beta2
 5.1-alpha1
 (was: 5.x)
 (was: 4.0.x)
 (was: 4.1.x)
 (was: 5.0.x)
  Since Version: 4.0
Source Control Link: 
https://github.com/apache/cassandra/commit/fba4a85b971a00e982361282cf6ea46f8ccf0cd1
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.13, 4.1.5, 5.0-beta2, 5.1-alpha1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-4.1 updated (445ae1a4b1 -> e471a57dc2)

2024-04-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-4.1
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 445ae1a4b1 Merge branch 'cassandra-4.0' into cassandra-4.1
 add fba4a85b97 Fix CQL tojson timestamp output on negative timestamp 
values before Gregorian calendar reform in 1582
 add e471a57dc2 Merge branch 'cassandra-4.0' into cassandra-4.1

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|  1 +
 .../org/apache/cassandra/db/marshal/DateType.java  |  2 +-
 .../apache/cassandra/db/marshal/TimestampType.java |  2 +-
 .../cassandra/serializers/TimestampSerializer.java | 35 --
 4 files changed, 23 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-5.0 updated (82297b490e -> 93692a4b17)

2024-04-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 82297b490e Merge branch 'cassandra-4.1' into cassandra-5.0
 add fba4a85b97 Fix CQL tojson timestamp output on negative timestamp 
values before Gregorian calendar reform in 1582
 add e471a57dc2 Merge branch 'cassandra-4.0' into cassandra-4.1
 add 93692a4b17 Merge branch 'cassandra-4.1' into cassandra-5.0

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|  1 +
 .../org/apache/cassandra/db/marshal/DateType.java  |  2 +-
 .../apache/cassandra/db/marshal/TimestampType.java |  2 +-
 .../cassandra/serializers/TimestampSerializer.java | 35 --
 4 files changed, 23 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated (c5c4cd4e57 -> 607d6d0361)

2024-04-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from c5c4cd4e57 Update use of transition plan in PrepareReplace
 add fba4a85b97 Fix CQL tojson timestamp output on negative timestamp 
values before Gregorian calendar reform in 1582
 add e471a57dc2 Merge branch 'cassandra-4.0' into cassandra-4.1
 add 93692a4b17 Merge branch 'cassandra-4.1' into cassandra-5.0
 new 607d6d0361 Merge branch 'cassandra-5.0' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|  1 +
 .../org/apache/cassandra/db/marshal/DateType.java  |  2 +-
 .../apache/cassandra/db/marshal/TimestampType.java |  2 +-
 .../cassandra/serializers/TimestampSerializer.java | 35 --
 4 files changed, 23 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) 01/01: Merge branch 'cassandra-5.0' into trunk

2024-04-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 607d6d0361184e478fcc1cd00be89c864a3a7455
Merge: c5c4cd4e57 93692a4b17
Author: Stefan Miklosovic 
AuthorDate: Tue Apr 23 01:14:31 2024 +0200

Merge branch 'cassandra-5.0' into trunk

 CHANGES.txt|  1 +
 .../org/apache/cassandra/db/marshal/DateType.java  |  2 +-
 .../apache/cassandra/db/marshal/TimestampType.java |  2 +-
 .../cassandra/serializers/TimestampSerializer.java | 35 --
 4 files changed, 23 insertions(+), 17 deletions(-)

diff --cc CHANGES.txt
index 3e2fa40c1e,1d51d2d27d..cf55f7b91c
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -72,9 -36,10 +72,10 @@@ Merged from 4.1
   * Fix hints delivery for a node going down repeatedly (CASSANDRA-19495)
   * Do not go to disk for reading hints file sizes (CASSANDRA-19477)
   * Fix system_views.settings to handle array types (CASSANDRA-19475)
 - * Memoize Cassandra verion and add a backoff interval for failed schema 
pulls (CASSANDRA-18902)
   * Fix StackOverflowError on ALTER after many previous schema changes 
(CASSANDRA-19166)
 + * Memoize Cassandra verion (CASSANDRA-18902)
  Merged from 4.0:
+  * Fix CQL tojson timestamp output on negative timestamp values before 
Gregorian calendar reform in 1582 (CASSANDRA-19566)
   * Fix few types issues and implement types compatibility tests 
(CASSANDRA-19479)
   * Change logging to TRACE when failing to get peer certificate 
(CASSANDRA-19508)
   * Push LocalSessions info logs to debug (CASSANDRA-18335)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-4.0 updated (f92998190c -> fba4a85b97)

2024-04-22 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-4.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from f92998190c Fix few types issues and implement types compatibility tests
 add fba4a85b97 Fix CQL tojson timestamp output on negative timestamp 
values before Gregorian calendar reform in 1582

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|  1 +
 .../org/apache/cassandra/db/marshal/DateType.java  |  2 +-
 .../apache/cassandra/db/marshal/TimestampType.java |  2 +-
 .../cassandra/serializers/TimestampSerializer.java | 35 --
 4 files changed, 23 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839887#comment-17839887
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

PS: the customer not doing step 2. That just my reliable way to reproduce the 
issue. I have seen this 'Unable to contact seeds!' in the past but never had 
enough information to go on. It seems to happen on larger clusters.

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839886#comment-17839886
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

The node trying to replace. So in my replication steps:
 # replace a node using '-Dcassandra.replace_address=44.239.237.152'
 # while its replacing kill off cassandra
 # wipe the cassandra folders
 # start cassandra again still using the replace address flag

After step 2 if I check 'nodetool gossipinfo' the node being replaced 
(44.239.237.152 in this example) has status of hibernate.

During step 4 the other nodes will say 'Not marking /44.239.237.152 alive due 
to dead state'

I did a whole bunch of testing of this yesterday and this is the key issue as 
far as I can tell. Due to the replacing node being in hibernate they won't send 
a SYN (see maybeGossipToUnreachableMember filters out ones in dead state). And 
without the SYN message the replacing node never gets gossip state of the 
cluster as its own SYN messages only has itself as digest so ACK replies to 
those don't include other nodes.

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839885#comment-17839885
 ] 

Brandon Williams commented on CASSANDRA-19580:
--

If internode compression is enabled, replacing with the same address won't work 
because the negotiated compression is cached. This is a limitation that we need 
to document.

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839884#comment-17839884
 ] 

Brandon Williams edited comment on CASSANDRA-19580 at 4/22/24 11:05 PM:


Thanks, that was answered better than I asked. If you killed the replacement, 
which node is complaining about seeds?


was (Author: brandon.williams):
Thanks, that was answered better than I asked. If you killed the replacement, 
which nice is complaining about seeds?

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839884#comment-17839884
 ] 

Brandon Williams commented on CASSANDRA-19580:
--

Thanks, that was answered better than I asked. If you killed the replacement, 
which nice is complaining about seeds?

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839882#comment-17839882
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

Customer cluster has: 

commitlog_compression=LZ4Compressor

hints_compression=null

internode_compression=dc

 

So it happens with and without comrpession

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839881#comment-17839881
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

[~brandon.williams] 

> Is compression enabled on this cluster?

Not sure which setting you referring to. Just replicated the issue on test 
cluster where I have:

commitlog_compression=null

internode_compression=none 

hint_compression=null

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839879#comment-17839879
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

Here is an extract of logs showing the issue:
{noformat}
INFO  [main] 2024-04-17 17:57:45,766 MessagingService.java:750 - Starting 
Messaging Service on /10.120.156.42:7000 (eth0)
INFO  [main] 2024-04-17 17:57:45,775 StorageService.java:681 - Gathering node 
replacement information for /10.120.156.42
TRACE [main] 2024-04-17 17:57:45,781 Gossiper.java:1613 - Sending shadow round 
GOSSIP DIGEST SYN to seeds [/10.120.156.17, /10.120.156.21, /10.120.156.9]
INFO  [main] 2024-04-17 17:57:45,788 OutboundTcpConnection.java:108 - 
OutboundTcpConnection using coalescing strategy DISABLED
INFO  [HANDSHAKE-/10.120.156.9] 2024-04-17 17:57:45,802 
OutboundTcpConnection.java:561 - Handshaking version with /10.120.156.9
INFO  [HANDSHAKE-/10.120.156.17] 2024-04-17 17:57:45,803 
OutboundTcpConnection.java:561 - Handshaking version with /10.120.156.17
INFO  [HANDSHAKE-/10.120.156.21] 2024-04-17 17:57:45,803 
OutboundTcpConnection.java:561 - Handshaking version with /10.120.156.21
TRACE [GossipStage:1] 2024-04-17 17:57:45,875 
GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage from 
/10.120.156.9
TRACE [GossipStage:1] 2024-04-17 17:57:45,875 
GossipDigestAckVerbHandler.java:52 - Received ack with 0 digests and 48 states
DEBUG [GossipStage:1] 2024-04-17 17:57:45,876 
GossipDigestAckVerbHandler.java:57 - Received an ack from /10.120.156.9, which 
may trigger exit from shadow round
DEBUG [GossipStage:1] 2024-04-17 17:57:45,876 Gossiper.java:1802 - Received a 
regular ack from /10.120.156.9, can now exit shadow round
TRACE [GossipStage:1] 2024-04-17 17:57:45,876 
GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage from 
/10.120.156.21
TRACE [GossipStage:1] 2024-04-17 17:57:45,876 
GossipDigestAckVerbHandler.java:45 - Ignoring GossipDigestAckMessage because 
gossip is disabled
TRACE [GossipStage:1] 2024-04-17 17:57:45,876 
GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage from 
/10.120.156.17
TRACE [GossipStage:1] 2024-04-17 17:57:45,876 
GossipDigestAckVerbHandler.java:45 - Ignoring GossipDigestAckMessage because 
gossip is disabled
WARN  [main] 2024-04-17 17:57:46,825 StorageService.java:970 - Writes will not 
be forwarded to this node during replacement because it has the same address as 
the node to be replaced (/10.120.156.42). If the previous node has been down 
for longer than max_hint_window_in_ms, repair must be run after the replacement 
process in order to make this node consistent.
INFO  [main] 2024-04-17 17:57:46,827 StorageService.java:877 - Loading 
persisted ring state
INFO  [main] 2024-04-17 17:57:46,829 StorageService.java:1008 - Starting up 
server gossip
TRACE [main] 2024-04-17 17:57:46,854 Gossiper.java:1550 - gossip started with 
generation 171337
WARN  [main] 2024-04-17 17:57:46,883 StorageService.java:1099 - Detected 
previous bootstrap failure; retrying
INFO  [main] 2024-04-17 17:57:46,883 StorageService.java:1679 - JOINING: 
waiting for ring information
TRACE [GossipTasks:1] 2024-04-17 17:57:47,855 Gossiper.java:215 - My heartbeat 
is now 16
TRACE [GossipTasks:1] 2024-04-17 17:57:47,856 Gossiper.java:633 - Gossip 
Digests are : /10.120.156.42:171337:16 
TRACE [GossipTasks:1] 2024-04-17 17:57:47,857 Gossiper.java:782 - Sending a 
GossipDigestSyn to /10.120.156.17 ...
TRACE [GossipTasks:1] 2024-04-17 17:57:47,857 Gossiper.java:911 - Performing 
status check ...
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 
GossipDigestAckVerbHandler.java:41 - Received a GossipDigestAckMessage from 
/10.120.156.17
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 
GossipDigestAckVerbHandler.java:52 - Received ack with 1 digests and 0 states
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 Gossiper.java:1048 - local 
heartbeat version 16 greater than 0 for /10.120.156.42
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 Gossiper.java:1063 - Adding state 
STATUS: hibernate,true
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 Gossiper.java:1063 - Adding state 
SCHEMA: 59adb24e-f3cd-3e02-97f0-5b395827453f
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 Gossiper.java:1063 - Adding state 
DC: us-west2
TRACE [GossipStage:1] 2024-04-17 17:57:47,858 Gossiper.java:1063 - Adding state 
RACK: c
TRACE [GossipStage:1] 2024-04-17 17:57:47,859 Gossiper.java:1063 - Adding state 
RELEASE_VERSION: 3.11.16
TRACE [GossipStage:1] 2024-04-17 17:57:47,859 Gossiper.java:1063 - Adding state 
INTERNAL_IP: 10.120.156.42
TRACE [GossipStage:1] 2024-04-17 17:57:47,859 Gossiper.java:1063 - Adding state 
RPC_ADDRESS: 10.120.156.42
TRACE [GossipStage:1] 2024-04-17 17:57:47,859 Gossiper.java:1063 - Adding state 
NET_VERSION: 11
TRACE [GossipStage:1] 2024-04-17 17:57:47,859 Gossiper.java:1063 - Adding state 
HOST_ID: 4477-a899-4cc1-a9f9-2

[jira] [Updated] (CASSANDRA-19563) [Analytics] Support bulk write via S3

2024-04-22 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-19563:
--
  Fix Version/s: NA
Source Control Link: 
https://github.com/apache/cassandra-analytics/commit/aea798dc7e517af520a403d4d86f3bc6bed65092
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed into trunk as 
[aea798|https://github.com/apache/cassandra-analytics/commit/aea798dc7e517af520a403d4d86f3bc6bed65092]

> [Analytics] Support bulk write via S3
> -
>
> Key: CASSANDRA-19563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19563
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Analytics Library
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: NA
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I would like to propose a new write option in Cassandra Analytics to bulk 
> write SSTables via S3, in addition to the previously-implemented "direct 
> upload to all sidecars" (now known as the "Direct" transport). 
> The new write option, now being implemented, is the "S3_COMPAT" transport, 
> which allows the job to upload the generated SSTables to an S3-compatible 
> storage system, and then inform the Cassandra Sidecar that those files are 
> available for download & commit.
> Additionally, a plug-in system was added to allow communications between 
> custom transport hooks and the job, so the custom hook can provide updated 
> credentials and out-of-band status updates on S3-related issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19563: Support bulk write via S3 [cassandra-analytics]

2024-04-22 Thread via GitHub


yifan-c merged PR #53:
URL: https://github.com/apache/cassandra-analytics/pull/53


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19563) [Analytics] Support bulk write via S3

2024-04-22 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-19563:
--
Status: Ready to Commit  (was: Review In Progress)

> [Analytics] Support bulk write via S3
> -
>
> Key: CASSANDRA-19563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19563
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Analytics Library
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I would like to propose a new write option in Cassandra Analytics to bulk 
> write SSTables via S3, in addition to the previously-implemented "direct 
> upload to all sidecars" (now known as the "Direct" transport). 
> The new write option, now being implemented, is the "S3_COMPAT" transport, 
> which allows the job to upload the generated SSTables to an S3-compatible 
> storage system, and then inform the Cassandra Sidecar that those files are 
> available for download & commit.
> Additionally, a plug-in system was added to allow communications between 
> custom transport hooks and the job, so the custom hook can provide updated 
> credentials and out-of-band status updates on S3-related issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839876#comment-17839876
 ] 

Brandon Williams commented on CASSANDRA-19580:
--

Is compression enabled on this cluster?

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839870#comment-17839870
 ] 

Cameron Zemek commented on CASSANDRA-19580:
---

[~brandon.williams] sorry I did not clarify that exactly what doing, node 
replacements. In particular for same IP address. If I kill off the node during 
node replacement the other nodes in cluster will have that replacing node in 
hibernate status. At which point you will always get 'Unable to contact any 
seeds!' as SYN are not sent by other nodes to the replacing node when they have 
it in HIBERNATE status since that is a dead state.

 

In a working replacement the other nodes have it in SHUTDOWN state. Then as 
part of bootstrap the node gets marked as alive and then one of the nodes end 
up sending a SYN.

 

That is if there some failure during a node replacement end up in unrecoverable 
state.

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15439) Token metadata for bootstrapping nodes is lost under temporary failures

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839866#comment-17839866
 ] 

Brandon Williams edited comment on CASSANDRA-15439 at 4/22/24 9:54 PM:
---

Determining if the node is bootstrapping is only part of the problem, we still 
have to evict a bootstrapping node that never comes back at some point.  What 
point that is for fat clients has always been equivalent to RING_DELAY, which 
is fine.  For bootstrapping nodes we can choose a new limit with a new 
parameter and allow an override with -D to accommodate those who need it 
longer, without the other drawbacks of increasing RING_DELAY.


was (Author: brandon.williams):
Determining if the node is bootstrapping is only part of the problem, we still 
have to evict a bootstrapping node that never comes back at some point.  What 
point that is for fat clients has always been equivalent to RING_DELAY, which 
is fine.  For bootstrapping nodes we can choose a new limit with a new 
parameter and allow an override with -D to accommodate those who need it longer.

> Token metadata for bootstrapping nodes is lost under temporary failures
> ---
>
> Key: CASSANDRA-15439
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15439
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> In CASSANDRA-8838, [~pauloricardomg] asked "hints will not be stored to the 
> bootstrapping node after RING_DELAY, since it will evicted from the TMD 
> pending ranges. Should we create a ticket to address this?"
> CASSANDRA-15264 relates to the most likely cause of such situations, where 
> the Cassandra daemon on the bootstrapping node completely crashes. Based on 
> testing with {{kill -STOP}} on a bootstrapping Cassandra JVM, I believe it 
> also is possible to remove token metadata (and thus pending ranges, and thus 
> hints) for a bootstrapping node, simply by affecting its status in the 
> failure detector. 
> A node in the cluster sees the bootstrapping node this way:
> {noformat}
> INFO  [GossipStage:1] 2019-11-27 20:41:41,101 Gossiper.java: - Node 
> /PUBLIC-IP is now part of the cluster
> INFO  [GossipStage:1] 2019-11-27 20:41:41,199 Gossiper.java:1073 - 
> InetAddress /PUBLIC-IP is now UP
> INFO  [HANDSHAKE-/PRIVATE-IP] 2019-11-27 20:41:41,412 
> OutboundTcpConnection.java:565 - Handshaking version with /PRIVATE-IP
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,019 
> StreamResultFuture.java:112 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Creating new streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,020 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:56003] 2019-11-27 20:42:10,112 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-IN-/PUBLIC-IP] 2019-11-27 20:42:10,179 
> StreamResultFuture.java:169 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 833 
> files(139744616815 bytes)
> INFO  [GossipStage:1] 2019-11-27 20:54:47,547 Gossiper.java:1089 - 
> InetAddress /PUBLIC-IP is now DOWN
> INFO  [GossipTasks:1] 2019-11-27 20:54:57,551 Gossiper.java:849 - FatClient 
> /PUBLIC-IP has been silent for 3ms, removing from gossip
> {noformat}
> Since the bootstrapping node has no tokens, it is treated like a fat client, 
> and it is removed from the ring. For correctness purposes, I believe we must 
> keep storing hints for the downed bootstrapping node until it is either 
> assassinated or until a replacement attempts to bootstrap for the same token.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15439) Token metadata for bootstrapping nodes is lost under temporary failures

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839866#comment-17839866
 ] 

Brandon Williams commented on CASSANDRA-15439:
--

Determining if the node is bootstrapping is only part of the problem, we still 
have to evict a bootstrapping node that never comes back at some point.  What 
point that is for fat clients has always been equivalent to RING_DELAY, which 
is fine.  For bootstrapping nodes we can choose a new limit with a new 
parameter and allow an override with -D to accommodate those who need it longer.

> Token metadata for bootstrapping nodes is lost under temporary failures
> ---
>
> Key: CASSANDRA-15439
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15439
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> In CASSANDRA-8838, [~pauloricardomg] asked "hints will not be stored to the 
> bootstrapping node after RING_DELAY, since it will evicted from the TMD 
> pending ranges. Should we create a ticket to address this?"
> CASSANDRA-15264 relates to the most likely cause of such situations, where 
> the Cassandra daemon on the bootstrapping node completely crashes. Based on 
> testing with {{kill -STOP}} on a bootstrapping Cassandra JVM, I believe it 
> also is possible to remove token metadata (and thus pending ranges, and thus 
> hints) for a bootstrapping node, simply by affecting its status in the 
> failure detector. 
> A node in the cluster sees the bootstrapping node this way:
> {noformat}
> INFO  [GossipStage:1] 2019-11-27 20:41:41,101 Gossiper.java: - Node 
> /PUBLIC-IP is now part of the cluster
> INFO  [GossipStage:1] 2019-11-27 20:41:41,199 Gossiper.java:1073 - 
> InetAddress /PUBLIC-IP is now UP
> INFO  [HANDSHAKE-/PRIVATE-IP] 2019-11-27 20:41:41,412 
> OutboundTcpConnection.java:565 - Handshaking version with /PRIVATE-IP
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,019 
> StreamResultFuture.java:112 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Creating new streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,020 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:56003] 2019-11-27 20:42:10,112 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-IN-/PUBLIC-IP] 2019-11-27 20:42:10,179 
> StreamResultFuture.java:169 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 833 
> files(139744616815 bytes)
> INFO  [GossipStage:1] 2019-11-27 20:54:47,547 Gossiper.java:1089 - 
> InetAddress /PUBLIC-IP is now DOWN
> INFO  [GossipTasks:1] 2019-11-27 20:54:57,551 Gossiper.java:849 - FatClient 
> /PUBLIC-IP has been silent for 3ms, removing from gossip
> {noformat}
> Since the bootstrapping node has no tokens, it is treated like a fat client, 
> and it is removed from the ring. For correctness purposes, I believe we must 
> keep storing hints for the downed bootstrapping node until it is either 
> assassinated or until a replacement attempts to bootstrap for the same token.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19429) Remove lock contention generated by getCapacity function in SSTableReader

2024-04-22 Thread Dipietro Salvatore (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839865#comment-17839865
 ] 

Dipietro Salvatore commented on CASSANDRA-19429:


[~smiklosovic] We have tested the patch also on smaller instances( > 4xl) using 
the same instructions I have shared above. 
Here the results:
 
|*Instance type*| *4.1.3 (on JDK11)*|*4.1.3 with patch* {*}(on JDK11){*}{*}{*}|
|r7i.4xlarge|91k/ops 4.9ms@P99|129k/ops 2.4ms@P99 (1.4x)|
|r7i.8xlarge|139k/ops 2.5ms@P99|201k/ops 1.3ms@P99 (1.44x)|
|r7i.16xlarge|152k/ops 2.2ms@P99|277k/ops 0.8ms@P99 (1.8x)|

Starting from 4xl instances we saw a 40% increase in performance. 
Can you please try to test it on your side? 

Otherwise can you let us know how we can help to move forward with it (provide 
instances to test this, run some specific benchmarks, more validation tests, 
etc.)? Thanks

> Remove lock contention generated by getCapacity function in SSTableReader
> -
>
> Key: CASSANDRA-19429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19429
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Dipietro Salvatore
>Assignee: Dipietro Salvatore
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: Screenshot 2024-02-26 at 10.27.10.png, Screenshot 
> 2024-02-27 at 11.29.41.png, Screenshot 2024-03-19 at 15.22.50.png, 
> asprof_cass4.1.3__lock_20240216052912lock.html, 
> image-2024-03-08-15-51-30-439.png, image-2024-03-08-15-52-07-902.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock 
> acquires is measured in the `getCapacity` function from 
> `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 
> seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), 
> this limits the CPU utilization of the system to under 50% when testing at 
> full load and therefore limits the achieved throughput.
> Removing the lock contention from the SSTableReader.java file by replacing 
> the call to `getCapacity` with `size` achieves up to 2.95x increase in 
> throughput on r8g.24xlarge and 2x on r7i.24xlarge:
> |Instance type|Cass 4.1.3|Cass 4.1.3 patched|
> |r8g.24xlarge|168k ops|496k ops (2.95x)|
> |r7i.24xlarge|153k ops|304k ops (1.98x)|
>  
> Instructions to reproduce:
> {code:java}
> ## Requirements for Ubuntu 22.04
> sudo apt install -y ant git openjdk-11-jdk
> ## Build and run
> CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && 
> CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f 
> -R
> # Run
> bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
> bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
> bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write 
> n=1000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log 
> -graph file=cload.html && \
> bin/nodetool compact keyspace1   && sleep 30s && \
> tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m 
> cl=ONE -rate threads=406 -node localhost -log file=result.log -graph 
> file=graph.html
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839859#comment-17839859
 ] 

Brandon Williams commented on CASSANDRA-19566:
--

If you dig into the 5.0 j11 failure it's a red herring; it's clean.  There 
seems to be an unrelated problem in test_parallel_upgrade on the upgrade tests 
but the rest have passed, I am +1 and will open a ticket for 
test_change_durable_writes (again.)

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19566:
-
Status: Ready to Commit  (was: Review In Progress)

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15439) Token metadata for bootstrapping nodes is lost under temporary failures

2024-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15439:
-
 Bug Category: Parent values: Correctness(12982)Level 1 values: Recoverable 
Corruption / Loss(12986)
   Complexity: Normal
  Component/s: Cluster/Membership
Discovered By: User Report
Fix Version/s: 3.0.x
   3.11.x
   4.0.x
   4.1.x
   5.0.x
   5.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Token metadata for bootstrapping nodes is lost under temporary failures
> ---
>
> Key: CASSANDRA-15439
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15439
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> In CASSANDRA-8838, [~pauloricardomg] asked "hints will not be stored to the 
> bootstrapping node after RING_DELAY, since it will evicted from the TMD 
> pending ranges. Should we create a ticket to address this?"
> CASSANDRA-15264 relates to the most likely cause of such situations, where 
> the Cassandra daemon on the bootstrapping node completely crashes. Based on 
> testing with {{kill -STOP}} on a bootstrapping Cassandra JVM, I believe it 
> also is possible to remove token metadata (and thus pending ranges, and thus 
> hints) for a bootstrapping node, simply by affecting its status in the 
> failure detector. 
> A node in the cluster sees the bootstrapping node this way:
> {noformat}
> INFO  [GossipStage:1] 2019-11-27 20:41:41,101 Gossiper.java: - Node 
> /PUBLIC-IP is now part of the cluster
> INFO  [GossipStage:1] 2019-11-27 20:41:41,199 Gossiper.java:1073 - 
> InetAddress /PUBLIC-IP is now UP
> INFO  [HANDSHAKE-/PRIVATE-IP] 2019-11-27 20:41:41,412 
> OutboundTcpConnection.java:565 - Handshaking version with /PRIVATE-IP
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,019 
> StreamResultFuture.java:112 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Creating new streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,020 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:56003] 2019-11-27 20:42:10,112 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-IN-/PUBLIC-IP] 2019-11-27 20:42:10,179 
> StreamResultFuture.java:169 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 833 
> files(139744616815 bytes)
> INFO  [GossipStage:1] 2019-11-27 20:54:47,547 Gossiper.java:1089 - 
> InetAddress /PUBLIC-IP is now DOWN
> INFO  [GossipTasks:1] 2019-11-27 20:54:57,551 Gossiper.java:849 - FatClient 
> /PUBLIC-IP has been silent for 3ms, removing from gossip
> {noformat}
> Since the bootstrapping node has no tokens, it is treated like a fat client, 
> and it is removed from the ring. For correctness purposes, I believe we must 
> keep storing hints for the downed bootstrapping node until it is either 
> assassinated or until a replacement attempts to bootstrap for the same token.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839858#comment-17839858
 ] 

Stefan Miklosovic commented on CASSANDRA-19572:
---

[~marcuse] any ideas how to fix this? You wrote that test.

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-19572:
-

Assignee: (was: Stefan Miklosovic)

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839857#comment-17839857
 ] 

Stefan Miklosovic commented on CASSANDRA-19572:
---

I tried to fix this just for one test, testImportInvalidateCache, by callgin 
resetTidying() before each load but it still fails

{code}
ERROR [main] 2024-04-22 21:02:10,706 Failed importing sstables in directory 
/tmp/importtest2511606450027946118/cql_test_keyspace/table_00
java.lang.AssertionError: null
at 
org.apache.cassandra.utils.concurrent.Ref$State.assertNotReleased(Ref.java:196)
at org.apache.cassandra.utils.concurrent.Ref.ref(Ref.java:152)
at 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy.get(SSTableReader.java:2196)
at 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.setup(SSTableReader.java:2028)
at 
org.apache.cassandra.io.sstable.format.SSTableReader.setup(SSTableReader.java:1971)
at 
org.apache.cassandra.io.sstable.format.SSTableReaderBuilder$ForRead.build(SSTableReaderBuilder.java:370)
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:501)
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:372)
at 
org.apache.cassandra.db.SSTableImporter.getTargetDirectory(SSTableImporter.java:213)
at 
org.apache.cassandra.db.SSTableImporter.importNewSSTables(SSTableImporter.java:135)
at org.apache.cassandra.db.ImportTest.load(ImportTest.java:484)
at 
org.apache.cassandra.db.ImportTest.testImportInvalidateCache(ImportTest.java:543)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
{code}



> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17932) create a multiplexer job for Jenkins

2024-04-22 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-17932:
---
Resolution: Duplicate
Status: Resolved  (was: Open)

> create a multiplexer job for Jenkins
> 
>
> Key: CASSANDRA-17932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17932
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
>
> It would be nice to have more parity with Circle and have the ability to 
> multiplex a test or group of tests in Jenkins.  This would be especially 
> helpful when working on intermittent issues that only appear in Jenkins.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18942) Repeatable java test runs on jenkins

2024-04-22 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-18942:
---
Resolution: (was: Later)
Status: Open  (was: Resolved)

dependencies are all done, re-opening.

> Repeatable java test runs on jenkins
> 
>
> Key: CASSANDRA-18942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18942
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Build, CI
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0, 5.0.x
>
> Attachments: jenkins_job.xml, testJava.txt, testJavaDocker.txt, 
> testJavaSplits.txt
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> It is our policy to loop new introduced tests to avoid introducing flakies. 
> We also want to add the possibility to repeat a test N number of times to 
> test robustness, debug flakies, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15439) Token metadata for bootstrapping nodes is lost under temporary failures

2024-04-22 Thread Raymond Huffman (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839851#comment-17839851
 ] 

Raymond Huffman commented on CASSANDRA-15439:
-

Just bumped this ticket in Slack because I believe this issue still exists in 
4.1

As an alternative to the patch linked here, could we instead check if a node is 
bootstrapping with something like this?

{code}
public boolean isJoining(InetAddress endpoint)
{
assert endpoint != null;

publicLock.readLock().lock();
lock.readLock().lock();
try
{
return bootstrapTokens.inverse().containsKey(endpoint);
}
finally
{
lock.readLock().unlock();
publicLock.readLock().unlock();
}
}
{code}

> Token metadata for bootstrapping nodes is lost under temporary failures
> ---
>
> Key: CASSANDRA-15439
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15439
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Josh Snyder
>Priority: Normal
>
> In CASSANDRA-8838, [~pauloricardomg] asked "hints will not be stored to the 
> bootstrapping node after RING_DELAY, since it will evicted from the TMD 
> pending ranges. Should we create a ticket to address this?"
> CASSANDRA-15264 relates to the most likely cause of such situations, where 
> the Cassandra daemon on the bootstrapping node completely crashes. Based on 
> testing with {{kill -STOP}} on a bootstrapping Cassandra JVM, I believe it 
> also is possible to remove token metadata (and thus pending ranges, and thus 
> hints) for a bootstrapping node, simply by affecting its status in the 
> failure detector. 
> A node in the cluster sees the bootstrapping node this way:
> {noformat}
> INFO  [GossipStage:1] 2019-11-27 20:41:41,101 Gossiper.java: - Node 
> /PUBLIC-IP is now part of the cluster
> INFO  [GossipStage:1] 2019-11-27 20:41:41,199 Gossiper.java:1073 - 
> InetAddress /PUBLIC-IP is now UP
> INFO  [HANDSHAKE-/PRIVATE-IP] 2019-11-27 20:41:41,412 
> OutboundTcpConnection.java:565 - Handshaking version with /PRIVATE-IP
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,019 
> StreamResultFuture.java:112 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Creating new streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:21233] 2019-11-27 20:42:10,020 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-INIT-/PRIVATE-IP:56003] 2019-11-27 20:42:10,112 
> StreamResultFuture.java:119 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4, 
> ID#0] Received streaming plan for Bootstrap
> INFO  [STREAM-IN-/PUBLIC-IP] 2019-11-27 20:42:10,179 
> StreamResultFuture.java:169 - [Stream #6219a950-1156-11ea-b45d-4d30364576c4 
> ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 833 
> files(139744616815 bytes)
> INFO  [GossipStage:1] 2019-11-27 20:54:47,547 Gossiper.java:1089 - 
> InetAddress /PUBLIC-IP is now DOWN
> INFO  [GossipTasks:1] 2019-11-27 20:54:57,551 Gossiper.java:849 - FatClient 
> /PUBLIC-IP has been silent for 3ms, removing from gossip
> {noformat}
> Since the bootstrapping node has no tokens, it is treated like a fat client, 
> and it is removed from the ring. For correctness purposes, I believe we must 
> keep storing hints for the downed bootstrapping node until it is either 
> assassinated or until a replacement attempts to bootstrap for the same token.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19577) Queries are not visible to the "system_views.queries" virtual table at the coordinator level

2024-04-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19577:

Test and Documentation Plan: expansion of QueriesTableTest to cover 
coordinator-level queries in flight
 Status: Patch Available  (was: In Progress)

The [4.1 patch|https://github.com/apache/cassandra/pull/3268] is up. CI results 
will be posted soon...

> Queries are not visible to the "system_views.queries" virtual table at the 
> coordinator level
> 
>
> Key: CASSANDRA-19577
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19577
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There appears to be a hole in the implementation of CASSANDRA-15241 where 
> {{DebuggableTasks}} at the coordinator are not preserved through the creation 
> of {{FutureTasks}} in {{TaskFactory}}. This means that {{QueriesTable}} can't 
> see them when is asks {{SharedExecutorPool}} for running tasks. It should be 
> possible to fix this in {{TaskFactory}} by making sure to propagate any 
> {{RunnableDebuggableTask}} we encounter. We already do this in 
> {{toExecute()}}, but it also needs to happen in the relevant {{toSubmit()}} 
> method(s).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839813#comment-17839813
 ] 

Brandon Williams commented on CASSANDRA-19566:
--

Running CI for 5.0, I typoed the branch name but it was already running so I've 
left it:

||Branch||CI||
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19556-5.0]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1593/workflows/cc50210b-cb42-4529-be00-d017b41d9328],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1593/workflows/2ac03a31-b5e8-4cd0-8209-b1b89779fc4e]|

[Here|https://app.circleci.com/pipelines/github/driftx/cassandra/1594/workflows/a1f7f945-a491-48b6-b166-4fd0917b2e96]
 are upgrade tests for 4.0

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839812#comment-17839812
 ] 

Stefan Miklosovic commented on CASSANDRA-19572:
---

I still see 

{code}
testImportCacheEnabledWithoutSrcDir
junit.framework.AssertionFailedError: expected:<1> but was:<2>
at 
org.apache.cassandra.db.ImportTest.testImportCacheEnabledWithoutSrcDir(ImportTest.java:562)
{code}
and 
{code}
junit.framework.AssertionFailedError: expected:<10> but was:<20>
at 
org.apache.cassandra.db.ImportTest.testImportInvalidateCache(ImportTest.java:537)
{code}

nevethless I think this is already a progress. 

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19566:
-
Reviewers: Brandon Williams
   Status: Review In Progress  (was: Needs Committer)

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-19572:
-

Assignee: Stefan Miklosovic

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839797#comment-17839797
 ] 

Stefan Miklosovic commented on CASSANDRA-19572:
---

I think I am onto something. I check the logs from the build for plain 4.0 
(without patch in 19401) and the first one is very interesting 
(testImportCorruptWithCopying) (1)

It fails on:
{code:java}
junit.framework.AssertionFailedError: 
expected:<[/tmp/importtest7641524017208283450/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360]>
 but 
was:<[/tmp/importtest7641524017208283450/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360,
 
/tmp/importtest916153905487802965/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360]>
at 
org.apache.cassandra.db.ImportTest.testCorruptHelper(ImportTest.java:341)
at 
org.apache.cassandra.db.ImportTest.testImportCorruptWithCopying(ImportTest.java:384)
 {code}
That test is expecting only one directory of sstables to be imported to be 
failed and another it expects to be loaded just fine, but here we clearly see 
that it failed to import {_}both{_}. I was checking the raw logs and I was 
quite lucky to find it, it is in this one (2). Grep it on exactly this 
timestamp:
{code}
ERROR [main] 2024-04-18 15:04:49,454 SSTableImporter.java:102
{code}
There you see that it failed to import the directory which it is not supposed 
to, that is the first stacktrace, but below it, there is another one:
{code}
[junit-timeout] ERROR [main] 2024-04-18 15:04:49,469 SSTableImporter.java:147 - 
Failed importing sstables in directory 
/tmp/importtest916153905487802965/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360
[junit-timeout] java.lang.AssertionError: null
[junit-timeout] at 
org.apache.cassandra.utils.concurrent.Ref$State.assertNotReleased(Ref.java:196)
[junit-timeout] at 
org.apache.cassandra.utils.concurrent.Ref.ref(Ref.java:152)
[junit-timeout] at 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy.get(SSTableReader.java:2196)
[junit-timeout] at 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.setup(SSTableReader.java:2028)
[junit-timeout] at 
org.apache.cassandra.io.sstable.format.SSTableReader.setup(SSTableReader.java:1971)
[junit-timeout] at 
org.apache.cassandra.io.sstable.format.SSTableReaderBuilder$ForRead.build(SSTableReaderBuilder.java:370)
[junit-timeout] at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:501)
[junit-timeout] at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:372)
[junit-timeout] at 
org.apache.cassandra.db.SSTableImporter.getTargetDirectory(SSTableImporter.java:211)
[junit-timeout] at 
org.apache.cassandra.db.SSTableImporter.importNewSSTables(SSTableImporter.java:135)
[junit-timeout] at 
org.apache.cassandra.db.ImportTest.testCorruptHelper(ImportTest.java:340)
[junit-timeout] at 
org.apache.cassandra.db.ImportTest.testImportCorruptWithCopying(ImportTest.java:384)
[junit-timeout] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
[junit-timeout] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
{code}

Here we see that it was asserting that sstable reader is not released but it 
failed because it seems it is. That release is happening here (3).

I run the multiplexer on this (4) test for 3000x (5) and it all passed. I think 
we should just call "SSTableReader.resetTidying();". That method is actually 
annotated with VisibleForTesting. I think that resetting the tidying will clear 
underlying map of references so it will not complain afterwards. It is probably 
some concurrent thing or similar ... 

(1) 
[https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]
(2) 
[https://circleci.com/api/v1.1/project/github/instaclustr/cassandra/223621/output/103/11?file=true&allocation-id=662134c47c6ecf4bb1db4681-11-build%2FABCDEFGH]
(3) 
[https://github.com/apache/cassandra/blob/cassandra-4.0/test/unit/org/apache/cassandra/db/ImportTest.java#L235]
(4) 
https://github.com/apache/cassandra/pull/3264/commits/d934e1c0f40353a12cd7588fc8a15a23d35d6a30
(5) 
https://app.circleci.com/pipelines/github/instaclustr/cassandra/4210/workflows/eea52e61-b670-4dc9-86b6-b07bf1030b09/jobs/224285

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x

[jira] [Commented] (CASSANDRA-17667) Text value containing "/*" interpreted as multiline comment in cqlsh

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839796#comment-17839796
 ] 

Brandon Williams commented on CASSANDRA-17667:
--

Kind ping in case this has been forgotten.

> Text value containing "/*" interpreted as multiline comment in cqlsh
> 
>
> Key: CASSANDRA-17667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17667
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: ANOOP THOMAS
>Assignee: Brad Schoening
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> I use CQLSH command line utility to load some DDLs. The version of utility I 
> use is this:
> {noformat}
> [cqlsh 6.0.0 | Cassandra 4.0.0.47 | CQL spec 3.4.5 | Native protocol 
> v5]{noformat}
> Command that loads DDL.cql:
> {noformat}
> cqlsh -u username -p password cassandra.example.com 65503 --ssl -f DDL.cql
> {noformat}
> I have a line in CQL script that breaks the syntax.
> {noformat}
> INSERT into tablename (key,columnname1,columnname2) VALUES 
> ('keyName','value1','/value2/*/value3');{noformat}
> {{/*}} here is interpreted as start of multi-line comment. It used to work on 
> older versions of cqlsh. The error I see looks like this:
> {noformat}
> SyntaxException: line 4:2 mismatched input 'Update' expecting ')' 
> (...,'value1','/value2INSERT into tablename(INSERT into tablename 
> (key,columnname1,columnname2)) VALUES ('[Update]-...) SyntaxException: line 
> 1:0 no viable alternative at input '(' ([(]...)
> {noformat}
> Same behavior while running in interactive mode too. {{/*}} inside a CQL 
> statement should not be interpreted as start of multi-line comment.
> With schema:
> {code:java}
> CREATE TABLE tablename ( key text primary key, columnname1 text, columnname2 
> text);{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19448:
-
Status: Open  (was: Patch Available)

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-22 Thread Tiago L. Alves (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tiago L. Alves updated CASSANDRA-19448:
---
Status: Patch Available  (was: Open)

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839672#comment-17839672
 ] 

Brandon Williams edited comment on CASSANDRA-19572 at 4/22/24 4:26 PM:
---

Yes, it is.  Those comments are to aid whoever takes this on, if the situation 
changes with regard to blocking something I will explicitly say so.


was (Author: brandon.williams):
Yes, it is.

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839772#comment-17839772
 ] 

Brandon Williams commented on CASSANDRA-19580:
--

Hibernate was added for node replacement, but it is also used if the node is 
told not to join the ring at startup.

bq.  I have been able to reproduce this issue if I kill Cassandra as its 
joining which will put the node into hibernate status. 

Can you expound upon this since it doesn't seem to meet either condition?

> Unable to contact any seeds with node in hibernate status
> -
>
> Key: CASSANDRA-19580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Cameron Zemek
>Priority: Normal
>
> We have customer running into the error 'Unable to contact any seeds!' . I 
> have been able to reproduce this issue if I kill Cassandra as its joining 
> which will put the node into hibernate status. Once a node is in hibernate it 
> will no longer receive any SYN messages from other nodes during startup and 
> as it sends only itself as digest in outbound SYN messages it never receives 
> any states in any of the ACK replies. So once it gets to the check 
> `seenAnySeed` in it fails as the endpointStateMap is empty.
>  
> A workaround is copying the system.peers table from other node but this is 
> less than ideal. I tested modifying maybeGossipToSeed as follows:
> {code:java}
>     /* Possibly gossip to a seed for facilitating partition healing */
>     private void maybeGossipToSeed(MessageOut prod)
>     {
>         int size = seeds.size();
>         if (size > 0)
>         {
>             if (size == 1 && 
> seeds.contains(FBUtilities.getBroadcastAddress()))
>             {
>                 return;
>             }
>             if (liveEndpoints.size() == 0)
>             {
>                 List gDigests = prod.payload.gDigests;
>                 if (gDigests.size() == 1 && 
> gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 {
>                     gDigests = new ArrayList();
>                     GossipDigestSyn digestSynMessage = new 
> GossipDigestSyn(DatabaseDescriptor.getClusterName(),
>                                                                            
> DatabaseDescriptor.getPartitionerName(),
>                                                                            
> gDigests);
>                     MessageOut message = new 
> MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
>                                                                               
>             digestSynMessage,
>                                                                               
>             GossipDigestSyn.serializer);
>                     sendGossip(message, seeds);
>                 }
>                 else
>                 {
>                     sendGossip(prod, seeds);
>                 }
>             }
>             else
>             {
>                 /* Gossip with the seed with some probability. */
>                 double probability = seeds.size() / (double) 
> (liveEndpoints.size() + unreachableEndpoints.size());
>                 double randDbl = random.nextDouble();
>                 if (randDbl <= probability)
>                     sendGossip(prod, seeds);
>             }
>         }
>     }
>  {code}
> Only problem is this is the same as SYN from shadow round. It does resolve 
> the issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-site updated (25cd5019 -> 690a3a5d)

2024-04-22 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 25cd5019 generate docs for 1d4732f3
 add cc1c7113 fix typo in date
 add 690a3a5d generate docs for cc1c7113

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (25cd5019)
\
 N -- N -- N   refs/heads/asf-site (690a3a5d)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 content/_/download.html|   2 +-
 .../managing/configuration/cass_yaml_file.html |  75 +
 .../managing/configuration/cass_yaml_file.html |  75 +
 .../managing/configuration/cass_yaml_file.html |  75 +
 .../managing/configuration/cass_yaml_file.html |  75 +
 content/search-index.js|   2 +-
 .../source/modules/ROOT/pages/download.adoc|   2 +-
 site-ui/build/ui-bundle.zip| Bin 4883646 -> 4883646 
bytes
 8 files changed, 303 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (66cf8ef3 -> 690a3a5d)

2024-04-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 66cf8ef3 generate docs for 1d4732f3
 add cc1c7113 fix typo in date
 new 690a3a5d generate docs for cc1c7113

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (66cf8ef3)
\
 N -- N -- N   refs/heads/asf-staging (690a3a5d)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/_/download.html|   2 +-
 content/search-index.js|   2 +-
 .../source/modules/ROOT/pages/download.adoc|   2 +-
 site-ui/build/ui-bundle.zip| Bin 4883646 -> 4883646 
bytes
 4 files changed, 3 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19578) Concurrent equivalent schema updates lead to unresolved disagreement

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839767#comment-17839767
 ] 

Brandon Williams commented on CASSANDRA-19578:
--

Unsurprisingly testTransKsMigration also failed in the CI run, but that is the 
only one that needs to be addressed there.

> Concurrent equivalent schema updates lead to unresolved disagreement
> 
>
> Key: CASSANDRA-19578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19578
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Chris Lohfink
>Priority: Normal
> Fix For: 4.1.5, 5.0-beta2
>
>
> As part of CASSANDRA-17819 a check for empty schema changes was added to the 
> updateSchema. This only looks at the _logical_ schema difference of the 
> schemas, but the changes made to the system_schema keyspace are the ones that 
> actually are involved in the digest.
> If two nodes issue the same CREATE statement the difference from the 
> keyspace.diff would be empty but the timestamps on the mutations would be 
> different, leading to a pseudo schema disagreement which will never resolve 
> until resetlocalschema or nodes being bounced.
> Only impacts 4.1
> test and fix : 
> https://github.com/clohfink/cassandra/commit/ba915f839089006ac6d08494ef19dc010bcd6411



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch trunk updated: fix typo in date

2024-04-22 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


The following commit(s) were added to refs/heads/trunk by this push:
 new cc1c7113 fix typo in date
cc1c7113 is described below

commit cc1c7113b8488a981b7eb580b6b48641c3581870
Author: Brandon Williams 
AuthorDate: Mon Apr 22 10:49:24 2024 -0500

fix typo in date
---
 site-content/source/modules/ROOT/pages/download.adoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/site-content/source/modules/ROOT/pages/download.adoc 
b/site-content/source/modules/ROOT/pages/download.adoc
index ac225060..613d3f27 100644
--- a/site-content/source/modules/ROOT/pages/download.adoc
+++ b/site-content/source/modules/ROOT/pages/download.adoc
@@ -81,7 +81,7 @@ 
https://www.apache.org/dyn/closer.lua/cassandra/4.0.12/apache-cassandra-4.0.12-b
 [discrete]
  Apache Cassandra 3.11
 [discrete]
- Latest release on 2023-04-16
+ Latest release on 2024-04-16
 [discrete]
  Maintained until 5.0.0 release (Nov-Dec 2023)
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19191) Optimisations to PlacementForRange, improve lookup on r/w path

2024-04-22 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19191:

Status: Needs Committer  (was: Patch Available)

> Optimisations to PlacementForRange, improve lookup on r/w path
> --
>
> Key: CASSANDRA-19191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19191
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary-1.html, ci_summary.html, result_details.tar.gz
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The lookup used when selecting the appropriate replica group for a range or 
> token while peforming reads and writes is extremely simplistic and 
> inefficient. There is plenty of scope to improve {{PlacementsForRange}} to by 
> replacing the current naive iteration with use a more efficient lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19191) Optimisations to PlacementForRange, improve lookup on r/w path

2024-04-22 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19191:

Status: Review In Progress  (was: Needs Committer)

> Optimisations to PlacementForRange, improve lookup on r/w path
> --
>
> Key: CASSANDRA-19191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19191
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary-1.html, ci_summary.html, result_details.tar.gz
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The lookup used when selecting the appropriate replica group for a range or 
> token while peforming reads and writes is extremely simplistic and 
> inefficient. There is plenty of scope to improve {{PlacementsForRange}} to by 
> replacing the current naive iteration with use a more efficient lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19191) Optimisations to PlacementForRange, improve lookup on r/w path

2024-04-22 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19191:

Status: Ready to Commit  (was: Review In Progress)

+1 

> Optimisations to PlacementForRange, improve lookup on r/w path
> --
>
> Key: CASSANDRA-19191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19191
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary-1.html, ci_summary.html, result_details.tar.gz
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The lookup used when selecting the appropriate replica group for a range or 
> token while peforming reads and writes is extremely simplistic and 
> inefficient. There is plenty of scope to improve {{PlacementsForRange}} to by 
> replacing the current naive iteration with use a more efficient lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19190) ForceSnapshot transformations should not be persisted in the local log table

2024-04-22 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19190:

Attachment: ci_summary-1.html

> ForceSnapshot transformations should not be persisted in the local log table
> 
>
> Key: CASSANDRA-19190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary-1.html, ci_summary.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Per its inline comments, ForceSnapshot is a synthetic transformation whose 
> purpose it to enable the local log to jump missing epochs. A common use for 
> this is when replaying persisted events from the metadata log at startup. The 
> log is initialised with {{Epoch.EMPTY}} but rather that replaying every 
> single entry since the beginning of history, we select the most recent 
> snapshot held locally and start the replay from that point. Likewise, when 
> catching up from a peer, a node may receive a snapshot plus subsequent log 
> entries. In order to bring local metadata to the same state as the snapshot, 
> a {{ForceSnapshot}} with the same epoch as the snapshot is inserted into the 
> {{LocalLog}} and enacted like any other other transformation. These synthetic 
> transformations should not be persisted in the `system.local_metadata_log`, 
> as they do not exist in the distributed metadata log. We _should_ persist the 
> snapshot itself in {{system.metadata_snapshots}} so that we can avoid having 
> to re-fetch remote snapshots (i.e. if a node were to restart shortly after 
> receiving a catchup from a peer).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19581) Add nodetool command to unregister LEFT nodes

2024-04-22 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19581:

Change Category: Operability
 Complexity: Low Hanging Fruit
  Reviewers: Alex Petrov, Sam Tunnicliffe
 Status: Open  (was: Triage Needed)

> Add nodetool command to unregister LEFT nodes
> -
>
> Key: CASSANDRA-19581
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19581
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Attachments: ci_summary.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When decommissioning a node it still remains in ClusterMetadata with state = 
> LEFT. We should provide a nodetool command to unregister such nodes 
> completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19581) Add nodetool command to unregister LEFT nodes

2024-04-22 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19581:

Attachment: ci_summary.html

> Add nodetool command to unregister LEFT nodes
> -
>
> Key: CASSANDRA-19581
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19581
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Attachments: ci_summary.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When decommissioning a node it still remains in ClusterMetadata with state = 
> LEFT. We should provide a nodetool command to unregister such nodes 
> completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19110) apt-key deprecation, replace with gpg --dearmor in the docs.

2024-04-22 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-19110:
---
Fix Version/s: 5.0
   5.1

> apt-key deprecation, replace with gpg --dearmor in the docs.
> 
>
> Key: CASSANDRA-19110
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19110
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Simon K
>Assignee: Tibor Repasi
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 3.0.30, 3.11.17, 4.0.13, 4.1.5, 5.0-beta2, 5.0, 5.1
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> the command `apt-key` is deprecated and soon to be removed, especially on 
> Ubuntu.
> the directory `/usr/share/keyrings` for shared keys are also being removed.
> I suggest to convert the docs from
> {code:java}
> curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add - {code}
> to a simpler command:
> {code:java}
> curl https://downloads.apache.org/cassandra/KEYS | sudo gpg --dearmor -o 
> /etc/apt/keyrings/cassandra-archive-keyring.gpg {code}
> The path `/etc/apt/keyrings` doesn't exists by default on Ubuntu 20.04 but it 
> does on 22.04.
> I also suggest to add the source.list.d text from 
> {code:java}
> $ echo "deb https://debian.cassandra.apache.org 42x main" | sudo tee -a 
> /etc/apt/sources.list.d/cassandra.sources.list
> deb https://debian.cassandra.apache.org 42x main{code}
> to 
> {code:java}
> $ echo "deb [signed-by=/etc/apt/keyrings/cassandra-archive-keyring.gpg] 
> https://debian.cassandra.apache.org 42x main" | sudo tee -a 
> /etc/apt/sources.list.d/cassandra.sources.list
> deb [signed-by=/etc/apt/keyrings/cassandra-archive-keyring.gpg] 
> https://debian.cassandra.apache.org 42x main {code}
> I have made a [PR|https://github.com/apache/cassandra/pull/2936]
> I have tested the gpg --dearmor on a VM with Ubuntu 22.04 myself recently and 
> it works just fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839672#comment-17839672
 ] 

Brandon Williams commented on CASSANDRA-19572:
--

Yes, it is.

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839671#comment-17839671
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19572 at 4/22/24 2:07 PM:


[~brandon.williams] do you still consider this to be a blocker for 
CASSANDRA-19401 ? I do not know what to make of it after your last comment in 
this ticket.


was (Author: smiklosovic):
[~brandon.williams] do you still consider this to be a blocker for 
CASSANDRA-19401. I do not know what to make of it after your last comment in 
this ticket.

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19572) Test failure: org.apache.cassandra.db.ImportTest flakiness

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839671#comment-17839671
 ] 

Stefan Miklosovic commented on CASSANDRA-19572:
---

[~brandon.williams] do you still consider this to be a blocker for 
CASSANDRA-19401. I do not know what to make of it after your last comment in 
this ticket.

> Test failure: org.apache.cassandra.db.ImportTest flakiness
> --
>
> Key: CASSANDRA-19572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> As discovered on CASSANDRA-19401, the tests in this class are flaky, at least 
> the following:
>  * testImportCorruptWithoutValidationWithCopying
>  * testImportInvalidateCache
>  * testImportCorruptWithCopying
>  * testImportCacheEnabledWithoutSrcDir
>  * testImportInvalidateCache
> [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19566:
--
Status: Needs Committer  (was: Patch Available)

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839670#comment-17839670
 ] 

Stefan Miklosovic commented on CASSANDRA-19566:
---

configuration_test.TestConfiguration test_change_durable_writes is 
CASSANDRA-19465 which I see returning back recently, nothing to do with this 
patch though.

write_failures_test.TestWriteFailures test_mutation_v5 passes locally

[CASSANDRA-19566-trunk|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19566-trunk]
{noformat}
java17_pre-commit_tests 
  ✓ j17_build4m 43s
  ✓ j17_cqlsh_dtests_py311   6m 54s
  ✓ j17_cqlsh_dtests_py311_vnode 7m 36s
  ✓ j17_cqlsh_dtests_py387m 26s
  ✓ j17_cqlsh_dtests_py38_vnode  7m 30s
  ✓ j17_cqlshlib_cython_tests7m 43s
  ✓ j17_cqlshlib_tests   8m 45s
  ✓ j17_dtests  36m 55s
  ✓ j17_unit_tests  13m 17s
  ✓ j17_utests_latest   13m 54s
  ✕ j17_dtests_latest   36m 42s
  configuration_test.TestConfiguration test_change_durable_writes
  ✕ j17_dtests_vnode36m 23s
  write_failures_test.TestWriteFailures test_mutation_v5
  ✕ j17_jvm_dtests  30m 39s
  
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections TIMEOUTED
  org.apache.cassandra.distributed.test.tcm.SplitBrainTest 
testSplitBrainStartup TIMEOUTED
  ✕ j17_jvm_dtests_latest_vnode 28m 39s
  junit.framework.TestSuite 
org.apache.cassandra.fuzz.harry.integration.model.InJVMTokenAwareExecutorTest 
TIMEOUTED
  org.apache.cassandra.distributed.test.tcm.SplitBrainTest 
testSplitBrainStartup TIMEOUTED
  ✕ j17_utests_oa   13m 38s
  org.apache.cassandra.tcm.DiscoverySimulationTest discoveryTest
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4209/workflows/c5bb9ae4-b124-412e-90ac-8bb2657d1a2c]


> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19517) Raise priority of TCM internode messages during critical operations

2024-04-22 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19517:

Reviewers: Marcus Eriksson

> Raise priority of TCM internode messages during critical operations
> ---
>
> Key: CASSANDRA-19517
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19517
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
> Attachments: ci_summary.html, result_details.tar.gz
>
>
> In a busy cluster, TCM messages may not get propagated throughout the 
> cluster, since they will be ordered together with other P1 messages (for 
> {{TCM_}} prefixed verbs), and with P2 with all Paxos operations.
> To avoid this, and make sure we can continue cluster metadata changes, all 
> {{TCM_}}-prefixed verbs should have {{P0}} priority, just like Gossip 
> messages used to. All Paxos messages that involve distributed metadata 
> keyspace should now get an {{URGENT}} flag, which will instruct internode 
> messaging to schedule them on the {{URGENT_MESSAGES}} connection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19581) Add nodetool command to unregister LEFT nodes

2024-04-22 Thread Marcus Eriksson (Jira)
Marcus Eriksson created CASSANDRA-19581:
---

 Summary: Add nodetool command to unregister LEFT nodes
 Key: CASSANDRA-19581
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19581
 Project: Cassandra
  Issue Type: Improvement
  Components: Transactional Cluster Metadata
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson


When decommissioning a node it still remains in ClusterMetadata with state = 
LEFT. We should provide a nodetool command to unregister such nodes completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-builds) branch trunk updated: Change DSL for 5.0 and trunk to use the standalone jenkinsfile

2024-04-22 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 5a9ba1a  Change DSL for 5.0 and trunk to use the standalone jenkinsfile
5a9ba1a is described below

commit 5a9ba1a1962794a338cecaa7d8e7f23cd0ea09fd
Author: Mick Semb Wever 
AuthorDate: Wed Apr 3 21:53:11 2024 +0200

Change DSL for 5.0 and trunk to use the standalone jenkinsfile

 patch by Mick Semb Wever; reviewed by Brandon Williams for CASSANDRA-18594
---
 jenkins-dsl/cassandra_job_dsl_seed.groovy | 662 --
 jenkins-dsl/cassandra_pipeline.groovy |   4 +-
 2 files changed, 74 insertions(+), 592 deletions(-)

diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy 
b/jenkins-dsl/cassandra_job_dsl_seed.groovy
index 9251db9..66084ec 100755
--- a/jenkins-dsl/cassandra_job_dsl_seed.groovy
+++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy
@@ -24,62 +24,31 @@ arm64_test_label_enabled = false
 def use_arm64_test_label() { return arm64_enabled && arm64_test_label_enabled }
 
 def slaveLabel = 'cassandra'
-slaveDtestLabel = 'cassandra-dtest'
-slaveDtestLargeLabel = 'cassandra-dtest-large'
-slaveArm64Label = 'cassandra-arm64'
-slaveArm64DtestLabel = 'cassandra-arm64-dtest'
-slaveArm64DtestLargeLabel = 'cassandra-arm64-dtest-large'
+def slaveDtestLabel = 'cassandra-dtest'
+def slaveDtestLargeLabel = 'cassandra-dtest-large'
+def slaveArm64Label = 'cassandra-arm64'
+def slaveArm64DtestLabel = 'cassandra-arm64-dtest'
+def slaveArm64DtestLargeLabel = 'cassandra-arm64-dtest-large'
 def mainRepo = "https://github.com/apache/cassandra";
-if(binding.hasVariable("CASSANDRA_GIT_URL")) {
-mainRepo = "${CASSANDRA_GIT_URL}"
-}
 def buildsRepo = "https://github.com/apache/cassandra-builds";
-if(binding.hasVariable("CASSANDRA_BUILDS_GIT_URL")) {
-buildsRepo = "${CASSANDRA_BUILDS_GIT_URL}"
-}
 def buildsBranch = "trunk"
-if(binding.hasVariable("CASSANDRA_BUILDS_BRANCH")) {
-buildsBranch = "${CASSANDRA_BUILDS_BRANCH}"
-}
 def dtestRepo = "https://github.com/apache/cassandra-dtest";
-if(binding.hasVariable("CASSANDRA_DTEST_GIT_URL")) {
-dtestRepo = "${CASSANDRA_DTEST_GIT_URL}"
-}
 def dtestBranch = "trunk"
-if(binding.hasVariable("CASSANDRA_DTEST_GIT_BRANCH")) {
-dtestRepo = "${CASSANDRA_DTEST_GIT_BRANCH}"
-}
 def buildDescStr = 'REF = ${GIT_BRANCH}  COMMIT = ${GIT_COMMIT}'
-// Cassandra active branches
-def cassandraBranches = ['cassandra-2.2', 'cassandra-3.0', 'cassandra-3.11', 
'cassandra-4.0', 'cassandra-4.1']
-if(binding.hasVariable("CASSANDRA_BRANCHES")) {
-cassandraBranches = "${CASSANDRA_BRANCHES}".split(",")
-}
+// From Cassandra 5.0 everything is defined in the in-tree Jenkinsfiles
+def cassandraBranches = ['cassandra-5.0', 'trunk']
+// Cassandra legacy branches (still using with external stages defined by dsl 
in this file)
+def legacyCassandraBranches = ['cassandra-2.2', 'cassandra-3.0', 
'cassandra-3.11', 'cassandra-4.0', 'cassandra-4.1']
 // Ant test targets
 def testTargets = ['test', 'test-burn', 'test-cdc', 'test-compression', 
'stress-test', 'fqltool-test', 'long-test', 'jvm-dtest', 'jvm-dtest-upgrade', 
'microbench']
-if(binding.hasVariable("CASSANDRA_ANT_TEST_TARGETS")) {
-testTargets = "${CASSANDRA_ANT_TEST_TARGETS}".split(",")
-}
-
 def testDockerImage = 
'apache/cassandra-testing-ubuntu2004-java11-w-dependencies'
 
 // Dtest test targets
 def dtestTargets = ['dtest', 'dtest-novnode', 'dtest-offheap', 'dtest-large', 
'dtest-large-novnode', 'dtest-upgrade']
-if(binding.hasVariable("CASSANDRA_DTEST_TEST_TARGETS")) {
-dtestTargets = "${CASSANDRA_DTEST_TEST_TARGETS}".split(",")
-}
 def dtestDockerImage = 'apache/cassandra-testing-ubuntu2004-java11'
 
-// tmp for CASSANDRA-18665
-def cassandraBranchesInTreeScript = ['cassandra-5.0', 'trunk']
-def testTargetsInTreeScript = ['test', 'test-burn', 'test-cdc', 
'test-compression', 'test-oa', 'test-system-keyspace-directory', 'test-trie', 
'stress-test', 'fqltool-test', 'long-test', 'jvm-dtest', 'jvm-dtest-upgrade', 
'jvm-dtest-novnode', 'jvm-dtest-upgrade-novnode', 'microbench', 
'simulator-dtest']
-def dtestTargetsInTreeScript = ['dtest', 'dtest-novnode', 'dtest-offheap', 
'dtest-large', 'dtest-large-novnode', 'dtest-upgrade', 'dtest-upgrade-novnode', 
'dtest-upgrade-large', 'dtest-upgrade-novnode-large']
-
 // expected longest job runtime
 def maxJobHours = 12
-if(binding.hasVariable("MAX_JOB_HOURS")) {
-maxJobHours = ${MAX_JOB_HOURS}
-}
 
 // how many splits are dtest jobs matrixed into
 def testSplits = 8
@@ -89,8 +58,7 @@ def dtestLargeSplits = 8
 def exists(branchName, targetName) {
 switch (targetName) {
 case 'artifact':
-// migrated to in-pipeline stages
-return true; // TODO CASSANDRA-18133 // return !(branchName = 
'trunk' || branchName ==~ /cassandra-5.\d+/)
+return true;
 case 't

[jira] [Updated] (CASSANDRA-19191) Optimisations to PlacementForRange, improve lookup on r/w path

2024-04-22 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19191:

Attachment: ci_summary-1.html

> Optimisations to PlacementForRange, improve lookup on r/w path
> --
>
> Key: CASSANDRA-19191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19191
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary-1.html, ci_summary.html, result_details.tar.gz
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The lookup used when selecting the appropriate replica group for a range or 
> token while peforming reads and writes is extremely simplistic and 
> inefficient. There is plenty of scope to improve {{PlacementsForRange}} to by 
> replacing the current naive iteration with use a more efficient lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-19580:
--
Description: 
We have customer running into the error 'Unable to contact any seeds!' . I have 
been able to reproduce this issue if I kill Cassandra as its joining which will 
put the node into hibernate status. Once a node is in hibernate it will no 
longer receive any SYN messages from other nodes during startup and as it sends 
only itself as digest in outbound SYN messages it never receives any states in 
any of the ACK replies. So once it gets to the check `seenAnySeed` in it fails 
as the endpointStateMap is empty.

 

A workaround is copying the system.peers table from other node but this is less 
than ideal. I tested modifying maybeGossipToSeed as follows:
{code:java}
    /* Possibly gossip to a seed for facilitating partition healing */
    private void maybeGossipToSeed(MessageOut prod)
    {
        int size = seeds.size();
        if (size > 0)
        {
            if (size == 1 && seeds.contains(FBUtilities.getBroadcastAddress()))
            {
                return;
            }
            if (liveEndpoints.size() == 0)
            {
                List gDigests = prod.payload.gDigests;
                if (gDigests.size() == 1 && 
gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
                {
                    gDigests = new ArrayList();
                    GossipDigestSyn digestSynMessage = new 
GossipDigestSyn(DatabaseDescriptor.getClusterName(),
                                                                           
DatabaseDescriptor.getPartitionerName(),
                                                                           
gDigests);
                    MessageOut message = new 
MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
                                                                                
          digestSynMessage,
                                                                                
          GossipDigestSyn.serializer);
                    sendGossip(message, seeds);
                }
                else
                {
                    sendGossip(prod, seeds);
                }
            }
            else
            {
                /* Gossip with the seed with some probability. */
                double probability = seeds.size() / (double) 
(liveEndpoints.size() + unreachableEndpoints.size());
                double randDbl = random.nextDouble();
                if (randDbl <= probability)
                    sendGossip(prod, seeds);
            }
        }
    }
 {code}
Only problem is this is the same as SYN from shadow round. It does resolve the 
issue however as then receive an ACK with all the states.

  was:
We have customer running into the error 'Unable to contact any seeds!' . I have 
been able to reproduce this issue if I kill Cassandra as its joining which will 
put the node into hibernate status. Once a node is in hibernate it will no 
longer receive any SYN messages from other nodes during startup and as it sends 
only itself as digest in outbound SYN messages it never receives any states in 
any of the ACK replies. So once it gets to the check `seenAnySeed` in it fails 
as the endpointStateMap is empty.

 

A workaround is copying the system.peers table from other node but this is less 
than ideal. I tested modifying maybeGossipToSeed as follows:
{code:java}
    /* Possibly gossip to a seed for facilitating partition healing */
    private void maybeGossipToSeed(MessageOut prod)
    {
        int size = seeds.size();
        if (size > 0)
        {
            if (size == 1 && seeds.contains(FBUtilities.getBroadcastAddress()))
            {
                return;
            }            if (liveEndpoints.size() == 0)
            {
                List gDigests = prod.payload.gDigests;
                if (gDigests.size() == 1 && 
gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
                {
                    gDigests = new ArrayList();
                    GossipDigestSyn digestSynMessage = new 
GossipDigestSyn(DatabaseDescriptor.getClusterName(),
                                                                           
DatabaseDescriptor.getPartitionerName(),
                                                                           
gDigests);
                    MessageOut message = new 
MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
                                                                                
          digestSynMessage,
                                                                                
          GossipDigestSyn.serializer);
                    sendGossip(message, seeds);
                }
                else
                {
                    sendGossip(prod, seeds);
                }
            }
  

[jira] [Created] (CASSANDRA-19580) Unable to contact any seeds with node in hibernate status

2024-04-22 Thread Cameron Zemek (Jira)
Cameron Zemek created CASSANDRA-19580:
-

 Summary: Unable to contact any seeds with node in hibernate status
 Key: CASSANDRA-19580
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19580
 Project: Cassandra
  Issue Type: Bug
Reporter: Cameron Zemek


We have customer running into the error 'Unable to contact any seeds!' . I have 
been able to reproduce this issue if I kill Cassandra as its joining which will 
put the node into hibernate status. Once a node is in hibernate it will no 
longer receive any SYN messages from other nodes during startup and as it sends 
only itself as digest in outbound SYN messages it never receives any states in 
any of the ACK replies. So once it gets to the check `seenAnySeed` in it fails 
as the endpointStateMap is empty.

 

A workaround is copying the system.peers table from other node but this is less 
than ideal. I tested modifying maybeGossipToSeed as follows:
{code:java}
    /* Possibly gossip to a seed for facilitating partition healing */
    private void maybeGossipToSeed(MessageOut prod)
    {
        int size = seeds.size();
        if (size > 0)
        {
            if (size == 1 && seeds.contains(FBUtilities.getBroadcastAddress()))
            {
                return;
            }            if (liveEndpoints.size() == 0)
            {
                List gDigests = prod.payload.gDigests;
                if (gDigests.size() == 1 && 
gDigests.get(0).endpoint.equals(FBUtilities.getBroadcastAddress()))
                {
                    gDigests = new ArrayList();
                    GossipDigestSyn digestSynMessage = new 
GossipDigestSyn(DatabaseDescriptor.getClusterName(),
                                                                           
DatabaseDescriptor.getPartitionerName(),
                                                                           
gDigests);
                    MessageOut message = new 
MessageOut(MessagingService.Verb.GOSSIP_DIGEST_SYN,
                                                                                
          digestSynMessage,
                                                                                
          GossipDigestSyn.serializer);
                    sendGossip(message, seeds);
                }
                else
                {
                    sendGossip(prod, seeds);
                }
            }
            else
            {
                /* Gossip with the seed with some probability. */
                double probability = seeds.size() / (double) 
(liveEndpoints.size() + unreachableEndpoints.size());
                double randDbl = random.nextDouble();
                if (randDbl <= probability)
                    sendGossip(prod, seeds);
            }
        }
    }
 {code}
Only problem is this is the same as SYN from shadow round. It does resolve the 
issue however as then receive an ACK with all the states.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19579) threads lingering after driver shutdown: session close starts thread and doesn't await its stop

2024-04-22 Thread Thomas Klambauer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Klambauer updated CASSANDRA-19579:
-
Description: 
We are checking remaining/lingering threads during shutdown.

we noticed some with naming pattern/thread factory: ""globalEventExecutor-1-2" 
Id=146 TIMED_WAITING"

this one seems to be created during shutdown / session close and not 
awaited/shut down:

{noformat}
addTask:156, GlobalEventExecutor (io.netty.util.concurrent)
execute0:225, GlobalEventExecutor (io.netty.util.concurrent)
execute:221, GlobalEventExecutor (io.netty.util.concurrent)
onClose:188, DefaultNettyOptions (com.datastax.oss.driver.internal.core.context)
onChildrenClosed:589, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
lambda$close$9:552, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
run:-1, 860270832 
(com.datastax.oss.driver.internal.core.session.DefaultSession$SingleThreaded$$Lambda$9508)
tryFire$$$capture:783, CompletableFuture$UniRun (java.util.concurrent)
tryFire:-1, CompletableFuture$UniRun (java.util.concurrent)
 - Async stack trace
addTask:-1, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:836, SingleThreadEventExecutor (io.netty.util.concurrent)
execute0:827, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:817, SingleThreadEventExecutor (io.netty.util.concurrent)
claim:568, CompletableFuture$UniCompletion (java.util.concurrent)
tryFire$$$capture:780, CompletableFuture$UniRun (java.util.concurrent)
tryFire:-1, CompletableFuture$UniRun (java.util.concurrent)
 - Async stack trace
:767, CompletableFuture$UniRun (java.util.concurrent)
uniRunStage:801, CompletableFuture (java.util.concurrent)
thenRunAsync:2136, CompletableFuture (java.util.concurrent)
thenRunAsync:143, CompletableFuture (java.util.concurrent)
whenAllDone:75, CompletableFutures 
(com.datastax.oss.driver.internal.core.util.concurrent)
close:551, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
access$1000:300, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
lambda$closeAsync$1:272, DefaultSession 
(com.datastax.oss.driver.internal.core.session)
runTask:98, PromiseTask (io.netty.util.concurrent)
run:106, PromiseTask (io.netty.util.concurrent)
runTask$$$capture:174, AbstractEventExecutor (io.netty.util.concurrent)
runTask:-1, AbstractEventExecutor (io.netty.util.concurrent)
 - Async stack trace
addTask:-1, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:836, SingleThreadEventExecutor (io.netty.util.concurrent)
execute0:827, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:817, SingleThreadEventExecutor (io.netty.util.concurrent)
submit:118, AbstractExecutorService (java.util.concurrent)
submit:118, AbstractEventExecutor (io.netty.util.concurrent)
on:57, RunOrSchedule (com.datastax.oss.driver.internal.core.util.concurrent)
closeSafely:286, DefaultSession (com.datastax.oss.driver.internal.core.session)
closeAsync:272, DefaultSession (com.datastax.oss.driver.internal.core.session)
close:76, AsyncAutoCloseable (com.datastax.oss.driver.api.core)
shutdown:172, DataStaxBackend 
(com.dynatrace.apm.server.core.persistence.dcc.impl.datastax.backend)
shutdown:389, DccServiceImpl 
(com.dynatrace.apm.server.core.persistence.dcc.impl.common)
shutdown:78, CassandraManagement (com.compuware.apm.server.core.lifecycle)
shutdownWithoutExit:3811, ServerLifecycle (com.compuware.apm.server.core.api)
shutdown:3648, ServerLifecycle (com.compuware.apm.server.core.api)
run:121, ServerShutdownRunner (com.compuware.apm.server.core)
run:829, Thread (java.lang)
{noformat}

the initial close here is called on com.datastax.oss.driver.api.core.CqlSession.

netty framework suggests to call
io.netty.util.concurrent.GlobalEventExecutor#awaitInactivity
during shutdown to await event thread stopping

(slightly related issue in netty: https://github.com/netty/netty/issues/2084 )

suggestion to add maybe GlobalEventExecutor.INSTANCE.awaitInactivity with some 
timeout during close around here:
https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/internal/core/context/DefaultNettyOptions.java#L199

noting that this might slow down closing for up to 2 seconds if the netty issue 
comment is correct.

  was:
We are checking remaining/lingering threads during shutdown.

we noticed some with naming pattern/thread factory: ""globalEventExecutor-1-2" 
Id=146 TIMED_WAITING"

this one seems to be created during shutdown / session close and not 
awaited/shut down:

{noformat}

addTask:156, GlobalEventExecutor (io.netty.util.concurrent) execute0:225, 
GlobalEventExecutor (io.netty.util.concurrent) execute:221, GlobalEventExecutor 
(io.netty.util.concurrent) onClose:188, DefaultNettyOptions 
(com.datastax.oss.driver.internal.core.context) onChildrenClosed:589, 
DefaultS

[jira] [Created] (CASSANDRA-19579) threads lingering after driver shutdown: session close starts thread and doesn't await its stop

2024-04-22 Thread Thomas Klambauer (Jira)
Thomas Klambauer created CASSANDRA-19579:


 Summary: threads lingering after driver shutdown: session close 
starts thread and doesn't await its stop
 Key: CASSANDRA-19579
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19579
 Project: Cassandra
  Issue Type: Bug
  Components: Client/java-driver
Reporter: Thomas Klambauer
Assignee: Henry Hughes


We are checking remaining/lingering threads during shutdown.

we noticed some with naming pattern/thread factory: ""globalEventExecutor-1-2" 
Id=146 TIMED_WAITING"

this one seems to be created during shutdown / session close and not 
awaited/shut down:

{noformat}

addTask:156, GlobalEventExecutor (io.netty.util.concurrent) execute0:225, 
GlobalEventExecutor (io.netty.util.concurrent) execute:221, GlobalEventExecutor 
(io.netty.util.concurrent) onClose:188, DefaultNettyOptions 
(com.datastax.oss.driver.internal.core.context) onChildrenClosed:589, 
DefaultSession$SingleThreaded (com.datastax.oss.driver.internal.core.session) 
lambda$close$9:552, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session) run:-1, 860270832 
(com.datastax.oss.driver.internal.core.session.DefaultSession$SingleThreaded$$Lambda$9508)
 tryFire$$$capture:783, CompletableFuture$UniRun (java.util.concurrent) 
tryFire:-1, CompletableFuture$UniRun (java.util.concurrent) - Async stack trace 
addTask:-1, SingleThreadEventExecutor (io.netty.util.concurrent) execute:836, 
SingleThreadEventExecutor (io.netty.util.concurrent) execute0:827, 
SingleThreadEventExecutor (io.netty.util.concurrent) execute:817, 
SingleThreadEventExecutor (io.netty.util.concurrent) claim:568, 
CompletableFuture$UniCompletion (java.util.concurrent) tryFire$$$capture:780, 
CompletableFuture$UniRun (java.util.concurrent) tryFire:-1, 
CompletableFuture$UniRun (java.util.concurrent) - Async stack trace :767, 
CompletableFuture$UniRun (java.util.concurrent) uniRunStage:801, 
CompletableFuture (java.util.concurrent) thenRunAsync:2136, CompletableFuture 
(java.util.concurrent) thenRunAsync:143, CompletableFuture 
(java.util.concurrent) whenAllDone:75, CompletableFutures 
(com.datastax.oss.driver.internal.core.util.concurrent) close:551, 
DefaultSession$SingleThreaded (com.datastax.oss.driver.internal.core.session) 
access$1000:300, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session) lambda$closeAsync$1:272, 
DefaultSession (com.datastax.oss.driver.internal.core.session) runTask:98, 
PromiseTask (io.netty.util.concurrent) run:106, PromiseTask 
(io.netty.util.concurrent) runTask$$$capture:174, AbstractEventExecutor 
(io.netty.util.concurrent) runTask:-1, AbstractEventExecutor 
(io.netty.util.concurrent) - Async stack trace addTask:-1, 
SingleThreadEventExecutor (io.netty.util.concurrent) execute:836, 
SingleThreadEventExecutor (io.netty.util.concurrent) execute0:827, 
SingleThreadEventExecutor (io.netty.util.concurrent) execute:817, 
SingleThreadEventExecutor (io.netty.util.concurrent) submit:118, 
AbstractExecutorService (java.util.concurrent) submit:118, 
AbstractEventExecutor (io.netty.util.concurrent) on:57, RunOrSchedule 
(com.datastax.oss.driver.internal.core.util.concurrent) closeSafely:286, 
DefaultSession (com.datastax.oss.driver.internal.core.session) closeAsync:272, 
DefaultSession (com.datastax.oss.driver.internal.core.session) close:76, 
AsyncAutoCloseable (com.datastax.oss.driver.api.core)
{noformat}

the initial close here is called on com.datastax.oss.driver.api.core.CqlSession.

netty framework suggests to call
io.netty.util.concurrent.GlobalEventExecutor#awaitInactivity
during shutdown to await event thread stopping

(slightly related issue in netty: https://github.com/netty/netty/issues/2084 )

suggestion to add maybe GlobalEventExecutor.INSTANCE.awaitInactivity with some 
timeout during close around here:
https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/internal/core/context/DefaultNettyOptions.java#L199

noting that this might slow down closing for up to 2 seconds if the netty issue 
comment is correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19579) threads lingering after driver shutdown: session close starts thread and doesn't await its stop

2024-04-22 Thread Thomas Klambauer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Klambauer updated CASSANDRA-19579:
-
Description: 
We are checking remaining/lingering threads during shutdown.

we noticed some with naming pattern/thread factory: ""globalEventExecutor-1-2" 
Id=146 TIMED_WAITING"

this one seems to be created during shutdown / session close and not 
awaited/shut down:
{noformat}
addTask:156, GlobalEventExecutor (io.netty.util.concurrent)
execute0:225, GlobalEventExecutor (io.netty.util.concurrent)
execute:221, GlobalEventExecutor (io.netty.util.concurrent)
onClose:188, DefaultNettyOptions (com.datastax.oss.driver.internal.core.context)
onChildrenClosed:589, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
lambda$close$9:552, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
run:-1, 860270832 
(com.datastax.oss.driver.internal.core.session.DefaultSession$SingleThreaded$$Lambda$9508)
tryFire$$$capture:783, CompletableFuture$UniRun (java.util.concurrent)
tryFire:-1, CompletableFuture$UniRun (java.util.concurrent)
 - Async stack trace
addTask:-1, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:836, SingleThreadEventExecutor (io.netty.util.concurrent)
execute0:827, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:817, SingleThreadEventExecutor (io.netty.util.concurrent)
claim:568, CompletableFuture$UniCompletion (java.util.concurrent)
tryFire$$$capture:780, CompletableFuture$UniRun (java.util.concurrent)
tryFire:-1, CompletableFuture$UniRun (java.util.concurrent)
 - Async stack trace
:767, CompletableFuture$UniRun (java.util.concurrent)
uniRunStage:801, CompletableFuture (java.util.concurrent)
thenRunAsync:2136, CompletableFuture (java.util.concurrent)
thenRunAsync:143, CompletableFuture (java.util.concurrent)
whenAllDone:75, CompletableFutures 
(com.datastax.oss.driver.internal.core.util.concurrent)
close:551, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
access$1000:300, DefaultSession$SingleThreaded 
(com.datastax.oss.driver.internal.core.session)
lambda$closeAsync$1:272, DefaultSession 
(com.datastax.oss.driver.internal.core.session)
runTask:98, PromiseTask (io.netty.util.concurrent)
run:106, PromiseTask (io.netty.util.concurrent)
runTask$$$capture:174, AbstractEventExecutor (io.netty.util.concurrent)
runTask:-1, AbstractEventExecutor (io.netty.util.concurrent)
 - Async stack trace
addTask:-1, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:836, SingleThreadEventExecutor (io.netty.util.concurrent)
execute0:827, SingleThreadEventExecutor (io.netty.util.concurrent)
execute:817, SingleThreadEventExecutor (io.netty.util.concurrent)
submit:118, AbstractExecutorService (java.util.concurrent)
submit:118, AbstractEventExecutor (io.netty.util.concurrent)
on:57, RunOrSchedule (com.datastax.oss.driver.internal.core.util.concurrent)
closeSafely:286, DefaultSession (com.datastax.oss.driver.internal.core.session)
closeAsync:272, DefaultSession (com.datastax.oss.driver.internal.core.session)
close:76, AsyncAutoCloseable (com.datastax.oss.driver.api.core)
shutdown:172, DataStaxBackend 
(com.dynatrace.apm.server.core.persistence.dcc.impl.datastax.backend)
shutdown:389, DccServiceImpl 
(com.dynatrace.apm.server.core.persistence.dcc.impl.common)
shutdown:78, CassandraManagement (com.compuware.apm.server.core.lifecycle)
shutdownWithoutExit:3811, ServerLifecycle (com.compuware.apm.server.core.api)
shutdown:3648, ServerLifecycle (com.compuware.apm.server.core.api)
run:121, ServerShutdownRunner (com.compuware.apm.server.core)
run:829, Thread (java.lang)
{noformat}
the initial close here is called on com.datastax.oss.driver.api.core.CqlSession.

netty framework suggests to call
io.netty.util.concurrent.GlobalEventExecutor#awaitInactivity
during shutdown to await event thread stopping

(slightly related issue in netty: [https://github.com/netty/netty/issues/2084] )

suggestion to add maybe GlobalEventExecutor.INSTANCE.awaitInactivity with some 
timeout during close around here:
[https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/internal/core/context/DefaultNettyOptions.java#L199]

noting that this might slow down closing for up to 2 seconds if the netty issue 
comment is correct.

this is on latest datastax java driver version: 4.17,

  was:
We are checking remaining/lingering threads during shutdown.

we noticed some with naming pattern/thread factory: ""globalEventExecutor-1-2" 
Id=146 TIMED_WAITING"

this one seems to be created during shutdown / session close and not 
awaited/shut down:

{noformat}
addTask:156, GlobalEventExecutor (io.netty.util.concurrent)
execute0:225, GlobalEventExecutor (io.netty.util.concurrent)
execute:221, GlobalEventExecutor (io.netty.util.concurrent)
onClose:188, DefaultNettyOptions (com.datastax.oss.driver.i

[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839570#comment-17839570
 ] 

Stefan Miklosovic commented on CASSANDRA-19566:
---

As I am writing this, I am running trunk's pre-commit workflow in CircleCI. I 
do not have my own instance of Jenkins and I do not believe that, looking into 
the history, cassandra-5-devbrach here (1) will actually test anything in a 
reasonable time without failures to provide upgrade tests. E.g. the last build 
is stuck on summary for 15 hours? 

(1) [https://ci-cassandra.apache.org/view/patches/job/Cassandra-5-devbranch/]

I would be delighted if community participated in the builds and provided 
upgrade tests as requested, most ideally via CircleCI. 

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839561#comment-17839561
 ] 

Stefan Miklosovic commented on CASSANDRA-19566:
---

[CASSANDRA-19566-4.1|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19566-4.1]
{noformat}
java11_pre-commit_tests 
  ✓ j11_build2m 20s
  ✓ j11_cqlsh_dtests_py3 5m 35s
  ✓ j11_cqlsh_dtests_py311   6m 11s
  ✓ j11_cqlsh_dtests_py311_vnode 6m 25s
  ✓ j11_cqlsh_dtests_py385m 50s
  ✓ j11_cqlsh_dtests_py38_vnode  5m 55s
  ✓ j11_cqlsh_dtests_py3_vnode   5m 45s
  ✓ j11_cqlshlib_cython_tests7m 41s
  ✓ j11_cqlshlib_tests7m 1s
  ✓ j11_dtests  34m 43s
  ✓ j11_dtests_vnode36m 19s
  ✓ j11_jvm_dtests  19m 12s
  ✓ j11_jvm_dtests_vnode 12m 7s
  ✕ j11_unit_tests   8m 48s
  org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
{noformat}

[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4206/workflows/5fb136cf-1503-40e0-a2a7-eeb93edf747b]


> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839560#comment-17839560
 ] 

Stefan Miklosovic commented on CASSANDRA-19566:
---

[CASSANDRA-19566-4.0|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19566-4.0]
{noformat}
java8_pre-commit_tests  
  ✓ j8_build  5m 9s
  ✓ j8_cqlsh-dtests-py2-no-vnodes6m 30s
  ✓ j8_cqlsh-dtests-py2-with-vnodes  8m 15s
  ✓ j8_cqlsh_dtests_py3  7m 52s
  ✓ j8_cqlsh_dtests_py3117m 39s
  ✓ j8_cqlsh_dtests_py311_vnode  9m 13s
  ✓ j8_cqlsh_dtests_py38 7m 35s
  ✓ j8_cqlsh_dtests_py38_vnode   10m 6s
  ✓ j8_cqlsh_dtests_py3_vnode8m 31s
  ✓ j8_cqlshlib_tests8m 55s
  ✓ j8_dtests   32m 10s
  ✓ j8_dtests_vnode 36m 28s
  ✓ j8_jvm_dtests16m 6s
  ✓ j11_dtests_vnode36m 12s
  ✓ j11_dtests   34m 0s
  ✓ j11_cqlsh_dtests_py3_vnode   5m 46s
  ✓ j11_cqlsh_dtests_py38_vnode  5m 39s
  ✓ j11_cqlsh_dtests_py385m 58s
  ✓ j11_cqlsh_dtests_py311_vnode 5m 52s
  ✓ j11_cqlsh_dtests_py311   5m 24s
  ✓ j11_cqlsh_dtests_py3 5m 24s
  ✓ j11_cqlsh-dtests-py2-with-vnodes 5m 38s
  ✓ j11_cqlsh-dtests-py2-no-vnodes   5m 40s
  ✕ j8_unit_tests   10m 17s
  org.apache.cassandra.cql3.MemtableSizeTest testTruncationReleasesLogSpace
  ✕ j8_utests_system_keyspace_directory   8m 9s
  org.apache.cassandra.cql3.MemtableSizeTest testTruncationReleasesLogSpace
  ✕ j11_unit_tests   7m 58s
  org.apache.cassandra.cql3.MemtableSizeTest testTruncationReleasesLogSpace
  org.apache.cassandra.cql3.ViewFilteringClustering2Test 
testClusteringKeyFilteringRestrictions[0]
{noformat}

[java8_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4207/workflows/1b8ba460-20c2-42ad-8d89-704a09c4d211]


> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org