date:20240205

[jira] [Commented] (CASSANDRA-19189) Revisit use of sealed period lookup tables

2024-02-05 Thread Marcus Eriksson (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814653#comment-17814653
 ] 

Marcus Eriksson commented on CASSANDRA-19189:
-

and https://github.com/apache/cassandra-dtest/pull/251 - stop trying to 
snapshot the removed tables

> Revisit use of sealed period lookup tables
> --
>
> Key: CASSANDRA-19189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19189
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
> table, which is keyed by epoch. Snapshots are retrieved from this table for 
> three purposes:
> * to replay locally during startup
> * to provide log state for a peer requesting catchup
> * to create point-in-time ClusterMetadata, for disaster recovery
> In the majority of cases, we always want to replay from the most recent 
> snapshot so we can usually select the appropriate snapshot by simply scanning 
> the snapshots table in reverse, which allows us to considerably simplify the 
> process of looking up the desired snapshot. We will continue to persist 
> historical snapshots, at least for now, so that we are able to select 
> arbitrary snapshots should we want to reconstruct metadata state for 
> arbitrary epochs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables

2024-02-05 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19189:

Attachment: ci_summary.html
result_details.tar.gz

> Revisit use of sealed period lookup tables
> --
>
> Key: CASSANDRA-19189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19189
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
> table, which is keyed by epoch. Snapshots are retrieved from this table for 
> three purposes:
> * to replay locally during startup
> * to provide log state for a peer requesting catchup
> * to create point-in-time ClusterMetadata, for disaster recovery
> In the majority of cases, we always want to replay from the most recent 
> snapshot so we can usually select the appropriate snapshot by simply scanning 
> the snapshots table in reverse, which allows us to considerably simplify the 
> process of looking up the desired snapshot. We will continue to persist 
> historical snapshots, at least for now, so that we are able to select 
> arbitrary snapshots should we want to reconstruct metadata state for 
> arbitrary epochs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables

2024-02-05 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19189:

Test and Documentation Plan: ci run
 Status: Patch Available  (was: Open)

https://github.com/apache/cassandra/pull/3088

> Revisit use of sealed period lookup tables
> --
>
> Key: CASSANDRA-19189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19189
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
> table, which is keyed by epoch. Snapshots are retrieved from this table for 
> three purposes:
> * to replay locally during startup
> * to provide log state for a peer requesting catchup
> * to create point-in-time ClusterMetadata, for disaster recovery
> In the majority of cases, we always want to replay from the most recent 
> snapshot so we can usually select the appropriate snapshot by simply scanning 
> the snapshots table in reverse, which allows us to considerably simplify the 
> process of looking up the desired snapshot. We will continue to persist 
> historical snapshots, at least for now, so that we are able to select 
> arbitrary snapshots should we want to reconstruct metadata state for 
> arbitrary epochs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables

2024-02-05 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19189:

Change Category: Code Clarity
 Complexity: Normal
  Reviewers: Alex Petrov, Sam Tunnicliffe
 Status: Open  (was: Triage Needed)

> Revisit use of sealed period lookup tables
> --
>
> Key: CASSANDRA-19189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19189
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
>
> Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
> table, which is keyed by epoch. Snapshots are retrieved from this table for 
> three purposes:
> * to replay locally during startup
> * to provide log state for a peer requesting catchup
> * to create point-in-time ClusterMetadata, for disaster recovery
> In the majority of cases, we always want to replay from the most recent 
> snapshot so we can usually select the appropriate snapshot by simply scanning 
> the snapshots table in reverse, which allows us to considerably simplify the 
> process of looking up the desired snapshot. We will continue to persist 
> historical snapshots, at least for now, so that we are able to select 
> arbitrary snapshots should we want to reconstruct metadata state for 
> arbitrary epochs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2024-02-05 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814613#comment-17814613
 ] 

Berenguer Blasi commented on CASSANDRA-19085:
-

Thx for the review [~brandon.williams]. The SCM setting is only for CI to fully 
exercise with that setting. Only the gossiper fix is needed. Given that jenkins 
is back I'll just merge this to prevent any failures arising from it and 
duplicating efforts.

> In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
> ---
>
> Key: CASSANDRA-19085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Branimir Lambov
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
> the test fails with an exception that appears to be a genuine problem:
> {code:java}
> junit.framework.AssertionFailedError: Exception found expected null, but 
> was:   at 
> org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> >
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> org.apache.cassandra.distributed.shared.ShutdownException: Uncaught 
> exceptions were thrown during test
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   Suppressed: java.lang.IllegalStateException: complete already: 
> (failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
>   at 
> org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
>   at 
> org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
>   at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>   at 
> org.apache.cassandra.net.In

[jira] [Comment Edited] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE

2024-02-05 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814563#comment-17814563
 ] 

Caleb Rackliffe edited comment on CASSANDRA-19018 at 2/6/24 5:45 AM:
-

I think I may have found the missing link. While RFP is still broken around 
short reads, SAI at the local level might be hiding range tombstones. I've 
managed to get the multi-node Harry test passing in extended runs 
[here|https://github.com/maedhroz/cassandra/pull/15/commits]. Paging and 
read-repair are disabled here to avoid the potential RFP problems, and statics 
are disabled, but I should now be able to add back statics and read-repair and 
get clean runs. More on that shortly...

UPDATE: I've been able to add static indexing back without failure. At this 
point, only read repair and paging are disabled, so attacking the RFP issues is 
probably next.


was (Author: maedhroz):
I think I may have found the missing link. While RFP is still broken around 
short reads, SAI at the local level might be hiding range tombstones. I've 
managed to get the multi-node Harry test passing in extended runs 
[here|https://github.com/maedhroz/cassandra/pull/15/commits]. Paging and 
read-repair are disabled here to avoid the potential RFP problems, and statics 
are disabled, but I should now be able to add back statics and read-repair and 
get clean runs. More on that shortly...

> An SAI-specific mechanism to ensure consistency isn't violated for 
> multi-column (i.e. AND) queries at CL > ONE
> --
>
> Key: CASSANDRA-19018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Feature/SAI
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> result_details.tar-1.gz, result_details.tar.gz
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around 
> filtering/index queries that use intersection/AND over partially updated 
> non-key columns. (ex. Restricting one clustering column and one normal column 
> does not cause a consistency problem, as primary keys cannot be partially 
> updated.) This issue exists to attempt to fix this specifically for SAI in 
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config -> 
> config.with(GOSSIP).with(NETWORK)).start()))
> {
> cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int 
> PRIMARY KEY, a int, b int)"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING 
> 'sai'"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING 
> 'sai'"));
> // insert a split row
> cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> a) VALUES (0, 1)"));
> cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> b) VALUES (0, 2)"));
> // Uncomment this line and test succeeds w/ partial writes 
> completed...
> //cluster.get(1).nodetoolResult("repair", 
> KEYSPACE).asserts().success();
> String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND 
> b = 2");
> Object[][] initialRows = cluster.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
> assertRows(initialRows, row(0, 1, 2)); // not found!!
> }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial 
> matches from the coordinator that would combine there to form full matches. 
> Simple non-index filtering queries also suffer from this problem, but they 
> hide the partial matches in a different way. I'll outline a possible solution 
> for this in the comments that takes advantage of replica filtering protection 
> and the repaired/unrepaired datasets...and attempts to minimize the amount of 
> extra row data sent to the coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19335) Default nodetool tablestats to Human-Readable Output

2024-02-05 Thread Leo Toff (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814609#comment-17814609
 ] 

Leo Toff commented on CASSANDRA-19335:
--

Your comment has been addressed, I've added "-r" short flag for 
"--no-human-readable". "-r" stands for Raw. Let me know what my next steps 
should be here.

I wanted to do some refactoring:
 * Convert "out.printf(indent + ...)" to "out.printf(%s ..., indent ...)" in 
TableStatsPrinter where printf format specifiers are used (see [Stefan's 
comment in 
PR#2977|https://github.com/apache/cassandra/pull/2977#discussion_r1430323676])
 * Move formatting from the Holder class to the Printer class (from 
TableStatsHolder to TableStatsPrinter)
 * Consider renaming "formatMemory" (and other mentions of "memory") to 
"formatDataSize" across TableStatsPrinter, TableStatsHolder, and FBUtilities

> Default nodetool tablestats to Human-Readable Output
> 
>
> Key: CASSANDRA-19335
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19335
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Leo Toff
>Assignee: Leo Toff
>Priority: Low
> Fix For: 5.x
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> *Current Behavior*
> The current implementation of nodetool tablestats in Apache Cassandra outputs 
> statistics in a format that is not immediately human-readable. This output 
> primarily includes raw byte counts, which require additional calculation or 
> conversion to be easily understood by users. This can be inefficient and 
> time-consuming, especially for users who frequently monitor these statistics 
> for performance tuning or maintenance purposes.
> *Proposed Change*
> We propose that nodetool tablestats should, by default, provide its output in 
> a human-readable format. This change would involve converting byte counts 
> into more understandable units (KiB, MiB, GiB). The tool could still retain 
> the option to display raw data for those who need it, perhaps through a flag 
> such as --no-human-readable or --raw.
> *Considerations*
> The change should maintain backward compatibility, ensuring that scripts or 
> tools relying on the current output format can continue to function correctly.
> We should provide adequate documentation and examples of both the new default 
> output and how to access the raw data format, if needed.
> *Alignment*
> Discussion in the dev mailing list: 
> [https://lists.apache.org/thread/mlp715kxho5b6f1ql9omlzmmnh4qfby9] 
> *Related work*
> Previous work in the series:
>  # https://issues.apache.org/jira/browse/CASSANDRA-19015 
>  # https://issues.apache.org/jira/browse/CASSANDRA-19104



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-19335) Default nodetool tablestats to Human-Readable Output

2024-02-05 Thread Leo Toff (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814609#comment-17814609
 ] 

Leo Toff edited comment on CASSANDRA-19335 at 2/6/24 5:25 AM:
--

Your comment has been addressed, I've added `-r` short flag for 
`-no-human-readable`. `-r` stands for Raw. Let me know what my next steps 
should be here.

I wanted to do some refactoring:
 * Convert "out.printf(indent + ...)" to "out.printf(%s ..., indent ...)" in 
TableStatsPrinter where printf format specifiers are used (see [Stefan's 
comment in 
PR#2977|https://github.com/apache/cassandra/pull/2977#discussion_r1430323676])
 * Move formatting from the Holder class to the Printer class (from 
TableStatsHolder to TableStatsPrinter)
 * Consider renaming "formatMemory" (and other mentions of "memory") to 
"formatDataSize" across TableStatsPrinter, TableStatsHolder, and FBUtilities


was (Author: JIRAUSER303078):
Your comment has been addressed, I've added "-r" short flag for 
"--no-human-readable". "-r" stands for Raw. Let me know what my next steps 
should be here.

I wanted to do some refactoring:
 * Convert "out.printf(indent + ...)" to "out.printf(%s ..., indent ...)" in 
TableStatsPrinter where printf format specifiers are used (see [Stefan's 
comment in 
PR#2977|https://github.com/apache/cassandra/pull/2977#discussion_r1430323676])
 * Move formatting from the Holder class to the Printer class (from 
TableStatsHolder to TableStatsPrinter)
 * Consider renaming "formatMemory" (and other mentions of "memory") to 
"formatDataSize" across TableStatsPrinter, TableStatsHolder, and FBUtilities

> Default nodetool tablestats to Human-Readable Output
> 
>
> Key: CASSANDRA-19335
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19335
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Leo Toff
>Assignee: Leo Toff
>Priority: Low
> Fix For: 5.x
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> *Current Behavior*
> The current implementation of nodetool tablestats in Apache Cassandra outputs 
> statistics in a format that is not immediately human-readable. This output 
> primarily includes raw byte counts, which require additional calculation or 
> conversion to be easily understood by users. This can be inefficient and 
> time-consuming, especially for users who frequently monitor these statistics 
> for performance tuning or maintenance purposes.
> *Proposed Change*
> We propose that nodetool tablestats should, by default, provide its output in 
> a human-readable format. This change would involve converting byte counts 
> into more understandable units (KiB, MiB, GiB). The tool could still retain 
> the option to display raw data for those who need it, perhaps through a flag 
> such as --no-human-readable or --raw.
> *Considerations*
> The change should maintain backward compatibility, ensuring that scripts or 
> tools relying on the current output format can continue to function correctly.
> We should provide adequate documentation and examples of both the new default 
> output and how to access the raw data format, if needed.
> *Alignment*
> Discussion in the dev mailing list: 
> [https://lists.apache.org/thread/mlp715kxho5b6f1ql9omlzmmnh4qfby9] 
> *Related work*
> Previous work in the series:
>  # https://issues.apache.org/jira/browse/CASSANDRA-19015 
>  # https://issues.apache.org/jira/browse/CASSANDRA-19104



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]

2024-02-05 Thread via GitHub



aratno commented on code in PR #1743:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1743#discussion_r1479160472


##
core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java:
##
@@ -96,14 +99,38 @@ public class DefaultLoadBalancingPolicy extends 
BasicLoadBalancingPolicy impleme
   private static final int MAX_IN_FLIGHT_THRESHOLD = 10;
   private static final long RESPONSE_COUNT_RESET_INTERVAL_NANOS = 
MILLISECONDS.toNanos(200);
 
-  protected final Map responseTimes = new 
ConcurrentHashMap<>();
+  protected final LoadingCache responseTimes;
   protected final Map upTimes = new ConcurrentHashMap<>();
   private final boolean avoidSlowReplicas;
 
   public DefaultLoadBalancingPolicy(@NonNull DriverContext context, @NonNull 
String profileName) {
 super(context, profileName);
 this.avoidSlowReplicas =
 
profile.getBoolean(DefaultDriverOption.LOAD_BALANCING_POLICY_SLOW_AVOIDANCE, 
true);
+CacheLoader cacheLoader =
+new CacheLoader() {
+  @Override
+  public AtomicLongArray load(Node key) throws Exception {
+// The array stores at most two timestamps, since we don't need 
more;
+// the first one is always the least recent one, and hence the one 
to inspect.
+long now = nanoTime();
+AtomicLongArray array = responseTimes.getIfPresent(key);
+if (array == null) {
+  array = new AtomicLongArray(1);
+  array.set(0, now);
+} else if (array.length() == 1) {
+  long previous = array.get(0);
+  array = new AtomicLongArray(2);
+  array.set(0, previous);
+  array.set(1, now);
+} else {
+  array.set(0, array.get(1));
+  array.set(1, now);
+}
+return array;
+  }
+};
+this.responseTimes = 
CacheBuilder.newBuilder().weakKeys().build(cacheLoader);

Review Comment:
   I think we should add a 
[RemovalListener](https://guava.dev/releases/21.0/api/docs/com/google/common/cache/RemovalListener.html)
 here.
   
   If a GC happens and response times for a Node are purged, then we'll end up 
treating that as "insufficient responses" in `isResponseRateInsufficient`, 
which can lead us to mark a node as unhealthy. I recognize that this is a bit 
of a pathological example, but this behavior does depend on GC timing and would 
be a pain to track down, so adding logging could make someone's life easier 
down the line.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE

2024-02-05 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814563#comment-17814563
 ] 

Caleb Rackliffe commented on CASSANDRA-19018:
-

I think I may have found the missing link. While RFP is still broken around 
short reads, SAI at the local level might be hiding range tombstones. I've 
managed to get the multi-node Harry test passing in extended runs 
[here|https://github.com/maedhroz/cassandra/pull/15/commits]. Paging and 
read-repair are disabled here to avoid the potential RFP problems, and statics 
are disabled, but I should now be able to add back statics and read-repair and 
get clean runs. More on that shortly...

> An SAI-specific mechanism to ensure consistency isn't violated for 
> multi-column (i.e. AND) queries at CL > ONE
> --
>
> Key: CASSANDRA-19018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Feature/SAI
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> result_details.tar-1.gz, result_details.tar.gz
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around 
> filtering/index queries that use intersection/AND over partially updated 
> non-key columns. (ex. Restricting one clustering column and one normal column 
> does not cause a consistency problem, as primary keys cannot be partially 
> updated.) This issue exists to attempt to fix this specifically for SAI in 
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config -> 
> config.with(GOSSIP).with(NETWORK)).start()))
> {
> cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int 
> PRIMARY KEY, a int, b int)"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING 
> 'sai'"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING 
> 'sai'"));
> // insert a split row
> cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> a) VALUES (0, 1)"));
> cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> b) VALUES (0, 2)"));
> // Uncomment this line and test succeeds w/ partial writes 
> completed...
> //cluster.get(1).nodetoolResult("repair", 
> KEYSPACE).asserts().success();
> String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND 
> b = 2");
> Object[][] initialRows = cluster.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
> assertRows(initialRows, row(0, 1, 2)); // not found!!
> }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial 
> matches from the coordinator that would combine there to form full matches. 
> Simple non-index filtering queries also suffer from this problem, but they 
> hide the partial matches in a different way. I'll outline a possible solution 
> for this in the comments that takes advantage of replica filtering protection 
> and the repaired/unrepaired datasets...and attempts to minimize the amount of 
> extra row data sent to the coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18230) Write docs for CEP-20

2024-02-05 Thread Lorina Poland (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorina Poland updated CASSANDRA-18230:
--
Change Category: Code Clarity
 Complexity: Normal
   Priority: Normal  (was: High)
 Status: Open  (was: Triage Needed)

> Write docs for CEP-20
> -
>
> Key: CASSANDRA-18230
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18230
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Lorina Poland
>Assignee: Lorina Poland
>Priority: Normal
> Fix For: 5.x
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-18230) Write docs for CEP-20

2024-02-05 Thread Lorina Poland (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorina Poland reassigned CASSANDRA-18230:
-

Assignee: Lorina Poland

> Write docs for CEP-20
> -
>
> Key: CASSANDRA-18230
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18230
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Lorina Poland
>Assignee: Lorina Poland
>Priority: High
> Fix For: 5.x
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]

2024-02-05 Thread via GitHub



absurdfarce commented on PR #1743:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1743#issuecomment-1928483385

   Very much agreed that the underlying issue here appears to be an issue with 
AWS Keyspaces @aratno; that's being addressed in a different ticket.  The scope 
of this change is around preventing the (potentially indefinite) caching of 
Node instances within an LBP.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]

2024-02-05 Thread via GitHub



aratno commented on code in PR #1743:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1743#discussion_r1479025356


##
core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java:
##
@@ -276,38 +303,23 @@ protected boolean isBusy(@NonNull Node node, @NonNull 
Session session) {
   protected boolean isResponseRateInsufficient(@NonNull Node node, long now) {
 // response rate is considered insufficient when less than 2 responses 
were obtained in
 // the past interval delimited by RESPONSE_COUNT_RESET_INTERVAL_NANOS.
-if (responseTimes.containsKey(node)) {
-  AtomicLongArray array = responseTimes.get(node);
-  if (array.length() == 2) {
-long threshold = now - RESPONSE_COUNT_RESET_INTERVAL_NANOS;
-long leastRecent = array.get(0);
-return leastRecent - threshold < 0;
-  }
-}
-return true;
+AtomicLongArray array = responseTimes.getIfPresent(node);
+if (array != null && array.length() == 2) {
+  long threshold = now - RESPONSE_COUNT_RESET_INTERVAL_NANOS;
+  long leastRecent = array.get(0);
+  return leastRecent - threshold < 0;
+} else return true;

Review Comment:
   Style nit: Invert the condition and use an early-return if response rate is 
insufficient, so you don't have `else return true`



##
core/src/main/java/com/datastax/oss/driver/internal/core/metrics/AbstractMetricUpdater.java:
##
@@ -173,9 +173,8 @@ protected Timeout newTimeout() {
 .getTimer()
 .newTimeout(
 t -> {
-  if (t.isExpired()) {
-clearMetrics();
-  }
+  clearMetrics();
+  cancelMetricsExpirationTimeout();

Review Comment:
   What's the reasoning for this change?



##
core/src/main/java/com/datastax/oss/driver/internal/core/util/concurrent/ReplayingEventFilter.java:
##
@@ -82,6 +82,7 @@ public void markReady() {
 consumer.accept(event);
   }
 } finally {
+  recordedEvents.clear();

Review Comment:
   What's the reasoning for this change?



##
core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java:
##
@@ -96,14 +99,38 @@ public class DefaultLoadBalancingPolicy extends 
BasicLoadBalancingPolicy impleme
   private static final int MAX_IN_FLIGHT_THRESHOLD = 10;
   private static final long RESPONSE_COUNT_RESET_INTERVAL_NANOS = 
MILLISECONDS.toNanos(200);
 
-  protected final Map responseTimes = new 
ConcurrentHashMap<>();
+  protected final LoadingCache responseTimes;
   protected final Map upTimes = new ConcurrentHashMap<>();
   private final boolean avoidSlowReplicas;
 
   public DefaultLoadBalancingPolicy(@NonNull DriverContext context, @NonNull 
String profileName) {
 super(context, profileName);
 this.avoidSlowReplicas =
 
profile.getBoolean(DefaultDriverOption.LOAD_BALANCING_POLICY_SLOW_AVOIDANCE, 
true);
+CacheLoader cacheLoader =

Review Comment:
   Style nit: use a separate class for the cache value here, rather than using 
AtomicLongArray as a generic container. Seems like it can be something like 
`NodeResponseRateSample`, with methods like `boolean hasSufficientResponses`. I 
see this was present in the previous implementation, so not a required change 
for this PR, just something I noticed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19372) WESBITE - Adding blog post

2024-02-05 Thread Paul Au (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814527#comment-17814527
 ] 

Paul Au commented on CASSANDRA-19372:
-

Preview Available:

https://raw.githack.com/Paul-TT/cassandra-website/CASSANDRA-19372_generated/content/_/blog.html

https://raw.githack.com/Paul-TT/cassandra-website/CASSANDRA-19372_generated/content/_/blog/Apache-Cassandra-5.0-Features-Mathematical-CQL-Functions.html

> WESBITE - Adding blog post
> --
>
> Key: CASSANDRA-19372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19372
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Paul Au
>Priority: Normal
>
> Adding blog post to website.
> Apache Cassandra 5.0 Features: Mathematical CQL Functions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19372) WESBITE - Adding blog post

2024-02-05 Thread Paul Au (Jira)

Paul Au created CASSANDRA-19372:
---

 Summary: WESBITE - Adding blog post
 Key: CASSANDRA-19372
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19372
 Project: Cassandra
  Issue Type: Task
  Components: Documentation/Website
Reporter: Paul Au


Adding blog post to website.

Apache Cassandra 5.0 Features: Mathematical CQL Functions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19370) Intermittent test failures in SchemaIT

2024-02-05 Thread Bret McGuire (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814519#comment-17814519
 ] 

Bret McGuire commented on CASSANDRA-19370:
--

Also worth noting that there are several older tickets in the DataStax Jira 
which address similar issues:

 

https://datastax-oss.atlassian.net/browse/JAVA-2579

[https://datastax-oss.atlassian.net/browse/JAVA-1690]

 

So this one has been around for a while.

> Intermittent test failures in SchemaIT
> --
>
> Key: CASSANDRA-19370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19370
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bret McGuire
>Priority: Normal
>
> Noted on a few DataStax Jenkins runs of the Java driver test suite, 
> specifically a test run for a recent PR for CASSANDRA-19290.  Seems to be 
> very intermittent.
>  
> {code:java}
> Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 
> 'openjdk@1.11' / Execute-Tests / 
> com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code}
>  
> {noformat}
> Error MessageExpecting:
>   
> {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
> not to contain key:
>   fooStacktracejava.lang.AssertionError: 
> Expecting:
>   
> {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
> not to contain key:
>   foo
>   at 
> com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
>   at 
> org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at java.base/java.lang.Thread.run(Thread.java:833){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19371) Intermittent test failures in ChannelPoolResizeTest

2024-02-05 Thread Bret McGuire (Jira)

Bret McGuire created CASSANDRA-19371:


 Summary: Intermittent test failures in ChannelPoolResizeTest
 Key: CASSANDRA-19371
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19371
 Project: Cassandra
  Issue Type: Bug
Reporter: Bret McGuire


Noted on a recent DataStax Jenkins run against a PR for CASSANDRA-19290.  
Failure seems to be intermittent.

 
{noformat}
Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 
'openjdk@1.11' / Execute-Tests / 
com.datastax.oss.driver.internal.core.pool.ChannelPoolResizeTest.should_resize_during_reconnection_if_config_changes{noformat}
{noformat}
Error MessageExpecting:
  [1]
to contain exactly (and in same order):
  [0]
but some elements were not found:
  [0]
and others were not expected:
  [1]
Stacktracejava.lang.AssertionError: 

Expecting:
  [1]
to contain exactly (and in same order):
  [0]
but some elements were not found:
  [0]
and others were not expected:
  [1]

at 
com.datastax.oss.driver.internal.core.channel.MockChannelFactoryHelper.verifyNoMoreCalls(MockChannelFactoryHelper.java:114)
at 
com.datastax.oss.driver.internal.core.pool.ChannelPoolResizeTest.should_resize_during_reconnection_if_config_changes(ChannelPoolResizeTest.java:379)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:49)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:120)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:95)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:69)
at 
org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:146)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
at 
org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To un

[jira] [Updated] (CASSANDRA-19370) Intermittent test failures in SchemaIT

2024-02-05 Thread Bret McGuire (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bret McGuire updated CASSANDRA-19370:
-
Description: 
Noted on a few DataStax Jenkins runs of the Java driver test suite, 
specifically a test run for a recent PR for CASSANDRA-19290.  Seems to be very 
intermittent.

 
{code:java}
Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 
'openjdk@1.11' / Execute-Tests / 
com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code}
 
{noformat}
Error MessageExpecting:
  
{foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
not to contain key:
  fooStacktracejava.lang.AssertionError: 

Expecting:
  
{foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
not to contain key:
  foo
at 
com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
at 
org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833){noformat}
 

  was:
Noted on a few DataStax Jenkins runs of the Java driver test suite.  Seems to 
be very intermittent.

 
{code:java}
Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 
'openjdk@1.11' / Execute-Tests / 
com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code}
 
{noformat}
Error MessageExpecting:
  
{foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
not to contain key:
  fooStacktracejava.lang.AssertionError: 

Expecting:
  
{foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
not to contain key:
  foo
at 
com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.b

[jira] [Created] (CASSANDRA-19370) Intermittent test failures in SchemaIT

2024-02-05 Thread Bret McGuire (Jira)

Bret McGuire created CASSANDRA-19370:


 Summary: Intermittent test failures in SchemaIT
 Key: CASSANDRA-19370
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19370
 Project: Cassandra
  Issue Type: Bug
Reporter: Bret McGuire


Noted on a few DataStax Jenkins runs of the Java driver test suite.  Seems to 
be very intermittent.

 
{code:java}
Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 
'openjdk@1.11' / Execute-Tests / 
com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code}
 
{noformat}
Error MessageExpecting:
  
{foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
not to contain key:
  fooStacktracejava.lang.AssertionError: 

Expecting:
  
{foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
not to contain key:
  foo
at 
com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
at 
org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833){noformat}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19370) Intermittent test failures in SchemaIT

2024-02-05 Thread Bret McGuire (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814516#comment-17814516
 ] 

Bret McGuire commented on CASSANDRA-19370:
--

Not immediately clear if there's anything DSE-specific about this failure or 
not.  The two cases I could find do involve runs against DSE but it's quite 
possible the issue in this test is more general.

> Intermittent test failures in SchemaIT
> --
>
> Key: CASSANDRA-19370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19370
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bret McGuire
>Priority: Normal
>
> Noted on a few DataStax Jenkins runs of the Java driver test suite.  Seems to 
> be very intermittent.
>  
> {code:java}
> Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 
> 'openjdk@1.11' / Execute-Tests / 
> com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code}
>  
> {noformat}
> Error MessageExpecting:
>   
> {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
> not to contain key:
>   fooStacktracejava.lang.AssertionError: 
> Expecting:
>   
> {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027}
> not to contain key:
>   foo
>   at 
> com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
>   at 
> org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at java.base/java.lang.Thread.run(Thread.java:833){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables

2024-02-05 Thread Francisco Guerrero (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814508#comment-17814508
 ] 

Francisco Guerrero edited comment on CASSANDRA-19369 at 2/5/24 9:16 PM:


[~smiklosovic] Cassandra Analytics creates an SSTable during bulk writes. For 
each SSTable component generated we calculate the digest of each file (this 
includes the crc32 file), which is then uploaded. The purpose of this checksum 
is to prevent file integrity of each of the SSTable component files during 
transmission from Spark executor to Cassandra Sidecar service, rather than 
integrity of the data file.

For data integrity, bulk writer does the following:
- Checksums of each file generated
- Re-read the generated SSTable file and ensure that what was written is the 
same as what we read.
- Transfer the file with a checksum header
- (On Sidecar) Validate that the checksum matches the uploaded file


was (Author: frankgh):
[~smiklosovic] Cassandra Analytics creates an SSTable during bulk writes. For 
each SSTable component generated we calculate the digest of each file (this 
includes the crc32 file), which is then uploaded. The purpose of this checksum 
is to prevent file integrity of each of the SSTable component files, rather 
than integrity of the data file.

For data integrity, bulk writer does the following:
- Checksums of each file generated
- Re-read the generated SSTable file and ensure that what was written is the 
same as what we read.
- Transfer the file with a checksum header
- (On Sidecar) Validate that the checksum matches the uploaded file

> [Analytics] Use XXHash32 for digest calculation of SSTables
> ---
>
> Key: CASSANDRA-19369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
> SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
> Analytics includes the {{content-md5}} header as part of the upload request. 
> This information is used by Cassandra Sidecar to validate the integrity of 
> the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.
> Recently, Cassandra Sidecar introduced [support for additional checksum 
> validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
> SSTable upload. Notably the XXHash32 digest support was added which offers 
> for more performant checksum calculations. This support now allows Cassandra 
> Analytics to use a more efficient digest algorithm that is friendlier on the 
> CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables

2024-02-05 Thread Francisco Guerrero (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814508#comment-17814508
 ] 

Francisco Guerrero commented on CASSANDRA-19369:


[~smiklosovic] Cassandra Analytics creates an SSTable during bulk writes. For 
each SSTable component generated we calculate the digest of each file (this 
includes the crc32 file), which is then uploaded. The purpose of this checksum 
is to prevent file integrity of each of the SSTable component files, rather 
than integrity of the data file.

For data integrity, bulk writer does the following:
- Checksums of each file generated
- Re-read the generated SSTable file and ensure that what was written is the 
same as what we read.
- Transfer the file with a checksum header
- (On Sidecar) Validate that the checksum matches the uploaded file

> [Analytics] Use XXHash32 for digest calculation of SSTables
> ---
>
> Key: CASSANDRA-19369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
> SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
> Analytics includes the {{content-md5}} header as part of the upload request. 
> This information is used by Cassandra Sidecar to validate the integrity of 
> the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.
> Recently, Cassandra Sidecar introduced [support for additional checksum 
> validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
> SSTable upload. Notably the XXHash32 digest support was added which offers 
> for more performant checksum calculations. This support now allows Cassandra 
> Analytics to use a more efficient digest algorithm that is friendlier on the 
> CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables

2024-02-05 Thread Stefan Miklosovic (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814502#comment-17814502
 ] 

Stefan Miklosovic commented on CASSANDRA-19369:
---

Why is this actually needed at all? If you write a SSTable, there is DIGEST 
component which computes crc32 of a data file. Are not analytics supporting 
this too? Would not it make more sense to introduce a way how to use different 
checksum algorithms except crc32 for data file integrity validation and then 
reuse it from analytics? 

> [Analytics] Use XXHash32 for digest calculation of SSTables
> ---
>
> Key: CASSANDRA-19369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
> SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
> Analytics includes the {{content-md5}} header as part of the upload request. 
> This information is used by Cassandra Sidecar to validate the integrity of 
> the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.
> Recently, Cassandra Sidecar introduced [support for additional checksum 
> validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
> SSTable upload. Notably the XXHash32 digest support was added which offers 
> for more performant checksum calculations. This support now allows Cassandra 
> Analytics to use a more efficient digest algorithm that is friendlier on the 
> CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] CASSANDRA-16969 4.7.x license check [cassandra-java-driver]

2024-02-05 Thread via GitHub



michaelsembwever closed pull request #1786: CASSANDRA-16969 4.7.x license check
URL: https://github.com/apache/cassandra-java-driver/pull/1786


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables

2024-02-05 Thread Francisco Guerrero (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19369:
---
Test and Documentation Plan: Added unit tests. Integration tests pending
 Status: Patch Available  (was: In Progress)

PR: https://github.com/apache/cassandra-analytics/pull/38

> [Analytics] Use XXHash32 for digest calculation of SSTables
> ---
>
> Key: CASSANDRA-19369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
> SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
> Analytics includes the {{content-md5}} header as part of the upload request. 
> This information is used by Cassandra Sidecar to validate the integrity of 
> the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.
> Recently, Cassandra Sidecar introduced [support for additional checksum 
> validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
> SSTable upload. Notably the XXHash32 digest support was added which offers 
> for more performant checksum calculations. This support now allows Cassandra 
> Analytics to use a more efficient digest algorithm that is friendlier on the 
> CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables

2024-02-05 Thread Francisco Guerrero (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19369:
---
Change Category: Performance
 Complexity: Low Hanging Fruit
Component/s: Analytics Library
 Status: Open  (was: Triage Needed)

> [Analytics] Use XXHash32 for digest calculation of SSTables
> ---
>
> Key: CASSANDRA-19369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>
> During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
> SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
> Analytics includes the {{content-md5}} header as part of the upload request. 
> This information is used by Cassandra Sidecar to validate the integrity of 
> the uploaded SSTable and prevent issues with bit flips and corrupted SSTables.
> Recently, Cassandra Sidecar introduced [support for additional checksum 
> validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
> SSTable upload. Notably the XXHash32 digest support was added which offers 
> for more performant checksum calculations. This support now allows Cassandra 
> Analytics to use a more efficient digest algorithm that is friendlier on the 
> CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[PR] CASSANDRA-19369 Use XXHash32 for digest calculation of SSTables [cassandra-analytics]

2024-02-05 Thread via GitHub



frankgh opened a new pull request, #38:
URL: https://github.com/apache/cassandra-analytics/pull/38

   This commit adds the ability to use the newly supported in Cassandra Sidecar 
XXhash32 digest algorithm. The commit allows for backwards compatibility to 
perform MD5 checksumming, but it now defaults to XXHash32.
   
   A new Writer option is added:
   
   ```
   .option(WriterOptions.DIGEST_TYPE.name(), "XXHASH32") // or
   .option(WriterOptions.DIGEST_TYPE.name(), "MD5")
   ```
   
   This option defaults to XXHash32, when not provided, but it can be 
configured to use the legacy MD5 algorithm.
   
   Path by Francisco Guerrero; Reviewed by TBD for CASSANDRA-19369


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables

2024-02-05 Thread Francisco Guerrero (Jira)

Francisco Guerrero created CASSANDRA-19369:
--

 Summary: [Analytics] Use XXHash32 for digest calculation of 
SSTables
 Key: CASSANDRA-19369
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19369
 Project: Cassandra
  Issue Type: Improvement
Reporter: Francisco Guerrero
Assignee: Francisco Guerrero


During bulk writes, Cassandra Analytics calculates the MD5 checksum of every 
SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra 
Analytics includes the {{content-md5}} header as part of the upload request. 
This information is used by Cassandra Sidecar to validate the integrity of the 
uploaded SSTable and prevent issues with bit flips and corrupted SSTables.

Recently, Cassandra Sidecar introduced [support for additional checksum 
validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during 
SSTable upload. Notably the XXHash32 digest support was added which offers for 
more performant checksum calculations. This support now allows Cassandra 
Analytics to use a more efficient digest algorithm that is friendlier on the 
CPU usage of Sidecar and spark resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] CASSANDRA-19180: Support reloading keystore in cassandra-java-driver [cassandra-java-driver]

2024-02-05 Thread via GitHub



absurdfarce commented on code in PR #1907:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1907#discussion_r1478784272


##
core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java:
##
@@ -0,0 +1,253 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package com.datastax.oss.driver.internal.core.ssl;
+
+import 
com.datastax.oss.driver.shaded.guava.common.annotations.VisibleForTesting;
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.Socket;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.security.KeyStore;
+import java.security.KeyStoreException;
+import java.security.MessageDigest;
+import java.security.NoSuchAlgorithmException;
+import java.security.Principal;
+import java.security.PrivateKey;
+import java.security.Provider;
+import java.security.UnrecoverableKeyException;
+import java.security.cert.CertificateException;
+import java.security.cert.X509Certificate;
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicReference;
+import javax.net.ssl.KeyManager;
+import javax.net.ssl.KeyManagerFactory;
+import javax.net.ssl.KeyManagerFactorySpi;
+import javax.net.ssl.ManagerFactoryParameters;
+import javax.net.ssl.SSLEngine;
+import javax.net.ssl.X509ExtendedKeyManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class ReloadingKeyManagerFactory extends KeyManagerFactory implements 
AutoCloseable {
+  private static final Logger logger = 
LoggerFactory.getLogger(ReloadingKeyManagerFactory.class);
+  private static final String KEYSTORE_TYPE = "JKS";
+  private Path keystorePath;
+  private String keystorePassword;
+  private ScheduledExecutorService executor;
+  private final Spi spi;
+
+  // We're using a single thread executor so this shouldn't need to be 
volatile, since all updates
+  // to lastDigest should come from the same thread
+  private volatile byte[] lastDigest;
+
+  /**
+   * Create a new {@link ReloadingKeyManagerFactory} with the given keystore 
file and password,
+   * reloading from the file's content at the given interval. This function 
will do an initial
+   * reload before returning, to confirm that the file exists and is readable.
+   *
+   * @param keystorePath the keystore file to reload
+   * @param keystorePassword the keystore password
+   * @param reloadInterval the duration between reload attempts. Set to {@link
+   * java.time.Duration#ZERO} to disable scheduled reloading.
+   * @return
+   */
+  public static ReloadingKeyManagerFactory create(
+  Path keystorePath, String keystorePassword, Duration reloadInterval)
+  throws UnrecoverableKeyException, KeyStoreException, 
NoSuchAlgorithmException,
+  CertificateException, IOException {
+KeyManagerFactory kmf = 
KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
+
+KeyStore ks;
+try (InputStream ksf = Files.newInputStream(keystorePath)) {
+  ks = KeyStore.getInstance(KEYSTORE_TYPE);
+  ks.load(ksf, keystorePassword.toCharArray());
+}
+kmf.init(ks, keystorePassword.toCharArray());
+
+ReloadingKeyManagerFactory reloadingKeyManagerFactory = new 
ReloadingKeyManagerFactory(kmf);
+reloadingKeyManagerFactory.start(keystorePath, keystorePassword, 
reloadInterval);
+return reloadingKeyManagerFactory;
+  }
+
+  @VisibleForTesting
+  protected ReloadingKeyManagerFactory(KeyManagerFactory initial) {
+this(
+new Spi((X509ExtendedKeyManager) initial.getKeyManagers()[0]),
+initial.getProvider(),
+initial.getAlgorithm());
+  }
+
+  private ReloadingKeyManagerFactory(Spi spi, Provider provider, String 
algorithm) {
+super(spi, provider, algorithm);
+this.spi = spi;
+  }
+
+  private void start(Path keystorePath, String keystorePassword, Duration 
reloadInterval) {
+this.keystorePath = keystorePath;
+this.keystorePassword = keystorePassword;
+
+// Ensure that reload is called once synchronously, to make sure t

[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814491#comment-17814491
 ] 

David Capwell commented on CASSANDRA-19367:
---

Here are the places isVector is called

* org.apache.cassandra.index.sai.memory.MemtableIndexManager#update - I don't 
need "update", but not sure why we even special case vector here?  removing 
"index" in favor of "update" seems fine to me.  I also see we don't update 
"memtableIndexWriteLatency" in the vectors case... so cleaning that up would 
fix this metric?
* 
org.apache.cassandra.index.sai.plan.StorageAttachedIndexQueryPlan#StorageAttachedIndexQueryPlan
 - this is checking if any index is a factor, if so we are "top-key"...  Thats 
super specific but mostly ignored in my POC as I query SAI lower level than CQL 
so I avoid post filtering and loading the partition/row
* 
org.apache.cassandra.index.sai.disk.v1.V1OnDiskFormat#perColumnIndexComponents 
- Just saw that my POC adds accord here, but I missed refactoring this to 
Strat... still need to do that for this patch
* org.apache.cassandra.index.sai.disk.v1.IndexWriterConfig#fromOptions - would 
be nice to leverage Strat here, but don't need for my use case as its a 
internal table with an internal index... I validate w/e you try to construct 
the index
* Several cases in StorageAttachedIndex and IndexTermType

I need to fix the v1 format as that does impact my POC, but open to other 
places depending on feedback

> Refactor SAI so the selection of the index type is not scattered to multiple 
> places
> ---
>
> Key: CASSANDRA-19367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For Accord we want to write an internal index and finding plugging into SAI 
> is a bit more channeling than it could be… we need to find multiple places 
> where the SAI code “infer” the index type so it can delegate… this logic 
> should be done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] CASSANDRA-19180: Support reloading keystore in cassandra-java-driver [cassandra-java-driver]

2024-02-05 Thread via GitHub



absurdfarce commented on code in PR #1907:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1907#discussion_r1478776793


##
upgrade_guide/README.md:
##
@@ -19,6 +19,17 @@ under the License.
 
 ## Upgrade guide
 
+### NEW VERSION PLACEHOLDER

Review Comment:
   Your suggestion seems like a pretty reasonable approach to me @aratno .  In 
the past we'd usually just set the placeholder to whatever we thought the next 
version would be (knowing full well it might be changed as things moved along) 
but I have no objection to just leaving a placeholder in the doc.  Part of the 
release checklist could then become "update the placeholder to the correct 
version string".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814488#comment-17814488
 ] 

David Capwell commented on CASSANDRA-19367:
---

What we do in Accord at the moment is

{code}
public class RoutingKeyIndex extends StorageAttachedIndex
{
public RoutingKeyIndex(ColumnFamilyStore baseCfs, IndexMetadata 
indexMetadata)
{
super(baseCfs, indexMetadata);
}

@Override
protected Strategy createStrategy(ColumnFamilyStore baseCfs,
  IndexMetadata indexMetadata,
  IndexTermType indexTermType,
  IndexIdentifier indexIdentifier)
{
if 
(!baseCfs.getKeyspaceName().equals(SchemaConstants.ACCORD_KEYSPACE_NAME))
throw new IllegalArgumentException("Attempted to use an internal 
index on the wrong table: " + baseCfs.metadata());
return new AbstractStrategy(this)
{
@Override
public MemoryIndex createMemoryIndex()
{
return new RoutingKeyMemoryIndex(index);
}

@Override
public Flusher flusher()
{
return (memtable, indexDescriptor, rowMapping) -> {
RoutingKeyMemoryIndex index = memtable.getBacking();
SegmentMetadata.ComponentMetadataMap metadataMap = 
index.writeDirect(indexDescriptor, indexIdentifier, rowMapping::get);

return new SegmentMetadata(0,
   rowMapping.size(),
   0,
   rowMapping.maxSSTableRowId,
   rowMapping.minKey,
   rowMapping.maxKey,
   index.getMinTerm(),
   index.getMaxTerm(),
   metadataMap);
};
}

@Override
public SegmentBuilder createSegmentBuilder(NamedMemoryLimiter 
limiter)
{
return new AccordRangeSegmentBuilder(index, limiter);
}

@Override
public IndexSegmentSearcher createSearcher(PrimaryKeyMap.Factory 
primaryKeyMapFactory, PerColumnIndexFiles indexFiles, SegmentMetadata 
segmentMetadata) throws IOException
{
return new 
RoutingKeyDiskIndexSegmentSearcher(primaryKeyMapFactory, indexFiles, 
segmentMetadata, index);
}
};
}
{code}

{code}
public static final TableMetadata Commands =
parse(COMMANDS,
  "accord commands",
  "CREATE TABLE %s ("
  + "store_id int,"
  + "domain int," // this is stored as part of txn_id, used 
currently for cheaper scans of the table
  + format("txn_id %s,", TIMESTAMP_TUPLE)
...
  + "route blob,"
...
  + "PRIMARY KEY((store_id, domain, txn_id))"
  + ')')
.partitioner(new 
LocalPartitioner(CompositeType.getInstance(Int32Type.instance, 
Int32Type.instance, TIMESTAMP_TYPE)))
.indexes(Indexes.builder()
.add(IndexMetadata.fromSchemaMetadata("route", 
IndexMetadata.Kind.CUSTOM, ImmutableMap.of("class_name", 
RoutingKeyIndex.class.getCanonicalName(), "target", "route")))
.build())
.build();
{code}

> Refactor SAI so the selection of the index type is not scattered to multiple 
> places
> ---
>
> Key: CASSANDRA-19367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For Accord we want to write an internal index and finding plugging into SAI 
> is a bit more channeling than it could be… we need to find multiple places 
> where the SAI code “infer” the index type so it can delegate… this logic 
> should be done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19368) Add way for SAI to disable row to token index so internal tables may leverage SAI

2024-02-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-19368:
--
Change Category: Semantic
 Complexity: Normal
  Fix Version/s: 5.x
 Status: Open  (was: Triage Needed)

> Add way for SAI to disable row to token index so internal tables may leverage 
> SAI
> -
>
> Key: CASSANDRA-19368
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19368
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>
> Internal tables tend to use LocalPartitioner and may not actually have murmur 
> tokens but rather LocalPartitioner, which is variable length bytes tokens!  
> For internal use cases we don’t always care about paging so don’t really need 
> this index to function.
> The use case motivating this work is for Accord, we wish to add a custom SAI 
> index on the system_accord.commands#routes column.  Since this logic is 
> purely internal we don’t care about paging, but can not leverage SAI at this 
> moment as it hard codes murmur tokens, and fails during memtable flush



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19368) Add way for SAI to disable row to token index so internal tables may leverage SAI

2024-02-05 Thread David Capwell (Jira)

David Capwell created CASSANDRA-19368:
-

 Summary: Add way for SAI to disable row to token index so internal 
tables may leverage SAI
 Key: CASSANDRA-19368
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19368
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/2i Index
Reporter: David Capwell


Internal tables tend to use LocalPartitioner and may not actually have murmur 
tokens but rather LocalPartitioner, which is variable length bytes tokens!  For 
internal use cases we don’t always care about paging so don’t really need this 
index to function.

The use case motivating this work is for Accord, we wish to add a custom SAI 
index on the system_accord.commands#routes column.  Since this logic is purely 
internal we don’t care about paging, but can not leverage SAI at this moment as 
it hard codes murmur tokens, and fails during memtable flush



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814487#comment-17814487
 ] 

David Capwell commented on CASSANDRA-19367:
---

[~mike_tr_adamson] just sent out a patch.  I didn't do IndexTermType as thats 
kinda annoying for accord... we are a "blob" but really we want to have 
custom/internal logic and try to hide the fact its a blob from SAI

> Refactor SAI so the selection of the index type is not scattered to multiple 
> places
> ---
>
> Key: CASSANDRA-19367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For Accord we want to write an internal index and finding plugging into SAI 
> is a bit more channeling than it could be… we need to find multiple places 
> where the SAI code “infer” the index type so it can delegate… this logic 
> should be done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-19367:
--
Test and Documentation Plan: existing tests
 Status: Patch Available  (was: Open)

> Refactor SAI so the selection of the index type is not scattered to multiple 
> places
> ---
>
> Key: CASSANDRA-19367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For Accord we want to write an internal index and finding plugging into SAI 
> is a bit more channeling than it could be… we need to find multiple places 
> where the SAI code “infer” the index type so it can delegate… this logic 
> should be done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Stefan Miklosovic (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19366:
--
Fix Version/s: 5.x
   (was: 5.1)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19362) An "include" is broken on the Storage Engine documentation page

2024-02-05 Thread Lorina Poland (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814486#comment-17814486
 ] 

Lorina Poland commented on CASSANDRA-19362:
---

The link is broken in versions 4.1, 4.0, and 3.11, but not 5.0 because the 
include is not correct.

 

The correct include is: 
{code:java}
include::cassandra:example$BASH/find_sstables.sh[]{code}

> An "include" is broken on the Storage Engine documentation page
> ---
>
> Key: CASSANDRA-19362
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19362
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Jeremy Hanna
>Assignee: Lorina Poland
>Priority: Normal
>
> The example code at the bottom of the "Storage Engine" page doesn't appear to 
> be including the code properly.  See 
> https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html#example-code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread Mike Adamson (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814485#comment-17814485
 ] 

Mike Adamson commented on CASSANDRA-19367:
--

The obvious place for this would be in {{IndexTermType}} where we could replace 
{{isLiteral}} and {{isVector}} with a {{getStrategy}} (or some such). The 
{{Strategy}} would then need to handle all the conditionals where the above 
methods are used.

Apart from anything this would tidy a lot of the current code paths where we 
are constantly checking for the index type.

> Refactor SAI so the selection of the index type is not scattered to multiple 
> places
> ---
>
> Key: CASSANDRA-19367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>
> For Accord we want to write an internal index and finding plugging into SAI 
> is a bit more channeling than it could be… we need to find multiple places 
> where the SAI code “infer” the index type so it can delegate… this logic 
> should be done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Stefan Miklosovic (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814482#comment-17814482
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19366 at 2/5/24 6:49 PM:
---

I did the first pass of the PR (minus tests)


was (Author: smiklosovic):
I did the first pass of the PR.

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Stefan Miklosovic (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19366:
--
Reviewers: Stefan Miklosovic
   Status: Review In Progress  (was: Patch Available)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Stefan Miklosovic (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19366:
--
Status: Changes Suggested  (was: Review In Progress)

I did the first pass of the PR.

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19254) "comments" keyword on docs page should be "comment"

2024-02-05 Thread Lorina Poland (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814479#comment-17814479
 ] 

Lorina Poland commented on CASSANDRA-19254:
---

Rolling into CASSANDRA-19249, since it is a minor issue.

> "comments" keyword on docs page should be "comment"
> ---
>
> Key: CASSANDRA-19254
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19254
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Stefano Lottini
>Assignee: Lorina Poland
>Priority: Low
>
> Low-priority nitpick: the CREATE TABLE [docs 
> page|[https://cassandra.apache.org/doc/latest/cassandra/reference/cql-commands/create-table.html#table_options]]
>  has
> {{comments = 'some text that describes the table'}}
> with plural `comments`, while the correct keyword to use is `comment` 
> (singular).
> Using the plural form would result in the following error when running the 
> DDL statement: _SyntaxException: Unknown property 'comments'_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-19367:
--
Change Category: Code Clarity
 Complexity: Low Hanging Fruit
  Fix Version/s: 5.x
 Status: Open  (was: Triage Needed)

> Refactor SAI so the selection of the index type is not scattered to multiple 
> places
> ---
>
> Key: CASSANDRA-19367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>
> For Accord we want to write an internal index and finding plugging into SAI 
> is a bit more channeling than it could be… we need to find multiple places 
> where the SAI code “infer” the index type so it can delegate… this logic 
> should be done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places

2024-02-05 Thread David Capwell (Jira)

David Capwell created CASSANDRA-19367:
-

 Summary: Refactor SAI so the selection of the index type is not 
scattered to multiple places
 Key: CASSANDRA-19367
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19367
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/2i Index
Reporter: David Capwell
Assignee: David Capwell


For Accord we want to write an internal index and finding plugging into SAI is 
a bit more channeling than it could be… we need to find multiple places where 
the SAI code “infer” the index type so it can delegate… this logic should be 
done once and made pluggable so custom SAI indexes can be defined



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814475#comment-17814475
 ] 

Andy Tolbert commented on CASSANDRA-19366:
--

awesome, thank you [~smiklosovic] !  I see you had some feedback already, 
appreciate you taking a look, I'll take a look at your comments and make 
changes this afternoon.

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19182) IR may leak SSTables with pending repair when coming from streaming

2024-02-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814476#comment-17814476
 ] 

David Capwell commented on CASSANDRA-19182:
---

[~e.dimitrova] issues with CI... since this goes back to 4.0 we need our CI 
working correctly and it has been having issues with 4.0... so merging this got 
put on hold until thats fixed =(

> IR may leak SSTables with pending repair when coming from streaming
> ---
>
> Key: CASSANDRA-19182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: 
> ci_summary-trunk-a1010f4101bf259de3f31077540e4f987d5df9c5.html
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There is a race condition where SSTables from streaming may race with pending 
> repair cleanup in compaction causing us to cleanup the pending repair state 
> in compaction while the SSTables are being added to it; this leads to IR 
> failing in the future when those files get selected for repair.
> This problem was hard to track down as the in-memory state was wiped, so we 
> don’t have any details.  To better aid these types of investigation we should 
> make sure the repair vtables get updated when IR session failures are 
> submitted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

(cassandra-sidecar) branch trunk updated: ninja fix: update CHANGES for ee454741

2024-02-05 Thread ycai

This is an automated email from the ASF dual-hosted git repository.

ycai pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git


The following commit(s) were added to refs/heads/trunk by this push:
 new c72f217  ninja fix: update CHANGES for ee454741
c72f217 is described below

commit c72f2179143e7e031f247d3e8385a29c5e64c1c3
Author: Yifan Cai <52585731+yifa...@users.noreply.github.com>
AuthorDate: Mon Feb 5 10:33:27 2024 -0800

ninja fix: update CHANGES for ee454741
---
 CHANGES.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGES.txt b/CHANGES.txt
index 1c3a6f8..85dc0d5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,6 @@
 1.0.0
 -
+ * Break restore job into stage and import phases and persist restore slice 
status on phase completion (CASSANDRASC-99)
  * Improve logging for traffic shaping / rate limiting configuration 
(CASSANDRASC-98)
  * Startup Validation Failures when Checking Sidecar Connectivity 
(CASSANDRASC-86)
  * Add support for additional digest validation during SSTable upload 
(CASSANDRASC-97)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion

2024-02-05 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRASC-99:
-
  Fix Version/s: 1.0
Source Control Link: 
https://github.com/apache/cassandra-sidecar/commit/ee454741363f3f693726af242c5ec37ad1480d60
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Break restore job into stage and import phases and persist restore slice 
> status on phase completion
> ---
>
> Key: CASSANDRASC-99
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-99
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Rest API
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 1.0
>
>
> In order to improve resilience of the restore sstables from s3 tasks, we want 
> to break the task into multiple phases and persist the status of each slice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion

2024-02-05 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814473#comment-17814473
 ] 

ASF subversion and git services commented on CASSANDRASC-99:


Commit ee454741363f3f693726af242c5ec37ad1480d60 in cassandra-sidecar's branch 
refs/heads/trunk from Yifan Cai
[ https://gitbox.apache.org/repos/asf?p=cassandra-sidecar.git;h=ee45474 ]

CASSANDRASC-99 Break restore job into stage and import phases and persist 
restore slice status on phase completion

patch by Yifan Cai; reviewed by Doug Rohrer, Francisco Guerrero for 
CASSANDRASC-99


> Break restore job into stage and import phases and persist restore slice 
> status on phase completion
> ---
>
> Key: CASSANDRASC-99
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-99
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Rest API
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
>
> In order to improve resilience of the restore sstables from s3 tasks, we want 
> to break the task into multiple phases and persist the status of each slice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

(cassandra-sidecar) branch trunk updated: CASSANDRASC-99 Break restore job into stage and import phases and persist restore slice status on phase completion

2024-02-05 Thread ycai

This is an automated email from the ASF dual-hosted git repository.

ycai pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git


The following commit(s) were added to refs/heads/trunk by this push:
 new ee45474  CASSANDRASC-99 Break restore job into stage and import phases 
and persist restore slice status on phase completion
ee45474 is described below

commit ee454741363f3f693726af242c5ec37ad1480d60
Author: Yifan Cai 
AuthorDate: Mon Jan 29 16:09:25 2024 -0800

CASSANDRASC-99 Break restore job into stage and import phases and persist 
restore slice status on phase completion

patch by Yifan Cai; reviewed by Doug Rohrer, Francisco Guerrero for 
CASSANDRASC-99
---
 .../data/CreateRestoreJobRequestPayload.java   |  28 ++-
 .../sidecar/common/data/RestoreJobConstants.java   |   1 +
 .../sidecar/common/data/RestoreJobStatus.java  |   1 +
 .../sidecar/common/data/RestoreSliceStatus.java|  37 ++-
 .../data/CreateRestoreJobRequestPayloadTest.java   |   6 +-
 .../common/data/RestoreSliceStatusTest.java|  83 +++
 spotbugs-exclude.xml   |   1 +
 .../config/yaml/RestoreJobConfigurationImpl.java   |  16 +-
 .../apache/cassandra/sidecar/db/RestoreJob.java|  85 ---
 .../sidecar/db/RestoreJobDatabaseAccessor.java |  21 +-
 .../apache/cassandra/sidecar/db/RestoreSlice.java  |  97 ++--
 .../sidecar/db/RestoreSliceDatabaseAccessor.java   |  47 ++--
 .../sidecar/db/schema/RestoreJobsSchema.java   |   5 +-
 .../sidecar/db/schema/RestoreSlicesSchema.java |   2 +-
 .../sidecar/locator/CachedLocalTokenRanges.java| 276 +
 .../sidecar/locator/LocalTokenRangesProvider.java  |  41 +++
 .../sidecar/restore/RestoreJobDiscoverer.java  |  55 +++-
 .../cassandra/sidecar/restore/RestoreJobUtil.java  |   2 +-
 .../sidecar/restore/RestoreProcessor.java  |  36 ++-
 .../sidecar/restore/RestoreSliceTask.java  | 118 +++--
 .../cassandra/sidecar/restore/StorageClient.java   |   2 +-
 .../routes/restore/AbortRestoreJobHandler.java |   6 +-
 .../routes/restore/CreateRestoreJobHandler.java|   2 +-
 .../routes/restore/CreateRestoreSliceHandler.java  |   2 +-
 .../routes/restore/UpdateRestoreJobHandler.java|  17 +-
 .../db/RestoreJobsDatabaseAccessorIntTest.java |  12 +-
 .../testing/ConfigurableCassandraTestContext.java  |  43 +++-
 .../cassandra/sidecar/db/RestoreJobTest.java   |  16 ++
 .../cassandra/sidecar/db/SidecarSchemaTest.java|  53 +++-
 .../sidecar/restore/RestoreJobDiscovererTest.java  |  84 ---
 .../sidecar/restore/RestoreJobManagerTest.java |   7 +-
 .../sidecar/restore/RestoreProcessorTest.java  |   3 +-
 .../sidecar/restore/RestoreSliceTaskTest.java  | 113 +++--
 .../sidecar/restore/RestoreSliceTest.java  |   2 +-
 .../routes/restore/BaseRestoreJobTests.java|   1 -
 .../restore/RestoreJobSummaryHandlerTest.java  |  29 ++-
 .../restore/UpdateRestoreJobHandlerTest.java   |  10 +-
 .../sidecar/utils/AsyncFileSystemUtilsTest.java| 111 +
 38 files changed, 1255 insertions(+), 216 deletions(-)

diff --git 
a/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java
 
b/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java
index 12858d8..0e5a9a0 100644
--- 
a/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java
+++ 
b/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java
@@ -26,8 +26,10 @@ import java.util.function.Consumer;
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import org.apache.cassandra.sidecar.common.utils.Preconditions;
+import org.jetbrains.annotations.Nullable;
 
 import static 
org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_AGENT;
+import static 
org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_CONSISTENCY_LEVEL;
 import static 
org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_EXPIRE_AT;
 import static 
org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_ID;
 import static 
org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_IMPORT_OPTIONS;
@@ -43,6 +45,8 @@ public class CreateRestoreJobRequestPayload
 private final RestoreJobSecrets secrets;
 private final SSTableImportOptions importOptions;
 private final long expireAtInMillis;
+@Nullable
+private final String consistencyLevel; // optional field
 
 /**
  * Builder to build a CreateRestoreJobRequest
@@ -65,13 +69,15 @@ public class CreateRestoreJobRequestPayload
  * @param secrets  secrets to be used by restore job to download 
data
  * @param importOptionsthe configured options for SSTable import
  * @param expireAtInMillis a timestamp in the future

[jira] [Updated] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion

2024-02-05 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRASC-99:
-
Status: Ready to Commit  (was: Review In Progress)

> Break restore job into stage and import phases and persist restore slice 
> status on phase completion
> ---
>
> Key: CASSANDRASC-99
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-99
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Rest API
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
>
> In order to improve resilience of the restore sstables from s3 tasks, we want 
> to break the task into multiple phases and persist the status of each slice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion

2024-02-05 Thread Yifan Cai (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814470#comment-17814470
 ] 

Yifan Cai commented on CASSANDRASC-99:
--

CI is green 
https://app.circleci.com/pipelines/github/yifan-c/cassandra-sidecar/45/workflows/a642b270-088d-4442-9355-2f392365f44c

> Break restore job into stage and import phases and persist restore slice 
> status on phase completion
> ---
>
> Key: CASSANDRASC-99
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-99
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Rest API
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
>
> In order to improve resilience of the restore sstables from s3 tasks, we want 
> to break the task into multiple phases and persist the status of each slice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19336) Repair causes out of memory

2024-02-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814468#comment-17814468
 ] 

David Capwell commented on CASSANDRA-19336:
---

Overall LGTM.  I am +1 but have some comments that ill leave to you if you wish 
to handle or not

1) scheduler should return a Future rather than pushing this to the caller... 
cleans up the calling code a bit
2) given you have a single task per session, can use a simpler data structure 
to track/limit... the current one works best when we have multiple tasks and 
can try to compare cross session.

> Repair causes out of memory
> ---
>
> Key: CASSANDRA-19336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory 
> usage for Merkle tree calculations during repairs. This limit is applied to 
> the set of Merkle trees built for a received validation request 
> ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to 
> overwhelm the repair coordinator, who will have requested RF sets of Merkle 
> trees. That way the repair coordinator should only use 
> {{repair_session_space}} for the RF Merkle trees.
> However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} 
> will send RF*RF validation requests, because the repair coordinator node has 
> RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests 
> are sent at the same time, at some point the repair coordinator can have up 
> to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the 
> validation responses is fully processed before the last response arrives.
> Even worse, if the cluster uses virtual nodes, many nodes can be replicas of 
> the repair coordinator, and some nodes can be replicas of multiple token 
> ranges. It would mean that the repair coordinator can send more than RF or 
> RF*RF simultaneous validation requests.
> For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a 
> repair session involving 44 groups of ranges to be repaired. This produces 
> 44*3=132 validation requests contacting all the nodes in the cluster. When 
> the responses for all these requests start to arrive to the coordinator, each 
> containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate 
> quicker than they are consumed, greatly exceeding {{repair_session_space}} 
> and OOMing the node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Stefan Miklosovic (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814454#comment-17814454
 ] 

Stefan Miklosovic commented on CASSANDRA-19366:
---

I feel confident I could help to review this patch.

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2024-02-05 Thread Brandon Williams (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814448#comment-17814448
 ] 

Brandon Williams commented on CASSANDRA-19085:
--

The gossiper fix looks good to me, +1.  I'll let you guys decide how to handle 
the SCM setting.

> In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
> ---
>
> Key: CASSANDRA-19085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Branimir Lambov
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
> the test fails with an exception that appears to be a genuine problem:
> {code:java}
> junit.framework.AssertionFailedError: Exception found expected null, but 
> was:   at 
> org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> >
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> org.apache.cassandra.distributed.shared.ShutdownException: Uncaught 
> exceptions were thrown during test
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   Suppressed: java.lang.IllegalStateException: complete already: 
> (failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
>   at 
> org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
>   at 
> org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
>   at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionF

[jira] [Commented] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814446#comment-17814446
 ] 

Andy Tolbert commented on CASSANDRA-19366:
--

Thanks [~frankgh] !

I've included my Pull Request which I will move out of Draft shortly.

Attached are the test results 
[^CASSANDRA-19366-trunk-1_test_results_summary.html] / 
[^CASSANDRA-19366-trunk-1_test_results.tgz]

The tests that failed were:

org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest#updateTest-_jdk11 
(maybe connected to CASSANDRA-19168 which was recently fixed, investigating why 
it still fails)
org.apache.cassandra.db.compaction.CompactionStrategyManagerTest 
testAutomaticUpgradeConcurrency-_jdk11 (likely unconnected, also investigating)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: (was: CASSANDRA-19366-trunk-1_test_results.tgz)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: CASSANDRA-19366-trunk-1_test_results.tgz

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: CASSANDRA-19366-trunk-1_test_results_summary.html

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, 
> CASSANDRA-19366-trunk-1_test_results_summary.html
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: (was: ci_summary.html)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: ci_summary.html

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: CASSANDRA-19366-trunk-1_test_results.tgz

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: (was: CASSANDRA-19366-trunk-1_test_results.tgz)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Attachment: CASSANDRA-19366-trunk-1_test_results.tgz

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
> Attachments: CASSANDRA-19366-trunk-1_test_results.tgz
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Impacts: Docs  (was: None)
Test and Documentation Plan: 
Updated existing tests around nodetool clientstats and added a 
\{{ClientMetricsTest}} that tests the existing metrics for 
ConnectedClients,AuthSuccess,AuthFailure and the new metrics I added.

I ran utests and dtests against this branch and it came back clean with 
exception to two likely unrelated tests which I'll capture in comments.

 
 Status: Patch Available  (was: Open)

Pull Request available at: [https://github.com/apache/cassandra/pull/3085]

I've marked this as Docs impacting as I've added new metrics.  I have updated 
the metrics.adoc file to include the new metrics in addition to existing ones 
that weren't documented.

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-19366:
-
Change Category: Operability
 Complexity: Normal
  Fix Version/s: 5.1
   Assignee: Andy Tolbert
 Status: Open  (was: Triage Needed)

> Expose mode of authentication in system_views.clients, nodetool clientstats, 
> and ClientMetrics
> --
>
> Key: CASSANDRA-19366
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
> Observability/Metrics, Tool/nodetool
>Reporter: Andy Tolbert
>Assignee: Andy Tolbert
>Priority: Normal
> Fix For: 5.1
>
>
> CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
> contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
> which enables Cassandra to support either password and mTLS-authenticated 
> connections.
> As an operator, it would be useful to know which connections are mTLS 
> authenticated, and which are password authenticated, as a possible mode of 
> operation is migrating users from one from of authentication to another. It 
> would also be useful to know if that if authentication attempts are failing 
> which mode of authentication is unsuccessful.
> Proposing to add the following:
>  * Add a {{mode: string}} and {{metadata: map}} to 
> {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
> to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
> {{metadata}} map (e.g. this can include the extracted {{identity}} from a 
> client certificate for {{mtls}} authentication).
>  * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
> which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
> added to existing output to maintain compatibility, much like 
> {{-client-options}} did.
>  * Update {{system_views.clients}} to include columns for these new fields.
>  * Add new metrics to {{{}ClientMetrics{}}}:
>  ** Track authentication success and failures by mode. (Note: The metrics 
> present by authentication mode scope are contextual based on the 
> Authenticator used (e.g. only {{scope=Password}} will be present for 
> {{{}PasswordAuthenticator{}}})
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=AuthSuccess,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,type=Client
> New:
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
> {noformat}
>  * 
>  ** Track connection counts by mode:
> {noformat}
> Existing:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
> org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
> (previously deprecated but still maintained)
> New:
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
> {noformat}
>  * 
>  ** A metric to track encrypted vs. non-encrypted connections:
> {noformat}
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
> org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Re: [PR] Unit test framework and 3 unit tests for PartitionAwarePolicy [cassandra-java-driver]

2024-02-05 Thread via GitHub



aravind-nallan-yb closed pull request #1912: Unit test framework and 3 unit 
tests for PartitionAwarePolicy
URL: https://github.com/apache/cassandra-java-driver/pull/1912


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics

2024-02-05 Thread Andy Tolbert (Jira)

Andy Tolbert created CASSANDRA-19366:


 Summary: Expose mode of authentication in system_views.clients, 
nodetool clientstats, and ClientMetrics
 Key: CASSANDRA-19366
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19366
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/Encryption, Messaging/Client, Observability/JMX, 
Observability/Metrics, Tool/nodetool
Reporter: Andy Tolbert


CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this 
contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, 
which enables Cassandra to support either password and mTLS-authenticated 
connections.

As an operator, it would be useful to know which connections are mTLS 
authenticated, and which are password authenticated, as a possible mode of 
operation is migrating users from one from of authentication to another. It 
would also be useful to know if that if authentication attempts are failing 
which mode of authentication is unsuccessful.

Proposing to add the following:
 * Add a {{mode: string}} and {{metadata: map}} to 
{{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations 
to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a 
{{metadata}} map (e.g. this can include the extracted {{identity}} from a 
client certificate for {{mtls}} authentication).
 * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, 
which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not 
added to existing output to maintain compatibility, much like 
{{-client-options}} did.
 * Update {{system_views.clients}} to include columns for these new fields.
 * Add new metrics to {{{}ClientMetrics{}}}:
 ** Track authentication success and failures by mode. (Note: The metrics 
present by authentication mode scope are contextual based on the Authenticator 
used (e.g. only {{scope=Password}} will be present for 
{{{}PasswordAuthenticator{}}})

{noformat}
Existing:

org.apache.cassandra.metrics:name=AuthSuccess,type=Client
org.apache.cassandra.metrics:name=AuthFailure,type=Client

New:

org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client
org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client

org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client
org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client
{noformat}
 * 
 ** Track connection counts by mode:

{noformat}
Existing:
org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client
org.apache.cassandra.metrics:name=connectedNativeClients,type=Client 
(previously deprecated but still maintained)

New:
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
{noformat}
 * 
 ** A metric to track encrypted vs. non-encrypted connections:

{noformat}
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-19336) Repair causes out of memory

2024-02-05 Thread Jira



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña reassigned CASSANDRA-19336:
-

Assignee: Andres de la Peña

> Repair causes out of memory
> ---
>
> Key: CASSANDRA-19336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory 
> usage for Merkle tree calculations during repairs. This limit is applied to 
> the set of Merkle trees built for a received validation request 
> ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to 
> overwhelm the repair coordinator, who will have requested RF sets of Merkle 
> trees. That way the repair coordinator should only use 
> {{repair_session_space}} for the RF Merkle trees.
> However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} 
> will send RF*RF validation requests, because the repair coordinator node has 
> RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests 
> are sent at the same time, at some point the repair coordinator can have up 
> to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the 
> validation responses is fully processed before the last response arrives.
> Even worse, if the cluster uses virtual nodes, many nodes can be replicas of 
> the repair coordinator, and some nodes can be replicas of multiple token 
> ranges. It would mean that the repair coordinator can send more than RF or 
> RF*RF simultaneous validation requests.
> For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a 
> repair session involving 44 groups of ranges to be repaired. This produces 
> 44*3=132 validation requests contacting all the nodes in the cluster. When 
> the responses for all these requests start to arrive to the coordinator, each 
> containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate 
> quicker than they are consumed, greatly exceeding {{repair_session_space}} 
> and OOMing the node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19336) Repair causes out of memory

2024-02-05 Thread Jira



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814404#comment-17814404
 ] 

Andres de la Peña commented on CASSANDRA-19336:
---

I have added a new {{concurrent_merkle_tree_requests}} config property to the 
PR. This property controls the parallelism of the scheduler. It defaults to 
unbounded parallelism so it keeps the previous behaviour. I think the 
recommended value should be one without vnodes. Without vnodes it could either 
be one too, or something higher if combined with a smaller 
{{repair_session_space}}.

CI looks good; the only failure is CASSANDRA-19168:
||PR||CI||
|[5.0|https://github.com/apache/cassandra/pull/3073]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3411/workflows/bff8cab0-6dff-423c-af95-c7f70fb9a887]
 
[j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3411/workflows/1c6b2337-0db7-4de2-81e3-4a6eccb70204]|

> Repair causes out of memory
> ---
>
> Key: CASSANDRA-19336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Andres de la Peña
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory 
> usage for Merkle tree calculations during repairs. This limit is applied to 
> the set of Merkle trees built for a received validation request 
> ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to 
> overwhelm the repair coordinator, who will have requested RF sets of Merkle 
> trees. That way the repair coordinator should only use 
> {{repair_session_space}} for the RF Merkle trees.
> However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} 
> will send RF*RF validation requests, because the repair coordinator node has 
> RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests 
> are sent at the same time, at some point the repair coordinator can have up 
> to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the 
> validation responses is fully processed before the last response arrives.
> Even worse, if the cluster uses virtual nodes, many nodes can be replicas of 
> the repair coordinator, and some nodes can be replicas of multiple token 
> ranges. It would mean that the repair coordinator can send more than RF or 
> RF*RF simultaneous validation requests.
> For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a 
> repair session involving 44 groups of ranges to be repaired. This produces 
> 44*3=132 validation requests contacting all the nodes in the cluster. When 
> the responses for all these requests start to arrive to the coordinator, each 
> containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate 
> quicker than they are consumed, greatly exceeding {{repair_session_space}} 
> and OOMing the node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[PR] Unit test framework and 3 unit tests for PartitionAwarePolicy [cassandra-java-driver]

2024-02-05 Thread via GitHub



aravind-nallan-yb opened a new pull request, #1912:
URL: https://github.com/apache/cassandra-java-driver/pull/1912

   Extend the upstream LB policy unit test framework to PartitionAwarePolicy 
and add 3 unit tests as samples.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables

2024-02-05 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19189:

Description: 
Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
table, which is keyed by epoch. Snapshots are retrieved from this table for 
three purposes:
* to replay locally during startup
* to provide log state for a peer requesting catchup
* to create point-in-time ClusterMetadata, for disaster recovery

In the majority of cases, we always want to replay from the most recent 
snapshot so we can usually select the appropriate snapshot by simply scanning 
the snapshots table in reverse, which allows us to considerably simplify the 
process of looking up the desired snapshot. We will continue to persist 
historical snapshots, at least for now, so that we are able to select arbitrary 
snapshots should we want to reconstruct metadata state for arbitrary epochs.

  was:
Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
table, which is keyed by epoch. Snapshots are retrieved from this table for two 
purposes:
* to replay locally during startup
* to provide log state for a peer requesting catchup
* to create point-in-time ClusterMetadata, for disaster recovery

In the majority of cases, we always want to replay from the most recent 
snapshot so we can usually select the appropriate snapshot by simply scanning 
the snapshots table in reverse, which allows us to considerably simplify the 
process of looking up the desired snapshot. We will continue to persist 
historical snapshots, at least for now, so that we are able to select arbitrary 
snapshots should we want to reconstruct metadata state for arbitrary epochs.


> Revisit use of sealed period lookup tables
> --
>
> Key: CASSANDRA-19189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19189
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
>
> Metadata snapshots are stored locally in the {{system.metadata_snapshots}} 
> table, which is keyed by epoch. Snapshots are retrieved from this table for 
> three purposes:
> * to replay locally during startup
> * to provide log state for a peer requesting catchup
> * to create point-in-time ClusterMetadata, for disaster recovery
> In the majority of cases, we always want to replay from the most recent 
> snapshot so we can usually select the appropriate snapshot by simply scanning 
> the snapshots table in reverse, which allows us to considerably simplify the 
> process of looking up the desired snapshot. We will continue to persist 
> historical snapshots, at least for now, so that we are able to select 
> arbitrary snapshots should we want to reconstruct metadata state for 
> arbitrary epochs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica

2024-02-05 Thread Brandon Williams (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814339#comment-17814339
 ] 

Brandon Williams commented on CASSANDRA-18824:
--

I think that is a good plan and I am +1 on it, and +1 on this ticket also.

> Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused 
> missing replica
> ---
>
> Key: CASSANDRA-18824
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18824
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Bootstrap and Decommission
>Reporter: Szymon Miezal
>Assignee: Szymon Miezal
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Node decommission triggers data transfer to other nodes. While this transfer 
> is in progress,
> receiving nodes temporarily hold token ranges in a pending state. However, 
> the cleanup process currently doesn't consider these pending ranges when 
> calculating token ownership.
> As a consequence, data that is already stored in sstables gets inadvertently 
> cleaned up.
> STR:
>  * Create two node cluster
>  * Create keyspace with RF=1
>  * Insert sample data (assert data is available when querying both nodes)
>  * Start decommission process of node 1
>  * Start running cleanup in a loop on node 2 until decommission on node 1 
> finishes
>  * Verify of all rows are in the cluster - it will fail as the previous step 
> removed some of the rows
> It seems that the cleanup process does not take into account the pending 
> ranges, it uses only the local ranges - 
> [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466].
> There are two solutions to the problem.
> One would be to change the cleanup process in a way that it start taking 
> pending ranges into account. Even thought it might sound tempting at first it 
> will require involving changes and a lot of testing effort.
> Alternatively we could interrupt/prevent the cleanup process from running 
> when any pending range on a node is detected. That sounds like a reasonable 
> alternative to the problem and something that is relatively easy to implement.
> The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this 
> ticket is to backport it to 3.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica

2024-02-05 Thread Jacek Lewandowski (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814322#comment-17814322
 ] 

Jacek Lewandowski commented on CASSANDRA-18824:
---

I've created https://issues.apache.org/jira/browse/CASSANDRA-19363 and 
https://issues.apache.org/jira/browse/CASSANDRA-19364 as a result of 
investigating the flakiness. The fact that it didn't fail in 5k runs, assuming 
all of those runs were executed under very similar cluster conditions, can be 
misleading. Adding a slight delay in an async code of pending ranges calculator 
leads to consistent test failures even on 4.0. This is not related to this 
issue though - it is only the test added here which can accidentally detect the 
problem. Since those separate tickets are now created, I think we can merge 
this ticket. However, those who asked for this fix should be notified about 
those possible issues.

 

> Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused 
> missing replica
> ---
>
> Key: CASSANDRA-18824
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18824
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Bootstrap and Decommission
>Reporter: Szymon Miezal
>Assignee: Szymon Miezal
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Node decommission triggers data transfer to other nodes. While this transfer 
> is in progress,
> receiving nodes temporarily hold token ranges in a pending state. However, 
> the cleanup process currently doesn't consider these pending ranges when 
> calculating token ownership.
> As a consequence, data that is already stored in sstables gets inadvertently 
> cleaned up.
> STR:
>  * Create two node cluster
>  * Create keyspace with RF=1
>  * Insert sample data (assert data is available when querying both nodes)
>  * Start decommission process of node 1
>  * Start running cleanup in a loop on node 2 until decommission on node 1 
> finishes
>  * Verify of all rows are in the cluster - it will fail as the previous step 
> removed some of the rows
> It seems that the cleanup process does not take into account the pending 
> ranges, it uses only the local ranges - 
> [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466].
> There are two solutions to the problem.
> One would be to change the cleanup process in a way that it start taking 
> pending ranges into account. Even thought it might sound tempting at first it 
> will require involving changes and a lot of testing effort.
> Alternatively we could interrupt/prevent the cleanup process from running 
> when any pending range on a node is detected. That sounds like a reasonable 
> alternative to the problem and something that is relatively easy to implement.
> The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this 
> ticket is to backport it to 3.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission

2024-02-05 Thread Jacek Lewandowski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-19363:
--
Description: 
While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the newly 
added decommission test. It looked innocent; however, when digging into the 
logs, it turned out that, for some reason, the data that were being pumped into 
the cluster went to the decommissioned node instead of going to the working 
node.

That is, the data were inserted into a 2-node cluster (RF=1) while, say, node1 
got decommissioned. The expected behavior would be that the data land in node2 
after that. However, for some reason, in this 1/1000 flaky test, the situation 
was the opposite, and the data went to the decommissioned node, resulting in a 
total loss.

I haven't found the reason. I don't know if it is a test failure or a 
production code problem. I cannot prove that it is only a 3.11 problem. I'm 
creating this ticket because if this is a real issue and exists on newer 
branches, it is serious.
 
The logs artifact is lost in CircleCI thus I'm attaching the one I've 
downloaded earlier, unfortunately it is cleaned up a bit. The relevant part is:
{noformat}
DEBUG [node1_isolatedExecutor:3] node1 ColumnFamilyStore.java:949 - Enqueuing 
flush of tbl: 38.965KiB (0%) on-heap, 0.000KiB (0%) off-heap
DEBUG [node1_PerDiskMemtableFlushWriter_1:1] node1 Memtable.java:477 - Writing 
Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), 
flushed range = (max(-3074457345618258603), max(3074457345618258602)]
DEBUG [node1_PerDiskMemtableFlushWriter_2:1] node1 Memtable.java:477 - Writing 
Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), 
flushed range = (max(3074457345618258602), max(9223372036854775807)]
DEBUG [node1_PerDiskMemtableFlushWriter_0:1] node1 Memtable.java:477 - Writing 
Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), 
flushed range = (min(-9223372036854775808), max(-3074457345618258603)]
DEBUG [node1_PerDiskMemtableFlushWriter_2:1] node1 Memtable.java:506 - 
Completed flushing 
/node1/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-3-big-Data.db
 (1.059KiB) for commitlog position CommitLogPosition(segmentId=1704397819937, 
position=47614)
DEBUG [node1_PerDiskMemtableFlushWriter_1:1] node1 Memtable.java:506 - 
Completed flushing 
/node1/data1/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-2-big-Data.db
 (1.091KiB) for commitlog position CommitLogPosition(segmentId=1704397819937, 
position=47614)
DEBUG [node1_PerDiskMemtableFlushWriter_0:1] node1 Memtable.java:506 - 
Completed flushing 
/node1/data0/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db
 (1.260KiB) for commitlog position CommitLogPosition(segmentId=1704397819937, 
position=47614)
DEBUG [node1_MemtableFlushWriter:1] node1 ColumnFamilyStore.java:1267 - Flushed 
to 
[BigTableReader(path='/node1/data0/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db'),
 
BigTableReader(path='/node1/data1/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-2-big-Data.db'),
 
BigTableReader(path='/node1/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-3-big-Data.db')]
 (3 sstables, 17.521KiB), biggest 5.947KiB, smallest 5.773KiB
DEBUG [node2_isolatedExecutor:1] node2 ColumnFamilyStore.java:949 - Enqueuing 
flush of tbl: 38.379KiB (0%) on-heap, 0.000KiB (0%) off-heap
DEBUG [node2_PerDiskMemtableFlushWriter_0:1] node2 Memtable.java:477 - Writing 
Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), 
flushed range = (null, null]
DEBUG [node2_PerDiskMemtableFlushWriter_0:1] node2 Memtable.java:506 - 
Completed flushing 
/node2/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db
 (3.409KiB) for commitlog position CommitLogPosition(segmentId=1704397821653, 
position=54585)
DEBUG [node2_MemtableFlushWriter:1] node2 ColumnFamilyStore.java:1267 - Flushed 
to 
[BigTableReader(path='/node2/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db')]
 (1 sstables, 7.731KiB), biggest 7.731KiB, 
{noformat}

As one can see, node1 flushed 3 sstables of {{tbl}} although it is already 
decommissioned. Node 2 did not flush much. This is opposite to the passing run 
of the test.

The test code is as follows:
{code:java}
try (Cluster cluster = init(builder().withNodes(2)
 
.withTokenSupplier(evenlyDistributedTokens(2))
 
.withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", "rack0"))
 .withConfig(config -> 
config.with(NETWORK, GOSSIP))
 .start(), 1))
{

[jira] [Updated] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission

2024-02-05 Thread Jacek Lewandowski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-19363:
--
Attachment: bad.txt

> Weird data loss in 3.11 flakiness during decommission
> -
>
> Key: CASSANDRA-19363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19363
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Bootstrap and Decommission
>Reporter: Jacek Lewandowski
>Priority: Normal
> Fix For: 3.11.x
>
> Attachments: bad.txt
>
>
> While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the 
> newly added decommission test. It looked innocent; however, when digging into 
> the logs, it turned out that, for some reason, the data that were being 
> pumped into the cluster went to the decommissioned node instead of going to 
> the working node.
> That is, the data were inserted into a 2-node cluster (RF=1) while, say, 
> node2 got decommissioned. The expected behavior would be that the data land 
> in node1 after that. However, for some reason, in this 1/1000 flaky test, the 
> situation was the opposite, and the data went to the decommissioned node, 
> resulting in a total loss.
> I haven't found the reason. I don't know if it is a test failure or a 
> production code problem. I cannot prove that it is only a 3.11 problem. I'm 
> creating this ticket because if this is a real issue and exists on newer 
> branches, it is serious.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19364) Data loss during decommission possible due to a delayed and unsynced pending ranges calculation

2024-02-05 Thread Jacek Lewandowski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-19364:
--
Description: 
This possible issue has been discovered while inspecting flaky tests of 
CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the 
node is decommissioned. If the data is inserted during decommissioning, and 
pending ranges calculation is delayed for some reason (it can be as it is not 
synchronous), we may end up with partial data loss. That can be just a wrong 
test. Thus, I perceive this ticket more like a memo for further investigation 
or discussion. 

Note that this has obviously been fixed by TCM.

The test in question was:

{code:java}
try (Cluster cluster = init(builder().withNodes(2)
 
.withTokenSupplier(evenlyDistributedTokens(2))
 
.withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", "rack0"))
 .withConfig(config -> 
config.with(NETWORK, GOSSIP))
 .start(), 1))
{
IInvokableInstance nodeToDecommission = cluster.get(1);
IInvokableInstance nodeToRemainInCluster = cluster.get(2);

// Start decomission on nodeToDecommission
cluster.forEach(statusToDecommission(nodeToDecommission));
logger.info("Decommissioning node {}", 
nodeToDecommission.broadcastAddress());

// Add data to cluster while node is decomissioning
int numRows = 100;
cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + 
".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))");
insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // 
<--- HERE - when PRC is delayed, we get there only ~50% of 
inserted rows

// Check data before cleanup on nodeToRemainInCluster
assertEquals(100, nodeToRemainInCluster.executeInternal("SELECT * 
FROM " + KEYSPACE + ".tbl").length);
}
{code}


  was:
This possible issue has been discovered while inspecting flaky tests of 
CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the 
node is decommissioned. If the data is inserted during decommissioning, and 
pending ranges calculation is delayed for some reason (it can be as it is not 
synchronous), we may end up with partial data loss. That can be just a wrong 
test. Thus, I perceive this ticket more like a memo for further investigation 
or discussion. 

Note that this has obviously been fixed by TCM.


> Data loss during decommission possible due to a delayed and unsynced pending 
> ranges calculation
> ---
>
> Key: CASSANDRA-19364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Bootstrap and Decommission
>Reporter: Jacek Lewandowski
>Priority: Normal
>
> This possible issue has been discovered while inspecting flaky tests of 
> CASSANDRA-18824. Pending ranges calculation is executed asynchronously when 
> the node is decommissioned. If the data is inserted during decommissioning, 
> and pending ranges calculation is delayed for some reason (it can be as it is 
> not synchronous), we may end up with partial data loss. That can be just a 
> wrong test. Thus, I perceive this ticket more like a memo for further 
> investigation or discussion. 
> Note that this has obviously been fixed by TCM.
> The test in question was:
> {code:java}
> try (Cluster cluster = init(builder().withNodes(2)
>  
> .withTokenSupplier(evenlyDistributedTokens(2))
>  
> .withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", 
> "rack0"))
>  .withConfig(config -> 
> config.with(NETWORK, GOSSIP))
>  .start(), 1))
> {
> IInvokableInstance nodeToDecommission = cluster.get(1);
> IInvokableInstance nodeToRemainInCluster = cluster.get(2);
> // Start decomission on nodeToDecommission
> cluster.forEach(statusToDecommission(nodeToDecommission));
> logger.info("Decommissioning node {}", 
> nodeToDecommission.broadcastAddress());
> // Add data to cluster while node is decomissioning
> int numRows = 100;
> cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + 
> ".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))");
> insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // 
> <---

[jira] [Created] (CASSANDRA-19364) Data loss during decommission possible due to a delayed and unsynced pending ranges calculation

2024-02-05 Thread Jacek Lewandowski (Jira)

Jacek Lewandowski created CASSANDRA-19364:
-

 Summary: Data loss during decommission possible due to a delayed 
and unsynced pending ranges calculation
 Key: CASSANDRA-19364
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19364
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Bootstrap and Decommission
Reporter: Jacek Lewandowski


This possible issue has been discovered while inspecting flaky tests of 
CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the 
node is decommissioned. If the data is inserted during decommissioning, and 
pending ranges calculation is delayed for some reason (it can be as it is not 
synchronous), we may end up with partial data loss. That can be just a wrong 
test. Thus, I perceive this ticket more like a memo for further investigation 
or discussion. 

Note that this has obviously been fixed by TCM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission

2024-02-05 Thread Jacek Lewandowski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Lewandowski updated CASSANDRA-19363:
--
Fix Version/s: 3.11.x

> Weird data loss in 3.11 flakiness during decommission
> -
>
> Key: CASSANDRA-19363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19363
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Bootstrap and Decommission
>Reporter: Jacek Lewandowski
>Priority: Normal
> Fix For: 3.11.x
>
>
> While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the 
> newly added decommission test. It looked innocent; however, when digging into 
> the logs, it turned out that, for some reason, the data that were being 
> pumped into the cluster went to the decommissioned node instead of going to 
> the working node.
> That is, the data were inserted into a 2-node cluster (RF=1) while, say, 
> node2 got decommissioned. The expected behavior would be that the data land 
> in node1 after that. However, for some reason, in this 1/1000 flaky test, the 
> situation was the opposite, and the data went to the decommissioned node, 
> resulting in a total loss.
> I haven't found the reason. I don't know if it is a test failure or a 
> production code problem. I cannot prove that it is only a 3.11 problem. I'm 
> creating this ticket because if this is a real issue and exists on newer 
> branches, it is serious.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission

2024-02-05 Thread Jacek Lewandowski (Jira)

Jacek Lewandowski created CASSANDRA-19363:
-

 Summary: Weird data loss in 3.11 flakiness during decommission
 Key: CASSANDRA-19363
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19363
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Bootstrap and Decommission
Reporter: Jacek Lewandowski


While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the newly 
added decommission test. It looked innocent; however, when digging into the 
logs, it turned out that, for some reason, the data that were being pumped into 
the cluster went to the decommissioned node instead of going to the working 
node.

That is, the data were inserted into a 2-node cluster (RF=1) while, say, node2 
got decommissioned. The expected behavior would be that the data land in node1 
after that. However, for some reason, in this 1/1000 flaky test, the situation 
was the opposite, and the data went to the decommissioned node, resulting in a 
total loss.

I haven't found the reason. I don't know if it is a test failure or a 
production code problem. I cannot prove that it is only a 3.11 problem. I'm 
creating this ticket because if this is a real issue and exists on newer 
branches, it is serious.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19361) fix node info NPE when ClusterMetadata is null

2024-02-05 Thread Sam Tunnicliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814296#comment-17814296
 ] 

Sam Tunnicliffe commented on CASSANDRA-19361:
-

bq. After deleting data(losing all data), restart and everything became OK

{code}
--  AddressLoad  Tokens  Owns (effective)  Host ID  
 Rack
UN  127.0.0.2  ? 16  51.2% 
6d194555-f6eb-41d0-c000-0002  rack1
DN  127.0.0.4  ? 16  48.8% 
6d194555-f6eb-41d0-c000-0001  rack1
{code}

This is pretty odd for a couple of reasons: 
* node1 and node3 seem to have left or been removed from the cluster. 
* {{Host ID}} is based on the {{NodeId}} in cluster metadata, which in turn is 
based on an auto incrementing integer. So according to this, {{127.0.0.4}} was 
actually the first node added to the cluster. 

Based on this and the stacktraces above, I would guess that something is going 
wrong with node4 discovering its peers when first joining, leading to it 
forming its own single-node cluster in isolation. I'm not sure exactly what is 
happening when you delete all data and restart, if you can attach full logs for 
all 4 nodes, that would be helpful.  

> fix node info NPE when ClusterMetadata is null
> --
>
> Key: CASSANDRA-19361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19361
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Ling Mao
>Assignee: Ling Mao
>Priority: Normal
> Fix For: 5.0.x
>
> Attachments: CASSANDRA-19361-stack-error.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. How
>  
> I create an ensemble with 3 nodes(It works well), then I add the fourth node 
> to join the party. 
> when executing nodetool info, get the following exception:
> {code:java}
> ➜  bin ./nodetool info
> java.lang.NullPointerException at 
> org.apache.cassandra.service.StorageService.operationMode(StorageService.java:3744)
>  at 
> org.apache.cassandra.service.StorageService.isBootstrapFailed(StorageService.java:3810)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)   
> ➜  bin ./nodetool info 
> WARN  [InternalResponseStage:152] 2024-02-02 11:45:15,731 
> RemoteProcessor.java:213 - Got error from /127.0.0.4:7000: TIMEOUT when 
> sending TCM_COMMIT_REQ, retrying on 
> CandidateIterator{candidates=[/127.0.0.4:7000], checkLive=true} error: null 
> -- StackTrace -- java.lang.NullPointerException at 
> org.apache.cassandra.service.StorageService.getLocalHostId(StorageService.java:1904)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at 
> jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260){code}
> server 1 cannot execute node info and cql shell, server 2 and 3 can do it. 
> Try to query the system prefix tables, I attach stack error log for the 
> further debugging. Cannot find a way to recover. After deleting data(losing 
> all data), restart and everything became OK
> {code:java}
> ➜  bin ./nodetool status
> Datacenter: datacenter1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address    Load  Tokens  Owns (effective)  Host ID                        
>        Rack
> UN  127.0.0.2  ?     16      51.2%             
> 6d194555-f6eb-41d0-c000-0002  rack1
> DN  127.0.0.4  ?     16      48.8%             
> 6d194555-f6eb-41d0-c000-0001  rack1{code}
> h3. When
>  
> It was introduced by the Patch: CEP-21. Anyway, the NPE check is needed to 
> protect its propagation anywhere
> {code:java}
> Implementation of Transactional Cluster Metadata as described in CEP-21
> Hash: ae084237
>  
> code diff:
>  
>     public String getLocalHostId()
>      {
> -        UUID id = getLo

[jira] [Commented] (CASSANDRA-19361) fix node info NPE when ClusterMetadata is null

2024-02-05 Thread Sam Tunnicliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814292#comment-17814292
 ] 

Sam Tunnicliffe commented on CASSANDRA-19361:
-

>From the info in the description and the attached text file, it looks as 
>though the 4th node is not communicating with the existing nodes. Can you 
>attach the full log from the fourth node?

I can't reproduce this with ccm, how are you configuring/running the instances?

The executions of {{nodetool info}} in the description, are those are being run 
against node4? Are they being executed while the node is bootstrapping? 
{quote}server 1 cannot execute node info and cql shell, server 2 and 3 can do 
it. 
{quote}
Does this only start to happen _after_ node4 is started? Can you run {{nodetool 
info}} and cqlsh on node1 before adding node4?

> fix node info NPE when ClusterMetadata is null
> --
>
> Key: CASSANDRA-19361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19361
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Ling Mao
>Assignee: Ling Mao
>Priority: Normal
> Fix For: 5.0.x
>
> Attachments: CASSANDRA-19361-stack-error.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. How
>  
> I create an ensemble with 3 nodes(It works well), then I add the fourth node 
> to join the party. 
> when executing nodetool info, get the following exception:
> {code:java}
> ➜  bin ./nodetool info
> java.lang.NullPointerException at 
> org.apache.cassandra.service.StorageService.operationMode(StorageService.java:3744)
>  at 
> org.apache.cassandra.service.StorageService.isBootstrapFailed(StorageService.java:3810)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)   
> ➜  bin ./nodetool info 
> WARN  [InternalResponseStage:152] 2024-02-02 11:45:15,731 
> RemoteProcessor.java:213 - Got error from /127.0.0.4:7000: TIMEOUT when 
> sending TCM_COMMIT_REQ, retrying on 
> CandidateIterator{candidates=[/127.0.0.4:7000], checkLive=true} error: null 
> -- StackTrace -- java.lang.NullPointerException at 
> org.apache.cassandra.service.StorageService.getLocalHostId(StorageService.java:1904)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at 
> jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260){code}
> server 1 cannot execute node info and cql shell, server 2 and 3 can do it. 
> Try to query the system prefix tables, I attach stack error log for the 
> further debugging. Cannot find a way to recover. After deleting data(losing 
> all data), restart and everything became OK
> {code:java}
> ➜  bin ./nodetool status
> Datacenter: datacenter1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address    Load  Tokens  Owns (effective)  Host ID                        
>        Rack
> UN  127.0.0.2  ?     16      51.2%             
> 6d194555-f6eb-41d0-c000-0002  rack1
> DN  127.0.0.4  ?     16      48.8%             
> 6d194555-f6eb-41d0-c000-0001  rack1{code}
> h3. When
>  
> It was introduced by the Patch: CEP-21. Anyway, the NPE check is needed to 
> protect its propagation anywhere
> {code:java}
> Implementation of Transactional Cluster Metadata as described in CEP-21
> Hash: ae084237
>  
> code diff:
>  
>     public String getLocalHostId()
>      {
> -        UUID id = getLocalHostUUID();
> -        return id != null ? id.toString() : null;
> +        return getLocalHostUUID().toString();
>      }
>  
>      public UUID getLocalHostUUID()
>      {
> -        UUID id = 
> getTokenMetadata().getHostId(FBUtilities.getBroadcastAddressAndPort());
> -        if (id != null)
> -            return id;
> -        // this condition is to prevent accessing the tables whe

[jira] [Updated] (CASSANDRA-18098) Test failure bootstrap_test.py::test_cleanup

2024-02-05 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-18098:

Fix Version/s: 5.0.x
   (was: 5.0-rc)

> Test failure bootstrap_test.py::test_cleanup
> 
>
> Key: CASSANDRA-18098
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18098
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Yifan Cai
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 5.0.x, 5.x
>
>
> The test failed a few times in the recent CI runs. For example, this log 
> captures a recent failure. 
> {code:none}
> 20:02:01,364 ccm INFO node1: using Java 11 for the current invocation
> 20:02:02,679 bootstrap_test ERROR ---
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-1-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-4-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-7-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-10-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-13-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-2-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-5-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-8-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-11-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-14-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-3-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-6-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-9-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-12-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-15-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-16-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-17-big-Data.db
> 20:02:02,679 bootstrap_test ERROR Current count is 17, basecount was 15
> -- generated xml file: /tmp/results/dtests/pytest_result_j11_with_vnodes.xml 
> ---
> ===Flaky Test Report===
> test_materialized_views_auth passed 1 out of the required 1 times. Success!
> test_cleanup failed and was not selected for rerun.
>   
>   assert not True
>  +  where True =  0x7f071d43cba8>>()
>  +where  0x7f071d43cba8>> = .is_set
>   []
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-18098) Test failure bootstrap_test.py::test_cleanup

2024-02-05 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-18098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814284#comment-17814284
 ] 

Berenguer Blasi commented on CASSANDRA-18098:
-

They're not exactly the same, just related. But if there's nothing to go about 
I agree there's nothing we can do. Let's remove it from blocking rc and if it 
pops up again we'll have some thread to start pulling.

> Test failure bootstrap_test.py::test_cleanup
> 
>
> Key: CASSANDRA-18098
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18098
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Yifan Cai
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 5.0-rc, 5.x
>
>
> The test failed a few times in the recent CI runs. For example, this log 
> captures a recent failure. 
> {code:none}
> 20:02:01,364 ccm INFO node1: using Java 11 for the current invocation
> 20:02:02,679 bootstrap_test ERROR ---
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-1-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-4-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-7-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-10-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-13-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-2-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-5-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-8-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-11-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-14-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-3-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-6-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-9-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-12-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-15-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-16-big-Data.db
> 20:02:02,679 bootstrap_test ERROR 
> /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-17-big-Data.db
> 20:02:02,679 bootstrap_test ERROR Current count is 17, basecount was 15
> -- generated xml file: /tmp/results/dtests/pytest_result_j11_with_vnodes.xml 
> ---
> ===Flaky Test Report===
> test_materialized_views_auth passed 1 out of the required 1 times. Success!
> test_cleanup failed and was not selected for rerun.
>   
>   assert not True
>  +  where True =  0x7f071d43cba8>>()
>  +where  0x7f071d43cba8>> = .is_set
>   []
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19283) Update rpm and debian shell includes

2024-02-05 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-19283:

Reviewers: Berenguer Blasi, Berenguer Blasi
   Berenguer Blasi, Berenguer Blasi  (was: Berenguer Blasi)
   Status: Review In Progress  (was: Patch Available)

> Update rpm and debian shell includes
> 
>
> Key: CASSANDRA-19283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> While working on CASSANDRA-19001, it was identified that there are 
> differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it 
> seems the debian diff on 5.0 was updated once in 2020 since it was created in 
> 2019.
> CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19283) Update rpm and debian shell includes

2024-02-05 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-19283:

Status: Ready to Commit  (was: Review In Progress)

> Update rpm and debian shell includes
> 
>
> Key: CASSANDRA-19283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> While working on CASSANDRA-19001, it was identified that there are 
> differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it 
> seems the debian diff on 5.0 was updated once in 2020 since it was created in 
> 2019.
> CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19283) Update rpm and debian shell includes

2024-02-05 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814282#comment-17814282
 ] 

Berenguer Blasi commented on CASSANDRA-19283:
-

Devbranch continues to be broken for 5.0+ so there's nothing we can do besides 
your local testing. Given jenkins 5.0 should be up again pretty soon we'll get 
quick feedback if anything is not quite right. +1 lgtm.

> Update rpm and debian shell includes
> 
>
> Key: CASSANDRA-19283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19283
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> While working on CASSANDRA-19001, it was identified that there are 
> differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it 
> seems the debian diff on 5.0 was updated once in 2020 since it was created in 
> 2019.
> CC [~brandon.williams]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

89 matches

Mail list logo