[jira] [Commented] (CASSANDRA-19189) Revisit use of sealed period lookup tables
[ https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814653#comment-17814653 ] Marcus Eriksson commented on CASSANDRA-19189: - and https://github.com/apache/cassandra-dtest/pull/251 - stop trying to snapshot the removed tables > Revisit use of sealed period lookup tables > -- > > Key: CASSANDRA-19189 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19189 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 20m > Remaining Estimate: 0h > > Metadata snapshots are stored locally in the {{system.metadata_snapshots}} > table, which is keyed by epoch. Snapshots are retrieved from this table for > three purposes: > * to replay locally during startup > * to provide log state for a peer requesting catchup > * to create point-in-time ClusterMetadata, for disaster recovery > In the majority of cases, we always want to replay from the most recent > snapshot so we can usually select the appropriate snapshot by simply scanning > the snapshots table in reverse, which allows us to considerably simplify the > process of looking up the desired snapshot. We will continue to persist > historical snapshots, at least for now, so that we are able to select > arbitrary snapshots should we want to reconstruct metadata state for > arbitrary epochs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables
[ https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19189: Attachment: ci_summary.html result_details.tar.gz > Revisit use of sealed period lookup tables > -- > > Key: CASSANDRA-19189 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19189 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 5.1-alpha1 > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 10m > Remaining Estimate: 0h > > Metadata snapshots are stored locally in the {{system.metadata_snapshots}} > table, which is keyed by epoch. Snapshots are retrieved from this table for > three purposes: > * to replay locally during startup > * to provide log state for a peer requesting catchup > * to create point-in-time ClusterMetadata, for disaster recovery > In the majority of cases, we always want to replay from the most recent > snapshot so we can usually select the appropriate snapshot by simply scanning > the snapshots table in reverse, which allows us to considerably simplify the > process of looking up the desired snapshot. We will continue to persist > historical snapshots, at least for now, so that we are able to select > arbitrary snapshots should we want to reconstruct metadata state for > arbitrary epochs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables
[ https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19189: Test and Documentation Plan: ci run Status: Patch Available (was: Open) https://github.com/apache/cassandra/pull/3088 > Revisit use of sealed period lookup tables > -- > > Key: CASSANDRA-19189 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19189 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 5.1-alpha1 > > Time Spent: 10m > Remaining Estimate: 0h > > Metadata snapshots are stored locally in the {{system.metadata_snapshots}} > table, which is keyed by epoch. Snapshots are retrieved from this table for > three purposes: > * to replay locally during startup > * to provide log state for a peer requesting catchup > * to create point-in-time ClusterMetadata, for disaster recovery > In the majority of cases, we always want to replay from the most recent > snapshot so we can usually select the appropriate snapshot by simply scanning > the snapshots table in reverse, which allows us to considerably simplify the > process of looking up the desired snapshot. We will continue to persist > historical snapshots, at least for now, so that we are able to select > arbitrary snapshots should we want to reconstruct metadata state for > arbitrary epochs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables
[ https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19189: Change Category: Code Clarity Complexity: Normal Reviewers: Alex Petrov, Sam Tunnicliffe Status: Open (was: Triage Needed) > Revisit use of sealed period lookup tables > -- > > Key: CASSANDRA-19189 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19189 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 5.1-alpha1 > > > Metadata snapshots are stored locally in the {{system.metadata_snapshots}} > table, which is keyed by epoch. Snapshots are retrieved from this table for > three purposes: > * to replay locally during startup > * to provide log state for a peer requesting catchup > * to create point-in-time ClusterMetadata, for disaster recovery > In the majority of cases, we always want to replay from the most recent > snapshot so we can usually select the appropriate snapshot by simply scanning > the snapshots table in reverse, which allows us to considerably simplify the > process of looking up the desired snapshot. We will continue to persist > historical snapshots, at least for now, so that we are able to select > arbitrary snapshots should we want to reconstruct metadata state for > arbitrary epochs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814613#comment-17814613 ] Berenguer Blasi commented on CASSANDRA-19085: - Thx for the review [~brandon.williams]. The SCM setting is only for CI to fully exercise with that setting. Only the gossiper fix is needed. Given that jenkins is back I'll just merge this to prevent any failures arising from it and duplicating efforts. > In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE > --- > > Key: CASSANDRA-19085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19085 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Branimir Lambov >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, > the test fails with an exception that appears to be a genuine problem: > {code:java} > junit.framework.AssertionFailedError: Exception found expected null, but > was: at > org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > > > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) > at > org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > org.apache.cassandra.distributed.shared.ShutdownException: Uncaught > exceptions were thrown during test > at > org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) > at > org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) > at > org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Suppressed: java.lang.IllegalStateException: complete already: > (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) > at > org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) > at > org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) > at > org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) > at > org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) > at > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) > at > org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net.In
[jira] [Comment Edited] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814563#comment-17814563 ] Caleb Rackliffe edited comment on CASSANDRA-19018 at 2/6/24 5:45 AM: - I think I may have found the missing link. While RFP is still broken around short reads, SAI at the local level might be hiding range tombstones. I've managed to get the multi-node Harry test passing in extended runs [here|https://github.com/maedhroz/cassandra/pull/15/commits]. Paging and read-repair are disabled here to avoid the potential RFP problems, and statics are disabled, but I should now be able to add back statics and read-repair and get clean runs. More on that shortly... UPDATE: I've been able to add static indexing back without failure. At this point, only read repair and paging are disabled, so attacking the RFP issues is probably next. was (Author: maedhroz): I think I may have found the missing link. While RFP is still broken around short reads, SAI at the local level might be hiding range tombstones. I've managed to get the multi-node Harry test passing in extended runs [here|https://github.com/maedhroz/cassandra/pull/15/commits]. Paging and read-repair are disabled here to avoid the potential RFP problems, and statics are disabled, but I should now be able to add back statics and read-repair and get clean runs. More on that shortly... > An SAI-specific mechanism to ensure consistency isn't violated for > multi-column (i.e. AND) queries at CL > ONE > -- > > Key: CASSANDRA-19018 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19018 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Feature/SAI >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 5.0-rc, 5.x > > Attachments: ci_summary-1.html, ci_summary.html, > result_details.tar-1.gz, result_details.tar.gz > > Time Spent: 8h 50m > Remaining Estimate: 0h > > CASSANDRA-19007 is going to be where we add a guardrail around > filtering/index queries that use intersection/AND over partially updated > non-key columns. (ex. Restricting one clustering column and one normal column > does not cause a consistency problem, as primary keys cannot be partially > updated.) This issue exists to attempt to fix this specifically for SAI in > 5.0.x, as Accord will (last I checked) not be available until the 5.1 release. > The SAI-specific version of the originally reported issue is this: > {noformat} > try (Cluster cluster = init(Cluster.build(2).withConfig(config -> > config.with(GOSSIP).with(NETWORK)).start())) > { > cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int > PRIMARY KEY, a int, b int)")); > cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING > 'sai'")); > cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING > 'sai'")); > // insert a split row > cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, > a) VALUES (0, 1)")); > cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, > b) VALUES (0, 2)")); > // Uncomment this line and test succeeds w/ partial writes > completed... > //cluster.get(1).nodetoolResult("repair", > KEYSPACE).asserts().success(); > String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND > b = 2"); > Object[][] initialRows = cluster.coordinator(1).execute(select, > ConsistencyLevel.ALL); > assertRows(initialRows, row(0, 1, 2)); // not found!! > } > {noformat} > To make a long story short, the local SAI indexes are hiding local partial > matches from the coordinator that would combine there to form full matches. > Simple non-index filtering queries also suffer from this problem, but they > hide the partial matches in a different way. I'll outline a possible solution > for this in the comments that takes advantage of replica filtering protection > and the repaired/unrepaired datasets...and attempts to minimize the amount of > extra row data sent to the coordinator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19335) Default nodetool tablestats to Human-Readable Output
[ https://issues.apache.org/jira/browse/CASSANDRA-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814609#comment-17814609 ] Leo Toff commented on CASSANDRA-19335: -- Your comment has been addressed, I've added "-r" short flag for "--no-human-readable". "-r" stands for Raw. Let me know what my next steps should be here. I wanted to do some refactoring: * Convert "out.printf(indent + ...)" to "out.printf(%s ..., indent ...)" in TableStatsPrinter where printf format specifiers are used (see [Stefan's comment in PR#2977|https://github.com/apache/cassandra/pull/2977#discussion_r1430323676]) * Move formatting from the Holder class to the Printer class (from TableStatsHolder to TableStatsPrinter) * Consider renaming "formatMemory" (and other mentions of "memory") to "formatDataSize" across TableStatsPrinter, TableStatsHolder, and FBUtilities > Default nodetool tablestats to Human-Readable Output > > > Key: CASSANDRA-19335 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19335 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Leo Toff >Assignee: Leo Toff >Priority: Low > Fix For: 5.x > > Time Spent: 40m > Remaining Estimate: 0h > > *Current Behavior* > The current implementation of nodetool tablestats in Apache Cassandra outputs > statistics in a format that is not immediately human-readable. This output > primarily includes raw byte counts, which require additional calculation or > conversion to be easily understood by users. This can be inefficient and > time-consuming, especially for users who frequently monitor these statistics > for performance tuning or maintenance purposes. > *Proposed Change* > We propose that nodetool tablestats should, by default, provide its output in > a human-readable format. This change would involve converting byte counts > into more understandable units (KiB, MiB, GiB). The tool could still retain > the option to display raw data for those who need it, perhaps through a flag > such as --no-human-readable or --raw. > *Considerations* > The change should maintain backward compatibility, ensuring that scripts or > tools relying on the current output format can continue to function correctly. > We should provide adequate documentation and examples of both the new default > output and how to access the raw data format, if needed. > *Alignment* > Discussion in the dev mailing list: > [https://lists.apache.org/thread/mlp715kxho5b6f1ql9omlzmmnh4qfby9] > *Related work* > Previous work in the series: > # https://issues.apache.org/jira/browse/CASSANDRA-19015 > # https://issues.apache.org/jira/browse/CASSANDRA-19104 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19335) Default nodetool tablestats to Human-Readable Output
[ https://issues.apache.org/jira/browse/CASSANDRA-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814609#comment-17814609 ] Leo Toff edited comment on CASSANDRA-19335 at 2/6/24 5:25 AM: -- Your comment has been addressed, I've added `-r` short flag for `-no-human-readable`. `-r` stands for Raw. Let me know what my next steps should be here. I wanted to do some refactoring: * Convert "out.printf(indent + ...)" to "out.printf(%s ..., indent ...)" in TableStatsPrinter where printf format specifiers are used (see [Stefan's comment in PR#2977|https://github.com/apache/cassandra/pull/2977#discussion_r1430323676]) * Move formatting from the Holder class to the Printer class (from TableStatsHolder to TableStatsPrinter) * Consider renaming "formatMemory" (and other mentions of "memory") to "formatDataSize" across TableStatsPrinter, TableStatsHolder, and FBUtilities was (Author: JIRAUSER303078): Your comment has been addressed, I've added "-r" short flag for "--no-human-readable". "-r" stands for Raw. Let me know what my next steps should be here. I wanted to do some refactoring: * Convert "out.printf(indent + ...)" to "out.printf(%s ..., indent ...)" in TableStatsPrinter where printf format specifiers are used (see [Stefan's comment in PR#2977|https://github.com/apache/cassandra/pull/2977#discussion_r1430323676]) * Move formatting from the Holder class to the Printer class (from TableStatsHolder to TableStatsPrinter) * Consider renaming "formatMemory" (and other mentions of "memory") to "formatDataSize" across TableStatsPrinter, TableStatsHolder, and FBUtilities > Default nodetool tablestats to Human-Readable Output > > > Key: CASSANDRA-19335 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19335 > Project: Cassandra > Issue Type: Improvement > Components: Tool/nodetool >Reporter: Leo Toff >Assignee: Leo Toff >Priority: Low > Fix For: 5.x > > Time Spent: 40m > Remaining Estimate: 0h > > *Current Behavior* > The current implementation of nodetool tablestats in Apache Cassandra outputs > statistics in a format that is not immediately human-readable. This output > primarily includes raw byte counts, which require additional calculation or > conversion to be easily understood by users. This can be inefficient and > time-consuming, especially for users who frequently monitor these statistics > for performance tuning or maintenance purposes. > *Proposed Change* > We propose that nodetool tablestats should, by default, provide its output in > a human-readable format. This change would involve converting byte counts > into more understandable units (KiB, MiB, GiB). The tool could still retain > the option to display raw data for those who need it, perhaps through a flag > such as --no-human-readable or --raw. > *Considerations* > The change should maintain backward compatibility, ensuring that scripts or > tools relying on the current output format can continue to function correctly. > We should provide adequate documentation and examples of both the new default > output and how to access the raw data format, if needed. > *Alignment* > Discussion in the dev mailing list: > [https://lists.apache.org/thread/mlp715kxho5b6f1ql9omlzmmnh4qfby9] > *Related work* > Previous work in the series: > # https://issues.apache.org/jira/browse/CASSANDRA-19015 > # https://issues.apache.org/jira/browse/CASSANDRA-19104 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]
aratno commented on code in PR #1743: URL: https://github.com/apache/cassandra-java-driver/pull/1743#discussion_r1479160472 ## core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java: ## @@ -96,14 +99,38 @@ public class DefaultLoadBalancingPolicy extends BasicLoadBalancingPolicy impleme private static final int MAX_IN_FLIGHT_THRESHOLD = 10; private static final long RESPONSE_COUNT_RESET_INTERVAL_NANOS = MILLISECONDS.toNanos(200); - protected final Map responseTimes = new ConcurrentHashMap<>(); + protected final LoadingCache responseTimes; protected final Map upTimes = new ConcurrentHashMap<>(); private final boolean avoidSlowReplicas; public DefaultLoadBalancingPolicy(@NonNull DriverContext context, @NonNull String profileName) { super(context, profileName); this.avoidSlowReplicas = profile.getBoolean(DefaultDriverOption.LOAD_BALANCING_POLICY_SLOW_AVOIDANCE, true); +CacheLoader cacheLoader = +new CacheLoader() { + @Override + public AtomicLongArray load(Node key) throws Exception { +// The array stores at most two timestamps, since we don't need more; +// the first one is always the least recent one, and hence the one to inspect. +long now = nanoTime(); +AtomicLongArray array = responseTimes.getIfPresent(key); +if (array == null) { + array = new AtomicLongArray(1); + array.set(0, now); +} else if (array.length() == 1) { + long previous = array.get(0); + array = new AtomicLongArray(2); + array.set(0, previous); + array.set(1, now); +} else { + array.set(0, array.get(1)); + array.set(1, now); +} +return array; + } +}; +this.responseTimes = CacheBuilder.newBuilder().weakKeys().build(cacheLoader); Review Comment: I think we should add a [RemovalListener](https://guava.dev/releases/21.0/api/docs/com/google/common/cache/RemovalListener.html) here. If a GC happens and response times for a Node are purged, then we'll end up treating that as "insufficient responses" in `isResponseRateInsufficient`, which can lead us to mark a node as unhealthy. I recognize that this is a bit of a pathological example, but this behavior does depend on GC timing and would be a pain to track down, so adding logging could make someone's life easier down the line. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814563#comment-17814563 ] Caleb Rackliffe commented on CASSANDRA-19018: - I think I may have found the missing link. While RFP is still broken around short reads, SAI at the local level might be hiding range tombstones. I've managed to get the multi-node Harry test passing in extended runs [here|https://github.com/maedhroz/cassandra/pull/15/commits]. Paging and read-repair are disabled here to avoid the potential RFP problems, and statics are disabled, but I should now be able to add back statics and read-repair and get clean runs. More on that shortly... > An SAI-specific mechanism to ensure consistency isn't violated for > multi-column (i.e. AND) queries at CL > ONE > -- > > Key: CASSANDRA-19018 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19018 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Feature/SAI >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 5.0-rc, 5.x > > Attachments: ci_summary-1.html, ci_summary.html, > result_details.tar-1.gz, result_details.tar.gz > > Time Spent: 8h 50m > Remaining Estimate: 0h > > CASSANDRA-19007 is going to be where we add a guardrail around > filtering/index queries that use intersection/AND over partially updated > non-key columns. (ex. Restricting one clustering column and one normal column > does not cause a consistency problem, as primary keys cannot be partially > updated.) This issue exists to attempt to fix this specifically for SAI in > 5.0.x, as Accord will (last I checked) not be available until the 5.1 release. > The SAI-specific version of the originally reported issue is this: > {noformat} > try (Cluster cluster = init(Cluster.build(2).withConfig(config -> > config.with(GOSSIP).with(NETWORK)).start())) > { > cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int > PRIMARY KEY, a int, b int)")); > cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING > 'sai'")); > cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING > 'sai'")); > // insert a split row > cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, > a) VALUES (0, 1)")); > cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, > b) VALUES (0, 2)")); > // Uncomment this line and test succeeds w/ partial writes > completed... > //cluster.get(1).nodetoolResult("repair", > KEYSPACE).asserts().success(); > String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND > b = 2"); > Object[][] initialRows = cluster.coordinator(1).execute(select, > ConsistencyLevel.ALL); > assertRows(initialRows, row(0, 1, 2)); // not found!! > } > {noformat} > To make a long story short, the local SAI indexes are hiding local partial > matches from the coordinator that would combine there to form full matches. > Simple non-index filtering queries also suffer from this problem, but they > hide the partial matches in a different way. I'll outline a possible solution > for this in the comments that takes advantage of replica filtering protection > and the repaired/unrepaired datasets...and attempts to minimize the amount of > extra row data sent to the coordinator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18230) Write docs for CEP-20
[ https://issues.apache.org/jira/browse/CASSANDRA-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorina Poland updated CASSANDRA-18230: -- Change Category: Code Clarity Complexity: Normal Priority: Normal (was: High) Status: Open (was: Triage Needed) > Write docs for CEP-20 > - > > Key: CASSANDRA-18230 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18230 > Project: Cassandra > Issue Type: New Feature > Components: Documentation >Reporter: Lorina Poland >Assignee: Lorina Poland >Priority: Normal > Fix For: 5.x > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-18230) Write docs for CEP-20
[ https://issues.apache.org/jira/browse/CASSANDRA-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lorina Poland reassigned CASSANDRA-18230: - Assignee: Lorina Poland > Write docs for CEP-20 > - > > Key: CASSANDRA-18230 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18230 > Project: Cassandra > Issue Type: New Feature > Components: Documentation >Reporter: Lorina Poland >Assignee: Lorina Poland >Priority: High > Fix For: 5.x > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]
absurdfarce commented on PR #1743: URL: https://github.com/apache/cassandra-java-driver/pull/1743#issuecomment-1928483385 Very much agreed that the underlying issue here appears to be an issue with AWS Keyspaces @aratno; that's being addressed in a different ticket. The scope of this change is around preventing the (potentially indefinite) caching of Node instances within an LBP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]
aratno commented on code in PR #1743: URL: https://github.com/apache/cassandra-java-driver/pull/1743#discussion_r1479025356 ## core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java: ## @@ -276,38 +303,23 @@ protected boolean isBusy(@NonNull Node node, @NonNull Session session) { protected boolean isResponseRateInsufficient(@NonNull Node node, long now) { // response rate is considered insufficient when less than 2 responses were obtained in // the past interval delimited by RESPONSE_COUNT_RESET_INTERVAL_NANOS. -if (responseTimes.containsKey(node)) { - AtomicLongArray array = responseTimes.get(node); - if (array.length() == 2) { -long threshold = now - RESPONSE_COUNT_RESET_INTERVAL_NANOS; -long leastRecent = array.get(0); -return leastRecent - threshold < 0; - } -} -return true; +AtomicLongArray array = responseTimes.getIfPresent(node); +if (array != null && array.length() == 2) { + long threshold = now - RESPONSE_COUNT_RESET_INTERVAL_NANOS; + long leastRecent = array.get(0); + return leastRecent - threshold < 0; +} else return true; Review Comment: Style nit: Invert the condition and use an early-return if response rate is insufficient, so you don't have `else return true` ## core/src/main/java/com/datastax/oss/driver/internal/core/metrics/AbstractMetricUpdater.java: ## @@ -173,9 +173,8 @@ protected Timeout newTimeout() { .getTimer() .newTimeout( t -> { - if (t.isExpired()) { -clearMetrics(); - } + clearMetrics(); + cancelMetricsExpirationTimeout(); Review Comment: What's the reasoning for this change? ## core/src/main/java/com/datastax/oss/driver/internal/core/util/concurrent/ReplayingEventFilter.java: ## @@ -82,6 +82,7 @@ public void markReady() { consumer.accept(event); } } finally { + recordedEvents.clear(); Review Comment: What's the reasoning for this change? ## core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java: ## @@ -96,14 +99,38 @@ public class DefaultLoadBalancingPolicy extends BasicLoadBalancingPolicy impleme private static final int MAX_IN_FLIGHT_THRESHOLD = 10; private static final long RESPONSE_COUNT_RESET_INTERVAL_NANOS = MILLISECONDS.toNanos(200); - protected final Map responseTimes = new ConcurrentHashMap<>(); + protected final LoadingCache responseTimes; protected final Map upTimes = new ConcurrentHashMap<>(); private final boolean avoidSlowReplicas; public DefaultLoadBalancingPolicy(@NonNull DriverContext context, @NonNull String profileName) { super(context, profileName); this.avoidSlowReplicas = profile.getBoolean(DefaultDriverOption.LOAD_BALANCING_POLICY_SLOW_AVOIDANCE, true); +CacheLoader cacheLoader = Review Comment: Style nit: use a separate class for the cache value here, rather than using AtomicLongArray as a generic container. Seems like it can be something like `NodeResponseRateSample`, with methods like `boolean hasSufficientResponses`. I see this was present in the previous implementation, so not a required change for this PR, just something I noticed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19372) WESBITE - Adding blog post
[ https://issues.apache.org/jira/browse/CASSANDRA-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814527#comment-17814527 ] Paul Au commented on CASSANDRA-19372: - Preview Available: https://raw.githack.com/Paul-TT/cassandra-website/CASSANDRA-19372_generated/content/_/blog.html https://raw.githack.com/Paul-TT/cassandra-website/CASSANDRA-19372_generated/content/_/blog/Apache-Cassandra-5.0-Features-Mathematical-CQL-Functions.html > WESBITE - Adding blog post > -- > > Key: CASSANDRA-19372 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19372 > Project: Cassandra > Issue Type: Task > Components: Documentation/Website >Reporter: Paul Au >Priority: Normal > > Adding blog post to website. > Apache Cassandra 5.0 Features: Mathematical CQL Functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19372) WESBITE - Adding blog post
Paul Au created CASSANDRA-19372: --- Summary: WESBITE - Adding blog post Key: CASSANDRA-19372 URL: https://issues.apache.org/jira/browse/CASSANDRA-19372 Project: Cassandra Issue Type: Task Components: Documentation/Website Reporter: Paul Au Adding blog post to website. Apache Cassandra 5.0 Features: Mathematical CQL Functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19370) Intermittent test failures in SchemaIT
[ https://issues.apache.org/jira/browse/CASSANDRA-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814519#comment-17814519 ] Bret McGuire commented on CASSANDRA-19370: -- Also worth noting that there are several older tickets in the DataStax Jira which address similar issues: https://datastax-oss.atlassian.net/browse/JAVA-2579 [https://datastax-oss.atlassian.net/browse/JAVA-1690] So this one has been around for a while. > Intermittent test failures in SchemaIT > -- > > Key: CASSANDRA-19370 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19370 > Project: Cassandra > Issue Type: Bug >Reporter: Bret McGuire >Priority: Normal > > Noted on a few DataStax Jenkins runs of the Java driver test suite, > specifically a test run for a recent PR for CASSANDRA-19290. Seems to be > very intermittent. > > {code:java} > Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = > 'openjdk@1.11' / Execute-Tests / > com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code} > > {noformat} > Error MessageExpecting: > > {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} > not to contain key: > fooStacktracejava.lang.AssertionError: > Expecting: > > {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} > not to contain key: > foo > at > com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) > at > org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47) > at > org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:27) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at java.base/java.lang.Thread.run(Thread.java:833){noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19371) Intermittent test failures in ChannelPoolResizeTest
Bret McGuire created CASSANDRA-19371: Summary: Intermittent test failures in ChannelPoolResizeTest Key: CASSANDRA-19371 URL: https://issues.apache.org/jira/browse/CASSANDRA-19371 Project: Cassandra Issue Type: Bug Reporter: Bret McGuire Noted on a recent DataStax Jenkins run against a PR for CASSANDRA-19290. Failure seems to be intermittent. {noformat} Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 'openjdk@1.11' / Execute-Tests / com.datastax.oss.driver.internal.core.pool.ChannelPoolResizeTest.should_resize_during_reconnection_if_config_changes{noformat} {noformat} Error MessageExpecting: [1] to contain exactly (and in same order): [0] but some elements were not found: [0] and others were not expected: [1] Stacktracejava.lang.AssertionError: Expecting: [1] to contain exactly (and in same order): [0] but some elements were not found: [0] and others were not expected: [1] at com.datastax.oss.driver.internal.core.channel.MockChannelFactoryHelper.verifyNoMoreCalls(MockChannelFactoryHelper.java:114) at com.datastax.oss.driver.internal.core.pool.ChannelPoolResizeTest.should_resize_during_reconnection_if_config_changes(ChannelPoolResizeTest.java:379) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:49) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:120) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:95) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:69) at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:146) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162) at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To un
[jira] [Updated] (CASSANDRA-19370) Intermittent test failures in SchemaIT
[ https://issues.apache.org/jira/browse/CASSANDRA-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bret McGuire updated CASSANDRA-19370: - Description: Noted on a few DataStax Jenkins runs of the Java driver test suite, specifically a test run for a recent PR for CASSANDRA-19290. Seems to be very intermittent. {code:java} Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 'openjdk@1.11' / Execute-Tests / com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code} {noformat} Error MessageExpecting: {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} not to contain key: fooStacktracejava.lang.AssertionError: Expecting: {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} not to contain key: foo at com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) at org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47) at org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833){noformat} was: Noted on a few DataStax Jenkins runs of the Java driver test suite. Seems to be very intermittent. {code:java} Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 'openjdk@1.11' / Execute-Tests / com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code} {noformat} Error MessageExpecting: {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} not to contain key: fooStacktracejava.lang.AssertionError: Expecting: {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} not to contain key: foo at com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.b
[jira] [Created] (CASSANDRA-19370) Intermittent test failures in SchemaIT
Bret McGuire created CASSANDRA-19370: Summary: Intermittent test failures in SchemaIT Key: CASSANDRA-19370 URL: https://issues.apache.org/jira/browse/CASSANDRA-19370 Project: Cassandra Issue Type: Bug Reporter: Bret McGuire Noted on a few DataStax Jenkins runs of the Java driver test suite. Seems to be very intermittent. {code:java} Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = 'openjdk@1.11' / Execute-Tests / com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code} {noformat} Error MessageExpecting: {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} not to contain key: fooStacktracejava.lang.AssertionError: Expecting: {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} not to contain key: foo at com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) at org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47) at org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19370) Intermittent test failures in SchemaIT
[ https://issues.apache.org/jira/browse/CASSANDRA-19370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814516#comment-17814516 ] Bret McGuire commented on CASSANDRA-19370: -- Not immediately clear if there's anything DSE-specific about this failure or not. The two cases I could find do involve runs against DSE but it's quite possible the issue in this test is more general. > Intermittent test failures in SchemaIT > -- > > Key: CASSANDRA-19370 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19370 > Project: Cassandra > Issue Type: Bug >Reporter: Bret McGuire >Priority: Normal > > Noted on a few DataStax Jenkins runs of the Java driver test suite. Seems to > be very intermittent. > > {code:java} > Per-Commit / Matrix - SERVER_VERSION = 'dse-6.8.30', JABBA_VERSION = > 'openjdk@1.11' / Execute-Tests / > com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config{code} > > {noformat} > Error MessageExpecting: > > {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} > not to contain key: > fooStacktracejava.lang.AssertionError: > Expecting: > > {foo=com.datastax.dse.driver.internal.core.metadata.schema.DefaultDseTableMetadata@a8599027} > not to contain key: > foo > at > com.datastax.oss.driver.core.metadata.SchemaIT.should_disable_schema_programmatically_when_enabled_in_config(SchemaIT.java:158) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) > at > org.apache.maven.surefire.junitcore.pc.InvokerStrategy.schedule(InvokerStrategy.java:47) > at > org.apache.maven.surefire.junitcore.pc.Scheduler.schedule(Scheduler.java:316) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:27) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:345) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at java.base/java.lang.Thread.run(Thread.java:833){noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814508#comment-17814508 ] Francisco Guerrero edited comment on CASSANDRA-19369 at 2/5/24 9:16 PM: [~smiklosovic] Cassandra Analytics creates an SSTable during bulk writes. For each SSTable component generated we calculate the digest of each file (this includes the crc32 file), which is then uploaded. The purpose of this checksum is to prevent file integrity of each of the SSTable component files during transmission from Spark executor to Cassandra Sidecar service, rather than integrity of the data file. For data integrity, bulk writer does the following: - Checksums of each file generated - Re-read the generated SSTable file and ensure that what was written is the same as what we read. - Transfer the file with a checksum header - (On Sidecar) Validate that the checksum matches the uploaded file was (Author: frankgh): [~smiklosovic] Cassandra Analytics creates an SSTable during bulk writes. For each SSTable component generated we calculate the digest of each file (this includes the crc32 file), which is then uploaded. The purpose of this checksum is to prevent file integrity of each of the SSTable component files, rather than integrity of the data file. For data integrity, bulk writer does the following: - Checksums of each file generated - Re-read the generated SSTable file and ensure that what was written is the same as what we read. - Transfer the file with a checksum header - (On Sidecar) Validate that the checksum matches the uploaded file > [Analytics] Use XXHash32 for digest calculation of SSTables > --- > > Key: CASSANDRA-19369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19369 > Project: Cassandra > Issue Type: Improvement > Components: Analytics Library >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > During bulk writes, Cassandra Analytics calculates the MD5 checksum of every > SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra > Analytics includes the {{content-md5}} header as part of the upload request. > This information is used by Cassandra Sidecar to validate the integrity of > the uploaded SSTable and prevent issues with bit flips and corrupted SSTables. > Recently, Cassandra Sidecar introduced [support for additional checksum > validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during > SSTable upload. Notably the XXHash32 digest support was added which offers > for more performant checksum calculations. This support now allows Cassandra > Analytics to use a more efficient digest algorithm that is friendlier on the > CPU usage of Sidecar and spark resources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814508#comment-17814508 ] Francisco Guerrero commented on CASSANDRA-19369: [~smiklosovic] Cassandra Analytics creates an SSTable during bulk writes. For each SSTable component generated we calculate the digest of each file (this includes the crc32 file), which is then uploaded. The purpose of this checksum is to prevent file integrity of each of the SSTable component files, rather than integrity of the data file. For data integrity, bulk writer does the following: - Checksums of each file generated - Re-read the generated SSTable file and ensure that what was written is the same as what we read. - Transfer the file with a checksum header - (On Sidecar) Validate that the checksum matches the uploaded file > [Analytics] Use XXHash32 for digest calculation of SSTables > --- > > Key: CASSANDRA-19369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19369 > Project: Cassandra > Issue Type: Improvement > Components: Analytics Library >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > During bulk writes, Cassandra Analytics calculates the MD5 checksum of every > SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra > Analytics includes the {{content-md5}} header as part of the upload request. > This information is used by Cassandra Sidecar to validate the integrity of > the uploaded SSTable and prevent issues with bit flips and corrupted SSTables. > Recently, Cassandra Sidecar introduced [support for additional checksum > validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during > SSTable upload. Notably the XXHash32 digest support was added which offers > for more performant checksum calculations. This support now allows Cassandra > Analytics to use a more efficient digest algorithm that is friendlier on the > CPU usage of Sidecar and spark resources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814502#comment-17814502 ] Stefan Miklosovic commented on CASSANDRA-19369: --- Why is this actually needed at all? If you write a SSTable, there is DIGEST component which computes crc32 of a data file. Are not analytics supporting this too? Would not it make more sense to introduce a way how to use different checksum algorithms except crc32 for data file integrity validation and then reuse it from analytics? > [Analytics] Use XXHash32 for digest calculation of SSTables > --- > > Key: CASSANDRA-19369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19369 > Project: Cassandra > Issue Type: Improvement > Components: Analytics Library >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > During bulk writes, Cassandra Analytics calculates the MD5 checksum of every > SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra > Analytics includes the {{content-md5}} header as part of the upload request. > This information is used by Cassandra Sidecar to validate the integrity of > the uploaded SSTable and prevent issues with bit flips and corrupted SSTables. > Recently, Cassandra Sidecar introduced [support for additional checksum > validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during > SSTable upload. Notably the XXHash32 digest support was added which offers > for more performant checksum calculations. This support now allows Cassandra > Analytics to use a more efficient digest algorithm that is friendlier on the > CPU usage of Sidecar and spark resources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-16969 4.7.x license check [cassandra-java-driver]
michaelsembwever closed pull request #1786: CASSANDRA-16969 4.7.x license check URL: https://github.com/apache/cassandra-java-driver/pull/1786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francisco Guerrero updated CASSANDRA-19369: --- Test and Documentation Plan: Added unit tests. Integration tests pending Status: Patch Available (was: In Progress) PR: https://github.com/apache/cassandra-analytics/pull/38 > [Analytics] Use XXHash32 for digest calculation of SSTables > --- > > Key: CASSANDRA-19369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19369 > Project: Cassandra > Issue Type: Improvement > Components: Analytics Library >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > During bulk writes, Cassandra Analytics calculates the MD5 checksum of every > SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra > Analytics includes the {{content-md5}} header as part of the upload request. > This information is used by Cassandra Sidecar to validate the integrity of > the uploaded SSTable and prevent issues with bit flips and corrupted SSTables. > Recently, Cassandra Sidecar introduced [support for additional checksum > validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during > SSTable upload. Notably the XXHash32 digest support was added which offers > for more performant checksum calculations. This support now allows Cassandra > Analytics to use a more efficient digest algorithm that is friendlier on the > CPU usage of Sidecar and spark resources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francisco Guerrero updated CASSANDRA-19369: --- Change Category: Performance Complexity: Low Hanging Fruit Component/s: Analytics Library Status: Open (was: Triage Needed) > [Analytics] Use XXHash32 for digest calculation of SSTables > --- > > Key: CASSANDRA-19369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19369 > Project: Cassandra > Issue Type: Improvement > Components: Analytics Library >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Normal > > During bulk writes, Cassandra Analytics calculates the MD5 checksum of every > SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra > Analytics includes the {{content-md5}} header as part of the upload request. > This information is used by Cassandra Sidecar to validate the integrity of > the uploaded SSTable and prevent issues with bit flips and corrupted SSTables. > Recently, Cassandra Sidecar introduced [support for additional checksum > validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during > SSTable upload. Notably the XXHash32 digest support was added which offers > for more performant checksum calculations. This support now allows Cassandra > Analytics to use a more efficient digest algorithm that is friendlier on the > CPU usage of Sidecar and spark resources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[PR] CASSANDRA-19369 Use XXHash32 for digest calculation of SSTables [cassandra-analytics]
frankgh opened a new pull request, #38: URL: https://github.com/apache/cassandra-analytics/pull/38 This commit adds the ability to use the newly supported in Cassandra Sidecar XXhash32 digest algorithm. The commit allows for backwards compatibility to perform MD5 checksumming, but it now defaults to XXHash32. A new Writer option is added: ``` .option(WriterOptions.DIGEST_TYPE.name(), "XXHASH32") // or .option(WriterOptions.DIGEST_TYPE.name(), "MD5") ``` This option defaults to XXHash32, when not provided, but it can be configured to use the legacy MD5 algorithm. Path by Francisco Guerrero; Reviewed by TBD for CASSANDRA-19369 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19369) [Analytics] Use XXHash32 for digest calculation of SSTables
Francisco Guerrero created CASSANDRA-19369: -- Summary: [Analytics] Use XXHash32 for digest calculation of SSTables Key: CASSANDRA-19369 URL: https://issues.apache.org/jira/browse/CASSANDRA-19369 Project: Cassandra Issue Type: Improvement Reporter: Francisco Guerrero Assignee: Francisco Guerrero During bulk writes, Cassandra Analytics calculates the MD5 checksum of every SSTable it produces. During SSTable upload to Cassandra Sidecar, Cassandra Analytics includes the {{content-md5}} header as part of the upload request. This information is used by Cassandra Sidecar to validate the integrity of the uploaded SSTable and prevent issues with bit flips and corrupted SSTables. Recently, Cassandra Sidecar introduced [support for additional checksum validations|https://issues.apache.org/jira/browse/CASSANDRASC-97] during SSTable upload. Notably the XXHash32 digest support was added which offers for more performant checksum calculations. This support now allows Cassandra Analytics to use a more efficient digest algorithm that is friendlier on the CPU usage of Sidecar and spark resources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19180: Support reloading keystore in cassandra-java-driver [cassandra-java-driver]
absurdfarce commented on code in PR #1907: URL: https://github.com/apache/cassandra-java-driver/pull/1907#discussion_r1478784272 ## core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java: ## @@ -0,0 +1,253 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package com.datastax.oss.driver.internal.core.ssl; + +import com.datastax.oss.driver.shaded.guava.common.annotations.VisibleForTesting; +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.net.Socket; +import java.nio.file.Files; +import java.nio.file.Path; +import java.security.KeyStore; +import java.security.KeyStoreException; +import java.security.MessageDigest; +import java.security.NoSuchAlgorithmException; +import java.security.Principal; +import java.security.PrivateKey; +import java.security.Provider; +import java.security.UnrecoverableKeyException; +import java.security.cert.CertificateException; +import java.security.cert.X509Certificate; +import java.time.Duration; +import java.util.Arrays; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicReference; +import javax.net.ssl.KeyManager; +import javax.net.ssl.KeyManagerFactory; +import javax.net.ssl.KeyManagerFactorySpi; +import javax.net.ssl.ManagerFactoryParameters; +import javax.net.ssl.SSLEngine; +import javax.net.ssl.X509ExtendedKeyManager; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class ReloadingKeyManagerFactory extends KeyManagerFactory implements AutoCloseable { + private static final Logger logger = LoggerFactory.getLogger(ReloadingKeyManagerFactory.class); + private static final String KEYSTORE_TYPE = "JKS"; + private Path keystorePath; + private String keystorePassword; + private ScheduledExecutorService executor; + private final Spi spi; + + // We're using a single thread executor so this shouldn't need to be volatile, since all updates + // to lastDigest should come from the same thread + private volatile byte[] lastDigest; + + /** + * Create a new {@link ReloadingKeyManagerFactory} with the given keystore file and password, + * reloading from the file's content at the given interval. This function will do an initial + * reload before returning, to confirm that the file exists and is readable. + * + * @param keystorePath the keystore file to reload + * @param keystorePassword the keystore password + * @param reloadInterval the duration between reload attempts. Set to {@link + * java.time.Duration#ZERO} to disable scheduled reloading. + * @return + */ + public static ReloadingKeyManagerFactory create( + Path keystorePath, String keystorePassword, Duration reloadInterval) + throws UnrecoverableKeyException, KeyStoreException, NoSuchAlgorithmException, + CertificateException, IOException { +KeyManagerFactory kmf = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm()); + +KeyStore ks; +try (InputStream ksf = Files.newInputStream(keystorePath)) { + ks = KeyStore.getInstance(KEYSTORE_TYPE); + ks.load(ksf, keystorePassword.toCharArray()); +} +kmf.init(ks, keystorePassword.toCharArray()); + +ReloadingKeyManagerFactory reloadingKeyManagerFactory = new ReloadingKeyManagerFactory(kmf); +reloadingKeyManagerFactory.start(keystorePath, keystorePassword, reloadInterval); +return reloadingKeyManagerFactory; + } + + @VisibleForTesting + protected ReloadingKeyManagerFactory(KeyManagerFactory initial) { +this( +new Spi((X509ExtendedKeyManager) initial.getKeyManagers()[0]), +initial.getProvider(), +initial.getAlgorithm()); + } + + private ReloadingKeyManagerFactory(Spi spi, Provider provider, String algorithm) { +super(spi, provider, algorithm); +this.spi = spi; + } + + private void start(Path keystorePath, String keystorePassword, Duration reloadInterval) { +this.keystorePath = keystorePath; +this.keystorePassword = keystorePassword; + +// Ensure that reload is called once synchronously, to make sure t
[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
[ https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814491#comment-17814491 ] David Capwell commented on CASSANDRA-19367: --- Here are the places isVector is called * org.apache.cassandra.index.sai.memory.MemtableIndexManager#update - I don't need "update", but not sure why we even special case vector here? removing "index" in favor of "update" seems fine to me. I also see we don't update "memtableIndexWriteLatency" in the vectors case... so cleaning that up would fix this metric? * org.apache.cassandra.index.sai.plan.StorageAttachedIndexQueryPlan#StorageAttachedIndexQueryPlan - this is checking if any index is a factor, if so we are "top-key"... Thats super specific but mostly ignored in my POC as I query SAI lower level than CQL so I avoid post filtering and loading the partition/row * org.apache.cassandra.index.sai.disk.v1.V1OnDiskFormat#perColumnIndexComponents - Just saw that my POC adds accord here, but I missed refactoring this to Strat... still need to do that for this patch * org.apache.cassandra.index.sai.disk.v1.IndexWriterConfig#fromOptions - would be nice to leverage Strat here, but don't need for my use case as its a internal table with an internal index... I validate w/e you try to construct the index * Several cases in StorageAttachedIndex and IndexTermType I need to fix the v1 format as that does impact my POC, but open to other places depending on feedback > Refactor SAI so the selection of the index type is not scattered to multiple > places > --- > > Key: CASSANDRA-19367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > For Accord we want to write an internal index and finding plugging into SAI > is a bit more channeling than it could be… we need to find multiple places > where the SAI code “infer” the index type so it can delegate… this logic > should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19180: Support reloading keystore in cassandra-java-driver [cassandra-java-driver]
absurdfarce commented on code in PR #1907: URL: https://github.com/apache/cassandra-java-driver/pull/1907#discussion_r1478776793 ## upgrade_guide/README.md: ## @@ -19,6 +19,17 @@ under the License. ## Upgrade guide +### NEW VERSION PLACEHOLDER Review Comment: Your suggestion seems like a pretty reasonable approach to me @aratno . In the past we'd usually just set the placeholder to whatever we thought the next version would be (knowing full well it might be changed as things moved along) but I have no objection to just leaving a placeholder in the doc. Part of the release checklist could then become "update the placeholder to the correct version string". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
[ https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814488#comment-17814488 ] David Capwell commented on CASSANDRA-19367: --- What we do in Accord at the moment is {code} public class RoutingKeyIndex extends StorageAttachedIndex { public RoutingKeyIndex(ColumnFamilyStore baseCfs, IndexMetadata indexMetadata) { super(baseCfs, indexMetadata); } @Override protected Strategy createStrategy(ColumnFamilyStore baseCfs, IndexMetadata indexMetadata, IndexTermType indexTermType, IndexIdentifier indexIdentifier) { if (!baseCfs.getKeyspaceName().equals(SchemaConstants.ACCORD_KEYSPACE_NAME)) throw new IllegalArgumentException("Attempted to use an internal index on the wrong table: " + baseCfs.metadata()); return new AbstractStrategy(this) { @Override public MemoryIndex createMemoryIndex() { return new RoutingKeyMemoryIndex(index); } @Override public Flusher flusher() { return (memtable, indexDescriptor, rowMapping) -> { RoutingKeyMemoryIndex index = memtable.getBacking(); SegmentMetadata.ComponentMetadataMap metadataMap = index.writeDirect(indexDescriptor, indexIdentifier, rowMapping::get); return new SegmentMetadata(0, rowMapping.size(), 0, rowMapping.maxSSTableRowId, rowMapping.minKey, rowMapping.maxKey, index.getMinTerm(), index.getMaxTerm(), metadataMap); }; } @Override public SegmentBuilder createSegmentBuilder(NamedMemoryLimiter limiter) { return new AccordRangeSegmentBuilder(index, limiter); } @Override public IndexSegmentSearcher createSearcher(PrimaryKeyMap.Factory primaryKeyMapFactory, PerColumnIndexFiles indexFiles, SegmentMetadata segmentMetadata) throws IOException { return new RoutingKeyDiskIndexSegmentSearcher(primaryKeyMapFactory, indexFiles, segmentMetadata, index); } }; } {code} {code} public static final TableMetadata Commands = parse(COMMANDS, "accord commands", "CREATE TABLE %s (" + "store_id int," + "domain int," // this is stored as part of txn_id, used currently for cheaper scans of the table + format("txn_id %s,", TIMESTAMP_TUPLE) ... + "route blob," ... + "PRIMARY KEY((store_id, domain, txn_id))" + ')') .partitioner(new LocalPartitioner(CompositeType.getInstance(Int32Type.instance, Int32Type.instance, TIMESTAMP_TYPE))) .indexes(Indexes.builder() .add(IndexMetadata.fromSchemaMetadata("route", IndexMetadata.Kind.CUSTOM, ImmutableMap.of("class_name", RoutingKeyIndex.class.getCanonicalName(), "target", "route"))) .build()) .build(); {code} > Refactor SAI so the selection of the index type is not scattered to multiple > places > --- > > Key: CASSANDRA-19367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > For Accord we want to write an internal index and finding plugging into SAI > is a bit more channeling than it could be… we need to find multiple places > where the SAI code “infer” the index type so it can delegate… this logic > should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19368) Add way for SAI to disable row to token index so internal tables may leverage SAI
[ https://issues.apache.org/jira/browse/CASSANDRA-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-19368: -- Change Category: Semantic Complexity: Normal Fix Version/s: 5.x Status: Open (was: Triage Needed) > Add way for SAI to disable row to token index so internal tables may leverage > SAI > - > > Key: CASSANDRA-19368 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19368 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Priority: Normal > Fix For: 5.x > > > Internal tables tend to use LocalPartitioner and may not actually have murmur > tokens but rather LocalPartitioner, which is variable length bytes tokens! > For internal use cases we don’t always care about paging so don’t really need > this index to function. > The use case motivating this work is for Accord, we wish to add a custom SAI > index on the system_accord.commands#routes column. Since this logic is > purely internal we don’t care about paging, but can not leverage SAI at this > moment as it hard codes murmur tokens, and fails during memtable flush -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19368) Add way for SAI to disable row to token index so internal tables may leverage SAI
David Capwell created CASSANDRA-19368: - Summary: Add way for SAI to disable row to token index so internal tables may leverage SAI Key: CASSANDRA-19368 URL: https://issues.apache.org/jira/browse/CASSANDRA-19368 Project: Cassandra Issue Type: Improvement Components: Feature/2i Index Reporter: David Capwell Internal tables tend to use LocalPartitioner and may not actually have murmur tokens but rather LocalPartitioner, which is variable length bytes tokens! For internal use cases we don’t always care about paging so don’t really need this index to function. The use case motivating this work is for Accord, we wish to add a custom SAI index on the system_accord.commands#routes column. Since this logic is purely internal we don’t care about paging, but can not leverage SAI at this moment as it hard codes murmur tokens, and fails during memtable flush -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
[ https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814487#comment-17814487 ] David Capwell commented on CASSANDRA-19367: --- [~mike_tr_adamson] just sent out a patch. I didn't do IndexTermType as thats kinda annoying for accord... we are a "blob" but really we want to have custom/internal logic and try to hide the fact its a blob from SAI > Refactor SAI so the selection of the index type is not scattered to multiple > places > --- > > Key: CASSANDRA-19367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > For Accord we want to write an internal index and finding plugging into SAI > is a bit more channeling than it could be… we need to find multiple places > where the SAI code “infer” the index type so it can delegate… this logic > should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
[ https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-19367: -- Test and Documentation Plan: existing tests Status: Patch Available (was: Open) > Refactor SAI so the selection of the index type is not scattered to multiple > places > --- > > Key: CASSANDRA-19367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > For Accord we want to write an internal index and finding plugging into SAI > is a bit more channeling than it could be… we need to find multiple places > where the SAI code “infer” the index type so it can delegate… this logic > should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-19366: -- Fix Version/s: 5.x (was: 5.1) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.x > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19362) An "include" is broken on the Storage Engine documentation page
[ https://issues.apache.org/jira/browse/CASSANDRA-19362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814486#comment-17814486 ] Lorina Poland commented on CASSANDRA-19362: --- The link is broken in versions 4.1, 4.0, and 3.11, but not 5.0 because the include is not correct. The correct include is: {code:java} include::cassandra:example$BASH/find_sstables.sh[]{code} > An "include" is broken on the Storage Engine documentation page > --- > > Key: CASSANDRA-19362 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19362 > Project: Cassandra > Issue Type: Bug > Components: Documentation >Reporter: Jeremy Hanna >Assignee: Lorina Poland >Priority: Normal > > The example code at the bottom of the "Storage Engine" page doesn't appear to > be including the code properly. See > https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html#example-code -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
[ https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814485#comment-17814485 ] Mike Adamson commented on CASSANDRA-19367: -- The obvious place for this would be in {{IndexTermType}} where we could replace {{isLiteral}} and {{isVector}} with a {{getStrategy}} (or some such). The {{Strategy}} would then need to handle all the conditionals where the above methods are used. Apart from anything this would tidy a lot of the current code paths where we are constantly checking for the index type. > Refactor SAI so the selection of the index type is not scattered to multiple > places > --- > > Key: CASSANDRA-19367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > > For Accord we want to write an internal index and finding plugging into SAI > is a bit more channeling than it could be… we need to find multiple places > where the SAI code “infer” the index type so it can delegate… this logic > should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814482#comment-17814482 ] Stefan Miklosovic edited comment on CASSANDRA-19366 at 2/5/24 6:49 PM: --- I did the first pass of the PR (minus tests) was (Author: smiklosovic): I did the first pass of the PR. > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-19366: -- Reviewers: Stefan Miklosovic Status: Review In Progress (was: Patch Available) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-19366: -- Status: Changes Suggested (was: Review In Progress) I did the first pass of the PR. > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19254) "comments" keyword on docs page should be "comment"
[ https://issues.apache.org/jira/browse/CASSANDRA-19254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814479#comment-17814479 ] Lorina Poland commented on CASSANDRA-19254: --- Rolling into CASSANDRA-19249, since it is a minor issue. > "comments" keyword on docs page should be "comment" > --- > > Key: CASSANDRA-19254 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19254 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Stefano Lottini >Assignee: Lorina Poland >Priority: Low > > Low-priority nitpick: the CREATE TABLE [docs > page|[https://cassandra.apache.org/doc/latest/cassandra/reference/cql-commands/create-table.html#table_options]] > has > {{comments = 'some text that describes the table'}} > with plural `comments`, while the correct keyword to use is `comment` > (singular). > Using the plural form would result in the following error when running the > DDL statement: _SyntaxException: Unknown property 'comments'_ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
[ https://issues.apache.org/jira/browse/CASSANDRA-19367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-19367: -- Change Category: Code Clarity Complexity: Low Hanging Fruit Fix Version/s: 5.x Status: Open (was: Triage Needed) > Refactor SAI so the selection of the index type is not scattered to multiple > places > --- > > Key: CASSANDRA-19367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > > For Accord we want to write an internal index and finding plugging into SAI > is a bit more channeling than it could be… we need to find multiple places > where the SAI code “infer” the index type so it can delegate… this logic > should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19367) Refactor SAI so the selection of the index type is not scattered to multiple places
David Capwell created CASSANDRA-19367: - Summary: Refactor SAI so the selection of the index type is not scattered to multiple places Key: CASSANDRA-19367 URL: https://issues.apache.org/jira/browse/CASSANDRA-19367 Project: Cassandra Issue Type: Improvement Components: Feature/2i Index Reporter: David Capwell Assignee: David Capwell For Accord we want to write an internal index and finding plugging into SAI is a bit more channeling than it could be… we need to find multiple places where the SAI code “infer” the index type so it can delegate… this logic should be done once and made pluggable so custom SAI indexes can be defined -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814475#comment-17814475 ] Andy Tolbert commented on CASSANDRA-19366: -- awesome, thank you [~smiklosovic] ! I see you had some feedback already, appreciate you taking a look, I'll take a look at your comments and make changes this afternoon. > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19182) IR may leak SSTables with pending repair when coming from streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-19182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814476#comment-17814476 ] David Capwell commented on CASSANDRA-19182: --- [~e.dimitrova] issues with CI... since this goes back to 4.0 we need our CI working correctly and it has been having issues with 4.0... so merging this got put on hold until thats fixed =( > IR may leak SSTables with pending repair when coming from streaming > --- > > Key: CASSANDRA-19182 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19182 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: > ci_summary-trunk-a1010f4101bf259de3f31077540e4f987d5df9c5.html > > Time Spent: 1h 40m > Remaining Estimate: 0h > > There is a race condition where SSTables from streaming may race with pending > repair cleanup in compaction causing us to cleanup the pending repair state > in compaction while the SSTables are being added to it; this leads to IR > failing in the future when those files get selected for repair. > This problem was hard to track down as the in-memory state was wiped, so we > don’t have any details. To better aid these types of investigation we should > make sure the repair vtables get updated when IR session failures are > submitted -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-sidecar) branch trunk updated: ninja fix: update CHANGES for ee454741
This is an automated email from the ASF dual-hosted git repository. ycai pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git The following commit(s) were added to refs/heads/trunk by this push: new c72f217 ninja fix: update CHANGES for ee454741 c72f217 is described below commit c72f2179143e7e031f247d3e8385a29c5e64c1c3 Author: Yifan Cai <52585731+yifa...@users.noreply.github.com> AuthorDate: Mon Feb 5 10:33:27 2024 -0800 ninja fix: update CHANGES for ee454741 --- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGES.txt b/CHANGES.txt index 1c3a6f8..85dc0d5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,5 +1,6 @@ 1.0.0 - + * Break restore job into stage and import phases and persist restore slice status on phase completion (CASSANDRASC-99) * Improve logging for traffic shaping / rate limiting configuration (CASSANDRASC-98) * Startup Validation Failures when Checking Sidecar Connectivity (CASSANDRASC-86) * Add support for additional digest validation during SSTable upload (CASSANDRASC-97) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion
[ https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRASC-99: - Fix Version/s: 1.0 Source Control Link: https://github.com/apache/cassandra-sidecar/commit/ee454741363f3f693726af242c5ec37ad1480d60 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Break restore job into stage and import phases and persist restore slice > status on phase completion > --- > > Key: CASSANDRASC-99 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-99 > Project: Sidecar for Apache Cassandra > Issue Type: Improvement > Components: Rest API >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 1.0 > > > In order to improve resilience of the restore sstables from s3 tasks, we want > to break the task into multiple phases and persist the status of each slice. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion
[ https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814473#comment-17814473 ] ASF subversion and git services commented on CASSANDRASC-99: Commit ee454741363f3f693726af242c5ec37ad1480d60 in cassandra-sidecar's branch refs/heads/trunk from Yifan Cai [ https://gitbox.apache.org/repos/asf?p=cassandra-sidecar.git;h=ee45474 ] CASSANDRASC-99 Break restore job into stage and import phases and persist restore slice status on phase completion patch by Yifan Cai; reviewed by Doug Rohrer, Francisco Guerrero for CASSANDRASC-99 > Break restore job into stage and import phases and persist restore slice > status on phase completion > --- > > Key: CASSANDRASC-99 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-99 > Project: Sidecar for Apache Cassandra > Issue Type: Improvement > Components: Rest API >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > > In order to improve resilience of the restore sstables from s3 tasks, we want > to break the task into multiple phases and persist the status of each slice. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-sidecar) branch trunk updated: CASSANDRASC-99 Break restore job into stage and import phases and persist restore slice status on phase completion
This is an automated email from the ASF dual-hosted git repository. ycai pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git The following commit(s) were added to refs/heads/trunk by this push: new ee45474 CASSANDRASC-99 Break restore job into stage and import phases and persist restore slice status on phase completion ee45474 is described below commit ee454741363f3f693726af242c5ec37ad1480d60 Author: Yifan Cai AuthorDate: Mon Jan 29 16:09:25 2024 -0800 CASSANDRASC-99 Break restore job into stage and import phases and persist restore slice status on phase completion patch by Yifan Cai; reviewed by Doug Rohrer, Francisco Guerrero for CASSANDRASC-99 --- .../data/CreateRestoreJobRequestPayload.java | 28 ++- .../sidecar/common/data/RestoreJobConstants.java | 1 + .../sidecar/common/data/RestoreJobStatus.java | 1 + .../sidecar/common/data/RestoreSliceStatus.java| 37 ++- .../data/CreateRestoreJobRequestPayloadTest.java | 6 +- .../common/data/RestoreSliceStatusTest.java| 83 +++ spotbugs-exclude.xml | 1 + .../config/yaml/RestoreJobConfigurationImpl.java | 16 +- .../apache/cassandra/sidecar/db/RestoreJob.java| 85 --- .../sidecar/db/RestoreJobDatabaseAccessor.java | 21 +- .../apache/cassandra/sidecar/db/RestoreSlice.java | 97 ++-- .../sidecar/db/RestoreSliceDatabaseAccessor.java | 47 ++-- .../sidecar/db/schema/RestoreJobsSchema.java | 5 +- .../sidecar/db/schema/RestoreSlicesSchema.java | 2 +- .../sidecar/locator/CachedLocalTokenRanges.java| 276 + .../sidecar/locator/LocalTokenRangesProvider.java | 41 +++ .../sidecar/restore/RestoreJobDiscoverer.java | 55 +++- .../cassandra/sidecar/restore/RestoreJobUtil.java | 2 +- .../sidecar/restore/RestoreProcessor.java | 36 ++- .../sidecar/restore/RestoreSliceTask.java | 118 +++-- .../cassandra/sidecar/restore/StorageClient.java | 2 +- .../routes/restore/AbortRestoreJobHandler.java | 6 +- .../routes/restore/CreateRestoreJobHandler.java| 2 +- .../routes/restore/CreateRestoreSliceHandler.java | 2 +- .../routes/restore/UpdateRestoreJobHandler.java| 17 +- .../db/RestoreJobsDatabaseAccessorIntTest.java | 12 +- .../testing/ConfigurableCassandraTestContext.java | 43 +++- .../cassandra/sidecar/db/RestoreJobTest.java | 16 ++ .../cassandra/sidecar/db/SidecarSchemaTest.java| 53 +++- .../sidecar/restore/RestoreJobDiscovererTest.java | 84 --- .../sidecar/restore/RestoreJobManagerTest.java | 7 +- .../sidecar/restore/RestoreProcessorTest.java | 3 +- .../sidecar/restore/RestoreSliceTaskTest.java | 113 +++-- .../sidecar/restore/RestoreSliceTest.java | 2 +- .../routes/restore/BaseRestoreJobTests.java| 1 - .../restore/RestoreJobSummaryHandlerTest.java | 29 ++- .../restore/UpdateRestoreJobHandlerTest.java | 10 +- .../sidecar/utils/AsyncFileSystemUtilsTest.java| 111 + 38 files changed, 1255 insertions(+), 216 deletions(-) diff --git a/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java b/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java index 12858d8..0e5a9a0 100644 --- a/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java +++ b/common/src/main/java/org/apache/cassandra/sidecar/common/data/CreateRestoreJobRequestPayload.java @@ -26,8 +26,10 @@ import java.util.function.Consumer; import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; import org.apache.cassandra.sidecar.common.utils.Preconditions; +import org.jetbrains.annotations.Nullable; import static org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_AGENT; +import static org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_CONSISTENCY_LEVEL; import static org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_EXPIRE_AT; import static org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_ID; import static org.apache.cassandra.sidecar.common.data.RestoreJobConstants.JOB_IMPORT_OPTIONS; @@ -43,6 +45,8 @@ public class CreateRestoreJobRequestPayload private final RestoreJobSecrets secrets; private final SSTableImportOptions importOptions; private final long expireAtInMillis; +@Nullable +private final String consistencyLevel; // optional field /** * Builder to build a CreateRestoreJobRequest @@ -65,13 +69,15 @@ public class CreateRestoreJobRequestPayload * @param secrets secrets to be used by restore job to download data * @param importOptionsthe configured options for SSTable import * @param expireAtInMillis a timestamp in the future
[jira] [Updated] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion
[ https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRASC-99: - Status: Ready to Commit (was: Review In Progress) > Break restore job into stage and import phases and persist restore slice > status on phase completion > --- > > Key: CASSANDRASC-99 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-99 > Project: Sidecar for Apache Cassandra > Issue Type: Improvement > Components: Rest API >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > > In order to improve resilience of the restore sstables from s3 tasks, we want > to break the task into multiple phases and persist the status of each slice. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRASC-99) Break restore job into stage and import phases and persist restore slice status on phase completion
[ https://issues.apache.org/jira/browse/CASSANDRASC-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814470#comment-17814470 ] Yifan Cai commented on CASSANDRASC-99: -- CI is green https://app.circleci.com/pipelines/github/yifan-c/cassandra-sidecar/45/workflows/a642b270-088d-4442-9355-2f392365f44c > Break restore job into stage and import phases and persist restore slice > status on phase completion > --- > > Key: CASSANDRASC-99 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-99 > Project: Sidecar for Apache Cassandra > Issue Type: Improvement > Components: Rest API >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > > In order to improve resilience of the restore sstables from s3 tasks, we want > to break the task into multiple phases and persist the status of each slice. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19336) Repair causes out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814468#comment-17814468 ] David Capwell commented on CASSANDRA-19336: --- Overall LGTM. I am +1 but have some comments that ill leave to you if you wish to handle or not 1) scheduler should return a Future rather than pushing this to the caller... cleans up the calling code a bit 2) given you have a single task per session, can use a simpler data structure to track/limit... the current one works best when we have multiple tasks and can try to compare cross session. > Repair causes out of memory > --- > > Key: CASSANDRA-19336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19336 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory > usage for Merkle tree calculations during repairs. This limit is applied to > the set of Merkle trees built for a received validation request > ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to > overwhelm the repair coordinator, who will have requested RF sets of Merkle > trees. That way the repair coordinator should only use > {{repair_session_space}} for the RF Merkle trees. > However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} > will send RF*RF validation requests, because the repair coordinator node has > RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests > are sent at the same time, at some point the repair coordinator can have up > to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the > validation responses is fully processed before the last response arrives. > Even worse, if the cluster uses virtual nodes, many nodes can be replicas of > the repair coordinator, and some nodes can be replicas of multiple token > ranges. It would mean that the repair coordinator can send more than RF or > RF*RF simultaneous validation requests. > For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a > repair session involving 44 groups of ranges to be repaired. This produces > 44*3=132 validation requests contacting all the nodes in the cluster. When > the responses for all these requests start to arrive to the coordinator, each > containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate > quicker than they are consumed, greatly exceeding {{repair_session_space}} > and OOMing the node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814454#comment-17814454 ] Stefan Miklosovic commented on CASSANDRA-19366: --- I feel confident I could help to review this patch. > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814448#comment-17814448 ] Brandon Williams commented on CASSANDRA-19085: -- The gossiper fix looks good to me, +1. I'll let you guys decide how to handle the SCM setting. > In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE > --- > > Key: CASSANDRA-19085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19085 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Branimir Lambov >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, > the test fails with an exception that appears to be a genuine problem: > {code:java} > junit.framework.AssertionFailedError: Exception found expected null, but > was: at > org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > > > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) > at > org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > org.apache.cassandra.distributed.shared.ShutdownException: Uncaught > exceptions were thrown during test > at > org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) > at > org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) > at > org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Suppressed: java.lang.IllegalStateException: complete already: > (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) > at > org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) > at > org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) > at > org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) > at > org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) > at > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) > at > org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionF
[jira] [Commented] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814446#comment-17814446 ] Andy Tolbert commented on CASSANDRA-19366: -- Thanks [~frankgh] ! I've included my Pull Request which I will move out of Draft shortly. Attached are the test results [^CASSANDRA-19366-trunk-1_test_results_summary.html] / [^CASSANDRA-19366-trunk-1_test_results.tgz] The tests that failed were: org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest#updateTest-_jdk11 (maybe connected to CASSANDRA-19168 which was recently fixed, investigating why it still fails) org.apache.cassandra.db.compaction.CompactionStrategyManagerTest testAutomaticUpgradeConcurrency-_jdk11 (likely unconnected, also investigating) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: (was: CASSANDRA-19366-trunk-1_test_results.tgz) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: CASSANDRA-19366-trunk-1_test_results.tgz > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: CASSANDRA-19366-trunk-1_test_results_summary.html > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz, > CASSANDRA-19366-trunk-1_test_results_summary.html > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: (was: ci_summary.html) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: ci_summary.html > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: CASSANDRA-19366-trunk-1_test_results.tgz > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: (was: CASSANDRA-19366-trunk-1_test_results.tgz) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Attachment: CASSANDRA-19366-trunk-1_test_results.tgz > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > Attachments: CASSANDRA-19366-trunk-1_test_results.tgz > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Impacts: Docs (was: None) Test and Documentation Plan: Updated existing tests around nodetool clientstats and added a \{{ClientMetricsTest}} that tests the existing metrics for ConnectedClients,AuthSuccess,AuthFailure and the new metrics I added. I ran utests and dtests against this branch and it came back clean with exception to two likely unrelated tests which I'll capture in comments. Status: Patch Available (was: Open) Pull Request available at: [https://github.com/apache/cassandra/pull/3085] I've marked this as Docs impacting as I've added new metrics. I have updated the metrics.adoc file to include the new metrics in addition to existing ones that weren't documented. > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Tolbert updated CASSANDRA-19366: - Change Category: Operability Complexity: Normal Fix Version/s: 5.1 Assignee: Andy Tolbert Status: Open (was: Triage Needed) > Expose mode of authentication in system_views.clients, nodetool clientstats, > and ClientMetrics > -- > > Key: CASSANDRA-19366 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Encryption, Messaging/Client, Observability/JMX, > Observability/Metrics, Tool/nodetool >Reporter: Andy Tolbert >Assignee: Andy Tolbert >Priority: Normal > Fix For: 5.1 > > > CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this > contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, > which enables Cassandra to support either password and mTLS-authenticated > connections. > As an operator, it would be useful to know which connections are mTLS > authenticated, and which are password authenticated, as a possible mode of > operation is migrating users from one from of authentication to another. It > would also be useful to know if that if authentication attempts are failing > which mode of authentication is unsuccessful. > Proposing to add the following: > * Add a {{mode: string}} and {{metadata: map}} to > {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations > to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a > {{metadata}} map (e.g. this can include the extracted {{identity}} from a > client certificate for {{mtls}} authentication). > * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, > which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not > added to existing output to maintain compatibility, much like > {{-client-options}} did. > * Update {{system_views.clients}} to include columns for these new fields. > * Add new metrics to {{{}ClientMetrics{}}}: > ** Track authentication success and failures by mode. (Note: The metrics > present by authentication mode scope are contextual based on the > Authenticator used (e.g. only {{scope=Password}} will be present for > {{{}PasswordAuthenticator{}}}) > {noformat} > Existing: > org.apache.cassandra.metrics:name=AuthSuccess,type=Client > org.apache.cassandra.metrics:name=AuthFailure,type=Client > New: > org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client > {noformat} > * > ** Track connection counts by mode: > {noformat} > Existing: > org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client > org.apache.cassandra.metrics:name=connectedNativeClients,type=Client > (previously deprecated but still maintained) > New: > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client > {noformat} > * > ** A metric to track encrypted vs. non-encrypted connections: > {noformat} > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client > org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] Unit test framework and 3 unit tests for PartitionAwarePolicy [cassandra-java-driver]
aravind-nallan-yb closed pull request #1912: Unit test framework and 3 unit tests for PartitionAwarePolicy URL: https://github.com/apache/cassandra-java-driver/pull/1912 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19366) Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics
Andy Tolbert created CASSANDRA-19366: Summary: Expose mode of authentication in system_views.clients, nodetool clientstats, and ClientMetrics Key: CASSANDRA-19366 URL: https://issues.apache.org/jira/browse/CASSANDRA-19366 Project: Cassandra Issue Type: Improvement Components: Feature/Encryption, Messaging/Client, Observability/JMX, Observability/Metrics, Tool/nodetool Reporter: Andy Tolbert CASSANDRA-18554 added support for mTLS-authenticated clients. Part of this contribution introduced {{{}MutualTlsWithPasswordFallbackAuthenticator{}}}, which enables Cassandra to support either password and mTLS-authenticated connections. As an operator, it would be useful to know which connections are mTLS authenticated, and which are password authenticated, as a possible mode of operation is migrating users from one from of authentication to another. It would also be useful to know if that if authentication attempts are failing which mode of authentication is unsuccessful. Proposing to add the following: * Add a {{mode: string}} and {{metadata: map}} to {{{}AuthenticatedUser{}}}. Update existing {{IAuthenticator}} implementations to pass {{mode}} (e.g. {{password}} , {{{}mtls{}}}), and optionally pass a {{metadata}} map (e.g. this can include the extracted {{identity}} from a client certificate for {{mtls}} authentication). * Update nodetool clientstats to add a new option flag {{{}--metadata{}}}, which when passed exposes these new fields on {{{}AuthenticatedUser{}}}. (Not added to existing output to maintain compatibility, much like {{-client-options}} did. * Update {{system_views.clients}} to include columns for these new fields. * Add new metrics to {{{}ClientMetrics{}}}: ** Track authentication success and failures by mode. (Note: The metrics present by authentication mode scope are contextual based on the Authenticator used (e.g. only {{scope=Password}} will be present for {{{}PasswordAuthenticator{}}}) {noformat} Existing: org.apache.cassandra.metrics:name=AuthSuccess,type=Client org.apache.cassandra.metrics:name=AuthFailure,type=Client New: org.apache.cassandra.metrics:name=AuthSuccess,scope=Mtls,type=Client org.apache.cassandra.metrics:name=AuthSuccess,scope=Password,type=Client org.apache.cassandra.metrics:name=AuthFailure,scope=Mtls,type=Client org.apache.cassandra.metrics:name=AuthFailure,scope=Password,type=Client {noformat} * ** Track connection counts by mode: {noformat} Existing: org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client org.apache.cassandra.metrics:name=connectedNativeClients,type=Client (previously deprecated but still maintained) New: org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mtls,type=Client org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client {noformat} * ** A metric to track encrypted vs. non-encrypted connections: {noformat} org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-19336) Repair causes out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña reassigned CASSANDRA-19336: - Assignee: Andres de la Peña > Repair causes out of memory > --- > > Key: CASSANDRA-19336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19336 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > > CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory > usage for Merkle tree calculations during repairs. This limit is applied to > the set of Merkle trees built for a received validation request > ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to > overwhelm the repair coordinator, who will have requested RF sets of Merkle > trees. That way the repair coordinator should only use > {{repair_session_space}} for the RF Merkle trees. > However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} > will send RF*RF validation requests, because the repair coordinator node has > RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests > are sent at the same time, at some point the repair coordinator can have up > to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the > validation responses is fully processed before the last response arrives. > Even worse, if the cluster uses virtual nodes, many nodes can be replicas of > the repair coordinator, and some nodes can be replicas of multiple token > ranges. It would mean that the repair coordinator can send more than RF or > RF*RF simultaneous validation requests. > For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a > repair session involving 44 groups of ranges to be repaired. This produces > 44*3=132 validation requests contacting all the nodes in the cluster. When > the responses for all these requests start to arrive to the coordinator, each > containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate > quicker than they are consumed, greatly exceeding {{repair_session_space}} > and OOMing the node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19336) Repair causes out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814404#comment-17814404 ] Andres de la Peña commented on CASSANDRA-19336: --- I have added a new {{concurrent_merkle_tree_requests}} config property to the PR. This property controls the parallelism of the scheduler. It defaults to unbounded parallelism so it keeps the previous behaviour. I think the recommended value should be one without vnodes. Without vnodes it could either be one too, or something higher if combined with a smaller {{repair_session_space}}. CI looks good; the only failure is CASSANDRA-19168: ||PR||CI|| |[5.0|https://github.com/apache/cassandra/pull/3073]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3411/workflows/bff8cab0-6dff-423c-af95-c7f70fb9a887] [j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3411/workflows/1c6b2337-0db7-4de2-81e3-4a6eccb70204]| > Repair causes out of memory > --- > > Key: CASSANDRA-19336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19336 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Andres de la Peña >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > > CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory > usage for Merkle tree calculations during repairs. This limit is applied to > the set of Merkle trees built for a received validation request > ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to > overwhelm the repair coordinator, who will have requested RF sets of Merkle > trees. That way the repair coordinator should only use > {{repair_session_space}} for the RF Merkle trees. > However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} > will send RF*RF validation requests, because the repair coordinator node has > RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests > are sent at the same time, at some point the repair coordinator can have up > to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the > validation responses is fully processed before the last response arrives. > Even worse, if the cluster uses virtual nodes, many nodes can be replicas of > the repair coordinator, and some nodes can be replicas of multiple token > ranges. It would mean that the repair coordinator can send more than RF or > RF*RF simultaneous validation requests. > For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a > repair session involving 44 groups of ranges to be repaired. This produces > 44*3=132 validation requests contacting all the nodes in the cluster. When > the responses for all these requests start to arrive to the coordinator, each > containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate > quicker than they are consumed, greatly exceeding {{repair_session_space}} > and OOMing the node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[PR] Unit test framework and 3 unit tests for PartitionAwarePolicy [cassandra-java-driver]
aravind-nallan-yb opened a new pull request, #1912: URL: https://github.com/apache/cassandra-java-driver/pull/1912 Extend the upstream LB policy unit test framework to PartitionAwarePolicy and add 3 unit tests as samples. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19189) Revisit use of sealed period lookup tables
[ https://issues.apache.org/jira/browse/CASSANDRA-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19189: Description: Metadata snapshots are stored locally in the {{system.metadata_snapshots}} table, which is keyed by epoch. Snapshots are retrieved from this table for three purposes: * to replay locally during startup * to provide log state for a peer requesting catchup * to create point-in-time ClusterMetadata, for disaster recovery In the majority of cases, we always want to replay from the most recent snapshot so we can usually select the appropriate snapshot by simply scanning the snapshots table in reverse, which allows us to considerably simplify the process of looking up the desired snapshot. We will continue to persist historical snapshots, at least for now, so that we are able to select arbitrary snapshots should we want to reconstruct metadata state for arbitrary epochs. was: Metadata snapshots are stored locally in the {{system.metadata_snapshots}} table, which is keyed by epoch. Snapshots are retrieved from this table for two purposes: * to replay locally during startup * to provide log state for a peer requesting catchup * to create point-in-time ClusterMetadata, for disaster recovery In the majority of cases, we always want to replay from the most recent snapshot so we can usually select the appropriate snapshot by simply scanning the snapshots table in reverse, which allows us to considerably simplify the process of looking up the desired snapshot. We will continue to persist historical snapshots, at least for now, so that we are able to select arbitrary snapshots should we want to reconstruct metadata state for arbitrary epochs. > Revisit use of sealed period lookup tables > -- > > Key: CASSANDRA-19189 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19189 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 5.1-alpha1 > > > Metadata snapshots are stored locally in the {{system.metadata_snapshots}} > table, which is keyed by epoch. Snapshots are retrieved from this table for > three purposes: > * to replay locally during startup > * to provide log state for a peer requesting catchup > * to create point-in-time ClusterMetadata, for disaster recovery > In the majority of cases, we always want to replay from the most recent > snapshot so we can usually select the appropriate snapshot by simply scanning > the snapshots table in reverse, which allows us to considerably simplify the > process of looking up the desired snapshot. We will continue to persist > historical snapshots, at least for now, so that we are able to select > arbitrary snapshots should we want to reconstruct metadata state for > arbitrary epochs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814339#comment-17814339 ] Brandon Williams commented on CASSANDRA-18824: -- I think that is a good plan and I am +1 on it, and +1 on this ticket also. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814322#comment-17814322 ] Jacek Lewandowski commented on CASSANDRA-18824: --- I've created https://issues.apache.org/jira/browse/CASSANDRA-19363 and https://issues.apache.org/jira/browse/CASSANDRA-19364 as a result of investigating the flakiness. The fact that it didn't fail in 5k runs, assuming all of those runs were executed under very similar cluster conditions, can be misleading. Adding a slight delay in an async code of pending ranges calculator leads to consistent test failures even on 4.0. This is not related to this issue though - it is only the test added here which can accidentally detect the problem. Since those separate tickets are now created, I think we can merge this ticket. However, those who asked for this fix should be notified about those possible issues. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-19363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-19363: -- Description: While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the newly added decommission test. It looked innocent; however, when digging into the logs, it turned out that, for some reason, the data that were being pumped into the cluster went to the decommissioned node instead of going to the working node. That is, the data were inserted into a 2-node cluster (RF=1) while, say, node1 got decommissioned. The expected behavior would be that the data land in node2 after that. However, for some reason, in this 1/1000 flaky test, the situation was the opposite, and the data went to the decommissioned node, resulting in a total loss. I haven't found the reason. I don't know if it is a test failure or a production code problem. I cannot prove that it is only a 3.11 problem. I'm creating this ticket because if this is a real issue and exists on newer branches, it is serious. The logs artifact is lost in CircleCI thus I'm attaching the one I've downloaded earlier, unfortunately it is cleaned up a bit. The relevant part is: {noformat} DEBUG [node1_isolatedExecutor:3] node1 ColumnFamilyStore.java:949 - Enqueuing flush of tbl: 38.965KiB (0%) on-heap, 0.000KiB (0%) off-heap DEBUG [node1_PerDiskMemtableFlushWriter_1:1] node1 Memtable.java:477 - Writing Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), flushed range = (max(-3074457345618258603), max(3074457345618258602)] DEBUG [node1_PerDiskMemtableFlushWriter_2:1] node1 Memtable.java:477 - Writing Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), flushed range = (max(3074457345618258602), max(9223372036854775807)] DEBUG [node1_PerDiskMemtableFlushWriter_0:1] node1 Memtable.java:477 - Writing Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223372036854775808), max(-3074457345618258603)] DEBUG [node1_PerDiskMemtableFlushWriter_2:1] node1 Memtable.java:506 - Completed flushing /node1/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-3-big-Data.db (1.059KiB) for commitlog position CommitLogPosition(segmentId=1704397819937, position=47614) DEBUG [node1_PerDiskMemtableFlushWriter_1:1] node1 Memtable.java:506 - Completed flushing /node1/data1/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-2-big-Data.db (1.091KiB) for commitlog position CommitLogPosition(segmentId=1704397819937, position=47614) DEBUG [node1_PerDiskMemtableFlushWriter_0:1] node1 Memtable.java:506 - Completed flushing /node1/data0/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db (1.260KiB) for commitlog position CommitLogPosition(segmentId=1704397819937, position=47614) DEBUG [node1_MemtableFlushWriter:1] node1 ColumnFamilyStore.java:1267 - Flushed to [BigTableReader(path='/node1/data0/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db'), BigTableReader(path='/node1/data1/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-2-big-Data.db'), BigTableReader(path='/node1/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-3-big-Data.db')] (3 sstables, 17.521KiB), biggest 5.947KiB, smallest 5.773KiB DEBUG [node2_isolatedExecutor:1] node2 ColumnFamilyStore.java:949 - Enqueuing flush of tbl: 38.379KiB (0%) on-heap, 0.000KiB (0%) off-heap DEBUG [node2_PerDiskMemtableFlushWriter_0:1] node2 Memtable.java:477 - Writing Memtable-tbl(5.176KiB serialized bytes, 100 ops, 0%/0% of on/off-heap limit), flushed range = (null, null] DEBUG [node2_PerDiskMemtableFlushWriter_0:1] node2 Memtable.java:506 - Completed flushing /node2/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db (3.409KiB) for commitlog position CommitLogPosition(segmentId=1704397821653, position=54585) DEBUG [node2_MemtableFlushWriter:1] node2 ColumnFamilyStore.java:1267 - Flushed to [BigTableReader(path='/node2/data2/distributed_test_keyspace/tbl-7fb7aa20ab3a11eeac381f661fe8b82f/me-1-big-Data.db')] (1 sstables, 7.731KiB), biggest 7.731KiB, {noformat} As one can see, node1 flushed 3 sstables of {{tbl}} although it is already decommissioned. Node 2 did not flush much. This is opposite to the passing run of the test. The test code is as follows: {code:java} try (Cluster cluster = init(builder().withNodes(2) .withTokenSupplier(evenlyDistributedTokens(2)) .withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", "rack0")) .withConfig(config -> config.with(NETWORK, GOSSIP)) .start(), 1)) {
[jira] [Updated] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-19363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-19363: -- Attachment: bad.txt > Weird data loss in 3.11 flakiness during decommission > - > > Key: CASSANDRA-19363 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19363 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x > > Attachments: bad.txt > > > While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the > newly added decommission test. It looked innocent; however, when digging into > the logs, it turned out that, for some reason, the data that were being > pumped into the cluster went to the decommissioned node instead of going to > the working node. > That is, the data were inserted into a 2-node cluster (RF=1) while, say, > node2 got decommissioned. The expected behavior would be that the data land > in node1 after that. However, for some reason, in this 1/1000 flaky test, the > situation was the opposite, and the data went to the decommissioned node, > resulting in a total loss. > I haven't found the reason. I don't know if it is a test failure or a > production code problem. I cannot prove that it is only a 3.11 problem. I'm > creating this ticket because if this is a real issue and exists on newer > branches, it is serious. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19364) Data loss during decommission possible due to a delayed and unsynced pending ranges calculation
[ https://issues.apache.org/jira/browse/CASSANDRA-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-19364: -- Description: This possible issue has been discovered while inspecting flaky tests of CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the node is decommissioned. If the data is inserted during decommissioning, and pending ranges calculation is delayed for some reason (it can be as it is not synchronous), we may end up with partial data loss. That can be just a wrong test. Thus, I perceive this ticket more like a memo for further investigation or discussion. Note that this has obviously been fixed by TCM. The test in question was: {code:java} try (Cluster cluster = init(builder().withNodes(2) .withTokenSupplier(evenlyDistributedTokens(2)) .withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", "rack0")) .withConfig(config -> config.with(NETWORK, GOSSIP)) .start(), 1)) { IInvokableInstance nodeToDecommission = cluster.get(1); IInvokableInstance nodeToRemainInCluster = cluster.get(2); // Start decomission on nodeToDecommission cluster.forEach(statusToDecommission(nodeToDecommission)); logger.info("Decommissioning node {}", nodeToDecommission.broadcastAddress()); // Add data to cluster while node is decomissioning int numRows = 100; cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + ".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))"); insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // <--- HERE - when PRC is delayed, we get there only ~50% of inserted rows // Check data before cleanup on nodeToRemainInCluster assertEquals(100, nodeToRemainInCluster.executeInternal("SELECT * FROM " + KEYSPACE + ".tbl").length); } {code} was: This possible issue has been discovered while inspecting flaky tests of CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the node is decommissioned. If the data is inserted during decommissioning, and pending ranges calculation is delayed for some reason (it can be as it is not synchronous), we may end up with partial data loss. That can be just a wrong test. Thus, I perceive this ticket more like a memo for further investigation or discussion. Note that this has obviously been fixed by TCM. > Data loss during decommission possible due to a delayed and unsynced pending > ranges calculation > --- > > Key: CASSANDRA-19364 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19364 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Jacek Lewandowski >Priority: Normal > > This possible issue has been discovered while inspecting flaky tests of > CASSANDRA-18824. Pending ranges calculation is executed asynchronously when > the node is decommissioned. If the data is inserted during decommissioning, > and pending ranges calculation is delayed for some reason (it can be as it is > not synchronous), we may end up with partial data loss. That can be just a > wrong test. Thus, I perceive this ticket more like a memo for further > investigation or discussion. > Note that this has obviously been fixed by TCM. > The test in question was: > {code:java} > try (Cluster cluster = init(builder().withNodes(2) > > .withTokenSupplier(evenlyDistributedTokens(2)) > > .withNodeIdTopology(NetworkTopology.singleDcNetworkTopology(2, "dc0", > "rack0")) > .withConfig(config -> > config.with(NETWORK, GOSSIP)) > .start(), 1)) > { > IInvokableInstance nodeToDecommission = cluster.get(1); > IInvokableInstance nodeToRemainInCluster = cluster.get(2); > // Start decomission on nodeToDecommission > cluster.forEach(statusToDecommission(nodeToDecommission)); > logger.info("Decommissioning node {}", > nodeToDecommission.broadcastAddress()); > // Add data to cluster while node is decomissioning > int numRows = 100; > cluster.schemaChange("CREATE TABLE IF NOT EXISTS " + KEYSPACE + > ".tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck))"); > insertData(cluster, 1, numRows, ConsistencyLevel.ONE); // > <---
[jira] [Created] (CASSANDRA-19364) Data loss during decommission possible due to a delayed and unsynced pending ranges calculation
Jacek Lewandowski created CASSANDRA-19364: - Summary: Data loss during decommission possible due to a delayed and unsynced pending ranges calculation Key: CASSANDRA-19364 URL: https://issues.apache.org/jira/browse/CASSANDRA-19364 Project: Cassandra Issue Type: Bug Components: Consistency/Bootstrap and Decommission Reporter: Jacek Lewandowski This possible issue has been discovered while inspecting flaky tests of CASSANDRA-18824. Pending ranges calculation is executed asynchronously when the node is decommissioned. If the data is inserted during decommissioning, and pending ranges calculation is delayed for some reason (it can be as it is not synchronous), we may end up with partial data loss. That can be just a wrong test. Thus, I perceive this ticket more like a memo for further investigation or discussion. Note that this has obviously been fixed by TCM. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-19363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-19363: -- Fix Version/s: 3.11.x > Weird data loss in 3.11 flakiness during decommission > - > > Key: CASSANDRA-19363 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19363 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x > > > While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the > newly added decommission test. It looked innocent; however, when digging into > the logs, it turned out that, for some reason, the data that were being > pumped into the cluster went to the decommissioned node instead of going to > the working node. > That is, the data were inserted into a 2-node cluster (RF=1) while, say, > node2 got decommissioned. The expected behavior would be that the data land > in node1 after that. However, for some reason, in this 1/1000 flaky test, the > situation was the opposite, and the data went to the decommissioned node, > resulting in a total loss. > I haven't found the reason. I don't know if it is a test failure or a > production code problem. I cannot prove that it is only a 3.11 problem. I'm > creating this ticket because if this is a real issue and exists on newer > branches, it is serious. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19363) Weird data loss in 3.11 flakiness during decommission
Jacek Lewandowski created CASSANDRA-19363: - Summary: Weird data loss in 3.11 flakiness during decommission Key: CASSANDRA-19363 URL: https://issues.apache.org/jira/browse/CASSANDRA-19363 Project: Cassandra Issue Type: Bug Components: Consistency/Bootstrap and Decommission Reporter: Jacek Lewandowski While testing CASSANDRA-18824 on 3.11, we noticed one flaky result of the newly added decommission test. It looked innocent; however, when digging into the logs, it turned out that, for some reason, the data that were being pumped into the cluster went to the decommissioned node instead of going to the working node. That is, the data were inserted into a 2-node cluster (RF=1) while, say, node2 got decommissioned. The expected behavior would be that the data land in node1 after that. However, for some reason, in this 1/1000 flaky test, the situation was the opposite, and the data went to the decommissioned node, resulting in a total loss. I haven't found the reason. I don't know if it is a test failure or a production code problem. I cannot prove that it is only a 3.11 problem. I'm creating this ticket because if this is a real issue and exists on newer branches, it is serious. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19361) fix node info NPE when ClusterMetadata is null
[ https://issues.apache.org/jira/browse/CASSANDRA-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814296#comment-17814296 ] Sam Tunnicliffe commented on CASSANDRA-19361: - bq. After deleting data(losing all data), restart and everything became OK {code} -- AddressLoad Tokens Owns (effective) Host ID Rack UN 127.0.0.2 ? 16 51.2% 6d194555-f6eb-41d0-c000-0002 rack1 DN 127.0.0.4 ? 16 48.8% 6d194555-f6eb-41d0-c000-0001 rack1 {code} This is pretty odd for a couple of reasons: * node1 and node3 seem to have left or been removed from the cluster. * {{Host ID}} is based on the {{NodeId}} in cluster metadata, which in turn is based on an auto incrementing integer. So according to this, {{127.0.0.4}} was actually the first node added to the cluster. Based on this and the stacktraces above, I would guess that something is going wrong with node4 discovering its peers when first joining, leading to it forming its own single-node cluster in isolation. I'm not sure exactly what is happening when you delete all data and restart, if you can attach full logs for all 4 nodes, that would be helpful. > fix node info NPE when ClusterMetadata is null > -- > > Key: CASSANDRA-19361 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19361 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool, Transactional Cluster Metadata >Reporter: Ling Mao >Assignee: Ling Mao >Priority: Normal > Fix For: 5.0.x > > Attachments: CASSANDRA-19361-stack-error.txt > > Time Spent: 10m > Remaining Estimate: 0h > > h3. How > > I create an ensemble with 3 nodes(It works well), then I add the fourth node > to join the party. > when executing nodetool info, get the following exception: > {code:java} > ➜ bin ./nodetool info > java.lang.NullPointerException at > org.apache.cassandra.service.StorageService.operationMode(StorageService.java:3744) > at > org.apache.cassandra.service.StorageService.isBootstrapFailed(StorageService.java:3810) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) > ➜ bin ./nodetool info > WARN [InternalResponseStage:152] 2024-02-02 11:45:15,731 > RemoteProcessor.java:213 - Got error from /127.0.0.4:7000: TIMEOUT when > sending TCM_COMMIT_REQ, retrying on > CandidateIterator{candidates=[/127.0.0.4:7000], checkLive=true} error: null > -- StackTrace -- java.lang.NullPointerException at > org.apache.cassandra.service.StorageService.getLocalHostId(StorageService.java:1904) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at > jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260){code} > server 1 cannot execute node info and cql shell, server 2 and 3 can do it. > Try to query the system prefix tables, I attach stack error log for the > further debugging. Cannot find a way to recover. After deleting data(losing > all data), restart and everything became OK > {code:java} > ➜ bin ./nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.2 ? 16 51.2% > 6d194555-f6eb-41d0-c000-0002 rack1 > DN 127.0.0.4 ? 16 48.8% > 6d194555-f6eb-41d0-c000-0001 rack1{code} > h3. When > > It was introduced by the Patch: CEP-21. Anyway, the NPE check is needed to > protect its propagation anywhere > {code:java} > Implementation of Transactional Cluster Metadata as described in CEP-21 > Hash: ae084237 > > code diff: > > public String getLocalHostId() > { > - UUID id = getLo
[jira] [Commented] (CASSANDRA-19361) fix node info NPE when ClusterMetadata is null
[ https://issues.apache.org/jira/browse/CASSANDRA-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814292#comment-17814292 ] Sam Tunnicliffe commented on CASSANDRA-19361: - >From the info in the description and the attached text file, it looks as >though the 4th node is not communicating with the existing nodes. Can you >attach the full log from the fourth node? I can't reproduce this with ccm, how are you configuring/running the instances? The executions of {{nodetool info}} in the description, are those are being run against node4? Are they being executed while the node is bootstrapping? {quote}server 1 cannot execute node info and cql shell, server 2 and 3 can do it. {quote} Does this only start to happen _after_ node4 is started? Can you run {{nodetool info}} and cqlsh on node1 before adding node4? > fix node info NPE when ClusterMetadata is null > -- > > Key: CASSANDRA-19361 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19361 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool, Transactional Cluster Metadata >Reporter: Ling Mao >Assignee: Ling Mao >Priority: Normal > Fix For: 5.0.x > > Attachments: CASSANDRA-19361-stack-error.txt > > Time Spent: 10m > Remaining Estimate: 0h > > h3. How > > I create an ensemble with 3 nodes(It works well), then I add the fourth node > to join the party. > when executing nodetool info, get the following exception: > {code:java} > ➜ bin ./nodetool info > java.lang.NullPointerException at > org.apache.cassandra.service.StorageService.operationMode(StorageService.java:3744) > at > org.apache.cassandra.service.StorageService.isBootstrapFailed(StorageService.java:3810) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) > ➜ bin ./nodetool info > WARN [InternalResponseStage:152] 2024-02-02 11:45:15,731 > RemoteProcessor.java:213 - Got error from /127.0.0.4:7000: TIMEOUT when > sending TCM_COMMIT_REQ, retrying on > CandidateIterator{candidates=[/127.0.0.4:7000], checkLive=true} error: null > -- StackTrace -- java.lang.NullPointerException at > org.apache.cassandra.service.StorageService.getLocalHostId(StorageService.java:1904) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at > jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) at > java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260){code} > server 1 cannot execute node info and cql shell, server 2 and 3 can do it. > Try to query the system prefix tables, I attach stack error log for the > further debugging. Cannot find a way to recover. After deleting data(losing > all data), restart and everything became OK > {code:java} > ➜ bin ./nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 127.0.0.2 ? 16 51.2% > 6d194555-f6eb-41d0-c000-0002 rack1 > DN 127.0.0.4 ? 16 48.8% > 6d194555-f6eb-41d0-c000-0001 rack1{code} > h3. When > > It was introduced by the Patch: CEP-21. Anyway, the NPE check is needed to > protect its propagation anywhere > {code:java} > Implementation of Transactional Cluster Metadata as described in CEP-21 > Hash: ae084237 > > code diff: > > public String getLocalHostId() > { > - UUID id = getLocalHostUUID(); > - return id != null ? id.toString() : null; > + return getLocalHostUUID().toString(); > } > > public UUID getLocalHostUUID() > { > - UUID id = > getTokenMetadata().getHostId(FBUtilities.getBroadcastAddressAndPort()); > - if (id != null) > - return id; > - // this condition is to prevent accessing the tables whe
[jira] [Updated] (CASSANDRA-18098) Test failure bootstrap_test.py::test_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-18098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-18098: Fix Version/s: 5.0.x (was: 5.0-rc) > Test failure bootstrap_test.py::test_cleanup > > > Key: CASSANDRA-18098 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18098 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Yifan Cai >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0.x, 5.0.x, 5.x > > > The test failed a few times in the recent CI runs. For example, this log > captures a recent failure. > {code:none} > 20:02:01,364 ccm INFO node1: using Java 11 for the current invocation > 20:02:02,679 bootstrap_test ERROR --- > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-1-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-4-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-7-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-10-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-13-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-2-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-5-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-8-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-11-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-14-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-3-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-6-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-9-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-12-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-15-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-16-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-17-big-Data.db > 20:02:02,679 bootstrap_test ERROR Current count is 17, basecount was 15 > -- generated xml file: /tmp/results/dtests/pytest_result_j11_with_vnodes.xml > --- > ===Flaky Test Report=== > test_materialized_views_auth passed 1 out of the required 1 times. Success! > test_cleanup failed and was not selected for rerun. > > assert not True > + where True = 0x7f071d43cba8>>() > +where 0x7f071d43cba8>> = .is_set > [] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18098) Test failure bootstrap_test.py::test_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-18098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814284#comment-17814284 ] Berenguer Blasi commented on CASSANDRA-18098: - They're not exactly the same, just related. But if there's nothing to go about I agree there's nothing we can do. Let's remove it from blocking rc and if it pops up again we'll have some thread to start pulling. > Test failure bootstrap_test.py::test_cleanup > > > Key: CASSANDRA-18098 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18098 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Yifan Cai >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0.x, 5.0-rc, 5.x > > > The test failed a few times in the recent CI runs. For example, this log > captures a recent failure. > {code:none} > 20:02:01,364 ccm INFO node1: using Java 11 for the current invocation > 20:02:02,679 bootstrap_test ERROR --- > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-1-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-4-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-7-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-10-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data0/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-13-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-2-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-5-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-8-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-11-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data1/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-14-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-3-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-6-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-9-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-12-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-15-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-16-big-Data.db > 20:02:02,679 bootstrap_test ERROR > /tmp/dtest-8kcle23s/test/node1/data2/keyspace1/standard1-e36da130727b11edb08827a767e354f3/nb-17-big-Data.db > 20:02:02,679 bootstrap_test ERROR Current count is 17, basecount was 15 > -- generated xml file: /tmp/results/dtests/pytest_result_j11_with_vnodes.xml > --- > ===Flaky Test Report=== > test_materialized_views_auth passed 1 out of the required 1 times. Success! > test_cleanup failed and was not selected for rerun. > > assert not True > + where True = 0x7f071d43cba8>>() > +where 0x7f071d43cba8>> = .is_set > [] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19283) Update rpm and debian shell includes
[ https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-19283: Reviewers: Berenguer Blasi, Berenguer Blasi Berenguer Blasi, Berenguer Blasi (was: Berenguer Blasi) Status: Review In Progress (was: Patch Available) > Update rpm and debian shell includes > > > Key: CASSANDRA-19283 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19283 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > > While working on CASSANDRA-19001, it was identified that there are > differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it > seems the debian diff on 5.0 was updated once in 2020 since it was created in > 2019. > CC [~brandon.williams] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19283) Update rpm and debian shell includes
[ https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-19283: Status: Ready to Commit (was: Review In Progress) > Update rpm and debian shell includes > > > Key: CASSANDRA-19283 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19283 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > > While working on CASSANDRA-19001, it was identified that there are > differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it > seems the debian diff on 5.0 was updated once in 2020 since it was created in > 2019. > CC [~brandon.williams] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19283) Update rpm and debian shell includes
[ https://issues.apache.org/jira/browse/CASSANDRA-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814282#comment-17814282 ] Berenguer Blasi commented on CASSANDRA-19283: - Devbranch continues to be broken for 5.0+ so there's nothing we can do besides your local testing. Given jenkins 5.0 should be up again pretty soon we'll get quick feedback if anything is not quite right. +1 lgtm. > Update rpm and debian shell includes > > > Key: CASSANDRA-19283 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19283 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > > While working on CASSANDRA-19001, it was identified that there are > differences between bin/cassandra.in.sh and redhat/cassandra.in.sh, and it > seems the debian diff on 5.0 was updated once in 2020 since it was created in > 2019. > CC [~brandon.williams] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org