[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767416#comment-17767416 ] Jacek Lewandowski commented on CASSANDRA-18871: --- btw. regarding testing - async profiler has been tested on Mac M1 by [~mmuzaf] and [~marianne-manaog] > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-18871: -- Reviewers: Maxim Muzafarov (was: Marianne Lyne Manaog, Maxim Muzafarov) > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-18786: -- Reviewers: Ling Mao, Stefan Miklosovic (was: Ling Mao) Status: Review In Progress (was: Needs Committer) > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-18786: -- Status: Ready to Commit (was: Review In Progress) > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767413#comment-17767413 ] Berenguer Blasi commented on CASSANDRA-18786: - Is that agreement to my statement or +1 to the ticket? > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767408#comment-17767408 ] Jacek Lewandowski commented on CASSANDRA-18871: --- yes, I've just also added a property to add extra jmh args, for example: {noformat} ant microbench -Dbenchmark.name=SampleBench -Djmh.args='-gc true' {noformat} Is that what you meant [~blambov]? > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767398#comment-17767398 ] Stefan Miklosovic commented on CASSANDRA-18786: --- +1 > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767388#comment-17767388 ] Maxwell Guo commented on CASSANDRA-18533: - [~blambov] I assigin this to myself , seems have some relation with CASSANDRA-18534. it's that ok ? > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement >Reporter: Branimir Lambov >Priority: Normal > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18707) Test failure: junit.framework.TestSuite.org.apache.cassandra.distributed.test.CASMultiDCTest-.jdk11
[ https://issues.apache.org/jira/browse/CASSANDRA-18707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767386#comment-17767386 ] Berenguer Blasi commented on CASSANDRA-18707: - That is clear. But it doesn't repro bc imo you're trying to repro a jenkins env freeze just during schema agreement. No amount of added logging will make a difference bc we have no insight into that. So I am with Brandon I don't know what else to do here. I'd merge and if it happens again maybe sbdy else sees sthg... :shrug: > Test failure: > junit.framework.TestSuite.org.apache.cassandra.distributed.test.CASMultiDCTest-.jdk11 > > > > Key: CASSANDRA-18707 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18707 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: TESTS-TestSuites.xml.xz > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1650/testReport/junit.framework/TestSuite/org_apache_cassandra_distributed_test_CASMultiDCTest__jdk11/] > h3. > {code:java} > Error Message > Schema agreement not reached. Schema versions of the instances: > [ef1c8e05-a06d-388d-a46d-53cc22a94762, 6c386108-1805-3985-b48e-8016012a0207, > 6c386108-1805-3985-b48e-8016012a0207, ef1c8e05-a06d-388d-a46d-53cc22a94762] > Stacktrace > java.lang.IllegalStateException: Schema agreement not reached. Schema > versions of the instances: [ef1c8e05-a06d-388d-a46d-53cc22a94762, > 6c386108-1805-3985-b48e-8016012a0207, 6c386108-1805-3985-b48e-8016012a0207, > ef1c8e05-a06d-388d-a46d-53cc22a94762] at > org.apache.cassandra.distributed.impl.AbstractCluster$ChangeMonitor.waitForCompletion(AbstractCluster.java:907) > at > org.apache.cassandra.distributed.impl.AbstractCluster.lambda$schemaChange$8(AbstractCluster.java:836) > at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) at > org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at > org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17674) Test failure: org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
[ https://issues.apache.org/jira/browse/CASSANDRA-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-17674: Status: In Progress (was: Changes Suggested) > Test failure: > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage > > > Key: CASSANDRA-17674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17674 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Andres de la Peña >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 5.x > > > The Java upgrade dtest > {{org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage}} > is ~68% flaky on 4.0 and ~2% flaky on trunk, at least in CircleCI: > * 4.0: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1622/workflows/0086c3b1-a552-4c7a-8278-2f759cee5bdf/jobs/17288] > * trunk: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1624/workflows/c4ce2b95-998f-459b-830e-8e3fa6637e15/jobs/17293] > The error for 4.0 is: > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:227) > at > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage(DropCompactStorageTest.java:49) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:235) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:231) > Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:62) > at > org.apache.cassandra.distributed.impl.Instance.lambda$shutdown$28(Instance.java:810) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor.lambda$null$8(IsolatedExecutor.java:114) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > Caused by: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.utils.ExecutorUtils.awaitTerminationUntil(ExecutorUtils.java:107) > at > org.apache.cassandra.utils.ExecutorUtils.awaitTermination(ExecutorUtils.java:96) > at > org.apache.cassandra.utils.ExecutorUtils.shutdownNowAndWait(ExecutorUtils.java:139) > at > org.apache.cassandra.concurrent.StageMana
[jira] [Updated] (CASSANDRA-17674) Test failure: org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
[ https://issues.apache.org/jira/browse/CASSANDRA-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-17674: Fix Version/s: 4.1.x (was: 5.x) > Test failure: > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage > > > Key: CASSANDRA-17674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17674 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Andres de la Peña >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.1.x > > > The Java upgrade dtest > {{org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage}} > is ~68% flaky on 4.0 and ~2% flaky on trunk, at least in CircleCI: > * 4.0: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1622/workflows/0086c3b1-a552-4c7a-8278-2f759cee5bdf/jobs/17288] > * trunk: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1624/workflows/c4ce2b95-998f-459b-830e-8e3fa6637e15/jobs/17293] > The error for 4.0 is: > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:227) > at > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage(DropCompactStorageTest.java:49) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:235) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:231) > Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:62) > at > org.apache.cassandra.distributed.impl.Instance.lambda$shutdown$28(Instance.java:810) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor.lambda$null$8(IsolatedExecutor.java:114) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > Caused by: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.utils.ExecutorUtils.awaitTerminationUntil(ExecutorUtils.java:107) > at > org.apache.cassandra.utils.ExecutorUtils.awaitTermination(ExecutorUtils.java:96) > at > org.apache.cassandra.utils.ExecutorUtils.shutdownNowAndWait(ExecutorUtils.java:139) > at > org.apache.cassandra.concurren
[jira] [Commented] (CASSANDRA-17674) Test failure: org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
[ https://issues.apache.org/jira/browse/CASSANDRA-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767377#comment-17767377 ] Berenguer Blasi commented on CASSANDRA-17674: - Ok yes that's what I was getting at, this is no 5.0 blocker imo as well. > Test failure: > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage > > > Key: CASSANDRA-17674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17674 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Andres de la Peña >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 5.x > > > The Java upgrade dtest > {{org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage}} > is ~68% flaky on 4.0 and ~2% flaky on trunk, at least in CircleCI: > * 4.0: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1622/workflows/0086c3b1-a552-4c7a-8278-2f759cee5bdf/jobs/17288] > * trunk: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1624/workflows/c4ce2b95-998f-459b-830e-8e3fa6637e15/jobs/17293] > The error for 4.0 is: > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:227) > at > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage(DropCompactStorageTest.java:49) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:235) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:231) > Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:62) > at > org.apache.cassandra.distributed.impl.Instance.lambda$shutdown$28(Instance.java:810) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor.lambda$null$8(IsolatedExecutor.java:114) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > Caused by: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.utils.ExecutorUtils.awaitTerminationUntil(ExecutorUtils.java:107) > at > org.apache.cassandra.utils.ExecutorUtils.awaitTermination(ExecutorUtils.java:96) > at > org.apache.cassandra.utils.ExecutorUtils.shutdownNowAndWai
[jira] [Commented] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767375#comment-17767375 ] Berenguer Blasi commented on CASSANDRA-18786: - ^Yep I know. But all I am doing is explaining the code referencing it's internals. It's not like we're generating documentation for those private fields/methods :shrug: > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767365#comment-17767365 ] Maxwell Guo commented on CASSANDRA-18870: - |pr|[trunk|https://github.com/apache/cassandra/pull/2711]| |j17|[pre-commit|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/491/workflows/3ba451c5-8abf-4648-8acd-edf79c6843c0]| |j11|[pre-commit|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/491/workflows/a4f215c0-f2c7-4a78-80f7-772a4ac94f3b]| [~brandon.williams] yes, you are right , the commits still exist , so I update my repositories > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767361#comment-17767361 ] Cameron Zemek commented on CASSANDRA-18845: --- {noformat} Sep 21 03:01:42 ip-10-1-32-228 cassandra[52927]: INFO org.apache.cassandra.gms.Gossiper Waiting for gossip to settle... Sep 21 03:01:48 ip-10-1-32-228 cassandra[52927]: INFO org.apache.cassandra.gms.Gossiper Gossip looks settled. epSize=108 Sep 21 03:01:49 ip-10-1-32-228 cassandra[52927]: INFO org.apache.cassandra.gms.Gossiper Gossip looks settled. epSize=108 Sep 21 03:01:50 ip-10-1-32-228 cassandra[52927]: INFO org.apache.cassandra.gms.Gossiper Gossip looks settled. epSize=108 Sep 21 03:02:00 ip-10-1-32-228 cassandra[52927]: INFO o.a.c.gms.GossipDigestAckVerbHandler Received a GossipDigestAckMessage from /15.223.140.86 Sep 21 03:02:00 ip-10-1-32-228 cassandra[52927]: INFO org.apache.cassandra.gms.Gossiper Sending a EchoMessage to /44.229.153.229 ... Sep 21 03:03:40 ip-10-1-32-228 cassandra[52927]: INFO org.apache.cassandra.gms.Gossiper InetAddress /44.229.153.229 is now UP{noformat} Got a test run with 18 second delay. > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18845-seperate.patch, delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767358#comment-17767358 ] Cameron Zemek edited comment on CASSANDRA-18845 at 9/21/23 2:59 AM: {noformat} Sep 19 08:09:45 ip-10-1-57-23 cassandra[131402]: INFO org.apache.cassandra.gms.Gossiper Waiting for gossip to settle... Sep 19 08:10:56 ip-10-1-57-23 cassandra[131402]: DEBUG org.apache.cassandra.gms.Gossiper Sending a EchoMessage to /35.83.14.80{noformat} I am struggling to reproduce this ^ I seen it twice, and after enabling more logging haven't been able to reproduce again. What I do sometimes see though is it taking over 30 seconds to get the first ECHO response. Since there are dtests that rely on having CQL up while nodes are down, I have attached a patch [^18845-seperate.patch] (against 5.0 branch) that is opt-in. Having settle just check for currentLive == liveSize is still allowing NTR to start while nodes are marked down. Yes you can increase cassandra.gossip_settle_poll_success_required (and/or the other properties) to mitigate it but these increase the minimum startup time. Whereas [^18845-seperate.patch] doesn't add to this when the cluster is healthy. A more elaborate solution would be to specify the required consistency level. And for all token ranges owned by the node you check if you have the needed live endpoints to satisfy the consistency level. was (Author: cam1982): {noformat} Sep 19 08:09:45 ip-10-1-57-23 cassandra[131402]: INFO org.apache.cassandra.gms.Gossiper Waiting for gossip to settle... Sep 19 08:10:56 ip-10-1-57-23 cassandra[131402]: DEBUG org.apache.cassandra.gms.Gossiper Sending a EchoMessage to /35.83.14.80{noformat} I am struggling to reproduce this ^ I seen it twice, and after enabling more logging haven't been able to reproduce again. What I do sometimes see though it taking over 30 seconds to get the first ECHO response. Since there are dtests that rely on having CQL up while nodes are down, I have attached a patch [^18845-seperate.patch] (against 5.0 branch) that is opt-in. Having settle just check for currentLive == liveSize is still allowing NTR to start while nodes are marked down. Yes you can increase cassandra.gossip_settle_poll_success_required (and/or the other properties) to mitigate it but these increase the minimum startup time. Whereas [^18845-seperate.patch] doesn't add to this when the cluster is healthy. A more elaborate solution would be to specify the required consistency level. And for all token ranges owned by the node you check if you have the needed live endpoints to satisfy the consistency level. > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18845-seperate.patch, delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767358#comment-17767358 ] Cameron Zemek commented on CASSANDRA-18845: --- {noformat} Sep 19 08:09:45 ip-10-1-57-23 cassandra[131402]: INFO org.apache.cassandra.gms.Gossiper Waiting for gossip to settle... Sep 19 08:10:56 ip-10-1-57-23 cassandra[131402]: DEBUG org.apache.cassandra.gms.Gossiper Sending a EchoMessage to /35.83.14.80{noformat} I am struggling to reproduce this ^ I seen it twice, and after enabling more logging haven't been able to reproduce again. What I do sometimes see though it taking over 30 seconds to get the first ECHO response. Since there are dtests that rely on having CQL up while nodes are down, I have attached a patch [^18845-seperate.patch] (against 5.0 branch) that is opt-in. Having settle just check for currentLive == liveSize is still allowing NTR to start while nodes are marked down. Yes you can increase cassandra.gossip_settle_poll_success_required (and/or the other properties) to mitigate it but these increase the minimum startup time. Whereas [^18845-seperate.patch] doesn't add to this when the cluster is healthy. A more elaborate solution would be to specify the required consistency level. And for all token ranges owned by the node you check if you have the needed live endpoints to satisfy the consistency level. > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18845-seperate.patch, delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cameron Zemek updated CASSANDRA-18845: -- Attachment: (was: 18845-seperate.patch) > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18845-seperate.patch, delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cameron Zemek updated CASSANDRA-18845: -- Attachment: 18845-seperate.patch > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18845-seperate.patch, delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cameron Zemek updated CASSANDRA-18845: -- Attachment: 18845-seperate.patch > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18845-seperate.patch, delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18681) Internode legacy SSL storage port certificate is not hot reloaded on update
[ https://issues.apache.org/jira/browse/CASSANDRA-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francisco Guerrero updated CASSANDRA-18681: --- Reviewers: Dinesh Joshi, Francisco Guerrero (was: Dinesh Joshi) > Internode legacy SSL storage port certificate is not hot reloaded on update > --- > > Key: CASSANDRA-18681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18681 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Internode >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > In CASSANDRA-1 the SSLContext cache was changed to clear individual > {{EncryptionOptions}} from the SslContext cache if they needed reloading to > reduce resource consumption. Before the change if ANY cert needed hot > reloading, the SSLContext cache would be cleared for ALL certs. > If the legacy SSL storage port is configured, a new {{EncryptionOptions}} > object is created in {{org.apache.cassandra.net.InboundSockets#addBindings}} > just for binding the socket, but never gets cleared as the change in port > means it no longer matches the configuration retrieved from > {{DatabaseDescriptor}} in > {{org.apache.cassandra.net.MessagingServiceMBeanImpl#reloadSslCertificates}}. > This is unlikely to be an issue in practice as the legacy SSL internode > socket is only used in mixed version clusters with pre-4.0 nodes, so the cert > only needs to stay valid until all nodes upgrade to 4.x or above. > One way to avoid this class of failures is to just check the entries present > in the SSLContext cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332236438 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraContext.java: ## @@ -96,4 +100,14 @@ protected BulkSparkConf conf() { return conf; } + +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecarClient)); +StartupValidator.instance().register(new CassandraValidation(sidecarClient)); +StartupValidator.instance().perform(); Review Comment: Also, very easy to switch to parallel execution of validations, if ever feel the need to do so in the future… -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332236438 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraContext.java: ## @@ -96,4 +100,14 @@ protected BulkSparkConf conf() { return conf; } + +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecarClient)); +StartupValidator.instance().register(new CassandraValidation(sidecarClient)); +StartupValidator.instance().perform(); Review Comment: Also, very easy to switch to parallel execution of validations, if ever feel need to do so in the future… ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraContext.java: ## @@ -96,4 +100,14 @@ protected BulkSparkConf conf() { return conf; } + +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecarClient)); +StartupValidator.instance().register(new CassandraValidation(sidecarClient)); +StartupValidator.instance().perform(); Review Comment: Also, very easy to switch to parallel execution of validations, if ever feel need to do so in the future… -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332213687 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraContext.java: ## @@ -96,4 +100,14 @@ protected BulkSparkConf conf() { return conf; } + +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecarClient)); +StartupValidator.instance().register(new CassandraValidation(sidecarClient)); +StartupValidator.instance().perform(); Review Comment: Current implementation serves several purposes: first, we get all validation logic grouped together and easy to discover (unlike having many separate validation calls in many different places); second, we register additional validations at other places as well (see SSL/KeyStore/TrustStore validations, for example); third, we enable library users to define and register their own validations (which we ourselves use in our internal version of the library); also, as @frankgh pointed out below, `StartupValidator` takes care of skipping validations when instructed to do so with an environment variable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332213687 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraContext.java: ## @@ -96,4 +100,14 @@ protected BulkSparkConf conf() { return conf; } + +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecarClient)); +StartupValidator.instance().register(new CassandraValidation(sidecarClient)); +StartupValidator.instance().perform(); Review Comment: Current implementation serves several purposes: first, we get all validation logic grouped together and easy to discover (unlike having many separate validation calls in many different places); second, we register additional validations at other places as well (see SSL/KeyStore/TrustStore validations, for example); third, we enable library users to define and register their own validations (which we ourselves use in our internal version of the library); also, as @frankgh pointed out below, `StartupValidator` takes care of skipping validations when instructed to do so by an environment variable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332213687 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraContext.java: ## @@ -96,4 +100,14 @@ protected BulkSparkConf conf() { return conf; } + +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecarClient)); +StartupValidator.instance().register(new CassandraValidation(sidecarClient)); +StartupValidator.instance().perform(); Review Comment: Current implementation serves several purposes: first, we get all validation logic grouped together and easy to discover (unlike having many separate validation calls in many different places); second, we register additional validations at other places as well (see SSL/KeyStore/TrustStore validations, for example); third, we enable library users to define and register their own validations (which we ourselves use in our internal version of the library). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332210699 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -639,6 +644,16 @@ public CassandraRing createCassandraRingFromRing(Partitioner partitioner, return new CassandraRing(partitioner, keyspace, replicationFactor, instances); } +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecar)); +StartupValidator.instance().register(new CassandraValidation(sidecar)); +StartupValidator.instance().perform(); +} Review Comment: (See my reply to the previous comment.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332210070 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/validation/SidecarValidation.java: ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import java.util.concurrent.TimeUnit; + +import org.apache.cassandra.sidecar.client.SidecarClient; +import org.apache.cassandra.sidecar.common.data.HealthResponse; + +/** + * A startup validation that checks the connectivity and health of Sidecar + */ +public class SidecarValidation implements StartupValidation +{ +private static final int TIMEOUT_SECONDS = 30; Review Comment: I think 30 seconds is a reasonable value for a timeout, it is also used in a few other places without being configurable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332209521 ## cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/validation/TrustStoreValidationTests.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import org.apache.cassandra.secrets.SecretsProvider; +import org.apache.cassandra.secrets.TestSecretsProvider; + +/** + * Unit tests that cover startup validation of a TrustStore + */ +public class TrustStoreValidationTests +{ +@Test() +public void testNullSecrets() +{ +TrustStoreValidation validation = new TrustStoreValidation(null); + +Assertions.assertThrows(RuntimeException.class, validation::perform); +} + +@Test() +public void testUnconfiguredTrustStore() +{ +SecretsProvider secrets = TestSecretsProvider.unconfigured(); +TrustStoreValidation validation = new TrustStoreValidation(secrets); + +Assertions.assertDoesNotThrow(validation::perform); // TrustStore is optional +} + +@Test() +public void testMissingTrustStore() +{ +SecretsProvider secrets = TestSecretsProvider.forTrustStore("PKCS12", "keystore-missing.p12", "qwerty"); Review Comment: In this case I produced the test files manually using a free and open-source tool called Java KeyStore Explorer (https://keystore-explorer.org/), don't think they will ever need to be modified, so there's nothing to describe, probably. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332208068 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/validation/StartupValidatable.java: ## @@ -0,0 +1,25 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +public interface StartupValidatable Review Comment: Added JavaDocs for both this interface and its only method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17062) Expose Auth Caches metrics to virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17062: -- Summary: Expose Auth Caches metrics to virtual table (was: Expose Auth Caches metrics) > Expose Auth Caches metrics to virtual table > --- > > Key: CASSANDRA-17062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17062 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Virtual Tables, Observability/Metrics, > Tool/nodetool >Reporter: Aleksei Zotov >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Unlike to other caches (row, key, counter), Auth Caches lack some monitoring > capabilities. Here are a few particular changes to get this inequity fixed: > # Add auth caches to _system_views.caches_ VT > # Expose auth caches metrics via JMX > # Add auth caches details to _nodetool info_ > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17062) Expose Auth Caches metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767275#comment-17767275 ] Stefan Miklosovic edited comment on CASSANDRA-17062 at 9/20/23 8:28 PM: I moved this forward here (1) We got new auth caches in the meanwhile - CIDR auth cache and Identity cache from Mutual TLS authenticator (both if enabled) so I exposed them in CQL virtual table too if configured. The original patch was also propagating these metrics to nodetool info because there are currently caches like key / row / counter / network cache but I do not like that we are exposing it in this nodetool command. It just does not feel natural. I think that it is just better to have it only in cqlsh so I can remove them if we agree on that. It would be great to have some review! ([~samt] I know you are busy ...) [~bereng] I see you among watchers, maybe you would give it a shot? [~paulo] likewise. [~yifanc] [~jonmeredith] maybe you guys would find this handy to have in trunk? I see you were reviewing Mutual TLS stuff. {code} $ ./bin/nodetool info ID : c8be5408-d1b1-433f-b124-6b7a8c76b1bb Gossip active: true Native Transport active : true Load : 252.6 KiB Uncompressed load: 332.68 KiB Generation No: 1695240794 Uptime (seconds) : 42 Heap Memory (MB) : 243.66 / 989.88 Off Heap Memory (MB) : 0.00 Data Center : datacenter1 Rack : rack1 Exceptions : 0 Key Cache: entries 11, size 984 bytes, capacity 46 MiB, 104 hits, 118 requests, 0.881 recent hit rate, 14400 save period in seconds Row Cache: entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache: entries 0, size 0 bytes, capacity 23 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Network Cache: size 8 MiB, overflow size: 0 bytes, capacity 61 MiB Credentials Cache: entries 1, capacity 1000, 1 hits, 2 requests, 0.500 recent hit rate JMX Permissions Cache: entries 0, capacity 1000, 0 hits, 0 requests, NaN recent hit rate Network Permissions Cache: entries 1, capacity 1000, 30 hits, 32 requests, 0.938 recent hit rate Permissions Cache: entries 1, capacity 1000, 2 hits, 17 requests, 0.118 recent hit rate Roles Cache : entries 1, capacity 1000, 37 hits, 39 requests, 0.949 recent hit rate CIDR Permissions Cache : entries 1, capacity 1000, 30 hits, 32 requests, 0.938 recent hit rate Percent Repaired : 100.0% Token: (invoke with -T/--tokens to see all 16 tokens) Bootstrap state : COMPLETED Bootstrap failed : false Decommissioning : false Decommission failed : false {code} (1) https://github.com/apache/cassandra/pull/2710 was (Author: smiklosovic): I moved this forward here (1) We got new auth caches in the meanwhile - CIDR auth cache and Identity cache from Mutual TLS authenticator (both if enabled) so I exposed them too if configured. The original patch was also propagating these metrics to nodetool info because there are currently caches like key / row / counter / network cache but I do not like that we are exposing it in this nodetool command. It just does not feel natural. I think that it is just better to have it only in cqlsh so I can remove them if we agree on that. It would be great to have some review! ([~samt] I know you are busy ...) [~bereng] I see you among watchers, maybe you would give it a shot? [~paulo] likewise. [~yifanc] [~jonmeredith] maybe you guys would find this handy to have in trunk? I see you were reviewing Mutual TLS stuff. {code} $ ./bin/nodetool info ID : c8be5408-d1b1-433f-b124-6b7a8c76b1bb Gossip active: true Native Transport active : true Load : 252.6 KiB Uncompressed load: 332.68 KiB Generation No: 1695240794 Uptime (seconds) : 42 Heap Memory (MB) : 243.66 / 989.88 Off Heap Memory (MB) : 0.00 Data Center : datacenter1 Rack : rack1 Exceptions : 0 Key Cache: entries 11, size 984 bytes, capacity 46 MiB, 104 hits, 118 requests, 0.881 recent hit rate, 14400 save period in seconds Row Cache: entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache: entries 0, size 0 bytes, capacity 23 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Network Cache: size 8 MiB, overflow size: 0 bytes, capacity 61 MiB Credentials Cache: entries 1, capacity 1000, 1 hits, 2 requests, 0.500 recent hit rate JMX Permissions C
[jira] [Comment Edited] (CASSANDRA-17062) Expose Auth Caches metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767275#comment-17767275 ] Stefan Miklosovic edited comment on CASSANDRA-17062 at 9/20/23 8:27 PM: I moved this forward here (1) We got new auth caches in the meanwhile - CIDR auth cache and Identity cache from Mutual TLS authenticator (both if enabled) so I exposed them too if configured. The original patch was also propagating these metrics to nodetool info because there are currently caches like key / row / counter / network cache but I do not like that we are exposing it in this nodetool command. It just does not feel natural. I think that it is just better to have it only in cqlsh so I can remove them if we agree on that. It would be great to have some review! ([~samt] I know you are busy ...) [~bereng] I see you among watchers, maybe you would give it a shot? [~paulo] likewise. [~yifanc] [~jonmeredith] maybe you guys would find this handy to have in trunk? I see you were reviewing Mutual TLS stuff. {code} $ ./bin/nodetool info ID : c8be5408-d1b1-433f-b124-6b7a8c76b1bb Gossip active: true Native Transport active : true Load : 252.6 KiB Uncompressed load: 332.68 KiB Generation No: 1695240794 Uptime (seconds) : 42 Heap Memory (MB) : 243.66 / 989.88 Off Heap Memory (MB) : 0.00 Data Center : datacenter1 Rack : rack1 Exceptions : 0 Key Cache: entries 11, size 984 bytes, capacity 46 MiB, 104 hits, 118 requests, 0.881 recent hit rate, 14400 save period in seconds Row Cache: entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache: entries 0, size 0 bytes, capacity 23 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Network Cache: size 8 MiB, overflow size: 0 bytes, capacity 61 MiB Credentials Cache: entries 1, capacity 1000, 1 hits, 2 requests, 0.500 recent hit rate JMX Permissions Cache: entries 0, capacity 1000, 0 hits, 0 requests, NaN recent hit rate Network Permissions Cache: entries 1, capacity 1000, 30 hits, 32 requests, 0.938 recent hit rate Permissions Cache: entries 1, capacity 1000, 2 hits, 17 requests, 0.118 recent hit rate Roles Cache : entries 1, capacity 1000, 37 hits, 39 requests, 0.949 recent hit rate CIDR Permissions Cache : entries 1, capacity 1000, 30 hits, 32 requests, 0.938 recent hit rate Percent Repaired : 100.0% Token: (invoke with -T/--tokens to see all 16 tokens) Bootstrap state : COMPLETED Bootstrap failed : false Decommissioning : false Decommission failed : false {code} (1) https://github.com/apache/cassandra/pull/2710 was (Author: smiklosovic): I moved this forward here (1) We got new auth caches in the meanwhile - CIDR auth cache and Identity cache from Mutual TLS authenticator (both if enabled) so I exposed them too if configured. The original patch was also propagating these metrics to nodetool info because there are currently caches like key / row / counter / network cache but I do not like that we are exposing it in this nodetool command. It just does not feel natural. I think that it is just better to have it only in cqlsh so I can remove them if we agree on that. It would be great to have some review! ([~samt] I know you are busy ...) [~bereng] I see you among watchers, maybe you would give it a shot? [~paulo] likewise. {code} $ ./bin/nodetool info ID : c8be5408-d1b1-433f-b124-6b7a8c76b1bb Gossip active: true Native Transport active : true Load : 252.6 KiB Uncompressed load: 332.68 KiB Generation No: 1695240794 Uptime (seconds) : 42 Heap Memory (MB) : 243.66 / 989.88 Off Heap Memory (MB) : 0.00 Data Center : datacenter1 Rack : rack1 Exceptions : 0 Key Cache: entries 11, size 984 bytes, capacity 46 MiB, 104 hits, 118 requests, 0.881 recent hit rate, 14400 save period in seconds Row Cache: entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache: entries 0, size 0 bytes, capacity 23 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Network Cache: size 8 MiB, overflow size: 0 bytes, capacity 61 MiB Credentials Cache: entries 1, capacity 1000, 1 hits, 2 requests, 0.500 recent hit rate JMX Permissions Cache: entries 0, capacity 1000, 0 hits, 0 requests, NaN recent hit rate Network Permissions Cache: entries 1, capacity 1000, 30 hits, 32 requ
[jira] [Updated] (CASSANDRA-17062) Expose Auth Caches metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17062: -- Complexity: Normal (was: Low Hanging Fruit) > Expose Auth Caches metrics > -- > > Key: CASSANDRA-17062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17062 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Virtual Tables, Observability/Metrics, > Tool/nodetool >Reporter: Aleksei Zotov >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Unlike to other caches (row, key, counter), Auth Caches lack some monitoring > capabilities. Here are a few particular changes to get this inequity fixed: > # Add auth caches to _system_views.caches_ VT > # Expose auth caches metrics via JMX > # Add auth caches details to _nodetool info_ > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332134672 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/validation/KeyStoreValidation.java: ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import java.io.IOException; +import java.security.GeneralSecurityException; +import java.security.Key; +import java.security.KeyStore; +import java.security.PrivateKey; +import java.util.Enumeration; + +import org.apache.cassandra.secrets.SecretsProvider; + +/** + * A startup validation that checks the KeyStore + */ +public class KeyStoreValidation implements StartupValidation +{ +private final SecretsProvider secrets; + +public KeyStoreValidation(SecretsProvider secrets) +{ +this.secrets = secrets; +} + +@Override +public void validate() +{ +try +{ +if (!secrets.hasKeyStoreSecrets()) +{ +throw new RuntimeException("KeyStore is unconfigured"); Review Comment: Replaced here and in 3 other places. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17062) Expose Auth Caches metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767275#comment-17767275 ] Stefan Miklosovic commented on CASSANDRA-17062: --- I moved this forward here (1) We got new auth caches in the meanwhile - CIDR auth cache and Identity cache from Mutual TLS authenticator (both if enabled) so I exposed them too if configured. The original patch was also propagating these metrics to nodetool info because there are currently caches like key / row / counter / network cache but I do not like that we are exposing it in this nodetool command. It just does not feel natural. I think that it is just better to have it only in cqlsh so I can remove them if we agree on that. It would be great to have some review! ([~samt] I know you are busy ...) [~bereng] I see you among watchers, maybe you would give it a shot? [~paulo] likewise. {code} $ ./bin/nodetool info ID : c8be5408-d1b1-433f-b124-6b7a8c76b1bb Gossip active: true Native Transport active : true Load : 252.6 KiB Uncompressed load: 332.68 KiB Generation No: 1695240794 Uptime (seconds) : 42 Heap Memory (MB) : 243.66 / 989.88 Off Heap Memory (MB) : 0.00 Data Center : datacenter1 Rack : rack1 Exceptions : 0 Key Cache: entries 11, size 984 bytes, capacity 46 MiB, 104 hits, 118 requests, 0.881 recent hit rate, 14400 save period in seconds Row Cache: entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache: entries 0, size 0 bytes, capacity 23 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds Network Cache: size 8 MiB, overflow size: 0 bytes, capacity 61 MiB Credentials Cache: entries 1, capacity 1000, 1 hits, 2 requests, 0.500 recent hit rate JMX Permissions Cache: entries 0, capacity 1000, 0 hits, 0 requests, NaN recent hit rate Network Permissions Cache: entries 1, capacity 1000, 30 hits, 32 requests, 0.938 recent hit rate Permissions Cache: entries 1, capacity 1000, 2 hits, 17 requests, 0.118 recent hit rate Roles Cache : entries 1, capacity 1000, 37 hits, 39 requests, 0.949 recent hit rate CIDR Permissions Cache : entries 1, capacity 1000, 30 hits, 32 requests, 0.938 recent hit rate Percent Repaired : 100.0% Token: (invoke with -T/--tokens to see all 16 tokens) Bootstrap state : COMPLETED Bootstrap failed : false Decommissioning : false Decommission failed : false {code} (1) https://github.com/apache/cassandra/pull/2710 > Expose Auth Caches metrics > -- > > Key: CASSANDRA-17062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17062 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Virtual Tables, Observability/Metrics, > Tool/nodetool >Reporter: Aleksei Zotov >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Unlike to other caches (row, key, counter), Auth Caches lack some monitoring > capabilities. Here are a few particular changes to get this inequity fixed: > # Add auth caches to _system_views.caches_ VT > # Expose auth caches metrics via JMX > # Add auth caches details to _nodetool info_ > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17062) Expose Auth Caches metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17062: -- Status: Patch Available (was: In Progress) > Expose Auth Caches metrics > -- > > Key: CASSANDRA-17062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17062 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Virtual Tables, Observability/Metrics, > Tool/nodetool >Reporter: Aleksei Zotov >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Unlike to other caches (row, key, counter), Auth Caches lack some monitoring > capabilities. Here are a few particular changes to get this inequity fixed: > # Add auth caches to _system_views.caches_ VT > # Expose auth caches metrics via JMX > # Add auth caches details to _nodetool info_ > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332132261 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/BulkSparkConf.java: ## @@ -203,7 +203,7 @@ else if (!batchSizeIsZero && sstableBatchSize != DEFAULT_BATCH_SIZE_IN_ROWS) /* * This method Will throw if the SSL configuration is incorrect (PATH provided w/o password, for example) */ -protected void validateSslConfiguration() +public void validateSslConfiguration() Review Comment: Added detailed description of possible failure scenarios and corresponding exception types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] 5 commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
5 commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332120129 ## cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/validation/KeyStoreValidationTests.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import org.apache.cassandra.secrets.SecretsProvider; +import org.apache.cassandra.secrets.TestSecretsProvider; + +/** + * Unit tests that cover startup validation of a KeyStore + */ +public class KeyStoreValidationTests +{ +@Test() Review Comment: Removed here and in 13 other places. ## cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/validation/KeyStoreValidationTests.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import org.apache.cassandra.secrets.SecretsProvider; +import org.apache.cassandra.secrets.TestSecretsProvider; + +/** + * Unit tests that cover startup validation of a KeyStore + */ +public class KeyStoreValidationTests +{ +@Test() +public void testNullSecrets() +{ +KeyStoreValidation validation = new KeyStoreValidation(null); + +Assertions.assertThrows(RuntimeException.class, validation::perform); Review Comment: Added assertions on the message and the cause here and in 12 other places. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18830) Set io.netty.transport.noNative to false for in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762981#comment-17762981 ] Stefan Miklosovic edited comment on CASSANDRA-18830 at 9/20/23 6:26 PM: I was repeating both just gossiper as well as "loop everything test", 100x each j11 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115329 j11 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115328 j17 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115326 j17 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115327 was (Author: smiklosovic): I was repeating both just gossiper as well as "loop everything test", 100x each j11 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115329 j11 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115328 j17 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115326 j17 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115327 j11 upgrade dtest part 1 https://app.circleci.com/pipelines/github/instaclustr/cassandra/3186/workflows/0f81bd76-9f46-400c-94bb-95a8c214d397/jobs/120900 > Set io.netty.transport.noNative to false for in-jvm dtests > -- > > Key: CASSANDRA-18830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18830 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > > This ticket was created as a reaction to (1). > (1) https://lists.apache.org/thread/p42yksvo6t0dy67wmnl2bzpnggo9gpp9 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] yifan-c commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
yifan-c commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332025542 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/validation/SidecarValidation.java: ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import java.util.concurrent.TimeUnit; + +import org.apache.cassandra.sidecar.client.SidecarClient; +import org.apache.cassandra.sidecar.common.data.HealthResponse; + +/** + * A startup validation that checks the connectivity and health of Sidecar + */ +public class SidecarValidation implements StartupValidation +{ +private static final int TIMEOUT_SECONDS = 30; Review Comment: Agreed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] yifan-c commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
yifan-c commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1332023979 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -639,6 +644,16 @@ public CassandraRing createCassandraRingFromRing(Partitioner partitioner, return new CassandraRing(partitioner, keyspace, replicationFactor, instances); } +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecar)); +StartupValidator.instance().register(new CassandraValidation(sidecar)); +StartupValidator.instance().perform(); +} Review Comment: Let me clarify. The register and perform design is unnecessary, as commented in https://github.com/apache/cassandra-analytics/pull/15#discussion_r1330931869 I agree that the validation should be performed eagerly. In the code, it looks like this. ```java { new SidecarValidation(sidecar).validate(); new CassandraValidation(sidecar).validate(); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[GitHub] [cassandra-analytics] frankgh commented on a diff in pull request #15: [CASSANDRA-18810] Cassandra Analytics Start-Up Validation
frankgh commented on code in PR #15: URL: https://github.com/apache/cassandra-analytics/pull/15#discussion_r1331992931 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/validation/SidecarValidation.java: ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import java.util.concurrent.TimeUnit; + +import org.apache.cassandra.sidecar.client.SidecarClient; +import org.apache.cassandra.sidecar.common.data.HealthResponse; + +/** + * A startup validation that checks the connectivity and health of Sidecar + */ +public class SidecarValidation implements StartupValidation +{ +private static final int TIMEOUT_SECONDS = 30; Review Comment: @yifan-c any thoughts about making this configurable? ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -639,6 +644,16 @@ public CassandraRing createCassandraRingFromRing(Partitioner partitioner, return new CassandraRing(partitioner, keyspace, replicationFactor, instances); } +// Startup Validation + +@Override +public void startupValidate() +{ +StartupValidator.instance().register(new SidecarValidation(sidecar)); +StartupValidator.instance().register(new CassandraValidation(sidecar)); +StartupValidator.instance().perform(); +} Review Comment: hmm. Thinking a bit more about this, perform gates validation on whether they are enabled or not. So I'm still undecided ## cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/validation/TrustStoreValidationTests.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.validation; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import org.apache.cassandra.secrets.SecretsProvider; +import org.apache.cassandra.secrets.TestSecretsProvider; + +/** + * Unit tests that cover startup validation of a TrustStore + */ +public class TrustStoreValidationTests +{ +@Test() +public void testNullSecrets() +{ +TrustStoreValidation validation = new TrustStoreValidation(null); + +Assertions.assertThrows(RuntimeException.class, validation::perform); +} + +@Test() +public void testUnconfiguredTrustStore() +{ +SecretsProvider secrets = TestSecretsProvider.unconfigured(); +TrustStoreValidation validation = new TrustStoreValidation(secrets); + +Assertions.assertDoesNotThrow(validation::perform); // TrustStore is optional +} + +@Test() +public void testMissingTrustStore() +{ +SecretsProvider secrets = TestSecretsProvider.forTrustStore("PKCS12", "keystore-missing.p12", "qwerty"); Review Comment: some projects offer a README or script to regenerate the keystore files needed for testing. I think that's useful for new contributors. See https://github.com/eclipse-vertx/vert.x/blob/master/src/test/resources/tls/ssl.txt for reference ## cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/validation/KeyStoreValidationTests.java: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information +
[jira] [Commented] (CASSANDRA-18681) Internode legacy SSL storage port certificate is not hot reloaded on update
[ https://issues.apache.org/jira/browse/CASSANDRA-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767219#comment-17767219 ] Jon Meredith commented on CASSANDRA-18681: -- I've remembered why I did it this way. The legacy ssl storage port encryption options are not registered for hot reloading, so you have to match invalidate if the original encryption options shouldReload returned true. > Internode legacy SSL storage port certificate is not hot reloaded on update > --- > > Key: CASSANDRA-18681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18681 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Internode >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > In CASSANDRA-1 the SSLContext cache was changed to clear individual > {{EncryptionOptions}} from the SslContext cache if they needed reloading to > reduce resource consumption. Before the change if ANY cert needed hot > reloading, the SSLContext cache would be cleared for ALL certs. > If the legacy SSL storage port is configured, a new {{EncryptionOptions}} > object is created in {{org.apache.cassandra.net.InboundSockets#addBindings}} > just for binding the socket, but never gets cleared as the change in port > means it no longer matches the configuration retrieved from > {{DatabaseDescriptor}} in > {{org.apache.cassandra.net.MessagingServiceMBeanImpl#reloadSslCertificates}}. > This is unlikely to be an issue in practice as the legacy SSL internode > socket is only used in mixed version clusters with pre-4.0 nodes, so the cert > only needs to stay valid until all nodes upgrade to 4.x or above. > One way to avoid this class of failures is to just check the entries present > in the SSLContext cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRASC-73) Update token-ranges endpoint to return additional instance metadata
[ https://issues.apache.org/jira/browse/CASSANDRASC-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRASC-73: -- Labels: pull-request-available (was: ) > Update token-ranges endpoint to return additional instance metadata > --- > > Key: CASSANDRASC-73 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-73 > Project: Sidecar for Apache Cassandra > Issue Type: Improvement >Reporter: Arjun Ashok >Assignee: Arjun Ashok >Priority: Normal > Labels: pull-request-available > > Sidecar `token-range-replicas` endpoint to return the following additional > metadata for each instance: > * State. eg Normal, Joining, Leaving > * Status. eg. Up, Down > * Address: Replica host and port > * Name: Node name resolved from the above IP address -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18836) Replace CRC32 w/ CRC32C in IndexFileUtils.ChecksummingWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-18836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767193#comment-17767193 ] Caleb Rackliffe commented on CASSANDRA-18836: - I think we have to change both. Take {{META}} files as an example. Those are written w/ what we get from {{IndexDescriptor#openPerIndexOutput()}} and read w/ what we get from {{IndexDescriptor#openPerIndexInput()}}. (See {{MetadataSource.load()}}) They have to use the same CRC algorithm, right? > Replace CRC32 w/ CRC32C in IndexFileUtils.ChecksummingWriter > > > Key: CASSANDRA-18836 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18836 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index, Feature/SAI >Reporter: Caleb Rackliffe >Assignee: Maxim Muzafarov >Priority: Normal > Fix For: 5.0-alpha2, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > It seems that now we're on Java 11 for 5.0, there isn't much reason not to > use CRC32C as a drop-in replacement for CRC32. SAI isn't even released, so > has no binary compatibility entanglements, and this should be pretty > straightforward. > See https://github.com/apache/bookkeeper/pull/3309 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18681) Internode legacy SSL storage port certificate is not hot reloaded on update
[ https://issues.apache.org/jira/browse/CASSANDRA-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Meredith updated CASSANDRA-18681: - Status: Changes Suggested (was: Ready to Commit) Going to rework a little, I don't like the different check between shouldReload and clearSslContext. It's ok for the default implementation, but may not be good for a custom SSLContextFactoryInstance. > Internode legacy SSL storage port certificate is not hot reloaded on update > --- > > Key: CASSANDRA-18681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18681 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Internode >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > In CASSANDRA-1 the SSLContext cache was changed to clear individual > {{EncryptionOptions}} from the SslContext cache if they needed reloading to > reduce resource consumption. Before the change if ANY cert needed hot > reloading, the SSLContext cache would be cleared for ALL certs. > If the legacy SSL storage port is configured, a new {{EncryptionOptions}} > object is created in {{org.apache.cassandra.net.InboundSockets#addBindings}} > just for binding the socket, but never gets cleared as the change in port > means it no longer matches the configuration retrieved from > {{DatabaseDescriptor}} in > {{org.apache.cassandra.net.MessagingServiceMBeanImpl#reloadSslCertificates}}. > This is unlikely to be an issue in practice as the legacy SSL internode > socket is only used in mixed version clusters with pre-4.0 nodes, so the cert > only needs to stay valid until all nodes upgrade to 4.x or above. > One way to avoid this class of failures is to just check the entries present > in the SSLContext cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18355) CEP-15: Transaction Result Serialization Efficiency
[ https://issues.apache.org/jira/browse/CASSANDRA-18355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-18355: Reviewers: Ariel Weisberg, Benedict Elliott Smith, David Capwell (was: Blake Eggleston, David Capwell) > CEP-15: Transaction Result Serialization Efficiency > --- > > Key: CASSANDRA-18355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18355 > Project: Cassandra > Issue Type: Improvement > Components: Accord >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 5.x > > Time Spent: 2h > Remaining Estimate: 0h > > There are two things we probably don’t need to serialize and write to the > Accord state tables: > > 1.) Internal/external read responses > 2.) The full result of the transaction -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18830) Set io.netty.transport.noNative to false for in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762981#comment-17762981 ] Stefan Miklosovic edited comment on CASSANDRA-18830 at 9/20/23 4:07 PM: I was repeating both just gossiper as well as "loop everything test", 100x each j11 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115329 j11 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115328 j17 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115326 j17 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115327 j11 upgrade dtest part 1 https://app.circleci.com/pipelines/github/instaclustr/cassandra/3186/workflows/0f81bd76-9f46-400c-94bb-95a8c214d397/jobs/120900 was (Author: smiklosovic): I was repeating both just gossiper as well as "loop everything test", 100x each j11 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115329 j11 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115328 j17 pre-commit https://app.circleci.com/pipelines/github/instaclustr/cassandra/3174/workflows/fe44e70a-51d5-459f-b9eb-6133a8841ce3 j11 pre-commit https://app.circleci.com/pipelines/github/instaclustr/cassandra/3174/workflows/5f3e4833-eea3-4c6e-b3b1-5ed6e3faf64d j17 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115326 j17 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115327 > Set io.netty.transport.noNative to false for in-jvm dtests > -- > > Key: CASSANDRA-18830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18830 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > > This ticket was created as a reaction to (1). > (1) https://lists.apache.org/thread/p42yksvo6t0dy67wmnl2bzpnggo9gpp9 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18681) Internode legacy SSL storage port certificate is not hot reloaded on update
[ https://issues.apache.org/jira/browse/CASSANDRA-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Meredith updated CASSANDRA-18681: - Status: Ready to Commit (was: Review In Progress) > Internode legacy SSL storage port certificate is not hot reloaded on update > --- > > Key: CASSANDRA-18681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18681 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Internode >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > In CASSANDRA-1 the SSLContext cache was changed to clear individual > {{EncryptionOptions}} from the SslContext cache if they needed reloading to > reduce resource consumption. Before the change if ANY cert needed hot > reloading, the SSLContext cache would be cleared for ALL certs. > If the legacy SSL storage port is configured, a new {{EncryptionOptions}} > object is created in {{org.apache.cassandra.net.InboundSockets#addBindings}} > just for binding the socket, but never gets cleared as the change in port > means it no longer matches the configuration retrieved from > {{DatabaseDescriptor}} in > {{org.apache.cassandra.net.MessagingServiceMBeanImpl#reloadSslCertificates}}. > This is unlikely to be an issue in practice as the legacy SSL internode > socket is only used in mixed version clusters with pre-4.0 nodes, so the cert > only needs to stay valid until all nodes upgrade to 4.x or above. > One way to avoid this class of failures is to just check the entries present > in the SSLContext cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18830) Set io.netty.transport.noNative to false for in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762981#comment-17762981 ] Stefan Miklosovic edited comment on CASSANDRA-18830 at 9/20/23 3:52 PM: I was repeating both just gossiper as well as "loop everything test", 100x each j11 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115329 j11 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115328 j17 pre-commit https://app.circleci.com/pipelines/github/instaclustr/cassandra/3174/workflows/fe44e70a-51d5-459f-b9eb-6133a8841ce3 j11 pre-commit https://app.circleci.com/pipelines/github/instaclustr/cassandra/3174/workflows/5f3e4833-eea3-4c6e-b3b1-5ed6e3faf64d j17 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115326 j17 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115327 was (Author: smiklosovic): I was repeating both just gossiper as well as "loop everything test", 100x each j11 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115329 j11 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/d9d7d011-55b4-4e8f-aa8c-b163d0c9/jobs/115328 j17 jvm dtest repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115326 j17 jvm dtest vnode repeat https://app.circleci.com/pipelines/github/instaclustr/cassandra/3123/workflows/bb3ee9e7-ed9c-42da-9c1a-44a57883864e/jobs/115327 > Set io.netty.transport.noNative to false for in-jvm dtests > -- > > Key: CASSANDRA-18830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18830 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > > This ticket was created as a reaction to (1). > (1) https://lists.apache.org/thread/p42yksvo6t0dy67wmnl2bzpnggo9gpp9 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-accord] branch trunk updated: - add shareable APPLIED and INVALIDATED implementations of Result - API changes to support splicing in complete update fragments from PartialTxn as mutations a
This is an automated email from the ASF dual-hosted git repository. maedhroz pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-accord.git The following commit(s) were added to refs/heads/trunk by this push: new df492dfd - add shareable APPLIED and INVALIDATED implementations of Result - API changes to support splicing in complete update fragments from PartialTxn as mutations are finally being applied df492dfd is described below commit df492dfd2ffe993c33761d0531ac5b979b80f080 Author: Caleb Rackliffe AuthorDate: Thu Mar 23 14:47:54 2023 -0500 - add shareable APPLIED and INVALIDATED implementations of Result - API changes to support splicing in complete update fragments from PartialTxn as mutations are finally being applied patch by Caleb Rackliffe; reviewed by David Capwell, Benedict Elliot Smith, and Ariel Weisberg for CASSANDRA-18355 --- accord-core/src/main/java/accord/api/Result.java | 10 -- accord-core/src/main/java/accord/api/Write.java| 3 ++- accord-core/src/main/java/accord/local/Command.java| 2 +- accord-core/src/main/java/accord/local/Commands.java | 2 +- accord-core/src/main/java/accord/primitives/Writes.java| 4 ++-- accord-core/src/test/java/accord/impl/list/ListWrite.java | 3 ++- accord-core/src/test/java/accord/impl/mock/MockStore.java | 2 +- accord-core/src/test/java/accord/messages/ReadDataTest.java| 2 +- .../src/main/java/accord/maelstrom/MaelstromWrite.java | 3 ++- 9 files changed, 20 insertions(+), 11 deletions(-) diff --git a/accord-core/src/main/java/accord/api/Result.java b/accord-core/src/main/java/accord/api/Result.java index d8b8fd6d..0509c476 100644 --- a/accord-core/src/main/java/accord/api/Result.java +++ b/accord-core/src/main/java/accord/api/Result.java @@ -23,11 +23,17 @@ import accord.primitives.ProgressToken; /** * A result to be returned to a client, or be stored in a node's command state. - * - * TODO (expected, efficiency): support minimizing the result for storage in a node's command state (e.g. to only retain success/failure) */ public interface Result extends Outcome { +Result APPLIED = new Result() { }; + +Result INVALIDATED = new Result() +{ +@Override +public ProgressToken asProgressToken() { return ProgressToken.INVALIDATED; } +}; + @Override default ProgressToken asProgressToken() { return ProgressToken.APPLIED; } } diff --git a/accord-core/src/main/java/accord/api/Write.java b/accord-core/src/main/java/accord/api/Write.java index ebe25903..62635379 100644 --- a/accord-core/src/main/java/accord/api/Write.java +++ b/accord-core/src/main/java/accord/api/Write.java @@ -19,6 +19,7 @@ package accord.api; import accord.local.SafeCommandStore; +import accord.primitives.PartialTxn; import accord.primitives.Seekable; import accord.primitives.Timestamp; import accord.utils.async.AsyncChain; @@ -30,5 +31,5 @@ import accord.utils.async.AsyncChain; */ public interface Write { -AsyncChain apply(Seekable key, SafeCommandStore safeStore, Timestamp executeAt, DataStore store); +AsyncChain apply(Seekable key, SafeCommandStore safeStore, Timestamp executeAt, DataStore store, PartialTxn txn); } diff --git a/accord-core/src/main/java/accord/local/Command.java b/accord-core/src/main/java/accord/local/Command.java index 47f9caba..194fe2da 100644 --- a/accord-core/src/main/java/accord/local/Command.java +++ b/accord-core/src/main/java/accord/local/Command.java @@ -650,7 +650,7 @@ public abstract class Command implements CommonAttributes public static Truncated invalidated(TxnId txnId, Listeners.Immutable durableListeners) { -return new Truncated(txnId, SaveStatus.Invalidated, DurableOrInvalidated, null, Timestamp.NONE, durableListeners, null, null); +return new Truncated(txnId, SaveStatus.Invalidated, DurableOrInvalidated, null, Timestamp.NONE, durableListeners, null, Result.INVALIDATED); } @Override diff --git a/accord-core/src/main/java/accord/local/Commands.java b/accord-core/src/main/java/accord/local/Commands.java index 38d3f18b..52c37a5f 100644 --- a/accord-core/src/main/java/accord/local/Commands.java +++ b/accord-core/src/main/java/accord/local/Commands.java @@ -582,7 +582,7 @@ public class Commands // that was pre-bootstrap for some range (so redundant and we may have gone ahead of), but had to be executed locally // for another range CommandStore unsafeStore = safeStore.commandStore(); -return command.writes().apply(safeStore, applyRanges(safeStore, command.executeAt())) +return command.writes().apply(safeStore, applyRanges(safeStore, command.executeAt()), command.partialTxn()) .flatMap(unused -> unsafeStore.submit(context, ss -> { postApply(ss, txnId);
[jira] [Commented] (CASSANDRA-17674) Test failure: org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
[ https://issues.apache.org/jira/browse/CASSANDRA-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767134#comment-17767134 ] Ekaterina Dimitrova commented on CASSANDRA-17674: - {quote}Can you see if I did sthg silly [~e.dimitrova] [here|https://app.circleci.com/pipelines/github/bereng/cassandra/1070/workflows/edfa498e-938b-48cd-9e1f-9a795753c838/jobs/28988]? {quote} I don't think you did anything wrong. I see "{*}22 tests failed{*} out of 48," which is unsurprising because [~adelapena] reported it "~68% flaky on 4.0 and ~2% flaky on trunk." However, we should also check 4.1, as the ticket was reported for trunk when we were still on 4.1-trunk. But on forking, a script moved all tickets to point to the next version - in this case, 5.0. Unfortunately, the CircleCI links in the ticket description have expired, so I cannot tell how many times the test was run to get the stated results. Let's mark the ticket 4.0+, start fixing it at the lowest branch, where it is almost always flaky, and then see how things change in newer branches. > Test failure: > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage > > > Key: CASSANDRA-17674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17674 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Andres de la Peña >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 5.x > > > The Java upgrade dtest > {{org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage}} > is ~68% flaky on 4.0 and ~2% flaky on trunk, at least in CircleCI: > * 4.0: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1622/workflows/0086c3b1-a552-4c7a-8278-2f759cee5bdf/jobs/17288] > * trunk: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1624/workflows/c4ce2b95-998f-459b-830e-8e3fa6637e15/jobs/17293] > The error for 4.0 is: > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:227) > at > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage(DropCompactStorageTest.java:49) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:235) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:231) > Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:62) > at > org.apache.cassandra.distributed.impl.Instance.la
[jira] [Updated] (CASSANDRA-18681) Internode legacy SSL storage port certificate is not hot reloaded on update
[ https://issues.apache.org/jira/browse/CASSANDRA-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Meredith updated CASSANDRA-18681: - Reviewers: Dinesh Joshi, Jon Meredith Status: Review In Progress (was: Patch Available) > Internode legacy SSL storage port certificate is not hot reloaded on update > --- > > Key: CASSANDRA-18681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18681 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Internode >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > In CASSANDRA-1 the SSLContext cache was changed to clear individual > {{EncryptionOptions}} from the SslContext cache if they needed reloading to > reduce resource consumption. Before the change if ANY cert needed hot > reloading, the SSLContext cache would be cleared for ALL certs. > If the legacy SSL storage port is configured, a new {{EncryptionOptions}} > object is created in {{org.apache.cassandra.net.InboundSockets#addBindings}} > just for binding the socket, but never gets cleared as the change in port > means it no longer matches the configuration retrieved from > {{DatabaseDescriptor}} in > {{org.apache.cassandra.net.MessagingServiceMBeanImpl#reloadSslCertificates}}. > This is unlikely to be an issue in practice as the legacy SSL internode > socket is only used in mixed version clusters with pre-4.0 nodes, so the cert > only needs to stay valid until all nodes upgrade to 4.x or above. > One way to avoid this class of failures is to just check the entries present > in the SSLContext cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18681) Internode legacy SSL storage port certificate is not hot reloaded on update
[ https://issues.apache.org/jira/browse/CASSANDRA-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Meredith updated CASSANDRA-18681: - Reviewers: Dinesh Joshi (was: Dinesh Joshi, Jon Meredith) > Internode legacy SSL storage port certificate is not hot reloaded on update > --- > > Key: CASSANDRA-18681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18681 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Internode >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > In CASSANDRA-1 the SSLContext cache was changed to clear individual > {{EncryptionOptions}} from the SslContext cache if they needed reloading to > reduce resource consumption. Before the change if ANY cert needed hot > reloading, the SSLContext cache would be cleared for ALL certs. > If the legacy SSL storage port is configured, a new {{EncryptionOptions}} > object is created in {{org.apache.cassandra.net.InboundSockets#addBindings}} > just for binding the socket, but never gets cleared as the change in port > means it no longer matches the configuration retrieved from > {{DatabaseDescriptor}} in > {{org.apache.cassandra.net.MessagingServiceMBeanImpl#reloadSslCertificates}}. > This is unlikely to be an issue in practice as the legacy SSL internode > socket is only used in mixed version clusters with pre-4.0 nodes, so the cert > only needs to stay valid until all nodes upgrade to 4.x or above. > One way to avoid this class of failures is to just check the entries present > in the SSLContext cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18707) Test failure: junit.framework.TestSuite.org.apache.cassandra.distributed.test.CASMultiDCTest-.jdk11
[ https://issues.apache.org/jira/browse/CASSANDRA-18707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767135#comment-17767135 ] Ekaterina Dimitrova commented on CASSANDRA-18707: - {quote}If we reduce the timeout we'll just make it fail more times, stare at an empty log for that timeout period and then there's nothing we can do about it right? I'm probably missing sthg but if it failed again what do you do without logs? {quote} I haven't played with this particular test, but usually, we add additional logging and/or enable tracing during debugging Python DTests. Staring at an empty log is not the way, I agree. > Test failure: > junit.framework.TestSuite.org.apache.cassandra.distributed.test.CASMultiDCTest-.jdk11 > > > > Key: CASSANDRA-18707 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18707 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: TESTS-TestSuites.xml.xz > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1650/testReport/junit.framework/TestSuite/org_apache_cassandra_distributed_test_CASMultiDCTest__jdk11/] > h3. > {code:java} > Error Message > Schema agreement not reached. Schema versions of the instances: > [ef1c8e05-a06d-388d-a46d-53cc22a94762, 6c386108-1805-3985-b48e-8016012a0207, > 6c386108-1805-3985-b48e-8016012a0207, ef1c8e05-a06d-388d-a46d-53cc22a94762] > Stacktrace > java.lang.IllegalStateException: Schema agreement not reached. Schema > versions of the instances: [ef1c8e05-a06d-388d-a46d-53cc22a94762, > 6c386108-1805-3985-b48e-8016012a0207, 6c386108-1805-3985-b48e-8016012a0207, > ef1c8e05-a06d-388d-a46d-53cc22a94762] at > org.apache.cassandra.distributed.impl.AbstractCluster$ChangeMonitor.waitForCompletion(AbstractCluster.java:907) > at > org.apache.cassandra.distributed.impl.AbstractCluster.lambda$schemaChange$8(AbstractCluster.java:836) > at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) at > org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at > org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767052#comment-17767052 ] Cameron Zemek commented on CASSANDRA-18845: --- with this removed {code:java} (epSize == liveSize || liveSize > 1){code} the j11_dtests just passed. [j11_dtests (120384) - instaclustr/cassandra (circleci.com)|https://app.circleci.com/pipelines/github/instaclustr/cassandra/3180/workflows/2f7e6199-d865-4eee-a3b1-9511a4c88a45/jobs/120384] > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766993#comment-17766993 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 6:17 AM: -- [~brandon.williams] [~smiklosovic] can you help to take a look at this little patch ? |pr for trunk |[trunk|https://github.com/apache/cassandra/pull/2707]| |java17 pre-commit|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/e5c42fef-bf21-4880-b76d-19ffeff62970]| |java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/93ea3717-9542-4777-9cad-666a6a7c40fa]| was (Author: maxwellguo): [~brandon.williams][~smiklosovic] can you help to take a look at this little patch ? |pr for trunk |[trunk|https://github.com/apache/cassandra/pull/2707]| |java17 pre-commit|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/e5c42fef-bf21-4880-b76d-19ffeff62970]| |java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/93ea3717-9542-4777-9cad-666a6a7c40fa]| > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766993#comment-17766993 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 6:17 AM: -- hello, [~brandon.williams] [~smiklosovic] can you help to take a look at this little patch ? |pr for trunk |[trunk|https://github.com/apache/cassandra/pull/2707]| |java17 pre-commit|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/e5c42fef-bf21-4880-b76d-19ffeff62970]| |java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/93ea3717-9542-4777-9cad-666a6a7c40fa]| was (Author: maxwellguo): [~brandon.williams] [~smiklosovic] can you help to take a look at this little patch ? |pr for trunk |[trunk|https://github.com/apache/cassandra/pull/2707]| |java17 pre-commit|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/e5c42fef-bf21-4880-b76d-19ffeff62970]| |java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/93ea3717-9542-4777-9cad-666a6a7c40fa]| > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766993#comment-17766993 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 2:46 PM: -- hello, [~brandon.williams] [~smiklosovic] can you help to take a look at this little patch ? |pr for trunk |[trunk|https://github.com/apache/cassandra/pull/2709]| |java17 pre-commit|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/e5c42fef-bf21-4880-b76d-19ffeff62970]| |java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/93ea3717-9542-4777-9cad-666a6a7c40fa]| was (Author: maxwellguo): hello, [~brandon.williams] [~smiklosovic] can you help to take a look at this little patch ? |pr for trunk |[trunk|https://github.com/apache/cassandra/pull/2707]| |java17 pre-commit|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/e5c42fef-bf21-4880-b76d-19ffeff62970]| |java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/482/workflows/93ea3717-9542-4777-9cad-666a6a7c40fa]| > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18830) Set io.netty.transport.noNative to false for in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-18830: -- Change Category: Quality Assurance Complexity: Low Hanging Fruit Component/s: CI Fix Version/s: 5.x Status: Open (was: Triage Needed) > Set io.netty.transport.noNative to false for in-jvm dtests > -- > > Key: CASSANDRA-18830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18830 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > > This ticket was created as a reaction to (1). > (1) https://lists.apache.org/thread/p42yksvo6t0dy67wmnl2bzpnggo9gpp9 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (bc17ed95a -> 7101de912)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard bc17ed95a generate docs for bc8bfc13 new 7101de912 generate docs for bc8bfc13 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (bc17ed95a) \ N -- N -- N refs/heads/asf-staging (7101de912) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: site-ui/build/ui-bundle.zip | Bin 4881412 -> 4881412 bytes 1 file changed, 0 insertions(+), 0 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767024#comment-17767024 ] Stefan Miklosovic commented on CASSANDRA-18845: --- I am retracting my note about loopback addresses. It does call waitToSettle because (empirically tested) {code} if (!FBUtilities.getBroadcastAddressAndPort().equals(InetAddressAndPort.getLoopbackAddress())) Gossiper.waitToSettle(); {code} evaluates to "FBUtilities.getBroadcastAddressAndPort().equals(InetAddressAndPort.getLoopbackAddress())" being false. Which is true as broadcast is 127.0.0.2 and loopback is 127.0.0.1. So it does call waitToSettle. > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18707) Test failure: junit.framework.TestSuite.org.apache.cassandra.distributed.test.CASMultiDCTest-.jdk11
[ https://issues.apache.org/jira/browse/CASSANDRA-18707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767010#comment-17767010 ] Berenguer Blasi commented on CASSANDRA-18707: - bq. I guess you waited for feedback before propagating to all other branches Yep let's agree first and then it goes into all 4.0+ branches. bq. Also, what do we do with the other failure Andres de la Peña mentioned? I opened CASSANDRA-18851 to handle that one bq. I'd suggest that I should have made it a configurable value back then Seems like an easy change to fold into this ticket. Added to the PR. bq. Berenguer Blasi, did you happen to try that already? Nope. And I am not following I am afraid. 60s pauses on jenkins are common going from my experience fixing previous flakies. I have seen it before. The logs are just empty for that period, there's absolutely nothing to go about. So as Brandon said it is just a timeout and I agree it's bc of the env imo. If we reduce the timeout we'll just make it fail more times, stare at an empty log for that timeout period and then there's nothing we can do about it right? I'm probably missing sthg but if it failed again what do you do without logs? > Test failure: > junit.framework.TestSuite.org.apache.cassandra.distributed.test.CASMultiDCTest-.jdk11 > > > > Key: CASSANDRA-18707 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18707 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: TESTS-TestSuites.xml.xz > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1650/testReport/junit.framework/TestSuite/org_apache_cassandra_distributed_test_CASMultiDCTest__jdk11/] > h3. > {code:java} > Error Message > Schema agreement not reached. Schema versions of the instances: > [ef1c8e05-a06d-388d-a46d-53cc22a94762, 6c386108-1805-3985-b48e-8016012a0207, > 6c386108-1805-3985-b48e-8016012a0207, ef1c8e05-a06d-388d-a46d-53cc22a94762] > Stacktrace > java.lang.IllegalStateException: Schema agreement not reached. Schema > versions of the instances: [ef1c8e05-a06d-388d-a46d-53cc22a94762, > 6c386108-1805-3985-b48e-8016012a0207, 6c386108-1805-3985-b48e-8016012a0207, > ef1c8e05-a06d-388d-a46d-53cc22a94762] at > org.apache.cassandra.distributed.impl.AbstractCluster$ChangeMonitor.waitForCompletion(AbstractCluster.java:907) > at > org.apache.cassandra.distributed.impl.AbstractCluster.lambda$schemaChange$8(AbstractCluster.java:836) > at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96) at > org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at > org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-18871: -- Fix Version/s: 4.0.x 4.1.x 5.0.x 5.1 > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever > {{-Dtest.profiler=...}} is specified. If that property is fed with the empty > value, some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{test.profiler}} property > will be added as profiler options. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767063#comment-17767063 ] Berenguer Blasi commented on CASSANDRA-18871: - Nice > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever > {{-Dtest.profiler=...}} is specified. If that property is fed with the empty > value, some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{test.profiler}} property > will be added as profiler options. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17296) Test Failure: dtest-upgrade.upgrade_tests.upgrade_through_versions_test.TestProtoV4Upgrade_AllVersions_RandomPartitioner_EndsAt_Trunk_HEAD.test_rolling_upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-17296: - Fix Version/s: 5.1 (was: 5.x) > Test Failure: > dtest-upgrade.upgrade_tests.upgrade_through_versions_test.TestProtoV4Upgrade_AllVersions_RandomPartitioner_EndsAt_Trunk_HEAD.test_rolling_upgrade > --- > > Key: CASSANDRA-17296 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17296 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 3.0.30, 4.0.12, 4.1.4, 5.0-alpha2, 5.1 > > > 2 failures in 30, looks flaky on timing / subprocess termination. > https://ci-cassandra.apache.org/job/Cassandra-trunk/920/testReport/dtest-upgrade.upgrade_tests.upgrade_through_versions_test/TestProtoV4Upgrade_AllVersions_RandomPartitioner_EndsAt_Trunk_HEAD/test_rolling_upgrade/ > Failed 2 times in the last 30 runs. Flakiness: 10%, Stability: 93% > Error Message > RuntimeError: A subprocess has terminated early. Subprocess statuses: > Process-1 (is_alive: True), Process-2 (is_alive: False), attempting to > terminate remaining subprocesses now. > Stacktrace > self = > object at 0x7f22685cebb0> > @pytest.mark.timeout(3000) > def test_rolling_upgrade(self): > """ > Test rolling upgrade of the cluster, so we have mixed versions > part way through. > """ > > self.upgrade_scenario(rolling=True) > upgrade_tests/upgrade_through_versions_test.py:320: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > upgrade_tests/upgrade_through_versions_test.py:398: in upgrade_scenario > self._check_on_subprocs(self.fixture_dtest_setup.subprocs) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > object at 0x7f22685cebb0> > subprocs = [ exitcode=-SIGKILL daemon>, stopped exitcode=1 daemon>] > def _check_on_subprocs(self, subprocs): > """ > Check on given subprocesses. > > If any are not alive, we'll go ahead and terminate any remaining > alive subprocesses since this test is going to fail. > """ > subproc_statuses = [s.is_alive() for s in subprocs] > if not all(subproc_statuses): > message = "A subprocess has terminated early. Subprocess > statuses: " > for s in subprocs: > message += "{name} (is_alive: {aliveness}), > ".format(name=s.name, aliveness=s.is_alive()) > message += "attempting to terminate remaining subprocesses now." > self._terminate_subprocs() > > raise RuntimeError(message) > E RuntimeError: A subprocess has terminated early. Subprocess > statuses: Process-1 (is_alive: True), Process-2 (is_alive: False), attempting > to terminate remaining subprocesses now. > upgrade_tests/upgrade_through_versions_test.py:456: RuntimeError -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767043#comment-17767043 ] Stefan Miklosovic edited comment on CASSANDRA-18786 at 9/20/23 8:23 AM: Some links are marked as errors because they are referencing private methods (I attached a screenshot). I am not sure we want to do something with it. was (Author: smiklosovic): Some links are marked as errors because they are referencing private methods (I attached a screenshot). I am not sure we what to do something with it. > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767123#comment-17767123 ] Maxim Muzafarov commented on CASSANDRA-18871: - > Can one specify a specific benchmark to run? Would it be too hard to also add > other parameters? we already can use -Dbenchmark.name=ChecksumBenchmark to run a specific class, or did you mean something else? > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767043#comment-17767043 ] Stefan Miklosovic commented on CASSANDRA-18786: --- Some links are marked as errors because they are referencing private methods > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767007#comment-17767007 ] Cameron Zemek commented on CASSANDRA-18845: --- [^stream.log] Without this patch I get nodes stuck in being unable to join large test cluster: {noformat} Sep 20 01:18:51 ip-10-7-20-120 cassandra[5521]: INFO o.a.cassandra.service.StorageService JOINING: Starting to bootstrap... Sep 20 01:18:51 ip-10-7-20-120 cassandra[5521]: Exception (java.lang.RuntimeException) encountered during startup: A node required to move the data consistently is down (/13.237.60.255). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false Sep 20 01:18:51 ip-10-7-20-120 cassandra[5521]: java.lang.RuntimeException: A node required to move the data consistently is down (/13.237.60.255). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false Sep 20 01:18:51 ip-10-7-20-120 cassandra[5521]: at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:294){noformat} The node is in endless restart cycle (since our service keeps retrying) with it reporting a different IP each time. > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-18786: -- Attachment: screenshot-1.png > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18836) Replace CRC32 w/ CRC32C in IndexFileUtils.ChecksummingWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-18836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767085#comment-17767085 ] Maxim Muzafarov commented on CASSANDRA-18836: - [~maedhroz], [~jjordan] do you think we need to do something with BufferedChecksumIndexInput or fixing the ChecksummingWriter is enough for this issue? > Replace CRC32 w/ CRC32C in IndexFileUtils.ChecksummingWriter > > > Key: CASSANDRA-18836 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18836 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index, Feature/SAI >Reporter: Caleb Rackliffe >Assignee: Maxim Muzafarov >Priority: Normal > Fix For: 5.0-alpha2, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > It seems that now we're on Java 11 for 5.0, there isn't much reason not to > use CRC32C as a drop-in replacement for CRC32. SAI isn't even released, so > has no binary compatibility entanglements, and this should be pretty > straightforward. > See https://github.com/apache/bookkeeper/pull/3309 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17674) Test failure: org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
[ https://issues.apache.org/jira/browse/CASSANDRA-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766998#comment-17766998 ] Berenguer Blasi edited comment on CASSANDRA-17674 at 9/20/23 6:34 AM: -- I must be doing sthg wrong bc on 4.0 it doesn't seem to pass at all. Can you see if I did sthg silly [~e.dimitrova] [here|https://app.circleci.com/pipelines/github/bereng/cassandra/1070/workflows/edfa498e-938b-48cd-9e1f-9a795753c838/jobs/28988]? On 5.0 everything seems [fine|https://app.circleci.com/pipelines/github/bereng/cassandra/1071/workflows/8a93b6aa-6da6-4374-a3e9-c734510248fd/jobs/29085]. So I'd say this is a 4.0 only issue. Wdyt? was (Author: bereng): I must be doing sthg wrong bc on 4.0 it doesn't seem to pass at all. Can you see if I did sthg silly [~e.dimitrova] [here|https://app.circleci.com/pipelines/github/bereng/cassandra/1070/workflows/edfa498e-938b-48cd-9e1f-9a795753c838/jobs/28988]? On 5.0 everything seems fine. So I'd say this is a 4.0 only issue. Wdyt? > Test failure: > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage > > > Key: CASSANDRA-17674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17674 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Andres de la Peña >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 5.x > > > The Java upgrade dtest > {{org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage}} > is ~68% flaky on 4.0 and ~2% flaky on trunk, at least in CircleCI: > * 4.0: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1622/workflows/0086c3b1-a552-4c7a-8278-2f759cee5bdf/jobs/17288] > * trunk: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1624/workflows/c4ce2b95-998f-459b-830e-8e3fa6637e15/jobs/17293] > The error for 4.0 is: > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:227) > at > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage(DropCompactStorageTest.java:49) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:235) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:231) > Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:62) > at > org.apache.cassandra.distributed.impl.Instance.lambda$shutdown$28(Instance.java:810) > at
[jira] [Comment Edited] (CASSANDRA-17674) Test failure: org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
[ https://issues.apache.org/jira/browse/CASSANDRA-17674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766998#comment-17766998 ] Berenguer Blasi edited comment on CASSANDRA-17674 at 9/20/23 6:34 AM: -- I must be doing sthg wrong bc on 4.0 it doesn't seem to pass at all. Can you see if I did sthg silly [~e.dimitrova] [here|https://app.circleci.com/pipelines/github/bereng/cassandra/1070/workflows/edfa498e-938b-48cd-9e1f-9a795753c838/jobs/28988]? On 5.0 everything seems fine. So I'd say this is a 4.0 only issue. Wdyt? was (Author: bereng): I must be doing sthg wrong bc on 4.0 it doesn't seem to pass at all. Can you see if I did sthg silly [~e.dimitrova] [here|https://app.circleci.com/pipelines/github/bereng/cassandra/1070/workflows/edfa498e-938b-48cd-9e1f-9a795753c838/jobs/28988]? > Test failure: > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage > > > Key: CASSANDRA-17674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17674 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Andres de la Peña >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0.x, 5.x > > > The Java upgrade dtest > {{org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage}} > is ~68% flaky on 4.0 and ~2% flaky on trunk, at least in CircleCI: > * 4.0: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1622/workflows/0086c3b1-a552-4c7a-8278-2f759cee5bdf/jobs/17288] > * trunk: > [https://app.circleci.com/pipelines/github/adelapena/cassandra/1624/workflows/c4ce2b95-998f-459b-830e-8e3fa6637e15/jobs/17293] > The error for 4.0 is: > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.cassandra.distributed.upgrade.UpgradeTestBase$TestCase.run(UpgradeTestBase.java:227) > at > org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage(DropCompactStorageTest.java:49) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:235) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor$ThrowingRunnable.lambda$toRunnable$0(IsolatedExecutor.java:231) > Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: > org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@6fa17524[Shutting > down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = > 218] did not terminate on time > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:62) > at > org.apache.cassandra.distributed.impl.Instance.lambda$shutdown$28(Instance.java:810) > at > org.apache.cassandra.distributed.impl.IsolatedExecutor.lambda$null$8(IsolatedExecutor.java:114) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at jav
[jira] [Created] (CASSANDRA-18871) JMH benchmark improvements
Jacek Lewandowski created CASSANDRA-18871: - Summary: JMH benchmark improvements Key: CASSANDRA-18871 URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 Project: Cassandra Issue Type: Improvement Components: Build, Legacy/Tools Reporter: Jacek Lewandowski 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for JMH benchmarks which is then not used with {{ant microbench}} task. It is used though by the {{test/bin/jmh}} script. In fact, I have no idea why we should use uber jar if JMH can perfectly run with a regular classpath. Maybe that had something to do with older JMH version which was used that time. Building uber jars takes time and is annoying. Since it seems to be redundant anyway, I'm going to remove it and fix {{test/bin/jmh}} to use a regular classpath. 2. I'll add support for async profiler in benchmarks. That is, the {{microbench}} target automatically fetches the async profiler binaries and adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever {{-Dtest.profiler=...}} is specified. If that property is fed with the empty value, some default options will be applied (defined in the script, can be negotiated). Otherwise, whatever is passed to the {{test.profiler}} property will be added as profiler options. 3. If someone wants to see any additional improvements, please comment on the ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767118#comment-17767118 ] Brandon Williams commented on CASSANDRA-18870: -- bq. The test case's catch body do not doing the exception message validation .So no matter the cql is illegal or not , the test will still pass. I see, good catch. Unfortunately If you rebased I don't think it helped, it's still got commits from CASSANDRA-17698 > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767114#comment-17767114 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 1:18 PM: -- Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [fb validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be throw. The test case's catch body do not doing the exception message validation . actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} was (Author: maxwellguo): Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " with bloom_filter_fp_chance = 0.001 is meaningful, see [fb validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be throw. The test case's catch body do not doing the exception message validation . actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski reassigned CASSANDRA-18871: - Assignee: Jacek Lewandowski > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever > {{-Dtest.profiler=...}} is specified. If that property is fed with the empty > value, some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{test.profiler}} property > will be added as profiler options. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767113#comment-17767113 ] Branimir Lambov commented on CASSANDRA-18871: - Can one specify a specific benchmark to run? Would it be too hard to also add other parameters? > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18816) Add support for repair coordinator to retry messages that timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-18816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767077#comment-17767077 ] Marcus Eriksson commented on CASSANDRA-18816: - and second round posted - the new tests all seem to fail so I'll get back to them once fixed there is also a bunch of TODOs in the patch, should those be fixed or removed? > Add support for repair coordinator to retry messages that timeout > - > > Key: CASSANDRA-18816 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18816 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 5.x > > Time Spent: 9h > Remaining Estimate: 0h > > Now that CASSANDRA-15399 is in, most of the repair messages have a state that > they can check against to make message delivery idempotent, allowing the > coordinator to retry such messages; a few of the most critical messages to > retry are: PREPARE_MSG, VALIDATION_REQ, VALIDATION_RSP, SYNC_REQ, and > SYNC_RSP. > With this I propose making the coordinator able to retry these key messages > to try and make repair more resilient to ephemeral issues. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-18747) Test failure: Fix assertion error AssertionError: Unknown keyspace system_auth\n\tat org.apache.cassandra.db.Keyspace.(Keyspace.java:324)\n\tat org.apache.cas
[ https://issues.apache.org/jira/browse/CASSANDRA-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski reassigned CASSANDRA-18747: - Assignee: Jacek Lewandowski > Test failure: Fix assertion error AssertionError: Unknown keyspace > system_auth\n\tat > org.apache.cassandra.db.Keyspace.(Keyspace.java:324)\n\tat > org.apache.cassandra.db.Keyspace.lambda$open$0(Keyspace.java:162) > --- > > Key: CASSANDRA-18747 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18747 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0.x, 5.x > > > I've been seeing this assertion error in different tests lately. > Full error message: > {code:java} > failed on teardown with "Unexpected error found in node logs (see stdout for > full details). Errors: [[node2] 'ERROR [PendingRangeCalculator:1] 2023-08-11 > 16:35:14,445 JVMStabilityInspector.java:70 - Exception in thread > Thread[PendingRangeCalculator:1,5,PendingRangeCalculator]\njava.lang.AssertionError: > Unknown keyspace system_auth\n\tat > org.apache.cassandra.db.Keyspace.(Keyspace.java:324)\n\tat > org.apache.cassandra.db.Keyspace.lambda$open$0(Keyspace.java:162)\n\tat > org.apache.cassandra.utils.concurrent.LoadingMap.blockingLoadIfAbsent(LoadingMap.java:105)\n\tat > > org.apache.cassandra.schema.Schema.maybeAddKeyspaceInstance(Schema.java:251)\n\tat > org.apache.cassandra.db.Keyspace.open(Keyspace.java:162)\n\tat > org.apache.cassandra.db.Keyspace.open(Keyspace.java:151)\n\tat > org.apache.cassandra.service.PendingRangeCalculatorService.lambda$new$1(PendingRangeCalculatorService.java:58)\n\tat > > org.apache.cassandra.concurrent.SingleThreadExecutorPlus$AtLeastOnce.run(SingleThreadExecutorPlus.java:60)\n\tat > > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)\n\tat > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat > > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat > java.base/java.lang.Thread.run(Thread.java:829)']" Unexpected error found in > node logs (see stdout for full details). Errors: [[node2] 'ERROR > [PendingRangeCalculator:1] 2023-08-11 16:35:14,445 > JVMStabilityInspector.java:70 - Exception in thread > Thread[PendingRangeCalculator:1,5,PendingRangeCalculator]\njava.lang.AssertionError: > Unknown keyspace system_auth\n\tat > org.apache.cassandra.db.Keyspace.(Keyspace.java:324)\n\tat > org.apache.cassandra.db.Keyspace.lambda$open$0(Keyspace.java:162)\n\tat > org.apache.cassandra.utils.concurrent.LoadingMap.blockingLoadIfAbsent(LoadingMap.java:105)\n\tat > > org.apache.cassandra.schema.Schema.maybeAddKeyspaceInstance(Schema.java:251)\n\tat > org.apache.cassandra.db.Keyspace.open(Keyspace.java:162)\n\tat > org.apache.cassandra.db.Keyspace.open(Keyspace.java:151)\n\tat > org.apache.cassandra.service.PendingRangeCalculatorService.lambda$new$1(PendingRangeCalculatorService.java:58)\n\tat > > org.apache.cassandra.concurrent.SingleThreadExecutorPlus$AtLeastOnce.run(SingleThreadExecutorPlus.java:60)\n\tat > > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)\n\tat > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat > > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat > java.base/java.lang.Thread.run(Thread.java:829)']{code} > Example failures: > test_failed_snitch_update_property_file_snitch - > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2475/workflows/2086619e-0f21-464b-a866-84aca516b5e5/jobs/36716/tests] > test_gcgs_validation - > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1666/testReport/junit/dtest.materialized_views_test/TestMaterializedViews/test_gcgs_validation/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767125#comment-17767125 ] Brandon Williams commented on CASSANDRA-18845: -- I don't have time to look at this fully but one thing you may want to do is something I did on CASSANDRA-18792 to find the issue, which is add more debugging around the echoes and push it up to debug so I didn't have to cloud everything with TRACE. https://github.com/driftx/cassandra/commit/e1e6b1a0fb0dacc067ddc5910659e1fe6da2cd52 > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18786) Javadoc BigFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767043#comment-17767043 ] Stefan Miklosovic edited comment on CASSANDRA-18786 at 9/20/23 8:20 AM: Some links are marked as errors because they are referencing private methods (I attached a screenshot). I am not sure we what to do something with it. was (Author: smiklosovic): Some links are marked as errors because they are referencing private methods > Javadoc BigFormat > - > > Key: CASSANDRA-18786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18786 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Javadoc >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0.x > > Attachments: screenshot-1.png > > > This ticket intends to go through the current sstables code and javadoc the > format at high-level. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18866) Node sends multiple inflight echos
[ https://issues.apache.org/jira/browse/CASSANDRA-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767036#comment-17767036 ] Cameron Zemek edited comment on CASSANDRA-18866 at 9/20/23 7:37 AM: going to run overnight the broken dtest that was flagged by the ECHO changes. But with potential fix: {noformat} @Override public void onFailure(InetAddressAndPort from, RequestFailureReason failureReason) { MessagingService.instance().sendWithCallback(echoMessage, addr, this); }{noformat} will report back in the morning. was (Author: cam1982): !18866-regression.patch|width=7,height=7,align=absmiddle! going to run overnight the broken dtest that was flagged by the ECHO changes. But with potential fix: {noformat} @Override public void onFailure(InetAddressAndPort from, RequestFailureReason failureReason) { MessagingService.instance().sendWithCallback(echoMessage, addr, this); }{noformat} will report back in the morning. > Node sends multiple inflight echos > -- > > Key: CASSANDRA-18866 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18866 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18866-regression.patch, duplicates.log, echo.log > > > CASSANDRA-18854 rolled back the changes from CASSANDRA-18845. In particular, > 18845 had change to only allow 1 inflight ECHO request at a time. As per > 18854 some tests have an error rate due to this change. Creating this ticket > to discuss this further. As the current state also does not have retry logic, > it just allowing multiple ECHO requests inflight at the same time so less > likely that all ECHO will timeout or get lost. > With the change from 18845 adding in some extra logging to track what is > going on, I do see it retrying ECHOs. Likewise, I patched a node to drop ECHO > requests from a node and also see it retrying ECHOs when it doesn't get a > reply. > Therefore, I think the problem is more specific than the dropping of one ECHO > request. Yes there no retry logic for failed ECHO requests, but this is the > case even both before and after 18845. ECHO requests are only sent via gossip > verb handlers calling applyStateLocally. In these failed tests I therefore > assuming their cases where it won't call markAlive when other nodes consider > the node UP but its marked DOWN by a node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-18871: -- Test and Documentation Plan: manual testing on linux and mac Status: Patch Available (was: In Progress) https://github.com/apache/cassandra/pull/2708 > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever > {{-Dtest.profiler=...}} is specified. If that property is fed with the empty > value, some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{test.profiler}} property > will be added as profiler options. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767114#comment-17767114 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 1:18 PM: -- Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [bf validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be throw. The test case's catch body do not doing the exception message validation . actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} was (Author: maxwellguo): Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [fb validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be throw. The test case's catch body do not doing the exception message validation . actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (3f339a315 -> bc17ed95a)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 3f339a315 generate docs for bc8bfc13 new bc17ed95a generate docs for bc8bfc13 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (3f339a315) \ N -- N -- N refs/heads/asf-staging (bc17ed95a) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/search-index.js | 2 +- site-ui/build/ui-bundle.zip | Bin 4881412 -> 4881412 bytes 2 files changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767114#comment-17767114 ] Maxwell Guo commented on CASSANDRA-18870: - Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " with bloom_filter_fp_chance = 0.001 is meaningful, see [fb validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be throw. The test case's catch body do not doing the exception message validation . actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767114#comment-17767114 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 1:21 PM: -- Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [bf validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be thrown. The test case's catch body do not doing the exception message validation .So no matter the cql is illegal or not , the test will still pass. Actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} was (Author: maxwellguo): Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [bf validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be thrown. The test case's catch body do not doing the exception message validation . Actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767083#comment-17767083 ] Maxim Muzafarov commented on CASSANDRA-18871: - +1 for point 1. In addition, I would like to say: - we must be able to run each benchmark from your IDE using the main method and/or JUnit runner, there is no way to do that now. In order to do this all the benchmarks must have an abstract ancestor that encapsulates all this machinery. For example, Netty has one: https://github.com/netty/netty/blob/4.1/microbench/src/main/java/io/netty/microbench/util/AbstractMicrobenchmarkBase.java#L44 - we should enforce the names of benchmarks to be able to run them using with our in-tree scripts, there are a lot of variation names e.g. *Bench, *Benchmarks, *BenchTest etc. For me, this is a problem because it is difficult to filter all these names with scripts. For example, we are enforcing "Test" postfix for other tests, but not for benchmarks. As the result, all similar filter patterns look like this: https://github.com/apache/cassandra/blob/trunk/build.xml#L1782C134-L1782C152 For me, we should align all benchmarks with "*BenchmarkTest" pattern. > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever > {{-Dtest.profiler=...}} is specified. If that property is fed with the empty > value, some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{test.profiler}} property > will be added as profiler options. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-18871: -- Change Category: Performance Complexity: Low Hanging Fruit Status: Open (was: Triage Needed) > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever > {{-Dtest.profiler=...}} is specified. If that property is fed with the empty > value, some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{test.profiler}} property > will be added as profiler options. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18870) Invalid ut check for CreateTableValidationTest
[ https://issues.apache.org/jira/browse/CASSANDRA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767114#comment-17767114 ] Maxwell Guo edited comment on CASSANDRA-18870 at 9/20/23 1:19 PM: -- Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [bf validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be thrown. The test case's catch body do not doing the exception message validation . Actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} was (Author: maxwellguo): Sorry, I may not have noticed, rebase again. as for this code in the test case : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { } {code} The test wants to see if the cql : {code:java} "CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001 " {code} with bloom_filter_fp_chance = 0.001 is meaningful, see [bf validation |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/TableParams.java#L156]. if bf chance is less than minBloomFilterFpChanceValue or bigger than 1 , {code:java} fail("%s must be larger than %s and less than or equal to 1.0 (got %s)", BLOOM_FILTER_FP_CHANCE, minBloomFilterFpChanceValue, bloomFilterFpChance); {code} should be throw. The test case's catch body do not doing the exception message validation . actually the right code should be : {code:java} try { createTableMayThrow("CREATE TABLE %s (a int PRIMARY KEY, b int) WITH bloom_filter_fp_chance = 0.001"); fail("Expected an fp chance of 0.001 to be rejected"); } catch (ConfigurationException exc) { assertTrue(exc.getMessage().contains("bloom_filter_fp_chance must be larger than " + BloomCalculations.minSupportedBloomFilterFpChance() + " and less than or equal to 1.0 (got 1.0E-7)")) } {code} > Invalid ut check for CreateTableValidationTest > -- > > Key: CASSANDRA-18870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18870 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Maxwell Guo >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/schema/CreateTableValidationTest.java#L40 > This test case testInvalidBloomFilterFPRatio() makes no sense and cannot > achieve the desired purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (3c8c100ae -> 95e761fa2)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 3c8c100ae generate docs for bc8bfc13 new 95e761fa2 generate docs for bc8bfc13 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (3c8c100ae) \ N -- N -- N refs/heads/asf-staging (95e761fa2) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/search-index.js | 2 +- site-ui/build/ui-bundle.zip | Bin 4881412 -> 4881412 bytes 2 files changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (95e761fa2 -> 3f339a315)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 95e761fa2 generate docs for bc8bfc13 new 3f339a315 generate docs for bc8bfc13 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (95e761fa2) \ N -- N -- N refs/heads/asf-staging (3f339a315) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: site-ui/build/ui-bundle.zip | Bin 4881412 -> 4881412 bytes 1 file changed, 0 insertions(+), 0 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18845) Waiting for gossip to settle on live endpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767037#comment-17767037 ] Cameron Zemek commented on CASSANDRA-18845: --- the {noformat} (epSize == liveSize || liveSize > 1){noformat} part breaks dtests. For example, {noformat} pytest --force-resource-intensive-tests --cassandra-dir=/home/grom/dev/cassandra materialized_views_test.py::TestMaterializedViews::test_throttled_partition_update{noformat} This test fails since it will shutdown a 5 node cluster and start/stop each node one at a time. And therefore liveSize > 1 is never true. Possible paths forward: # The check for waiting for other nodes is off by default and requries setting a system property. # Figure out why there this large delay between waitToSettle call and getting ECHO responses. # Have the tests override cassandra.skip_wait_for_gossip_to_settle # ?? Some other option haven't thought of yet. > Waiting for gossip to settle on live endpoints > -- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: delay.log, example.log, > image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17876) remove the unused imports in the source code and fail builds when they are present
[ https://issues.apache.org/jira/browse/CASSANDRA-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-17876: --- Fix Version/s: 5.1 > remove the unused imports in the source code and fail builds when they are > present > -- > > Key: CASSANDRA-17876 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17876 > Project: Cassandra > Issue Type: Improvement > Components: Build >Reporter: Ling Mao >Assignee: Ling Mao >Priority: Normal > Labels: pull-request-available > Fix For: 4.1-beta1, 5.0, 5.0-alpha1, 5.1 > > Time Spent: 50m > Remaining Estimate: 0h > > remove the unused imports in the source code > upd. A RedundantImport check was also added to the checkstyle rules. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-18871: -- Reviewers: Marianne Lyne Manaog, Maxim Muzafarov (was: Marianne Lyne Manaog) > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18866) Node sends multiple inflight echos
[ https://issues.apache.org/jira/browse/CASSANDRA-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767036#comment-17767036 ] Cameron Zemek commented on CASSANDRA-18866: --- !18866-regression.patch|width=7,height=7,align=absmiddle! going to run overnight the broken dtest that was flagged by the ECHO changes. But with potential fix: {noformat} @Override public void onFailure(InetAddressAndPort from, RequestFailureReason failureReason) { MessagingService.instance().sendWithCallback(echoMessage, addr, this); }{noformat} will report back in the morning. > Node sends multiple inflight echos > -- > > Key: CASSANDRA-18866 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18866 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18866-regression.patch, duplicates.log, echo.log > > > CASSANDRA-18854 rolled back the changes from CASSANDRA-18845. In particular, > 18845 had change to only allow 1 inflight ECHO request at a time. As per > 18854 some tests have an error rate due to this change. Creating this ticket > to discuss this further. As the current state also does not have retry logic, > it just allowing multiple ECHO requests inflight at the same time so less > likely that all ECHO will timeout or get lost. > With the change from 18845 adding in some extra logging to track what is > going on, I do see it retrying ECHOs. Likewise, I patched a node to drop ECHO > requests from a node and also see it retrying ECHOs when it doesn't get a > reply. > Therefore, I think the problem is more specific than the dropping of one ECHO > request. Yes there no retry logic for failed ECHO requests, but this is the > case even both before and after 18845. ECHO requests are only sent via gossip > verb handlers calling applyStateLocally. In these failed tests I therefore > assuming their cases where it won't call markAlive when other nodes consider > the node UP but its marked DOWN by a node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18866) Node sends multiple inflight echos
[ https://issues.apache.org/jira/browse/CASSANDRA-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cameron Zemek updated CASSANDRA-18866: -- Attachment: 18866-regression.patch > Node sends multiple inflight echos > -- > > Key: CASSANDRA-18866 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18866 > Project: Cassandra > Issue Type: Improvement >Reporter: Cameron Zemek >Priority: Normal > Attachments: 18866-regression.patch, duplicates.log, echo.log > > > CASSANDRA-18854 rolled back the changes from CASSANDRA-18845. In particular, > 18845 had change to only allow 1 inflight ECHO request at a time. As per > 18854 some tests have an error rate due to this change. Creating this ticket > to discuss this further. As the current state also does not have retry logic, > it just allowing multiple ECHO requests inflight at the same time so less > likely that all ECHO will timeout or get lost. > With the change from 18845 adding in some extra logging to track what is > going on, I do see it retrying ECHOs. Likewise, I patched a node to drop ECHO > requests from a node and also see it retrying ECHOs when it doesn't get a > reply. > Therefore, I think the problem is more specific than the dropping of one ECHO > request. Yes there no retry logic for failed ECHO requests, but this is the > case even both before and after 18845. ECHO requests are only sent via gossip > verb handlers calling applyStateLocally. In these failed tests I therefore > assuming their cases where it won't call markAlive when other nodes consider > the node UP but its marked DOWN by a node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org