[jira] [Commented] (CASSANDRA-15448) Throttle the speed of merkletree row hash
[ https://issues.apache.org/jira/browse/CASSANDRA-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995282#comment-16995282 ] maxwellguo commented on CASSANDRA-15448: I don't mean my repair is not efficient , for some server of low pofile, doing repair may got affect .And the issue are all not resolved. For changing hash is a way to alleviate this problem,but I think it may be dangerous ,sha-256 will got very low probability of data collision. > Throttle the speed of merkletree row hash > -- > > Key: CASSANDRA-15448 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15448 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair, Tool/nodetool >Reporter: maxwellguo >Assignee: maxwellguo >Priority: Normal > > Under our enviroment , we may got some Low-profile servers, like 4core 8G > memory, so when doing repair for merkletree calculate , the cpu may cost so > much. And we think repair can take long time so do some speed throttle may > increase repair time,but can make the server more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15429) Support NodeTool for in-jvm dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995177#comment-16995177 ] Dinesh Joshi commented on CASSANDRA-15429: -- Hi [~yifanc], thanks for the PR. I have left a few review comments. > Support NodeTool for in-jvm dtest > - > > Key: CASSANDRA-15429 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15429 > Project: Cassandra > Issue Type: New Feature > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > In-JVM dtest framework does not support nodetool as of now. This > functionality is wanted in some tests, e.g. constructing an end-to-end test > scenario that uses nodetool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15429) Support NodeTool for in-jvm dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15429: - Status: Changes Suggested (was: Review In Progress) > Support NodeTool for in-jvm dtest > - > > Key: CASSANDRA-15429 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15429 > Project: Cassandra > Issue Type: New Feature > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > In-JVM dtest framework does not support nodetool as of now. This > functionality is wanted in some tests, e.g. constructing an end-to-end test > scenario that uses nodetool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
[ https://issues.apache.org/jira/browse/CASSANDRA-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15450: -- Description: In AbstractCluster.uncaughtExceptions, we attempt to get the instance from the class loader and used the “generation”. This value is actually the cluster id, so causes tests to fail when multiple tests share the same JVM; it should be using the “id” field which represents the instance id relative to the cluster. (was: In \{code}AbstractCluster.uncaughtExceptions\{code}, we attempt to get the instance from the class loader and used the “generation”. This value is actually the cluster id, so causes tests to fail when multiple tests share the same JVM; it should be using the “id” field which represents the instance id relative to the cluster.) > in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the > wrong instance, it uses cluster generation when it should be using the > instance id > --- > > Key: CASSANDRA-15450 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In AbstractCluster.uncaughtExceptions, we attempt to get the instance from > the class loader and used the “generation”. This value is actually the > cluster id, so causes tests to fail when multiple tests share the same JVM; > it should be using the “id” field which represents the instance id relative > to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
[ https://issues.apache.org/jira/browse/CASSANDRA-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15450: -- Description: In \{code}AbstractCluster.uncaughtExceptions\{code}, we attempt to get the instance from the class loader and used the “generation”. This value is actually the cluster id, so causes tests to fail when multiple tests share the same JVM; it should be using the “id” field which represents the instance id relative to the cluster. (was: In AbstractCluster.uncaughtExceptions, we attempt to get the instance from the class loader and used the “generation”. This value is actually the cluster id, so causes tests to fail when multiple tests share the same JVM; it should be using the “id” field which represents the instance id relative to the cluster.) > in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the > wrong instance, it uses cluster generation when it should be using the > instance id > --- > > Key: CASSANDRA-15450 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In \{code}AbstractCluster.uncaughtExceptions\{code}, we attempt to get the > instance from the class loader and used the “generation”. This value is > actually the cluster id, so causes tests to fail when multiple tests share > the same JVM; it should be using the “id” field which represents the instance > id relative to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
[ https://issues.apache.org/jira/browse/CASSANDRA-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995117#comment-16995117 ] David Capwell commented on CASSANDRA-15450: --- [~drohrer] could you review as you were the one who reported the issue? [~ifesdjeen] could you also review? > in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the > wrong instance, it uses cluster generation when it should be using the > instance id > --- > > Key: CASSANDRA-15450 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In AbstractCluster.uncaughtExceptions, we attempt to get the instance from > the class loader and used the “generation”. This value is actually the > cluster id, so causes tests to fail when multiple tests share the same JVM; > it should be using the “id” field which represents the instance id relative > to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
[ https://issues.apache.org/jira/browse/CASSANDRA-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15450: -- Test and Documentation Plan: Patch located here: https://github.com/apache/cassandra/pull/397 Replicated the issue in IntelliJ by selecting {code}GossipSettlesTest{code} and {code}FailingRepairTest{code} and calling run. Before this patch, FailingRepairTest hangs until timeout, it never sees the jvm kill attempt (went to the wrong host); after this patch the correct host gets killed Status: Patch Available (was: Open) > in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the > wrong instance, it uses cluster generation when it should be using the > instance id > --- > > Key: CASSANDRA-15450 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In AbstractCluster.uncaughtExceptions, we attempt to get the instance from > the class loader and used the “generation”. This value is actually the > cluster id, so causes tests to fail when multiple tests share the same JVM; > it should be using the “id” field which represents the instance id relative > to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
[ https://issues.apache.org/jira/browse/CASSANDRA-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-15450: --- Labels: pull-request-available (was: ) > in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the > wrong instance, it uses cluster generation when it should be using the > instance id > --- > > Key: CASSANDRA-15450 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > > In AbstractCluster.uncaughtExceptions, we attempt to get the instance from > the class loader and used the “generation”. This value is actually the > cluster id, so causes tests to fail when multiple tests share the same JVM; > it should be using the “id” field which represents the instance id relative > to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
[ https://issues.apache.org/jira/browse/CASSANDRA-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15450: -- Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear Impact(13164) Complexity: Low Hanging Fruit Discovered By: Unit Test Severity: Normal Status: Open (was: Triage Needed) > in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the > wrong instance, it uses cluster generation when it should be using the > instance id > --- > > Key: CASSANDRA-15450 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > > In AbstractCluster.uncaughtExceptions, we attempt to get the instance from > the class loader and used the “generation”. This value is actually the > cluster id, so causes tests to fail when multiple tests share the same JVM; > it should be using the “id” field which represents the instance id relative > to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15450) in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id
David Capwell created CASSANDRA-15450: - Summary: in-jvm dtest cluster uncaughtExceptions propagation of exception goes to the wrong instance, it uses cluster generation when it should be using the instance id Key: CASSANDRA-15450 URL: https://issues.apache.org/jira/browse/CASSANDRA-15450 Project: Cassandra Issue Type: Bug Components: Test/dtest Reporter: David Capwell Assignee: David Capwell In AbstractCluster.uncaughtExceptions, we attempt to get the instance from the class loader and used the “generation”. This value is actually the cluster id, so causes tests to fail when multiple tests share the same JVM; it should be using the “id” field which represents the instance id relative to the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15449) Credentials out of sync after replacing the nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jai Bheemsen Rao Dhanwada updated CASSANDRA-15449: -- Impacts: Clients (was: None) > Credentials out of sync after replacing the nodes > - > > Key: CASSANDRA-15449 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15449 > Project: Cassandra > Issue Type: Bug >Reporter: Jai Bheemsen Rao Dhanwada >Priority: Normal > Attachments: Screen Shot 2019-12-12 at 11.13.52 AM.png > > > Hello, > We are seeing a strange issue where, after replacing multiple C* nodes from > the clusters intermittently we see an issue where few nodes doesn't have any > credentials and the client queries fail. > Here are the sequence of steps > 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes > in one DC. > 2. The approach we took to replace the nodes is kill one node and launch a > new node with {{-Dcassandra.replace_address=}} and proceed with next node > once the node is bootstrapped and CQL is enabled. > 3. This process works fine and all of a sudden, we started seeing our > application started failing with the below errors in the logs > {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc > has no SELECT permission on or any of its parents at > com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) > at > com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) > at > {quote} > 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while > rest of the nodes are serving ~100 requests. (attached the metrics) > 5. We suspect some credentials sync issue and manually synced the > credentials and restarted the nodes with 0 requests, which fixed the problem. > Also, one few C* nodes we see below exception immediately after the bootstrap > is completed and the process dies. is this contributing to the credentials > issue? > NOTE: The C* nodes with zero traffic and the nodes with the below exception > are not the same. > {quote}ERROR [main] 2019-12-12 05:34:40,412 CassandraDaemon.java:583 - > Exception encountered during startup > java.lang.AssertionError: > org.apache.cassandra.exceptions.InvalidRequestException: Undefined name > salted_hash in selection clause > at > org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:202) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at org.apache.cassandra.auth.Auth.setup(Auth.java:144) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:996) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:740) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:617) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391) > [apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566) > [apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655) > [apache-cassandra-2.1.16.jar:2.1.16] > Caused by: org.apache.cassandra.exceptions.InvalidRequestException: > Undefined name salted_hash in selection clause > at > org.apache.cassandra.cql3.statements.Selection.fromSelectors(Selection.java:292) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1592) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:198) > ~[apache-cassandra-2.1.16.jar:2.1.16] > ... 7 common frames omitted > {quote} > Not sure why this is happening, is this a potential bug or any other pointers > to fix the problem. > C* Version: 2.1.16 > Client: Datastax Java Driver. > system_auth RF: 3, dc-1:3 and dc-2:3 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15449) Credentials out of sync after replacing the nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jai Bheemsen Rao Dhanwada updated CASSANDRA-15449: -- Description: Hello, We are seeing a strange issue where, after replacing multiple C* nodes from the clusters intermittently we see an issue where few nodes doesn't have any credentials and the client queries fail. Here are the sequence of steps 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes in one DC. 2. The approach we took to replace the nodes is kill one node and launch a new node with {{-Dcassandra.replace_address=}} and proceed with next node once the node is bootstrapped and CQL is enabled. 3. This process works fine and all of a sudden, we started seeing our application started failing with the below errors in the logs {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc has no SELECT permission on or any of its parents at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) at {quote} 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while rest of the nodes are serving ~100 requests. (attached the metrics) 5. We suspect some credentials sync issue and manually synced the credentials and restarted the nodes with 0 requests, which fixed the problem. Also, one few C* nodes we see below exception immediately after the bootstrap is completed and the process dies. is this contributing to the credentials issue? NOTE: The C* nodes with zero traffic and the nodes with the below exception are not the same. {quote}ERROR [main] 2019-12-12 05:34:40,412 CassandraDaemon.java:583 - Exception encountered during startup java.lang.AssertionError: org.apache.cassandra.exceptions.InvalidRequestException: Undefined name salted_hash in selection clause at org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:202) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.auth.Auth.setup(Auth.java:144) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:996) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:740) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:617) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391) [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566) [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655) [apache-cassandra-2.1.16.jar:2.1.16] Caused by: org.apache.cassandra.exceptions.InvalidRequestException: Undefined name salted_hash in selection clause at org.apache.cassandra.cql3.statements.Selection.fromSelectors(Selection.java:292) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1592) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:198) ~[apache-cassandra-2.1.16.jar:2.1.16] ... 7 common frames omitted {quote} Not sure why this is happening, is this a potential bug or any other pointers to fix the problem. C* Version: 2.1.16 Client: Datastax Java Driver. system_auth RF: 3, dc-1:3 and dc-2:3 was: Hello, We are seeing a strange issue where, after replacing multiple C* nodes from the clusters intermittently we see an issue where few nodes doesn't have any credentials and the client queries fail. Here are the sequence of steps 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes in one DC. 2. The approach we took to replace the nodes is kill one node and launch a new node with {{-Dcassandra.replace_address=}} and proceed with next node once the node is bootstrapped and CQL is enabled. 3. This process works fine and all of a sudden, we started seeing our application started failing with the below errors in the logs {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc has no SELECT permission on or any of its parents at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) at {quote} 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while rest of the nodes are serving ~100 requests. (attached the metrics) !Screen Shot 2019-12-12 at 11.13.52 AM.png! 5. We suspect some credentials sync issue and
[jira] [Updated] (CASSANDRA-15449) Credentials out of sync after replacing the nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jai Bheemsen Rao Dhanwada updated CASSANDRA-15449: -- Attachment: Screen Shot 2019-12-12 at 11.13.52 AM.png > Credentials out of sync after replacing the nodes > - > > Key: CASSANDRA-15449 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15449 > Project: Cassandra > Issue Type: Bug >Reporter: Jai Bheemsen Rao Dhanwada >Priority: Normal > Attachments: Screen Shot 2019-12-12 at 11.13.52 AM.png > > > Hello, > We are seeing a strange issue where, after replacing multiple C* nodes from > the clusters intermittently we see an issue where few nodes doesn't have any > credentials and the client queries fail. > Here are the sequence of steps > 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes > in one DC. > 2. The approach we took to replace the nodes is kill one node and launch a > new node with {{-Dcassandra.replace_address=}} and proceed with next node > once the node is bootstrapped and CQL is enabled. > 3. This process works fine and all of a sudden, we started seeing our > application started failing with the below errors in the logs > {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc > has no SELECT permission on or any of its parents at > com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) > at > com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) > at > {quote} > 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while > rest of the nodes are serving ~100 requests. (attached the metrics) > !Screen Shot 2019-12-12 at 11.13.52 AM.png! > 5. We suspect some credentials sync issue and manually synced the > credentials and restarted the nodes with 0 requests, which fixed the problem. > Also, one few C* nodes we see below exception immediately after the bootstrap > is completed and the process dies. is this contributing to the credentials > issue? > NOTE: The C* nodes with zero traffic and the nodes with the below exception > are not the same. > {quote}ERROR [main] 2019-12-12 05:34:40,412 CassandraDaemon.java:583 - > Exception encountered during startup > java.lang.AssertionError: > org.apache.cassandra.exceptions.InvalidRequestException: Undefined name > salted_hash in selection clause > at > org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:202) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at org.apache.cassandra.auth.Auth.setup(Auth.java:144) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:996) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:740) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:617) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391) > [apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566) > [apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655) > [apache-cassandra-2.1.16.jar:2.1.16] > Caused by: org.apache.cassandra.exceptions.InvalidRequestException: > Undefined name salted_hash in selection clause > at > org.apache.cassandra.cql3.statements.Selection.fromSelectors(Selection.java:292) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1592) > ~[apache-cassandra-2.1.16.jar:2.1.16] > at > org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:198) > ~[apache-cassandra-2.1.16.jar:2.1.16] > ... 7 common frames omitted > {quote} > Not sure why this is happening, is this a potential bug or any other pointers > to fix the problem. > C* Version: 2.1.16 > Client: Datastax Java Driver. > system_auth RF: 3, dc-1:3 and dc-2:3 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15449) Credentials out of sync after replacing the nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jai Bheemsen Rao Dhanwada updated CASSANDRA-15449: -- Description: Hello, We are seeing a strange issue where, after replacing multiple C* nodes from the clusters intermittently we see an issue where few nodes doesn't have any credentials and the client queries fail. Here are the sequence of steps 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes in one DC. 2. The approach we took to replace the nodes is kill one node and launch a new node with {{-Dcassandra.replace_address=}} and proceed with next node once the node is bootstrapped and CQL is enabled. 3. This process works fine and all of a sudden, we started seeing our application started failing with the below errors in the logs {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc has no SELECT permission on or any of its parents at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) at {quote} 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while rest of the nodes are serving ~100 requests. (attached the metrics) !Screen Shot 2019-12-12 at 11.13.52 AM.png! 5. We suspect some credentials sync issue and manually synced the credentials and restarted the nodes with 0 requests, which fixed the problem. Also, one few C* nodes we see below exception immediately after the bootstrap is completed and the process dies. is this contributing to the credentials issue? NOTE: The C* nodes with zero traffic and the nodes with the below exception are not the same. {quote}ERROR [main] 2019-12-12 05:34:40,412 CassandraDaemon.java:583 - Exception encountered during startup java.lang.AssertionError: org.apache.cassandra.exceptions.InvalidRequestException: Undefined name salted_hash in selection clause at org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:202) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.auth.Auth.setup(Auth.java:144) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:996) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:740) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:617) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391) [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566) [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655) [apache-cassandra-2.1.16.jar:2.1.16] Caused by: org.apache.cassandra.exceptions.InvalidRequestException: Undefined name salted_hash in selection clause at org.apache.cassandra.cql3.statements.Selection.fromSelectors(Selection.java:292) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1592) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:198) ~[apache-cassandra-2.1.16.jar:2.1.16] ... 7 common frames omitted {quote} Not sure why this is happening, is this a potential bug or any other pointers to fix the problem. C* Version: 2.1.16 Client: Datastax Java Driver. system_auth RF: 3, dc-1:3 and dc-2:3 was: Hello, We are seeing a strange issue where, after replacing multiple C* nodes from the clusters intermittently we see an issue where few nodes doesn't have any credentials and the client queries fail. Here are the sequence of steps 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes in one DC. 2. The approach we took to replace the nodes is kill one node and launch a new node with {{-Dcassandra.replace_address=}} and proceed with next node once the node is bootstrapped and CQL is enabled. 3. This process works fine and all of a sudden, we started seeing our application started failing with the below errors in the logs {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc has no SELECT permission on or any of its parents at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) at {quote} 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while rest of the nodes are serving ~100 requests. (attached the metrics) !Screen Shot 2019-12-12 at 11.13.52 AM.png! 5.
[jira] [Created] (CASSANDRA-15449) Credentials out of sync after replacing the nodes
Jai Bheemsen Rao Dhanwada created CASSANDRA-15449: - Summary: Credentials out of sync after replacing the nodes Key: CASSANDRA-15449 URL: https://issues.apache.org/jira/browse/CASSANDRA-15449 Project: Cassandra Issue Type: Bug Reporter: Jai Bheemsen Rao Dhanwada Attachments: Screen Shot 2019-12-12 at 11.13.52 AM.png Hello, We are seeing a strange issue where, after replacing multiple C* nodes from the clusters intermittently we see an issue where few nodes doesn't have any credentials and the client queries fail. Here are the sequence of steps 1. on a Multi DC C* cluster(12 nodes in each DC), we replaced all the nodes in one DC. 2. The approach we took to replace the nodes is kill one node and launch a new node with {{-Dcassandra.replace_address=}} and proceed with next node once the node is bootstrapped and CQL is enabled. 3. This process works fine and all of a sudden, we started seeing our application started failing with the below errors in the logs {quote}com.datastax.driver.core.exceptions.UnauthorizedException: User abc has no SELECT permission on or any of its parents at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59) at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25) at {quote} 4. At this stage we see that 3 nodes in the cluster takes zero traffic, while rest of the nodes are serving ~100 requests. (attached the metrics) !Screen Shot 2019-12-12 at 11.13.52 AM.png! 5. We suspect some credentials sync issue and manually synced the credentials and restarted the nodes with 0 requests, which fixed the problem. Also, one few C* nodes we see below exception immediately after the bootstrap is completed and the process dies. is this contributing to the credentials issue? NOTE: The C* nodes with zero traffic and the nodes with the below exception are not the same. {quote}ERROR [main] 2019-12-12 05:34:40,412 CassandraDaemon.java:583 - Exception encountered during startup java.lang.AssertionError: org.apache.cassandra.exceptions.InvalidRequestException: Undefined name salted_hash in selection clause at org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:202) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.auth.Auth.setup(Auth.java:144) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:996) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:740) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:617) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391) [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566) [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655) [apache-cassandra-2.1.16.jar:2.1.16] Caused by: org.apache.cassandra.exceptions.InvalidRequestException: Undefined name salted_hash in selection clause at org.apache.cassandra.cql3.statements.Selection.fromSelectors(Selection.java:292) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1592) ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.auth.PasswordAuthenticator.setup(PasswordAuthenticator.java:198) ~[apache-cassandra-2.1.16.jar:2.1.16] ... 7 common frames omitted {quote} Not sure why this is happening, is this a potential bug or any other pointers to fix the problem. C* Version: 2.1.16 Client: Datastax Java Driver. system_auth RF: 3, dc-1:3 and dc-2:3 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15210) Streaming with CDC does not honor cdc_enabled
[ https://issues.apache.org/jira/browse/CASSANDRA-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994963#comment-16994963 ] Jeremiah Jordan commented on CASSANDRA-15210: - [~aprudhomme] if your patch is ready for review, then hit the "submit patch" button above to let people know that. > Streaming with CDC does not honor cdc_enabled > - > > Key: CASSANDRA-15210 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15210 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Feature/Change Data Capture >Reporter: Andrew Prudhomme >Assignee: Andrew Prudhomme >Priority: Normal > > When SSTables are streamed for a CDC enabled table, the updates are processed > through the write path to ensure they are made available through the commit > log. However, currently only the CDC state of the table is checked. Since CDC > is enabled at both the node and table level, a node with CDC disabled (with > cdc_enabled: false) will unnecessarily send updates through the write path if > CDC is enabled on the table. This seems like an oversight. > I'd imagine the fix would be something like > > {code:java} > - hasCDC = cfs.metadata.params.cdc; > + hasCDC = cfs.metadata.params.cdc && > DatabaseDescriptor.isCDCEnabled();{code} > in > org.apache.cassandra.db.streaming.CassandraStreamReceiver (4) > org.apache.cassandra.streaming.StreamReceiveTask (3.11) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15210) Streaming with CDC does not honor cdc_enabled
[ https://issues.apache.org/jira/browse/CASSANDRA-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-15210: Bug Category: Parent values: Correctness(12982)Level 1 values: API / Semantic Implementation(12988) Complexity: Normal Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > Streaming with CDC does not honor cdc_enabled > - > > Key: CASSANDRA-15210 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15210 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Feature/Change Data Capture >Reporter: Andrew Prudhomme >Assignee: Andrew Prudhomme >Priority: Normal > > When SSTables are streamed for a CDC enabled table, the updates are processed > through the write path to ensure they are made available through the commit > log. However, currently only the CDC state of the table is checked. Since CDC > is enabled at both the node and table level, a node with CDC disabled (with > cdc_enabled: false) will unnecessarily send updates through the write path if > CDC is enabled on the table. This seems like an oversight. > I'd imagine the fix would be something like > > {code:java} > - hasCDC = cfs.metadata.params.cdc; > + hasCDC = cfs.metadata.params.cdc && > DatabaseDescriptor.isCDCEnabled();{code} > in > org.apache.cassandra.db.streaming.CassandraStreamReceiver (4) > org.apache.cassandra.streaming.StreamReceiveTask (3.11) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization
[ https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15295: - Authors: Dinesh Joshi, Zephyr Guo (was: Zephyr Guo) Fix Version/s: 4.0 Since Version: 4.0 Source Control Link: https://github.com/apache/cassandra/commit/3a8300e0b86c4acfb7b7702197d36cc39ebe94bc Resolution: Fixed Status: Resolved (was: Ready to Commit) Thanks for the patch [~gzh1992n] & review [~jrwest]! > Running into deadlock when do CommitLog initialization > -- > > Key: CASSANDRA-15295 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15295 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Zephyr Guo >Assignee: Zephyr Guo >Priority: Normal > Fix For: 4.0 > > Attachments: image.png, jstack.log, pstack.log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a > long time. > I used jstack to saw what happened. The main thread stuck in > *AbstractCommitLogSegmentManager.awaitAvailableSegment* > !screenshot-1.png! > The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it > was not actually running. > !screenshot-2.png! > And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on > java class initialization. > !screenshot-3.png! > This is a deadlock obviously. CommitLog waits for a CommitLogSegment when > initializing. In this moment, the CommitLog class is not initialized and the > main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a > CommitLogSegment with exception and call *CommitLog.handleCommitError*(static > method). COMMIT-LOG-ALLOCATOR will block on this line because CommitLog > class is still initializing. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15447) in-jvm dtest support for subnets doesn't change seed provider subnet
[ https://issues.apache.org/jira/browse/CASSANDRA-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994933#comment-16994933 ] David Capwell commented on CASSANDRA-15447: --- I modified org.apache.cassandra.distributed.test.DistributedReadWritePathTest#pagingTests (it used subnet) without your patch, and it failed matching the jira description; I then partially applied your patch and it worked. PR on trunk LGTM; +1 > in-jvm dtest support for subnets doesn't change seed provider subnet > > > Key: CASSANDRA-15447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15447 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Doug Rohrer >Assignee: Doug Rohrer >Priority: Normal > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > When using the `withSubnet` function on AbstractCluster.Builder, the > newly-selected subnet is never used when setting up the SeedProvider in the > constructor of InstanceConfig, which is hard-coded to 127.0.0.1. Because of > this, clusters with any subnet other than 0, and gossip enabled, cannot start > up as they have no seed provider in their subnet and what should be the seed > (instance 1) doesn't think it is the seed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Avoid deadlock during CommitLog initialization
This is an automated email from the ASF dual-hosted git repository. djoshi pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 3a8300e Avoid deadlock during CommitLog initialization 3a8300e is described below commit 3a8300e0b86c4acfb7b7702197d36cc39ebe94bc Author: Zephyr Guo AuthorDate: Fri Oct 18 17:15:20 2019 -0700 Avoid deadlock during CommitLog initialization patch by Zephyr Guo, Dinesh Joshi; reviewed by Jordan West and Dinesh Joshi for CASSANDRA-15295 Co-Authored-By: Zephyr Guo Co-Authored-By: Dinesh Joshi --- .../cassandra/config/DatabaseDescriptor.java | 18 .../commitlog/AbstractCommitLogSegmentManager.java | 10 +- .../db/commitlog/AbstractCommitLogService.java | 7 +- .../apache/cassandra/db/commitlog/CommitLog.java | 56 --- .../apache/cassandra/service/CassandraDaemon.java | 2 + .../cassandra/utils/JVMStabilityInspector.java | 20 +++- .../cassandra/distributed/impl/Instance.java | 1 + .../CassandraIsolatedJunit4ClassRunner.java| 107 .../config/DatabaseDescriptorRefTest.java | 7 ++ test/unit/org/apache/cassandra/cql3/CQLTester.java | 2 + test/unit/org/apache/cassandra/db/ColumnsTest.java | 2 + .../apache/cassandra/db/SystemKeyspaceTest.java| 2 + .../commitlog/CommitLogInitWithExceptionTest.java | 110 + .../cassandra/db/context/CounterContextTest.java | 2 + .../apache/cassandra/db/lifecycle/HelpersTest.java | 2 + .../apache/cassandra/db/lifecycle/TrackerTest.java | 1 + .../apache/cassandra/db/lifecycle/ViewTest.java| 2 + .../apache/cassandra/dht/PartitionerTestCase.java | 2 + .../apache/cassandra/dht/StreamStateStoreTest.java | 2 + .../apache/cassandra/gms/FailureDetectorTest.java | 2 + .../org/apache/cassandra/gms/GossiperTest.java | 2 + .../org/apache/cassandra/gms/ShadowRoundTest.java | 2 + .../sstable/format/SSTableFlushObserverTest.java | 2 + .../cassandra/locator/AlibabaCloudSnitchTest.java | 2 + .../cassandra/locator/CloudstackSnitchTest.java| 2 + .../apache/cassandra/locator/EC2SnitchTest.java| 2 + .../cassandra/locator/GoogleCloudSnitchTest.java | 2 + .../metrics/HintedHandOffMetricsTest.java | 2 + .../org/apache/cassandra/net/ConnectionTest.java | 2 + .../org/apache/cassandra/net/HandshakeTest.java| 2 + .../apache/cassandra/net/MessagingServiceTest.java | 2 + .../net/OutboundConnectionSettingsTest.java| 2 + .../cassandra/net/OutboundConnectionsTest.java | 2 + .../org/apache/cassandra/service/RemoveTest.java | 2 + .../service/StorageServiceServerTest.java | 2 + .../cassandra/transport/IdleDisconnectTest.java| 4 +- .../concurrent/AbstractTransactionalTest.java | 2 + .../apache/cassandra/stress/CompactionStress.java | 2 + 38 files changed, 372 insertions(+), 23 deletions(-) diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 02f5a70..3c184bd 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -51,6 +51,10 @@ import org.apache.cassandra.auth.IRoleManager; import org.apache.cassandra.config.Config.CommitLogSync; import org.apache.cassandra.config.EncryptionOptions.ServerEncryptionOptions.InternodeEncryption; import org.apache.cassandra.db.ConsistencyLevel; +import org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager; +import org.apache.cassandra.db.commitlog.CommitLog; +import org.apache.cassandra.db.commitlog.CommitLogSegmentManagerCDC; +import org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard; import org.apache.cassandra.dht.IPartitioner; import org.apache.cassandra.exceptions.ConfigurationException; import org.apache.cassandra.io.FSWriteError; @@ -147,6 +151,10 @@ public class DatabaseDescriptor // turns some warnings into exceptions for testing private static final boolean strictRuntimeChecks = Boolean.getBoolean("cassandra.strict.runtime.checks"); +private static Function commitLogSegmentMgrProvider = c -> DatabaseDescriptor.isCDCEnabled() + ? new CommitLogSegmentManagerCDC(c, DatabaseDescriptor.getCommitLogLocation()) + : new CommitLogSegmentManagerStandard(c, DatabaseDescriptor.getCommitLogLocation()); + public static void daemonInitialization() throws ConfigurationException { daemonInitialization(DatabaseDescriptor::loadConfig); @@ -2968,4 +2976,14 @@ public class DatabaseDescriptor logger.info("Setting use_offheap_merkle_trees to {}", value); conf.use_offheap_merkle_trees = value; } + +public static
[jira] [Updated] (CASSANDRA-15447) in-jvm dtest support for subnets doesn't change seed provider subnet
[ https://issues.apache.org/jira/browse/CASSANDRA-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15447: -- Reviewers: David Capwell, Yifan Cai (was: Yifan Cai) > in-jvm dtest support for subnets doesn't change seed provider subnet > > > Key: CASSANDRA-15447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15447 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Doug Rohrer >Assignee: Doug Rohrer >Priority: Normal > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > When using the `withSubnet` function on AbstractCluster.Builder, the > newly-selected subnet is never used when setting up the SeedProvider in the > constructor of InstanceConfig, which is hard-coded to 127.0.0.1. Because of > this, clusters with any subnet other than 0, and gossip enabled, cannot start > up as they have no seed provider in their subnet and what should be the seed > (instance 1) doesn't think it is the seed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15447) in-jvm dtest support for subnets doesn't change seed provider subnet
[ https://issues.apache.org/jira/browse/CASSANDRA-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-15447: -- Reviewers: Yifan Cai, Yifan Cai (was: Yifan Cai) Yifan Cai, Yifan Cai Status: Review In Progress (was: Patch Available) > in-jvm dtest support for subnets doesn't change seed provider subnet > > > Key: CASSANDRA-15447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15447 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Doug Rohrer >Assignee: Doug Rohrer >Priority: Normal > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > When using the `withSubnet` function on AbstractCluster.Builder, the > newly-selected subnet is never used when setting up the SeedProvider in the > constructor of InstanceConfig, which is hard-coded to 127.0.0.1. Because of > this, clusters with any subnet other than 0, and gossip enabled, cannot start > up as they have no seed provider in their subnet and what should be the seed > (instance 1) doesn't think it is the seed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15447) in-jvm dtest support for subnets doesn't change seed provider subnet
[ https://issues.apache.org/jira/browse/CASSANDRA-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994914#comment-16994914 ] Yifan Cai commented on CASSANDRA-15447: --- Thanks [~drohrer]. The PRs LGTM. One nit: the PR to trunk has 2 parameters at the same line for the multiple-line statement. According to the [code style|http://cassandra.apache.org/doc/latest/development/code_style.html], it should be 1 pre line. > in-jvm dtest support for subnets doesn't change seed provider subnet > > > Key: CASSANDRA-15447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15447 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Doug Rohrer >Priority: Normal > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > When using the `withSubnet` function on AbstractCluster.Builder, the > newly-selected subnet is never used when setting up the SeedProvider in the > constructor of InstanceConfig, which is hard-coded to 127.0.0.1. Because of > this, clusters with any subnet other than 0, and gossip enabled, cannot start > up as they have no seed provider in their subnet and what should be the seed > (instance 1) doesn't think it is the seed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15447) in-jvm dtest support for subnets doesn't change seed provider subnet
[ https://issues.apache.org/jira/browse/CASSANDRA-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai reassigned CASSANDRA-15447: - Assignee: Doug Rohrer > in-jvm dtest support for subnets doesn't change seed provider subnet > > > Key: CASSANDRA-15447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15447 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Doug Rohrer >Assignee: Doug Rohrer >Priority: Normal > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > When using the `withSubnet` function on AbstractCluster.Builder, the > newly-selected subnet is never used when setting up the SeedProvider in the > constructor of InstanceConfig, which is hard-coded to 127.0.0.1. Because of > this, clusters with any subnet other than 0, and gossip enabled, cannot start > up as they have no seed provider in their subnet and what should be the seed > (instance 1) doesn't think it is the seed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15446) Per-thread stack size is too small on aarch64 CentOS
[ https://issues.apache.org/jira/browse/CASSANDRA-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15446: -- Platform: Java8,Java11,OpenJDK,Linux,ARM (was: Java8,Java11,OpenJDK,Linux) > Per-thread stack size is too small on aarch64 CentOS > > > Key: CASSANDRA-15446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15446 > Project: Cassandra > Issue Type: Bug > Components: Local/Config, Local/Startup and Shutdown >Reporter: Heming Fu >Assignee: Heming Fu >Priority: Normal > Fix For: 3.11.5, 2.1.x, 2.2.x, 3.0.x > > > Hi all, > I found an issue when I tried to start cassandra on my aarch64 CentOS7.6, > however no errors on Ubuntu. Of course I could increase -Xss in jvm.options > to fix it, but this issue also caused Cassandra's docker images from docker > hub could not run containers on this OS. > The information of my current environment and root cause of this issue were > shown below. > *Error* > The stack size specified is too small, Specify at least 328k > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > *Version* > Cassandra 2.1.21 2.2.15 3.0.19 3.11.5 > *Environment* > $ lscpu > Architecture: aarch64 > Byte Order: Little Endian > $ uname -m > aarch64 > $ java -version > openjdk version "1.8.0_181" > OpenJDK Runtime Environment (build 1.8.0_181-b13) > OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) > [root@localhost apache-cassandra-3.11.5]# cat /etc/os-release > $ cat /etc/os-release > NAME="CentOS Linux" > VERSION="7 (AltArch)" > ID="centos" > ID_LIKE="rhel fedora" > VERSION_ID="7" > PRETTY_NAME="CentOS Linux 7 (AltArch)" > ANSI_COLOR="0;31" > CPE_NAME="cpe:/o:centos:centos:7" > HOME_URL="https://www.centos.org/; > BUG_REPORT_URL="https://bugs.centos.org/; > *Root Cause* > Checked openjdk-1.8.0 source code, the min stack size is calculated by > StackYellowPage, StackRedPage, StackShadowPage, OS page size. Among those > parameters, *default OS page size of aarch64 CentOS 7.6 is 64K, however > aarch64 Ubuntu 18.04 and X86 CentOS are both 4K.* > This difference causes JVM on aarch64 Ubuntu 18.04 needs 164K per-thread > stack size, but 328K required on aarch64 CentOS 7.6. > The formula is > os::Linux::min_stack_allowed = MAX2(os::Linux::min_stack_allowed, > (size_t)(StackYellowPages+StackRedPages+StackShadowPages) * > Linux::page_size() + > (2*BytesPerWord COMPILER2_PRESENT(+1)) * Linux::vm_default_page_size()); > *Parameters on aarch64 CentOS7.6* > intx StackRedPages = 1 > intx StackShadowPages = 1 > intx StackYellowPages = 1 > pageSize 64K > BytesPerWord 8 > vm_default_page_size 8K > As a result, we have min_stack_allowed = (1 + 1 + 1) * 64K + (2 * 8 + 1) * 8K > = 328K > > I could see some similar issues asked for specified achitecture, but no root > cause analyzed. I hope this could help you decide proper stack size for all > common OS. > If you have any suggestion, pls let me know. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15448) Throttle the speed of merkletree row hash
[ https://issues.apache.org/jira/browse/CASSANDRA-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994898#comment-16994898 ] David Capwell commented on CASSANDRA-15448: --- Do you have any profiles/gc logs to show where you are seeing most of the time? There is some work going on to make repair more efficient; work to remove allocations (less GC), work to speed up interval trees (see CASSANDRA-15397), change implementation of hash (see CASSANDRA-15294), etc. With all those applied, wonder how it would impact on your systems. > Throttle the speed of merkletree row hash > -- > > Key: CASSANDRA-15448 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15448 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair, Tool/nodetool >Reporter: maxwellguo >Assignee: maxwellguo >Priority: Normal > > Under our enviroment , we may got some Low-profile servers, like 4core 8G > memory, so when doing repair for merkletree calculate , the cpu may cost so > much. And we think repair can take long time so do some speed throttle may > increase repair time,but can make the server more stable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15446) Per-thread stack size is too small on aarch64 CentOS
[ https://issues.apache.org/jira/browse/CASSANDRA-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15446: -- Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear Impact(13164) Complexity: Low Hanging Fruit Component/s: Local/Startup and Shutdown Discovered By: User Report Platform: Java8,Java11,OpenJDK,Linux (was: OpenJDK,Linux) Severity: Normal Status: Open (was: Triage Needed) > Per-thread stack size is too small on aarch64 CentOS > > > Key: CASSANDRA-15446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15446 > Project: Cassandra > Issue Type: Bug > Components: Local/Config, Local/Startup and Shutdown >Reporter: Heming Fu >Assignee: Heming Fu >Priority: Normal > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.5 > > > Hi all, > I found an issue when I tried to start cassandra on my aarch64 CentOS7.6, > however no errors on Ubuntu. Of course I could increase -Xss in jvm.options > to fix it, but this issue also caused Cassandra's docker images from docker > hub could not run containers on this OS. > The information of my current environment and root cause of this issue were > shown below. > *Error* > The stack size specified is too small, Specify at least 328k > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > *Version* > Cassandra 2.1.21 2.2.15 3.0.19 3.11.5 > *Environment* > $ lscpu > Architecture: aarch64 > Byte Order: Little Endian > $ uname -m > aarch64 > $ java -version > openjdk version "1.8.0_181" > OpenJDK Runtime Environment (build 1.8.0_181-b13) > OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) > [root@localhost apache-cassandra-3.11.5]# cat /etc/os-release > $ cat /etc/os-release > NAME="CentOS Linux" > VERSION="7 (AltArch)" > ID="centos" > ID_LIKE="rhel fedora" > VERSION_ID="7" > PRETTY_NAME="CentOS Linux 7 (AltArch)" > ANSI_COLOR="0;31" > CPE_NAME="cpe:/o:centos:centos:7" > HOME_URL="https://www.centos.org/; > BUG_REPORT_URL="https://bugs.centos.org/; > *Root Cause* > Checked openjdk-1.8.0 source code, the min stack size is calculated by > StackYellowPage, StackRedPage, StackShadowPage, OS page size. Among those > parameters, *default OS page size of aarch64 CentOS 7.6 is 64K, however > aarch64 Ubuntu 18.04 and X86 CentOS are both 4K.* > This difference causes JVM on aarch64 Ubuntu 18.04 needs 164K per-thread > stack size, but 328K required on aarch64 CentOS 7.6. > The formula is > os::Linux::min_stack_allowed = MAX2(os::Linux::min_stack_allowed, > (size_t)(StackYellowPages+StackRedPages+StackShadowPages) * > Linux::page_size() + > (2*BytesPerWord COMPILER2_PRESENT(+1)) * Linux::vm_default_page_size()); > *Parameters on aarch64 CentOS7.6* > intx StackRedPages = 1 > intx StackShadowPages = 1 > intx StackYellowPages = 1 > pageSize 64K > BytesPerWord 8 > vm_default_page_size 8K > As a result, we have min_stack_allowed = (1 + 1 + 1) * 64K + (2 * 8 + 1) * 8K > = 328K > > I could see some similar issues asked for specified achitecture, but no root > cause analyzed. I hope this could help you decide proper stack size for all > common OS. > If you have any suggestion, pls let me know. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15447) in-jvm dtest support for subnets doesn't change seed provider subnet
[ https://issues.apache.org/jira/browse/CASSANDRA-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994788#comment-16994788 ] Doug Rohrer commented on CASSANDRA-15447: - PRs for all four active branches are now available. The patch is essentially identical for all four, and require very few changes. Separately, it appears there's something causing FailingRepairTest to fail when run with other dtests - this is true on trunk w/o my changes but I'll take a look and see if I can figure out what's bleeding over from some previous test that's causing it. > in-jvm dtest support for subnets doesn't change seed provider subnet > > > Key: CASSANDRA-15447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15447 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Doug Rohrer >Priority: Normal > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > When using the `withSubnet` function on AbstractCluster.Builder, the > newly-selected subnet is never used when setting up the SeedProvider in the > constructor of InstanceConfig, which is hard-coded to 127.0.0.1. Because of > this, clusters with any subnet other than 0, and gossip enabled, cannot start > up as they have no seed provider in their subnet and what should be the seed > (instance 1) doesn't think it is the seed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko reassigned CASSANDRA-13938: - Assignee: Aleksey Yeschenko (was: Alex Petrov) > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13938 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Nate McCall >Assignee: Aleksey Yeschenko >Priority: Urgent > Fix For: 4.0-alpha > > Attachments: 13938.yaml, test.sh > > > Running through a simple scenario to test some of the new repair features, I > was not able to make a repair command work. Further, the exception seemed to > trigger a nasty failure state that basically shuts down the netty connections > for messaging *and* CQL on the nodes transferring back data to the node being > repaired. The following steps reproduce this issue consistently. > Cassandra stress profile (probably not necessary, but this one provides a > really simple schema and consistent data shape): > {noformat} > keyspace: standard_long > keyspace_definition: | > CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', > 'replication_factor':3}; > table: test_data > table_definition: | > CREATE TABLE test_data ( > key text, > ts bigint, > val text, > PRIMARY KEY (key, ts) > ) WITH COMPACT STORAGE AND > CLUSTERING ORDER BY (ts DESC) AND > bloom_filter_fp_chance=0.01 AND > caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > columnspec: > - name: key > population: uniform(1..5000) # 50 million records available > - name: ts > cluster: gaussian(1..50) # Up to 50 inserts per record > - name: val > population: gaussian(128..1024) # varrying size of value data > insert: > partitions: fixed(1) # only one insert per batch for individual partitions > select: fixed(1)/1 # each insert comes in one at a time > batchtype: UNLOGGED > queries: > single: > cql: select * from test_data where key = ? and ts = ? limit 1; > series: > cql: select key,ts,val from test_data where key = ? limit 10; > {noformat} > The commands to build and run: > {noformat} > ccm create 4_0_test -v git:trunk -n 3 -s > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4 > # flush the memtable just to get everything on disk > ccm node1 nodetool flush > ccm node2 nodetool flush > ccm node3 nodetool flush > # disable hints for nodes 2 and 3 > ccm node2 nodetool disablehandoff > ccm node3 nodetool disablehandoff > # stop node1 > ccm node1 stop > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4 > # wait 10 seconds > ccm node1 start > # Note that we are local to ccm's nodetool install 'cause repair preview is > not reported yet > node1/bin/nodetool repair --preview > node1/bin/nodetool repair standard_long test_data > {noformat} > The error outputs from the last repair command follow. First, this is stdout > from node1: > {noformat} > $ node1/bin/nodetool repair standard_long test_data > objc[47876]: Class JavaLaunchHelper is implemented in both > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java > (0x10274d4c0) and > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib > (0x1047b64e0). One of the two will be used. Which one is undefined. > [2017-10-05 14:31:52,425] Starting repair command #4 > (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with > repair options (parallelism: parallel, primary range: false, incremental: > true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: > [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: > false) > [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 > for range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] failed with error Stream failed > [2017-10-05 14:32:07,048] null > [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds > error: Repair job has failed with the error message: [2017-10-05 > 14:32:07,048] null > -- StackTrace -- > java.lang.RuntimeException: Repair job has failed with the error message: > [2017-10-05