[jira] [Commented] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory
[ https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629803#comment-16629803 ] Stephen Connolly commented on CASSANDRA-12704: -- IIRC it was hard enough getting interest to publish releases... but were back in the mists of time > snapshot build never be able to publish to mvn artifactory > -- > > Key: CASSANDRA-12704 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12704 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > Attachments: 12704-trunk.txt > > > {code} > $ ant publish > {code} > works fine when property "release" is set, which publishes the binaries to > release Artifactory. > But for daily snapshot build, if "release" is set, it won't be snapshot build: > https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74 > if "release" is not set, it doesn't publish to snapshot Artifactory: > https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888 > I would suggest just removing the "if check" for target "publish". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629689#comment-16629689 ] sankalp kohli commented on CASSANDRA-12126: --- Why is end result not correct? second and third operation did not succeed because first 1 did not finish? Can you combine the example with the earlier comment please > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: sankalp kohli >Priority: Major > Labels: LWT > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629685#comment-16629685 ] Jeffrey F. Lukman commented on CASSANDRA-12126: --- To complete our scenario, here is the setup for our Cassandra: We run the scenario with Cassandra-v2.0.15. Here is the scheme that we use: * CREATE KEYSPACE test WITH REPLICATION = \{'class': 'SimpleStrategy', 'replication_factor': 3}; * CREATE TABLE tests ( name text PRIMARY KEY, owner text, value_1 text, value_2 text, value_3 text); Here are the queries that we submit: * client request to node X (1st): UPDATE test.tests SET value_1 = 'A' WHERE name = 'testing' IF owner = 'user_1'; * client request to node Y (2nd): UPDATE test.tests SET value_2 = 'B' WHERE name = 'testing' IF value_1 = 'A'; * client request to node Z (3rd): UPDATE test.tests SET value_3 = 'C' WHERE name = 'testing' IF value_1 = 'A'; To confirm, when the bug is manifested, the end result will be: value_1='A', value_2=null, value_3=null [~jjirsa], regarding our tool, at this point, it is not open for public. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: sankalp kohli >Priority: Major > Labels: LWT > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory
[ https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629667#comment-16629667 ] mck commented on CASSANDRA-12704: - +1 to the patch. Seemed odd when I saw it last week considering https://github.com/apache/cassandra/blame/trunk/build.xml#L100-L101. But maybe [~urandom] or [~stephenc] have input? > snapshot build never be able to publish to mvn artifactory > -- > > Key: CASSANDRA-12704 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12704 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > Attachments: 12704-trunk.txt > > > {code} > $ ant publish > {code} > works fine when property "release" is set, which publishes the binaries to > release Artifactory. > But for daily snapshot build, if "release" is set, it won't be snapshot build: > https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74 > if "release" is not set, it doesn't publish to snapshot Artifactory: > https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888 > I would suggest just removing the "if check" for target "publish". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12438) Data inconsistencies with lightweight transactions, serial reads, and rejoining node
[ https://issues.apache.org/jira/browse/CASSANDRA-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629632#comment-16629632 ] Jeffrey F. Lukman commented on CASSANDRA-12438: --- Hi [~benedict] , Following the bug description, we integrate our model checker with Cassandra-v3.7. We grabbed the code from the github repository. Regarding the scheme, here is the initial scheme that we have prepared before we inject any queries in the model checker path execution: * CREATE KEYSPACE test WITH REPLICATION = \{'class': 'SimpleStrategy', 'replication_factor': 3}; * CREATE TABLE tests (name text PRIMARY KEY, owner text, value_1 text, value_2 text, value_3 text, value_4 text, value_5 text, value_6 text, value_7 text); Regarding the operations/queries, here are the details of them: * INSERT INTO test.tests (name, owner, value_1, value_2, value_3, value_4, value_5, value_6, value_7) VALUES ('cass-12438', 'user_1', 'A1', 'B1', 'C1', 'D1', 'E1', 'F1', 'G1') IF NOT EXISTS; * Client Request 2: UPDATE test.tests SET value_1 = 'A2', owner = 'user_2' WHERE name = 'cass-12438' IF owner = 'user_1'; * Client Request 3: UPDATE test.tests SET value_1 = 'A3', owner = 'user_3' WHERE name = 'cass-12438' IF owner = 'user_2'; The messages from these queries here are the one that the model checker control and reorder in some way, so that we ended up reproducing this bug. > Data inconsistencies with lightweight transactions, serial reads, and > rejoining node > > > Key: CASSANDRA-12438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12438 > Project: Cassandra > Issue Type: Bug >Reporter: Steven Schaefer >Priority: Major > > I've run into some issues with data inconsistency in a situation where a > single node is rejoining a 3-node cluster with RF=3. I'm running 3.7. > I have a client system which inserts data into a table with around 7 columns, > named let's say A-F,id, and version. LWTs are used to make the inserts and > updates. > Typically what happens is there's an insert of values id, V_a1, V_b1, ... , > version=1, then another process will pick up rows with for example A=V_a1 and > subsequently update A to V_a2 and version=2. Yet another process will watch > for A=V_a2 to then make a second update to the same column, and set version > to 3, with end result being There's a > secondary index on this A column (there's only a few possible values for A so > not worried about the cardinality issue), though I've reproed with the new > SASI index too. > If one of the nodes is down, there's still 2 alive for quorum so inserts can > still happen. When I bring up the downed node, sometimes I get really weird > state back which ultimately crashes the client system that's talking to > Cassandra. > When reading I always select all the columns, but there is a conditional > where clause that A=V_a2 (e.g. SELECT * FROM table WHERE A=V_a2). This read > is for processing any rows with V_a2, and ultimately updating to V_a3 when > complete. While periodically polling for A=V_a2 it is of course possible for > the poller to to observe the old V_a2 value while the other parts of the > client system process and make the update to V_a3, and that's generally ok > because of the LWTs used for updates, an occassionaly wasted reprocessing run > ins't a big deal, but when reading at serial I always expect to get the > original values for columns that were never updated too. If a paxos update is > in progress then I expect that completed before its value(s) returned. But > instead, the read seems to be seeing the partial commit of the LWT, returning > the old V_2a value for the changed column, but no values whatsoever for the > other columns. From the example above, instead of getting , version=3>, or even the older (either of > which I expect and are ok), I get only , so the rest of > the columns end up null, which I never expect. However this isn't persistent, > Cassandra does end up consistent, which I see via sstabledump and cqlsh after > the fact. > In my client system logs I record the insert / updates, and this > inconsistency happens around the same time as the update from V_a2 to V_a3, > hence my comment about Cassandra seeing a partial commit. So that leads me to > suspect that perhaps due to the where clause in my read query for A=V_a2, > perhaps one of the original good nodes already has the new V_a3 value, so it > doesn't return this row for the select query, but the other good node and the > one that was down still have the old value V_a2, so those 2 nodes return what > they have. The one that was down doesn't yet have the original insert, just > the update from V_a1 -> V_a2 (again I suspect, it's not been easy to verify), >
[jira] [Comment Edited] (CASSANDRA-12438) Data inconsistencies with lightweight transactions, serial reads, and rejoining node
[ https://issues.apache.org/jira/browse/CASSANDRA-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629434#comment-16629434 ] Jeffrey F. Lukman edited comment on CASSANDRA-12438 at 9/27/18 1:38 AM: Hi all, Our team from UCARE University of Chicago, have been able to reproduce this bug consistently with our model checker. Here are the workload and scenario of the bug: Workload: 3 node-cluster (let's call them node X, Y, and Z), 1 Crash, 1 Reboot events, with 3 client requests (where node X will be the coordinator node for all client requests. Scenario: # Start the 3 nodes and setup the CONSISTENCY = ONE. # Inject client request 1 as described in this bug description: Insert (along with many others) # But before any PREPARE messages have been sent by the node X, node Z has crashed. # Client request 1 is successfully committed in node X and Y. # Reboot node Z. # Inject client request 2 & 3 as described in this bug description: Update (along with others for which A=V_a1) Update (along with many others for which A=V_a2) (**Although Update 3 can also be ignored if we want to simplify the bug scenario) # If only client request-2 that finished, then we expect to see: If the client request-2 and then client request-3 are committed, then we expect to see: The very least possibility is if both client request-2 & -3 failed and they reached timeout, then we expect to see: # But our model checker shows that, if we do a read request to node Z, then we will see: // some fields are null But if we do a read request to node X or Y, then we will get a complete result. (or as expected in step 7) Which means we end up in an inconsistency view among the nodes (X and Y are different from Z). If we run this scenario with CONSISTENCY.ALL we will not see this bug to happen. We are happy to assist you guys to debug this issue. was (Author: jeffreyflukman): Hi all, Our team from UCARE University of Chicago, have been able to reproduce this bug consistently with our model checker. Here are the workload and scenario of the bug: Workload: 3 node-cluster (let's call them node X, Y, and Z), 1 Crash, 1 Reboot events, with 3 client requests (where node X will be the coordinator node for all client requests. Scenario: # Start the 3 nodes and setup the CONSISTENCY = ONE. # Inject client request 1 as described in this bug description: Insert (along with many others) # But before any PREPARE messages have been sent by the node X, node Z has crashed. # Client request 1 is successfully committed in node X and Y. # Reboot node Z. # Inject client request 2 & 3 as described in this bug description: Update (along with others for which A=V_a1) Update (along with many others for which A=V_a2) (**Although Update 3 can also be ignored if we want to simplify the bug scenario) # If client request-2 finished first without being interfered by client request-3, then we expect to see: If the client request-3 interfere client request-2 or is executed before client request-2 for any reason, then we expect to see: # But our model checker shows that, if we do a read request to node Z, then we will see: // some fields are null But if we do a read request to node X or Y, then we will get a complete result. Which means we end up in an inconsistency view among the nodes. If we run this scenario with CONSISTENCY.ALL we will not see this bug to happen. We are happy to assist you guys to debug this issue. > Data inconsistencies with lightweight transactions, serial reads, and > rejoining node > > > Key: CASSANDRA-12438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12438 > Project: Cassandra > Issue Type: Bug >Reporter: Steven Schaefer >Priority: Major > > I've run into some issues with data inconsistency in a situation where a > single node is rejoining a 3-node cluster with RF=3. I'm running 3.7. > I have a client system which inserts data into a table with around 7 columns, > named let's say A-F,id, and version. LWTs are used to make the inserts and > updates. > Typically what happens is there's an insert of values id, V_a1, V_b1, ... , > version=1, then another process will pick up rows with for example A=V_a1 and > subsequently update A to V_a2 and version=2. Yet another process will watch > for A=V_a2 to then make a second update to the same column, and set version > to 3, with end result being There's a > secondary index on this A column (there's only a few possible values for A so > not worried about the cardinality issue), though I've reproed with the new > SASI index too. > If one of the nodes is down, there's still 2 alive for quorum so inserts can > still happen. When I bring up the downed
[jira] [Commented] (CASSANDRA-14702) Cassandra Write failed even when the required nodes to Ack(consistency) are up.
[ https://issues.apache.org/jira/browse/CASSANDRA-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629626#comment-16629626 ] Rohit Singh commented on CASSANDRA-14702: - No, Only upgrade was being performed on one node after the other. > Cassandra Write failed even when the required nodes to Ack(consistency) are > up. > --- > > Key: CASSANDRA-14702 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14702 > Project: Cassandra > Issue Type: Bug >Reporter: Rohit Singh >Priority: Blocker > > Hi, > We have following configuration in our project for cassandra. > Total nodes in Cluster-5 > Replication Factor- 3 > Consistency- LOCAL_QUORUM > We get the writetimeout exception from cassandra even when 2 nodes are up and > why does stack trace says that 3 replica were required when consistency is 2? > Below is the exception we got:- > com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout > during write query at consistency LOCAL_QUORUM (3 replica were required but > only 2 acknowledged the write) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:59) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:37) > at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:289) > at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:269) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629542#comment-16629542 ] Jeff Jirsa edited comment on CASSANDRA-12126 at 9/26/18 11:47 PM: -- [~jeffreyflukman] thanks for this report. Suspect that most of the folks who are interested in this are already cc'd and received an email notification of your response, but explicitly tagging [~benedict] [~iamaleksey] and [~bdeggleston] as people who aren't yet watching it but may be interested. Also, very much interested in the model you mentioned - is that available publicly at this point? was (Author: jjirsa): [~jeffreyflukman] thanks for this report. Suspect that most of the folks who are interested in this are already cc'd and received an email notification of your response, but explicitly tagging [~benedict] [~iamaleksey] and [~bdeggleston] as people who aren't yet watching it but may be interested. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: sankalp kohli >Priority: Major > Labels: LWT > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629542#comment-16629542 ] Jeff Jirsa commented on CASSANDRA-12126: [~jeffreyflukman] thanks for this report. Suspect that most of the folks who are interested in this are already cc'd and received an email notification of your response, but explicitly tagging [~benedict] [~iamaleksey] and [~bdeggleston] as people who aren't yet watching it but may be interested. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: sankalp kohli >Priority: Major > Labels: LWT > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14792) skip TestRepair.test_dead_coordinator dtest in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-14792: Resolution: Fixed Status: Resolved (was: Patch Available) committed as f4888c8976c2012e9de3b92dedb0ae1a3c984a4b, thanks > skip TestRepair.test_dead_coordinator dtest in 4.0 > -- > > Key: CASSANDRA-14792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14792 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > CASSANDRA-14763 changed the coordinator behavior to not cleanup old repair > sessions, so this test doesn't really make sense anymore. We should just skip > it in 4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: skip TestRepair.test_dead_coordinator dtest in 4.0
Repository: cassandra-dtest Updated Branches: refs/heads/master 96f90eee2 -> f4888c897 skip TestRepair.test_dead_coordinator dtest in 4.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/f4888c89 Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/f4888c89 Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/f4888c89 Branch: refs/heads/master Commit: f4888c8976c2012e9de3b92dedb0ae1a3c984a4b Parents: 96f90ee Author: Blake Eggleston Authored: Tue Sep 25 15:58:14 2018 -0700 Committer: Blake Eggleston Committed: Wed Sep 26 16:45:39 2018 -0700 -- repair_tests/repair_test.py | 33 + 1 file changed, 1 insertion(+), 32 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/f4888c89/repair_tests/repair_test.py -- diff --git a/repair_tests/repair_test.py b/repair_tests/repair_test.py index d4a59a8..3264671 100644 --- a/repair_tests/repair_test.py +++ b/repair_tests/repair_test.py @@ -1087,35 +1087,7 @@ class TestRepair(BaseRepairTest): node2.start(wait_for_binary_proto=True) node2.repair() -def _cancel_open_ir_sessions(self, nodes): -cancelled = 0; -for node in nodes: -stdout = node.nodetool('repair_admin').stdout.strip() -if stdout.strip() == 'no sessions': -continue - -for line in stdout.split('\n')[1:]: -columns = [c.strip() for c in line.split('|')] -session = columns[0] -coordinator = columns[3] -if coordinator == node.address_and_port(): -node.nodetool('repair_admin --cancel {}'.format(session)) -cancelled += 1 - -time.sleep(1) - -# force all of the sstables out of pending repair -for node in nodes: -node.nodetool('compact') - -for node in nodes: -stdout = node.nodetool('repair_admin').stdout.strip() -assert stdout.strip() == 'no sessions' - -return cancelled - - - +@since('2.1', max_version='4') def test_dead_coordinator(self): """ @jira_ticket CASSANDRA-11824 @@ -1151,9 +1123,6 @@ class TestRepair(BaseRepairTest): node1.start(wait_for_binary_proto=True, wait_other_notice=True) logger.debug("running second repair") if cluster.version() >= "2.2": -# 4.0+ actually requires manual intervention here (CASSANDRA-14763) -if cluster.version() >= '4.0': -assert self._cancel_open_ir_sessions(cluster.nodelist()) == 1 node1.repair() else: node1.nodetool('repair keyspace1 standard1 -inc -par') - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12438) Data inconsistencies with lightweight transactions, serial reads, and rejoining node
[ https://issues.apache.org/jira/browse/CASSANDRA-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629454#comment-16629454 ] Benedict edited comment on CASSANDRA-12438 at 9/26/18 9:56 PM: --- There's a lot of information here, that I haven't fully parsed, partially because of the pseudo-code (it's helpful to post actual schemas and operations/queries). However, if you are performing a QUORUM read of *just* {{V_a2/3}}, by itself (to any node; X, Y or Z), before querying node Z directly at ONE then it's probable you are encountering CASSANDRA-14593. The best workaround for this would be to always query all of the columns/rows you want to see updated atomically. Never select a subset. You could also patch your Cassandra instance to not persist the results of read-repair. The upcoming 4.0 release will have the ability to disable it for exactly this reason, but this probably won't be released for several months. was (Author: benedict): There's a lot of information here, that I haven't fully parsed, partially because of the pseudo-code (it's helpful to post actual schemas and operations/queries). However, if you are performing a QUORUM read of *just* {{V_a2/3}}, by itself (to any node; X, Y or Z), before querying node Z directly at ONE then it's probable you are encountering CASSANDRA-14593. The best workaround for this would be to always query all of the columns/rows you want to see updated atomically. Never select a subset. You could also patch your Cassandra instance to not persist the results of read-repair. The upcoming 4.0 release will have the ability to disable it for exactly this reason, but this probably won't be released for several months. > Data inconsistencies with lightweight transactions, serial reads, and > rejoining node > > > Key: CASSANDRA-12438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12438 > Project: Cassandra > Issue Type: Bug >Reporter: Steven Schaefer >Priority: Major > > I've run into some issues with data inconsistency in a situation where a > single node is rejoining a 3-node cluster with RF=3. I'm running 3.7. > I have a client system which inserts data into a table with around 7 columns, > named let's say A-F,id, and version. LWTs are used to make the inserts and > updates. > Typically what happens is there's an insert of values id, V_a1, V_b1, ... , > version=1, then another process will pick up rows with for example A=V_a1 and > subsequently update A to V_a2 and version=2. Yet another process will watch > for A=V_a2 to then make a second update to the same column, and set version > to 3, with end result being There's a > secondary index on this A column (there's only a few possible values for A so > not worried about the cardinality issue), though I've reproed with the new > SASI index too. > If one of the nodes is down, there's still 2 alive for quorum so inserts can > still happen. When I bring up the downed node, sometimes I get really weird > state back which ultimately crashes the client system that's talking to > Cassandra. > When reading I always select all the columns, but there is a conditional > where clause that A=V_a2 (e.g. SELECT * FROM table WHERE A=V_a2). This read > is for processing any rows with V_a2, and ultimately updating to V_a3 when > complete. While periodically polling for A=V_a2 it is of course possible for > the poller to to observe the old V_a2 value while the other parts of the > client system process and make the update to V_a3, and that's generally ok > because of the LWTs used for updates, an occassionaly wasted reprocessing run > ins't a big deal, but when reading at serial I always expect to get the > original values for columns that were never updated too. If a paxos update is > in progress then I expect that completed before its value(s) returned. But > instead, the read seems to be seeing the partial commit of the LWT, returning > the old V_2a value for the changed column, but no values whatsoever for the > other columns. From the example above, instead of getting , version=3>, or even the older (either of > which I expect and are ok), I get only , so the rest of > the columns end up null, which I never expect. However this isn't persistent, > Cassandra does end up consistent, which I see via sstabledump and cqlsh after > the fact. > In my client system logs I record the insert / updates, and this > inconsistency happens around the same time as the update from V_a2 to V_a3, > hence my comment about Cassandra seeing a partial commit. So that leads me to > suspect that perhaps due to the where clause in my read query for A=V_a2, > perhaps one of the original good nodes already has the new V
[jira] [Commented] (CASSANDRA-12438) Data inconsistencies with lightweight transactions, serial reads, and rejoining node
[ https://issues.apache.org/jira/browse/CASSANDRA-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629454#comment-16629454 ] Benedict commented on CASSANDRA-12438: -- There's a lot of information here, that I haven't fully parsed, partially because of the pseudo-code (it's helpful to post actual schemas and operations/queries). However, if you are performing a QUORUM read of *just* {{V_a2/3}}, by itself (to any node; X, Y or Z), before querying node Z directly at ONE then it's probable you are encountering CASSANDRA-14593. The best workaround for this would be to always query all of the columns/rows you want to see updated atomically. Never select a subset. You could also patch your Cassandra instance to not persist the results of read-repair. The upcoming 4.0 release will have the ability to disable it for exactly this reason, but this probably won't be released for several months. > Data inconsistencies with lightweight transactions, serial reads, and > rejoining node > > > Key: CASSANDRA-12438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12438 > Project: Cassandra > Issue Type: Bug >Reporter: Steven Schaefer >Priority: Major > > I've run into some issues with data inconsistency in a situation where a > single node is rejoining a 3-node cluster with RF=3. I'm running 3.7. > I have a client system which inserts data into a table with around 7 columns, > named let's say A-F,id, and version. LWTs are used to make the inserts and > updates. > Typically what happens is there's an insert of values id, V_a1, V_b1, ... , > version=1, then another process will pick up rows with for example A=V_a1 and > subsequently update A to V_a2 and version=2. Yet another process will watch > for A=V_a2 to then make a second update to the same column, and set version > to 3, with end result being There's a > secondary index on this A column (there's only a few possible values for A so > not worried about the cardinality issue), though I've reproed with the new > SASI index too. > If one of the nodes is down, there's still 2 alive for quorum so inserts can > still happen. When I bring up the downed node, sometimes I get really weird > state back which ultimately crashes the client system that's talking to > Cassandra. > When reading I always select all the columns, but there is a conditional > where clause that A=V_a2 (e.g. SELECT * FROM table WHERE A=V_a2). This read > is for processing any rows with V_a2, and ultimately updating to V_a3 when > complete. While periodically polling for A=V_a2 it is of course possible for > the poller to to observe the old V_a2 value while the other parts of the > client system process and make the update to V_a3, and that's generally ok > because of the LWTs used for updates, an occassionaly wasted reprocessing run > ins't a big deal, but when reading at serial I always expect to get the > original values for columns that were never updated too. If a paxos update is > in progress then I expect that completed before its value(s) returned. But > instead, the read seems to be seeing the partial commit of the LWT, returning > the old V_2a value for the changed column, but no values whatsoever for the > other columns. From the example above, instead of getting , version=3>, or even the older (either of > which I expect and are ok), I get only , so the rest of > the columns end up null, which I never expect. However this isn't persistent, > Cassandra does end up consistent, which I see via sstabledump and cqlsh after > the fact. > In my client system logs I record the insert / updates, and this > inconsistency happens around the same time as the update from V_a2 to V_a3, > hence my comment about Cassandra seeing a partial commit. So that leads me to > suspect that perhaps due to the where clause in my read query for A=V_a2, > perhaps one of the original good nodes already has the new V_a3 value, so it > doesn't return this row for the select query, but the other good node and the > one that was down still have the old value V_a2, so those 2 nodes return what > they have. The one that was down doesn't yet have the original insert, just > the update from V_a1 -> V_a2 (again I suspect, it's not been easy to verify), > which would explain where comes from, that's all it > knows about. However since it's a serial quorum read, I'd expect some sort of > exception as neither of the remaining 2 nodes with A=V_a2 would be able to > come to a quorum on the values for all the columns, as I'd expect the other > good node to return > I know at some point nodetool repair should be run on this node, but I'm > concerned about a window of time between when the node comes back up and > repair
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629455#comment-16629455 ] Jeffrey F. Lukman commented on CASSANDRA-12126: --- Hi all, Our team from UCARE University of Chicago, have been able to reproduce similar manifestation to this bug consistently with our model checker. (Our scenario is different with what [~kohlisankalp] proposed) Here are the workload and scenario of the bug: Workload: 3 nodes-cluster, 3 client requests (but no crash event) Scenario: # Start 3-nodes cluster and inject all of 3 client requests to 3 different nodes (node X, Y, Z) # Node X sends its prepare messages (ballot number=1) to all nodes and all nodes accept it # Node X sends its propose message to itself, causing its inProgress value to be "X". # Node Y sends its prepare messages (ballot number=2) to all nodes. This also causes the rest of node X propose messages to be invalid because its ballot number is smaller than node Y prepare messages. # In our scenario, the prepare response messages from node Y and Z comes first before prepare response message from node X, causing the node Y to unrecognize the state of node X which already accepted value "X" (step 3). # But since our query of client request 2 has an IF, that said IF value_1='X', therefore node Y will not continue on sending propose messages to all nodes. Up to this point, it means none of the queries are committed to the server. # Node Z now sends its prepare messages to all nodes and all nodes accept it. # In our scenario, now the node X returns its response first where it also let node Z knows about its inProgress Value "X". >From here, node Z will propose and commit client request-1 (with value "X") >instead of client-request-3. # Therefore, we ended up having client request 1 stored to the server, although client request-3 was the one that is said successful. We are ready to assist, if any further information is needed. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: sankalp kohli >Priority: Major > Labels: LWT > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14702) Cassandra Write failed even when the required nodes to Ack(consistency) are up.
[ https://issues.apache.org/jira/browse/CASSANDRA-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629438#comment-16629438 ] Jeff Jirsa commented on CASSANDRA-14702: Were you adding/removing/replacing hosts at this time? > Cassandra Write failed even when the required nodes to Ack(consistency) are > up. > --- > > Key: CASSANDRA-14702 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14702 > Project: Cassandra > Issue Type: Bug >Reporter: Rohit Singh >Priority: Blocker > > Hi, > We have following configuration in our project for cassandra. > Total nodes in Cluster-5 > Replication Factor- 3 > Consistency- LOCAL_QUORUM > We get the writetimeout exception from cassandra even when 2 nodes are up and > why does stack trace says that 3 replica were required when consistency is 2? > Below is the exception we got:- > com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout > during write query at consistency LOCAL_QUORUM (3 replica were required but > only 2 acknowledged the write) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:59) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:37) > at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:289) > at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:269) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14444) Got NPE when querying Cassandra 3.11.2
[ https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-1: --- Fix Version/s: 3.11.x > Got NPE when querying Cassandra 3.11.2 > -- > > Key: CASSANDRA-1 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: Ubuntu 14.04, JDK 1.8.0_171. > Cassandra 3.11.2 >Reporter: Xiaodong Xie >Assignee: Xiaodong Xie >Priority: Blocker > Labels: pull-request-available > Fix For: 3.11.x > > Time Spent: 10m > Remaining Estimate: 0h > > We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2 > After upgrading, we immediately got exceptions in Cassandra like this one: > > {code} > ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 > QueryMessage.java:129 - Unexpected error during query > java.lang.NullPointerException: null > at > org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:248) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:92) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.service.pager.PartitionRangeQueryPager.(PartitionRangeQueryPager.java:44) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:268) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:475) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:288) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:118) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:255) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:240) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517) > [apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.2.jar:3.11.2] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_171] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.2.jar:3.11.2] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171] > {code} > > The table schema is like: > {code} > CREATE TABLE example.example_table ( > id bigint, > hash text, > json text, > PRIMARY KEY (id, hash) > ) WITH COMPACT STORAGE > {code} > > The query is something like: > {code} > "select * from example.example_table;" // (We do know this is bad practise, > and we are trying to fix that right now) > {code} > with fetch-size as 200, using DataStax Java driver. > This table contains about 20k rows. > > Actually, the fix is quite simple, > > {code} > --- a/src/java/org/apache/cassandra/service/pager/PagingState.java > +++ b/src/java/org/apache/cassandra/service/pager/PagingState.java > @@ -46,7 +46,7 @@ public class PagingState > public PagingState(ByteBuffer partitionKey, RowMark rowMark, int remaining, > int remainingInPartition) > { > - this.partitionKey = partitionKey; > + this.partitionKey = partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : partitionKey; > this.rowMark = rowMark; > this.r
[jira] [Assigned] (CASSANDRA-14444) Got NPE when querying Cassandra 3.11.2
[ https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa reassigned CASSANDRA-1: -- Assignee: Xiaodong Xie > Got NPE when querying Cassandra 3.11.2 > -- > > Key: CASSANDRA-1 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: Ubuntu 14.04, JDK 1.8.0_171. > Cassandra 3.11.2 >Reporter: Xiaodong Xie >Assignee: Xiaodong Xie >Priority: Blocker > Labels: pull-request-available > Fix For: 3.11.x > > Time Spent: 10m > Remaining Estimate: 0h > > We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2 > After upgrading, we immediately got exceptions in Cassandra like this one: > > {code} > ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 > QueryMessage.java:129 - Unexpected error during query > java.lang.NullPointerException: null > at > org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:248) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:92) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.service.pager.PartitionRangeQueryPager.(PartitionRangeQueryPager.java:44) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:268) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:475) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:288) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:118) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:255) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:240) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) > ~[apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517) > [apache-cassandra-3.11.2.jar:3.11.2] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.2.jar:3.11.2] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_171] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.2.jar:3.11.2] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.2.jar:3.11.2] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171] > {code} > > The table schema is like: > {code} > CREATE TABLE example.example_table ( > id bigint, > hash text, > json text, > PRIMARY KEY (id, hash) > ) WITH COMPACT STORAGE > {code} > > The query is something like: > {code} > "select * from example.example_table;" // (We do know this is bad practise, > and we are trying to fix that right now) > {code} > with fetch-size as 200, using DataStax Java driver. > This table contains about 20k rows. > > Actually, the fix is quite simple, > > {code} > --- a/src/java/org/apache/cassandra/service/pager/PagingState.java > +++ b/src/java/org/apache/cassandra/service/pager/PagingState.java > @@ -46,7 +46,7 @@ public class PagingState > public PagingState(ByteBuffer partitionKey, RowMark rowMark, int remaining, > int remainingInPartition) > { > - this.partitionKey = partitionKey; > + this.partitionKey = partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : partitionKey; > this.rowMark = rowMark; >
[jira] [Commented] (CASSANDRA-12438) Data inconsistencies with lightweight transactions, serial reads, and rejoining node
[ https://issues.apache.org/jira/browse/CASSANDRA-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629434#comment-16629434 ] Jeffrey F. Lukman commented on CASSANDRA-12438: --- Hi all, Our team from UCARE University of Chicago, have been able to reproduce this bug consistently with our model checker. Here are the workload and scenario of the bug: Workload: 3 node-cluster (let's call them node X, Y, and Z), 1 Crash, 1 Reboot events, with 3 client requests (where node X will be the coordinator node for all client requests. Scenario: # Start the 3 nodes and setup the CONSISTENCY = ONE. # Inject client request 1 as described in this bug description: Insert (along with many others) # But before any PREPARE messages have been sent by the node X, node Z has crashed. # Client request 1 is successfully committed in node X and Y. # Reboot node Z. # Inject client request 2 & 3 as described in this bug description: Update (along with others for which A=V_a1) Update (along with many others for which A=V_a2) (**Although Update 3 can also be ignored if we want to simplify the bug scenario) # If client request-2 finished first without being interfered by client request-3, then we expect to see: If the client request-3 interfere client request-2 or is executed before client request-2 for any reason, then we expect to see: # But our model checker shows that, if we do a read request to node Z, then we will see: // some fields are null But if we do a read request to node X or Y, then we will get a complete result. Which means we end up in an inconsistency view among the nodes. If we run this scenario with CONSISTENCY.ALL we will not see this bug to happen. We are happy to assist you guys to debug this issue. > Data inconsistencies with lightweight transactions, serial reads, and > rejoining node > > > Key: CASSANDRA-12438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12438 > Project: Cassandra > Issue Type: Bug >Reporter: Steven Schaefer >Priority: Major > > I've run into some issues with data inconsistency in a situation where a > single node is rejoining a 3-node cluster with RF=3. I'm running 3.7. > I have a client system which inserts data into a table with around 7 columns, > named let's say A-F,id, and version. LWTs are used to make the inserts and > updates. > Typically what happens is there's an insert of values id, V_a1, V_b1, ... , > version=1, then another process will pick up rows with for example A=V_a1 and > subsequently update A to V_a2 and version=2. Yet another process will watch > for A=V_a2 to then make a second update to the same column, and set version > to 3, with end result being There's a > secondary index on this A column (there's only a few possible values for A so > not worried about the cardinality issue), though I've reproed with the new > SASI index too. > If one of the nodes is down, there's still 2 alive for quorum so inserts can > still happen. When I bring up the downed node, sometimes I get really weird > state back which ultimately crashes the client system that's talking to > Cassandra. > When reading I always select all the columns, but there is a conditional > where clause that A=V_a2 (e.g. SELECT * FROM table WHERE A=V_a2). This read > is for processing any rows with V_a2, and ultimately updating to V_a3 when > complete. While periodically polling for A=V_a2 it is of course possible for > the poller to to observe the old V_a2 value while the other parts of the > client system process and make the update to V_a3, and that's generally ok > because of the LWTs used for updates, an occassionaly wasted reprocessing run > ins't a big deal, but when reading at serial I always expect to get the > original values for columns that were never updated too. If a paxos update is > in progress then I expect that completed before its value(s) returned. But > instead, the read seems to be seeing the partial commit of the LWT, returning > the old V_2a value for the changed column, but no values whatsoever for the > other columns. From the example above, instead of getting , version=3>, or even the older (either of > which I expect and are ok), I get only , so the rest of > the columns end up null, which I never expect. However this isn't persistent, > Cassandra does end up consistent, which I see via sstabledump and cqlsh after > the fact. > In my client system logs I record the insert / updates, and this > inconsistency happens around the same time as the update from V_a2 to V_a3, > hence my comment about Cassandra seeing a partial commit. So that leads me to > suspect that perhaps due to the where clause in my read query for A=V_a2, > perhaps one of the original good
[jira] [Commented] (CASSANDRA-14727) Transient Replication: EACH_QUORUM not implemented
[ https://issues.apache.org/jira/browse/CASSANDRA-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629192#comment-16629192 ] Alex Petrov commented on CASSANDRA-14727: - +1, thank you for expanding comments, too! I've left a couple of minor notes here: https://github.com/apache/cassandra/pull/275/files > Transient Replication: EACH_QUORUM not implemented > -- > > Key: CASSANDRA-14727 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14727 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > Transient replication cannot presently handle EACH_QUORUM consistency; reads > and writes should currently fail, though without good error messages. Not > clear if this is acceptable for GA, since we cannot impose this limitation at > Keyspace declaration time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14792) skip TestRepair.test_dead_coordinator dtest in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629176#comment-16629176 ] Alex Petrov edited comment on CASSANDRA-14792 at 9/26/18 5:34 PM: -- +1, thank you for fixing this one! For completeness, the test was failing with {code} java.lang.RuntimeException: java.lang.IllegalArgumentException: Invalid state transition FINALIZED -> FAILED at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:214) at org.apache.cassandra.net.MessageDeliveryTask.process(MessageDeliveryTask.java:92) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:54) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) {code} This happens only every once in a while, since a completed task is racing with cancel. We could improve a failure message on cancelation if we had a distinction between failed state and cancelled one, but this might be not worth it. was (Author: ifesdjeen): +1, thank you for fixing this one! > skip TestRepair.test_dead_coordinator dtest in 4.0 > -- > > Key: CASSANDRA-14792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14792 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > CASSANDRA-14763 changed the coordinator behavior to not cleanup old repair > sessions, so this test doesn't really make sense anymore. We should just skip > it in 4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14792) skip TestRepair.test_dead_coordinator dtest in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629176#comment-16629176 ] Alex Petrov commented on CASSANDRA-14792: - +1, thank you for fixing this one! > skip TestRepair.test_dead_coordinator dtest in 4.0 > -- > > Key: CASSANDRA-14792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14792 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > CASSANDRA-14763 changed the coordinator behavior to not cleanup old repair > sessions, so this test doesn't really make sense anymore. We should just skip > it in 4.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14767) Embedded cassandra not working after jdk10 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-14767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14767: --- Priority: Minor (was: Blocker) > Embedded cassandra not working after jdk10 upgrade > -- > > Key: CASSANDRA-14767 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14767 > Project: Cassandra > Issue Type: Bug >Reporter: parthiban >Priority: Minor > > Embedded cassandra not working after jdk10 upgrade. Could some one help me on > this. > Cassandra config: > {{try \{ EmbeddedCassandraServerHelper.startEmbeddedCassandra(); }catch > (Exception e) \{ LOGGER.error(" CommonConfig ", " cluster()::Exception while > creating cluster ", e); System.setProperty("cassandra.config", > "cassandra.yaml"); DatabaseDescriptor.daemonInitialization(); > EmbeddedCassandraServerHelper.startEmbeddedCassandra(); } Cluster cluster = > Cluster.builder() > .addContactPoints(environment.getProperty(TextToClipConstants.CASSANDRA_CONTACT_POINTS)).withPort(Integer.parseInt(environment.getProperty(TextToClipConstants.CASSANDRA_PORT))).build(); > Session session = cluster.connect(); > session.execute(KEYSPACE_CREATION_QUERY); > session.execute(KEYSPACE_ACTIVATE_QUERY); }} > > {{build.gradle}} > {{buildscript \{ ext { springBootVersion = '2.0.1.RELEASE' } repositories \{ > mavenCentral() mavenLocal() } dependencies \{ > classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}") > classpath ("com.bmuschko:gradle-docker-plugin:3.2.1") classpath > ("org.sonarsource.scanner.gradle:sonarqube-gradle-plugin:2.5") > classpath("au.com.dius:pact-jvm-provider-gradle_2.12:3.5.13") classpath > ("com.moowork.gradle:gradle-node-plugin:1.2.0") } } plugins \{ //id > "au.com.dius.pact" version "3.5.7" id "com.gorylenko.gradle-git-properties" > version "1.4.17" id "de.undercouch.download" version "3.4.2" } apply plugin: > 'java' apply plugin: 'eclipse' apply plugin: 'org.springframework.boot' apply > plugin: 'io.spring.dependency-management' apply plugin: > 'com.bmuschko.docker-remote-api' apply plugin: 'jacoco' apply plugin: > 'maven-publish' apply plugin: 'org.sonarqube' apply plugin: > 'au.com.dius.pact' apply plugin: 'scala' sourceCompatibility = 1.8 > repositories \{ mavenCentral() maven { url "https://repo.spring.io/milestone"; > } mavenLocal() } ext \{ springCloudVersion = 'Finchley.RELEASE' } pact \{ > serviceProviders { rxorder { publish { pactDirectory = > '/Users/sv/Documents/wag-doc-text2clip/target/pacts' // defaults to > $buildDir/pacts pactBrokerUrl = 'http://localhost:80' version=2.0 } } } } > //start of integration tests changes sourceSets \{ integrationTest { java { > compileClasspath += main.output + test.output runtimeClasspath += main.output > + test.output srcDir file('test/functional-api/java') } resources.srcDir > file('test/functional-api/resources') } } configurations \{ > integrationTestCompile.extendsFrom testCompile > integrationTestRuntime.extendsFrom testRuntime } //end of integration tests > changes dependencies \{ //web (Tomcat, Logging, Rest) compile group: > 'org.springframework.boot', name: 'spring-boot-starter-web' // Redis > //compile group: 'org.springframework.boot', name: > 'spring-boot-starter-data-redis' //Mongo Starter compile group: > 'org.springframework.boot', name:'spring-boot-starter-data-mongodb' // > Configuration processor - To Generate MetaData Files. The files are designed > to let developers offer “code completion� as users are working with > application.properties compile group: 'org.springframework.boot', name: > 'spring-boot-configuration-processor' // Actuator - Monitoring compile group: > 'org.springframework.boot', name: 'spring-boot-starter-actuator' //Sleuth - > Tracing compile group: 'org.springframework.cloud', name: > 'spring-cloud-starter-sleuth' //Hystrix - Circuit Breaker compile group: > 'org.springframework.cloud', name: 'spring-cloud-starter-netflix-hystrix' // > Hystrix - Dashboard compile group: 'org.springframework.cloud', name: > 'spring-cloud-starter-netflix-hystrix-dashboard' // Thymeleaf compile group: > 'org.springframework.boot', name: 'spring-boot-starter-thymeleaf' //Voltage > // Device Detection compile group: 'org.springframework.boot', name: > 'spring-boot-starter-data-cassandra', version:'2.0.4.RELEASE' compile group: > 'com.google.guava', name: 'guava', version: '23.2-jre' > compile('com.google.code.gson:gson:2.8.0') compile('org.json:json:20170516') > //Swagger compile group: 'io.springfox', name: 'springfox-swagger2', > version:'2.8.0' compile group: 'io.springfox', name: 'springfox-swagger-ui', > version:'2.8.0' //jkd10 fixes compile group: 'javax.xml.bind',name: > 'jaxb-api', version:'2.3.0' compile group: 'javax.xml.soap', name: > 'javax.xml.s
[jira] [Updated] (CASSANDRA-14786) Attempted to delete non-existing file CommitLog
[ https://issues.apache.org/jira/browse/CASSANDRA-14786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14786: --- Priority: Critical (was: Blocker) > Attempted to delete non-existing file CommitLog > --- > > Key: CASSANDRA-14786 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14786 > Project: Cassandra > Issue Type: Bug > Environment: RedHat 6.8 x64 > Apache Cassandra 2.2.9 >Reporter: Riccardo Paoli >Priority: Critical > Attachments: Cassandra_2209.zip > > > Hi all, > we write here for the first time so forgive us if we forget something. > At one of our customers we have installed some Genesys applications that use > a Cassandra cluster with 3 nodes. > For several weeks, at regular intervals, Cassandra's trials encounter > problems and start shutdown. In particular, the error that is reported on the > logs is as follows: > NODE 2 > > {code:java} > ERROR [COMMIT-LOG-ALLOCATOR] 2018-09-21 23:05:48,718 CommitLog.java:488 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-5-1537387998650.log > {code} > > NODE 1 > > {code:java} > ERROR [COMMIT-LOG-ALLOCATOR] 2018-09-22 01:04:53,488 CommitLog.java:488 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-5-1537387930979.log > {code} > > NODE 3 > > {code:java} > ERROR [COMMIT-LOG-ALLOCATOR] 2018-09-22 04:31:56,176 CommitLog.java:488 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-5-1537388059095.log > {code} > > h6. > Is it possible to understand the cause? > Attached the logs of the 22/09 error and the cluster configuration. > Thanks to the availability. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14297) Optional startup delay for peers should wait for count rather than percentage
[ https://issues.apache.org/jira/browse/CASSANDRA-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629160#comment-16629160 ] Joseph Lynch commented on CASSANDRA-14297: -- I'm changing this to a bug since I think the current user interface is not possible for users to correctly configure and I hope we don't ship 4.0 with the percentage option instead of a count. If someone thinks that there are plausible settings of the existing configuration options users can use we can change this back to an improvement. > Optional startup delay for peers should wait for count rather than percentage > - > > Key: CASSANDRA-14297 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14297 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Labels: 4.0-feature-freeze-review-requested, PatchAvailable > > As I commented in CASSANDRA-13993, the current wait for functionality is a > great step in the right direction, but I don't think that the current setting > (70% of nodes in the cluster) is the right configuration option. First I > think this because 70% will not protect against errors as if you wait for 70% > of the cluster you could still very easily have {{UnavailableException}} or > {{ReadTimeoutException}} exceptions. This is because if you have even two > nodes down in different racks in a Cassandra cluster these exceptions are > possible (or with the default {{num_tokens}} setting of 256 it is basically > guaranteed). Second I think this option is not easy for operators to set, the > only setting I could think of that would "just work" is 100%. > I proposed in that ticket instead of having `block_for_peers_percentage` > defaulting to 70%, we instead have `block_for_peers` as a count of nodes that > are allowed to be down before the starting node makes itself available as a > coordinator. Of course, we would still have the timeout to limit startup time > and deal with really extreme situations (whole datacenters down etc). > I started working on a patch for this change [on > github|https://github.com/jasobrown/cassandra/compare/13993...jolynch:13993], > and am happy to finish it up with unit tests and such if someone can > review/commit it (maybe [~aweisberg]?). > I think the short version of my proposal is we replace: > {noformat} > block_for_peers_percentage: > {noformat} > with either > {noformat} > block_for_peers: > {noformat} > or, if we want to do even better imo and enable advanced operators to finely > tune this behavior (while still having good defaults that work for almost > everyone): > {noformat} > block_for_peers_local_dc: > block_for_peers_each_dc: > block_for_peers_all_dcs: > {noformat} > For example if an operator knows that they must be available at > {{LOCAL_QUORUM}} they would set {{block_for_peers_local_dc=1}}, if they use > {{EACH_QUOURM}} they would set {{block_for_peers_local_dc=1}}, if they use > {{QUORUM}} (RF=3, dcs=2) they would set {{block_for_peers_all_dcs=2}}. > Naturally everything would of course have a timeout to prevent startup taking > too long. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14297) Optional startup delay for peers should wait for count rather than percentage
[ https://issues.apache.org/jira/browse/CASSANDRA-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14297: - Issue Type: Bug (was: Improvement) > Optional startup delay for peers should wait for count rather than percentage > - > > Key: CASSANDRA-14297 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14297 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Labels: 4.0-feature-freeze-review-requested, PatchAvailable > > As I commented in CASSANDRA-13993, the current wait for functionality is a > great step in the right direction, but I don't think that the current setting > (70% of nodes in the cluster) is the right configuration option. First I > think this because 70% will not protect against errors as if you wait for 70% > of the cluster you could still very easily have {{UnavailableException}} or > {{ReadTimeoutException}} exceptions. This is because if you have even two > nodes down in different racks in a Cassandra cluster these exceptions are > possible (or with the default {{num_tokens}} setting of 256 it is basically > guaranteed). Second I think this option is not easy for operators to set, the > only setting I could think of that would "just work" is 100%. > I proposed in that ticket instead of having `block_for_peers_percentage` > defaulting to 70%, we instead have `block_for_peers` as a count of nodes that > are allowed to be down before the starting node makes itself available as a > coordinator. Of course, we would still have the timeout to limit startup time > and deal with really extreme situations (whole datacenters down etc). > I started working on a patch for this change [on > github|https://github.com/jasobrown/cassandra/compare/13993...jolynch:13993], > and am happy to finish it up with unit tests and such if someone can > review/commit it (maybe [~aweisberg]?). > I think the short version of my proposal is we replace: > {noformat} > block_for_peers_percentage: > {noformat} > with either > {noformat} > block_for_peers: > {noformat} > or, if we want to do even better imo and enable advanced operators to finely > tune this behavior (while still having good defaults that work for almost > everyone): > {noformat} > block_for_peers_local_dc: > block_for_peers_each_dc: > block_for_peers_all_dcs: > {noformat} > For example if an operator knows that they must be available at > {{LOCAL_QUORUM}} they would set {{block_for_peers_local_dc=1}}, if they use > {{EACH_QUOURM}} they would set {{block_for_peers_local_dc=1}}, if they use > {{QUORUM}} (RF=3, dcs=2) they would set {{block_for_peers_all_dcs=2}}. > Naturally everything would of course have a timeout to prevent startup taking > too long. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14762) Transient node receives full data requests in dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629102#comment-16629102 ] Benedict commented on CASSANDRA-14762: -- bq. I'm not sure it is as valuable just because the race is much smaller since it's not O(gossip) it's O(time to switch threads). I was thinking of programmer error more than the race condition, but I agree it's much less impactful. I might rustle it up anyway, while we're here. > Transient node receives full data requests in dtests > > > Key: CASSANDRA-14762 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14762 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ariel Weisberg >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > I saw this running them on my laptop with rapid write protection disabled. > Attached is a patch for disabling rapid write protection in the transient > dtests. > {noformat} > .Exception in thread Thread-19: > Traceback (most recent call last): > File > "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", > line 916, in _bootstrap_inner > self.run() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 180, in run > self.scan_and_report() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 173, in scan_and_report > on_error_call(errordata) > File "/Users/aweisberg/repos/cassandra-dtest/dtest_setup.py", line 137, in > _log_error_handler > pytest.fail("Error details: \n{message}".format(message=message)) > File > "/Users/aweisberg/repos/cassandra-dtest/venv/lib/python3.6/site-packages/_pytest/outcomes.py", > line 96, in fail > raise Failed(msg=msg, pytrace=pytrace) > Failed: Error details: > Errors seen in logs for: node3 > node3: ERROR [ReadStage-1] 2018-09-18 12:28:48,344 > AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread > Thread[ReadStage-1,5,main] > org.apache.cassandra.exceptions.InvalidRequestException: Attempted to serve > transient data request from full node in > org.apache.cassandra.db.ReadCommandVerbHandler@3c55e0ff > at > org.apache.cassandra.db.ReadCommandVerbHandler.validateTransientStatus(ReadCommandVerbHandler.java:104) > at > org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:53) > at > org.apache.cassandra.net.MessageDeliveryTask.process(MessageDeliveryTask.java:92) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:54) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:110) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14791) [utest] tests unable to write system tmp directory
[ https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629010#comment-16629010 ] Jay Zhuang commented on CASSANDRA-14791: Hi [~mshuler], [~spo...@gmail.com], any idea if there's a permission setting we could set for the Jenkins Job/Slave? > [utest] tests unable to write system tmp directory > -- > > Key: CASSANDRA-14791 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14791 > Project: Cassandra > Issue Type: Task > Components: Testing >Reporter: Jay Zhuang >Priority: Minor > > Some tests are failing from time to time because it cannot write to directory > {{/tmp/}}: > https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/ > {noformat} > java.lang.RuntimeException: java.nio.file.AccessDeniedException: > /tmp/na-1-big-Data.db > at > org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119) > at > org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152) > at > org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141) > at > org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82) > at > org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119) > at > org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) > at java.nio.channels.FileChannel.open(FileChannel.java:287) > at java.nio.channels.FileChannel.open(FileChannel.java:335) > at > org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100) > {noformat} > I guess it's because some Jenkins slaves don't have proper permission set. > For slave {{cassandra16}}, the tests are fine: > https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD
Marcus Eriksson created CASSANDRA-14793: --- Summary: Improve system table handling when losing a disk when using JBOD Key: CASSANDRA-14793 URL: https://issues.apache.org/jira/browse/CASSANDRA-14793 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Fix For: 4.0 We should improve the way we handle disk failures when losing a disk in a JBOD setup One way could be to pin the system tables to a special data directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14762) Transient node receives full data requests in dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628901#comment-16628901 ] Ariel Weisberg edited comment on CASSANDRA-14762 at 9/26/18 3:08 PM: - bq. Is this simply you reasoning out the rationale for issuing the requests, or have you spotted an issue with the patch that means we are not doing so? Sorry it's just socratic code review. I don't see any problems. +1 Checking locally is nice, but I'm not sure it is as valuable just because the race is much smaller since it's not O(gossip) it's O(time to switch threads). If you want to do it here or in another ticket it's still good to have. was (Author: aweisberg): bq. Is this simply you reasoning out the rationale for issuing the requests, or have you spotted an issue with the patch that means we are not doing so? Sorry it's just socratic code review. I don't see any problems. +1 Checking locally is nice, but I'm not sure it is as valuable just because the race is much smaller since it's not O(gossip) it's O(time to switch threads). > Transient node receives full data requests in dtests > > > Key: CASSANDRA-14762 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14762 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ariel Weisberg >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > I saw this running them on my laptop with rapid write protection disabled. > Attached is a patch for disabling rapid write protection in the transient > dtests. > {noformat} > .Exception in thread Thread-19: > Traceback (most recent call last): > File > "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", > line 916, in _bootstrap_inner > self.run() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 180, in run > self.scan_and_report() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 173, in scan_and_report > on_error_call(errordata) > File "/Users/aweisberg/repos/cassandra-dtest/dtest_setup.py", line 137, in > _log_error_handler > pytest.fail("Error details: \n{message}".format(message=message)) > File > "/Users/aweisberg/repos/cassandra-dtest/venv/lib/python3.6/site-packages/_pytest/outcomes.py", > line 96, in fail > raise Failed(msg=msg, pytrace=pytrace) > Failed: Error details: > Errors seen in logs for: node3 > node3: ERROR [ReadStage-1] 2018-09-18 12:28:48,344 > AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread > Thread[ReadStage-1,5,main] > org.apache.cassandra.exceptions.InvalidRequestException: Attempted to serve > transient data request from full node in > org.apache.cassandra.db.ReadCommandVerbHandler@3c55e0ff > at > org.apache.cassandra.db.ReadCommandVerbHandler.validateTransientStatus(ReadCommandVerbHandler.java:104) > at > org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:53) > at > org.apache.cassandra.net.MessageDeliveryTask.process(MessageDeliveryTask.java:92) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:54) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:110) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14762) Transient node receives full data requests in dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628901#comment-16628901 ] Ariel Weisberg commented on CASSANDRA-14762: bq. Is this simply you reasoning out the rationale for issuing the requests, or have you spotted an issue with the patch that means we are not doing so? Sorry it's just socratic code review. I don't see any problems. +1 Checking locally is nice, but I'm not sure it is as valuable just because the race is much smaller since it's not O(gossip) it's O(time to switch threads). > Transient node receives full data requests in dtests > > > Key: CASSANDRA-14762 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14762 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ariel Weisberg >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > I saw this running them on my laptop with rapid write protection disabled. > Attached is a patch for disabling rapid write protection in the transient > dtests. > {noformat} > .Exception in thread Thread-19: > Traceback (most recent call last): > File > "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", > line 916, in _bootstrap_inner > self.run() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 180, in run > self.scan_and_report() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 173, in scan_and_report > on_error_call(errordata) > File "/Users/aweisberg/repos/cassandra-dtest/dtest_setup.py", line 137, in > _log_error_handler > pytest.fail("Error details: \n{message}".format(message=message)) > File > "/Users/aweisberg/repos/cassandra-dtest/venv/lib/python3.6/site-packages/_pytest/outcomes.py", > line 96, in fail > raise Failed(msg=msg, pytrace=pytrace) > Failed: Error details: > Errors seen in logs for: node3 > node3: ERROR [ReadStage-1] 2018-09-18 12:28:48,344 > AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread > Thread[ReadStage-1,5,main] > org.apache.cassandra.exceptions.InvalidRequestException: Attempted to serve > transient data request from full node in > org.apache.cassandra.db.ReadCommandVerbHandler@3c55e0ff > at > org.apache.cassandra.db.ReadCommandVerbHandler.validateTransientStatus(ReadCommandVerbHandler.java:104) > at > org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:53) > at > org.apache.cassandra.net.MessageDeliveryTask.process(MessageDeliveryTask.java:92) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:54) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:110) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14742) Race Condition in batchlog replica collection
[ https://issues.apache.org/jira/browse/CASSANDRA-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628602#comment-16628602 ] Benedict edited comment on CASSANDRA-14742 at 9/26/18 11:20 AM: Patch looks good overall, just a few nits: # Right now, {{ReplicaPlans}} is organised into counter writes, regular writes, regular write utilities, reads, reads utilities; I think it would be cleanest to keep the batch write utilities similarly proximal to the batch writes themselves, for consistency # {{syncWriteBatchedMutations}} and {{forBatchlogWrite}} each accept a {{localDc}} parameter - this seems a bit weird, since it's a global variable, and only ever invoked with this (but also, we obtain it inconsistently, by asking the snitch instead of the cached {{localDc}}. Perhaps they should each just use the latter, without requiring it as a parameter? (I realise this is pre-existing) # Unused imports in {{ReplicaPlans}} was (Author: benedict): Patch looks good overall, just a few nits: # Right now, {{ReplicaPlans}} is organised into counter writes, regular writes, regular write utilities, reads, reads utilities; I think it would be cleanest to keep the batch write utilities similarly proximal to the batch writes themselves, for consistency # {{syncWriteBatchedMutations}} and {{forBatchlogWrite}} each accept a {{localDc}} parameter - this seems a bit weird, since it's a global variable, and only ever invoked with this (but also, we obtain it inconsistently, by asking the snitch instead of the cached {{localDc}}. Perhaps they should each just use the latter, without requiring it as a parameter? # Unused imports in {{ReplicaPlans}} > Race Condition in batchlog replica collection > - > > Key: CASSANDRA-14742 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14742 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When we collect nodes for it in {{StorageProxy#getBatchlogReplicas}}, we > already filter out down replicas; subsequently they get picked up and taken > for liveAndDown. > There's a possible race condition due to picking tokens from token metadata > twice (once in {{StorageProxy#getBatchlogReplicas}} and second one in > {{ReplicaPlan#forBatchlogWrite}}) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14742) Race Condition in batchlog replica collection
[ https://issues.apache.org/jira/browse/CASSANDRA-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628602#comment-16628602 ] Benedict commented on CASSANDRA-14742: -- Patch looks good overall, just a few nits: # Right now, {{ReplicaPlans}} is organised into counter writes, regular writes, regular write utilities, reads, reads utilities; I think it would be cleanest to keep the batch write utilities similarly proximal to the batch writes themselves, for consistency # {{syncWriteBatchedMutations}} and {{forBatchlogWrite}} each accept a {{localDc}} parameter - this seems a bit weird, since it's a global variable, and only ever invoked with this (but also, we obtain it inconsistently, by asking the snitch instead of the cached {{localDc}}. Perhaps they should each just use the latter, without requiring it as a parameter? # Unused imports in {{ReplicaPlans}} > Race Condition in batchlog replica collection > - > > Key: CASSANDRA-14742 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14742 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When we collect nodes for it in {{StorageProxy#getBatchlogReplicas}}, we > already filter out down replicas; subsequently they get picked up and taken > for liveAndDown. > There's a possible race condition due to picking tokens from token metadata > twice (once in {{StorageProxy#getBatchlogReplicas}} and second one in > {{ReplicaPlan#forBatchlogWrite}}) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14770) Introduce RangesAtEndpoint.unwrap to simplify StreamSession.addTransferRanges
[ https://issues.apache.org/jira/browse/CASSANDRA-14770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628519#comment-16628519 ] Benedict commented on CASSANDRA-14770: -- Thanks. Committed as [914c66685c5bebe1624d827a9b4562b73a08c297|https://github.com/apache/cassandra/commit/914c66685c5bebe1624d827a9b4562b73a08c297] > Introduce RangesAtEndpoint.unwrap to simplify StreamSession.addTransferRanges > - > > Key: CASSANDRA-14770 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14770 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Benedict >Assignee: Benedict >Priority: Trivial > Fix For: 4.0 > > > Arguably, since this is only performed in one place, we could leave it in > {{addTransferRanges}}, but it should be a helper method anyway, and given > {{unwrap()}} is a feature of {{Range}}, we should implement that in > {{RangesAtEndpoint}} IMO. I have introduced this method, which avoids > allocating a new collection unnecessarily, corroborates we have at most one > wrap-around range, and introduced unit tests for the method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14770) Introduce RangesAtEndpoint.unwrap to simplify StreamSession.addTransferRanges
[ https://issues.apache.org/jira/browse/CASSANDRA-14770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14770: - Resolution: Fixed Status: Resolved (was: Patch Available) > Introduce RangesAtEndpoint.unwrap to simplify StreamSession.addTransferRanges > - > > Key: CASSANDRA-14770 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14770 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Benedict >Assignee: Benedict >Priority: Trivial > Fix For: 4.0 > > > Arguably, since this is only performed in one place, we could leave it in > {{addTransferRanges}}, but it should be a helper method anyway, and given > {{unwrap()}} is a feature of {{Range}}, we should implement that in > {{RangesAtEndpoint}} IMO. I have introduced this method, which avoids > allocating a new collection unnecessarily, corroborates we have at most one > wrap-around range, and introduced unit tests for the method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Introduce RangesAtEndpoint.unwrap; simplify StreamSession.addTransferRanges
Repository: cassandra Updated Branches: refs/heads/trunk 8554d6b35 -> 914c66685 Introduce RangesAtEndpoint.unwrap; simplify StreamSession.addTransferRanges Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/914c6668 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/914c6668 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/914c6668 Branch: refs/heads/trunk Commit: 914c66685c5bebe1624d827a9b4562b73a08c297 Parents: 8554d6b Author: Benedict Elliott Smith Authored: Tue Sep 18 13:17:15 2018 +0100 Committer: Benedict Elliott Smith Committed: Wed Sep 26 11:12:12 2018 +0100 -- CHANGES.txt | 1 + .../cassandra/locator/RangesAtEndpoint.java | 31 + .../cassandra/streaming/StreamSession.java | 11 + .../locator/ReplicaCollectionTest.java | 46 +++- 4 files changed, 69 insertions(+), 20 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/914c6668/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9139822..e227c40 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Introduce RangesAtEndpoint.unwrap to simplify StreamSession.addTransferRanges (CASSANDRA-14770) * LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable (CASSANDRA-14735) * Avoid creating empty compaction tasks after truncate (CASSANDRA-14780) * Fail incremental repair prepare phase if it encounters sstables from un-finalized sessions (CASSANDRA-14763) http://git-wip-us.apache.org/repos/asf/cassandra/blob/914c6668/src/java/org/apache/cassandra/locator/RangesAtEndpoint.java -- diff --git a/src/java/org/apache/cassandra/locator/RangesAtEndpoint.java b/src/java/org/apache/cassandra/locator/RangesAtEndpoint.java index f57c28e..8319d92 100644 --- a/src/java/org/apache/cassandra/locator/RangesAtEndpoint.java +++ b/src/java/org/apache/cassandra/locator/RangesAtEndpoint.java @@ -165,6 +165,37 @@ public class RangesAtEndpoint extends AbstractReplicaCollection range : replica.range().unwrap()) +builder.add(replica.decorateSubrange(range)); +} +return builder.build(); +} + public static Collector collector(InetAddressAndPort endpoint) { return collector(ImmutableSet.of(), () -> new Builder(endpoint)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/914c6668/src/java/org/apache/cassandra/streaming/StreamSession.java -- diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java b/src/java/org/apache/cassandra/streaming/StreamSession.java index d7d0836..80fcebb 100644 --- a/src/java/org/apache/cassandra/streaming/StreamSession.java +++ b/src/java/org/apache/cassandra/streaming/StreamSession.java @@ -335,15 +335,8 @@ public class StreamSession implements IEndpointStateChangeSubscriber //Was it safe to remove this normalize, sorting seems not to matter, merging? Maybe we should have? //Do we need to unwrap here also or is that just making it worse? //Range and if it's transient -RangesAtEndpoint.Builder unwrappedRanges = RangesAtEndpoint.builder(replicas.endpoint(), replicas.size()); -for (Replica replica : replicas) -{ -for (Range unwrapped : replica.range().unwrap()) -{ -unwrappedRanges.add(new Replica(replica.endpoint(), unwrapped, replica.isFull())); -} -} -List streams = getOutgoingStreamsForRanges(unwrappedRanges.build(), stores, pendingRepair, previewKind); +RangesAtEndpoint unwrappedRanges = replicas.unwrap(); +List streams = getOutgoingStreamsForRanges(unwrappedRanges, stores, pendingRepair, previewKind); addTransferStreams(streams); Set> toBeUpdated = transferredRangesPerKeyspace.get(keyspace); if (toBeUpdated == null) http://git-wip-us.apache.org/repos/asf/cassandra/blob/914c6668/test/unit/org/apache/cassandra/locator/ReplicaCollectionTest.java -- diff --git a/test/unit/org/apache/cassandra/locator/ReplicaCollectionTest.java b/test/unit/org/apache/cassandra/locator/ReplicaCollectionTest.java index f937f96..c289d50 100644 --- a/test/unit/org/apache/cassandra/locator/ReplicaCollectionTest.java +++ b/test/unit/org/apache/cassandra/locator/ReplicaCollectionTest.java @@ -33,6 +33,7 @@ import org.junit.Assert; import org.junit.Test; import java.net.UnknownHostException; +import java.util.ArrayList; import java.util.Comparator; import ja
[jira] [Comment Edited] (CASSANDRA-14735) LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable
[ https://issues.apache.org/jira/browse/CASSANDRA-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628496#comment-16628496 ] Benedict edited comment on CASSANDRA-14735 at 9/26/18 9:57 AM: --- Thanks, committed as [8554d6b35dcc5eec46ed7edc809a36c1f7fa588f|https://github.com/apache/cassandra/commit/8554d6b35dcc5eec46ed7edc809a36c1f7fa588f] was (Author: benedict): Thanks, committed > LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead > of Unavailable > -- > > Key: CASSANDRA-14735 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14735 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Assignee: Benedict >Priority: Minor > Fix For: 4.0 > > > This issue applies to all of: rapid read protection, read repair's rapid read > protection and read repair's rapid write protection. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14735) LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable
[ https://issues.apache.org/jira/browse/CASSANDRA-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14735: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks, committed > LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead > of Unavailable > -- > > Key: CASSANDRA-14735 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14735 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Assignee: Benedict >Priority: Minor > Fix For: 4.0 > > > This issue applies to all of: rapid read protection, read repair's rapid read > protection and read repair's rapid write protection. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable
Repository: cassandra Updated Branches: refs/heads/trunk 0379201c7 -> 8554d6b35 LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable patch by Benedict; reviewed by Ariel Weisberg for CASSANDRA-14735 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8554d6b3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8554d6b3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8554d6b3 Branch: refs/heads/trunk Commit: 8554d6b35dcc5eec46ed7edc809a36c1f7fa588f Parents: 0379201 Author: Benedict Elliott Smith Authored: Thu Sep 20 08:54:55 2018 +0100 Committer: Benedict Elliott Smith Committed: Wed Sep 26 10:55:11 2018 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/ConsistencyLevel.java | 233 ++- .../apache/cassandra/locator/InOurDcTester.java | 93 .../apache/cassandra/locator/ReplicaPlan.java | 3 - .../apache/cassandra/locator/ReplicaPlans.java | 193 +-- .../org/apache/cassandra/locator/Replicas.java | 65 +- .../service/DatacenterWriteResponseHandler.java | 7 +- .../apache/cassandra/service/StorageProxy.java | 6 +- .../reads/repair/BlockingPartitionRepair.java | 27 ++- .../reads/repair/BlockingReadRepairTest.java| 10 +- .../DiagEventsBlockingReadRepairTest.java | 23 +- .../service/reads/repair/ReadRepairTest.java| 9 +- 12 files changed, 373 insertions(+), 297 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8554d6b3/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9f7958c..9139822 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable (CASSANDRA-14735) * Avoid creating empty compaction tasks after truncate (CASSANDRA-14780) * Fail incremental repair prepare phase if it encounters sstables from un-finalized sessions (CASSANDRA-14763) * Add a check for receiving digest response from transient node (CASSANDRA-14750) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8554d6b3/src/java/org/apache/cassandra/db/ConsistencyLevel.java -- diff --git a/src/java/org/apache/cassandra/db/ConsistencyLevel.java b/src/java/org/apache/cassandra/db/ConsistencyLevel.java index 5a4baf7..9e884a7 100644 --- a/src/java/org/apache/cassandra/db/ConsistencyLevel.java +++ b/src/java/org/apache/cassandra/db/ConsistencyLevel.java @@ -17,26 +17,18 @@ */ package org.apache.cassandra.db; -import java.util.HashMap; -import java.util.Map; -import com.google.common.collect.Iterables; +import com.carrotsearch.hppc.ObjectIntOpenHashMap; import org.apache.cassandra.locator.Endpoints; -import org.apache.cassandra.locator.ReplicaCollection; -import org.apache.cassandra.locator.Replicas; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import org.apache.cassandra.locator.InetAddressAndPort; -import org.apache.cassandra.locator.Replica; import org.apache.cassandra.schema.TableMetadata; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.exceptions.InvalidRequestException; -import org.apache.cassandra.exceptions.UnavailableException; import org.apache.cassandra.locator.AbstractReplicationStrategy; import org.apache.cassandra.locator.NetworkTopologyStrategy; import org.apache.cassandra.transport.ProtocolException; +import static org.apache.cassandra.locator.Replicas.countInOurDc; + public enum ConsistencyLevel { ANY (0), @@ -52,8 +44,6 @@ public enum ConsistencyLevel LOCAL_ONE (10, true), NODE_LOCAL (11, true); -private static final Logger logger = LoggerFactory.getLogger(ConsistencyLevel.class); - // Used by the binary protocol public final int code; private final boolean isDCLocal; @@ -90,18 +80,27 @@ public enum ConsistencyLevel return codeIdx[code]; } -private int quorumFor(Keyspace keyspace) +public static int quorumFor(Keyspace keyspace) { return (keyspace.getReplicationStrategy().getReplicationFactor().allReplicas / 2) + 1; } -private int localQuorumFor(Keyspace keyspace, String dc) +public static int localQuorumFor(Keyspace keyspace, String dc) { return (keyspace.getReplicationStrategy() instanceof NetworkTopologyStrategy) ? (((NetworkTopologyStrategy) keyspace.getReplicationStrategy()).getReplicationFactor(dc).allReplicas / 2) + 1 : quorumFor(keyspace); } +public static ObjectIntOpenHashMap eachQuorumFor(Keyspace keyspace) +{ +NetworkTopology
[jira] [Updated] (CASSANDRA-14756) Transient Replication - range movement improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14756: Resolution: Fixed Status: Resolved (was: Patch Available) > Transient Replication - range movement improvements > --- > > Key: CASSANDRA-14756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 > Project: Cassandra > Issue Type: Improvement >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve > readability: > * Simplify RangeRelocator code > * Fix range relocation > * Simplify calculateStreamAndFetchRanges > * Unify request/transfer ranges interface (Added benefit of this change is > that we have a check for non-intersecting ranges) > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/2] cassandra git commit: Transient replication: range movement improvements
Repository: cassandra Updated Branches: refs/heads/trunk 210da3dc0 -> 0379201c7 http://git-wip-us.apache.org/repos/asf/cassandra/blob/0379201c/test/unit/org/apache/cassandra/dht/BootStrapperTest.java -- diff --git a/test/unit/org/apache/cassandra/dht/BootStrapperTest.java b/test/unit/org/apache/cassandra/dht/BootStrapperTest.java index 8ae6853..2f412ad 100644 --- a/test/unit/org/apache/cassandra/dht/BootStrapperTest.java +++ b/test/unit/org/apache/cassandra/dht/BootStrapperTest.java @@ -105,7 +105,6 @@ public class BootStrapperTest InetAddressAndPort myEndpoint = InetAddressAndPort.getByName("127.0.0.1"); assertEquals(numOldNodes, tmd.sortedTokens().size()); -RangeStreamer s = new RangeStreamer(tmd, null, myEndpoint, StreamOperation.BOOTSTRAP, true, DatabaseDescriptor.getEndpointSnitch(), new StreamStateStore(), false, 1); IFailureDetector mockFailureDetector = new IFailureDetector() { public boolean isAlive(InetAddressAndPort ep) @@ -120,26 +119,20 @@ public class BootStrapperTest public void remove(InetAddressAndPort ep) { throw new UnsupportedOperationException(); } public void forceConviction(InetAddressAndPort ep) { throw new UnsupportedOperationException(); } }; -s.addSourceFilter(new RangeStreamer.FailureDetectorSourceFilter(mockFailureDetector)); +RangeStreamer s = new RangeStreamer(tmd, null, myEndpoint, StreamOperation.BOOTSTRAP, true, DatabaseDescriptor.getEndpointSnitch(), new StreamStateStore(), mockFailureDetector, false, 1); assertNotNull(Keyspace.open(keyspaceName)); s.addRanges(keyspaceName, Keyspace.open(keyspaceName).getReplicationStrategy().getPendingAddressRanges(tmd, myToken, myEndpoint)); -Collection> toFetch = s.toFetch().get(keyspaceName); +Multimap toFetch = s.toFetch().get(keyspaceName); // Check we get get RF new ranges in total -long rangesCount = toFetch.stream() - .map(Multimap::values) - .flatMap(Collection::stream) - .map(f -> f.remote) - .map(Replica::range) - .count(); -assertEquals(replicationFactor, rangesCount); +assertEquals(replicationFactor, toFetch.size()); // there isn't any point in testing the size of these collections for any specific size. When a random partitioner // is used, they will vary. -assert toFetch.stream().map(Multimap::values).flatMap(Collection::stream).count() > 0; -assert toFetch.stream().map(Multimap::keySet).map(Collection::stream).noneMatch(myEndpoint::equals); +assert toFetch.values().size() > 0; +assert toFetch.keys().stream().noneMatch(myEndpoint::equals); return s; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/0379201c/test/unit/org/apache/cassandra/dht/RangeFetchMapCalculatorTest.java -- diff --git a/test/unit/org/apache/cassandra/dht/RangeFetchMapCalculatorTest.java b/test/unit/org/apache/cassandra/dht/RangeFetchMapCalculatorTest.java index 07d6377..cee4bb9 100644 --- a/test/unit/org/apache/cassandra/dht/RangeFetchMapCalculatorTest.java +++ b/test/unit/org/apache/cassandra/dht/RangeFetchMapCalculatorTest.java @@ -195,18 +195,26 @@ public class RangeFetchMapCalculatorTest addNonTrivialRangeAndSources(rangesWithSources, 21, 30, "127.0.0.3"); //Return false for all except 127.0.0.5 -final Predicate filter = replica -> +final RangeStreamer.SourceFilter filter = new RangeStreamer.SourceFilter() { -try +public boolean apply(Replica replica) { -if (replica.endpoint().equals(InetAddressAndPort.getByName("127.0.0.5"))) -return false; -else +try +{ +if (replica.endpoint().equals(InetAddressAndPort.getByName("127.0.0.5"))) +return false; +else +return true; +} +catch (UnknownHostException e) +{ return true; +} } -catch (UnknownHostException e) + +public String message(Replica replica) { -return true; +return "Doesn't match 127.0.0.5"; } }; @@ -230,7 +238,18 @@ public class RangeFetchMapCalculatorTest addNonTrivialRangeAndSources(rangesWithSources, 11, 20, "127.0.0.2"); addNonTrivialRangeAndSources(rangesWithSources, 21, 30, "127.0.0.3"); -final Predicate allDeadFilter = replica -> false; +final RangeStreamer.SourceFilter allDeadFilter = new RangeStreamer.SourceFilter() +{ +
[jira] [Commented] (CASSANDRA-14756) Transient Replication - range movement improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628486#comment-16628486 ] Alex Petrov commented on CASSANDRA-14756: - Thank you for the review, committed to trunk as [0379201c7057f6bac4abf1e0f3d81a12d90abd08|https://github.com/apache/cassandra/commit/0379201c7057f6bac4abf1e0f3d81a12d90abd08] > Transient Replication - range movement improvements > --- > > Key: CASSANDRA-14756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 > Project: Cassandra > Issue Type: Improvement >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve > readability: > * Simplify RangeRelocator code > * Fix range relocation > * Simplify calculateStreamAndFetchRanges > * Unify request/transfer ranges interface (Added benefit of this change is > that we have a check for non-intersecting ranges) > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/2] cassandra git commit: Transient replication: range movement improvements
Transient replication: range movement improvements Patch by Alex Petrov; reviewed by Ariel Weisberg and Benedict Elliott Smith for CASSANDRA-14756 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0379201c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0379201c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0379201c Branch: refs/heads/trunk Commit: 0379201c7057f6bac4abf1e0f3d81a12d90abd08 Parents: 210da3d Author: Alex Petrov Authored: Mon Sep 17 11:51:56 2018 +0200 Committer: Alex Petrov Committed: Wed Sep 26 11:42:46 2018 +0200 -- .../org/apache/cassandra/db/SystemKeyspace.java | 31 +- .../org/apache/cassandra/dht/BootStrapper.java | 3 - .../cassandra/dht/RangeFetchMapCalculator.java | 2 +- .../org/apache/cassandra/dht/RangeStreamer.java | 448 ++- .../apache/cassandra/dht/StreamStateStore.java | 12 +- .../cassandra/locator/RangesAtEndpoint.java | 6 + .../cassandra/service/RangeRelocator.java | 324 ++ .../cassandra/service/StorageService.java | 314 + .../apache/cassandra/streaming/StreamPlan.java | 17 +- .../cassandra/streaming/StreamSession.java | 8 +- .../apache/cassandra/dht/BootStrapperTest.java | 17 +- .../dht/RangeFetchMapCalculatorTest.java| 79 +++- .../locator/OldNetworkTopologyStrategyTest.java | 3 +- .../service/BootstrapTransientTest.java | 113 +++-- .../cassandra/service/MoveTransientTest.java| 321 +++-- .../cassandra/service/StorageServiceTest.java | 18 +- 16 files changed, 981 insertions(+), 735 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0379201c/src/java/org/apache/cassandra/db/SystemKeyspace.java -- diff --git a/src/java/org/apache/cassandra/db/SystemKeyspace.java b/src/java/org/apache/cassandra/db/SystemKeyspace.java index ff070a3..0f904ce 100644 --- a/src/java/org/apache/cassandra/db/SystemKeyspace.java +++ b/src/java/org/apache/cassandra/db/SystemKeyspace.java @@ -32,12 +32,11 @@ import javax.management.openmbean.TabularData; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.HashMultimap; import com.google.common.collect.ImmutableMap; +import com.google.common.collect.ImmutableSet; import com.google.common.collect.SetMultimap; import com.google.common.collect.Sets; import com.google.common.io.ByteStreams; import com.google.common.util.concurrent.ListenableFuture; -import org.apache.cassandra.locator.RangesAtEndpoint; -import org.apache.cassandra.locator.Replica; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -1285,24 +1284,40 @@ public final class SystemKeyspace keyspace); } -public static synchronized RangesAtEndpoint getAvailableRanges(String keyspace, IPartitioner partitioner) +/** + * List of the streamed ranges, where transientness is encoded based on the source, where range was streamed from. + */ +public static synchronized AvailableRanges getAvailableRanges(String keyspace, IPartitioner partitioner) { String query = "SELECT * FROM system.%s WHERE keyspace_name=?"; UntypedResultSet rs = executeInternal(format(query, AVAILABLE_RANGES_V2), keyspace); -InetAddressAndPort endpoint = InetAddressAndPort.getLocalHost(); -RangesAtEndpoint.Builder builder = RangesAtEndpoint.builder(endpoint); + +ImmutableSet.Builder> full = new ImmutableSet.Builder<>(); +ImmutableSet.Builder> trans = new ImmutableSet.Builder<>(); for (UntypedResultSet.Row row : rs) { Optional.ofNullable(row.getSet("full_ranges", BytesType.instance)) .ifPresent(full_ranges -> full_ranges.stream() .map(buf -> byteBufferToRange(buf, partitioner)) -.forEach(range -> builder.add(fullReplica(endpoint, range; +.forEach(full::add)); Optional.ofNullable(row.getSet("transient_ranges", BytesType.instance)) .ifPresent(transient_ranges -> transient_ranges.stream() .map(buf -> byteBufferToRange(buf, partitioner)) -.forEach(range -> builder.add(transientReplica(endpoint, range; +.forEach(trans::add)); +} +return new AvailableRanges(full.build(), trans.build()); +} + +public static class AvailableRanges +{ +public Set> full; +public Set> trans; + +private AvailableRanges(Set> full, Set> trans) +{ +this.full = full; +this.trans = trans; } -re
[jira] [Updated] (CASSANDRA-14467) Add option to sanity check tombstones on reads/compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-14467: Resolution: Fixed Fix Version/s: (was: 4.x) 4.0 Status: Resolved (was: Patch Available) committed as \{{96f90eee28247cf9a8520e6962b0388f193c7ca8}}, thanks! new test result: https://circleci.com/workflow-run/ffc46ccd-42d8-41ce-b319-7d29db444e20 > Add option to sanity check tombstones on reads/compaction > - > > Key: CASSANDRA-14467 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14467 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Minor > Labels: pull-request-available > Fix For: 4.0 > > > We should add an option to do a quick sanity check of tombstones on reads + > compaction. It should either log the error or throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14467) Add option to sanity check tombstones on reads/compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-14467: --- Labels: pull-request-available (was: ) > Add option to sanity check tombstones on reads/compaction > - > > Key: CASSANDRA-14467 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14467 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Minor > Labels: pull-request-available > Fix For: 4.x > > > We should add an option to do a quick sanity check of tombstones on reads + > compaction. It should either log the error or throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14467) Add option to sanity check tombstones on reads/compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628462#comment-16628462 ] ASF GitHub Bot commented on CASSANDRA-14467: Github user asfgit closed the pull request at: https://github.com/apache/cassandra-dtest/pull/30 > Add option to sanity check tombstones on reads/compaction > - > > Key: CASSANDRA-14467 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14467 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Minor > Labels: pull-request-available > Fix For: 4.x > > > We should add an option to do a quick sanity check of tombstones on reads + > compaction. It should either log the error or throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: always enable tombstone validation exceptions during tests
Repository: cassandra-dtest Updated Branches: refs/heads/master 02c1cd774 -> 96f90eee2 always enable tombstone validation exceptions during tests Patch by marcuse; reviewed by Ariel Weisberg for CASSANDRA-14467 Closes #30 Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/96f90eee Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/96f90eee Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/96f90eee Branch: refs/heads/master Commit: 96f90eee28247cf9a8520e6962b0388f193c7ca8 Parents: 02c1cd7 Author: Marcus Eriksson Authored: Thu May 31 08:41:11 2018 +0200 Committer: Marcus Eriksson Committed: Wed Sep 26 11:13:16 2018 +0200 -- dtest_setup.py | 2 ++ ttl_test.py| 2 ++ 2 files changed, 4 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/96f90eee/dtest_setup.py -- diff --git a/dtest_setup.py b/dtest_setup.py index 295fa3f..9e3f330 100644 --- a/dtest_setup.py +++ b/dtest_setup.py @@ -417,6 +417,8 @@ class DTestSetup: # No more thrift in 4.0, and start_rpc doesn't exists anymore if self.cluster.version() >= '4' and 'start_rpc' in values: del values['start_rpc'] +if self.cluster.version() >= '4': +values['corrupted_tombstone_strategy'] = 'exception' self.cluster.set_configuration_options(values) logger.debug("Done setting configuration options:\n" + pprint.pformat(self.cluster._config_options, indent=4)) http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/96f90eee/ttl_test.py -- diff --git a/ttl_test.py b/ttl_test.py index 4a7ad06..d89ca6a 100644 --- a/ttl_test.py +++ b/ttl_test.py @@ -567,6 +567,8 @@ class TestRecoverNegativeExpirationDate(TestHelper): Check that row with negative overflowed ttl is recovered by offline scrub """ cluster = self.cluster +if self.cluster.version() >= '4': + cluster.set_configuration_options(values={'corrupted_tombstone_strategy': 'disabled'}) cluster.populate(1).start(wait_for_binary_proto=True) [node] = cluster.nodelist() - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14733) AbstractReadRepair sends unnecessary data read(s)
[ https://issues.apache.org/jira/browse/CASSANDRA-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628435#comment-16628435 ] Benedict commented on CASSANDRA-14733: -- Ah, thanks for that insight. Perhaps it's better to wait until we fix monotonic reads with transient replication, then, as at that point we may be requesting a separate repaired/unrepaired digest as a matter of course. At the moment, at least transient requests already implicitly 'track' this (or, cannot track it, however you want to view it), so we could at least not re-issue these requests. It looks like we're already special casing the receipt of transient responses because of this, so it would have no effect to reuse the existing responses. > AbstractReadRepair sends unnecessary data read(s) > - > > Key: CASSANDRA-14733 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14733 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Priority: Minor > Labels: Availability, performance > > We already have one or more data responses (two in case of 'always' > speculation, and potentially more if transient replication is enabled, though > the two do not presently interact). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14733) AbstractReadRepair sends unnecessary data read(s)
[ https://issues.apache.org/jira/browse/CASSANDRA-14733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628422#comment-16628422 ] Sam Tunnicliffe commented on CASSANDRA-14733: - If repaired data tracking (CASSANDRA-14145) is enabled, we'll need to re-issue the original requests as those responses won't include the tracking info > AbstractReadRepair sends unnecessary data read(s) > - > > Key: CASSANDRA-14733 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14733 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Priority: Minor > Labels: Availability, performance > > We already have one or more data responses (two in case of 'always' > speculation, and potentially more if transient replication is enabled, though > the two do not presently interact). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14762) Transient node receives full data requests in dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628397#comment-16628397 ] Benedict commented on CASSANDRA-14762: -- Thanks for the review. bq. Is this just a clarification? Yes; it is a no-effect change. bq. It seems to me we use read repair to assemble the read after digest mismatch between full replicas and that is why we it might send messages to transient replicas? That is what it seems like to me. Even though we don't send repair mutations after read-repair from transient replicas, on digest mismatch we still perform reads to other replicas - including those that we may not have contacted initially, which might include new transient nodes. bq. A minor out of scope improvement would be to use the existing response and not repeat the read? Agreed, see CASSANDRA-14733. I considered doing that optimisation for this ticket, but given the above fact that we might issue new transient reads it seemed to unnecessarily complicate fixing this bug. bq. It seems to me also that we would read from transients as part of short read protection (they are just another member of the group), and they aren't special so we should issue them the query. Is this simply you reasoning out the rationale for issuing the requests, or have you spotted an issue with the patch that means we are not doing so? It does look to me like we should be perhaps switching to {{acceptsTransient}} for the local query also, and validating transient status in {{LocalReadRunnable}} - but, for now, we don't do this. Perhaps we could fix this also in this patch, or otherwise file a follow up. > Transient node receives full data requests in dtests > > > Key: CASSANDRA-14762 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14762 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ariel Weisberg >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > I saw this running them on my laptop with rapid write protection disabled. > Attached is a patch for disabling rapid write protection in the transient > dtests. > {noformat} > .Exception in thread Thread-19: > Traceback (most recent call last): > File > "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", > line 916, in _bootstrap_inner > self.run() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 180, in run > self.scan_and_report() > File > "/Users/aweisberg/repos/cassandra-dtest/venv/src/ccm/ccmlib/cluster.py", line > 173, in scan_and_report > on_error_call(errordata) > File "/Users/aweisberg/repos/cassandra-dtest/dtest_setup.py", line 137, in > _log_error_handler > pytest.fail("Error details: \n{message}".format(message=message)) > File > "/Users/aweisberg/repos/cassandra-dtest/venv/lib/python3.6/site-packages/_pytest/outcomes.py", > line 96, in fail > raise Failed(msg=msg, pytrace=pytrace) > Failed: Error details: > Errors seen in logs for: node3 > node3: ERROR [ReadStage-1] 2018-09-18 12:28:48,344 > AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread > Thread[ReadStage-1,5,main] > org.apache.cassandra.exceptions.InvalidRequestException: Attempted to serve > transient data request from full node in > org.apache.cassandra.db.ReadCommandVerbHandler@3c55e0ff > at > org.apache.cassandra.db.ReadCommandVerbHandler.validateTransientStatus(ReadCommandVerbHandler.java:104) > at > org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:53) > at > org.apache.cassandra.net.MessageDeliveryTask.process(MessageDeliveryTask.java:92) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:54) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:110) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org