[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516182#comment-16516182 ] Christian Spriegel commented on CASSANDRA-14480: The fact that it will be 4.0 only, is indeed a hard pill to swallow. :( > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel resolved CASSANDRA-14480. Resolution: Duplicate > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516124#comment-16516124 ] Christian Spriegel commented on CASSANDRA-14480: [~jjirsa]: It sounds like my ticket is a duplicate. > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515889#comment-16515889 ] Christian Spriegel commented on CASSANDRA-14480: I just saw this happening in a production system: {noformat} Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ALL (8 responses were required but only 7 replica responded){noformat} Our queries use LOCAL_QUORUM, but we have RTEs happening due to read-repair. read_repair_chance = 0.1 is set, so its going cross DC :( > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495254#comment-16495254 ] Christian Spriegel edited comment on CASSANDRA-14480 at 5/30/18 2:55 PM: - I did some more testing and tried the following change in StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(): {code:java} repairHandler = new ReadCallback(resolver, ConsistencyLevel.ALL, consistency.blockFor(keyspace), // was: executor.getContactedReplicas().size() command, keyspace, executor.handler.endpoints);{code} This fixed the issue in my test-scenario. But it causes the read-repair to only repair to only repair 2 our of my 3 replicas, in cases where all 3 replicas would be available. I could imagine an alternative solution where maybeAwaitFullDataRead() would wait for 3 replicas, but in case of an RTE it could check if 2 responded and treat that as a successful read. was (Author: christianmovi): I did some more testing and tried the following change in StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(): {code:java} repairHandler = new ReadCallback(resolver, ConsistencyLevel.ALL, consistency.blockFor(keyspace), // was: executor.getContactedReplicas().size() command, keyspace, executor.handler.endpoints);{code} This fixed the issue in my test-scenario. But it causes the read-repair to only repair to only repair 2 our of my 3 replicas, in cases where all 3 replicas would be available. > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495254#comment-16495254 ] Christian Spriegel commented on CASSANDRA-14480: I did some more testing and tried the following change in StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(): {code:java} repairHandler = new ReadCallback(resolver, ConsistencyLevel.ALL, consistency.blockFor(keyspace), // was: executor.getContactedReplicas().size() command, keyspace, executor.handler.endpoints);{code} This fixed the issue in my test-scenario. But it causes the read-repair to only repair to only repair 2 our of my 3 replicas, in cases where all 3 replicas would be available. > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-14480: --- Attachment: Reader.java Writer.java > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: Reader.java, Writer.java, schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
[ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-14480: --- Attachment: schema_14480.cql > Digest mismatch requires all replicas to be responsive > -- > > Key: CASSANDRA-14480 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Christian Spriegel >Priority: Major > Attachments: schema_14480.cql > > > I ran across a scenario where a digest mismatch causes a read-repair that > requires all up nodes to be able to respond. If one of these nodes is not > responding, then the read-repair is being reported to the client as > ReadTimeoutException. > > My expection would be that a CL=QUORUM will always succeed as long as 2 nodes > are responding. But unfortunetaly the third node being "up" in the ring, but > not being able to respond does lead to a RTE. > > > I came up with a scenario that reproduces the issue: > # set up a 3 node cluster using ccm > # increase the phi_convict_threshold to 16, so that nodes are permanently > reported as up > # create attached schema > # run attached reader (which only connects to node1&2). This should > already produce digest mismatches > # do a "ccm node3 pause" > # The reader will report a read-timeout with consistency QUORUM (2 responses > were required but only 1 replica responded). Within the > DigestMismatchException catch-block it can be seen that the repairHandler is > waiting for 3 responses, even though the exception says that 2 responses are > required. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
Christian Spriegel created CASSANDRA-14480: -- Summary: Digest mismatch requires all replicas to be responsive Key: CASSANDRA-14480 URL: https://issues.apache.org/jira/browse/CASSANDRA-14480 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel I ran across a scenario where a digest mismatch causes a read-repair that requires all up nodes to be able to respond. If one of these nodes is not responding, then the read-repair is being reported to the client as ReadTimeoutException. My expection would be that a CL=QUORUM will always succeed as long as 2 nodes are responding. But unfortunetaly the third node being "up" in the ring, but not being able to respond does lead to a RTE. I came up with a scenario that reproduces the issue: # set up a 3 node cluster using ccm # increase the phi_convict_threshold to 16, so that nodes are permanently reported as up # create attached schema # run attached reader (which only connects to node1&2). This should already produce digest mismatches # do a "ccm node3 pause" # The reader will report a read-timeout with consistency QUORUM (2 responses were required but only 1 replica responded). Within the DigestMismatchException catch-block it can be seen that the repairHandler is waiting for 3 responses, even though the exception says that 2 responses are required. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
[ https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15811487#comment-15811487 ] Christian Spriegel edited comment on CASSANDRA-13086 at 2/15/18 11:26 AM: -- [~ifesdjeen]: Then this would mean the java-driver needs some kind of hasColumn() method in the Row, so that the application can properly check for the column. It would be a driver issue then. edit: row.getColumnDefinitions().contains("columnname") was (Author: christianmovi): [~ifesdjeen]: Then this would mean the java-driver needs some kind of hasColumn() method in the Row, so that the application can properly check for the column. It would be a driver issue then. > CAS resultset sometimes does not contain value column even though wasApplied > is false > - > > Key: CASSANDRA-13086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Spriegel >Priority: Minor > > Every now and then I see a ResultSet for one of my CAS queries that contain > wasApplied=false, but does not contain my value column. > I just now found another occurrence, which causes the following exception in > the driver: > {code} > ... > Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ > exhausted: true, Columns[[applied](boolean)]]) > at com.mycompany.MyDAO._checkLock(MyDAO.java:408) > at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) > ... 16 more > Caused by: java.lang.IllegalArgumentException: value is not a column defined > in this metadata > at > com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) > at > com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) > at > com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) > at > com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) > at com.mycompany.MyDAO._checkLock(MyDAO.java:383) > ... 17 more > {code} > The query the application was doing: > delete from "Lock" where lockname=:lockname and id=:id if value=:value; > I did some debugging recently and was able to track these ResultSets to > StorageProxy.cas() to the "CAS precondition does not match current values {}" > return statement. > I saw this happening with Cassandra 3.0.10 and earlier versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14092) Max ttl of 20 years will overflow localDeletionTime
[ https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337907#comment-16337907 ] Christian Spriegel commented on CASSANDRA-14092: Reproduced in 3.0.15. > Max ttl of 20 years will overflow localDeletionTime > --- > > Key: CASSANDRA-14092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14092 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Paulo Motta >Priority: Major > > CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year > 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for > {{localDeletionTime}}. > It turns out that next year the {{localDeletionTime}} will start overflowing > with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20 > years) > Integer.MAX_VALUE}}), so we should remove this limitation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337826#comment-16337826 ] Christian Spriegel edited comment on CASSANDRA-4771 at 1/24/18 4:12 PM: [~rbfblk]: Nop, it seems it was not fixed with the refactoring. For now changing the typecast to signed int would be a big improvement already. Edit: It happened in C* 3.0.15 was (Author: christianmovi): [~rbfblk]: Nop, it seems it was not fixed with the refactoring. For now changing the typecast to signed int would be a big improvement already. > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337826#comment-16337826 ] Christian Spriegel commented on CASSANDRA-4771: --- [~rbfblk]: Nop, it seems it was not fixed with the refactoring. For now changing the typecast to signed int would be a big improvement already. > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337586#comment-16337586 ] Christian Spriegel edited comment on CASSANDRA-4771 at 1/24/18 1:37 PM: Can't we get another 68 years if we cast to long instead of int? Class SerializationHeader: {code:java} public int readLocalDeletionTime(DataInputPlus in) throws IOException { return (int)in.readUnsignedVInt() + stats.minLocalDeletionTime; }{code} was (Author: christianmovi): Can't we get another 68 years if we cast to long instead to int? Class SerializationHeader: {code:java} public int readLocalDeletionTime(DataInputPlus in) throws IOException { return (int)in.readUnsignedVInt() + stats.minLocalDeletionTime; }{code} > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337586#comment-16337586 ] Christian Spriegel commented on CASSANDRA-4771: --- Can't we get another 68 years if we cast to long instead to int? Class SerializationHeader: {code:java} public int readLocalDeletionTime(DataInputPlus in) throws IOException { return (int)in.readUnsignedVInt() + stats.minLocalDeletionTime; }{code} > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337548#comment-16337548 ] Christian Spriegel edited comment on CASSANDRA-4771 at 1/24/18 12:57 PM: - [~rbfblk]: You are a smart guy. It indeed broke 4 years later :) Edit: I will buy you a beer, should we ever meet ;) was (Author: christianmovi): [~rbfblk]: You are a smart guy. It indeed broke 4 years later :) > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Reopened] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel reopened CASSANDRA-4771: --- > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337548#comment-16337548 ] Christian Spriegel commented on CASSANDRA-4771: --- [~rbfblk]: You are a smart guy. It indeed broke 4 years later :) > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius >Priority: Major > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-7868) Sporadic CL switch from LOCAL_QUORUM to ALL
[ https://issues.apache.org/jira/browse/CASSANDRA-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107084#comment-16107084 ] Christian Spriegel edited comment on CASSANDRA-7868 at 8/3/17 2:48 PM: --- [~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in C* 3.0.13. Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side. I would expect that a failing Read-Repair does not throw an exception on the client. Is my expectation incorrect? Edit: It seems that StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch() is the culprit. It indeed does a blocking repair. I assume it is necessary that it is blocking, but does it really have to be a CL.ALL ? Edit 2: I think I understand now why this is an issue: Due to speculative-retry, contactedReplicas may contain more nodes than expected by the queries CL. A digest-mismatch will then cause a CL.ALL query on all the contacted nodes (including the ones from speculative retry). I think this RR code needs to be improved to honor the query-CL when speculative retry was performed for the query. was (Author: christianmovi): [~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in C* 3.0.13. Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side. I would expect that a failing Read-Repair does not throw an exception on the client. Is my expectation incorrect? Edit: It seems that StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch() is the culprit. It indeed does a blocking repair. I assume it is necessary that it is blocking, but does it really have to be a CL.ALL ? > Sporadic CL switch from LOCAL_QUORUM to ALL > --- > > Key: CASSANDRA-7868 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7868 > Project: Cassandra > Issue Type: Bug > Environment: Client: cassandra-java-driver 2.0.4 > Server: 2.0.9 >Reporter: Dmitry Schitinin > > Hi! > We have keyspace described as > {code} > CREATE KEYSPACE subscriptions WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'FOL': '3', > 'SAS': '3', > 'AMS': '0', > 'IVA': '3', > 'UGR': '0' > } AND durable_writes = 'false'; > {code} > There is simple table > {code} > CREATE TABLE processed_documents ( > id text, > PRIMARY KEY ((id)) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.10 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > {code} > in the keyspace. > On client we execute next prepared statement: > {code} > session.prepare( > "SELECT id FROM processed_documents WHERE id IN :ids > ).setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM) > {code} > Used Cassandra session has next main properties: > * Load balancing policy - DCAwareRoundRobinPolicy(localDc, > usedHostPerRemoteDc = 3, allowRemoteDcForLocalConsistencyLevel = true) > * Retry policy - DefaultRetryPolicy > * Query options - QueryOptions with set consistency level to > ConsistencyLevel.LOCAL_QUORUM > Our problem is next. > Since some moment there are next errors in the client application log: > {code} > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency ALL (9 responses were required but only 8 > replica responded) > at > com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.Responses$Error.asException(Responses.java:94) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:379) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571) > ~[cassandra-driver-core-2.0.2.jar:na] > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > ~[netty-3.9.0.Final.jar:na] > at >
[jira] [Comment Edited] (CASSANDRA-7868) Sporadic CL switch from LOCAL_QUORUM to ALL
[ https://issues.apache.org/jira/browse/CASSANDRA-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107084#comment-16107084 ] Christian Spriegel edited comment on CASSANDRA-7868 at 7/31/17 12:31 PM: - [~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in C* 3.0.13. Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side. I would expect that a failing Read-Repair does not throw an exception on the client. Is my expectation incorrect? Edit: It seems that StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch() is the culprit. It indeed does a blocking repair. I assume it is necessary that it is blocking, but does it really have to be a CL.ALL ? was (Author: christianmovi): [~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in C* 3.0.13. Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side. I would expect that a failing Read-Repair does not throw an exception on the client. Is my expectation incorrect? > Sporadic CL switch from LOCAL_QUORUM to ALL > --- > > Key: CASSANDRA-7868 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7868 > Project: Cassandra > Issue Type: Bug > Environment: Client: cassandra-java-driver 2.0.4 > Server: 2.0.9 >Reporter: Dmitry Schitinin > > Hi! > We have keyspace described as > {code} > CREATE KEYSPACE subscriptions WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'FOL': '3', > 'SAS': '3', > 'AMS': '0', > 'IVA': '3', > 'UGR': '0' > } AND durable_writes = 'false'; > {code} > There is simple table > {code} > CREATE TABLE processed_documents ( > id text, > PRIMARY KEY ((id)) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.10 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > {code} > in the keyspace. > On client we execute next prepared statement: > {code} > session.prepare( > "SELECT id FROM processed_documents WHERE id IN :ids > ).setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM) > {code} > Used Cassandra session has next main properties: > * Load balancing policy - DCAwareRoundRobinPolicy(localDc, > usedHostPerRemoteDc = 3, allowRemoteDcForLocalConsistencyLevel = true) > * Retry policy - DefaultRetryPolicy > * Query options - QueryOptions with set consistency level to > ConsistencyLevel.LOCAL_QUORUM > Our problem is next. > Since some moment there are next errors in the client application log: > {code} > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency ALL (9 responses were required but only 8 > replica responded) > at > com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.Responses$Error.asException(Responses.java:94) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:379) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571) > ~[cassandra-driver-core-2.0.2.jar:na] > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
[jira] [Commented] (CASSANDRA-7868) Sporadic CL switch from LOCAL_QUORUM to ALL
[ https://issues.apache.org/jira/browse/CASSANDRA-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107084#comment-16107084 ] Christian Spriegel commented on CASSANDRA-7868: --- [~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in C* 3.0.13. Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side. I would expect that a failing Read-Repair does not throw an exception on the client. Is my expectation incorrect? > Sporadic CL switch from LOCAL_QUORUM to ALL > --- > > Key: CASSANDRA-7868 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7868 > Project: Cassandra > Issue Type: Bug > Environment: Client: cassandra-java-driver 2.0.4 > Server: 2.0.9 >Reporter: Dmitry Schitinin > > Hi! > We have keyspace described as > {code} > CREATE KEYSPACE subscriptions WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'FOL': '3', > 'SAS': '3', > 'AMS': '0', > 'IVA': '3', > 'UGR': '0' > } AND durable_writes = 'false'; > {code} > There is simple table > {code} > CREATE TABLE processed_documents ( > id text, > PRIMARY KEY ((id)) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.10 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > {code} > in the keyspace. > On client we execute next prepared statement: > {code} > session.prepare( > "SELECT id FROM processed_documents WHERE id IN :ids > ).setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM) > {code} > Used Cassandra session has next main properties: > * Load balancing policy - DCAwareRoundRobinPolicy(localDc, > usedHostPerRemoteDc = 3, allowRemoteDcForLocalConsistencyLevel = true) > * Retry policy - DefaultRetryPolicy > * Query options - QueryOptions with set consistency level to > ConsistencyLevel.LOCAL_QUORUM > Our problem is next. > Since some moment there are next errors in the client application log: > {code} > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency ALL (9 responses were required but only 8 > replica responded) > at > com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.Responses$Error.asException(Responses.java:94) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:379) > ~[cassandra-driver-core-2.0.2.jar:na] > at > com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571) > ~[cassandra-driver-core-2.0.2.jar:na] > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70) > ~[netty-3.9.0.Final.jar:na] > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > ~[netty-3.9.0.Final.jar:na] > at >
[jira] [Updated] (CASSANDRA-11887) Duplicate rows after a 2.2.5 to 3.0.4 migration
[ https://issues.apache.org/jira/browse/CASSANDRA-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-11887: --- Attachment: christianspriegel_schema.txt christianspriegel_query_trace.txt > Duplicate rows after a 2.2.5 to 3.0.4 migration > --- > > Key: CASSANDRA-11887 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11887 > Project: Cassandra > Issue Type: Bug >Reporter: Julien Anguenot >Priority: Blocker > Attachments: christianspriegel_query_trace.txt, > christianspriegel_schema.txt, > post_3.0.9_upgrade_sstabledump_showing_duplicate_row.txt > > > After migrating from 2.2.5 to 3.0.4, some tables seem to carry duplicate > primary keys. > Below an example. Note, repair / scrub of such table do not seem to fix nor > indicate any issues. > *Table definition*: > {code} > CREATE TABLE core.edge_ipsec_vpn_service ( > edge_uuid text PRIMARY KEY, > enabled boolean, > endpoints set, > tunnels set > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > *UDTs:* > {code} > CREATE TYPE core.edge_ipsec_vpn_endpoint ( > network text, > public_ip text > ); > CREATE TYPE core.edge_ipsec_vpn_tunnel ( > name text, > description text, > peer_ip_address text, > peer_id text, > local_ip_address text, > local_id text, > local_subnets frozen >, > peer_subnets frozen >, > shared_secret text, > shared_secret_encrypted boolean, > encryption_protocol text, > mtu int, > enabled boolean, > operational boolean, > error_details text, > vpn_peer frozen > ); > CREATE TYPE core.edge_ipsec_vpn_subnet ( > name text, > gateway text, > netmask text > ); > CREATE TYPE core.edge_ipsec_vpn_peer ( > type text, > id text, > name text, > vcd_url text, > vcd_org text, > vcd_username text > ); > {code} > sstabledump extract (IP addressees hidden as well as secrets) > {code} > [...] > { > "partition" : { > "key" : [ "84d567cc-0165-4e64-ab97-3a9d06370ba9" ], > "position" : 131146 > }, > "rows" : [ > { > "type" : "row", > "position" : 131236, > "liveness_info" : { "tstamp" : "2016-05-06T17:07:15.416003Z" }, > "cells" : [ > { "name" : "enabled", "value" : "true" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:false::third > party\\:1.2.3.4\\:\\:\\:\\:” ], "value" : "" } > ] > }, > { > "type" : "row", > "position" : 131597, > "cells" : [ > { "name" : "endpoints", "path" : [ “XXX” ], "value" : "", "tstamp" > : "2016-03-29T08:05:38.297015Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:” ], "value" : "", "tstamp" : > "2016-03-29T08:05:38.297015Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:false::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-14T18:05:07.262001Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-29T08:05:38.297015Z" } > ] > }, > { > "type" : "row", > "position" : 133644, > "cells" : [ > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-29T07:05:27.213013Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4.7:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-29T07:05:27.213013Z" } > ] > } > ] > }, > [...] > [...] > {code} -- This message was sent by Atlassian
[jira] [Commented] (CASSANDRA-11887) Duplicate rows after a 2.2.5 to 3.0.4 migration
[ https://issues.apache.org/jira/browse/CASSANDRA-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821732#comment-15821732 ] Christian Spriegel commented on CASSANDRA-11887: We got the same issue when upgrading from 2.2.x to 3.0.10. I think what is special about this table is the primary key definition 'PRIMARY KEY ("id")'. Perhaps this issue is caused by not having a "column-name"? I attached a query trace and the schema as files. > Duplicate rows after a 2.2.5 to 3.0.4 migration > --- > > Key: CASSANDRA-11887 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11887 > Project: Cassandra > Issue Type: Bug >Reporter: Julien Anguenot >Priority: Blocker > Attachments: post_3.0.9_upgrade_sstabledump_showing_duplicate_row.txt > > > After migrating from 2.2.5 to 3.0.4, some tables seem to carry duplicate > primary keys. > Below an example. Note, repair / scrub of such table do not seem to fix nor > indicate any issues. > *Table definition*: > {code} > CREATE TABLE core.edge_ipsec_vpn_service ( > edge_uuid text PRIMARY KEY, > enabled boolean, > endpoints set, > tunnels set > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > *UDTs:* > {code} > CREATE TYPE core.edge_ipsec_vpn_endpoint ( > network text, > public_ip text > ); > CREATE TYPE core.edge_ipsec_vpn_tunnel ( > name text, > description text, > peer_ip_address text, > peer_id text, > local_ip_address text, > local_id text, > local_subnets frozen >, > peer_subnets frozen >, > shared_secret text, > shared_secret_encrypted boolean, > encryption_protocol text, > mtu int, > enabled boolean, > operational boolean, > error_details text, > vpn_peer frozen > ); > CREATE TYPE core.edge_ipsec_vpn_subnet ( > name text, > gateway text, > netmask text > ); > CREATE TYPE core.edge_ipsec_vpn_peer ( > type text, > id text, > name text, > vcd_url text, > vcd_org text, > vcd_username text > ); > {code} > sstabledump extract (IP addressees hidden as well as secrets) > {code} > [...] > { > "partition" : { > "key" : [ "84d567cc-0165-4e64-ab97-3a9d06370ba9" ], > "position" : 131146 > }, > "rows" : [ > { > "type" : "row", > "position" : 131236, > "liveness_info" : { "tstamp" : "2016-05-06T17:07:15.416003Z" }, > "cells" : [ > { "name" : "enabled", "value" : "true" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:false::third > party\\:1.2.3.4\\:\\:\\:\\:” ], "value" : "" } > ] > }, > { > "type" : "row", > "position" : 131597, > "cells" : [ > { "name" : "endpoints", "path" : [ “XXX” ], "value" : "", "tstamp" > : "2016-03-29T08:05:38.297015Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:” ], "value" : "", "tstamp" : > "2016-03-29T08:05:38.297015Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:false::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-14T18:05:07.262001Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-29T08:05:38.297015Z" } > ] > }, > { > "type" : "row", > "position" : 133644, > "cells" : [ > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" : > "2016-03-29T07:05:27.213013Z" }, > { "name" : "tunnels", "path" : [ > “XXX::1.2.3.4.7:1.2.3.4:1.2.3.4:1.2.3.4:XXX:XXX:false:AES256:1500:true:true::third > party\\:1.2.3.4\\:\\:\\:\\:" ], "value" : "", "tstamp" :
[jira] [Commented] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
[ https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15811487#comment-15811487 ] Christian Spriegel commented on CASSANDRA-13086: [~ifesdjeen]: Then this would mean the java-driver needs some kind of hasColumn() method in the Row, so that the application can properly check for the column. It would be a driver issue then. > CAS resultset sometimes does not contain value column even though wasApplied > is false > - > > Key: CASSANDRA-13086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Spriegel >Priority: Minor > > Every now and then I see a ResultSet for one of my CAS queries that contain > wasApplied=false, but does not contain my value column. > I just now found another occurrence, which causes the following exception in > the driver: > {code} > ... > Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ > exhausted: true, Columns[[applied](boolean)]]) > at com.mycompany.MyDAO._checkLock(MyDAO.java:408) > at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) > ... 16 more > Caused by: java.lang.IllegalArgumentException: value is not a column defined > in this metadata > at > com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) > at > com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) > at > com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) > at > com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) > at com.mycompany.MyDAO._checkLock(MyDAO.java:383) > ... 17 more > {code} > The query the application was doing: > delete from "Lock" where lockname=:lockname and id=:id if value=:value; > I did some debugging recently and was able to track these ResultSets to > StorageProxy.cas() to the "CAS precondition does not match current values {}" > return statement. > I saw this happening with Cassandra 3.0.10 and earlier versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
[ https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15811213#comment-15811213 ] Christian Spriegel commented on CASSANDRA-13086: [~ifesdjeen]: I did not know that the value can be null. I think this is what is happening in my case: # My application tries to delete with a condition, which fails due to a WriteTimeoutException # My application retries the same delete operation # Cassandra returns a wasapplied=false and no value column, because the last delete was already successful I find that behaviour a bit strange, as the datastax-java-driver does not give me any method to check whether a column exists or not. It gives me a way to check for null values though. My question now would be: Wouldn't it be better if Cassandra would return a null value instead of not including the column? > CAS resultset sometimes does not contain value column even though wasApplied > is false > - > > Key: CASSANDRA-13086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Spriegel >Priority: Minor > > Every now and then I see a ResultSet for one of my CAS queries that contain > wasApplied=false, but does not contain my value column. > I just now found another occurrence, which causes the following exception in > the driver: > {code} > ... > Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ > exhausted: true, Columns[[applied](boolean)]]) > at com.mycompany.MyDAO._checkLock(MyDAO.java:408) > at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) > ... 16 more > Caused by: java.lang.IllegalArgumentException: value is not a column defined > in this metadata > at > com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) > at > com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) > at > com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) > at > com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) > at com.mycompany.MyDAO._checkLock(MyDAO.java:383) > ... 17 more > {code} > The query the application was doing: > delete from "Lock" where lockname=:lockname and id=:id if value=:value; > I did some debugging recently and was able to track these ResultSets to > StorageProxy.cas() to the "CAS precondition does not match current values {}" > return statement. > I saw this happening with Cassandra 3.0.10 and earlier versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
[ https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-13086: --- Description: Every now and then I see a ResultSet for one of my CAS queries that contain wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. I saw this happening with Cassandra 3.0.10 and earlier versions. was: Every now and then I see a ResultSet for one of my CAS queries that contain wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. I saw this happening with Cassandra 3.0.10. > CAS resultset sometimes does not contain value column even though wasApplied > is false > - > > Key: CASSANDRA-13086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Spriegel >Priority: Minor > > Every now and then I see a ResultSet for one of my CAS queries that contain > wasApplied=false, but does not contain my value column. > I just now found another occurrence, which causes the following exception in > the driver: > {code} > ... > Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ > exhausted: true, Columns[[applied](boolean)]]) > at com.mycompany.MyDAO._checkLock(MyDAO.java:408) > at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) > ... 16 more > Caused by: java.lang.IllegalArgumentException: value is not a column defined > in this metadata > at > com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) > at > com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) > at > com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) > at > com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) > at com.mycompany.MyDAO._checkLock(MyDAO.java:383) > ... 17 more > {code} > The query the application was doing: > delete from "Lock" where lockname=:lockname and id=:id if value=:value; > I did some debugging recently and was able to track these ResultSets to > StorageProxy.cas() to the "CAS precondition does not match current values {}" > return statement. > I saw this happening with Cassandra 3.0.10 and earlier versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
[ https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-13086: --- Description: Every now and then I see a ResultSet for one of my CAS queries that contain wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. I saw this happening with Cassandra 3.0.10. was: Every now and then I see a ResultSet for one of my CAS queries that contain wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. > CAS resultset sometimes does not contain value column even though wasApplied > is false > - > > Key: CASSANDRA-13086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Spriegel >Priority: Minor > > Every now and then I see a ResultSet for one of my CAS queries that contain > wasApplied=false, but does not contain my value column. > I just now found another occurrence, which causes the following exception in > the driver: > {code} > ... > Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ > exhausted: true, Columns[[applied](boolean)]]) > at com.mycompany.MyDAO._checkLock(MyDAO.java:408) > at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) > ... 16 more > Caused by: java.lang.IllegalArgumentException: value is not a column defined > in this metadata > at > com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) > at > com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) > at > com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) > at > com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) > at com.mycompany.MyDAO._checkLock(MyDAO.java:383) > ... 17 more > {code} > The query the application was doing: > delete from "Lock" where lockname=:lockname and id=:id if value=:value; > I did some debugging recently and was able to track these ResultSets to > StorageProxy.cas() to the "CAS precondition does not match current values {}" > return statement. > I saw this happening with Cassandra 3.0.10. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
[ https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-13086: --- Description: Every now and then I see a ResultSet for one of my CAS queries that contain wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. was: Every now and then I see a ResultSet for one of my CAS queries that say wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. > CAS resultset sometimes does not contain value column even though wasApplied > is false > - > > Key: CASSANDRA-13086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Spriegel >Priority: Minor > > Every now and then I see a ResultSet for one of my CAS queries that contain > wasApplied=false, but does not contain my value column. > I just now found another occurrence, which causes the following exception in > the driver: > {code} > ... > Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ > exhausted: true, Columns[[applied](boolean)]]) > at com.mycompany.MyDAO._checkLock(MyDAO.java:408) > at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) > ... 16 more > Caused by: java.lang.IllegalArgumentException: value is not a column defined > in this metadata > at > com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) > at > com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) > at > com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) > at > com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) > at com.mycompany.MyDAO._checkLock(MyDAO.java:383) > ... 17 more > {code} > The query the application was doing: > delete from "Lock" where lockname=:lockname and id=:id if value=:value; > I did some debugging recently and was able to track these ResultSets to > StorageProxy.cas() to the "CAS precondition does not match current values {}" > return statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false
Christian Spriegel created CASSANDRA-13086: -- Summary: CAS resultset sometimes does not contain value column even though wasApplied is false Key: CASSANDRA-13086 URL: https://issues.apache.org/jira/browse/CASSANDRA-13086 Project: Cassandra Issue Type: Bug Reporter: Christian Spriegel Priority: Minor Every now and then I see a ResultSet for one of my CAS queries that say wasApplied=false, but does not contain my value column. I just now found another occurrence, which causes the following exception in the driver: {code} ... Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ exhausted: true, Columns[[applied](boolean)]]) at com.mycompany.MyDAO._checkLock(MyDAO.java:408) at com.mycompany.MyDAO._releaseLock(MyDAO.java:314) ... 16 more Caused by: java.lang.IllegalArgumentException: value is not a column defined in this metadata at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266) at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272) at com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81) at com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151) at com.mycompany.MyDAO._checkLock(MyDAO.java:383) ... 17 more {code} The query the application was doing: delete from "Lock" where lockname=:lockname and id=:id if value=:value; I did some debugging recently and was able to track these ResultSets to StorageProxy.cas() to the "CAS precondition does not match current values {}" return statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12143) NPE when trying to remove purgable tombstones from result
[ https://issues.apache.org/jira/browse/CASSANDRA-12143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365833#comment-15365833 ] Christian Spriegel commented on CASSANDRA-12143: Is the fixVersion correct? I can't find this ticket in the Changes.txt. > NPE when trying to remove purgable tombstones from result > - > > Key: CASSANDRA-12143 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12143 > Project: Cassandra > Issue Type: Bug >Reporter: mck >Assignee: mck > Fix For: 2.2.7 > > Attachments: 12143-2.2.txt > > > A cluster running 2.2.6 started throwing NPEs. > (500K exceptions on a node was seen.) > {noformat}WARN … AbstractLocalAwareExecutorService.java:169 - Uncaught > exception on thread Thread[SharedPool-Worker-5,5,main]: {} > java.lang.NullPointerException: null{noformat} > Bisecting this highlighted commit d3db33c008542c7044f3ed8c19f3a45679fcf52e as > the culprit, which was a fix for CASSANDRA-11427. > This commit added a line to "remove purgable tombstones from result" but > failed to null check the {{data}} variable first. This variable comes from > {{Row.cf}} which is permitted to be null where the CFS has no data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11315) Upgrade from 2.2.6 to 3.0.5 Fails with AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-11315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329614#comment-15329614 ] Christian Spriegel commented on CASSANDRA-11315: Hi [~iamaleksey], do you still have this ticket on your radar? :-) > Upgrade from 2.2.6 to 3.0.5 Fails with AssertionError > - > > Key: CASSANDRA-11315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11315 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04, Oracle Java 8, Apache Cassandra 2.2.5 -> > 3.0.3, Apache Cassandra 2.2.6 -> 3.0.5 >Reporter: Dominik Keil >Assignee: Aleksey Yeschenko >Priority: Blocker > Fix For: 3.0.x, 3.x > > > Hi, > when trying to upgrade our development cluster from C* 2.2.5 to 3.0.3 > Cassandra fails during startup. > Here's the relevant log snippet: > {noformat} > [...] > INFO [main] 2016-03-08 11:42:01,291 ColumnFamilyStore.java:381 - > Initializing system.schema_triggers > INFO [main] 2016-03-08 11:42:01,302 ColumnFamilyStore.java:381 - > Initializing system.schema_usertypes > INFO [main] 2016-03-08 11:42:01,313 ColumnFamilyStore.java:381 - > Initializing system.schema_functions > INFO [main] 2016-03-08 11:42:01,324 ColumnFamilyStore.java:381 - > Initializing system.schema_aggregates > INFO [main] 2016-03-08 11:42:01,576 SystemKeyspace.java:1284 - Detected > version upgrade from 2.2.5 to 3.0.3, snapshotting system keyspace > WARN [main] 2016-03-08 11:42:01,911 CompressionParams.java:382 - The > sstable_compression option has been deprecated. You should use class instead > WARN [main] 2016-03-08 11:42:01,959 CompressionParams.java:333 - The > chunk_length_kb option has been deprecated. You should use chunk_length_in_kb > instead > ERROR [main] 2016-03-08 11:42:02,638 CassandraDaemon.java:692 - Exception > encountered during startup > java.lang.AssertionError: null > at > org.apache.cassandra.db.CompactTables.getCompactValueColumn(CompactTables.java:90) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:315) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:291) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:367) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:337) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$227(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$224(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:223) > [apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) > [apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) > [apache-cassandra-3.0.3.jar:3.0.3] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11315) Upgrade from 2.2.6 to 3.0.5 Fails with AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-11315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-11315: --- Reproduced In: 3.0.7, 3.0.5, 3.0.4, 3.0.3 (was: 3.0.3, 3.0.4, 3.0.5) > Upgrade from 2.2.6 to 3.0.5 Fails with AssertionError > - > > Key: CASSANDRA-11315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11315 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04, Oracle Java 8, Apache Cassandra 2.2.5 -> > 3.0.3, Apache Cassandra 2.2.6 -> 3.0.5 >Reporter: Dominik Keil >Assignee: Aleksey Yeschenko >Priority: Blocker > Fix For: 3.0.x, 3.x > > > Hi, > when trying to upgrade our development cluster from C* 2.2.5 to 3.0.3 > Cassandra fails during startup. > Here's the relevant log snippet: > {noformat} > [...] > INFO [main] 2016-03-08 11:42:01,291 ColumnFamilyStore.java:381 - > Initializing system.schema_triggers > INFO [main] 2016-03-08 11:42:01,302 ColumnFamilyStore.java:381 - > Initializing system.schema_usertypes > INFO [main] 2016-03-08 11:42:01,313 ColumnFamilyStore.java:381 - > Initializing system.schema_functions > INFO [main] 2016-03-08 11:42:01,324 ColumnFamilyStore.java:381 - > Initializing system.schema_aggregates > INFO [main] 2016-03-08 11:42:01,576 SystemKeyspace.java:1284 - Detected > version upgrade from 2.2.5 to 3.0.3, snapshotting system keyspace > WARN [main] 2016-03-08 11:42:01,911 CompressionParams.java:382 - The > sstable_compression option has been deprecated. You should use class instead > WARN [main] 2016-03-08 11:42:01,959 CompressionParams.java:333 - The > chunk_length_kb option has been deprecated. You should use chunk_length_in_kb > instead > ERROR [main] 2016-03-08 11:42:02,638 CassandraDaemon.java:692 - Exception > encountered during startup > java.lang.AssertionError: null > at > org.apache.cassandra.db.CompactTables.getCompactValueColumn(CompactTables.java:90) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:315) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:291) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:367) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:337) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$227(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$224(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:223) > [apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) > [apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) > [apache-cassandra-3.0.3.jar:3.0.3] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11315) Upgrade from 2.2.5 to 3.0.3 Fails with AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-11315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209006#comment-15209006 ] Christian Spriegel commented on CASSANDRA-11315: [~iamaleksey]: How can we explicitly disable dense manually? I would simply rename the columns using CQLSH, to give them a CQL3 schema. btw: I assume MDS_0 has the same issue? > Upgrade from 2.2.5 to 3.0.3 Fails with AssertionError > - > > Key: CASSANDRA-11315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11315 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04, Oracle Java 8, Apache Cassandra 2.2.5 -> > 3.0.3 >Reporter: Dominik Keil >Assignee: Aleksey Yeschenko >Priority: Blocker > Fix For: 3.0.x, 3.x > > > Hi, > when trying to upgrade our development cluster from C* 2.2.5 to 3.0.3 > Cassandra fails during startup. > Here's the relevant log snippet: > {noformat} > [...] > INFO [main] 2016-03-08 11:42:01,291 ColumnFamilyStore.java:381 - > Initializing system.schema_triggers > INFO [main] 2016-03-08 11:42:01,302 ColumnFamilyStore.java:381 - > Initializing system.schema_usertypes > INFO [main] 2016-03-08 11:42:01,313 ColumnFamilyStore.java:381 - > Initializing system.schema_functions > INFO [main] 2016-03-08 11:42:01,324 ColumnFamilyStore.java:381 - > Initializing system.schema_aggregates > INFO [main] 2016-03-08 11:42:01,576 SystemKeyspace.java:1284 - Detected > version upgrade from 2.2.5 to 3.0.3, snapshotting system keyspace > WARN [main] 2016-03-08 11:42:01,911 CompressionParams.java:382 - The > sstable_compression option has been deprecated. You should use class instead > WARN [main] 2016-03-08 11:42:01,959 CompressionParams.java:333 - The > chunk_length_kb option has been deprecated. You should use chunk_length_in_kb > instead > ERROR [main] 2016-03-08 11:42:02,638 CassandraDaemon.java:692 - Exception > encountered during startup > java.lang.AssertionError: null > at > org.apache.cassandra.db.CompactTables.getCompactValueColumn(CompactTables.java:90) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:315) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:291) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:367) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:337) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$227(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$224(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:223) > [apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) > [apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) > [apache-cassandra-3.0.3.jar:3.0.3] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11315) Upgrade from 2.2.5 to 3.0.3 Fails with AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-11315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208973#comment-15208973 ] Christian Spriegel commented on CASSANDRA-11315: [~iamaleksey]: The Auth_0 keyspace was bascially created by CLI. It is one of the keyspaces created by our automated testsuite (it seems someone connected his IDE to the D system, so the schema was created there). One thing that is special: After creation, the tests modify the keyspace via the thrift API: {code} // update gc grace seconds to 0 final Cluster cluster = keyspace.getCluster(); final String keyspaceName = keyspace.getKeyspace().getKeyspaceName(); final KeyspaceDefinition keyspaceDefinition = cluster.describeKeyspace(keyspaceName); final List cfDefs = keyspaceDefinition.getCfDefs(); for (final ColumnFamilyDefinition cfDef : cfDefs) { cfDef.setGcGraceSeconds(0); cfDef.setMemtableFlushAfterMins(Integer.MAX_VALUE); cfDef.setReadRepairChance(0.0); cfDef.setKeyCacheSavePeriodInSeconds(Integer.MAX_VALUE); cluster.updateColumnFamily(cfDef); } {code} > Upgrade from 2.2.5 to 3.0.3 Fails with AssertionError > - > > Key: CASSANDRA-11315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11315 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04, Oracle Java 8, Apache Cassandra 2.2.5 -> > 3.0.3 >Reporter: Dominik Keil >Assignee: Aleksey Yeschenko >Priority: Blocker > Fix For: 3.0.x, 3.x > > > Hi, > when trying to upgrade our development cluster from C* 2.2.5 to 3.0.3 > Cassandra fails during startup. > Here's the relevant log snippet: > {noformat} > [...] > INFO [main] 2016-03-08 11:42:01,291 ColumnFamilyStore.java:381 - > Initializing system.schema_triggers > INFO [main] 2016-03-08 11:42:01,302 ColumnFamilyStore.java:381 - > Initializing system.schema_usertypes > INFO [main] 2016-03-08 11:42:01,313 ColumnFamilyStore.java:381 - > Initializing system.schema_functions > INFO [main] 2016-03-08 11:42:01,324 ColumnFamilyStore.java:381 - > Initializing system.schema_aggregates > INFO [main] 2016-03-08 11:42:01,576 SystemKeyspace.java:1284 - Detected > version upgrade from 2.2.5 to 3.0.3, snapshotting system keyspace > WARN [main] 2016-03-08 11:42:01,911 CompressionParams.java:382 - The > sstable_compression option has been deprecated. You should use class instead > WARN [main] 2016-03-08 11:42:01,959 CompressionParams.java:333 - The > chunk_length_kb option has been deprecated. You should use chunk_length_in_kb > instead > ERROR [main] 2016-03-08 11:42:02,638 CassandraDaemon.java:692 - Exception > encountered during startup > java.lang.AssertionError: null > at > org.apache.cassandra.db.CompactTables.getCompactValueColumn(CompactTables.java:90) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:315) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:291) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:367) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:337) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$227(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$224(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.3.jar:3.0.3] > at > org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77) >
[jira] [Commented] (CASSANDRA-11301) Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads
[ https://issues.apache.org/jira/browse/CASSANDRA-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186860#comment-15186860 ] Christian Spriegel commented on CASSANDRA-11301: I don't know if it helps: We saw these throttled reads when we had repair running with -pr on. We turned primary-range off, and now this issues is gone. btw: With the -pr flag enabled, we had easily 50-100 concurrent validations running concurrently for some reason (Without -pr its back to a single validation). Maybe the issue requires such a high amount of concurrent compactions/validations? > Non-obsoleting compaction operations over compressed files can impose rate > limit on normal reads > > > Key: CASSANDRA-11301 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11301 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Stefania > Fix For: 2.2.6 > > > Broken by CASSANDRA-9240; the rate limiting reader passes the ICompressedFile > interface to its parent, which uses this to attach an "owner" - which means > the reader gets recycled on close, i.e. pooled, for normal use. If the > compaction were to replace the sstable there would be no problem, which is > presumably why this hasn't been encountered frequently. However validation > compactions on long lived sstables would permit these rate limited readers to > accumulate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11301) Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads
[ https://issues.apache.org/jira/browse/CASSANDRA-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178546#comment-15178546 ] Christian Spriegel commented on CASSANDRA-11301: Benedict: CASSANDRA-9240 does not specify 2.1 as fixVersion. Am I correct to assume that 2.1 should not be affected by this? [~luxifer]: Didn't you say that our 2.1 installation is also affected? Did you test with setcompactionthroughput? > Non-obsoleting compaction operations over compressed files can impose rate > limit on normal reads > > > Key: CASSANDRA-11301 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11301 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.2.6 > > > Broken by CASSANDRA-9240; the rate limiting reader passes the ICompressedFile > interface to its parent, which uses this to attach an "owner" - which means > the reader gets recycled on close, i.e. pooled, for normal use. If the > compaction were to replace the sstable there would be no problem, which is > presumably why this hasn't been encountered frequently. However validation > compactions on long lived sstables would permit these rate limited readers to > accumulate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8013) AssertionError on RangeTombstoneList.diff
[ https://issues.apache.org/jira/browse/CASSANDRA-8013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699475#comment-14699475 ] Christian Spriegel commented on CASSANDRA-8013: --- It seems 2.0 is also affected. We are getting this regularily with 2.0.15: {code} ERROR 2015-08-17 12:11:09,502 [Thrift:1058] CassandraDaemon.java (line 258) Exception in thread Thread[Thrift:1058,5,main] java.lang.AssertionError at org.apache.cassandra.db.RangeTombstoneList.diff(RangeTombstoneList.java:346) at org.apache.cassandra.db.DeletionInfo.diff(DeletionInfo.java:180) at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:324) at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:404) at org.apache.cassandra.service.RowDataResolver.scheduleRepairs(RowDataResolver.java:114) at org.apache.cassandra.service.RangeSliceResponseResolver$Reducer.getReduced(RangeSliceResponseResolver.java:162) at org.apache.cassandra.service.RangeSliceResponseResolver$Reducer.getReduced(RangeSliceResponseResolver.java:130) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.service.RangeSliceResponseResolver.resolve(RangeSliceResponseResolver.java:90) at org.apache.cassandra.service.RangeSliceResponseResolver.resolve(RangeSliceResponseResolver.java:40) at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:107) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1603) at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:1164) at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.getResult(Cassandra.java:3698) at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.getResult(Cassandra.java:3682) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:204) {code} AssertionError on RangeTombstoneList.diff - Key: CASSANDRA-8013 URL: https://issues.apache.org/jira/browse/CASSANDRA-8013 Project: Cassandra Issue Type: Bug Components: Core Reporter: Phil Yang Assignee: Phil Yang Fix For: 2.1.1 Attachments: 8013-v2.patch, 8013-v3.txt, 8013.patch after upgrading to 2.1.0, I found there are many exceptions in system.log. It appears in nodes upgraded from 2.0 as well as in nodes newly add at 2.1.0 {noformat} ERROR [SharedPool-Worker-8] 2014-09-27 16:44:50,188 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.RangeTombstoneList.diff(RangeTombstoneList.java:424) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.DeletionInfo.diff(DeletionInfo.java:189) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:311) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:394) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.RowDataResolver.scheduleRepairs(RowDataResolver.java:114) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.RowDataResolver.resolve(RowDataResolver.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.RowDataResolver.resolve(RowDataResolver.java:37) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:110) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1300) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1153) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(SliceQueryPager.java:83) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:88) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(SliceQueryPager.java:36) ~[apache-cassandra-2.1.0.jar:2.1.0] at
[jira] [Commented] (CASSANDRA-8574) Gracefully degrade SELECT when there are lots of tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534110#comment-14534110 ] Christian Spriegel commented on CASSANDRA-8574: --- [~ztyx] TOE = TombstoneOverwhelmingException Gracefully degrade SELECT when there are lots of tombstones --- Key: CASSANDRA-8574 URL: https://issues.apache.org/jira/browse/CASSANDRA-8574 Project: Cassandra Issue Type: Improvement Reporter: Jens Rantil Fix For: 3.x *Background:* There's lots of tooling out there to do BigData analysis on Cassandra clusters. Examples are Spark and Hadoop, which is offered by DSE. The problem with both of these so far, is that a single partition key with too many tombstones can make the query job fail hard. The described scenario happens despite the user setting a rather small FetchSize. I assume this is a common scenario if you have larger rows. *Proposal:* To allow a CQL SELECT to gracefully degrade to only return a smaller batch of results if there are too many tombstones. The tombstones are ordered according to clustering key and one should be able to page through them. Potentially: SELECT * FROM mytable LIMIT 1000 TOMBSTONES; would page through maximum 1000 tombstones, _or_ 1000 (CQL) rows. I understand that this obviously would degrade performance, but it would at least yield a result. *Additional comment:* I haven't dug into Cassandra code, but conceptually I guess this would be doable. Let me know what you think. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8574) Gracefully degrade SELECT when there are lots of tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526524#comment-14526524 ] Christian Spriegel commented on CASSANDRA-8574: --- Another use-case: I think being able to select tombstones could be very useful to examine TOEs. The user could simply do a query in CQLSH and see where the tombstones come from. Crazy thought: Perhaps there could be different tombstone modes: - One that selects all tombstones: good for debugging. - One that only returns the last tombstone: good for iterating. Gracefully degrade SELECT when there are lots of tombstones --- Key: CASSANDRA-8574 URL: https://issues.apache.org/jira/browse/CASSANDRA-8574 Project: Cassandra Issue Type: Improvement Reporter: Jens Rantil Fix For: 3.x *Background:* There's lots of tooling out there to do BigData analysis on Cassandra clusters. Examples are Spark and Hadoop, which is offered by DSE. The problem with both of these so far, is that a single partition key with too many tombstones can make the query job fail hard. The described scenario happens despite the user setting a rather small FetchSize. I assume this is a common scenario if you have larger rows. *Proposal:* To allow a CQL SELECT to gracefully degrade to only return a smaller batch of results if there are too many tombstones. The tombstones are ordered according to clustering key and one should be able to page through them. Potentially: SELECT * FROM mytable LIMIT 1000 TOMBSTONES; would page through maximum 1000 tombstones, _or_ 1000 (CQL) rows. I understand that this obviously would degrade performance, but it would at least yield a result. *Additional comment:* I haven't dug into Cassandra code, but conceptually I guess this would be doable. Let me know what you think. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8561) Tombstone log warning does not log partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267750#comment-14267750 ] Christian Spriegel commented on CASSANDRA-8561: --- Linking CASSANDRA-7886: There are also changes made to the TOE logging made in this ticket. Tombstone log warning does not log partition key Key: CASSANDRA-8561 URL: https://issues.apache.org/jira/browse/CASSANDRA-8561 Project: Cassandra Issue Type: Improvement Components: Core Environment: Datastax DSE 4.5 Reporter: Jens Rantil Labels: logging Fix For: 2.0.12, 2.1.3 AFAIK, the tombstone warning in system.log does not contain the primary key. See: https://gist.github.com/JensRantil/44204676f4dbea79ea3a Including it would help a lot in diagnosing why the (CQL) row has so many tombstones. Let me know if I have misunderstood something. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262943#comment-14262943 ] Christian Spriegel edited comment on CASSANDRA-7886 at 1/2/15 3:22 PM: --- Hi [~thobbs], uploaded new patch: V6 Here is what I did: - Fixed logging of TOEs... -- ... in StorageProxy for local reads -- ... in MessageDeliveryTask for remote reads - Added partitionKey(as DecoratedKey) and lastCellName logging to TOE. - Changed SliceQueryFilter not to throw TOEs Exception for System-keyspace. Cassandra does not seem to like TOEs in system queries. These TOEs will always be logged as warnings instead. This is how TOEs look like in system.log: {code} ERROR [SharedPool-Worker-1] 2015-01-02 15:07:24,878 MessageDeliveryTask.java:81 - Scanned over 201 tombstones in test.test; 100 columns were requested; query aborted (see tombstone_failure_threshold); partitionKey=DecoratedKey(78703492656118554854272571946195123045, 31); lastCell=188; delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}; slices=[-] {code} kind regards, Christian was (Author: christianmovi): Hi [~thobbs], uploaded new patch: V6 Here is what I did: - Fixed logging of TOEs... -- ... in StorageProxy for local reads -- ... in MessageDeliveryTask for remote reads - Added partitionKey(as DecoratedKey) and lastCellName logging to TOE. - Changed SliceQueryFilter not to throw TOEs Exception for System-keyspace. Cassandra does not seem to like TOEs in system queries. These TOEs will always be logged as warnings instead. This is how TOEs look like in system.log: {quote} ERROR [SharedPool-Worker-1] 2015-01-02 15:07:24,878 MessageDeliveryTask.java:81 - Scanned over 201 tombstones in test.test; 100 columns were requested; query aborted (see tombstone_failure_threshold); partitionKey=DecoratedKey(78703492656118554854272571946195123045, 31); lastCell=188; delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}; slices=[-] {quote} kind regards, Christian Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 7886_v4_trunk.txt, 7886_v5_trunk.txt, 7886_v6_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262943#comment-14262943 ] Christian Spriegel commented on CASSANDRA-7886: --- Hi [~thobbs], uploaded new patch: V6 Here is what I did: - Fixed logging of TOEs... -- ... in StorageProxy for local reads -- ... in MessageDeliveryTask for remote reads - Added partitionKey(as DecoratedKey) and lastCellName logging to TOE. - Changed SliceQueryFilter not to throw TOEs Exception for System-keyspace. Cassandra does not seem to like TOEs in system queries. These TOEs will always be logged as warnings instead. This is how TOEs look like in system.log: {quote} ERROR [SharedPool-Worker-1] 2015-01-02 15:07:24,878 MessageDeliveryTask.java:81 - Scanned over 201 tombstones in test.test; 100 columns were requested; query aborted (see tombstone_failure_threshold); partitionKey=DecoratedKey(78703492656118554854272571946195123045, 31); lastCell=188; delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}; slices=[-] {quote} kind regards, Christian Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 7886_v4_trunk.txt, 7886_v5_trunk.txt, 7886_v6_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Attachment: 7886_v6_trunk.txt Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 7886_v4_trunk.txt, 7886_v5_trunk.txt, 7886_v6_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Attachment: 7886_v5_trunk.txt Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 7886_v4_trunk.txt, 7886_v5_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256948#comment-14256948 ] Christian Spriegel commented on CASSANDRA-7886: --- Hi @thobbs! I have a chrismas present for you, in form of a patch file ;-) I attached a v5 patch that contains the fixes. Regarding TOE: Currently I throw TOEs as exceptions and they get logged just like any other exception. I am not sure if this is desireable and would like to hear your feedback. I think we have the following options: - Leave as it is in v5, meaning TOEs get logged with stacktraces. - Add catch blocks where neccessary and log it in user-friendly way. But it might be in many places. Also in this case I would prefer making TOE a checked exception. Imho TOE should not be unchecked. - Add TOE logging to C* default exception handler. (I did not investigate yet, but I assume there is a exceptionhandler) - Leave it as it was before Here a few examples how TOEs look now to the user: TOE using a 3.0 CQLSH (still on CQL-protocol 3): {code} cqlsh:test select * from test; code=1200 [Coordinator node timed out waiting for replica nodes' responses] message=Operation timed out - received only 0 responses. info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} cqlsh:test {code} TOE using a 2.0 CQLSH: {code} cqlsh:test select * from test; Request did not complete within rpc_timeout. {code} TOE with cassandra-cli: {code} [default@unknown] use test; Authenticated to keyspace: test [default@test] list test; Using default limit of 100 Using default cell limit of 100 null TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_range_slices_result$get_range_slices_resultStandardScheme.read(Cassandra.java:17448) at org.apache.cassandra.thrift.Cassandra$get_range_slices_result$get_range_slices_resultStandardScheme.read(Cassandra.java:17397) at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:17323) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:802) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:786) at org.apache.cassandra.cli.CliClient.executeList(CliClient.java:1520) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:285) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:201) at org.apache.cassandra.cli.CliMain.main(CliMain.java:331) [default@test] {code} Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 7886_v4_trunk.txt, 7886_v5_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246592#comment-14246592 ] Christian Spriegel commented on CASSANDRA-7886: --- Hi [~thobbs], sorry I kept you waiting for so long. {quote}Instead of using Unavailable when the protocol version is less than 4, use ReadTimeout. Unavailable signals that some of the replicas are considered to be down, which is not the case here. Plus, ReadTimeout is the error that is currently returned in these circumstances.{quote} Makes sense. I changed Unavailable to ReadTimeout for CQL3 and Thrift. {quote}In ErrorMessage.encodedSize(), there's some commented out code for READ_FAILURE handling.{quote} The commented code was meant as a preparation for WriteFailureExceptions. Does it perhaps make sense to fully add WriteFailureException? As a follow up ticket, we could implement it then for the different writes. Or do you want me to get rid it? {quote}Instead of catching and ignoring TombstoneOverwhelmingException in multiple places, I suggest you move the logged error message into the TOE message and let it propagate (and be logged) like any other exception.{quote} Just to make sure that we dont touch anything new here: TOEs are logged inside SliceQueryFilter.collectReducedColumns already. I simply took this catch block from the ReadVerbHandler/RangeSliceVerbHandler and put into StorageProxy/MessageDeliveryTask. I don't like that either, but I did not want to touch it. Do you still want me to change it? {quote}Can you update docs/native_protocol_v4.spec with these changes? You can look at the previous specs to see examples of the changes from the previous version section{quote} Ok. Should we also add WriteFailures? {quote}In StorageProxy, the unavailables counter should not be incremented for read failures. I suggest creating a new, separate failure counter.{quote} Done. {quote}Also in StorageProxy, there's now quite a bit of code duplication around building error messages for ReadTimeoutExceptions and ReadFailureExceptions. Can you condense those somewhat?{quote} I merged ReadTimeoutException|ReadFailureException into a single catch block. I also added the last cell-name to the TOE, so that an administrator can get an estimate where to look for the tombstones. This doesn't really match the tickets new name, but is related to my original issue :-) Overall, one question remains from my side: Should I also prepare WriteFailureExceptions? I could (as a follow-up ticket) add these to the write-codepath. Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Attachment: 7886_v4_trunk.txt Coordinator should not wait for read timeouts when replicas hit Exceptions -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Labels: protocolv4 Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 7886_v4_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Attachment: 7886_v3_trunk.txt TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Attachment: 7886_v2_trunk.txt TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222955#comment-14222955 ] Christian Spriegel commented on CASSANDRA-7886: --- Hi [~slebresne], I finally had the time to port my patch to trunk and add error handling to the ErrorMessage class. Thrift and CQL protocol 3 will get an Unavailable error instead of my new READ_FAILURE. CQL protocol = 4 will get the new READ_FAILURE. It seems there is no CQL protcol 4 yet, so my code always returns Unavailable at the moment. Let me know if I you want me to improve anything. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt, 7886_v2_trunk.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178282#comment-14178282 ] Christian Spriegel commented on CASSANDRA-7886: --- [~slebresne]: Does it make sense that I prepare a patch on trunk that includes the errror-handling? Also I would do some (manual) testing on trunk. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176915#comment-14176915 ] Christian Spriegel commented on CASSANDRA-7886: --- [~slebresne]: Sorry, I meant CQLSH and not CQL. With the standard CQL client I meant the CQLSH client that was installed with the debian packages. Regarding the ErrorMessage class: A new error code READ_FAILURE was introduced with my patch. But no new fields were added the ErrorMessage. I assume you worry about clients not being able to handle the new code. In my opinion any client-code that does not have a default-case should be punished. So I would not hestitate to add it ;-) I assume with CQL 4 (CASSANDRA-8043) a clean code handling and additional fields for be implemented for read_failures? {code} public void encode(ErrorMessage msg, ByteBuf dest, int version) { dest.writeInt(msg.error.code().value); // TODO: make sure READ_FAILURE is only sent for CQL =4 CBUtil.writeString(msg.error.getMessage(), dest); switch (msg.error.code()) { //case READ_FAILURE: // read failure case not implemented so far! // if(version x) // with the next version this could be implemented // { //RequestFailureException rfe = (RequestFailureException) msg.error; //dest.writeInt(rfe.received); //dest.writeInt(rfe.blockFor); //dest.writeInt(rfe.failures); // } // break; {code} TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175093#comment-14175093 ] Christian Spriegel commented on CASSANDRA-7886: --- [~slebresne]: I am not sure if we are talking about the same thing :-) I am pretty sure that I was using the standard CQL client in my test. It showed me the new error code I added. My new exceptions extend RequestExecutionException, which I assume the CQL server side is able to handle. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14159275#comment-14159275 ] Christian Spriegel commented on CASSANDRA-7990: --- [~thobbs]: Ok, works fine now. No Exceptions. Thanks! CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.1.1 Attachments: 7990-partial-fix.txt, 7990.txt I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:971) ~[apache-cassandra-2.1.0.jar:2.1.0] at
[jira] [Commented] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14158345#comment-14158345 ] Christian Spriegel commented on CASSANDRA-7990: --- [~thobbs]: Cool! I will test your second patch over the weekend and will give you feedback. But I assume its fixed :-) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.1.1 Attachments: 7990-partial-fix.txt, 7990.txt I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156254#comment-14156254 ] Christian Spriegel commented on CASSANDRA-7886: --- [~slebresne]: I would be fine with 3.0. I am glad if it gets implemented at all :-) The Exceptions I added were internal ones. - CQL: CQL returns the error code and the message upon failure. - Thrift: I did not add new Thrift Exceptions. I simply reused UnavailableException in Thrift. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 2.1.1 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157451#comment-14157451 ] Christian Spriegel commented on CASSANDRA-7990: --- [~thobbs]: Good news, the Thrift part seems to be fixed. The only exceptions that are left are the CQL ones. Here is what I gathered so far: Your assertion message is showing me: {code} java.lang.AssertionError: Expected composite prefix of size 6, but got size 5 at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[main/:na] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[main/:na] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[main/:na] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:232) ~[main/:na] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:178) ~[main/:na] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:280) ~[main/:na] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:266) ~[main/:na] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:261) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:227) ~[main/:na] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:249) ~[main/:na] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[main/:na] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:438) [main/:na] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:1) [main/:na] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_65] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [main/:na] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [main/:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] {code} The prefix is of type: org.apache.cassandra.db.composites.CompoundComposite The column is: ColumnDefinition{name=Message, type=org.apache.cassandra.db.marshal.UTF8Type, kind=COMPACT_VALUE, componentIndex=null, indexName=null, indexType=null} The table causing that error has the following schema: {code} CREATE TABLE RefIndex ( BucketID int, RefID bigint, CreationTime timestamp, Severity int, SystemID bigint, EventType int, Sender int, Filter varchar, MessageID uuid, Message text, PRIMARY KEY ((BucketID, SystemID, RefID), CreationTime, Severity, EventType, Sender, MessageID, Filter) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (CreationTime DESC) AND COMPRESSION = { 'sstable_compression' : 'DeflateCompressor', 'chunk_length_kb' : '128' }; {code} CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Attachments: 7990-partial-fix.txt I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at
[jira] [Commented] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154697#comment-14154697 ] Christian Spriegel commented on CASSANDRA-7990: --- [~thobbs]: Sorry I kept you waiting. I did not have the time so far. But I promise I will give you feedback. I am confident that I'll find some time on Friday. CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Attachments: 7990-partial-fix.txt I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at
[jira] [Commented] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146114#comment-14146114 ] Christian Spriegel commented on CASSANDRA-7990: --- [~thobbs]: Sure, I will have a look on the weekend. CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Attachments: 7990-partial-fix.txt I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:971) ~[apache-cassandra-2.1.0.jar:2.1.0] at
[jira] [Created] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
Christian Spriegel created CASSANDRA-7990: - Summary: CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Reporter: Christian Spriegel Priority: Minor I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:971) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996) ~[apache-cassandra-thrift-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3980) ~[apache-cassandra-thrift-2.1.0.jar:2.1.0] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[libthrift-0.9.1.jar:0.9.1] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.1.jar:0.9.1] at
[jira] [Updated] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7990: -- Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0 CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0 Reporter: Christian Spriegel Priority: Minor I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:971) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996) ~[apache-cassandra-thrift-2.1.0.jar:2.1.0] at
[jira] [Updated] (CASSANDRA-7990) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException
[ https://issues.apache.org/jira/browse/CASSANDRA-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7990: -- Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 (was: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0) CompoundDenseCellNameType AssertionError and BoundedComposite to CellName ClasCastException --- Key: CASSANDRA-7990 URL: https://issues.apache.org/jira/browse/CASSANDRA-7990 Project: Cassandra Issue Type: Bug Environment: Ubuntu, Java 1.7.0_67, Cassandra 2.1.0, cassandra-driver-core:jar:2.0.6 Reporter: Christian Spriegel Priority: Minor I just updated my laptop to Cassandra 2.1 and created a fresh data folder. When trying to run my automated tests i get a lot these exceptions in the Cassandra log: {code} ERROR [SharedPool-Worker-1] 2014-09-23 12:59:17,812 ErrorMessage.java:218 - Unexpected exception during request java.lang.AssertionError: null at org.apache.cassandra.db.composites.CompoundDenseCellNameType.create(CompoundDenseCellNameType.java:57) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.Constants$Setter.execute(Constants.java:313) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:91) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.addStatementMutations(BatchStatement.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.getMutations(BatchStatement.java:181) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:283) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:269) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:264) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:187) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:206) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:422) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:318) [apache-cassandra-2.1.0.jar:2.1.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:103) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:332) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:31) [netty-all-4.0.20.Final.jar:4.0.20.Final] at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:323) [netty-all-4.0.20.Final.jar:4.0.20.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:163) [apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103) [apache-cassandra-2.1.0.jar:2.1.0] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] ERROR [Thrift:9] 2014-09-23 12:59:17,823 CustomTThreadPoolServer.java:219 - Error occurred during processing of message. java.lang.ClassCastException: org.apache.cassandra.db.composites.BoundedComposite cannot be cast to org.apache.cassandra.db.composites.CellName at org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:170) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.deleteColumnOrSuperColumn(CassandraServer.java:936) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:860) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:971) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996)
[jira] [Updated] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Attachment: 7886_v1.txt TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143247#comment-14143247 ] Christian Spriegel commented on CASSANDRA-7886: --- [~kohlisankalp]: Thanks for you feedback. [~slebresne], [~kohlisankalp]: I attached a patch for C 2.1 where I implemented remote failure handling for reads and range-reads. Using a ccm 3 node cluster, I tested remote and local read failures. Both CLI and CQLSH return instantly, instead of waiting for timeouts. Any feedback? Could this be merged into 2.1? Please let me know if the patch needs improvement. I guess, the next steps would be to implement callbacks for writes, truncates, etc. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143623#comment-14143623 ] Christian Spriegel commented on CASSANDRA-7886: --- [~jbellis]: Dont get me wrong: There is definitely some client-limitation necessary in the application. But it is really not a nice situation that all queries are just sitting there and waiting. Just to clarify: The patch is not only about TOEs. It will report back any Exception. Another reason why I'd like this functionality is because it makes understanding TOEs easier. Think of a developer running his query in CQLSH: With this patch the user will get a clear message that something is wrong, instead of a timeout. I know I found this to be confusing in the beginning, and I probably still do. We could even show the ip address of the host causing the error in the message. Then the user could see which host is responsible for the failure. Is there anything about the patch itself you dont like? Imho its not adding much complexity. Most of the patch is the new Exception classes and logging. The actual code handling the failure is just a few lines. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Assignee: Christian Spriegel Priority: Minor Fix For: 2.1.1 Attachments: 7886_v1.txt *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133959#comment-14133959 ] Christian Spriegel commented on CASSANDRA-7886: --- [~kohlisankalp]: Thanks for the reference to CASSANDRA-6747. This seems to be exactly what I am talking about. Doesn't the patch from CASSANDRA-6747 handle TOEs already? (Sorry, I haven't studied the patch yet). TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133974#comment-14133974 ] Christian Spriegel commented on CASSANDRA-7886: --- [~kohlisankalp]: Hi again! Sorry for all the mails, but I just had a look at your 2.1 patch: I think removing the try-catch in ReadVerbHandler should do the trick, right? Then TOEs would be handled by your code in the MessageDeliveryTask? ReadVerbHandler: {code} Row row; -try -{ row = command.getRow(keyspace); -} -catch (TombstoneOverwhelmingException e) -{ -// error already logged. Drop the request -return; -} {code} TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133974#comment-14133974 ] Christian Spriegel edited comment on CASSANDRA-7886 at 9/15/14 3:07 PM: [~kohlisankalp]: Hi again! Sorry for all the mails, but I just had a look at your 2.1 patch: I think removing the try-catch in ReadVerbHandler should do the trick, right? Then TOEs would be handled by your code in the MessageDeliveryTask? ReadVerbHandler: {code} Row row; -try -{ row = command.getRow(keyspace); -} -catch (TombstoneOverwhelmingException e) -{ -// error already logged. Drop the request -return; -} {code} Edit: Looking a bit closer, I think its missing a few more pieces. But in my naive mind it does not look like a big protocol change. I would like to hear your opinion. was (Author: christianmovi): [~kohlisankalp]: Hi again! Sorry for all the mails, but I just had a look at your 2.1 patch: I think removing the try-catch in ReadVerbHandler should do the trick, right? Then TOEs would be handled by your code in the MessageDeliveryTask? ReadVerbHandler: {code} Row row; -try -{ row = command.getRow(keyspace); -} -catch (TombstoneOverwhelmingException e) -{ -// error already logged. Drop the request -return; -} {code} TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
Christian Spriegel created CASSANDRA-7886: - Summary: TombstoneOverwhelmingException should not wait for timeout Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Christian Spriegel Priority: Minor *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Environment: Tested with Cassandra 2.0.8 TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7886: -- Description: *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. was: *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122756#comment-14122756 ] Christian Spriegel commented on CASSANDRA-7886: --- [~slebresne]: Customer keep sending in requests. So if cassandra suddenly decides to make every request wait for 15 sec. (config increased) then we run out of heap, because requests pile up :-( As a workaround we can probably decrease the timeout setting, but the behaviour should be changed imho. Can we set fixversion to 3.0 already so that this ticket wont be forgotten? TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122756#comment-14122756 ] Christian Spriegel edited comment on CASSANDRA-7886 at 9/5/14 10:09 AM: [~slebresne]: Customer keep sending in requests. So if cassandra suddenly decides to make every request wait for 15 sec. (config increased) then we run out of heap, because requests pile up :-( As a workaround we can probably decrease the timeout setting, but the behaviour should be changed imho. Can we set fixversion to 3.0 already so that this ticket wont be forgotten? edit: Thanks for the fast response :-) was (Author: christianmovi): [~slebresne]: Customer keep sending in requests. So if cassandra suddenly decides to make every request wait for 15 sec. (config increased) then we run out of heap, because requests pile up :-( As a workaround we can probably decrease the timeout setting, but the behaviour should be changed imho. Can we set fixversion to 3.0 already so that this ticket wont be forgotten? TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122830#comment-14122830 ] Christian Spriegel commented on CASSANDRA-7886: --- [~slebresne]: {quote}I meant that if every requests hits TombstoneOverwhelmingException{quote} My story is a bit longer, in normal operation this does not happen, not even close. But sometimes our customers mess up in their backend systems: Sometimes their backend will send the same request in an endless loop, where they delete+create a column in a row. This causes many tombstones to be created very quickly. Currently this single customer brings down our entire landscape, due to his requests piling up in our tomcat. Which also affects other customers. If Cassandra were to fail instantly, then his requests would run into an error (which they should, because he is using it wrong) and therefore would not pile up. {quote}Sure, but setting a fixversion is never a promise. {quote} Thanks! I know. But at least somebody will have it on his radar. (I hope) :-) TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7511) Always flush on TRUNCATE
[ https://issues.apache.org/jira/browse/CASSANDRA-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090675#comment-14090675 ] Christian Spriegel commented on CASSANDRA-7511: --- The renewTable-logic was added to allow fast-truncates for automated testing (for application developers). Originally this was added for 1.1: CASSANDRA-4153 And with 1.2 there was another tweak on this topic: CASSANDRA-5704 Please reconsider this patch, as it will slow down truncates and will probably make automated testing impossible again (as truncate becomes so slow it will be unusable) Always flush on TRUNCATE Key: CASSANDRA-7511 URL: https://issues.apache.org/jira/browse/CASSANDRA-7511 Project: Cassandra Issue Type: Bug Environment: CentOS 6.5, Oracle Java 7u60, C* 2.0.6, 2.0.9, including earlier 1.0.* versions. Reporter: Viktor Jevdokimov Assignee: Jeremiah Jordan Priority: Minor Labels: commitlog Fix For: 2.0.10, 2.1 rc5 Attachments: 7511-2.0-v2.txt, 7511-v3-remove-renewMemtable.txt, 7511-v3-test.txt, 7511-v3.txt, 7511.txt Commit log grows infinitely after CF truncate operation via cassandra-cli, regardless CF receives writes or not thereafter. CF's could be non-CQL Standard and Super column type. Creation of snapshots after truncate is turned off. Commit log may start grow promptly, may start grow later, on a few only or on all nodes at once. Nothing special in the system log. No idea how to reproduce. After rolling restart commit logs are cleared and back to normal. Just annoying to do rolling restart after each truncate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7511) Always flush on TRUNCATE
[ https://issues.apache.org/jira/browse/CASSANDRA-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090675#comment-14090675 ] Christian Spriegel edited comment on CASSANDRA-7511 at 8/8/14 12:02 PM: The renewTable-logic was added to allow fast-truncates for automated testing (for application developers). Originally this was added for 1.1: CASSANDRA-4153 And with 1.2 there was another tweak on this topic: CASSANDRA-5704 Please reconsider this patch, as it will slow down truncates and will probably make automated testing impossible again (as truncate becomes so slow it will be unusable) Edit: could the patch being changed so that renew is only used only when durable_writes if off? Shouldnt that solve the growing commitlog issue? was (Author: christianmovi): The renewTable-logic was added to allow fast-truncates for automated testing (for application developers). Originally this was added for 1.1: CASSANDRA-4153 And with 1.2 there was another tweak on this topic: CASSANDRA-5704 Please reconsider this patch, as it will slow down truncates and will probably make automated testing impossible again (as truncate becomes so slow it will be unusable) Always flush on TRUNCATE Key: CASSANDRA-7511 URL: https://issues.apache.org/jira/browse/CASSANDRA-7511 Project: Cassandra Issue Type: Bug Environment: CentOS 6.5, Oracle Java 7u60, C* 2.0.6, 2.0.9, including earlier 1.0.* versions. Reporter: Viktor Jevdokimov Assignee: Jeremiah Jordan Priority: Minor Labels: commitlog Fix For: 2.0.10, 2.1 rc5 Attachments: 7511-2.0-v2.txt, 7511-v3-remove-renewMemtable.txt, 7511-v3-test.txt, 7511-v3.txt, 7511.txt Commit log grows infinitely after CF truncate operation via cassandra-cli, regardless CF receives writes or not thereafter. CF's could be non-CQL Standard and Super column type. Creation of snapshots after truncate is turned off. Commit log may start grow promptly, may start grow later, on a few only or on all nodes at once. Nothing special in the system log. No idea how to reproduce. After rolling restart commit logs are cleared and back to normal. Just annoying to do rolling restart after each truncate. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7401) Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero
[ https://issues.apache.org/jira/browse/CASSANDRA-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035104#comment-14035104 ] Christian Spriegel commented on CASSANDRA-7401: --- [~iamaleksey]: Thanks! {quote} Apparently yes, it is - when the CF has no cells in it, no range tombstones, and isn't top-level marked for deletion. {quote} That is what I got from the code too :-) {quote} I have NO idea where that CF originated from, and don't even know where to start looking. If you find out - let us know. {quote} I also was not able to figure this out. If I do, I will let you know. Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero -- Key: CASSANDRA-7401 URL: https://issues.apache.org/jira/browse/CASSANDRA-7401 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Assignee: Christian Spriegel Fix For: 2.0.9 Attachments: MemtableFixV1.patch Hi, I was describing an error the other day on the mailing list, where the MemoryMeter would go into an endless loop. This happened multiple times last week, unfortunetaly I cannot reproduce it at the moment. The whole cassandra server got unresponsive and logged about 7000k messages per second into the log: {quote} ... INFO [MemoryMeter:1] 2014-06-14 19:24:09,488 Memtable.java (line 481) CFS(Keyspace='MDS', ColumnFamily='ResponsePortal') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells ... {quote} The cause for this seems to be Memtable.maybeUpdateLiveRatio(), which cannot handle currentOperations (and liveRatioComputedAt) to be zero. The loop will iterate endlessly: {code} ... if (operations 2 * last) // does never break when zero: 0 0 is not true break; ... {code} One thing I cannot explain: How can the operationcount be zero when maybeUpdateLiveRatio() gets called? is it possible that addAndGet in resolve() increases by 0 in some cases? {code} currentOperations.addAndGet(cf.getColumnCount() + (cf.isMarkedForDelete() ? 1 : 0) + cf.deletionInfo().rangeCount()); // can this be zero? {code} Nevertheless, the attached patch fixes the endless loop. Feel free to reassign this ticket or create a followup ticket if currentOperations should not be zero. kind regards, Christian -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7401) Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero
[ https://issues.apache.org/jira/browse/CASSANDRA-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033894#comment-14033894 ] Christian Spriegel commented on CASSANDRA-7401: --- Tonight, one of our servers was hanging again. I now deployed a custom-cassandra with the patch on our systems. I will report if it still happens... Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero -- Key: CASSANDRA-7401 URL: https://issues.apache.org/jira/browse/CASSANDRA-7401 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Assignee: Christian Spriegel Fix For: 2.0.9 Attachments: MemtableFixV1.patch Hi, I was describing an error the other day on the mailing list, where the MemoryMeter would go into an endless loop. This happened multiple times last week, unfortunetaly I cannot reproduce it at the moment. The whole cassandra server got unresponsive and logged about 7000k messages per second into the log: {quote} ... INFO [MemoryMeter:1] 2014-06-14 19:24:09,488 Memtable.java (line 481) CFS(Keyspace='MDS', ColumnFamily='ResponsePortal') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells ... {quote} The cause for this seems to be Memtable.maybeUpdateLiveRatio(), which cannot handle currentOperations (and liveRatioComputedAt) to be zero. The loop will iterate endlessly: {code} ... if (operations 2 * last) // does never break when zero: 0 0 is not true break; ... {code} One thing I cannot explain: How can the operationcount be zero when maybeUpdateLiveRatio() gets called? is it possible that addAndGet in resolve() increases by 0 in some cases? {code} currentOperations.addAndGet(cf.getColumnCount() + (cf.isMarkedForDelete() ? 1 : 0) + cf.deletionInfo().rangeCount()); // can this be zero? {code} Nevertheless, the attached patch fixes the endless loop. Feel free to reassign this ticket or create a followup ticket if currentOperations should not be zero. kind regards, Christian -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7401) Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero
Christian Spriegel created CASSANDRA-7401: - Summary: Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero Key: CASSANDRA-7401 URL: https://issues.apache.org/jira/browse/CASSANDRA-7401 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Assignee: Christian Spriegel Hi, I was describing an error the other day on the mailing list, where the MemoryMeter would go into an endless loop. This happened multiple times last week, unfortunetaly I cannot reproduce it at the moment. The whole cassandra server got unresponsive and logged about 7000k messages per second into the log: {quote} ... INFO [MemoryMeter:1] 2014-06-14 19:24:09,488 Memtable.java (line 481) CFS(Keyspace='MDS', ColumnFamily='ResponsePortal') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells ... {quote} The cause for this seems to be Memtable.maybeUpdateLiveRatio(), which cannot handle currentOperations (and liveRatioComputedAt) to be zero. The loop will iterate endlessly: {code} ... if (operations 2 * last) // does never break when zero: 0 0 is not true break; ... {code} One thing I cannot explain: How can the operationcount be zero when maybeUpdateLiveRatio() gets called? is it possible that addAndGet in resolve() increases by 0 in some cases? {code} currentOperations.addAndGet(cf.getColumnCount() + (cf.isMarkedForDelete() ? 1 : 0) + cf.deletionInfo().rangeCount()); // can this be zero? {code} Nevertheless, the attached patch fixes the endless loop. Feel free to reassign this ticket or create a followup ticket if currentOperations should not be zero. kind regards, Christian -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7401) Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero
[ https://issues.apache.org/jira/browse/CASSANDRA-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-7401: -- Attachment: MemtableFixV1.patch Memtable.maybeUpdateLiveRatio goes into an endless loop when currentOperations is zero -- Key: CASSANDRA-7401 URL: https://issues.apache.org/jira/browse/CASSANDRA-7401 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Assignee: Christian Spriegel Fix For: 2.0.9 Attachments: MemtableFixV1.patch Hi, I was describing an error the other day on the mailing list, where the MemoryMeter would go into an endless loop. This happened multiple times last week, unfortunetaly I cannot reproduce it at the moment. The whole cassandra server got unresponsive and logged about 7000k messages per second into the log: {quote} ... INFO [MemoryMeter:1] 2014-06-14 19:24:09,488 Memtable.java (line 481) CFS(Keyspace='MDS', ColumnFamily='ResponsePortal') liveRatio is 64.0 (just-counted was 64.0). calculation took 0ms for 0 cells ... {quote} The cause for this seems to be Memtable.maybeUpdateLiveRatio(), which cannot handle currentOperations (and liveRatioComputedAt) to be zero. The loop will iterate endlessly: {code} ... if (operations 2 * last) // does never break when zero: 0 0 is not true break; ... {code} One thing I cannot explain: How can the operationcount be zero when maybeUpdateLiveRatio() gets called? is it possible that addAndGet in resolve() increases by 0 in some cases? {code} currentOperations.addAndGet(cf.getColumnCount() + (cf.isMarkedForDelete() ? 1 : 0) + cf.deletionInfo().rangeCount()); // can this be zero? {code} Nevertheless, the attached patch fixes the endless loop. Feel free to reassign this ticket or create a followup ticket if currentOperations should not be zero. kind regards, Christian -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7059) Range query with strict bound on clustering column can return less results than required for compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983089#comment-13983089 ] Christian Spriegel commented on CASSANDRA-7059: --- Is it possible that allow filtering is generally not allowed for compact storage tables? (due to this ticket?) Range query with strict bound on clustering column can return less results than required for compact tables --- Key: CASSANDRA-7059 URL: https://issues.apache.org/jira/browse/CASSANDRA-7059 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne What's wrong: {noformat} CREATE TABLE test ( k int, v int, PRIMARY KEY (k, v) ) WITH COMPACT STORAGE; INSERT INTO test(k, v) VALUES (0, 0); INSERT INTO test(k, v) VALUES (0, 1); INSERT INTO test(k, v) VALUES (1, 0); INSERT INTO test(k, v) VALUES (1, 1); INSERT INTO test(k, v) VALUES (2, 0); INSERT INTO test(k, v) VALUES (2, 1); SELECT * FROM test WHERE v 0 LIMIT 3 ALLOW FILTERING; k | v ---+--- 1 | 1 0 | 1 {noformat} That last query should return 3 results. The problem lies into how we deal with 'strict greater than' ({{}}) for wide compact storage table. Namely, for those tables, we internally only support inclusive bounds (for CQL3 tables this is not a problem as we deal with this using the 'end-of-component' of the CompositeType encoding). So we compensate by asking one more result than asked by the user, and we trim afterwards if that was unnecessary. This works fine for per-partition queries, but don't for range queries since we potentially would have to ask for {{X}} more results where {{X}} is the number of partition fetched, but we don't know {{X}} beforehand. I'll note that: * this has always be there * this only (potentially) affect compact tables * this only affect range queries that have a strict bound on the clustering column (this means only {{ALLOW FILTERING}}) queries in particular. * this only matters if a {{LIMIT}} is set on the query. As for fixes, it's not entirely trivial. The right fix would probably be to start supporting non-inclusive bound internally, but that's far from a small fix and is at best a 2.1 fix (since we'll have to make a messaging protocol change to ship some additional info for SliceQueryFilter). Also, this might be a lot of work for something that only affect some {{ALLOW FILTERING}} queries on compact tables. Another (somewhat simpler) solution might be to detect when we have this kind of queries and use a pager with no limit. We would then query a first page using the user limit (plus some smudge factor to avoid being inefficient too often) and would continue paging unless either we've exhausted all results or we can prove that post-processing we do have enough results to satisfy the user limit. This does mean in some case we might do 2 or more internal queries, but in practice we can probably make that case very rare, and since the query is an {{ALLOW FILTERING}} one, the user is somewhat warned that the query may not be terribly efficient. Lastly, we could always start by disallowing the kind of query that is potentially problematic (until we have a proper fix), knowing that users can work around that by either using non-strict bounds or removing the {{LIMIT}}, whichever makes the most sense in their case. In 1.2 in particular, we don't have the query pagers, so the previous solution I describe would be a bit of a mess to implement. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955927#comment-13955927 ] Christian Spriegel commented on CASSANDRA-6892: --- I ran my tests again with the second patch. Again no errors. Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: 6892-2.0-v2.txt, 6892-2.0.txt, CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses keyValidator !!! column_metadata.put(newDef.name, newDef); pkCols.set(i,
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954862#comment-13954862 ] Christian Spriegel commented on CASSANDRA-6892: --- [~thobbs]: I ran my tests with your patch. Works fine for me. Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: 6892-2.0.txt, CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses keyValidator !!! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef);
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942946#comment-13942946 ] Christian Spriegel commented on CASSANDRA-6892: --- I tried both: - First I upgraded from C 1.2.15 to 2.0.6. - Then I deleted the entire data folder and started with a fresh installation I will try to provide more information on the weekend... Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943457#comment-13943457 ] Christian Spriegel commented on CASSANDRA-6892: --- I think this must be some special case I am stumbling upon. This error does only happen for a single testcase (out of 1100), and there are actually some values in that column family. My schema creation is a bit more complicated: First create the schema by calling cassandra-cli from inside my unit test. Right after that I modify the schema though Hector/thrift and disable gc-grace for my test-schema: {code} final Cluster cluster = keyspace.getCluster(); final String keyspaceName = keyspace.getKeyspace().getKeyspaceName(); final KeyspaceDefinition keyspaceDefinition = cluster.describeKeyspace(keyspaceName); final ListColumnFamilyDefinition cfDefs = keyspaceDefinition.getCfDefs(); for (final ColumnFamilyDefinition cfDef : cfDefs) { cfDef.setGcGraceSeconds(0); cfDef.setMemtableFlushAfterMins(Integer.MAX_VALUE); cfDef.setReadRepairChance(0.0); cfDef.setKeyCacheSavePeriodInSeconds(Integer.MAX_VALUE); cluster.updateColumnFamily(cfDef); } {code} I could imagine that modifying the schema through thrift breaks/broke the schema for 2.0. Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943555#comment-13943555 ] Christian Spriegel commented on CASSANDRA-6892: --- Tyler can reproduce the issue now, but I am posting this anyway :-) The thrift-schema: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} With the following data: {code} [default@MDS_0] list MasterdataIndex ; Using default limit of 100 Using default cell limit of 100 --- RowKey: G:1 = (name=GOOD, value=474f4f44, timestamp=1395434320342000) --- RowKey: K:1 = (name=key0, value=160218046b6579301804474f4f4416c29a0c16, timestamp=1395434320347001) = (name=key1, value=160218046b6579311804474f4f4416c49a0c16, timestamp=1395434320351001) 2 Rows Returned. Elapsed time: 30 msec(s). [default@MDS_0] {code} (and a key2 which failed to insert) results in the following CFMetaData.toString(): {code} org.apache.cassandra.config.CFMetaData@54196399[ cfId=1d46d5a5-726e-3610-b08e-ebeca28b6325,ksName=MDS_0, cfName=MasterdataIndex, cfType=Standard, comparator=org.apache.cassandra.db.marshal.UTF8Type, comment=,readRepairChance=0.1,dclocalReadRepairChance=0.0,replicateOnWrite=true, gcGraceSeconds=864000, defaultValidator=org.apache.cassandra.db.marshal.BytesType, keyValidator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.LongType),minCompactionThreshold=4,maxCompactionThreshold=32, column_metadata= { java.nio.HeapByteBuffer[pos=0 lim=4 cap=4]=ColumnDefinition{name=6b657932, validator=org.apache.cassandra.db.marshal.LongType, type=PARTITION_KEY, componentIndex=1, indexName=null, indexType=null}, java.nio.HeapByteBuffer[pos=0 lim=3 cap=3]=ColumnDefinition{name=6b6579, validator=org.apache.cassandra.db.marshal.UTF8Type, type=PARTITION_KEY, componentIndex=0, indexName=null, indexType=null} }, compactionStrategyClass=class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionOptions={sstable_compression=org.apache.cassandra.io.compress.SnappyCompressor, chunk_length_kb=64},bloomFilterFpChance=null,memtable_flush_period_in_ms=0,caching=KEYS_ONLY,defaultTimeToLive=0,speculative_retry=NONE,indexInterval=128,populateIoCacheOnFlush=false,droppedColumns={},triggers={}] {code} Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943580#comment-13943580 ] Christian Spriegel commented on CASSANDRA-6892: --- Ok, its easily reproducable with CLI: {code} [default@MDS_0] list MasterdataIndex ; Using default limit of 100 Using default cell limit of 100 --- RowKey: G:1 = (name=GOOD, value=474f4f44, timestamp=1395434320342000) --- RowKey: K:1 = (name=key0, value=160218046b6579301804474f4f4416c29a0c16, timestamp=1395434320347001) = (name=key1, value=160218046b6579311804474f4f4416c49a0c16, timestamp=1395434320351001) 2 Rows Returned. Elapsed time: 2.6 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key2] = 1122112211221122112211221122AAFF11AAFF; (Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation InvalidRequestException(why:(Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] {code} Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this
[jira] [Comment Edited] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943580#comment-13943580 ] Christian Spriegel edited comment on CASSANDRA-6892 at 3/21/14 9:30 PM: Ok, its easily reproducable with CLI: {code} [default@MDS_0] set MasterdataIndex['K:1'][key0] = 1122112211221122112211221122AAFF11AAFF; Value inserted. Elapsed time: 1.08 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key1] = 1122112211221122112211221122AAFF11AAFF; Value inserted. Elapsed time: 1.08 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key] = 1122112211221122112211221122AAFF11AAFF; (String didn't validate.) [MDS_0][MasterdataIndex][key] failed validation InvalidRequestException(why:(String didn't validate.) [MDS_0][MasterdataIndex][key] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] set MasterdataIndex['K:1'][key2] = 1122112211221122112211221122AAFF11AAFF; (Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation InvalidRequestException(why:(Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] list MasterdataIndex ; Using default limit of 100 Using default cell limit of 100 --- RowKey: K:1 = (name=key0, value=1122112211221122112211221122aaff11aaff, timestamp=1395437337904000) = (name=key1, value=1122112211221122112211221122aaff11aaff, timestamp=1395437341326000) 2 Rows Returned. Elapsed time: 2.35 msec(s). [default@MDS_0] {code} was (Author: christianmovi): Ok, its easily reproducable with CLI: {code} [default@MDS_0] set MasterdataIndex['K:1'][key0] = 1122112211221122112211221122AAFF11AAFF; Value inserted. Elapsed time: 1.08 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key1] = 1122112211221122112211221122AAFF11AAFF; Value inserted. Elapsed time: 1.08 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key] = 1122112211221122112211221122AAFF11AAFF; (String didn't validate.) [MDS_0][MasterdataIndex][key] failed validation InvalidRequestException(why:(String didn't validate.) [MDS_0][MasterdataIndex][key] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] set MasterdataIndex['K:1'][key2] = 1122112211221122112211221122AAFF11AAFF; (Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation InvalidRequestException(why:(Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at
[jira] [Comment Edited] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943580#comment-13943580 ] Christian Spriegel edited comment on CASSANDRA-6892 at 3/21/14 9:30 PM: Ok, its easily reproducable with CLI: {code} [default@MDS_0] set MasterdataIndex['K:1'][key0] = 1122112211221122112211221122AAFF11AAFF; Value inserted. Elapsed time: 1.08 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key1] = 1122112211221122112211221122AAFF11AAFF; Value inserted. Elapsed time: 1.08 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key] = 1122112211221122112211221122AAFF11AAFF; (String didn't validate.) [MDS_0][MasterdataIndex][key] failed validation InvalidRequestException(why:(String didn't validate.) [MDS_0][MasterdataIndex][key] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] set MasterdataIndex['K:1'][key2] = 1122112211221122112211221122AAFF11AAFF; (Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation InvalidRequestException(why:(Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] list MasterdataIndex ; Using default limit of 100 Using default cell limit of 100 --- RowKey: G:1 = (name=GOOD, value=474f4f44, timestamp=1395434320342000) --- RowKey: K:1 = (name=key0, value=1122112211221122112211221122aaff11aaff, timestamp=1395437337904000) = (name=key1, value=1122112211221122112211221122aaff11aaff, timestamp=1395437341326000) 2 Rows Returned. Elapsed time: 2.35 msec(s). [default@MDS_0] {code} was (Author: christianmovi): Ok, its easily reproducable with CLI: {code} [default@MDS_0] list MasterdataIndex ; Using default limit of 100 Using default cell limit of 100 --- RowKey: G:1 = (name=GOOD, value=474f4f44, timestamp=1395434320342000) --- RowKey: K:1 = (name=key0, value=160218046b6579301804474f4f4416c29a0c16, timestamp=1395434320347001) = (name=key1, value=160218046b6579311804474f4f4416c49a0c16, timestamp=1395434320351001) 2 Rows Returned. Elapsed time: 2.6 msec(s). [default@MDS_0] set MasterdataIndex['K:1'][key2] = 1122112211221122112211221122AAFF11AAFF; (Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation InvalidRequestException(why:(Expected 8 or 0 byte long (19)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:16640) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:848) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:832) at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:982) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:225) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:213) at org.apache.cassandra.cli.CliMain.main(CliMain.java:343) [default@MDS_0] {code} Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943642#comment-13943642 ] Christian Spriegel commented on CASSANDRA-6892: --- With CQLSH the insert works fine: {code} cqlsh:MDS select * from MasterdataIndex; key | key2 | column1 | value -+--+-+-- K |1 |key1 | 0x1122112211221122112211221122aaff11aaff cqlsh:MDS insert into MasterdataIndex (key, key2, column1, value) VALUES ('K',1,'key2',0x1122112211221122112211221122aaff11aaff); cqlsh:MDS select * from MasterdataIndex; key | key2 | column1 | value -+--+-+-- K |1 |key1 | 0x1122112211221122112211221122aaff11aaff K |1 |key2 | 0x1122112211221122112211221122aaff11aaff cqlsh:MDS {code} I can even list the value afterwards using CLI: {code} [default@MDS] list MasterdataIndex; Using default limit of 100 Using default cell limit of 100 --- RowKey: K:1 = (name=key1, value=1122112211221122112211221122aaff11aaff, timestamp=139543981152) = (name=key2, value=1122112211221122112211221122aaff11aaff, timestamp=1395439922582000) 1 Row Returned. Elapsed time: 2.02 msec(s). [default@MDS] {code} Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols)
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943723#comment-13943723 ] Christian Spriegel commented on CASSANDRA-6892: --- Imho taking the thrift-columname to access column_metadata must be wrong. I think in ThriftValidation.validateColumnData() should not work on CFMetaData.column_metadata, but on regularColumns or regularAndStaticColumns() instead. [~jbellis], [~thobbs]: Do you guys think I am on the right track here? Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: API Reporter: Christian Spriegel Assignee: Tyler Hobbs Priority: Minor Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition
[jira] [Created] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
Christian Spriegel created CASSANDRA-6892: - Summary: Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses keyValidator !!! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } ... public AbstractType? getValidator() // in ThriftValidation this is expected to be the value validator! { return validator; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-6892: -- Reproduced In: 2.0.6 Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses keyValidator !!! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } ... public AbstractType? getValidator() // in ThriftValidation this is expected to be the value validator! { return validator; } {code} -- This message was sent by
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941557#comment-13941557 ] Christian Spriegel commented on CASSANDRA-6892: --- I think addDefaultKeyAliases() should use the defaultValidator instead of keyValidator in instanciate the ColumnDefinition. Looks like a copypaste mistake :-) {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, defaultValidator, idx); // SHOULD USE defaultValidator HERE! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } {code} Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition
[jira] [Updated] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-6892: -- Attachment: CASSANDRA-6892_V1.patch Attached patch file Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Assignee: Christian Spriegel Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses keyValidator !!! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } ... public AbstractType? getValidator()
[jira] [Commented] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941567#comment-13941567 ] Christian Spriegel commented on CASSANDRA-6892: --- Two questions remain: - Should it even call addDefaultKeyAliases() for my legacy thrift CF? - Why did it work for some columns and failed for just some? (my test was able to insert two columns and failed with the third column it was inserting into this row) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Assignee: Christian Spriegel Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type
[jira] [Updated] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Spriegel updated CASSANDRA-6892: -- Assignee: (was: Christian Spriegel) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20833) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:1) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 46 more {code} The schema of my column family is: {code} create column family MasterdataIndex with compression_options = {sstable_compression:SnappyCompressor, chunk_length_kb:64} and comparator = UTF8Type and key_validation_class = 'CompositeType(UTF8Type,LongType)' and default_validation_class = BytesType; {code} From the error message it looks like Cassandra is trying to validate the value with the key-validator! (My value in this case it 21 bytes long) I studied the Cassandra 2.0 code and found something wrong. It seems in CFMetaData.addDefaultKeyAliases it passes the KeyValidator into ColumnDefinition.partitionKeyDef. Inside ColumnDefinition the validator is expected to be the value validator! In CFMetaData: {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, type, idx); // type is LongType in my case, as it uses keyValidator !!! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } ... public AbstractType? getValidator() // in ThriftValidation this is expected to be the value
[jira] [Comment Edited] (CASSANDRA-6892) Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941557#comment-13941557 ] Christian Spriegel edited comment on CASSANDRA-6892 at 3/20/14 10:27 AM: - Edit: The following is not true. It probably makes sense for index-columns. I think addDefaultKeyAliases() should use the defaultValidator instead of keyValidator in instanciate the ColumnDefinition. Looks like a copypaste mistake :-) {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, defaultValidator, idx); // SHOULD USE defaultValidator HERE! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } {code} was (Author: christianmovi): I think addDefaultKeyAliases() should use the defaultValidator instead of keyValidator in instanciate the ColumnDefinition. Looks like a copypaste mistake :-) {code} private ListColumnDefinition addDefaultKeyAliases(ListColumnDefinition pkCols) { for (int i = 0; i pkCols.size(); i++) { if (pkCols.get(i) == null) { Integer idx = null; AbstractType? type = keyValidator; if (keyValidator instanceof CompositeType) { idx = i; type = ((CompositeType)keyValidator).types.get(i); } // For compatibility sake, we call the first alias 'key' rather than 'key1'. This // is inconsistent with column alias, but it's probably not worth risking breaking compatibility now. ByteBuffer name = ByteBufferUtil.bytes(i == 0 ? DEFAULT_KEY_ALIAS : DEFAULT_KEY_ALIAS + (i + 1)); ColumnDefinition newDef = ColumnDefinition.partitionKeyDef(name, defaultValidator, idx); // SHOULD USE defaultValidator HERE! column_metadata.put(newDef.name, newDef); pkCols.set(i, newDef); } } return pkCols; } {code} Cassandra 2.0.x validates Thrift columns incorrectly and causes InvalidRequestException --- Key: CASSANDRA-6892 URL: https://issues.apache.org/jira/browse/CASSANDRA-6892 Project: Cassandra Issue Type: Bug Components: Core Reporter: Christian Spriegel Fix For: 2.0.7 Attachments: CASSANDRA-6892_V1.patch I just upgrade my local dev machine to Cassandra 2.0, which causes one of my automated tests to fail now. With the latest 1.2.x it was working fine. The Exception I get on my client (using Hector) is: {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:265) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115) at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:163) at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69) at com.mycompany.spring3utils.dataaccess.cassandra.AbstractCassandraDAO.doUpdate(AbstractCassandraDAO.java:482) Caused by: InvalidRequestException(why:(Expected 8 or 0 byte long (21)) [MDS_0][MasterdataIndex][key2] failed validation) at
[jira] [Commented] (CASSANDRA-4206) AssertionError: originally calculated column size of 629444349 but now it is 588008950
[ https://issues.apache.org/jira/browse/CASSANDRA-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855152#comment-13855152 ] Christian Spriegel commented on CASSANDRA-4206: --- I can confirm that this error also occurs with multithreaded_compaction=false. AssertionError: originally calculated column size of 629444349 but now it is 588008950 -- Key: CASSANDRA-4206 URL: https://issues.apache.org/jira/browse/CASSANDRA-4206 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.9 Environment: Debian Squeeze Linux, kernel 2.6.32, sun-java6-bin 6.26-0squeeze1 Reporter: Patrik Modesto I've 4 node cluster of Cassandra 1.0.9. There is a rfTest3 keyspace with RF=3 and one CF with two secondary indexes. I'm importing data into this CF using Hadoop Mapreduce job, each row has less than 10 colkumns. From JMX: MaxRowSize: 1597 MeanRowSize: 369 And there are some tens of millions of rows. It's write-heavy usage and there is a big pressure on each node, there are quite some dropped mutations on each node. After ~12 hours of inserting I see these assertion exceptiona on 3 out of four nodes: {noformat} ERROR 06:25:40,124 Fatal exception in thread Thread[HintedHandoff:1,1,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 629444349 but now it is 588008950 at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:388) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:256) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:84) at org.apache.cassandra.db.HintedHandOffManager$3.runMayThrow(HintedHandOffManager.java:437) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 629444349 but now it is 588008950 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:384) ... 7 more Caused by: java.lang.AssertionError: originally calculated column size of 629444349 but now it is 588008950 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:124) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:161) at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:380) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {noformat} Few lines regarding Hints from the output.log: {noformat} INFO 06:21:26,202 Compacting large row system/HintsColumnFamily:7000 (1712834057 bytes) incrementally INFO 06:22:52,610 Compacting large row system/HintsColumnFamily:1000 (2616073981 bytes) incrementally INFO 06:22:59,111 flushing high-traffic column family CFS(Keyspace='system', ColumnFamily='HintsColumnFamily') (estimated 305147360 bytes) INFO 06:22:59,813 Enqueuing flush of Memtable-HintsColumnFamily@833933926(3814342/305147360 serialized/live bytes, 7452 ops) INFO 06:22:59,814 Writing Memtable-HintsColumnFamily@833933926(3814342/305147360 serialized/live bytes, 7452 ops) {noformat} I think the problem may be somehow connected to an IntegerType secondary index. I had a different problem with CF with two secondary indexes, the first UTF8Type, the second IntegerType. After a few hours of inserting data in the afternoon and midnight repair+compact, the next day I couldn't find any row using the IntegerType secondary index. The output was like this: {noformat} [default@rfTest3] get IndexTest where col1 = '3230727:http://zaskolak.cz/download.php'; --- RowKey: 3230727:8383582:http://zaskolak.cz/download.php = (column=col1, value=3230727:http://zaskolak.cz/download.php, timestamp=1335348630332000) = (column=col2, value=8383582, timestamp=1335348630332000)
[jira] [Commented] (CASSANDRA-4206) AssertionError: originally calculated column size of 629444349 but now it is 588008950
[ https://issues.apache.org/jira/browse/CASSANDRA-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841590#comment-13841590 ] Christian Spriegel commented on CASSANDRA-4206: --- Seen with 1.2.11: {code} java.lang.AssertionError: originally calculated column size of 44470356 but now it is 44470410 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} AssertionError: originally calculated column size of 629444349 but now it is 588008950 -- Key: CASSANDRA-4206 URL: https://issues.apache.org/jira/browse/CASSANDRA-4206 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.9 Environment: Debian Squeeze Linux, kernel 2.6.32, sun-java6-bin 6.26-0squeeze1 Reporter: Patrik Modesto I've 4 node cluster of Cassandra 1.0.9. There is a rfTest3 keyspace with RF=3 and one CF with two secondary indexes. I'm importing data into this CF using Hadoop Mapreduce job, each row has less than 10 colkumns. From JMX: MaxRowSize: 1597 MeanRowSize: 369 And there are some tens of millions of rows. It's write-heavy usage and there is a big pressure on each node, there are quite some dropped mutations on each node. After ~12 hours of inserting I see these assertion exceptiona on 3 out of four nodes: {noformat} ERROR 06:25:40,124 Fatal exception in thread Thread[HintedHandoff:1,1,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 629444349 but now it is 588008950 at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:388) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:256) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:84) at org.apache.cassandra.db.HintedHandOffManager$3.runMayThrow(HintedHandOffManager.java:437) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 629444349 but now it is 588008950 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:384) ... 7 more Caused by: java.lang.AssertionError: originally calculated column size of 629444349 but now it is 588008950 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:124) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:161) at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:380) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {noformat} Few lines regarding Hints from the output.log: {noformat} INFO 06:21:26,202 Compacting large row