[jira] [Updated] (CASSANDRA-5922) Delete doesn't delete data.
[ https://issues.apache.org/jira/browse/CASSANDRA-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Eineke updated CASSANDRA-5922: Description: In a nutshell, I'm running several test cases against my Cassandra JPA implementation (astyanax-jpa, see https://github.com/ceineke/astyanax-jpa) and sometimes (!) batched deletes seem not to delete all rows specified in the batch. Here's the sequence of prepared CQL3 statements that is causing the issue to appear: TRUNCATE compositeentity; (delete all records so we have a clean slate) INSERT INTO compositeentity (compositekeypartone, compositekeyparttwo, compositekeypartthree, astring, auuid) VALUES (?, ?, ?, ?, ?); INSERT INTO compositeentity (compositekeypartone, compositekeyparttwo, compositekeypartthree, astring, auuid) VALUES (?, ?, ?, ?, ?); (insert two unique rows into the table) SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; (load both rows from the table to validate their existence) SELECT COUNT(1) FROM compositeentity; (counts rows to validate the number of records in the table) BEGIN BATCH DELETE FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; DELETE FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; APPLY BATCH; (uses a logged batch to delete the two rows from the table) SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; (tries loads the rows from the table to check that they don't exist anymore) After the delete, Cassandra has deleted only the first row so that the second SELECT here actually returns data. So far, this behaviour occurs randomly. This happens even if there's a long sleep (1s, 10s) between the batch delete and the selects. It is always the second row that isn't deleted, never the first. Thinking it might be timing issue (based on http://ria101.wordpress.com/2011/02/08/cassandra-the-importance-of-system-clocks-avoiding-oom-and-how-to-escape-oom-meltdown/), I've set up NTP to keep the clocks synchronized across all nodes (one node acts as the master which syncs to time.nrc.ca and {0,1,2,3}.ca.pool.ntp.org, whereas the remaining ones sync against the master. This hasn't reduced the number of times this behaviour crops up. (I am executing all statements with QUORUM level consistency.) I'm open to suggestions as to why this occurs and how I can fix it, if this can be fixed. was: In a nutcase, I'm running several test cases against my Cassandra JPA implementation (astyanax-jpa, see https://github.com/ceineke/astyanax-jpa) and sometimes(!) batched deletes seem not to delete all data. Here's the sequence of prepared CQL3 statements that is causing the issue to appear: TRUNCATE compositeentity; (delete all records so we have a clean slate) INSERT INTO compositeentity (compositekeypartone, compositekeyparttwo, compositekeypartthree, astring, auuid) VALUES (?, ?, ?, ?, ?); INSERT INTO compositeentity (compositekeypartone, compositekeyparttwo, compositekeypartthree, astring, auuid) VALUES (?, ?, ?, ?, ?); (insert two unique rows into the table) SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; (load both rows from the table to validate their existence) SELECT COUNT(1) FROM compositeentity; (counts rows to validate the number of records in the table) BEGIN BATCH DELETE FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; DELETE FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; APPLY BATCH; (uses a logged batch to delete the two rows from the table) SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypa
[jira] [Created] (CASSANDRA-5922) Delete doesn't delete data.
Chris Eineke created CASSANDRA-5922: --- Summary: Delete doesn't delete data. Key: CASSANDRA-5922 URL: https://issues.apache.org/jira/browse/CASSANDRA-5922 Project: Cassandra Issue Type: Bug Environment: 4-node cluster w/ Cassandra v1.2.8 Oracle JDK 1.6.0_45 Netflix Astyanax 1.56.42 Quorum read and write consistency level Reporter: Chris Eineke In a nutcase, I'm running several test cases against my Cassandra JPA implementation (astyanax-jpa, see https://github.com/ceineke/astyanax-jpa) and sometimes(!) batched deletes seem not to delete all data. Here's the sequence of prepared CQL3 statements that is causing the issue to appear: TRUNCATE compositeentity; (delete all records so we have a clean slate) INSERT INTO compositeentity (compositekeypartone, compositekeyparttwo, compositekeypartthree, astring, auuid) VALUES (?, ?, ?, ?, ?); INSERT INTO compositeentity (compositekeypartone, compositekeyparttwo, compositekeypartthree, astring, auuid) VALUES (?, ?, ?, ?, ?); (insert two unique rows into the table) SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; (load both rows from the table to validate their existence) SELECT COUNT(1) FROM compositeentity; (counts rows to validate the number of records in the table) BEGIN BATCH DELETE FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; DELETE FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; APPLY BATCH; (uses a logged batch to delete the two rows from the table) SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; SELECT compositekeypartone, compositekeyparttwo, compositekeypartthree FROM compositeentity WHERE compositekeypartone = ? AND compositekeyparttwo = ? AND compositekeypartthree = ?; (tries loads the rows from the table to check that they don't exist anymore) After the delete, Cassandra has deleted only the first row so that the second SELECT here actually returns data. So far, this behaviour occurs randomly. This happens even if there's a long sleep (1s, 10s) between the batch delete and the selects. It is always the second row that isn't deleted, never the first. Thinking it might be timing issue (based on http://ria101.wordpress.com/2011/02/08/cassandra-the-importance-of-system-clocks-avoiding-oom-and-how-to-escape-oom-meltdown/), I've set up NTP to keep the clocks synchronized across all nodes (one node acts as the master which syncs to time.nrc.ca and {0,1,2,3}.ca.pool.ntp.org, whereas the remaining ones sync against the master. This hasn't reduced the number of times this behaviour crops up. (I am executing all statements with QUORUM level consistency.) I'm open to suggestions as to why this occurs and how I can fix it, if this can be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5748) When flushing, nodes spent almost 100% in AbstractCompositeType.compare
[ https://issues.apache.org/jira/browse/CASSANDRA-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708444#comment-13708444 ] Chris Eineke commented on CASSANDRA-5748: - Sylvain, Thank you for your response. That's great to hear! Is there a anticipated release date yet? > When flushing, nodes spent almost 100% in AbstractCompositeType.compare > --- > > Key: CASSANDRA-5748 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5748 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.5, 1.2.6 > Environment: Apache Cassandra v1.2.6 > 4-node cluster, mostly the same hardware > # java -version > java version "1.6.0_37" > Java(TM) SE Runtime Environment (build 1.6.0_37-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode) >Reporter: Chris Eineke >Priority: Critical > Attachments: thread_dump > > > We're pretty heavy users of CQL3 and CQL3 collection types. Occasionally, > some nodes of the cluster will become extremely sluggish and the cluster as a > whole starts to become unresponsive, reads will time out, and nodes will drop > mutation messages. This happens when nodes flush Memtables to disk (based on > my tail of the system.log on each node). > I'm a curious guy, so I attached jvisualvm (v1.3.3) to the JVMs that were > having this problem. These nodes are spending up to 98% of CPU in > org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78). > I will attach a thread dump. > Thi is causing us quite a headache, because we're unable to figure what would > be causing this. We tried tuning several configuration settings (column cache > size, row key cache size), but the cluster exhibits the same issues even with > the default configuration (except for a modified num_tokens and > listen_address). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5748) When flushing, nodes spent almost 100% in AbstractCompositeType.compare
[ https://issues.apache.org/jira/browse/CASSANDRA-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Eineke updated CASSANDRA-5748: Attachment: thread_dump > When flushing, nodes spent almost 100% in AbstractCompositeType.compare > --- > > Key: CASSANDRA-5748 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5748 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.5, 1.2.6 > Environment: Apache Cassandra v1.2.6 > 4-node cluster, mostly the same hardware > # java -version > java version "1.6.0_37" > Java(TM) SE Runtime Environment (build 1.6.0_37-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode) >Reporter: Chris Eineke >Priority: Critical > Attachments: thread_dump > > > We're pretty heavy users of CQL3 and CQL3 collection types. Occasionally, > some nodes of the cluster will become extremely sluggish and the cluster as a > whole starts to become unresponsive, reads will time out, and nodes will drop > mutation messages. This happens when nodes flush Memtables to disk (based on > my tail of the system.log on each node). > I'm a curious guy, so I attached jvisualvm (v1.3.3) to the JVMs that were > having this problem. These nodes are spending up to 98% of CPU in > org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78). > I will attach a thread dump. > Thi is causing us quite a headache, because we're unable to figure what would > be causing this. We tried tuning several configuration settings (column cache > size, row key cache size), but the cluster exhibits the same issues even with > the default configuration (except for a modified num_tokens and > listen_address). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5747) When flushing, nodes spent almost 100% in AbstractCompositeType.compare
Chris Eineke created CASSANDRA-5747: --- Summary: When flushing, nodes spent almost 100% in AbstractCompositeType.compare Key: CASSANDRA-5747 URL: https://issues.apache.org/jira/browse/CASSANDRA-5747 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.6, 1.2.5 Environment: Apache Cassandra v1.2.6 4-node cluster, mostly the same hardware # java -version java version "1.6.0_37" Java(TM) SE Runtime Environment (build 1.6.0_37-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode) Reporter: Chris Eineke Priority: Critical We're pretty heavy users of CQL3 and CQL3 collection types. Occasionally, some nodes of the cluster will become extremely sluggish and the cluster as a whole starts to become unresponsive, reads will time out, and nodes will drop mutation messages. This happens when nodes flush Memtables to disk (based on my tail of the system.log on each node). I'm a curious guy, so I attached jvisualvm (v1.3.3) to the JVMs that were having this problem. These nodes are spending up to 98% of CPU in org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78). I will attach a thread dump. Thi is causing us quite a headache, because we're unable to figure what would be causing this. We tried tuning several configuration settings (column cache size, row key cache size), but the cluster exhibits the same issues even with the default configuration (except for a modified num_tokens and listen_address). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5748) When flushing, nodes spent almost 100% in AbstractCompositeType.compare
Chris Eineke created CASSANDRA-5748: --- Summary: When flushing, nodes spent almost 100% in AbstractCompositeType.compare Key: CASSANDRA-5748 URL: https://issues.apache.org/jira/browse/CASSANDRA-5748 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.6, 1.2.5 Environment: Apache Cassandra v1.2.6 4-node cluster, mostly the same hardware # java -version java version "1.6.0_37" Java(TM) SE Runtime Environment (build 1.6.0_37-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode) Reporter: Chris Eineke Priority: Critical We're pretty heavy users of CQL3 and CQL3 collection types. Occasionally, some nodes of the cluster will become extremely sluggish and the cluster as a whole starts to become unresponsive, reads will time out, and nodes will drop mutation messages. This happens when nodes flush Memtables to disk (based on my tail of the system.log on each node). I'm a curious guy, so I attached jvisualvm (v1.3.3) to the JVMs that were having this problem. These nodes are spending up to 98% of CPU in org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78). I will attach a thread dump. Thi is causing us quite a headache, because we're unable to figure what would be causing this. We tried tuning several configuration settings (column cache size, row key cache size), but the cluster exhibits the same issues even with the default configuration (except for a modified num_tokens and listen_address). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira