[jira] [Comment Edited] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302044#comment-14302044 ] Marco Palladino edited comment on CASSANDRA-2103 at 2/2/15 10:19 PM: - I do also agree with Nikolay and Amol. My use case is about storing analytics information and then deleting data when it gets too old and the counters are not incremented/used by the application anymore. I am no expert, but maybe another option would be complying with the TTL set when creating the table using {{default_time_to_live}} (as opposed as setting the TTL when increasing the counter for the first time, which is also a nice option to have). The application itself could then control the rotation of data by storing/duplicating counters in hot or frozen tables. This would require some more planning when creating the data model, as such it would be totally fine to only allow the TTL when creating the table the first time, and prevent the TTL from being set when altering the table. was (Author: thefosk): I do also agree with Nikolay and Amol. My use case is about storing analytics information and then deleting data when they get too old and are not incremented/used by the application anymore. I am no expert, but maybe another option would be complying with the TTL set when creating the table using {{default_time_to_live}} (as opposed as setting the TTL when increasing the counter for the first time, which is also a nice option to have). The application itself could then control the rotation of data by storing/duplicating counters in hot or frozen tables. This would require some more planning when creating the data model, as such it would be totally fine to only allow the TTL when creating the table the first time, and prevent the TTL from being set when altering the table. expiring counter columns Key: CASSANDRA-2103 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 Project: Cassandra Issue Type: New Feature Components: Core Affects Versions: 0.8 beta 1 Reporter: Kelvin Kakugawa Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch add ttl functionality to counter columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302044#comment-14302044 ] Marco Palladino commented on CASSANDRA-2103: I do also agree with Nikolay and Amol. My use case is about storing analytics information and then deleting data when they get too old and are not incremented/used by the application anymore. I am no expert, but maybe another option would be complying with the TTL set when creating the table using {{default_time_to_live}} (as opposed as setting the TTL when increasing the counter for the first time, which is also a nice option to have). The application itself could then control the rotation of data by storing/duplicating counters in hot or frozen tables. This would require some more planning when creating the data model, as such it would be totally fine to only allow the TTL when creating the table the first time, and prevent the TTL from being set when altering the table. expiring counter columns Key: CASSANDRA-2103 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 Project: Cassandra Issue Type: New Feature Components: Core Affects Versions: 0.8 beta 1 Reporter: Kelvin Kakugawa Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch add ttl functionality to counter columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8679) Successful LWT INSERT should return any server generated values
[ https://issues.apache.org/jira/browse/CASSANDRA-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301879#comment-14301879 ] Aleksey Yeschenko commented on CASSANDRA-8679: -- All right. I see how this would be useful to you, although I do think you should either remodel or have ntp on your clients. That said, it's a very niche request. We don't want to return all the applied results - that's not what most users would want to, on successful execution - and adding logic to select which values are server generated, and which aren't, at the level of Paxos would require reingeneering our LWT implementation. It's not a simple piece of code, but it's very stable now, and I'd prefer to not disrupt it and risk breakage for a niche feature request. Sorry. Successful LWT INSERT should return any server generated values --- Key: CASSANDRA-8679 URL: https://issues.apache.org/jira/browse/CASSANDRA-8679 Project: Cassandra Issue Type: Wish Reporter: Nils Kilden-Pedersen A failed LWT INSERT returns the row that prevented insertion, along with {{[applied]}} boolean value. A successful LWT INSERT only returns {{[applied]}}. It would be helpful to also return any other server generated values, e.g. {{NOW()}} as {{[now]}} (or whatever). There is currently no way to know what exactly was inserted without re-querying the row, which is horrible for write-throughput. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8721) expose shadowed columns in tracing output
Robert Coli created CASSANDRA-8721: -- Summary: expose shadowed columns in tracing output Key: CASSANDRA-8721 URL: https://issues.apache.org/jira/browse/CASSANDRA-8721 Project: Cassandra Issue Type: Improvement Reporter: Robert Coli Priority: Trivial Current tracing messages expose how many tombstones are read in order to read live columns, but they do not expose shadowed columns. Shadowed columns are columns where the timestamp for a given column is lower than the highest timestamp for that column. It would be useful for users who are tracing queries to understand how many shadowed columns are being read-than-ignored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8720) Provide tools for finding wide row/partition keys
J.B. Langston created CASSANDRA-8720: Summary: Provide tools for finding wide row/partition keys Key: CASSANDRA-8720 URL: https://issues.apache.org/jira/browse/CASSANDRA-8720 Project: Cassandra Issue Type: Improvement Reporter: J.B. Langston Multiple users have requested some sort of tool to help identify wide row keys. They get into a situation where they know a wide row/partition has been inserted and it's causing problems for them but they have no idea what the row key is in order to remove it. Maintaining the widest row key currently encountered and displaying it in cfstats would be one possible approach. Another would be an offline tool (possibly an enhancement to sstablekeys) to show the number of columns/bytes per key in each sstable. If a tool to aggregate the information at a CF-level could be provided that would be a bonus, but it shouldn't be too hard to write a script wrapper to aggregate them if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8086) Cassandra should have ability to limit the number of native connections
[ https://issues.apache.org/jira/browse/CASSANDRA-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302022#comment-14302022 ] Vishy Kasar commented on CASSANDRA-8086: We need this patch applied to 2.0 branch build as well. Thanks! Cassandra should have ability to limit the number of native connections --- Key: CASSANDRA-8086 URL: https://issues.apache.org/jira/browse/CASSANDRA-8086 Project: Cassandra Issue Type: Bug Reporter: Vishy Kasar Assignee: Norman Maurer Fix For: 2.1.3 Attachments: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c-2.0.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.txt We have a production cluster with 72 instances spread across 2 DCs. We have a large number ( ~ 40,000 ) of clients hitting this cluster. Client normally connects to 4 cassandra instances. Some event (we think it is a schema change on server side) triggered the client to establish connections to all cassandra instances of local DC. This brought the server to its knees. The client connections failed and client attempted re-connections. Cassandra should protect itself from such attack from client. Do we have any knobs to control the number of max connections? If not, we need to add that knob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8086) Cassandra should have ability to limit the number of native connections
[ https://issues.apache.org/jira/browse/CASSANDRA-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-8086: Fix Version/s: 2.0.13 Cassandra should have ability to limit the number of native connections --- Key: CASSANDRA-8086 URL: https://issues.apache.org/jira/browse/CASSANDRA-8086 Project: Cassandra Issue Type: Bug Reporter: Vishy Kasar Assignee: Norman Maurer Fix For: 2.1.3, 2.0.13 Attachments: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c-2.0.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.txt We have a production cluster with 72 instances spread across 2 DCs. We have a large number ( ~ 40,000 ) of clients hitting this cluster. Client normally connects to 4 cassandra instances. Some event (we think it is a schema change on server side) triggered the client to establish connections to all cassandra instances of local DC. This brought the server to its knees. The client connections failed and client attempted re-connections. Cassandra should protect itself from such attack from client. Do we have any knobs to control the number of max connections? If not, we need to add that knob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8086) Cassandra should have ability to limit the number of native connections
[ https://issues.apache.org/jira/browse/CASSANDRA-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8086: - Reviewer: Joshua McKenzie (was: Sylvain Lebresne) Cassandra should have ability to limit the number of native connections --- Key: CASSANDRA-8086 URL: https://issues.apache.org/jira/browse/CASSANDRA-8086 Project: Cassandra Issue Type: Bug Reporter: Vishy Kasar Assignee: Norman Maurer Fix For: 2.1.3 Attachments: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c-2.0.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.txt We have a production cluster with 72 instances spread across 2 DCs. We have a large number ( ~ 40,000 ) of clients hitting this cluster. Client normally connects to 4 cassandra instances. Some event (we think it is a schema change on server side) triggered the client to establish connections to all cassandra instances of local DC. This brought the server to its knees. The client connections failed and client attempted re-connections. Cassandra should protect itself from such attack from client. Do we have any knobs to control the number of max connections? If not, we need to add that knob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8679) Successful LWT INSERT should return any server generated values
[ https://issues.apache.org/jira/browse/CASSANDRA-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-8679. -- Resolution: Won't Fix Successful LWT INSERT should return any server generated values --- Key: CASSANDRA-8679 URL: https://issues.apache.org/jira/browse/CASSANDRA-8679 Project: Cassandra Issue Type: Wish Reporter: Nils Kilden-Pedersen A failed LWT INSERT returns the row that prevented insertion, along with {{[applied]}} boolean value. A successful LWT INSERT only returns {{[applied]}}. It would be helpful to also return any other server generated values, e.g. {{NOW()}} as {{[now]}} (or whatever). There is currently no way to know what exactly was inserted without re-querying the row, which is horrible for write-throughput. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8336) Quarantine nodes after receiving the gossip shutdown message
[ https://issues.apache.org/jira/browse/CASSANDRA-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288251#comment-14288251 ] Brandon Williams edited comment on CASSANDRA-8336 at 2/2/15 10:28 PM: -- This patch helps, but the problem with this is approach is the node can still flap, given a disjoint enough (gossip state-wise) cluster. There are a few ways we can solve this: * quarantine after shutdown. This has the consequence of not being able to restart a node until the quarantine expires. * Sleep for ring_delay or some interval after setting the shutdown state before sending the rpc shutdown. I'm not 100% sure this would prevent the flapping, and sleeping that long on shutdown sucks as equally as not being able to reboot until the quarantine expires. * I suggest a third way, which I'll discuss below. The method suggests when node X receives a shutdown event from Y, it will update its local state for Y to version Integer.MAX_VALUE, and thus no updates for the same generation will be accepted since they will always have a lower version. When Y restarts it will have a new generation and everything will work normally. There is one consequence to this method, and that is that gossipdisable/enable has to now generate a new generation, which triggers the has restarted, now UP message on other nodes, but this seems like a fairly minor thing. On the surface, it may seem easier to have Y just send with a version of MAX_VALUE, but that will only apply to nodes that receive it via gossip, not the ones that receive it via rpc which is likely the bulk of them, and it wouldn't be an optimization anyway since we only sleep for one gossip round, and the node(s) we gossip to will set the version anyway before propagating it to the rest of the cluster. v2 does this. was (Author: brandon.williams): This patch helps, but the problem with this is approach is the node can still flap, given a disjoint enough (gossip state-wise) cluster. There are a few ways we can solve this: * quarantine after shutdown. This has the consequence of not being able to restart a node until the quarantine expires. * Sleep for ring_delay or some interval after setting the shutdown state before sending the rpc shutdown. I'm not 100% sure this would prevent the flapping, and sleeping that long on shutdown sucks as equally as not being able to reboot until the quarantine expires. * Offline Richard suggested to me a third way, which I'll discuss below. The method suggests when node X receives a shutdown event from Y, it will update its local state for Y to version Integer.MAX_VALUE, and thus no updates for the same generation will be accepted since they will always have a lower version. When Y restarts it will have a new generation and everything will work normally. There is one consequence to this method, and that is that gossipdisable/enable has to now generate a new generation, which triggers the has restarted, now UP message on other nodes, but this seems like a fairly minor thing. On the surface, it may seem easier to have Y just send with a version of MAX_VALUE, but that will only apply to nodes that receive it via gossip, not the ones that receive it via rpc which is likely the bulk of them, and it wouldn't be an optimization anyway since we only sleep for one gossip round, and the node(s) we gossip to will set the version anyway before propagating it to the rest of the cluster. v2 does this. Quarantine nodes after receiving the gossip shutdown message Key: CASSANDRA-8336 URL: https://issues.apache.org/jira/browse/CASSANDRA-8336 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 2.0.13 Attachments: 8336-v2.txt, 8336.txt In CASSANDRA-3936 we added a gossip shutdown announcement. The problem here is that this isn't sufficient; you can still get TOEs and have to wait on the FD to figure things out. This happens due to gossip propagation time and variance; if node X shuts down and sends the message to Y, but Z has a greater gossip version than Y for X and has not yet received the message, it can initiate gossip with Y and thus mark X alive again. I propose quarantining to solve this, however I feel it should be a -D parameter you have to specify, so as not to destroy current dev and test practices, since this will mean a node that shuts down will not be able to restart until the quarantine expires. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8561) Tombstone log warning does not log partition key
[ https://issues.apache.org/jira/browse/CASSANDRA-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301994#comment-14301994 ] Robert Coli commented on CASSANDRA-8561: Yep, see CASSANDRA-8721. Tombstone log warning does not log partition key Key: CASSANDRA-8561 URL: https://issues.apache.org/jira/browse/CASSANDRA-8561 Project: Cassandra Issue Type: Improvement Components: Core Environment: Datastax DSE 4.5 Reporter: Jens Rantil Labels: logging Fix For: 2.1.3, 2.0.13 AFAIK, the tombstone warning in system.log does not contain the primary key. See: https://gist.github.com/JensRantil/44204676f4dbea79ea3a Including it would help a lot in diagnosing why the (CQL) row has so many tombstones. Let me know if I have misunderstood something. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302044#comment-14302044 ] Marco Palladino edited comment on CASSANDRA-2103 at 2/2/15 10:19 PM: - I do also agree with Nikolay and Amol. My use case is about storing analytics information and then deleting data when it gets too old and the counters are not incremented/used by the application anymore. I am no expert, but maybe another option would be complying with the TTL set when creating the table using {{default_time_to_live}} (as opposed as setting the TTL when increasing the counter for the first time, which is also a nice option to have). The application itself could then control the rotation of data by storing/duplicating counters in hot or frozen tables. This would require some more planning when creating the data model, as such it could be acceptable to only allow the TTL when creating the table the first time, and prevent the TTL from being set when altering the table. was (Author: thefosk): I do also agree with Nikolay and Amol. My use case is about storing analytics information and then deleting data when it gets too old and the counters are not incremented/used by the application anymore. I am no expert, but maybe another option would be complying with the TTL set when creating the table using {{default_time_to_live}} (as opposed as setting the TTL when increasing the counter for the first time, which is also a nice option to have). The application itself could then control the rotation of data by storing/duplicating counters in hot or frozen tables. This would require some more planning when creating the data model, as such it would be totally fine to only allow the TTL when creating the table the first time, and prevent the TTL from being set when altering the table. expiring counter columns Key: CASSANDRA-2103 URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 Project: Cassandra Issue Type: New Feature Components: Core Affects Versions: 0.8 beta 1 Reporter: Kelvin Kakugawa Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch add ttl functionality to counter columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8336) Quarantine nodes after receiving the gossip shutdown message
[ https://issues.apache.org/jira/browse/CASSANDRA-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302098#comment-14302098 ] Brandon Williams commented on CASSANDRA-8336: - v2 has a few problems: * If a node joins while another is shutdown, it won't see the down node's tokens because it's a dead state. This is somewhat tricky to fix since now we need a third class of states aside from live/dead. * when we decom and the user kills the node, it goes into shutdown state instead of LEFT * we should probably do both active and passive announcement, then sleep before killing the gossiper Quarantine nodes after receiving the gossip shutdown message Key: CASSANDRA-8336 URL: https://issues.apache.org/jira/browse/CASSANDRA-8336 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 2.0.13 Attachments: 8336-v2.txt, 8336.txt In CASSANDRA-3936 we added a gossip shutdown announcement. The problem here is that this isn't sufficient; you can still get TOEs and have to wait on the FD to figure things out. This happens due to gossip propagation time and variance; if node X shuts down and sends the message to Y, but Z has a greater gossip version than Y for X and has not yet received the message, it can initiate gossip with Y and thus mark X alive again. I propose quarantining to solve this, however I feel it should be a -D parameter you have to specify, so as not to destroy current dev and test practices, since this will mean a node that shuts down will not be able to restart until the quarantine expires. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8720) Provide tools for finding wide row/partition keys
[ https://issues.apache.org/jira/browse/CASSANDRA-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302072#comment-14302072 ] Jonathan Ellis commented on CASSANDRA-8720: --- In = 2.0, we log every row larger than in_memory_compaction limit. Isn't that enough? (For 2.1 we should probably add back similar functionality... /cc [~krummas]) Provide tools for finding wide row/partition keys - Key: CASSANDRA-8720 URL: https://issues.apache.org/jira/browse/CASSANDRA-8720 Project: Cassandra Issue Type: Improvement Reporter: J.B. Langston Multiple users have requested some sort of tool to help identify wide row keys. They get into a situation where they know a wide row/partition has been inserted and it's causing problems for them but they have no idea what the row key is in order to remove it. Maintaining the widest row key currently encountered and displaying it in cfstats would be one possible approach. Another would be an offline tool (possibly an enhancement to sstablekeys) to show the number of columns/bytes per key in each sstable. If a tool to aggregate the information at a CF-level could be provided that would be a bonus, but it shouldn't be too hard to write a script wrapper to aggregate them if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/3] cassandra git commit: Fix hang when repairing empty keyspace
Fix hang when repairing empty keyspace patch by Jeff Jirsa; reviewed by yukim for CASSANDRA-8694 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8c003a2a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8c003a2a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8c003a2a Branch: refs/heads/trunk Commit: 8c003a2a5ab261db06fbda784b5353d38982f488 Parents: 2a283e1 Author: Jeff Jirsa j...@jeffjirsa.net Authored: Mon Feb 2 16:52:03 2015 -0600 Committer: Yuki Morishita yu...@apache.org Committed: Mon Feb 2 16:52:03 2015 -0600 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/service/ActiveRepairService.java | 2 ++ 2 files changed, 3 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c003a2a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 40d844c..b95fd3a 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -83,6 +83,7 @@ * Log failed host when preparing incremental repair (CASSANDRA-8228) * Force config client mode in CQLSSTableWriter (CASSANDRA-8281) * Fix sstableupgrade throws exception (CASSANDRA-8688) + * Fix hang when repairing empty keyspace (CASSANDRA-8694) Merged from 2.0: * Prevent non-zero default_time_to_live on tables with counters (CASSANDRA-8678) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c003a2a/src/java/org/apache/cassandra/service/ActiveRepairService.java -- diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java index 1c5138b..15e7641 100644 --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java @@ -122,6 +122,8 @@ public class ActiveRepairService */ public RepairFuture submitRepairSession(UUID parentRepairSession, RangeToken range, String keyspace, RepairParallelism parallelismDegree, SetInetAddress endpoints, String... cfnames) { +if (cfnames.length == 0) +return null; RepairSession session = new RepairSession(parentRepairSession, range, keyspace, parallelismDegree, endpoints, cfnames); if (session.endpoints.isEmpty()) return null;
[1/3] cassandra git commit: Fix hang when repairing empty keyspace
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 2a283e10f - 8c003a2a5 refs/heads/trunk 6e9aec312 - 0cfeab60a Fix hang when repairing empty keyspace patch by Jeff Jirsa; reviewed by yukim for CASSANDRA-8694 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8c003a2a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8c003a2a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8c003a2a Branch: refs/heads/cassandra-2.1 Commit: 8c003a2a5ab261db06fbda784b5353d38982f488 Parents: 2a283e1 Author: Jeff Jirsa j...@jeffjirsa.net Authored: Mon Feb 2 16:52:03 2015 -0600 Committer: Yuki Morishita yu...@apache.org Committed: Mon Feb 2 16:52:03 2015 -0600 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/service/ActiveRepairService.java | 2 ++ 2 files changed, 3 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c003a2a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 40d844c..b95fd3a 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -83,6 +83,7 @@ * Log failed host when preparing incremental repair (CASSANDRA-8228) * Force config client mode in CQLSSTableWriter (CASSANDRA-8281) * Fix sstableupgrade throws exception (CASSANDRA-8688) + * Fix hang when repairing empty keyspace (CASSANDRA-8694) Merged from 2.0: * Prevent non-zero default_time_to_live on tables with counters (CASSANDRA-8678) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c003a2a/src/java/org/apache/cassandra/service/ActiveRepairService.java -- diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java index 1c5138b..15e7641 100644 --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java @@ -122,6 +122,8 @@ public class ActiveRepairService */ public RepairFuture submitRepairSession(UUID parentRepairSession, RangeToken range, String keyspace, RepairParallelism parallelismDegree, SetInetAddress endpoints, String... cfnames) { +if (cfnames.length == 0) +return null; RepairSession session = new RepairSession(parentRepairSession, range, keyspace, parallelismDegree, endpoints, cfnames); if (session.endpoints.isEmpty()) return null;
[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Conflicts: src/java/org/apache/cassandra/service/ActiveRepairService.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0cfeab60 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0cfeab60 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0cfeab60 Branch: refs/heads/trunk Commit: 0cfeab60a44bf80cdd60a7887012f33db3fc57ab Parents: 6e9aec3 8c003a2 Author: Yuki Morishita yu...@apache.org Authored: Mon Feb 2 16:56:46 2015 -0600 Committer: Yuki Morishita yu...@apache.org Committed: Mon Feb 2 16:56:46 2015 -0600 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/service/ActiveRepairService.java | 3 +++ 2 files changed, 4 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0cfeab60/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0cfeab60/src/java/org/apache/cassandra/service/ActiveRepairService.java -- diff --cc src/java/org/apache/cassandra/service/ActiveRepairService.java index fa9be8a,15e7641..1882a7b --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java @@@ -109,40 -120,29 +109,43 @@@ public class ActiveRepairServic * * @return Future for asynchronous call or null if there is no need to repair */ -public RepairFuture submitRepairSession(UUID parentRepairSession, RangeToken range, String keyspace, RepairParallelism parallelismDegree, SetInetAddress endpoints, String... cfnames) +public RepairSession submitRepairSession(UUID parentRepairSession, + RangeToken range, + String keyspace, + RepairParallelism parallelismDegree, + SetInetAddress endpoints, + long repairedAt, + ListeningExecutorService executor, + String... cfnames) { -if (cfnames.length == 0) +if (endpoints.isEmpty()) return null; -RepairSession session = new RepairSession(parentRepairSession, range, keyspace, parallelismDegree, endpoints, cfnames); -if (session.endpoints.isEmpty()) + ++if (cfnames.length == 0) + return null; -RepairFuture futureTask = new RepairFuture(session); -executor.execute(futureTask); -return futureTask; -} + -public void addToActiveSessions(RepairSession session) -{ +final RepairSession session = new RepairSession(parentRepairSession, UUIDGen.getTimeUUID(), range, keyspace, parallelismDegree, endpoints, repairedAt, cfnames); + sessions.put(session.getId(), session); -Gossiper.instance.register(session); - FailureDetector.instance.registerFailureDetectionEventListener(session); -} +// register listeners +gossiper.register(session); +failureDetector.registerFailureDetectionEventListener(session); -public void removeFromActiveSessions(RepairSession session) -{ -Gossiper.instance.unregister(session); -sessions.remove(session.getId()); +// unregister listeners at completion +session.addListener(new Runnable() +{ +/** + * When repair finished, do clean up + */ +public void run() +{ + failureDetector.unregisterFailureDetectionEventListener(session); +gossiper.unregister(session); +sessions.remove(session.getId()); +} +}, MoreExecutors.sameThreadExecutor()); +session.start(executor); +return session; } public synchronized void terminateSessions()
[jira] [Commented] (CASSANDRA-8518) Impose In-Flight Data Limit
[ https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302320#comment-14302320 ] Benedict commented on CASSANDRA-8518: - Have a look at how we solve this in the Cell implementations; it is possible to cache the results for each object type, so that minimal overheads are incurred at run time. Impose In-Flight Data Limit --- Key: CASSANDRA-8518 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cheng Ren Labels: performance We have been suffering from cassandra node crash due to out of memory for a long time. The heap dump from the recent crash shows there are 22 native transport request threads each of which consumes 3.3% of heap size, taking more than 70% in total. Heap dump: !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600! Expanded view of one thread: !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600! The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor as the request executor and provided a default request size estimator which constantly returns 1, meaning it limits only the number of requests being pushed to the pool. To have more fine-grained control on handling requests and better protect our node from OOM issue, we propose implementing a more precise estimator. Here is our two cents: For update/delete/insert request: Size could be estimated by adding size of all class members together. For scan query, the major part of the request is response, which can be estimated from the history data. For example if we receive a scan query on a column family for a certain token range, we keep track of its response size used as the estimated response size for later scan query on the same cf. For future requests on the same cf, response size could be calculated by token range*recorded size/ recorded token range. The request size should be estimated as (query size + estimated response size). We believe what we're proposing here can be useful for other people in the Cassandra community as well. Would you mind providing us feedbacks? Please let us know if you have any concerns or suggestions regarding this proposal. Thanks, Cheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory continuously increase until killed by OOM killer
[ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Liu updated CASSANDRA-8723: Summary: Cassandra 2.1.2 Memory issue - java process memory continuously increase until killed by OOM killer (was: Cassandra 2.1.2 Memory issue - Continuously process memory increase until killed by OOM killer) Cassandra 2.1.2 Memory issue - java process memory continuously increase until killed by OOM killer --- Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - Continuously process memory increase until killed by OOM killer
Jeff Liu created CASSANDRA-8723: --- Summary: Cassandra 2.1.2 Memory issue - Continuously process memory increase until killed by OOM killer Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8720) Provide tools for finding wide row/partition keys
[ https://issues.apache.org/jira/browse/CASSANDRA-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302211#comment-14302211 ] J.B. Langston commented on CASSANDRA-8720: -- Better than nothing, but logs can get rotated, deleted, etc. and it would good to have a way to get this information on demand without having to wait for a compaction to occur. Provide tools for finding wide row/partition keys - Key: CASSANDRA-8720 URL: https://issues.apache.org/jira/browse/CASSANDRA-8720 Project: Cassandra Issue Type: Improvement Reporter: J.B. Langston Multiple users have requested some sort of tool to help identify wide row keys. They get into a situation where they know a wide row/partition has been inserted and it's causing problems for them but they have no idea what the row key is in order to remove it. Maintaining the widest row key currently encountered and displaying it in cfstats would be one possible approach. Another would be an offline tool (possibly an enhancement to sstablekeys) to show the number of columns/bytes per key in each sstable. If a tool to aggregate the information at a CF-level could be provided that would be a bonus, but it shouldn't be too hard to write a script wrapper to aggregate them if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8518) Impose In-Flight Data Limit
[ https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302318#comment-14302318 ] Cheng Ren commented on CASSANDRA-8518: -- ObjectSizes is basically a wrapper around memory meter. Is that possible it will add latency overhead to query processing? Impose In-Flight Data Limit --- Key: CASSANDRA-8518 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cheng Ren Labels: performance We have been suffering from cassandra node crash due to out of memory for a long time. The heap dump from the recent crash shows there are 22 native transport request threads each of which consumes 3.3% of heap size, taking more than 70% in total. Heap dump: !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600! Expanded view of one thread: !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600! The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor as the request executor and provided a default request size estimator which constantly returns 1, meaning it limits only the number of requests being pushed to the pool. To have more fine-grained control on handling requests and better protect our node from OOM issue, we propose implementing a more precise estimator. Here is our two cents: For update/delete/insert request: Size could be estimated by adding size of all class members together. For scan query, the major part of the request is response, which can be estimated from the history data. For example if we receive a scan query on a column family for a certain token range, we keep track of its response size used as the estimated response size for later scan query on the same cf. For future requests on the same cf, response size could be calculated by token range*recorded size/ recorded token range. The request size should be estimated as (query size + estimated response size). We believe what we're proposing here can be useful for other people in the Cassandra community as well. Would you mind providing us feedbacks? Please let us know if you have any concerns or suggestions regarding this proposal. Thanks, Cheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8722) Auth MBean needs to be registered
Brandon Williams created CASSANDRA-8722: --- Summary: Auth MBean needs to be registered Key: CASSANDRA-8722 URL: https://issues.apache.org/jira/browse/CASSANDRA-8722 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 2.1.3, 2.0.13 In CASSANDRA-7968 we created this bean but forgot to register it :( This also makes CASSANDRA-7977 unusable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8721) expose shadowed columns in tracing output
[ https://issues.apache.org/jira/browse/CASSANDRA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302073#comment-14302073 ] Jonathan Ellis commented on CASSANDRA-8721: --- Tracing already tells you how many sstables it needs to open. I'm not sure that adding cell level detail gives me useful enough information to push tracing into the cell merging logic. expose shadowed columns in tracing output --- Key: CASSANDRA-8721 URL: https://issues.apache.org/jira/browse/CASSANDRA-8721 Project: Cassandra Issue Type: Improvement Reporter: Robert Coli Priority: Trivial Current tracing messages expose how many tombstones are read in order to read live columns, but they do not expose shadowed columns. Shadowed columns are columns where the timestamp for a given column is lower than the highest timestamp for that column. It would be useful for users who are tracing queries to understand how many shadowed columns are being read-than-ignored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8722) Auth MBean needs to be registered
[ https://issues.apache.org/jira/browse/CASSANDRA-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-8722: Reviewer: Aleksey Yeschenko Reproduced In: 2.0.11 Auth MBean needs to be registered - Key: CASSANDRA-8722 URL: https://issues.apache.org/jira/browse/CASSANDRA-8722 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 2.1.3, 2.0.13 In CASSANDRA-7968 we created this bean but forgot to register it :( This also makes CASSANDRA-7977 unusable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8211) Overlapping sstables in L1+
[ https://issues.apache.org/jira/browse/CASSANDRA-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301143#comment-14301143 ] Imri Zvik commented on CASSANDRA-8211: -- [~krummas] The same goes to what I'm seeing in CASSANDRA-8210? Also un-related? Overlapping sstables in L1+ --- Key: CASSANDRA-8211 URL: https://issues.apache.org/jira/browse/CASSANDRA-8211 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 2.0.12, 2.1.3 Attachments: 0001-Avoid-overlaps-in-L1-v2.patch, 0001-Avoid-overlaps-in-L1.patch Seems we have a bug that can create overlapping sstables in L1: {code} WARN [main] 2014-10-28 04:09:42,295 LeveledManifest.java (line 164) At level 2, SSTableReader(path='sstable') [DecoratedKey(2838397575996053472, 00 10066059b210066059b210400100), DecoratedKey(5516674013223138308, 001000ff2d161000ff2d160 00010400100)] overlaps SSTableReader(path='sstable') [DecoratedKey(2839992722300822584, 0010 00229ad21000229ad210400100), DecoratedKey(5532836928694021724, 0010034b05a610034b05a6100 000400100)]. This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the fact that you have dropped sstables from another node into the data directory. Sending back to L0. If you didn't drop in sstables, and have not yet run scrub, you should do so since you may also have rows out-of-order within an sstable {code} Which might manifest itself during compaction with this exception: {code} ERROR [CompactionExecutor:3152] 2014-10-28 00:24:06,134 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:3152,1,main] java.lang.RuntimeException: Last written key DecoratedKey(5516674013223138308, 001000ff2d161000ff2d1610400100) = current key DecoratedKey(2839992722300822584, 001000229ad21000229ad210400100) writing into sstable {code} since we use LeveledScanner when compacting (the backing sstable scanner might go beyond the start of the next sstable scanner) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor
[ https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301151#comment-14301151 ] Imri Zvik commented on CASSANDRA-8210: -- Sure: INFO [CompactionExecutor:513] 2015-02-01 09:54:21,803 CompactionManager.java (line 563) Cleaning up SSTableReader(path='/var/lib/cassandra/data/accounts/account_store_data/accounts-account_store_data-jb-6-Data.db') ERROR [CompactionExecutor:513] 2015-02-01 09:54:21,815 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:513,1,RMI Runtime] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getIndexScanPosition(SSTableReader.java:602) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:947) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:910) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:819) at org.apache.cassandra.db.ColumnFamilyStore.getExpectedCompactedFileSize(ColumnFamilyStore.java:1088) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:564) at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:281) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:225) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) java.lang.AssertionError: Memory was freed exception in CompactionExecutor Key: CASSANDRA-8210 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210 Project: Cassandra Issue Type: Bug Components: Core Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01 Reporter: Nikolai Grigoriev Priority: Minor Attachments: cassandra-env.sh, cassandra.yaml, occurence frequency.png, system.log.gz I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). After looking through the history I have found that it was actually happening on all nodes since the start of large compaction process (I've loaded tons of data in the system and then turned off all load to let it compact the data). {code} ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692) at org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
[jira] [Updated] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imri Zvik updated CASSANDRA-8716: - Description: Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getIndexScanPosition(SSTableReader.java:602) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:947) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:910) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:819) at org.apache.cassandra.db.ColumnFamilyStore.getExpectedCompactedFileSize(ColumnFamilyStore.java:1088) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:564) at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63) at
[jira] [Created] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
Imri Zvik created CASSANDRA-8716: Summary: java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Priority: Minor Attachments: system.log.gz The error continue to happen even after scrub. If I retry a few time, it may run successfully. I am attaching the server log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8086) Cassandra should have ability to limit the number of native connections
[ https://issues.apache.org/jira/browse/CASSANDRA-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Norman Maurer updated CASSANDRA-8086: - Attachment: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.patch Latest patch with all to limit per source ip or limit in general Cassandra should have ability to limit the number of native connections --- Key: CASSANDRA-8086 URL: https://issues.apache.org/jira/browse/CASSANDRA-8086 Project: Cassandra Issue Type: Bug Reporter: Vishy Kasar Assignee: Norman Maurer Fix For: 2.1.3 Attachments: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.txt We have a production cluster with 72 instances spread across 2 DCs. We have a large number ( ~ 40,000 ) of clients hitting this cluster. Client normally connects to 4 cassandra instances. Some event (we think it is a schema change on server side) triggered the client to establish connections to all cassandra instances of local DC. This brought the server to its knees. The client connections failed and client attempted re-connections. Cassandra should protect itself from such attack from client. Do we have any knobs to control the number of max connections? If not, we need to add that knob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8717) Top-k queries with custom secondary indexes
Andrés de la Peña created CASSANDRA-8717: Summary: Top-k queries with custom secondary indexes Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Priority: Minor Fix For: 2.1.3 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301135#comment-14301135 ] Wei Yuan Cho commented on CASSANDRA-8390: - Hi there, I have a similar error as this thread and I'm running Windows Server 2008. I have set to DiskAccessMode to standard and I'm still getting the following error INFO [main] 2015-01-23 16:41:01,578 DatabaseDescriptor.java:211 - DiskAccessMode is standard, indexAccessMode is standard {quote} ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,284 SSTableDeletingTask.java:89 - Unable to delete \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137342-Data.db (it will be removed on server restart; we'll also retry after GC) ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,284 SSTableDeletingTask.java:89 - Unable to delete \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137341-Data.db (it will be removed on server restart; we'll also retry after GC) INFO [CompactionExecutor:3644] 2015-01-30 22:02:02,284 CompactionTask.java:252 - Compacted 4 sstables to [\var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137344,]. 4,830,501 bytes to 2,453,934 (~50% of original) in 1,433ms = 1.633115MB/s. 13 total partitions merged to 4. Partition merge counts were {1:1, 4:3, } INFO [CompactionExecutor:3645] 2015-01-30 22:02:02,565 CompactionTask.java:252 - Compacted 4 sstables to [\var\lib\cassandra\data\system\compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b\system-compactions_in_progress-ka-5659,]. 424 bytes to 42 (~9% of original) in 285ms = 0.000141MB/s. 4 total partitions merged to 1. Partition merge counts were {2:2, } ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,659 CassandraDaemon.java:153 - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137339-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[apache-cassandra-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_71] at java.lang.Thread.run(Unknown Source) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137339-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) ~[na:1.7.0_71] at java.nio.file.Files.delete(Unknown Source) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[apache-cassandra-2.1.1.jar:2.1.1] ... 11 common frames omitted ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,925 StorageService.java:377 - Stopping gossiper WARN [NonPeriodicTasks:1] 2015-01-30 22:02:02,925 StorageService.java:291 - Stopping gossip by operator request INFO [NonPeriodicTasks:1] 2015-01-30 22:02:02,925 Gossiper.java:1317 - Announcing shutdown ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:04,925 StorageService.java:382 - Stopping RPC server INFO [NonPeriodicTasks:1] 2015-01-30 22:02:04,925 ThriftServer.java:142 - Stop listening to thrift clients ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:04,956 StorageService.java:387 - Stopping native transport INFO [NonPeriodicTasks:1] 2015-01-30 22:02:05,519 Server.java:213 - Stop listening for CQL clients {quote} I've read that 3.0 will fix the root
[jira] [Updated] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yuan Cho updated CASSANDRA-8390: Attachment: CassandraDiedWithDiskAccessModeStandardLogs.7z The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8696) nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot
[ https://issues.apache.org/jira/browse/CASSANDRA-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301157#comment-14301157 ] Jan Karlsson commented on CASSANDRA-8696: - I was only able to reproduce this when the amount of data on disk was over 12G. From taking a quick glance at the code, this is caused by the snapshot process throwing a timeout. nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot - Key: CASSANDRA-8696 URL: https://issues.apache.org/jira/browse/CASSANDRA-8696 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu When trying to run nodetool repair -pr on cassandra node ( 2.1.2), cassandra throw java exceptions: cannot create snapshot. the error log from system.log: {noformat} INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:28,815 StreamResultFuture.java:166 - [Stream #692c1450-a692-11e4-9973-070e938df227 ID#0] Prepare completed. Receiving 2 files(221187 bytes), sending 5 files(632105 bytes) INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 StreamResultFuture.java:180 - [Stream #692c1450-a692-11e4-9973-070e938df227] Session with /10.97.9.110 is complete INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 StreamResultFuture.java:212 - [Stream #692c1450-a692-11e4-9973-070e938df227] All sessions completed INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,047 StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] streaming task succeed, returning response to /10.98.194.68 INFO [RepairJobTask:1] 2015-01-28 02:07:29,065 StreamResultFuture.java:86 - [Stream #692c6270-a692-11e4-9973-070e938df227] Executing streaming plan for Repair INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,065 StreamSession.java:213 - [Stream #692c6270-a692-11e4-9973-070e938df227] Starting streaming to /10.66.187.201 INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,070 StreamCoordinator.java:209 - [Stream #692c6270-a692-11e4-9973-070e938df227, ID#0] Beginning stream session with /10.66.187.201 INFO [STREAM-IN-/10.66.187.201] 2015-01-28 02:07:29,465 StreamResultFuture.java:166 - [Stream #692c6270-a692-11e4-9973-070e938df227 ID#0] Prepare completed. Receiving 5 files(627994 bytes), sending 5 files(632105 bytes) INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,971 StreamResultFuture.java:180 - [Stream #692c6270-a692-11e4-9973-070e938df227] Session with /10.66.187.201 is complete INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 StreamResultFuture.java:212 - [Stream #692c6270-a692-11e4-9973-070e938df227] All sessions completed INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] streaming task succeed, returning response to /10.98.194.68 ERROR [RepairJobTask:1] 2015-01-28 02:07:39,444 RepairJob.java:127 - Error occurred during snapshot phase java.lang.RuntimeException: Could not create snapshot at /10.97.9.110 at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:347) ~[apache-cassandra-2.1.2.jar:2.1.2] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_45] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] INFO [AntiEntropySessions:6] 2015-01-28 02:07:39,445 RepairSession.java:260 - [repair #6f85e740-a692-11e4-9973-070e938df227] new session: will sync /10.98.194.68, /10.66.187.201, /10.226.218.135 on range (12817179804668051873746972069086 2638799,12863540308359254031520865977436165] for events.[bigint0text, bigint0boolean, bigint0int, dataset_catalog, column_categories, bigint0double, bigint0bigint] ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,445 RepairSession.java:303 - [repair #685e3d00-a692-11e4-9973-070e938df227] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:128) ~[apache-cassandra-2.1.2.jar:2.1.2] at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
[jira] [Commented] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor
[ https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301155#comment-14301155 ] Marcus Eriksson commented on CASSANDRA-8210: [~imriz] think that merits a new ticket, could you create one? And it would be great if you could include say 1hour of logs before the exception. Also mention if it is LCS or STCS java.lang.AssertionError: Memory was freed exception in CompactionExecutor Key: CASSANDRA-8210 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210 Project: Cassandra Issue Type: Bug Components: Core Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01 Reporter: Nikolai Grigoriev Priority: Minor Attachments: cassandra-env.sh, cassandra.yaml, occurence frequency.png, system.log.gz I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). After looking through the history I have found that it was actually happening on all nodes since the start of large compaction process (I've loaded tons of data in the system and then turned off all load to let it compact the data). {code} ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692) at org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301096#comment-14301096 ] Sylvain Lebresne commented on CASSANDRA-8712: - I'm not familiar with django-cassandra-engine and I'm not sure other Cassandra devs are, so it would be much simpler to limit the layer used to reproduce (to limit the possibility that the problem actually come from one of those layers). Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8211) Overlapping sstables in L1+
[ https://issues.apache.org/jira/browse/CASSANDRA-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301131#comment-14301131 ] Marcus Eriksson commented on CASSANDRA-8211: [~imriz] that is unrelated (compaction_history is not LCS), if you see this on more that one node, file a new ticket, otherwise i would suspect you have some data corruption and need to run scrub Overlapping sstables in L1+ --- Key: CASSANDRA-8211 URL: https://issues.apache.org/jira/browse/CASSANDRA-8211 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 2.0.12, 2.1.3 Attachments: 0001-Avoid-overlaps-in-L1-v2.patch, 0001-Avoid-overlaps-in-L1.patch Seems we have a bug that can create overlapping sstables in L1: {code} WARN [main] 2014-10-28 04:09:42,295 LeveledManifest.java (line 164) At level 2, SSTableReader(path='sstable') [DecoratedKey(2838397575996053472, 00 10066059b210066059b210400100), DecoratedKey(5516674013223138308, 001000ff2d161000ff2d160 00010400100)] overlaps SSTableReader(path='sstable') [DecoratedKey(2839992722300822584, 0010 00229ad21000229ad210400100), DecoratedKey(5532836928694021724, 0010034b05a610034b05a6100 000400100)]. This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the fact that you have dropped sstables from another node into the data directory. Sending back to L0. If you didn't drop in sstables, and have not yet run scrub, you should do so since you may also have rows out-of-order within an sstable {code} Which might manifest itself during compaction with this exception: {code} ERROR [CompactionExecutor:3152] 2014-10-28 00:24:06,134 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:3152,1,main] java.lang.RuntimeException: Last written key DecoratedKey(5516674013223138308, 001000ff2d161000ff2d1610400100) = current key DecoratedKey(2839992722300822584, 001000229ad21000229ad210400100) writing into sstable {code} since we use LeveledScanner when compacting (the backing sstable scanner might go beyond the start of the next sstable scanner) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor
[ https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301151#comment-14301151 ] Imri Zvik edited comment on CASSANDRA-8210 at 2/2/15 10:59 AM: --- Sure: INFO [CompactionExecutor:513] 2015-02-01 09:54:21,803 CompactionManager.java (line 563) Cleaning up SSTableReader(path='/var/lib/cassandra/data/accounts/account_store_data/accounts-account_store_data-jb-6-Data.db') ERROR [CompactionExecutor:513] 2015-02-01 09:54:21,815 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:513,1,RMI Runtime] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getIndexScanPosition(SSTableReader.java:602) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:947) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:910) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:819) at org.apache.cassandra.db.ColumnFamilyStore.getExpectedCompactedFileSize(ColumnFamilyStore.java:1088) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:564) at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:281) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:225) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) (By the way, I've seen this on all of our nodes, not just one node) was (Author: imriz): Sure: INFO [CompactionExecutor:513] 2015-02-01 09:54:21,803 CompactionManager.java (line 563) Cleaning up SSTableReader(path='/var/lib/cassandra/data/accounts/account_store_data/accounts-account_store_data-jb-6-Data.db') ERROR [CompactionExecutor:513] 2015-02-01 09:54:21,815 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:513,1,RMI Runtime] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getIndexScanPosition(SSTableReader.java:602) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:947) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:910) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:819) at org.apache.cassandra.db.ColumnFamilyStore.getExpectedCompactedFileSize(ColumnFamilyStore.java:1088) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:564) at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:281) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:225) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) java.lang.AssertionError: Memory was freed exception in CompactionExecutor Key: CASSANDRA-8210 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210 Project: Cassandra Issue Type: Bug Components: Core Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301074#comment-14301074 ] mlowicki commented on CASSANDRA-8712: - 1. Drop keyspace {code} cqlsh use sync; cqlsh:sync drop keyspace sync; cqlsh:sync {code} 2. Creating keyspace from scratch (I'm using sync_casandra from django-cassandra-engine) {code} ./bin/django sync_cassandra Creating keyspace sync.. Syncing sync.api.models.Entity Syncing sync.api.models.UserStore {code} 3. Populate database using Django's shell {code} from sync.api.models import Entity, UserStore user = UserStore.objects.create(user_id='foo') root = Entity.objects.create(user_id='foo', data_type_id=0, version=0, id='-1') {code} 4. Run {{check_parent_index_consistency}} script: {code} ./bin/django check_parent_index_consistency { folder: 1, user: 1 } {code} 5. Add entities to root folder {code} for i in range(1): Entity.objects.create(user_id='foo', data_type_id=0, version=0, id='a' + str(i), parent_id='-1', folder=False) {code} 6. While inserting run {{check_parent_index_consistency}} script: {code} ./bin/django check_parent_index_consistency { folder: 1, inconsistent_folder: 1, user: 1 } {code} Number of entities returned directly from {{entity}} while running insert was 8918 but got only 372 from index. It seems to be related to number of entities I'm adding. If less than 1 I couldn't reproduce the issue. When running {{check_parent_index_consistency}} script after couple of minutes it was completely fine - no inconsistencies. Not sure if this is the same issue as number of inconsistencies is zero after some time but maybe it'll help. {{check_parent_index_consistency}} is available on https://cpaste.org/p7zht9rli Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8554) Node where gossip is disabled still shows as UP on that node; other nodes show it as DN
[ https://issues.apache.org/jira/browse/CASSANDRA-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301201#comment-14301201 ] Anuja Mandlecha commented on CASSANDRA-8554: I tried to reproduce the same bug using apache cassandra-2.0.8 which dse-4.5.1 uses but couldnt reproduce. The node is shown as down once it is drained which is expected behaviour.I also tried with apache cassandra-2.1.2 and the same happened with it too. Node where gossip is disabled still shows as UP on that node; other nodes show it as DN Key: CASSANDRA-8554 URL: https://issues.apache.org/jira/browse/CASSANDRA-8554 Project: Cassandra Issue Type: Bug Environment: Centos 6.5, DSE4.5.1 tarball install Reporter: Mark Curtis Priority: Minor When running nodetool drain, the drained node will still show the status of itself as UP in nodetool status even after the drain has finished. For example using a 3 node cluster on one of the nodes that is still operating and not drained we see this: {code} $ ./dse-4.5.1/bin/nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: Central === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 192.168.56.21 210.78 KB 256 32.1% 82eb2fca-4f57-467b-a972-93096ec5d69f RAC1 DN 192.168.56.23 2.22 GB256 33.5% a11bfac1-fad0-440b-bd68-7562a89ce3c7 RAC1 UN 192.168.56.22 2.22 GB256 34.4% 4250cb05-97be-4bac-887a-acc307d1bc0c RAC1 {code} While on the drained node we see this: {code} [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool drain [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: Central === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN 192.168.56.21 210.78 KB 256 32.1% 82eb2fca-4f57-467b-a972-93096ec5d69f RAC1 UN 192.168.56.23 2.22 GB256 33.5% a11bfac1-fad0-440b-bd68-7562a89ce3c7 RAC1 UN 192.168.56.22 2.22 GB256 34.4% 4250cb05-97be-4bac-887a-acc307d1bc0c RAC1 {code} Netstat shows outgoing connections from the drained node to other nodes as still established on port 7000 but the node is no longer listening on port 7000 which I believe is expected. However the output of nodetool status on the drained node could be interpreted as misleading. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8211) Overlapping sstables in L1+
[ https://issues.apache.org/jira/browse/CASSANDRA-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301133#comment-14301133 ] Imri Zvik commented on CASSANDRA-8211: -- Hi Marcus, Thanks. I'll run the scrub and re verify. Overlapping sstables in L1+ --- Key: CASSANDRA-8211 URL: https://issues.apache.org/jira/browse/CASSANDRA-8211 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 2.0.12, 2.1.3 Attachments: 0001-Avoid-overlaps-in-L1-v2.patch, 0001-Avoid-overlaps-in-L1.patch Seems we have a bug that can create overlapping sstables in L1: {code} WARN [main] 2014-10-28 04:09:42,295 LeveledManifest.java (line 164) At level 2, SSTableReader(path='sstable') [DecoratedKey(2838397575996053472, 00 10066059b210066059b210400100), DecoratedKey(5516674013223138308, 001000ff2d161000ff2d160 00010400100)] overlaps SSTableReader(path='sstable') [DecoratedKey(2839992722300822584, 0010 00229ad21000229ad210400100), DecoratedKey(5532836928694021724, 0010034b05a610034b05a6100 000400100)]. This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the fact that you have dropped sstables from another node into the data directory. Sending back to L0. If you didn't drop in sstables, and have not yet run scrub, you should do so since you may also have rows out-of-order within an sstable {code} Which might manifest itself during compaction with this exception: {code} ERROR [CompactionExecutor:3152] 2014-10-28 00:24:06,134 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:3152,1,main] java.lang.RuntimeException: Last written key DecoratedKey(5516674013223138308, 001000ff2d161000ff2d1610400100) = current key DecoratedKey(2839992722300822584, 001000229ad21000229ad210400100) writing into sstable {code} since we use LeveledScanner when compacting (the backing sstable scanner might go beyond the start of the next sstable scanner) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301135#comment-14301135 ] Wei Yuan Cho edited comment on CASSANDRA-8390 at 2/2/15 10:48 AM: -- Hi there, I have a similar error as this thread and I'm running Windows Server 2008. I have set to DiskAccessMode to standard and I'm still getting the following error INFO [main] 2015-01-23 16:41:01,578 DatabaseDescriptor.java:211 - DiskAccessMode is standard, indexAccessMode is standard {quote} ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,284 SSTableDeletingTask.java:89 - Unable to delete \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137342-Data.db (it will be removed on server restart; we'll also retry after GC) ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,284 SSTableDeletingTask.java:89 - Unable to delete \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137341-Data.db (it will be removed on server restart; we'll also retry after GC) INFO [CompactionExecutor:3644] 2015-01-30 22:02:02,284 CompactionTask.java:252 - Compacted 4 sstables to [\var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137344,]. 4,830,501 bytes to 2,453,934 (~50% of original) in 1,433ms = 1.633115MB/s. 13 total partitions merged to 4. Partition merge counts were {1:1, 4:3, } INFO [CompactionExecutor:3645] 2015-01-30 22:02:02,565 CompactionTask.java:252 - Compacted 4 sstables to [\var\lib\cassandra\data\system\compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b\system-compactions_in_progress-ka-5659,]. 424 bytes to 42 (~9% of original) in 285ms = 0.000141MB/s. 4 total partitions merged to 1. Partition merge counts were {2:2, } ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,659 CassandraDaemon.java:153 - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137339-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[apache-cassandra-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[apache-cassandra-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_71] at java.lang.Thread.run(Unknown Source) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: \var\lib\cassandra\data\CM\Alerts\CM-Alerts-ka-137339-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) ~[na:1.7.0_71] at java.nio.file.Files.delete(Unknown Source) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[apache-cassandra-2.1.1.jar:2.1.1] ... 11 common frames omitted ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:02,925 StorageService.java:377 - Stopping gossiper WARN [NonPeriodicTasks:1] 2015-01-30 22:02:02,925 StorageService.java:291 - Stopping gossip by operator request INFO [NonPeriodicTasks:1] 2015-01-30 22:02:02,925 Gossiper.java:1317 - Announcing shutdown ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:04,925 StorageService.java:382 - Stopping RPC server INFO [NonPeriodicTasks:1] 2015-01-30 22:02:04,925 ThriftServer.java:142 - Stop listening to thrift clients ERROR [NonPeriodicTasks:1] 2015-01-30 22:02:04,956 StorageService.java:387 - Stopping native transport INFO [NonPeriodicTasks:1] 2015-01-30 22:02:05,519 Server.java:213 - Stop listening for CQL
[jira] [Commented] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor
[ https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301159#comment-14301159 ] Imri Zvik commented on CASSANDRA-8210: -- [~krummas] Sure - I am currently running scrub on all the nodes (although I already did that yesterday), just to make sure. I will try to reproduce it after the scrub is over, and open a new ticket if the issue will persist. java.lang.AssertionError: Memory was freed exception in CompactionExecutor Key: CASSANDRA-8210 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210 Project: Cassandra Issue Type: Bug Components: Core Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01 Reporter: Nikolai Grigoriev Priority: Minor Attachments: cassandra-env.sh, cassandra.yaml, occurence frequency.png, system.log.gz I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). After looking through the history I have found that it was actually happening on all nodes since the start of large compaction process (I've loaded tons of data in the system and then turned off all load to let it compact the data). {code} ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692) at org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301161#comment-14301161 ] mlowicki commented on CASSANDRA-8712: - I'll try to provide sth soon. We've checked and {{rebuild_index}} doesn't help at all. Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8492) Support IF NOT EXISTS for ALTER TABLE ADD COLUMN
[ https://issues.apache.org/jira/browse/CASSANDRA-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sachin Janani updated CASSANDRA-8492: - Attachment: 8492.patch Patch for Cassandra #8492 Support IF NOT EXISTS for ALTER TABLE ADD COLUMN Key: CASSANDRA-8492 URL: https://issues.apache.org/jira/browse/CASSANDRA-8492 Project: Cassandra Issue Type: Improvement Reporter: Peter Mädel Priority: Minor Attachments: 8492.patch would enable creation of schema update scripts that can be repeatable executed without having to worry about invalid query exceptions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8210) java.lang.AssertionError: Memory was freed exception in CompactionExecutor
[ https://issues.apache.org/jira/browse/CASSANDRA-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301146#comment-14301146 ] Marcus Eriksson commented on CASSANDRA-8210: could you post the server side log for this? java.lang.AssertionError: Memory was freed exception in CompactionExecutor Key: CASSANDRA-8210 URL: https://issues.apache.org/jira/browse/CASSANDRA-8210 Project: Cassandra Issue Type: Bug Components: Core Environment: DSE 4.5.2, Cassandra 2.0.10, OEL 6.5, kernel 3.8.13-44.el6uek.x86_64, 128Gb of RAM, swap disabled, JRE 1.7.0_67-b01 Reporter: Nikolai Grigoriev Priority: Minor Attachments: cassandra-env.sh, cassandra.yaml, occurence frequency.png, system.log.gz I have just got this problem on multiple nodes. Cassandra 2.0.10 (DSE 4.5.2). After looking through the history I have found that it was actually happening on all nodes since the start of large compaction process (I've loaded tons of data in the system and then turned off all load to let it compact the data). {code} ERROR [CompactionExecutor:1196] 2014-10-28 17:14:50,124 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1196,1,main] java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getSampleIndexesForRanges(SSTableReader.java:692) at org.apache.cassandra.io.sstable.SSTableReader.estimatedKeysForRanges(SSTableReader.java:663) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:328) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.findDroppableSSTable(LeveledCompactionStrategy.java:354) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getMaximalTask(LeveledCompactionStrategy.java:125) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:113) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:192) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8696) nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot
[ https://issues.apache.org/jira/browse/CASSANDRA-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301157#comment-14301157 ] Jan Karlsson edited comment on CASSANDRA-8696 at 2/2/15 11:08 AM: -- We stumbled upon this issue aswell. I was only able to reproduce this when the amount of data on disk was over 12G. was (Author: jan karlsson): I was only able to reproduce this when the amount of data on disk was over 12G. From taking a quick glance at the code, this is caused by the snapshot process throwing a timeout. nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot - Key: CASSANDRA-8696 URL: https://issues.apache.org/jira/browse/CASSANDRA-8696 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu When trying to run nodetool repair -pr on cassandra node ( 2.1.2), cassandra throw java exceptions: cannot create snapshot. the error log from system.log: {noformat} INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:28,815 StreamResultFuture.java:166 - [Stream #692c1450-a692-11e4-9973-070e938df227 ID#0] Prepare completed. Receiving 2 files(221187 bytes), sending 5 files(632105 bytes) INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 StreamResultFuture.java:180 - [Stream #692c1450-a692-11e4-9973-070e938df227] Session with /10.97.9.110 is complete INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 StreamResultFuture.java:212 - [Stream #692c1450-a692-11e4-9973-070e938df227] All sessions completed INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,047 StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] streaming task succeed, returning response to /10.98.194.68 INFO [RepairJobTask:1] 2015-01-28 02:07:29,065 StreamResultFuture.java:86 - [Stream #692c6270-a692-11e4-9973-070e938df227] Executing streaming plan for Repair INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,065 StreamSession.java:213 - [Stream #692c6270-a692-11e4-9973-070e938df227] Starting streaming to /10.66.187.201 INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,070 StreamCoordinator.java:209 - [Stream #692c6270-a692-11e4-9973-070e938df227, ID#0] Beginning stream session with /10.66.187.201 INFO [STREAM-IN-/10.66.187.201] 2015-01-28 02:07:29,465 StreamResultFuture.java:166 - [Stream #692c6270-a692-11e4-9973-070e938df227 ID#0] Prepare completed. Receiving 5 files(627994 bytes), sending 5 files(632105 bytes) INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,971 StreamResultFuture.java:180 - [Stream #692c6270-a692-11e4-9973-070e938df227] Session with /10.66.187.201 is complete INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 StreamResultFuture.java:212 - [Stream #692c6270-a692-11e4-9973-070e938df227] All sessions completed INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] streaming task succeed, returning response to /10.98.194.68 ERROR [RepairJobTask:1] 2015-01-28 02:07:39,444 RepairJob.java:127 - Error occurred during snapshot phase java.lang.RuntimeException: Could not create snapshot at /10.97.9.110 at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:347) ~[apache-cassandra-2.1.2.jar:2.1.2] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_45] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] INFO [AntiEntropySessions:6] 2015-01-28 02:07:39,445 RepairSession.java:260 - [repair #6f85e740-a692-11e4-9973-070e938df227] new session: will sync /10.98.194.68, /10.66.187.201, /10.226.218.135 on range (12817179804668051873746972069086 2638799,12863540308359254031520865977436165] for events.[bigint0text, bigint0boolean, bigint0int, dataset_catalog, column_categories, bigint0double, bigint0bigint] ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,445 RepairSession.java:303 - [repair #685e3d00-a692-11e4-9973-070e938df227] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) ~[apache-cassandra-2.1.2.jar:2.1.2] at
[jira] [Commented] (CASSANDRA-8718) nodetool cleanup causes segfault
[ https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302721#comment-14302721 ] Philip Thompson commented on CASSANDRA-8718: You are using JDK 1.7.0_71-b14 correct? What operating system are you running on? nodetool cleanup causes segfault Key: CASSANDRA-8718 URL: https://issues.apache.org/jira/browse/CASSANDRA-8718 Project: Cassandra Issue Type: Bug Reporter: Maxim Ivanov Priority: Minor Fix For: 2.0.13 When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8718) nodetool cleanup causes segfault
[ https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8718: --- Reproduced In: 2.0.12 Fix Version/s: 2.0.13 nodetool cleanup causes segfault Key: CASSANDRA-8718 URL: https://issues.apache.org/jira/browse/CASSANDRA-8718 Project: Cassandra Issue Type: Bug Reporter: Maxim Ivanov Priority: Minor Fix For: 2.0.13 When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302728#comment-14302728 ] Philip Thompson commented on CASSANDRA-8716: I can reproduce this very easily with java 1.7.0_67. It's just a matter of creating a 2.0.12 node with ccm, writing with stress and flushing, bootstrapping a second node, then running cleanup. Could this be related to CASSANDRA-8718? java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Priority: Minor Fix For: 2.0.13 Attachments: system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at
[jira] [Updated] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8716: --- Tester: Philip Thompson java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Assignee: Benedict Priority: Minor Fix For: 2.0.13 Attachments: system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at
[jira] [Commented] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer
[ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302744#comment-14302744 ] Ariel Weisberg commented on CASSANDRA-8723: --- Is it possible to get the cassandra.yaml and system.log? Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer - Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Fix For: 2.1.3 Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8358) Bundled tools shouldn't be using Thrift API
[ https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302767#comment-14302767 ] Brandon Williams commented on CASSANDRA-8358: - Yes, wait. Bundled tools shouldn't be using Thrift API --- Key: CASSANDRA-8358 URL: https://issues.apache.org/jira/browse/CASSANDRA-8358 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Philip Thompson Fix For: 3.0 In 2.1, we switched cqlsh to the python-driver. In 3.0, we got rid of cassandra-cli. Yet there is still code that's using legacy Thrift API. We want to convert it all to use the java-driver instead. 1. BulkLoader uses Thrift to query the schema tables. It should be using java-driver metadata APIs directly instead. 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift 5. o.a.c.hadoop.pig.CqlStorage is using Thrift Some of the things listed above use Thrift to get the list of partition key columns or clustering columns. Those should be converted to use the Metadata API of the java-driver. Somewhat related to that, we also have badly ported code from Thrift in o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches columns from schema tables instead of properly using the driver's Metadata API. We need all of it fixed. One exception, for now, is o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its describe_splits_ex() call that cannot be currently replaced by any java-driver call (?). Once this is done, we can stop starting Thrift RPC port by default in cassandra.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8725) CassandraStorage erroring due to system keyspace schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302779#comment-14302779 ] Philip Thompson edited comment on CASSANDRA-8725 at 2/3/15 5:14 AM: [~iamaleksey], AbstractCassandraStorage is using key_aliases to determine if the table was created via cql or thrift. Is there still a way to do that without key_aliases? {code} String keyAliases = ByteBufferUtil.string(cqlRow.columns.get(5).value); if (FBUtilities.fromJsonList(keyAliases).size() 0) cql3Table = true; {code} was (Author: philipthompson): [~iamaleksey], AbstractCassandraStorage is using key_aliases to determine if the table was created via cql or thrift. Is there still a way to do that without key_aliases? CassandraStorage erroring due to system keyspace schema changes --- Key: CASSANDRA-8725 URL: https://issues.apache.org/jira/browse/CASSANDRA-8725 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.0 CassandraStorage will be deprecated in 3.0, but is currently not working because it is selecting {{key_aliases}} from the system.schema_columnfamilies table, and the column no longer exists. This is causing about half of the pig-tests to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8725) CassandraStorage erroring due to system keyspace schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302779#comment-14302779 ] Philip Thompson commented on CASSANDRA-8725: [~iamaleksey], AbstractCassandraStorage is using key_aliases to determine if the table was created via cql or thrift. Is there still a way to do that without key_aliases? CassandraStorage erroring due to system keyspace schema changes --- Key: CASSANDRA-8725 URL: https://issues.apache.org/jira/browse/CASSANDRA-8725 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.0 CassandraStorage will be deprecated in 3.0, but is currently not working because it is selecting {{key_aliases}} from the system.schema_columnfamilies table, and the column no longer exists. This is causing about half of the pig-tests to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8725) CassandraStorage erroring due to system keyspace schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302805#comment-14302805 ] Philip Thompson commented on CASSANDRA-8725: If not, the only way to handle using CassandraStorage against cql tables after 8358 would be to push the java driver changes up to AbstractCassandraStorage. CassandraStorage erroring due to system keyspace schema changes --- Key: CASSANDRA-8725 URL: https://issues.apache.org/jira/browse/CASSANDRA-8725 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.0 CassandraStorage will be deprecated in 3.0, but is currently not working because it is selecting {{key_aliases}} from the system.schema_columnfamilies table, and the column no longer exists. This is causing about half of the pig-tests to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8716: --- Assignee: Benedict java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Assignee: Benedict Priority: Minor Fix For: 2.0.13 Attachments: system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at
[jira] [Created] (CASSANDRA-8724) Move PigTestBase to Java Driver
Philip Thompson created CASSANDRA-8724: -- Summary: Move PigTestBase to Java Driver Key: CASSANDRA-8724 URL: https://issues.apache.org/jira/browse/CASSANDRA-8724 Project: Cassandra Issue Type: Test Components: Hadoop, Tests Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.0 The initial CQL statements for the pig tests are sent via thrift in PigTestBase. This should be modified to use the java driver. See CASSANDRA-8358. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer
[ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8723: --- Reproduced In: 2.1.2 Fix Version/s: 2.1.3 Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer - Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Fix For: 2.1.3 Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8358) Bundled tools shouldn't be using Thrift API
[ https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302762#comment-14302762 ] Brandon Williams commented on CASSANDRA-8358: - bq. As stated earlier, there are probably additional changes needed to PigTestBase and AbstractCassandraStorage, but they belong in their own tickets. That's fine, but then this ticket should depend upon them, because pig-test is the best and easiest way we have to make sure this works, and I'd rather we make sure this works before committing rather than fix broken stuff again later if pig-test fails. Bundled tools shouldn't be using Thrift API --- Key: CASSANDRA-8358 URL: https://issues.apache.org/jira/browse/CASSANDRA-8358 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Philip Thompson Fix For: 3.0 In 2.1, we switched cqlsh to the python-driver. In 3.0, we got rid of cassandra-cli. Yet there is still code that's using legacy Thrift API. We want to convert it all to use the java-driver instead. 1. BulkLoader uses Thrift to query the schema tables. It should be using java-driver metadata APIs directly instead. 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift 5. o.a.c.hadoop.pig.CqlStorage is using Thrift Some of the things listed above use Thrift to get the list of partition key columns or clustering columns. Those should be converted to use the Metadata API of the java-driver. Somewhat related to that, we also have badly ported code from Thrift in o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches columns from schema tables instead of properly using the driver's Metadata API. We need all of it fixed. One exception, for now, is o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its describe_splits_ex() call that cannot be currently replaced by any java-driver call (?). Once this is done, we can stop starting Thrift RPC port by default in cassandra.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8716: --- Reproduced In: 2.0.12 Fix Version/s: 2.0.13 java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Priority: Minor Fix For: 2.0.13 Attachments: system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at
[jira] [Updated] (CASSANDRA-8715) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ...
[ https://issues.apache.org/jira/browse/CASSANDRA-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8715: --- Fix Version/s: 2.1.3 Labels: cqlsh (was: ) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ... --- Key: CASSANDRA-8715 URL: https://issues.apache.org/jira/browse/CASSANDRA-8715 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2.160, cqlsh 5.0.1, Native protocol v3 Reporter: Eduard Tudenhoefner Priority: Critical Labels: cqlsh Fix For: 2.1.3 When running a COPY ... FROM ... command in a Kerberos environment, I see the number of rows processed, but eventually, Cqlsh never returns. I can verify, that all the data was copied, but the progress bar shows me the last shown info and cqlsh hangs there and never returns. Please note that this issue did *not* occur in the exact same environment with *Cassandra 2.0.12.156*. With the help of Tyler Hobbs, I investigated the problem a little bit further and added some debug statements at specific points. For example, in the CountdownLatch class at https://github.com/apache/cassandra/blob/a323a1a6d5f28ced1a51ba559055283f3eb356ff/pylib/cqlshlib/async_insert.py#L35-L36 I can see that the counter always stays above zero and therefore never returns (even when the data to be copied is already copied). I've also seen that somehow when I type in one cqlsh command, there will be actually two commands. Let me give you an example: I added a debug statement just before https://github.com/apache/cassandra/blob/d76450c7986202141f3a917b3623a4c3138c1094/bin/cqlsh#L920 {code} cqlsh use libdata ; 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] {code} and saw that all commands I enter, they end up being executed twice (same goes for the COPY command). If I can provide any other input for debugging purposes, please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8716: --- Description: {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at org.apache.cassandra.io.sstable.IndexSummary.binarySearch(IndexSummary.java:58) at org.apache.cassandra.io.sstable.SSTableReader.getIndexScanPosition(SSTableReader.java:602) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:947) at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:910) at org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:819) at org.apache.cassandra.db.ColumnFamilyStore.getExpectedCompactedFileSize(ColumnFamilyStore.java:1088) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:564) at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63) at
[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301367#comment-14301367 ] Joshua McKenzie commented on CASSANDRA-8390: I assume you didn't install the desktop experience and have Windows Defender on there? Alternatively, do you have another file system driver resident from an antivirus software (or other NIPS / IPS software) that could be causing this? Similarly, you guys didn't dig down into the roles and enable Windows Search I assume? The Unable to delete messages you're seeing are business as usual and those will clear up, however the failure on deleteWithConfirm (as in the other cases) indicates your data file deleted without issue but something had a handle to that index file when it went to delete it. We have yet to have a confirmed instance of this occurring without some file system driver in the mix, so I'd look to that first. The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8715) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ...
[ https://issues.apache.org/jira/browse/CASSANDRA-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8715: --- Assignee: Tyler Hobbs Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ... --- Key: CASSANDRA-8715 URL: https://issues.apache.org/jira/browse/CASSANDRA-8715 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2.160, cqlsh 5.0.1, Native protocol v3 Reporter: Eduard Tudenhoefner Assignee: Tyler Hobbs Priority: Critical Labels: cqlsh Fix For: 2.1.3 When running a COPY ... FROM ... command in a Kerberos environment, I see the number of rows processed, but eventually, Cqlsh never returns. I can verify, that all the data was copied, but the progress bar shows me the last shown info and cqlsh hangs there and never returns. Please note that this issue did *not* occur in the exact same environment with *Cassandra 2.0.12.156*. With the help of Tyler Hobbs, I investigated the problem a little bit further and added some debug statements at specific points. For example, in the CountdownLatch class at https://github.com/apache/cassandra/blob/a323a1a6d5f28ced1a51ba559055283f3eb356ff/pylib/cqlshlib/async_insert.py#L35-L36 I can see that the counter always stays above zero and therefore never returns (even when the data to be copied is already copied). I've also seen that somehow when I type in one cqlsh command, there will be actually two commands. Let me give you an example: I added a debug statement just before https://github.com/apache/cassandra/blob/d76450c7986202141f3a917b3623a4c3138c1094/bin/cqlsh#L920 {code} cqlsh use libdata ; 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] {code} and saw that all commands I enter, they end up being executed twice (same goes for the COPY command). If I can provide any other input for debugging purposes, please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6542) nodetool removenode hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302871#comment-14302871 ] Erik Forsberg commented on CASSANDRA-6542: -- I observed this on my 1.2.18 production cluster. 6 out of 56 machines would not come back with any replication confirmation. Checking 'nodetool netstats' on the 6 machines, there were no streaming sessions ongoing. Checking the logs on the 6 machines, they had been streaming relevant data. nodetool removenode hangs - Key: CASSANDRA-6542 URL: https://issues.apache.org/jira/browse/CASSANDRA-6542 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12, 1.2.11 DSE Reporter: Eric Lubow Assignee: Tyler Hobbs Running *nodetool removenode $host-id* doesn't actually remove the node from the ring. I've let it run anywhere from 5 minutes to 3 days and there are no messages in the log about it hanging or failing, the command just sits there running. So the regular response has been to run *nodetool removenode $host-id*, give it about 10-15 minutes and then run *nodetool removenode force*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8492) Support IF NOT EXISTS for ALTER TABLE ADD COLUMN
[ https://issues.apache.org/jira/browse/CASSANDRA-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8492: - Reviewer: Aleksey Yeschenko Support IF NOT EXISTS for ALTER TABLE ADD COLUMN Key: CASSANDRA-8492 URL: https://issues.apache.org/jira/browse/CASSANDRA-8492 Project: Cassandra Issue Type: Improvement Reporter: Peter Mädel Priority: Minor Attachments: 8492.patch would enable creation of schema update scripts that can be repeatable executed without having to worry about invalid query exceptions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301428#comment-14301428 ] Wei Yuan Cho commented on CASSANDRA-8390: - Hi Joshua, thank you for your reply. We have Windows Defender here, I'll exclude Cassandra folder, will that be good enough? We do not have Windows Search nor antivirus here. I have checked file system driver, nothing out of the ordinary as far as I can see. I do not mind posting a screenshot of it if required. Is it technically possible to retrieve what program has a handle on the index file so that it can be printed to the logs should that error occur? That'd be helpful for debugging issues like this I suppose The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301431#comment-14301431 ] Aleksey Yeschenko commented on CASSANDRA-8717: -- It's unlikely to get into the 2.1 line, sorry. Maybe 3.0, it at all, and should be done on top of CASSANDRA-8099, if at all. Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 2.1.3 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8717: --- Fix Version/s: (was: 2.1.3) 3.0 Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 3.0 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8717: --- Assignee: Andrés de la Peña Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 2.1.3 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301452#comment-14301452 ] Robbie Strickland commented on CASSANDRA-8717: -- [~iamaleksey] Have you looked at the patch? There's barely anything to it, and yet it opens up the door for guys like Stratio to plug in more advanced index implementations without breaking anything (i.e. no need for their fork, which is a good thing). Plus who knows when 3.0 will go mainstream? I think you should reconsider, or at least get some other input. Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 3.0 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301458#comment-14301458 ] Joshua McKenzie commented on CASSANDRA-8390: The best current method I have is to use msinfo32 to check for file system drivers and to deduce from there. I'll see if there's some way to query from user-space - [handle|https://technet.microsoft.com/en-us/sysinternals/bb896655.aspx] from sysinternals lets you run a standalone query for what has file handles open but running that manually isn't realistic as you'll miss transient handles and most of sysinternals stuff is implemented as kernel drivers, so if we find some route to get this same information we'll likely have to take a different route. Excluding the Cassandra folder from Windows Defender is highly recommended but I can't speak to the rest of your environment w/regards to file system drivers. The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301395#comment-14301395 ] Robbie Strickland commented on CASSANDRA-8717: -- Prior to this patch being submitted, I went through this same exercise and patched 2.1 mainline with these changes. I couldn't see where it broke anything, and it allows users to drop in Stratio's (or their own) custom index implementation. This is a big win! Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 2.1.3 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8718) nodetool cleanup causes segfault
Maxim Ivanov created CASSANDRA-8718: --- Summary: nodetool cleanup causes segfault Key: CASSANDRA-8718 URL: https://issues.apache.org/jira/browse/CASSANDRA-8718 Project: Cassandra Issue Type: Bug Reporter: Maxim Ivanov Priority: Minor When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] {code} # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8718) nodetool cleanup causes segfault
[ https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Ivanov updated CASSANDRA-8718: Description: When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # {code} was: When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] {code} # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # nodetool cleanup causes segfault Key: CASSANDRA-8718 URL: https://issues.apache.org/jira/browse/CASSANDRA-8718 Project: Cassandra Issue Type: Bug Reporter: Maxim Ivanov Priority: Minor When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try
[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301594#comment-14301594 ] Joshua McKenzie commented on CASSANDRA-8390: Nothing in that list jumps out as a problem. I take it you've had continued index file deletion failures after disabling Windows Defender? If you could produce that error and run [handle|https://technet.microsoft.com/en-us/sysinternals/bb896655.aspx] against it - if something other than java shows up that might help tell us something. If you're still having the problem after excluding cassandra folders from Windows Defender, I'd try to stop the Windows Defender service entirely from within PowerShell as an administrator: {noformat} Set-MpPreference -DisableRealtimeMonitoring $true {noformat} The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, FSD.PNG, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8689) Assertion error in 2.1.2: ERROR [IndexSummaryManager:1]
[ https://issues.apache.org/jira/browse/CASSANDRA-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301599#comment-14301599 ] Benedict commented on CASSANDRA-8689: - It is unlikely this will cause any degradation to your cluster. The problem should just prevent summaries from being redistributed at that moment in time, which is a harmless setback. Assertion error in 2.1.2: ERROR [IndexSummaryManager:1] --- Key: CASSANDRA-8689 URL: https://issues.apache.org/jira/browse/CASSANDRA-8689 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Fix For: 2.1.3 After upgrading a 6 nodes cassandra from 2.1.0 to 2.1.2, start getting the following assertion error. {noformat} ERROR [IndexSummaryManager:1] 2015-01-26 20:55:40,451 CassandraDaemon.java:153 - Exception in thread Thread[IndexSummaryManager:1,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.util.Memory.size(Memory.java:307) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummary.getOffHeapSize(IndexSummary.java:192) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.SSTableReader.getIndexSummaryOffHeapSize(SSTableReader.java:1070) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:292) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:238) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:139) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:77) ~[apache-cassandra-2.1.2.jar:2.1.2] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_45] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] {noformat} cassandra service is still running despite the issue. Node has total 8G memory with 2G allocated to heap. We are basically running read queries to retrieve data out of cassandra. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8518) Impose In-Flight Data Limit
[ https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301620#comment-14301620 ] Cheng Ren commented on CASSANDRA-8518: -- Thanks for your reply. What are the necessary facilities here? is MemoryMeter one of them? What else do we have? Impose In-Flight Data Limit --- Key: CASSANDRA-8518 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cheng Ren Labels: performance We have been suffering from cassandra node crash due to out of memory for a long time. The heap dump from the recent crash shows there are 22 native transport request threads each of which consumes 3.3% of heap size, taking more than 70% in total. Heap dump: !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600! Expanded view of one thread: !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600! The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor as the request executor and provided a default request size estimator which constantly returns 1, meaning it limits only the number of requests being pushed to the pool. To have more fine-grained control on handling requests and better protect our node from OOM issue, we propose implementing a more precise estimator. Here is our two cents: For update/delete/insert request: Size could be estimated by adding size of all class members together. For scan query, the major part of the request is response, which can be estimated from the history data. For example if we receive a scan query on a column family for a certain token range, we keep track of its response size used as the estimated response size for later scan query on the same cf. For future requests on the same cf, response size could be calculated by token range*recorded size/ recorded token range. The request size should be estimated as (query size + estimated response size). We believe what we're proposing here can be useful for other people in the Cassandra community as well. Would you mind providing us feedbacks? Please let us know if you have any concerns or suggestions regarding this proposal. Thanks, Cheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8689) Assertion error in 2.1.2: ERROR [IndexSummaryManager:1]
[ https://issues.apache.org/jira/browse/CASSANDRA-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301596#comment-14301596 ] Jeff Liu commented on CASSANDRA-8689: - Surprised to know that there are so much information behind this blur java assertion error. Thank you Benedict for the info. What is the impact of this issue without fix? are in some kind of compacting danger ? I guess I'm really asking for the criticality of this issue. Thanks. Assertion error in 2.1.2: ERROR [IndexSummaryManager:1] --- Key: CASSANDRA-8689 URL: https://issues.apache.org/jira/browse/CASSANDRA-8689 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Fix For: 2.1.3 After upgrading a 6 nodes cassandra from 2.1.0 to 2.1.2, start getting the following assertion error. {noformat} ERROR [IndexSummaryManager:1] 2015-01-26 20:55:40,451 CassandraDaemon.java:153 - Exception in thread Thread[IndexSummaryManager:1,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.util.Memory.size(Memory.java:307) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummary.getOffHeapSize(IndexSummary.java:192) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.SSTableReader.getIndexSummaryOffHeapSize(SSTableReader.java:1070) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:292) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:238) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:139) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:77) ~[apache-cassandra-2.1.2.jar:2.1.2] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_45] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] {noformat} cassandra service is still running despite the issue. Node has total 8G memory with 2G allocated to heap. We are basically running read queries to retrieve data out of cassandra. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8518) Impose In-Flight Data Limit
[ https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301627#comment-14301627 ] Benedict commented on CASSANDRA-8518: - Cell and all its implementations offer an unsharedHeapSizeExcludingData() - which could be expanded to provide just unsharedHeapSize() quite easily. These generally depend on ObjectSizes, which can tell you how much space is used by e.g. a ByteBuffer on heap Impose In-Flight Data Limit --- Key: CASSANDRA-8518 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cheng Ren Labels: performance We have been suffering from cassandra node crash due to out of memory for a long time. The heap dump from the recent crash shows there are 22 native transport request threads each of which consumes 3.3% of heap size, taking more than 70% in total. Heap dump: !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600! Expanded view of one thread: !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600! The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor as the request executor and provided a default request size estimator which constantly returns 1, meaning it limits only the number of requests being pushed to the pool. To have more fine-grained control on handling requests and better protect our node from OOM issue, we propose implementing a more precise estimator. Here is our two cents: For update/delete/insert request: Size could be estimated by adding size of all class members together. For scan query, the major part of the request is response, which can be estimated from the history data. For example if we receive a scan query on a column family for a certain token range, we keep track of its response size used as the estimated response size for later scan query on the same cf. For future requests on the same cf, response size could be calculated by token range*recorded size/ recorded token range. The request size should be estimated as (query size + estimated response size). We believe what we're proposing here can be useful for other people in the Cassandra community as well. Would you mind providing us feedbacks? Please let us know if you have any concerns or suggestions regarding this proposal. Thanks, Cheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301634#comment-14301634 ] Aleksey Yeschenko commented on CASSANDRA-8717: -- See also - CASSANDRA-7017. @jhaliday Might have something to add, too. It is a small patch, but it does touch the internals, subtly changing behavior that may or may nor be taken into account by the rest of C* codebase. My autopilot reaction is to say 'no' to any potentially breaking changes when it comes to minor C* releases. The instabilities we had with the 2.1 line so far (hopefully in the past) make me be even more careful and more aggressive about pushing stuff to 'Later'. Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 3.0 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8717) Top-k queries with custom secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301643#comment-14301643 ] Aleksey Yeschenko commented on CASSANDRA-8717: -- That's [~jhalliday], sorry. Top-k queries with custom secondary indexes --- Key: CASSANDRA-8717 URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Andrés de la Peña Assignee: Andrés de la Peña Priority: Minor Labels: 2i, secondary_index, sort, sorting, top-k Fix For: 3.0 Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/watch?v=Hg5s-hXy_-M], secondary indexes can be modified to support general top-k queries with minimum changes in Cassandra codebase. This way, custom 2i implementations could provide relevance search, sorting by columns, etc. Top-k queries retrieve the k best results for a certain query. That implies querying the k best rows in each token range and then sort them in order to obtain the k globally best rows. For doing that, we propose two additional methods in class SecondaryIndexSearcher: {code:java} public boolean requiresFullScan(ListIndexExpression clause) { return false; } public ListRow sort(ListIndexExpression clause, ListRow rows) { return rows; } {code} The first one indicates if a query performed in the index requires querying all the nodes in the ring. It is necessary in top-k queries because we do not know which node are the best results. The second method specifies how to sort all the partial node results according to the query. Then we add two similar methods to the class AbstractRangeCommand: {code:java} this.searcher = Keyspace.open(keyspace).getColumnFamilyStore(columnFamily).indexManager.searcher(rowFilter); public boolean requiresFullScan() { return searcher == null ? false : searcher.requiresFullScan(rowFilter); } public ListRow combine(ListRow rows) { return searcher == null ? trim(rows) : trim(searcher.sort(rowFilter, rows)); } {code} Finnally, we modify StorageProxy#getRangeSlice to use the previous method, as shown in the attached patch. We think that the proposed approach provides very useful functionality with minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8715) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ...
[ https://issues.apache.org/jira/browse/CASSANDRA-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8715: --- Priority: Major (was: Critical) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ... --- Key: CASSANDRA-8715 URL: https://issues.apache.org/jira/browse/CASSANDRA-8715 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2.160, cqlsh 5.0.1, Native protocol v3 Reporter: Eduard Tudenhoefner Assignee: Tyler Hobbs Labels: cqlsh Fix For: 2.1.3 When running a COPY ... FROM ... command in a Kerberos environment, I see the number of rows processed, but eventually, Cqlsh never returns. I can verify, that all the data was copied, but the progress bar shows me the last shown info and cqlsh hangs there and never returns. Please note that this issue did *not* occur in the exact same environment with *Cassandra 2.0.12.156*. With the help of Tyler Hobbs, I investigated the problem a little bit further and added some debug statements at specific points. For example, in the CountdownLatch class at https://github.com/apache/cassandra/blob/a323a1a6d5f28ced1a51ba559055283f3eb356ff/pylib/cqlshlib/async_insert.py#L35-L36 I can see that the counter always stays above zero and therefore never returns (even when the data to be copied is already copied). I've also seen that somehow when I type in one cqlsh command, there will be actually two commands. Let me give you an example: I added a debug statement just before https://github.com/apache/cassandra/blob/d76450c7986202141f3a917b3623a4c3138c1094/bin/cqlsh#L920 {code} cqlsh use libdata ; 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] {code} and saw that all commands I enter, they end up being executed twice (same goes for the COPY command). If I can provide any other input for debugging purposes, please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8715) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ...
[ https://issues.apache.org/jira/browse/CASSANDRA-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-8715: Priority: Minor (was: Critical) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ... --- Key: CASSANDRA-8715 URL: https://issues.apache.org/jira/browse/CASSANDRA-8715 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2.160, cqlsh 5.0.1, Native protocol v3 Reporter: Eduard Tudenhoefner Assignee: Tyler Hobbs Priority: Minor Labels: cqlsh Fix For: 2.1.3 When running a COPY ... FROM ... command in a Kerberos environment, I see the number of rows processed, but eventually, Cqlsh never returns. I can verify, that all the data was copied, but the progress bar shows me the last shown info and cqlsh hangs there and never returns. Please note that this issue did *not* occur in the exact same environment with *Cassandra 2.0.12.156*. With the help of Tyler Hobbs, I investigated the problem a little bit further and added some debug statements at specific points. For example, in the CountdownLatch class at https://github.com/apache/cassandra/blob/a323a1a6d5f28ced1a51ba559055283f3eb356ff/pylib/cqlshlib/async_insert.py#L35-L36 I can see that the counter always stays above zero and therefore never returns (even when the data to be copied is already copied). I've also seen that somehow when I type in one cqlsh command, there will be actually two commands. Let me give you an example: I added a debug statement just before https://github.com/apache/cassandra/blob/d76450c7986202141f3a917b3623a4c3138c1094/bin/cqlsh#L920 {code} cqlsh use libdata ; 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] {code} and saw that all commands I enter, they end up being executed twice (same goes for the COPY command). If I can provide any other input for debugging purposes, please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8715) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ...
[ https://issues.apache.org/jira/browse/CASSANDRA-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-8715: Priority: Minor (was: Major) Possible Deadlock in Cqlsh in a Kerberos-enabled environment when using COPY ... FROM ... --- Key: CASSANDRA-8715 URL: https://issues.apache.org/jira/browse/CASSANDRA-8715 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2.160, cqlsh 5.0.1, Native protocol v3 Reporter: Eduard Tudenhoefner Assignee: Tyler Hobbs Priority: Minor Labels: cqlsh Fix For: 2.1.3 When running a COPY ... FROM ... command in a Kerberos environment, I see the number of rows processed, but eventually, Cqlsh never returns. I can verify, that all the data was copied, but the progress bar shows me the last shown info and cqlsh hangs there and never returns. Please note that this issue did *not* occur in the exact same environment with *Cassandra 2.0.12.156*. With the help of Tyler Hobbs, I investigated the problem a little bit further and added some debug statements at specific points. For example, in the CountdownLatch class at https://github.com/apache/cassandra/blob/a323a1a6d5f28ced1a51ba559055283f3eb356ff/pylib/cqlshlib/async_insert.py#L35-L36 I can see that the counter always stays above zero and therefore never returns (even when the data to be copied is already copied). I've also seen that somehow when I type in one cqlsh command, there will be actually two commands. Let me give you an example: I added a debug statement just before https://github.com/apache/cassandra/blob/d76450c7986202141f3a917b3623a4c3138c1094/bin/cqlsh#L920 {code} cqlsh use libdata ; 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] 2015-01-30 18:54:56,113 [DEBUG] root: STATEMENT: [('K_USE', 'use', (0, 3)), ('identifier', 'libdata', (4, 11)), ('endtoken', ';', (12, 13))] {code} and saw that all commands I enter, they end up being executed twice (same goes for the COPY command). If I can provide any other input for debugging purposes, please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yuan Cho updated CASSANDRA-8390: Attachment: FSD.PNG Attached my file systems driver. Anything that might affect Cassandra there? The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, FSD.PNG, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8390) The process cannot access the file because it is being used by another process
[ https://issues.apache.org/jira/browse/CASSANDRA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301558#comment-14301558 ] Joshua McKenzie commented on CASSANDRA-8390: This piqued my interest - it seems we could write a stand-alone .dll and distribute it with Cassandra, making use of the [Restart Manager|https://msdn.microsoft.com/library/cc948910.aspx] to [grab a list of who has file handles open|http://blogs.msdn.com/b/oldnewthing/archive/2012/02/17/10268840.aspx]. Even then, I'm not sure that would address our case where file system drivers are causing transient failures as I can't guarantee fs hooks show up in the queried [RmGetList results|https://msdn.microsoft.com/en-us/library/windows/desktop/aa373661%28v=vs.85%29.aspx] in the first place and and we'd likely race with that query anyway - but it's worth a shot. I'll put this on my list of things to look into, likely around the 3.1 era. The process cannot access the file because it is being used by another process -- Key: CASSANDRA-8390 URL: https://issues.apache.org/jira/browse/CASSANDRA-8390 Project: Cassandra Issue Type: Bug Reporter: Ilya Komolkin Assignee: Joshua McKenzie Fix For: 2.1.3 Attachments: CassandraDiedWithDiskAccessModeStandardLogs.7z, FSD.PNG, NoHostAvailableLogs.zip {code}21:46:27.810 [NonPeriodicTasks:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[NonPeriodicTasks:1,5,main] org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:135) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:121) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:113) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:94) ~[cassandra-all-2.1.1.jar:2.1.1] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:664) ~[cassandra-all-2.1.1.jar:2.1.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.nio.file.FileSystemException: E:\Upsource_12391\data\cassandra\data\kernel\filechangehistory_t-a277b560764611e48c8e4915424c75fe\kernel-filechangehistory_t-ka-33-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) ~[na:1.7.0_71] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) ~[na:1.7.0_71] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) ~[na:1.7.0_71] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) ~[na:1.7.0_71] at java.nio.file.Files.delete(Files.java:1079) ~[na:1.7.0_71] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131) ~[cassandra-all-2.1.1.jar:2.1.1] ... 11 common frames omitted{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until killed by OOM killer
[ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Liu updated CASSANDRA-8723: Summary: Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until killed by OOM killer (was: Cassandra 2.1.2 Memory issue - java process memory usage continuously increase until killed by OOM killer) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until killed by OOM killer -- Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer
[ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Liu updated CASSANDRA-8723: Summary: Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer (was: Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until killed by OOM killer) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer - Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increase until killed by OOM killer
[ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Liu updated CASSANDRA-8723: Summary: Cassandra 2.1.2 Memory issue - java process memory usage continuously increase until killed by OOM killer (was: Cassandra 2.1.2 Memory issue - java process memory continuously increase until killed by OOM killer) Cassandra 2.1.2 Memory issue - java process memory usage continuously increase until killed by OOM killer - Key: CASSANDRA-8723 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723 Project: Cassandra Issue Type: Bug Reporter: Jeff Liu Issue: We have an on-going issue with cassandra nodes running with continuously increasing memory until killed by OOM. {noformat} Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919 (java) score 911 or sacrifice child Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB, anon-rss:6461472kB, file-rss:6684kB {noformat} System Profile: cassandra version 2.1.2 system: aws c1.xlarge instance with 8 cores, 7.1G memory. cassandra jvm: -Xms1792M -Xmx1792M -Xmn400M -Xss256k {noformat} java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log org.apache.cassandra.service.CassandraDaemon {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8725) CassandraStorage erroring due to system keyspace schema changes
Philip Thompson created CASSANDRA-8725: -- Summary: CassandraStorage erroring due to system keyspace schema changes Key: CASSANDRA-8725 URL: https://issues.apache.org/jira/browse/CASSANDRA-8725 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.0 CassandraStorage will be deprecated in 3.0, but is currently not working because it is selecting {{key_aliases}} from the system.schema_columnfamilies table, and the column no longer exists. This is causing about half of the pig-tests to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8358) Bundled tools shouldn't be using Thrift API
[ https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302766#comment-14302766 ] Philip Thompson commented on CASSANDRA-8358: What should be done about the fact that this depends upon version 2.1.5 of the java driver, which is not yet released? I assume the easiest thing to do is wait for that release, then bundle the appropriate jar. Currently mvn install needs to be run against a snapshot jar in order for this to build. Bundled tools shouldn't be using Thrift API --- Key: CASSANDRA-8358 URL: https://issues.apache.org/jira/browse/CASSANDRA-8358 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Philip Thompson Fix For: 3.0 In 2.1, we switched cqlsh to the python-driver. In 3.0, we got rid of cassandra-cli. Yet there is still code that's using legacy Thrift API. We want to convert it all to use the java-driver instead. 1. BulkLoader uses Thrift to query the schema tables. It should be using java-driver metadata APIs directly instead. 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift 5. o.a.c.hadoop.pig.CqlStorage is using Thrift Some of the things listed above use Thrift to get the list of partition key columns or clustering columns. Those should be converted to use the Metadata API of the java-driver. Somewhat related to that, we also have badly ported code from Thrift in o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches columns from schema tables instead of properly using the driver's Metadata API. We need all of it fixed. One exception, for now, is o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its describe_splits_ex() call that cannot be currently replaced by any java-driver call (?). Once this is done, we can stop starting Thrift RPC port by default in cassandra.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8086) Cassandra should have ability to limit the number of native connections
[ https://issues.apache.org/jira/browse/CASSANDRA-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Norman Maurer updated CASSANDRA-8086: - Attachment: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c-2.0.patch Patch with applies on the cassandra-2.0 branch Cassandra should have ability to limit the number of native connections --- Key: CASSANDRA-8086 URL: https://issues.apache.org/jira/browse/CASSANDRA-8086 Project: Cassandra Issue Type: Bug Reporter: Vishy Kasar Assignee: Norman Maurer Fix For: 2.1.3 Attachments: 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c-2.0.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.patch, 0001-CASSANDRA-8086-Allow-to-limit-the-number-of-native-c.txt We have a production cluster with 72 instances spread across 2 DCs. We have a large number ( ~ 40,000 ) of clients hitting this cluster. Client normally connects to 4 cassandra instances. Some event (we think it is a schema change on server side) triggered the client to establish connections to all cassandra instances of local DC. This brought the server to its knees. The client connections failed and client attempted re-connections. Cassandra should protect itself from such attack from client. Do we have any knobs to control the number of max connections? If not, we need to add that knob. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8683) Ensure early reopening has no overlap with replaced files, and that SSTableReader.first/last are honoured universally
[ https://issues.apache.org/jira/browse/CASSANDRA-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-8683: Tester: Alan Boudreault Ensure early reopening has no overlap with replaced files, and that SSTableReader.first/last are honoured universally - Key: CASSANDRA-8683 URL: https://issues.apache.org/jira/browse/CASSANDRA-8683 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Benedict Priority: Critical Fix For: 2.1.3 Attachments: 0001-avoid-NPE-in-getPositionsForRanges.patch Incremental repairs holds a set of the sstables it started the repair on (we need to know which sstables were actually validated to be able to anticompact them). This includes any tmplink files that existed when the compaction started (if we wouldn't include those, we would miss data since we move the start point of the existing non-tmplink files) With CASSANDRA-6916 we swap out those instances with new ones (SSTR.cloneWithNewStart / SSTW.openEarly), meaning that the underlying file can get deleted even though we hold a reference. This causes the unit test error: http://cassci.datastax.com/job/trunk_utest/1330/testReport/junit/org.apache.cassandra.db.compaction/LeveledCompactionStrategyTest/testValidationMultipleSSTablePerLevel/ (note that it only fails on trunk though, in 2.1 we don't hold references to the repairing files for non-incremental repairs, but the bug should exist in 2.1 as well) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8719) Using thrift HSHA with offheap_objects appears to corrupt data
Randy Fradin created CASSANDRA-8719: --- Summary: Using thrift HSHA with offheap_objects appears to corrupt data Key: CASSANDRA-8719 URL: https://issues.apache.org/jira/browse/CASSANDRA-8719 Project: Cassandra Issue Type: Bug Components: Core Reporter: Randy Fradin Copying my comment from CASSANDRA-6285 to a new issue since that issue is long closed and I'm not sure if they are related... I am getting this exception using Thrift HSHA in 2.1.0: {quote} INFO [CompactionExecutor:8] 2015-01-26 13:32:51,818 CompactionTask.java (line 138) Compacting [SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-2-Data.db'), SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-1-Data.db')] INFO [CompactionExecutor:8] 2015-01-26 13:32:51,890 ColumnFamilyStore.java (line 856) Enqueuing flush of compactions_in_progress: 212 (0%) on-heap, 20 (0%) off-heap INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,892 Memtable.java (line 326) Writing Memtable-compactions_in_progress@1155018639(0 serialized bytes, 1 ops, 0%/0% of on/off-heap limit) INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,896 Memtable.java (line 360) Completed flushing /tmp/cass_test/cassandra/TestCassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-2-Data.db (42 bytes) for commitlog position ReplayPosition(segmentId=1422296630707, position=430226) ERROR [CompactionExecutor:8] 2015-01-26 13:32:51,906 CassandraDaemon.java (line 166) Exception in thread Thread[CompactionExecutor:8,1,RMI Runtime] java.lang.RuntimeException: Last written key DecoratedKey(131206587314004820534098544948237170809, 80010001000c62617463685f6d757461746500) = current key DecoratedKey(14775611966645399672119169777260659240, 726f776b65793030385f31343232323937313537353835) writing into /tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-tmp-ka-3-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:196) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:110) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:177) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_40] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_40] at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40] {quote} I don't think it's caused by CASSANDRA-8211, because it happens during the first compaction that takes place between the first 2 SSTables to get flushed from an initially empty column family. Also, I've only been able to reproduce it when using both *hsha* for the rpc server and *offheap_objects* for memtable allocation. If I switch either to sync or to offheap_buffers or heap_buffers then I cannot reproduce the problem. Also under the same circumstances I'm pretty sure I've seen incorrect data being returned to a client multiget_slice request before any SSTables had been flushed yet, so I presume this is corruption that happens before any flush/compaction takes place. nodetool scrub yielded these errors: {quote} INFO [CompactionExecutor:9] 2015-01-26 13:48:01,512 OutputHandler.java (line 42) Scrubbing SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-2-Data.db') (168780 bytes) INFO [CompactionExecutor:10] 2015-01-26 13:48:01,512 OutputHandler.java (line
[jira] [Comment Edited] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data
[ https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292451#comment-14292451 ] Randy Fradin edited comment on CASSANDRA-6285 at 2/2/15 7:26 PM: - I am getting this exception using Thrift HSHA in 2.1.0: {quote} INFO [CompactionExecutor:8] 2015-01-26 13:32:51,818 CompactionTask.java (line 138) Compacting [SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-2-Data.db'), SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-1-Data.db')] INFO [CompactionExecutor:8] 2015-01-26 13:32:51,890 ColumnFamilyStore.java (line 856) Enqueuing flush of compactions_in_progress: 212 (0%) on-heap, 20 (0%) off-heap INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,892 Memtable.java (line 326) Writing Memtable-compactions_in_progress@1155018639(0 serialized bytes, 1 ops, 0%/0% of on/off-heap limit) INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,896 Memtable.java (line 360) Completed flushing /tmp/cass_test/cassandra/TestCassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-2-Data.db (42 bytes) for commitlog position ReplayPosition(segmentId=1422296630707, position=430226) ERROR [CompactionExecutor:8] 2015-01-26 13:32:51,906 CassandraDaemon.java (line 166) Exception in thread Thread[CompactionExecutor:8,1,RMI Runtime] java.lang.RuntimeException: Last written key DecoratedKey(131206587314004820534098544948237170809, 80010001000c62617463685f6d757461746500) = current key DecoratedKey(14775611966645399672119169777260659240, 726f776b65793030385f31343232323937313537353835) writing into /tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-tmp-ka-3-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:196) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:110) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:177) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_40] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_40] at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40] {quote} I don't think it's caused by CASSANDRA-8211, because it happens during the first compaction that takes place between the first 2 SSTables to get flushed from an initially empty column family. Also, I've only been able to reproduce it when using both *hsha* for the rpc server and *offheap_objects* for memtable allocation. If I switch either to sync or to offheap_buffers or heap_buffers then I cannot reproduce the problem. Also under the same circumstances I'm pretty sure I've seen incorrect data being returned to a client multiget_slice request before any SSTables had been flushed yet, so I presume this is corruption that happens before any flush/compaction takes place. nodetool scrub yielded these errors: {quote} INFO [CompactionExecutor:9] 2015-01-26 13:48:01,512 OutputHandler.java (line 42) Scrubbing SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-2-Data.db') (168780 bytes) INFO [CompactionExecutor:10] 2015-01-26 13:48:01,512 OutputHandler.java (line 42) Scrubbing SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-1-Data.db') (135024 bytes) WARN [CompactionExecutor:9] 2015-01-26
[jira] [Updated] (CASSANDRA-8719) Using thrift HSHA with offheap_objects appears to corrupt data
[ https://issues.apache.org/jira/browse/CASSANDRA-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8719: --- Fix Version/s: 2.1.3 Using thrift HSHA with offheap_objects appears to corrupt data -- Key: CASSANDRA-8719 URL: https://issues.apache.org/jira/browse/CASSANDRA-8719 Project: Cassandra Issue Type: Bug Components: Core Reporter: Randy Fradin Fix For: 2.1.3 Copying my comment from CASSANDRA-6285 to a new issue since that issue is long closed and I'm not sure if they are related... I am getting this exception using Thrift HSHA in 2.1.0: {quote} INFO [CompactionExecutor:8] 2015-01-26 13:32:51,818 CompactionTask.java (line 138) Compacting [SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-2-Data.db'), SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-1-Data.db')] INFO [CompactionExecutor:8] 2015-01-26 13:32:51,890 ColumnFamilyStore.java (line 856) Enqueuing flush of compactions_in_progress: 212 (0%) on-heap, 20 (0%) off-heap INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,892 Memtable.java (line 326) Writing Memtable-compactions_in_progress@1155018639(0 serialized bytes, 1 ops, 0%/0% of on/off-heap limit) INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,896 Memtable.java (line 360) Completed flushing /tmp/cass_test/cassandra/TestCassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-2-Data.db (42 bytes) for commitlog position ReplayPosition(segmentId=1422296630707, position=430226) ERROR [CompactionExecutor:8] 2015-01-26 13:32:51,906 CassandraDaemon.java (line 166) Exception in thread Thread[CompactionExecutor:8,1,RMI Runtime] java.lang.RuntimeException: Last written key DecoratedKey(131206587314004820534098544948237170809, 80010001000c62617463685f6d757461746500) = current key DecoratedKey(14775611966645399672119169777260659240, 726f776b65793030385f31343232323937313537353835) writing into /tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-tmp-ka-3-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:196) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:110) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:177) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_40] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_40] at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40] {quote} I don't think it's caused by CASSANDRA-8211, because it happens during the first compaction that takes place between the first 2 SSTables to get flushed from an initially empty column family. Also, I've only been able to reproduce it when using both *hsha* for the rpc server and *offheap_objects* for memtable allocation. If I switch either to sync or to offheap_buffers or heap_buffers then I cannot reproduce the problem. Also under the same circumstances I'm pretty sure I've seen incorrect data being returned to a client multiget_slice request before any SSTables had been flushed yet, so I presume this is corruption that happens before any flush/compaction takes place. nodetool scrub yielded these errors: {quote} INFO [CompactionExecutor:9]
[jira] [Updated] (CASSANDRA-8719) Using thrift HSHA with offheap_objects appears to corrupt data
[ https://issues.apache.org/jira/browse/CASSANDRA-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8719: --- Assignee: Marcus Eriksson Using thrift HSHA with offheap_objects appears to corrupt data -- Key: CASSANDRA-8719 URL: https://issues.apache.org/jira/browse/CASSANDRA-8719 Project: Cassandra Issue Type: Bug Components: Core Reporter: Randy Fradin Assignee: Marcus Eriksson Fix For: 2.1.3 Copying my comment from CASSANDRA-6285 to a new issue since that issue is long closed and I'm not sure if they are related... I am getting this exception using Thrift HSHA in 2.1.0: {quote} INFO [CompactionExecutor:8] 2015-01-26 13:32:51,818 CompactionTask.java (line 138) Compacting [SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-2-Data.db'), SSTableReader(path='/tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-ka-1-Data.db')] INFO [CompactionExecutor:8] 2015-01-26 13:32:51,890 ColumnFamilyStore.java (line 856) Enqueuing flush of compactions_in_progress: 212 (0%) on-heap, 20 (0%) off-heap INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,892 Memtable.java (line 326) Writing Memtable-compactions_in_progress@1155018639(0 serialized bytes, 1 ops, 0%/0% of on/off-heap limit) INFO [MemtableFlushWriter:8] 2015-01-26 13:32:51,896 Memtable.java (line 360) Completed flushing /tmp/cass_test/cassandra/TestCassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-2-Data.db (42 bytes) for commitlog position ReplayPosition(segmentId=1422296630707, position=430226) ERROR [CompactionExecutor:8] 2015-01-26 13:32:51,906 CassandraDaemon.java (line 166) Exception in thread Thread[CompactionExecutor:8,1,RMI Runtime] java.lang.RuntimeException: Last written key DecoratedKey(131206587314004820534098544948237170809, 80010001000c62617463685f6d757461746500) = current key DecoratedKey(14775611966645399672119169777260659240, 726f776b65793030385f31343232323937313537353835) writing into /tmp/cass_test/cassandra/TestCassandra/data/test_ks/test_cf-1c45da40a58911e4826751fbbc77b187/test_ks-test_cf-tmp-ka-3-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:196) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:110) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:177) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.0.jar:2.1.0] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:235) ~[apache-cassandra-2.1.0.jar:2.1.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_40] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_40] at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40] {quote} I don't think it's caused by CASSANDRA-8211, because it happens during the first compaction that takes place between the first 2 SSTables to get flushed from an initially empty column family. Also, I've only been able to reproduce it when using both *hsha* for the rpc server and *offheap_objects* for memtable allocation. If I switch either to sync or to offheap_buffers or heap_buffers then I cannot reproduce the problem. Also under the same circumstances I'm pretty sure I've seen incorrect data being returned to a client multiget_slice request before any SSTables had been flushed yet, so I presume this is corruption that happens before any flush/compaction takes place. nodetool scrub yielded these
[jira] [Updated] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-8712: Assignee: (was: Sylvain Lebresne) Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)