[jira] [Updated] (CASSANDRA-8384) Change CREATE TABLE syntax for compression options
[ https://issues.apache.org/jira/browse/CASSANDRA-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8384: - Assignee: Benjamin Lerer (was: Aleksey Yeschenko) Change CREATE TABLE syntax for compression options -- Key: CASSANDRA-8384 URL: https://issues.apache.org/jira/browse/CASSANDRA-8384 Project: Cassandra Issue Type: Sub-task Reporter: Aleksey Yeschenko Assignee: Benjamin Lerer Fix For: 3.0 beta 1 Currently, `compression` table options are inconsistent with the likes of it (table `compaction`, keyspace `replication`). I suggest we change it for 3.0, like we did change `caching` syntax for 2.1 (while continuing to accept the old syntax for a release). I recommend the following changes: 1. rename `sstable_compression` to `class`, to make it consistent `compression` and `replication` 2. rename `chunk_length_kb` to `chunk_length_in_kb`, to match `memtable_flush_period_in_ms`, or, alternatively, to just `chunk_length`, with `memtable_flush_period_in_ms` renamed to `memtable_flush_period` - consistent with every other CQL option everywhere else 3. add a boolean `enabled` option, to match `compaction`. Currently, the official way to disable comression is an ugly, ugly hack (see CASSANDRA-8288) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8907) Raise GCInspector alerts to WARN
[ https://issues.apache.org/jira/browse/CASSANDRA-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585808#comment-14585808 ] Johnny Miller commented on CASSANDRA-8907: -- [~achowdhe] That sounds good to me. Raise GCInspector alerts to WARN Key: CASSANDRA-8907 URL: https://issues.apache.org/jira/browse/CASSANDRA-8907 Project: Cassandra Issue Type: Improvement Reporter: Adam Hattrell I'm fairly regularly running into folks wondering why their applications are reporting down nodes. Yet, they report, when they grepped the logs they have no WARN or ERRORs listed. Nine times out of ten, when I look through the logs we see a ton of ParNew or CMS gc pauses occurring similar to the following: INFO [ScheduledTasks:1] 2013-03-07 18:44:46,795 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1835 ms for 3 collections, 2606015656 used; max is 10611589120 INFO [ScheduledTasks:1] 2013-03-07 19:45:08,029 GCInspector.java (line 122) GC for ParNew: 9866 ms for 8 collections, 2910124308 used; max is 6358564864 To my mind these should be WARN's as they have the potential to be significantly impacting the clusters performance as a whole. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections
[ https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585813#comment-14585813 ] Mike Adamson commented on CASSANDRA-9590: - [~spod] TLS is a transport layer protocol so would be implemented by the native protocol handler. The encryption is negotiated after the socket connection is made using a STARTTLS request by the client. If the client doesn't request TLS it can carry on communicating without encryption. Support for both encrypted and unencrypted native transport connections --- Key: CASSANDRA-9590 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stefan Podkowinski Enabling encryption for native transport currently turns SSL exclusively on or off for the opened socket. Migrating from plain to encrypted requires to migrate all native clients as well and redeploy all of them at the same time after starting the SSL enabled Cassandra nodes. This patch would allow to start Cassandra with both an unencrypted and ssl enabled native port. Clients can connect to either, based whether they support ssl or not. This has been implemented by introducing a new {{native_transport_port_ssl}} config option. There would be three scenarios: * client encryption disabled: native_transport_port unencrypted, port_ssl not used * client encryption enabled, port_ssl not set: encrypted native_transport_port * client encryption enabled and port_ssl set: native_transport_port unencrypted, port_ssl encrypted This approach would keep configuration behavior fully backwards compatible. Patch proposal (tests will be added later in case people will speak out in favor for the patch): [Diff trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl], [Patch against trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections
[ https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585770#comment-14585770 ] Stefan Podkowinski commented on CASSANDRA-9590: --- [~mikea], I'm not sure how it would be possible to support both encrypted and unencrypted content over a TLS socket. TLS connections are initiated by a handshake protocol. Without TLS enabled, any native client won't be able to participate in the handshake. I'm not aware how a downgrade would work in this scenario, but I'd be grateful for further references on that. Support for both encrypted and unencrypted native transport connections --- Key: CASSANDRA-9590 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stefan Podkowinski Enabling encryption for native transport currently turns SSL exclusively on or off for the opened socket. Migrating from plain to encrypted requires to migrate all native clients as well and redeploy all of them at the same time after starting the SSL enabled Cassandra nodes. This patch would allow to start Cassandra with both an unencrypted and ssl enabled native port. Clients can connect to either, based whether they support ssl or not. This has been implemented by introducing a new {{native_transport_port_ssl}} config option. There would be three scenarios: * client encryption disabled: native_transport_port unencrypted, port_ssl not used * client encryption enabled, port_ssl not set: encrypted native_transport_port * client encryption enabled and port_ssl set: native_transport_port unencrypted, port_ssl encrypted This approach would keep configuration behavior fully backwards compatible. Patch proposal (tests will be added later in case people will speak out in favor for the patch): [Diff trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl], [Patch against trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
[ https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585740#comment-14585740 ] Stefania commented on CASSANDRA-9591: - The patch looks OK so far, I've pushed some minor changes in a separate commit on [this branch | https://github.com/stef1927/cassandra/commits/9591-2.0]. It's pretty straightforward stuff but do let me know if you have any concerns. I have also started the integration to 2.1 on [this branch | https://github.com/stef1927/cassandra/commits/9591-2.1]. Unfortunately the code in SSTableReader is a bit divergent so it's probably not working in 2.1 right now. These two branches will be picked up by our continuous integration server, the results will (eventually) be available [here|http://cassci.datastax.com/view/Dev/view/stef1927]. I will check tomorrow for any broken tests. I've added a very basic unit test. However we need to add more tests, probably in _scrub_test.py_ of [dtests|https://github.com/riptano/cassandra-dtest]. Here we should test both standalone and nodetool scrub, albeit with an index in this latter case. I don't mind writing some more tests as this area is a bit lacking (we do have some tests for scrubbing secondary indexes but they are only active on = 2.2). However do let me know if you want to write them yourself. Once all the tests are clear, new and existing, I will try to find a committer, thanks for submitting the patch! Scrub (recover) sstables even when -Index.db is missing --- Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Labels: sstablescrub Fix For: 2.0.x Attachments: 9591-2.0.txt Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. There's also a cassandra distribution bundled with the patch [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz] to make life a little easier for anyone finding themselves in such a bad situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8907) Raise GCInspector alerts to WARN
[ https://issues.apache.org/jira/browse/CASSANDRA-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585801#comment-14585801 ] Amit Singh Chowdhery commented on CASSANDRA-8907: - I am planning to pick this issue and provide a patch... Approach would be to add a property for GC warning threshold in yaml. Whenever the GC pause is equal to or greater than the configured time, a WARN message will be logged in Cassandra logs..else it will be logged at DEBUG level.. Please share your comments so that I can proceed with the fix.. Raise GCInspector alerts to WARN Key: CASSANDRA-8907 URL: https://issues.apache.org/jira/browse/CASSANDRA-8907 Project: Cassandra Issue Type: Improvement Reporter: Adam Hattrell I'm fairly regularly running into folks wondering why their applications are reporting down nodes. Yet, they report, when they grepped the logs they have no WARN or ERRORs listed. Nine times out of ten, when I look through the logs we see a ton of ParNew or CMS gc pauses occurring similar to the following: INFO [ScheduledTasks:1] 2013-03-07 18:44:46,795 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1835 ms for 3 collections, 2606015656 used; max is 10611589120 INFO [ScheduledTasks:1] 2013-03-07 19:45:08,029 GCInspector.java (line 122) GC for ParNew: 9866 ms for 8 collections, 2910124308 used; max is 6358564864 To my mind these should be WARN's as they have the potential to be significantly impacting the clusters performance as a whole. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9598) bad classapth for 'sstablerepairedset' in 'cassandra-tools' package
Clément Lardeur created CASSANDRA-9598: -- Summary: bad classapth for 'sstablerepairedset' in 'cassandra-tools' package Key: CASSANDRA-9598 URL: https://issues.apache.org/jira/browse/CASSANDRA-9598 Project: Cassandra Issue Type: Bug Components: Tools Environment: Debian 3.16.7, cassandra-tools 2.1.6, cassandra 2.1.6 Reporter: Clément Lardeur Priority: Minor The script 'sstablerepairedset' is not ready out of the box for debian distro, maybe due to the refactoring of CASSANDRA-7160 to pack out tools from the bin directory. Actually in 'sstablerepairedset' the classapth is calculated with: {code} if [ x$CLASSPATH = x ]; then # execute from the build dir. if [ -d `dirname $0`/../../build/classes ]; then for directory in `dirname $0`/../../build/classes/*; do CLASSPATH=$CLASSPATH:$directory done else if [ -f `dirname $0`/../lib/stress.jar ]; then CLASSPATH=`dirname $0`/../lib/stress.jar fi fi for jar in `dirname $0`/../../lib/*.jar; do CLASSPATH=$CLASSPATH:$jar done fi {code} Whereas in other scripts from the 'bin/tools', the classpath is calculated with: {code} if [ x$CASSANDRA_INCLUDE = x ]; then for include in `dirname $0`/cassandra.in.sh \ $HOME/.cassandra.in.sh \ /usr/share/cassandra/cassandra.in.sh \ /usr/local/share/cassandra/cassandra.in.sh \ /opt/cassandra/cassandra.in.sh; do if [ -r $include ]; then . $include break fi done elif [ -r $CASSANDRA_INCLUDE ]; then . $CASSANDRA_INCLUDE fi {code} I think that a little refactoring could be good to extract the common part of these scripts like the computation of the CLASSPATH and the set of JAVA_HOME. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9595) Compacting an empty table sometimes doesn't delete SSTables in 2.0 with LCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-9595. Resolution: Not A Problem Looks like the dtest does a major compaction which we don't support with LCS until 2.2 Compacting an empty table sometimes doesn't delete SSTables in 2.0 with LCS --- Key: CASSANDRA-9595 URL: https://issues.apache.org/jira/browse/CASSANDRA-9595 Project: Cassandra Issue Type: Bug Reporter: Jim Witschey Fix For: 2.0.x On 2.0, when compaction is run on a table with all rows deleted and configured with LCS, sometimes SSTables remain on disk afterwards. This causes one of our dtests to fail periodically, for instance [here|http://cassci.datastax.com/view/cassandra-2.0/job/cassandra-2.0_dtest/68/testReport/compaction_test/TestCompaction_with_LeveledCompactionStrategy/sstable_deletion_test/]. This can be reproduced in dtests with {code} CASSANDRA_VERSION=git:cassandra-2.0 nosetests ./compaction_test.py:TestCompaction_with_LeveledCompactionStrategy.sstable_deletion_test {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9402) Implement proper sandboxing for UDFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585843#comment-14585843 ] Robert Stupp commented on CASSANDRA-9402: - Pushed an update of sandboxing to my branch and wanted to provide some status of the work. * Uses a global, custom implementation of a {{SecurityManager}} and {{Policy}}. The security-manager only performs permission checks for user-defined functions (try-finally with a {{ThreadLocal}}) - this eliminates the previously mentioned 3% pref regression. ([cstar|http://cstar.datastax.com/graph?stats=1d461628-12ba-11e5-918f-42010af0688fmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=223.3ymin=0ymax=123458.5]). * The effective {{Permissions}} set for UDFs is empty (Java and Javascript). No access to files, sockets, processes, etc. * UDFs only have access to a very restricted set of classes. ** This works very nicely for Java UDFs. Direct use and {{Class.forName()}} approaches are prevented. Access to {{Runtime}}, {{Thread}} and other ”evil” classes is prevented with a compiler error or {{ClassNotFoundException}}. ** Scripted UDFs need to take the {{SecurityManager.checkPackageAccess()}} approach (see below), which results in an {{AccessControlException}}. ** So Nashorn unfortunately still allows code like {{java.lang.Runtime.getRuntime().gc()}} - but Nashorn does a nice job to prevent bad access in general. ** Thread starting required some special casing in the sec-mgr (the default permission handling in {{java.lang.SecurityManager}} is … well … could be improved). Java {{SecurityManager}} is kind of dangerous because it explicitly allows thread creation and modification of all threads that do not belong to the root thread group (source code cite: {{// just return}}). Notes: * I did not test this with other JSR223 implementations. Security heavily depends on the implementation of the scripting engine. All UDFs are executed within a {{doPriviledged()}} and the UDF class loader as the thread’s context class loader. * Did also not test against Rhino (Java7) due to the tentative decision for Java8 for 3.0 White/blacklisting of classes (class loading): I tried to find something that’s better than WB, but honestly could not find anything better. It works as follows: # only whitelisted pattern ({{startsWith}}) are allowed - if not found, then it’s not accessible # if the individual pattern matching a whitelist entry is contained in the blacklist, then it’s not accessible # so, patterns matching whitelist and not matching blacklist are allowed It is possible to consume a lot of CPU with an endless loop or to force a lot of GCs using Nashorn ({{java.lang.Runtime.getRuntime().gc()}}). Will work on finding a solution for that (separate pooled threads, adopted thread priority, maximum execution time + UDF blacklisting) - maybe with a separate class loader hierarchy. I’m not completely sold on moving UDF execution to a separate process (fenced UDFs). TBH I think it adds too much complexity - requires a new IPC/RPC protocol, state management, recovery scenarios, etc. plus child process (state) management + recovery. To make it really safe, each individual invocation would have to spawn its own process, that has to be monitored. We can probably not prevent UDFs from excessive use of heap space (by accident or bad intention). Other implementations: * JEE containers rely on a (more or less) standard {{SecurityManager}} + {{Policy}} implementation for the whole VM. These usually rely on proper class loader separation. * Some other projects use user-provided code - either these projects have no or effectively no security/sandboxing. Implement proper sandboxing for UDFs Key: CASSANDRA-9402 URL: https://issues.apache.org/jira/browse/CASSANDRA-9402 Project: Cassandra Issue Type: Task Reporter: T Jake Luciani Assignee: Robert Stupp Priority: Critical Labels: docs-impacting, security Fix For: 3.0 beta 1 Attachments: 9402-warning.txt We want to avoid a security exploit for our users. We need to make sure we ship 2.2 UDFs with good defaults so someone exposing it to the internet accidentally doesn't open themselves up to having arbitrary code run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections
[ https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585911#comment-14585911 ] Stefan Podkowinski commented on CASSANDRA-9590: --- [~mikea], you're right, [STARTTLS|https://en.wikipedia.org/wiki/STARTTLS] would be another option how this could be implemented. Netty's [SSLHandler|https://netty.io/4.0/api/io/netty/handler/ssl/SslHandler.html] also supports it, so it should be possible to enable starttls for the server and java driver fairly easily. But we'd have to implement this for the other drivers as well. Not sure if we should go down that road, instead of just having two dedicated sockets. Support for both encrypted and unencrypted native transport connections --- Key: CASSANDRA-9590 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stefan Podkowinski Enabling encryption for native transport currently turns SSL exclusively on or off for the opened socket. Migrating from plain to encrypted requires to migrate all native clients as well and redeploy all of them at the same time after starting the SSL enabled Cassandra nodes. This patch would allow to start Cassandra with both an unencrypted and ssl enabled native port. Clients can connect to either, based whether they support ssl or not. This has been implemented by introducing a new {{native_transport_port_ssl}} config option. There would be three scenarios: * client encryption disabled: native_transport_port unencrypted, port_ssl not used * client encryption enabled, port_ssl not set: encrypted native_transport_port * client encryption enabled and port_ssl set: native_transport_port unencrypted, port_ssl encrypted This approach would keep configuration behavior fully backwards compatible. Patch proposal (tests will be added later in case people will speak out in favor for the patch): [Diff trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl], [Patch against trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9416) 3.x should refuse to start on JVM_VERSION 1.8
[ https://issues.apache.org/jira/browse/CASSANDRA-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-9416: - Fix Version/s: (was: 3.x) 3.0 beta 1 3.x should refuse to start on JVM_VERSION 1.8 --- Key: CASSANDRA-9416 URL: https://issues.apache.org/jira/browse/CASSANDRA-9416 Project: Cassandra Issue Type: Task Reporter: Michael Shuler Priority: Minor Labels: lhf Fix For: 3.0 beta 1 Attachments: trunk-9416.patch When I was looking at CASSANDRA-9408, I noticed that {{conf/cassandra-env.sh}} and {{conf/cassandra-env.ps1}} do JVM version checking and should get updated for 3.x to refuse to start with JVM_VERSION 1.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8384) Change CREATE TABLE syntax for compression options
[ https://issues.apache.org/jira/browse/CASSANDRA-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8384: - Fix Version/s: (was: 3.0 beta 1) 3.x Change CREATE TABLE syntax for compression options -- Key: CASSANDRA-8384 URL: https://issues.apache.org/jira/browse/CASSANDRA-8384 Project: Cassandra Issue Type: Sub-task Reporter: Aleksey Yeschenko Assignee: Benjamin Lerer Fix For: 3.x Currently, `compression` table options are inconsistent with the likes of it (table `compaction`, keyspace `replication`). I suggest we change it for 3.0, like we did change `caching` syntax for 2.1 (while continuing to accept the old syntax for a release). I recommend the following changes: 1. rename `sstable_compression` to `class`, to make it consistent `compression` and `replication` 2. rename `chunk_length_kb` to `chunk_length_in_kb`, to match `memtable_flush_period_in_ms`, or, alternatively, to just `chunk_length`, with `memtable_flush_period_in_ms` renamed to `memtable_flush_period` - consistent with every other CQL option everywhere else 3. add a boolean `enabled` option, to match `compaction`. Currently, the official way to disable comression is an ugly, ugly hack (see CASSANDRA-8288) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585907#comment-14585907 ] Sam Tunnicliffe commented on CASSANDRA-9576: [~philipthompson] I think the problem description is correct, the switching of the connect/execution blocks within {{RangeClient#run}} can cause a connection leak. Pre-8358, upon entering the inner loop, we would assume there to be an existing connection and attempt to use it (on the very first time round the loop, there would be no established connection, hence the comment about the harmless NPE). After executing (providing no errors were encountered), we'd poll for more work then break out of the inner loop, leaving the established connection open and ready to be re-used on the next iteration of the outer loop. Contrarily, the first thing we do now upon entering the inner loop is attempt to open a new connection, whether we need to or not. This would be ok (if inefficient) so long as we ensured that we closed it again after executing the prepared statement. However we don't do that, we just break out of the inner loop without closing the connection. So next time we iterate the outer loop, we'll open another connection (to the same host, as we use a new ListIterator) after entering the inner loop. Finally, {{closeInternal}} only closes the current connection so any others opened during previous iterations will leak. Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9522) Specify unset column ratios in cassandra-stress write
[ https://issues.apache.org/jira/browse/CASSANDRA-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585915#comment-14585915 ] T Jake Luciani commented on CASSANDRA-9522: --- The magic ratio seems to be 50% Would adding the ability to ignore certain columns defined in the schema be good enough? Like in the column spec of the yaml file you could add ignored: true and stress would just not insert into it. Then you could do one test with 1/2 of the columns set to ignored = true. and another with 1/2 - 1 set. Specify unset column ratios in cassandra-stress write - Key: CASSANDRA-9522 URL: https://issues.apache.org/jira/browse/CASSANDRA-9522 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jim Witschey Assignee: T Jake Luciani Fix For: 3.0 beta 1 I'd like to be able to use stress to generate workloads with different distributions of unset columns -- so, for instance, you could specify that rows will have 70% unset columns, and on average, a 100-column row would contain only 30 values. This would help us test the new row formats introduced in 8099. There are a 2 different row formats, used depending on the ratio of set to unset columns, and this feature would let us generate workloads that would be stored in each of those formats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8384) Change CREATE TABLE syntax for compression options
[ https://issues.apache.org/jira/browse/CASSANDRA-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585917#comment-14585917 ] Aleksey Yeschenko commented on CASSANDRA-8384: -- Actually, given that we are going to support the previous syntax anyway, this is not a blocker for the 3.0.0. Moving to 3.X. Change CREATE TABLE syntax for compression options -- Key: CASSANDRA-8384 URL: https://issues.apache.org/jira/browse/CASSANDRA-8384 Project: Cassandra Issue Type: Sub-task Reporter: Aleksey Yeschenko Assignee: Benjamin Lerer Fix For: 3.x Currently, `compression` table options are inconsistent with the likes of it (table `compaction`, keyspace `replication`). I suggest we change it for 3.0, like we did change `caching` syntax for 2.1 (while continuing to accept the old syntax for a release). I recommend the following changes: 1. rename `sstable_compression` to `class`, to make it consistent `compression` and `replication` 2. rename `chunk_length_kb` to `chunk_length_in_kb`, to match `memtable_flush_period_in_ms`, or, alternatively, to just `chunk_length`, with `memtable_flush_period_in_ms` renamed to `memtable_flush_period` - consistent with every other CQL option everywhere else 3. add a boolean `enabled` option, to match `compaction`. Currently, the official way to disable comression is an ugly, ugly hack (see CASSANDRA-8288) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586138#comment-14586138 ] Jeremiah Jordan commented on CASSANDRA-9592: I have seen this happen in 2.0 as well. Can we put this hack fix there as well? `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9304) COPY TO improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-9304: -- Assignee: David Kua COPY TO improvements Key: CASSANDRA-9304 URL: https://issues.apache.org/jira/browse/CASSANDRA-9304 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: David Kua Priority: Minor Fix For: 2.1.x COPY FROM has gotten a lot of love. COPY TO not so much. One obvious improvement could be to parallelize reading and writing (write one page of data while fetching the next). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3
[ https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-9302: -- Assignee: David Kua Optimize cqlsh COPY FROM, part 3 Key: CASSANDRA-9302 URL: https://issues.apache.org/jira/browse/CASSANDRA-9302 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jonathan Ellis Assignee: David Kua Fix For: 2.1.x We've had some discussion moving to Spark CSV import for bulk load in 3.x, but people need a good bulk load tool now. One option is to add a separate Java bulk load tool (CASSANDRA-9048), but if we can match that performance from cqlsh I would prefer to leave COPY FROM as the preferred option to which we point people, rather than adding more tools that need to be supported indefinitely. Previous work on COPY FROM optimization was done in CASSANDRA-7405 and CASSANDRA-8225. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586218#comment-14586218 ] Yuki Morishita commented on CASSANDRA-9592: --- 9592-2.1 branch and dtest on cassci look good to me. So, we can delete the scheduling I mentioned before (https://github.com/belliottsmith/cassandra/blob/d1ddae1b61a9ca037b5edc137b5c9915e86dece6/src/java/org/apache/cassandra/service/CassandraDaemon.java#L371-L386) can be removed right? `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7145) FileNotFoundException during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586228#comment-14586228 ] Yuki Morishita commented on CASSANDRA-7145: --- The error says Too many open files. Maybe check your ulimit? FileNotFoundException during compaction --- Key: CASSANDRA-7145 URL: https://issues.apache.org/jira/browse/CASSANDRA-7145 Project: Cassandra Issue Type: Bug Environment: CentOS 6.3, Datastax Enterprise 4.0.1 (Cassandra 2.0.5), Java 1.7.0_55 Reporter: PJ Assignee: Marcus Eriksson Fix For: 1.2.19, 2.0.11, 2.1.0 Attachments: 0001-avoid-marking-compacted-sstables-as-compacting.patch, compaction - FileNotFoundException.txt, repair - RuntimeException.txt, startup - AssertionError.txt I can't finish any compaction because my nodes always throw a FileNotFoundException. I've already tried the following but nothing helped: 1. nodetool flush 2. nodetool repair (ends with RuntimeException; see attachment) 3. node restart (via dse cassandra-stop) Whenever I restart the nodes, another type of exception is logged (see attachment) somewhere near the end of startup process. This particular exception doesn't seem to be critical because the nodes still manage to finish the startup and become online. I don't have specific steps to reproduce the problem that I'm experiencing with compaction and repair. I'm in the middle of migrating 4.8 billion rows from MySQL via SSTableLoader. Some things that may or may not be relevant: 1. I didn't drop and recreate the keyspace (so probably not related to CASSANDRA-4857) 2. I do the bulk-loading in batches of 1 to 20 millions rows. When a batch reaches 100% total progress (i.e. starts to build secondary index), I kill the sstableloader process and cancel the index build 3. I restart the nodes occasionally. It's possible that there is an on-going compaction during one of those restarts. Related StackOverflow question (mine): http://stackoverflow.com/questions/23435847/filenotfoundexception-during-compaction -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9576: --- Priority: Blocker (was: Major) Fix Version/s: 2.2.0 rc2 Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Priority: Blocker Fix For: 2.2.0 rc2 Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586246#comment-14586246 ] Philip Thompson commented on CASSANDRA-9576: The problem is I missed that when we break, we're resetting the inner loop. The close is only happening at the end of the outer loop. This is actually really serious. I've reordered those two blocks, and am running the patch on cassci. Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Fix For: 2.2.0 rc2 Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9598) bad classapth for 'sstablerepairedset' in 'cassandra-tools' package
[ https://issues.apache.org/jira/browse/CASSANDRA-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-9598: -- Assignee: Michael Shuler bad classapth for 'sstablerepairedset' in 'cassandra-tools' package --- Key: CASSANDRA-9598 URL: https://issues.apache.org/jira/browse/CASSANDRA-9598 Project: Cassandra Issue Type: Bug Components: Tools Environment: Debian 3.16.7, cassandra-tools 2.1.6, cassandra 2.1.6 Reporter: Clément Lardeur Assignee: Michael Shuler Priority: Minor The script 'sstablerepairedset' is not ready out of the box for debian distro, maybe due to the refactoring of CASSANDRA-7160 to pack out tools from the bin directory. Actually in 'sstablerepairedset' the classapth is calculated with: {code} if [ x$CLASSPATH = x ]; then # execute from the build dir. if [ -d `dirname $0`/../../build/classes ]; then for directory in `dirname $0`/../../build/classes/*; do CLASSPATH=$CLASSPATH:$directory done else if [ -f `dirname $0`/../lib/stress.jar ]; then CLASSPATH=`dirname $0`/../lib/stress.jar fi fi for jar in `dirname $0`/../../lib/*.jar; do CLASSPATH=$CLASSPATH:$jar done fi {code} Whereas in other scripts from the 'bin/tools', the classpath is calculated with: {code} if [ x$CASSANDRA_INCLUDE = x ]; then for include in `dirname $0`/cassandra.in.sh \ $HOME/.cassandra.in.sh \ /usr/share/cassandra/cassandra.in.sh \ /usr/local/share/cassandra/cassandra.in.sh \ /opt/cassandra/cassandra.in.sh; do if [ -r $include ]; then . $include break fi done elif [ -r $CASSANDRA_INCLUDE ]; then . $CASSANDRA_INCLUDE fi {code} I think that a little refactoring could be good to extract the common part of these scripts like the computation of the CLASSPATH and the set of JAVA_HOME. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586194#comment-14586194 ] Benedict commented on CASSANDRA-9592: - Sure. I've pushed a 2.0 version (which may or may not compile - I haven't written my script for quickly switching major versions yet, so I'm letting cassci do the hard work for me :)) `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM
[ https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-9303: -- Assignee: David Kua Match cassandra-loader options in COPY FROM --- Key: CASSANDRA-9303 URL: https://issues.apache.org/jira/browse/CASSANDRA-9303 Project: Cassandra Issue Type: New Feature Components: Tools Reporter: Jonathan Ellis Assignee: David Kua Fix For: 2.1.x https://github.com/brianmhess/cassandra-loader added a bunch of options to handle real world requirements, we should match those. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9499) Introduce writeVInt method to DataOutputStreamPlus
[ https://issues.apache.org/jira/browse/CASSANDRA-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587256#comment-14587256 ] Ariel Weisberg commented on CASSANDRA-9499: --- I did all the changes except for trying to make readVInt faster. It isn't so simple which is why I didn't do it. I can't leave any padding (fake reads), and I can't over consume (lost data). I would end up having to make things right after I have figured out how many bytes I have actually consumed or alternatively I would have to make a copy (which could easily be faster). This is much simpler, but if you think its necessary I'll do it. Introduce writeVInt method to DataOutputStreamPlus -- Key: CASSANDRA-9499 URL: https://issues.apache.org/jira/browse/CASSANDRA-9499 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Ariel Weisberg Priority: Minor Fix For: 3.0 beta 1 CASSANDRA-8099 really could do with a writeVInt method, for both fixing CASSANDRA-9498 but also efficiently encoding timestamp/deletion deltas. It should be possible to make an especially efficient implementation against BufferedDataOutputStreamPlus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-9597) DTCS should consider file SIZE in addition to time windowing
[ https://issues.apache.org/jira/browse/CASSANDRA-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9597: -- Comment: was deleted (was: You can understand why this happens when you realize that the sstables are filtered by max timestamp: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java#L178 And then the resulting list is sorted by min timestamp: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java#L357-L367 The result is that for roughly evenly distributed time periods (file size proportional to sstable maxTimestamp - sstable minTimestamp, which is likely mostly true for most DTCS workloads), larger files will always be at the front of {{trimToThreshold}}, which virtually guarantees we'll re-compact a very large sstable over and over and over if any other sstables are in the window for compaction. ) DTCS should consider file SIZE in addition to time windowing Key: CASSANDRA-9597 URL: https://issues.apache.org/jira/browse/CASSANDRA-9597 Project: Cassandra Issue Type: Improvement Reporter: Jeff Jirsa Priority: Minor Labels: dtcs DTCS seems to work well for the typical use case - writing data in perfect time order, compacting recent files, and ignoring older files. However, there are normal operational actions where DTCS will fall behind and is unlikely to recover. An example of this is streaming operations (for example, bootstrap or loading data into a cluster using sstableloader), where lots (tens of thousands) of very small sstables can be created spanning multiple time buckets. In these case, even if max_sstable_age_days is extended to allow the older incoming files to be compacted, the selection logic is likely to re-compact large files with fewer small files over and over, rather than prioritizing selection of max_threshold smallest files to decrease the number of candidate sstables as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9560) Changing durable_writes on a keyspace is only applied after restart of node
[ https://issues.apache.org/jira/browse/CASSANDRA-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587505#comment-14587505 ] Fred commented on CASSANDRA-9560: - Any news about this one? Changing durable_writes on a keyspace is only applied after restart of node --- Key: CASSANDRA-9560 URL: https://issues.apache.org/jira/browse/CASSANDRA-9560 Project: Cassandra Issue Type: Bug Components: Core Environment: Single node Reporter: Fred Assignee: Carl Yeksigian Fix For: 2.1.x When mutations for a column family is about to be applied, the cached instance of the keyspace metadata is read. But the schema mutation for durable_writes hasn't been applied to this cached instance. I'm not too familiar with the codebase but after some debugging (2.1.3), it's somehow related to: {code:title=org.apache.cassandra.db.Mutation.java|borderStyle=solid} public void apply() { Keyspace ks = Keyspace.open(keyspaceName); ks.apply(this, ks.metadata.durableWrites); } {code} Where a cached instance of the keyspace is opened but it's metadata hasn't been updated with the earlier applied durable_writes mutation, since it seems that the cached keyspace instance is lazily build at startup but after that, never updated. I'm also a little bit concerned if other values in the cached keyspace instance suffers from the same issue, e.g. replication_factor... I've seen the same issue in 2.1.5 and the only way to resolve this issue is to restart the node to let the keyspace instance cache reload from disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9445) Read timeout on the tables where we recreated previously dropped column with different type
[ https://issues.apache.org/jira/browse/CASSANDRA-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587269#comment-14587269 ] Cheng Ren commented on CASSANDRA-9445: -- Hi Philip, Thanks so much for your reply. Our schema of the table is: CREATE TABLE br_product ( m_pid text, b_paa blob, b_page blob, b_variant blob, b_view_membership blob, b_view_overrides blob, c_canon_key text, c_canon_url text, c_url_flags text, cat_crumbs maptext, text, cat_goog text, cat_nodes maptext, text, d_age_group text, d_brand text, d_color_group maptext, text, d_colors maptext, text, d_conditions maptext, text, d_description text, d_flags maptext, text, d_gender text, d_keywords maptext, text, d_la_img_urls maptext, text, d_material maptext, text, d_mature boolean, d_model_name text, d_others maptext, text, d_pattern maptext, text, d_sizes maptext, text, d_stores maptext, text, d_sw_img_urls maptext, text, d_th_img_urls maptext, text, d_title text, d_urls maptext, text, d_zo_img_urls maptext, text, i_buyable boolean, i_have_avail boolean, i_have_local_avail maptext, text, i_last_avail_ts timestamp, i_level int, i_price float, i_price_rh float, i_price_rl float, i_price_tags maptext, text, i_sale_price float, i_sale_price_rh float, i_sale_price_rl float, i_sale_price_tags maptext, text, i_status text, is_blacklist boolean, l_first_live_ts timestamp, l_is_live boolean, l_last_live_ts timestamp, l_launch_ts timestamp, pg_i_refs maptext, text, pg_o_refs maptext, text, pg_type text, s_deleted int, s_status_tags listtext, PRIMARY KEY (m_pid) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; then we added a column as maptext,text, dropped it and readded as blob. We have backend and frontend dcs. backend dc: We have pipelines running 2-4 times/day doing scan/write to populate the table. frontend dc: 20 lookups/second. Please let me know if you need any more information. Thanks Read timeout on the tables where we recreated previously dropped column with different type --- Key: CASSANDRA-9445 URL: https://issues.apache.org/jira/browse/CASSANDRA-9445 Project: Cassandra Issue Type: Bug Components: Core Reporter: Cheng Ren We had 10%~20% read request timeout on one specific table in our cassandra cluster. This happened since we added the column to that table with type of maptext,text, ran the pipeline against it adding data and then dropped the column and re-added it as a blob with the same name. The pipeline run to populate the blob data happened as the problem began. The issue got fixed as soon as we dropped the column. Any clue why this is happening ? is there any similar issue been reported with this kind of column change? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586265#comment-14586265 ] Joshua McKenzie commented on CASSANDRA-9584: I am unable to reproduce this locally, win8, on 2.2.0-rc1, 2.2-HEAD, or trunk using 2.6.0c1: {noformat} :\src\python-drivergit branch * (detached from 2.6.0c1) C:\src\decomTestgrep Event *.out 2.2-rc.out:Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) 2.2.out:Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) trunk.out:Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} Only outstanding possibility is that we somehow send the wrong message on server12 where we send the correct on win8 but I'm *very* skeptical of that both due to a) the message generation logic here having nothing to do with the underlying OS to my knowledge and b) server12 and win8 sharing a large portion of their libs and kernel logic. [~kishkaru]: is there a possibility there's something else off in your testing environment? Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586373#comment-14586373 ] Joshua McKenzie commented on CASSANDRA-9584: Tested on Server 2012 and I'm seeing the correct event (note: I have gnuwin32 tools installed, hence grep above and here): {noformat} jmckenzie@WIN-PERF01 c:\src\decomTest grep Event 2.2.0-rc1.out Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8149) bump metrics-reporter-config dependency
[ https://issues.apache.org/jira/browse/CASSANDRA-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586406#comment-14586406 ] Chris Burroughs commented on CASSANDRA-8149: metrics-ganglia in v3 depends on gmetric4j which depends on a LGPL project [1]. I don't think that affects *this* project directly, but if you were re-packaging Cassandra with reporters included and come across this ticket, that may matter to you. [1] https://github.com/ganglia/gmetric4j/issues/9 bump metrics-reporter-config dependency Key: CASSANDRA-8149 URL: https://issues.apache.org/jira/browse/CASSANDRA-8149 Project: Cassandra Issue Type: Improvement Reporter: Pierre-Yves Ritschard Assignee: T Jake Luciani Fix For: 2.2.0 beta 1 It would be nice to be able to take advantage of the new reporters available in metrics-reporter-config 2.3.1 which is now available on maven central. If my understanding is correct, this only entails bumping the dependency in build.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586425#comment-14586425 ] Jeff Jirsa commented on CASSANDRA-9549: --- Throwing a me-too here, copying summary from IRC (on the topic of 2.1.6 showing weird memory behavior that feels like a leak). Other user was also using DTCS: 11:07 jeffj opened CASSANDRA-9597 last night. dtcs + streaming = lots of sstables that won't compact efficiently and eventually (days after load is stopped) nodes end up ooming or in gc hell. 11:08 jeffj in our case, the PROBLEM is that sstables build up over time due to the weird way dtcs is selecting candidates to compact, but the symptom is very very very long gc pauses and eventual ooms. 11:10 jeffj i would very much believe there's a leak somewhere in 2.1.6. in our case, we saw the same behavior in 2.1.5, so i dont think it's a single minor version regression Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class
[jira] [Commented] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586460#comment-14586460 ] Robbie Strickland commented on CASSANDRA-9549: -- We also experience this issue on 2.1.5, and also running DTCS. Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@2054303604:/data2/data/ourtablegoeshere-ka-1151 was not released before the reference was garbage collected {code} This might be related to [CASSANDRA-8723]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9231) Support Routing Key as part of Partition Key
[ https://issues.apache.org/jira/browse/CASSANDRA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586256#comment-14586256 ] Benjamin Coverston commented on CASSANDRA-9231: --- I'm also -1 on adding UDFs into the mix, just on the merits of losing the token aware routing from the client. A simple designation of a some of the partition keys as routing keys would serve the use cases I'm aware of. Support Routing Key as part of Partition Key Key: CASSANDRA-9231 URL: https://issues.apache.org/jira/browse/CASSANDRA-9231 Project: Cassandra Issue Type: Wish Components: Core Reporter: Matthias Broecheler Fix For: 3.x Provide support for sub-dividing the partition key into a routing key and a non-routing key component. Currently, all columns that make up the partition key of the primary key are also routing keys, i.e. they determine which nodes store the data. This proposal would give the data modeler the ability to designate only a subset of the columns that comprise the partition key to be routing keys. The non-routing key columns of the partition key identify the partition but are not used to determine where to store the data. Consider the following example table definition: CREATE TABLE foo ( a int, b int, c int, d int, PRIMARY KEY (([a], b), c ) ); (a,b) is the partition key, c is the clustering key, and d is just a column. In addition, the square brackets identify the routing key as column a. This means that only the value of column a is used to determine the node for data placement (i.e. only the value of column a is murmur3 hashed to compute the token). In addition, column b is needed to identify the partition but does not influence the placement. This has the benefit that all rows with the same routing key (but potentially different non-routing key columns of the partition key) are stored on the same node and that knowledge of such co-locality can be exploited by applications build on top of Cassandra. Currently, the only way to achieve co-locality is within a partition. However, this approach has the limitations that: a) there are theoretical and (more importantly) practical limitations on the size of a partition and b) rows within a partition are ordered and an index is build to exploit such ordering. For large partitions that overhead is significant if ordering isn't needed. In other words, routing keys afford a simple means to achieve scalable node-level co-locality without ordering while clustering keys afford page-level co-locality with ordering. As such, they address different co-locality needs giving the data modeler the flexibility to choose what is needed for their application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586251#comment-14586251 ] Philip Thompson edited comment on CASSANDRA-9576 at 6/15/15 4:12 PM: - Patch is at https://github.com/apache/cassandra/compare/cassandra-2.2...ptnapoleon:cassandra-9576 http://cassci.datastax.com/view/Dev/view/ptnapoleon/job/ptnapoleon-cassandra-9576-dtest/ http://cassci.datastax.com/view/Dev/view/ptnapoleon/job/ptnapoleon-cassandra-9576-testall/ was (Author: philipthompson): Patch is at https://github.com/apache/cassandra/compare/cassandra-2.2...ptnapoleon:cassandra-9576 Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Priority: Blocker Fix For: 2.2.0 rc2 Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9576: --- Since Version: 2.2.0 beta 1 (was: 2.2.0 rc1) Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Priority: Critical Fix For: 2.2.0 rc2 Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586335#comment-14586335 ] Ivar Thorson commented on CASSANDRA-9549: - As another data point, we upgraded our servers to 5.1.6 and see the same issue. Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@2054303604:/data2/data/ourtablegoeshere-ka-1151 was not released before the reference was garbage collected {code} This might be related to [CASSANDRA-8723]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9576: --- Priority: Critical (was: Blocker) Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Priority: Critical Fix For: 2.2.0 rc2 Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7281) SELECT on tuple relations are broken for mixed ASC/DESC clustering order
[ https://issues.apache.org/jira/browse/CASSANDRA-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586621#comment-14586621 ] Marcin Szymaniuk commented on CASSANDRA-7281: - All right. I will merge it with unit tests I already have (It might be that those from dtest are not necessary anymore). SELECT on tuple relations are broken for mixed ASC/DESC clustering order Key: CASSANDRA-7281 URL: https://issues.apache.org/jira/browse/CASSANDRA-7281 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Marcin Szymaniuk Fix For: 2.1.x Attachments: 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-.patch, 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v2.patch, 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v3.patch, 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v4.patch, 7281_unit_tests.txt As noted on [CASSANDRA-6875|https://issues.apache.org/jira/browse/CASSANDRA-6875?focusedCommentId=13992153page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13992153], the tuple notation is broken when the clustering order mixes ASC and DESC directives because the range of data they describe don't correspond to a single continuous slice internally. To copy the example from CASSANDRA-6875: {noformat} cqlsh:ks create table foo (a int, b int, c int, PRIMARY KEY (a, b, c)) WITH CLUSTERING ORDER BY (b DESC, c ASC); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 2, 0); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 1, 0); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 1, 1); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 0, 0); cqlsh:ks SELECT * FROM foo WHERE a=0; a | b | c ---+---+--- 0 | 2 | 0 0 | 1 | 0 0 | 1 | 1 0 | 0 | 0 (4 rows) cqlsh:ks SELECT * FROM foo WHERE a=0 AND (b, c) (1, 0); a | b | c ---+---+--- 0 | 2 | 0 (1 rows) {noformat} The last query should really return {{(0, 2, 0)}} and {{(0, 1, 1)}}. For that specific example we should generate 2 internal slices, but I believe that with more clustering columns we may have more slices. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9528) Improve log output from unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586655#comment-14586655 ] Ariel Weisberg commented on CASSANDRA-9528: --- This is ready for review. It doesn't depend on 9463 which we aren't going to do anything additional for. Improve log output from unit tests -- Key: CASSANDRA-9528 URL: https://issues.apache.org/jira/browse/CASSANDRA-9528 Project: Cassandra Issue Type: Test Reporter: Ariel Weisberg Assignee: Ariel Weisberg * Single log output file per suite * stdout/stderr to the same log file with proper interleaving * Don't interleave interactive output from unit tests run concurrently to the console. Print everything about the test once the test has completed. * Fetch and compress log files as part of artifacts collected by cassci -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9600) DescriptorTypeTidy and GlobalTypeTidy do not benefit from being full fledged Ref instances
Benedict created CASSANDRA-9600: --- Summary: DescriptorTypeTidy and GlobalTypeTidy do not benefit from being full fledged Ref instances Key: CASSANDRA-9600 URL: https://issues.apache.org/jira/browse/CASSANDRA-9600 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor These inner SSTableReader tidying classes do not benefit from being a full fledged Ref because they are managed in such a small scope. This increases our surface area to problems such as CASSANDRA-9549 (these were the affected instances, ftr). . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9584) Decommissioning a node on Windows sends the wrong schema change event
[ https://issues.apache.org/jira/browse/CASSANDRA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586540#comment-14586540 ] Joshua McKenzie commented on CASSANDRA-9584: Can further confirm that the failing long test that prompted this ticket also passes for me locally: {noformat} test_token_aware (tests.integration.long.test_loadbalancingpolicies.LoadBalancingPolicyTests) ... Started: node1 with pid: 6340 Started: node3 with pid: 3536 Started: node2 with pid: 5396 SUCCESS: The process with PID 5396 has been terminated. Creating session Started: node2 with pid: 6396 SUCCESS: The process with PID 6396 has been terminated. Started: node2 with pid: 4140 SUCCESS: The process with PID 4140 has been terminated. SUCCESS: The process with PID 6340 has been terminated. SUCCESS: The process with PID 3536 has been terminated. ok -- Ran 1 test in 211.360s OK {noformat} I did have to make some modifications to the test as it's hard-coded expecting the tokens to be assigned to node 2 and mine were going to node 3, but after inverting the node2/node3 checks the test passes. Will try it on the Server2012 perf machine as well to confirm. This is with ccm master and cassandra-2.2.0-rc1. If it passes on the server as well I'm going to close this as cannot reproduce as it looks like something's up with the testing environment. Decommissioning a node on Windows sends the wrong schema change event - Key: CASSANDRA-9584 URL: https://issues.apache.org/jira/browse/CASSANDRA-9584 Project: Cassandra Issue Type: Bug Environment: C* 2.2.0-rc1 | python-driver 2.6.0-rc1 | Windows Server 2012 R2 64-bit Reporter: Kishan Karunaratne Assignee: Joshua McKenzie Fix For: 2.2.x Decommissioning a node on Windows sends the wrong schema change event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'STATUS_CHANGE', trace_id=None, event _args={'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} On Linux I get the correct event: {noformat} cassandra.connection: DEBUG: Message pushed from server: EventMessage(event_type=u'TOPOLOGY_CHANGE', trace_id=None, event_args={'change_type': u'REMOVED_NODE', 'address': ('127.0.0.2', 9042)}, stream_id=-1) {noformat} We are using ccmlib node.py.decommission() which calls nodetool decommission: {noformat} def decommission(self): self.nodetool(decommission) self.status = Status.DECOMMISIONNED self._update_config() {noformat} Interestingly, it does seem to work (correctly?) on CCM CLI: {noformat} PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: UP PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 62.43 KB? -9223372036854775808 127.0.0.2 rack1 Up Normal 104.87 KB ? -3074457345618258603 127.0.0.3 rack1 Up Normal 83.67 KB? 3074457345618258602 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless PS C:\Users\Administrator ccm node2 decommission PS C:\Users\Administrator ccm status Cluster: '2.2' -- node1: UP node3: UP node2: DECOMMISIONNED PS C:\Users\Administrator ccm node1 ring Starting NodeTool Datacenter: datacenter1 == AddressRackStatus State LoadOwns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 67.11 KB? -9223372036854775808 127.0.0.3 rack1 Up Normal 88.35 KB? 3074457345618258602 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586597#comment-14586597 ] Benedict commented on CASSANDRA-9592: - Sorry, I should have mentioned in my last comment that on updating 2.0 I noticed I hadn't removed that from 2.1, and uploaded a change to do so. cassci has already corroborated it. `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9599) Echo message in Gossip should use IAsyncCallbackWithFailure
sankalp kohli created CASSANDRA-9599: Summary: Echo message in Gossip should use IAsyncCallbackWithFailure Key: CASSANDRA-9599 URL: https://issues.apache.org/jira/browse/CASSANDRA-9599 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor It will help a lot in debugging if we use IAsyncCallbackWithFailure for Epoch message. We can log an error if this times out. Also why Epoch messages are processed in GOSSIP stage? Can this be moved to some other stage? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] cassandra git commit: Reorder operations in CqlRecordWriter main run loop
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 2e92cf899 - 17d43fa55 refs/heads/trunk 7476d83b4 - 81858ebcb Reorder operations in CqlRecordWriter main run loop Patch by Philip Thompson; reviewed by Sam Tunnicliffe for CASSANDRA-9576 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/17d43fa5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/17d43fa5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/17d43fa5 Branch: refs/heads/cassandra-2.2 Commit: 17d43fa55eca29be492a716f04d9ceff1989762d Parents: 2e92cf8 Author: Philip Thompson ptnapol...@gmail.com Authored: Mon Jun 15 11:55:04 2015 -0400 Committer: Sam Tunnicliffe s...@beobal.com Committed: Mon Jun 15 19:57:08 2015 +0100 -- CHANGES.txt | 1 + .../cassandra/hadoop/cql3/CqlRecordWriter.java | 61 ++-- 2 files changed, 30 insertions(+), 32 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/17d43fa5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 020cb46..ba8ef12 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2 + * Fix connection leak in CqlRecordWriter (CASSANDRA-9576) * Mlockall before opening system sstables remove boot_without_jna option (CASSANDRA-9573) * Add functions to convert timeuuid to date or time, deprecate dateOf and unixTimestampOf (CASSANDRA-9229) * Make sure we cancel non-compacting sstables from LifecycleTransaction (CASSANDRA-9566) http://git-wip-us.apache.org/repos/asf/cassandra/blob/17d43fa5/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java -- diff --git a/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java b/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java index 78b0494..6e8ffd9 100644 --- a/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java +++ b/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java @@ -299,36 +299,6 @@ class CqlRecordWriter extends RecordWriterMapString, ByteBuffer, ListByteBuf while (true) { // send the mutation to the last-used endpoint. first time through, this will NPE harmlessly. - -// attempt to connect to a different endpoint -try -{ -InetAddress address = iter.next(); -String host = address.getHostName(); -client = CqlConfigHelper.getOutputCluster(host, conf).connect(); -} -catch (Exception e) -{ -//If connection died due to Interrupt, just try connecting to the endpoint again. -//There are too many ways for the Thread.interrupted() state to be cleared, so -//we can't rely on that here. Until the java driver gives us a better way of knowing -//that this exception came from an InterruptedException, this is the best solution. -if (canRetryDriverConnection(e)) -{ -iter.previous(); -} -closeInternal(); - -// Most exceptions mean something unexpected went wrong to that endpoint, so -// we should try again to another. Other exceptions (auth or invalid request) are fatal. -if ((e instanceof AuthenticationException || e instanceof InvalidQueryException) || !iter.hasNext()) -{ -lastException = new IOException(e); -break outer; -} -continue; -} - try { int i = 0; @@ -342,7 +312,7 @@ class CqlRecordWriter extends RecordWriterMapString, ByteBuffer, ListByteBuf } client.execute(boundStatement); i++; - + if (i = batchThreshold) break; bindVariables = queue.poll(); @@ -359,6 +329,33 @@ class CqlRecordWriter extends RecordWriterMapString, ByteBuffer, ListByteBuf } } +// attempt to connect to a different endpoint +try +{ +InetAddress address = iter.next(); +
[2/3] cassandra git commit: Reorder operations in CqlRecordWriter main run loop
Reorder operations in CqlRecordWriter main run loop Patch by Philip Thompson; reviewed by Sam Tunnicliffe for CASSANDRA-9576 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/17d43fa5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/17d43fa5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/17d43fa5 Branch: refs/heads/trunk Commit: 17d43fa55eca29be492a716f04d9ceff1989762d Parents: 2e92cf8 Author: Philip Thompson ptnapol...@gmail.com Authored: Mon Jun 15 11:55:04 2015 -0400 Committer: Sam Tunnicliffe s...@beobal.com Committed: Mon Jun 15 19:57:08 2015 +0100 -- CHANGES.txt | 1 + .../cassandra/hadoop/cql3/CqlRecordWriter.java | 61 ++-- 2 files changed, 30 insertions(+), 32 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/17d43fa5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 020cb46..ba8ef12 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2 + * Fix connection leak in CqlRecordWriter (CASSANDRA-9576) * Mlockall before opening system sstables remove boot_without_jna option (CASSANDRA-9573) * Add functions to convert timeuuid to date or time, deprecate dateOf and unixTimestampOf (CASSANDRA-9229) * Make sure we cancel non-compacting sstables from LifecycleTransaction (CASSANDRA-9566) http://git-wip-us.apache.org/repos/asf/cassandra/blob/17d43fa5/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java -- diff --git a/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java b/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java index 78b0494..6e8ffd9 100644 --- a/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java +++ b/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java @@ -299,36 +299,6 @@ class CqlRecordWriter extends RecordWriterMapString, ByteBuffer, ListByteBuf while (true) { // send the mutation to the last-used endpoint. first time through, this will NPE harmlessly. - -// attempt to connect to a different endpoint -try -{ -InetAddress address = iter.next(); -String host = address.getHostName(); -client = CqlConfigHelper.getOutputCluster(host, conf).connect(); -} -catch (Exception e) -{ -//If connection died due to Interrupt, just try connecting to the endpoint again. -//There are too many ways for the Thread.interrupted() state to be cleared, so -//we can't rely on that here. Until the java driver gives us a better way of knowing -//that this exception came from an InterruptedException, this is the best solution. -if (canRetryDriverConnection(e)) -{ -iter.previous(); -} -closeInternal(); - -// Most exceptions mean something unexpected went wrong to that endpoint, so -// we should try again to another. Other exceptions (auth or invalid request) are fatal. -if ((e instanceof AuthenticationException || e instanceof InvalidQueryException) || !iter.hasNext()) -{ -lastException = new IOException(e); -break outer; -} -continue; -} - try { int i = 0; @@ -342,7 +312,7 @@ class CqlRecordWriter extends RecordWriterMapString, ByteBuffer, ListByteBuf } client.execute(boundStatement); i++; - + if (i = batchThreshold) break; bindVariables = queue.poll(); @@ -359,6 +329,33 @@ class CqlRecordWriter extends RecordWriterMapString, ByteBuffer, ListByteBuf } } +// attempt to connect to a different endpoint +try +{ +InetAddress address = iter.next(); +String host = address.getHostName(); +client = CqlConfigHelper.getOutputCluster(host, conf).connect(); +
[3/3] cassandra git commit: Merge branch 'cassandra-2.2' into trunk
Merge branch 'cassandra-2.2' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/81858ebc Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/81858ebc Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/81858ebc Branch: refs/heads/trunk Commit: 81858ebcbda2149d94e4c83e20965832a3f43734 Parents: 7476d83 17d43fa Author: Sam Tunnicliffe s...@beobal.com Authored: Mon Jun 15 20:01:23 2015 +0100 Committer: Sam Tunnicliffe s...@beobal.com Committed: Mon Jun 15 20:01:23 2015 +0100 -- CHANGES.txt | 1 + .../cassandra/hadoop/cql3/CqlRecordWriter.java | 61 ++-- 2 files changed, 30 insertions(+), 32 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/81858ebc/CHANGES.txt -- diff --cc CHANGES.txt index 35e02a2,ba8ef12..de85f61 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,14 -1,5 +1,15 @@@ +3.0: + * Make file buffer cache independent of paths being read (CASSANDRA-8897) + * Remove deprecated legacy Hadoop code (CASSANDRA-9353) + * Decommissioned nodes will not rejoin the cluster (CASSANDRA-8801) + * Change gossip stabilization to use endpoit size (CASSANDRA-9401) + * Change default garbage collector to G1 (CASSANDRA-7486) + * Populate TokenMetadata early during startup (CASSANDRA-9317) + * undeprecate cache recentHitRate (CASSANDRA-6591) + + 2.2 + * Fix connection leak in CqlRecordWriter (CASSANDRA-9576) * Mlockall before opening system sstables remove boot_without_jna option (CASSANDRA-9573) * Add functions to convert timeuuid to date or time, deprecate dateOf and unixTimestampOf (CASSANDRA-9229) * Make sure we cancel non-compacting sstables from LifecycleTransaction (CASSANDRA-9566) http://git-wip-us.apache.org/repos/asf/cassandra/blob/81858ebc/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java --
[jira] [Commented] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586549#comment-14586549 ] Benedict commented on CASSANDRA-9549: - Sorry for the slow response. This one slipped off my work queue. I've pushed a fix [here|https://github.com/belliottsmith/cassandra/tree/9549]. The problem is that I made erroneous assumptions about the behaviour of CLQ on remove (I've read too many CLQ implementations to keep them all straight, I guess). The problem is that on remove, it does not unlink the node it has removed from, it only sets the item to null. This means we accumulate the CLQ nodes for the whole lifetime of the Ref (in this case an sstable). DTCS obviously exacerbates this by ensuring sstable lifetimes are infinite. This patch simply swaps that to a CLDeque. This has some undesirable properties, so we should probably hasten CASSANDRA-9379. This would have prevented this, and will generally improve our management of Ref instances. I've also filed a follow up ticket, CASSANDRA-9600, which would have mitigated this. Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR
[jira] [Commented] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586556#comment-14586556 ] Benedict commented on CASSANDRA-9549: - Actually, scratch that... it does look like CLQ should remove the node. And yet, it isn't doing so, if the heap dump is to be believed. I suspect the patched branch will fix the problem, but will see if I can puzzle out a plausible mechanism by which the nodes are accumulating. Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@2054303604:/data2/data/ourtablegoeshere-ka-1151 was not released before the reference was garbage collected {code} This might be related to [CASSANDRA-8723]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586335#comment-14586335 ] Ivar Thorson edited comment on CASSANDRA-9549 at 6/15/15 7:57 PM: -- As another data point, we upgraded our servers to 2.1.6 and see the same issue. was (Author: ivar.thorson): As another data point, we upgraded our servers to 5.1.6 and see the same issue. Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@2054303604:/data2/data/ourtablegoeshere-ka-1151 was not released before the reference was garbage collected {code} This might be related to [CASSANDRA-8723]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict reassigned CASSANDRA-9549: --- Assignee: Benedict Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Assignee: Benedict Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@2054303604:/data2/data/ourtablegoeshere-ka-1151 was not released before the reference was garbage collected {code} This might be related to [CASSANDRA-8723]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586587#comment-14586587 ] Benedict commented on CASSANDRA-9549: - Ahhh. So, there is a pathological case in CLQ.remove. If the item you delete was the last to be inserted, it will not expunge the node. However if it also does not expunge any deleted items en route to the end. So, if you retain the first to be inserted, and you always delete the last, you get an infinitely growing, but completely empty, middle of the CLQ. This is pretty easily avoided, so might be worth an upstream patch to the JDK. However for now the patch I uploaded should fix the problem (which I'm more confident of, now there is an explanatory framework), and CASSANDRA-9379 remains the correct follow up to ensure no pathological list behaviours (e.g. with lots of extant Ref instances). Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class
[jira] [Commented] (CASSANDRA-9379) Use a collection supporting more efficient removal in Ref.GlobalState
[ https://issues.apache.org/jira/browse/CASSANDRA-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587250#comment-14587250 ] Benedict commented on CASSANDRA-9379: - For 3.0, I think we should introduce the data structure I linked above, for use in managing Ref states. While CASSANDRA-9549 will be fixed by CLDeque, this data structure still has O(N) modification time. Generally this is fine, but if we have buggy / pathological behaviour again, it would be comforting to know that no operation is slowed down by this. Since the code already exists, and has pretty thorough accompanying test coverage, the labour involved should be minimal. What would be the preferred way to do this: bring in tree, or publish my artefacts to maven? Use a collection supporting more efficient removal in Ref.GlobalState - Key: CASSANDRA-9379 URL: https://issues.apache.org/jira/browse/CASSANDRA-9379 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Ref is intended only to be used in places where there are very few Ref instances against a given object extant at any moment, so this collection does not need to be performant. But to avoid performance regressions, such as accidentally introduced in CASSANDRA-8897 (but avoidable via the scaling back of Ref use, since no longer necessary), we could use a collection that supports more efficient removal. I would prefer, however, not to use either of CHM or NBHM, since both are heavyweight objects, wasting a lot of heap; the former is also blocking, and the latter could be problematic for this kind of workload, since it can leave references present in the map after a deletion. The most suitable structure is the one I blogged about [here|http://belliottsmith.com/eventual-consistency-concurrent-data-structures/] and have on github [here|https://github.com/belliottsmith/bes-utils/blob/master/src/bes/concurrent/collections/SimpleCollection.java], since it offers lock-free append and wait-free removal, and ensures space utilization is as low as possible. Thoughts/opinions? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9599) Echo message in Gossip should use IAsyncCallbackWithFailure
[ https://issues.apache.org/jira/browse/CASSANDRA-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586704#comment-14586704 ] sankalp kohli commented on CASSANDRA-9599: -- cc [~brandon.williams] Echo message in Gossip should use IAsyncCallbackWithFailure --- Key: CASSANDRA-9599 URL: https://issues.apache.org/jira/browse/CASSANDRA-9599 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor It will help a lot in debugging if we use IAsyncCallbackWithFailure for Epoch message. We can log an error if this times out. Also why Epoch messages are processed in GOSSIP stage? Can this be moved to some other stage? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9549) Memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9549: -- Reproduced In: 2.1.6, 2.1.5 (was: 2.1.5) Memory leak Key: CASSANDRA-9549 URL: https://issues.apache.org/jira/browse/CASSANDRA-9549 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.5. 9 node cluster in EC2 (m1.large nodes, 2 cores 7.5G memory, 800G platter for cassandra data, root partition and commit log are on SSD EBS with sufficient IOPS), 3 nodes/availablity zone, 1 replica/zone JVM: /usr/java/jdk1.8.0_40/jre/bin/java JVM Flags besides CP: -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -XX:CMSWaitDuration=1 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:CMSWaitDuration=1 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid Kernel: Linux 2.6.32-504.16.2.el6.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux Reporter: Ivar Thorson Assignee: Benedict Priority: Critical Fix For: 2.1.x Attachments: c4_system.log, c7fromboot.zip, cassandra.yaml, cpu-load.png, memoryuse.png, ref-java-errors.jpeg, suspect.png, two-loads.png We have been experiencing a severe memory leak with Cassandra 2.1.5 that, over the period of a couple of days, eventually consumes all of the available JVM heap space, putting the JVM into GC hell where it keeps trying CMS collection but can't free up any heap space. This pattern happens for every node in our cluster and is requiring rolling cassandra restarts just to keep the cluster running. We have upgraded the cluster per Datastax docs from the 2.0 branch a couple of months ago and have been using the data from this cluster for more than a year without problem. As the heap fills up with non-GC-able objects, the CPU/OS load average grows along with it. Heap dumps reveal an increasing number of java.util.concurrent.ConcurrentLinkedQueue$Node objects. We took heap dumps over a 2 day period, and watched the number of Node objects go from 4M, to 19M, to 36M, and eventually about 65M objects before the node stops responding. The screen capture of our heap dump is from the 19M measurement. Load on the cluster is minimal. We can see this effect even with only a handful of writes per second. (See attachments for Opscenter snapshots during very light loads and heavier loads). Even with only 5 reads a sec we see this behavior. Log files show repeated errors in Ref.java:181 and Ref.java:279 and LEAK detected messages: {code} ERROR [CompactionExecutor:557] 2015-06-01 18:27:36,978 Ref.java:279 - Error when closing class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@1302301946:/data1/data/ourtablegoeshere-ka-1150 java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32680b31 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@573464d6[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1644] {code} {code} ERROR [Reference-Reaper:1] 2015-06-01 18:27:37,083 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@74b5df92) to class org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@2054303604:/data2/data/ourtablegoeshere-ka-1151 was not released before the reference was garbage collected {code} This might be related to [CASSANDRA-8723]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9423) Improve Leak Detection to cover strong reference leaks
[ https://issues.apache.org/jira/browse/CASSANDRA-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9423: Fix Version/s: (was: 3.0 beta 1) 3.0.x Improve Leak Detection to cover strong reference leaks -- Key: CASSANDRA-9423 URL: https://issues.apache.org/jira/browse/CASSANDRA-9423 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.0.x Currently we detect resources that we don't cleanup that become unreachable. We could also detect references that appear to have leaked without becoming unreachable, by periodically scanning the set of extant refs, and checking if they are reachable via their normal means (if any); if their lifetime is unexpectedly long this likely indicates a problem, and we can log a warning/error. Assigning to myself to not forget it, since this may well help especially with [~tjake]'s concerns highlighted on 8099 for 3.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9423) Improve Leak Detection to cover strong reference leaks
[ https://issues.apache.org/jira/browse/CASSANDRA-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585629#comment-14585629 ] Benedict commented on CASSANDRA-9423: - Since CASSANDRA-8099 is modifying its use of OpOrder, its importance is much lower. Improve Leak Detection to cover strong reference leaks -- Key: CASSANDRA-9423 URL: https://issues.apache.org/jira/browse/CASSANDRA-9423 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 3.0.x Currently we detect resources that we don't cleanup that become unreachable. We could also detect references that appear to have leaked without becoming unreachable, by periodically scanning the set of extant refs, and checking if they are reachable via their normal means (if any); if their lifetime is unexpectedly long this likely indicates a problem, and we can log a warning/error. Assigning to myself to not forget it, since this may well help especially with [~tjake]'s concerns highlighted on 8099 for 3.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9423) Improve Leak Detection to cover strong reference leaks
[ https://issues.apache.org/jira/browse/CASSANDRA-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9423: Priority: Minor (was: Major) Improve Leak Detection to cover strong reference leaks -- Key: CASSANDRA-9423 URL: https://issues.apache.org/jira/browse/CASSANDRA-9423 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 3.0.x Currently we detect resources that we don't cleanup that become unreachable. We could also detect references that appear to have leaked without becoming unreachable, by periodically scanning the set of extant refs, and checking if they are reachable via their normal means (if any); if their lifetime is unexpectedly long this likely indicates a problem, and we can log a warning/error. Assigning to myself to not forget it, since this may well help especially with [~tjake]'s concerns highlighted on 8099 for 3.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7145) FileNotFoundException during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585644#comment-14585644 ] ZhaoYang commented on CASSANDRA-7145: - encounter the same problem: nodetool compact failed. log: ERROR 08:38:42 Exception in thread Thread[CompactionExecutor:1,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /home/zhaoyang/program/cassandra-2.1.1/./bin/../data/data/ycsb/usertable-4fb3447310e211e5a5c2f18cd916802f/ycsb-usertable-ka-6440-Data.db (Too many open files) I was doing YCSB testing. Setup: C* 2.1.1, Ubuntu 14.04 LTS, 24GB Ram, 256 ssd. FileNotFoundException during compaction --- Key: CASSANDRA-7145 URL: https://issues.apache.org/jira/browse/CASSANDRA-7145 Project: Cassandra Issue Type: Bug Environment: CentOS 6.3, Datastax Enterprise 4.0.1 (Cassandra 2.0.5), Java 1.7.0_55 Reporter: PJ Assignee: Marcus Eriksson Fix For: 1.2.19, 2.0.11, 2.1.0 Attachments: 0001-avoid-marking-compacted-sstables-as-compacting.patch, compaction - FileNotFoundException.txt, repair - RuntimeException.txt, startup - AssertionError.txt I can't finish any compaction because my nodes always throw a FileNotFoundException. I've already tried the following but nothing helped: 1. nodetool flush 2. nodetool repair (ends with RuntimeException; see attachment) 3. node restart (via dse cassandra-stop) Whenever I restart the nodes, another type of exception is logged (see attachment) somewhere near the end of startup process. This particular exception doesn't seem to be critical because the nodes still manage to finish the startup and become online. I don't have specific steps to reproduce the problem that I'm experiencing with compaction and repair. I'm in the middle of migrating 4.8 billion rows from MySQL via SSTableLoader. Some things that may or may not be relevant: 1. I didn't drop and recreate the keyspace (so probably not related to CASSANDRA-4857) 2. I do the bulk-loading in batches of 1 to 20 millions rows. When a batch reaches 100% total progress (i.e. starts to build secondary index), I kill the sstableloader process and cancel the index build 3. I restart the nodes occasionally. It's possible that there is an on-going compaction during one of those restarts. Related StackOverflow question (mine): http://stackoverflow.com/questions/23435847/filenotfoundexception-during-compaction -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7281) SELECT on tuple relations are broken for mixed ASC/DESC clustering order
[ https://issues.apache.org/jira/browse/CASSANDRA-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587132#comment-14587132 ] Stefania commented on CASSANDRA-7281: - Thanks! SELECT on tuple relations are broken for mixed ASC/DESC clustering order Key: CASSANDRA-7281 URL: https://issues.apache.org/jira/browse/CASSANDRA-7281 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Marcin Szymaniuk Fix For: 2.1.x Attachments: 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-.patch, 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v2.patch, 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v3.patch, 0001-CASSANDRA-7281-SELECT-on-tuple-relations-are-broken-v4.patch, 7281_unit_tests.txt As noted on [CASSANDRA-6875|https://issues.apache.org/jira/browse/CASSANDRA-6875?focusedCommentId=13992153page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13992153], the tuple notation is broken when the clustering order mixes ASC and DESC directives because the range of data they describe don't correspond to a single continuous slice internally. To copy the example from CASSANDRA-6875: {noformat} cqlsh:ks create table foo (a int, b int, c int, PRIMARY KEY (a, b, c)) WITH CLUSTERING ORDER BY (b DESC, c ASC); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 2, 0); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 1, 0); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 1, 1); cqlsh:ks INSERT INTO foo (a, b, c) VALUES (0, 0, 0); cqlsh:ks SELECT * FROM foo WHERE a=0; a | b | c ---+---+--- 0 | 2 | 0 0 | 1 | 0 0 | 1 | 1 0 | 0 | 0 (4 rows) cqlsh:ks SELECT * FROM foo WHERE a=0 AND (b, c) (1, 0); a | b | c ---+---+--- 0 | 2 | 0 (1 rows) {noformat} The last query should really return {{(0, 2, 0)}} and {{(0, 1, 1)}}. For that specific example we should generate 2 internal slices, but I believe that with more clustering columns we may have more slices. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9577) Cassandra not performing GC on stale SStables after compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585476#comment-14585476 ] Marcus Eriksson commented on CASSANDRA-9577: is this just happening on one host? And did you do a major compaction? Did that major compaction take almost 4 days? Can you attach full logs? Cassandra not performing GC on stale SStables after compaction -- Key: CASSANDRA-9577 URL: https://issues.apache.org/jira/browse/CASSANDRA-9577 Project: Cassandra Issue Type: Bug Components: Core Environment: 2.0.12.200 / DSE 4.6.1. Reporter: Jeff Ferland Assignee: Marcus Eriksson Space used (live), bytes: 878681716067 Space used (total), bytes: 2227857083852 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ sudo lsof *-Data.db COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java4473 cassandra 446r REG 0,26 17582559172 39241 trends-trends-jb-144864-Data.db java4473 cassandra 448r REG 0,26 62040962 37431 trends-trends-jb-144731-Data.db java4473 cassandra 449r REG 0,26 829935047545 21150 trends-trends-jb-143581-Data.db java4473 cassandra 452r REG 0,26 8980406 39503 trends-trends-jb-144882-Data.db java4473 cassandra 454r REG 0,26 8980406 39503 trends-trends-jb-144882-Data.db java4473 cassandra 462r REG 0,26 9487703 39542 trends-trends-jb-144883-Data.db java4473 cassandra 463r REG 0,26 36158226 39629 trends-trends-jb-144889-Data.db java4473 cassandra 468r REG 0,26105693505 39447 trends-trends-jb-144881-Data.db java4473 cassandra 530r REG 0,26 17582559172 39241 trends-trends-jb-144864-Data.db java4473 cassandra 535r REG 0,26105693505 39447 trends-trends-jb-144881-Data.db java4473 cassandra 542r REG 0,26 9487703 39542 trends-trends-jb-144883-Data.db java4473 cassandra 553u REG 0,26 6431729821 39556 trends-trends-tmp-jb-144884-Data.db jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ ls *-Data.db trends-trends-jb-142631-Data.db trends-trends-jb-143562-Data.db trends-trends-jb-143581-Data.db trends-trends-jb-144731-Data.db trends-trends-jb-144883-Data.db trends-trends-jb-142633-Data.db trends-trends-jb-143563-Data.db trends-trends-jb-144530-Data.db trends-trends-jb-144864-Data.db trends-trends-jb-144889-Data.db trends-trends-jb-143026-Data.db trends-trends-jb-143564-Data.db trends-trends-jb-144551-Data.db trends-trends-jb-144881-Data.db trends-trends-tmp-jb-144884-Data.db trends-trends-jb-143533-Data.db trends-trends-jb-143578-Data.db trends-trends-jb-144552-Data.db trends-trends-jb-144882-Data.db jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ cd - /mnt/cassandra/data/trends/trends jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ sudo lsof * jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ ls *-Data.db trends-trends-jb-124502-Data.db trends-trends-jb-141113-Data.db trends-trends-jb-141377-Data.db trends-trends-jb-141846-Data.db trends-trends-jb-144890-Data.db trends-trends-jb-125457-Data.db trends-trends-jb-141123-Data.db trends-trends-jb-141391-Data.db trends-trends-jb-141871-Data.db trends-trends-jb-41121-Data.db trends-trends-jb-130016-Data.db trends-trends-jb-141137-Data.db trends-trends-jb-141538-Data.db trends-trends-jb-141883-Data.db trends-trends.trends_date_idx-jb-2100-Data.db trends-trends-jb-139563-Data.db trends-trends-jb-141358-Data.db trends-trends-jb-141806-Data.db trends-trends-jb-142033-Data.db trends-trends-jb-141102-Data.db trends-trends-jb-141363-Data.db trends-trends-jb-141829-Data.db trends-trends-jb-144553-Data.db Compaction started INFO [CompactionExecutor:6661] 2015-06-05 14:02:36,515 CompactionTask.java (line 120) Compacting [SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-124502-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141358-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141883-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141846-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141871-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141391-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-139563-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-125457-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141806-Data.db'),
[jira] [Commented] (CASSANDRA-9597) DTCS should consider file SIZE in addition to time windowing
[ https://issues.apache.org/jira/browse/CASSANDRA-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585527#comment-14585527 ] Jeff Jirsa commented on CASSANDRA-9597: --- You can understand why this happens when you realize that the sstables are filtered by max timestamp: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java#L178 And then the resulting list is sorted by min timestamp: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java#L357-L367 The result is that for roughly evenly distributed time periods (file size proportional to sstable maxTimestamp - sstable minTimestamp, which is likely mostly true for most DTCS workloads), larger files will always be at the front of {{trimToThreshold}}, which virtually guarantees we'll re-compact a very large sstable over and over and over if any other sstables are in the window for compaction. DTCS should consider file SIZE in addition to time windowing Key: CASSANDRA-9597 URL: https://issues.apache.org/jira/browse/CASSANDRA-9597 Project: Cassandra Issue Type: Improvement Reporter: Jeff Jirsa Priority: Minor Labels: dtcs DTCS seems to work well for the typical use case - writing data in perfect time order, compacting recent files, and ignoring older files. However, there are normal operational actions where DTCS will fall behind and is unlikely to recover. An example of this is streaming operations (for example, bootstrap or loading data into a cluster using sstableloader), where lots (tens of thousands) of very small sstables can be created spanning multiple time buckets. In these case, even if max_sstable_age_days is extended to allow the older incoming files to be compacted, the selection logic is likely to re-compact large files with fewer small files over and over, rather than prioritizing selection of max_threshold smallest files to decrease the number of candidate sstables as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9499) Introduce writeVInt method to DataOutputStreamPlus
[ https://issues.apache.org/jira/browse/CASSANDRA-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587084#comment-14587084 ] Benedict commented on CASSANDRA-9499: - Thanks. I know Cap'n Proto and our existing code uses a loop, but why not just use {{8 - Integer.numberOfLeadingZeros\(v\)/8}} ? The cyclic dependency between {{EncodedDIS}} and AbstractDIS is a bit confusing to me. I'd rather we simply marked {{EncodedDIS @Deprecated}}, and moved all of the implementation details somewhere that's acyclic. The read method we can make quite a bit more efficient, with a special version of {{prepareReadPrimitive}}: we want the result to be that {{length}} bytes are in the buffer for consumptions,but that we are also 8 bytes or more before the end of the buffer. This way we can just call {{buffer.getLong(buffer.position())}}, then advance its position by {{length}} and truncate the long with {{ (-1L (64 - (length * 8)) }} (where {{length}} here excludes the initial size byte) Introduce writeVInt method to DataOutputStreamPlus -- Key: CASSANDRA-9499 URL: https://issues.apache.org/jira/browse/CASSANDRA-9499 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Ariel Weisberg Priority: Minor Fix For: 3.0 beta 1 CASSANDRA-8099 really could do with a writeVInt method, for both fixing CASSANDRA-9498 but also efficiently encoding timestamp/deletion deltas. It should be possible to make an especially efficient implementation against BufferedDataOutputStreamPlus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9499) Introduce writeVInt method to DataOutputStreamPlus
[ https://issues.apache.org/jira/browse/CASSANDRA-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587204#comment-14587204 ] Benedict commented on CASSANDRA-9499: - It's a real shame we can't modify the encoding, or at least, it's not worth the effort. The implementation of read, in particular, could have a much clearer and more efficient decoding of size if the negative value positions were inverted. While we're touching this code, it is worth cleaning up these methods a little: there's no point decoding the size of 1, and then deducting from it; we can just return immediately if firstByte}} = MIN_BYTE_VALUE // == -112}}, and always decode the size to a value in the range [2..8] {vIntIsNegative}}: it has three conditional expressions, and only one of them is needed, the other two are always false (since we return those negative values up front; and we wouldn't want to invert them anyway, I would guess) and then {{vintDecodeSize can be made into just a ternary statement. Introduce writeVInt method to DataOutputStreamPlus -- Key: CASSANDRA-9499 URL: https://issues.apache.org/jira/browse/CASSANDRA-9499 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Ariel Weisberg Priority: Minor Fix For: 3.0 beta 1 CASSANDRA-8099 really could do with a writeVInt method, for both fixing CASSANDRA-9498 but also efficiently encoding timestamp/deletion deltas. It should be possible to make an especially efficient implementation against BufferedDataOutputStreamPlus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9499) Introduce writeVInt method to DataOutputStreamPlus
[ https://issues.apache.org/jira/browse/CASSANDRA-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587084#comment-14587084 ] Benedict edited comment on CASSANDRA-9499 at 6/16/15 12:32 AM: --- Thanks. I know Cap'n Proto and our existing code uses a loop, but why not just use {{8 - Integer.numberOfLeadingZeros\(v\)/8}} ? The cyclic dependency between {{EncodedDIS}} and AbstractDIS is a bit confusing to me. I'd rather we simply marked {{EncodedDIS @Deprecated}}, and moved all of the implementation details somewhere that's acyclic. The read method we can make quite a bit more efficient, with a special version of {{prepareReadPrimitive}}: we want the result to be that {{length}} bytes are in the buffer for consumptions,but that we are also 8 bytes or more before the end of the buffer. This way we can just call {{buffer.getLong(buffer.position())}}, then advance its position by {{length}} and truncate the long with {{ (-1L (64 - (length * 8)) }} (where {{length}} here excludes the initial size byte) It occurs to me there's nothing stopping us using this approach for writing as well, simply calling putLong(), with an optional writeByte() to fill in the most-significant byte if it was non-empty. was (Author: benedict): Thanks. I know Cap'n Proto and our existing code uses a loop, but why not just use {{8 - Integer.numberOfLeadingZeros\(v\)/8}} ? The cyclic dependency between {{EncodedDIS}} and AbstractDIS is a bit confusing to me. I'd rather we simply marked {{EncodedDIS @Deprecated}}, and moved all of the implementation details somewhere that's acyclic. The read method we can make quite a bit more efficient, with a special version of {{prepareReadPrimitive}}: we want the result to be that {{length}} bytes are in the buffer for consumptions,but that we are also 8 bytes or more before the end of the buffer. This way we can just call {{buffer.getLong(buffer.position())}}, then advance its position by {{length}} and truncate the long with {{ (-1L (64 - (length * 8)) }} (where {{length}} here excludes the initial size byte) Introduce writeVInt method to DataOutputStreamPlus -- Key: CASSANDRA-9499 URL: https://issues.apache.org/jira/browse/CASSANDRA-9499 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Ariel Weisberg Priority: Minor Fix For: 3.0 beta 1 CASSANDRA-8099 really could do with a writeVInt method, for both fixing CASSANDRA-9498 but also efficiently encoding timestamp/deletion deltas. It should be possible to make an especially efficient implementation against BufferedDataOutputStreamPlus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)