[jira] [Created] (CASSANDRA-9214) Add Data Type Serialization information for date and time types to v4 protocol spec
Joshua McKenzie created CASSANDRA-9214: -- Summary: Add Data Type Serialization information for date and time types to v4 protocol spec Key: CASSANDRA-9214 URL: https://issues.apache.org/jira/browse/CASSANDRA-9214 Project: Cassandra Issue Type: Improvement Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Trivial Fix For: 3.0 Overlooked in CASSANDRA-7523. Need to add serialization information to doc\native_protocol_v4.spec for the new types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9198) Deleting from an empty list produces an error
[ https://issues.apache.org/jira/browse/CASSANDRA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9198: -- Attachment: 9198.txt Attaching patch that was previously option #2 on CASSANDRA-9077. Addresses this without modification: {noformat} cqlsh create keyspace IF NOT EXISTS test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE test; cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test cqlsh:test select * from foo; k | v ---+-- 1 | null (1 rows) {noformat} Deleting from an empty list produces an error - Key: CASSANDRA-9198 URL: https://issues.apache.org/jira/browse/CASSANDRA-9198 Project: Cassandra Issue Type: Bug Components: API Reporter: Olivier Michallat Assignee: Benjamin Lerer Priority: Minor Fix For: 3.0 Attachments: 9198.txt While deleting an element from a list that does not contain it is a no-op, deleting it from an empty list causes an error. This edge case is a bit inconsistent, because it makes list deletion non idempotent: {code} cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; InvalidRequest: code=2200 [Invalid query] message=Attempted to delete an element from a list which is null {code} With speculative retries coming to the drivers, idempotency becomes more important because it determines which query we might retry or not. So it would be better if deleting from an empty list succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503054#comment-14503054 ] Matt Stump commented on CASSANDRA-9193: --- I've got it so I can run JS when events are triggered, but the problem is that the partition key is serialized to a ByteBuffer fairly high in the stack above StorageProxy. StorageProxy doesn't have the context of a CQL3 query so if the query utilizes a composite partition key it's hard to check for equality if you wanted to filter by partition key. We have three options: # Only allow JS introspection for CQL3 queries # Expose functionality to compose composite partition keys in JS so that we can check for equality # Require that the user know the fully serialized partition key beforehand. I'm leaning towards option 1 and 3. Offer multiple injection points, one at the CQL3 query processing level, and maybe later an injection point further down the stack which would require knowledge of the fully serialized partition key. Facility to write dynamic code to selectively trigger trace or log for queries -- Key: CASSANDRA-9193 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193 Project: Cassandra Issue Type: New Feature Reporter: Matt Stump I want the equivalent of dtrace for Cassandra. I want the ability to intercept a query with a dynamic script (assume JS) and based on logic in that script trigger the statement for trace or logging. Examples - Trace only INSERT statements to a particular CF. - Trace statements for a particular partition or consistency level. - Log statements that fail to reach the desired consistency for read or write. - Log If the request size for read or write exceeds some threshold At some point in the future it would be helpful to also do things such as log partitions greater than X bytes or Z cells when performing compaction. Essentially be able to inject custom code dynamically without a reboot to the different stages of C*. The code should be executed synchronously as part of the monitored task, but we should provide the ability to log or execute CQL asynchronously from the provided API. Further down the line we could use this functionality to modify/rewrite requests or tasks dynamically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9213) Compaction errors observed during heavy write load: BAD RELEASE
Rocco Varela created CASSANDRA-9213: --- Summary: Compaction errors observed during heavy write load: BAD RELEASE Key: CASSANDRA-9213 URL: https://issues.apache.org/jira/browse/CASSANDRA-9213 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.4.374 Ubuntu 14.04.2 java version 1.7.0_45 10-node cluster, RF = 3 Reporter: Rocco Varela Assignee: Benedict Fix For: 2.1.4 Attachments: COMPACTION-ERR.log During heavy write load testing we're seeing occasional compaction errors with the following error message: {code} ERROR [CompactionExecutor:40] 2015-04-16 17:01:16,936 Ref.java:170 - BAD RELEASE: attempted to release a reference (org.apache.cassandra.utils.concurrent.Ref$State@31d969bd) that has already been released ... ERROR [CompactionExecutor:40] 2015-04-16 17:01:22,190 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:40,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.SSTableReader.markObsolete(SSTableReader.java:1699) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.db.DataTracker.unmarkCompacting(DataTracker.java:240) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.io.sstable.SSTableRewriter.replaceWithFinishedReaders(SSTableRewriter.java:495) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at ... {code} I have turned on debugrefcount in bin/cassandra:launch_service() and I will repost another stack trace when it happens again. {code} cassandra_parms=$cassandra_parms -Dcassandra.debugrefcount=true {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9213) Compaction errors observed during heavy write load: BAD RELEASE
[ https://issues.apache.org/jira/browse/CASSANDRA-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rocco Varela updated CASSANDRA-9213: Assignee: (was: Benedict) Compaction errors observed during heavy write load: BAD RELEASE --- Key: CASSANDRA-9213 URL: https://issues.apache.org/jira/browse/CASSANDRA-9213 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.4.374 Ubuntu 14.04.2 java version 1.7.0_45 10-node cluster, RF = 3 Reporter: Rocco Varela Fix For: 2.1.4 Attachments: COMPACTION-ERR.log During heavy write load testing we're seeing occasional compaction errors with the following error message: {code} ERROR [CompactionExecutor:40] 2015-04-16 17:01:16,936 Ref.java:170 - BAD RELEASE: attempted to release a reference (org.apache.cassandra.utils.concurrent.Ref$State@31d969bd) that has already been released ... ERROR [CompactionExecutor:40] 2015-04-16 17:01:22,190 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:40,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.SSTableReader.markObsolete(SSTableReader.java:1699) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.db.DataTracker.unmarkCompacting(DataTracker.java:240) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.io.sstable.SSTableRewriter.replaceWithFinishedReaders(SSTableRewriter.java:495) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at ... {code} I have turned on debugrefcount in bin/cassandra:launch_service() and I will repost another stack trace when it happens again. {code} cassandra_parms=$cassandra_parms -Dcassandra.debugrefcount=true {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503098#comment-14503098 ] Matt Stump commented on CASSANDRA-9193: --- The above example demonstrates the serialized partition key dilemma pretty well. A {{ReadCommand}} has ByteBuffer for key, which can be a serialized composite. I suppose I could provide a function that given a CFMetaData and a key byte buffer return an array of java objects to JS, one object per key component. Facility to write dynamic code to selectively trigger trace or log for queries -- Key: CASSANDRA-9193 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193 Project: Cassandra Issue Type: New Feature Reporter: Matt Stump I want the equivalent of dtrace for Cassandra. I want the ability to intercept a query with a dynamic script (assume JS) and based on logic in that script trigger the statement for trace or logging. Examples - Trace only INSERT statements to a particular CF. - Trace statements for a particular partition or consistency level. - Log statements that fail to reach the desired consistency for read or write. - Log If the request size for read or write exceeds some threshold At some point in the future it would be helpful to also do things such as log partitions greater than X bytes or Z cells when performing compaction. Essentially be able to inject custom code dynamically without a reboot to the different stages of C*. The code should be executed synchronously as part of the monitored task, but we should provide the ability to log or execute CQL asynchronously from the provided API. Further down the line we could use this functionality to modify/rewrite requests or tasks dynamically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8729) Commitlog causes read before write when overwriting
[ https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503110#comment-14503110 ] Jeremiah Jordan commented on CASSANDRA-8729: Do we have a fix for 2.1 users? Commitlog causes read before write when overwriting --- Key: CASSANDRA-8729 URL: https://issues.apache.org/jira/browse/CASSANDRA-8729 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Labels: commitlog Fix For: 3.0 The memory mapped commit log implementation writes directly to the page cache. If a page is not in the cache the kernel will read it in even though we are going to overwrite. The way to avoid this is to write to private memory, and then pad the write with 0s at the end so it is page (4k) aligned before writing to a file. The commit log would benefit from being refactored into something that looks more like a pipeline with incoming requests receiving private memory to write in, completed buffers being submitted to a parallelized compression/checksum step, followed by submission to another thread for writing to a file that preserves the order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7410) Pig support for BulkOutputFormat as a parameter in url
[ https://issues.apache.org/jira/browse/CASSANDRA-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503144#comment-14503144 ] Piotr Kołaczkowski commented on CASSANDRA-7410: --- Patch doesn't apply to 2.0 branch: {noformat} pkolaczk@m4600 ~/Projekty/DataStax/cassandra $ git fetch pkolaczk@m4600 ~/Projekty/DataStax/cassandra $ git checkout cassandra-2.0 Already on 'cassandra-2.0' Your branch is up-to-date with 'origin/cassandra-2.0'. pkolaczk@m4600 ~/Projekty/DataStax/cassandra $ git apply 7410-v3-2.0-branch.txt 7410-v3-2.0-branch.txt:195: trailing whitespace. [columns=columns][where_clause=where_clause] + error: patch failed: src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java:345 error: src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java: patch does not apply {noformat} Pig support for BulkOutputFormat as a parameter in url -- Key: CASSANDRA-7410 URL: https://issues.apache.org/jira/browse/CASSANDRA-7410 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Priority: Minor Fix For: 2.0.15 Attachments: 7410-2.0-branch.txt, 7410-2.1-branch.txt, 7410-v2-2.0-branch.txt, 7410-v3-2.0-branch.txt, CASSANDRA-7410-v2-2.1-branch.txt, CASSANDRA-7410-v3-2.1-branch.txt, CASSANDRA-7410-v4-2.0-branch.txt Add BulkOutputFormat support in Pig url -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9215) Test with degraded or unreliable networks
Ariel Weisberg created CASSANDRA-9215: - Summary: Test with degraded or unreliable networks Key: CASSANDRA-9215 URL: https://issues.apache.org/jira/browse/CASSANDRA-9215 Project: Cassandra Issue Type: Test Reporter: Ariel Weisberg I have tried to test WAN replication using routing nodes with various configurations to simulate a bad network and not had good results with realistically reproducing a WAN performance. The fake WAN performed better than the real one. I think we need to do at least some of our testing over a link between data centers that are at least as distance as US East - US West. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503434#comment-14503434 ] Benedict commented on CASSANDRA-8984: - bq. BigTableWriter reaching into the guts of IndexWriter is error-prone bq. There's enough usage of finish() and finishAndClose() floating around that it comes off as an undocumented extension Agreed, these had both vaguely bugged me as well. The first I've fixed and uploaded (along with the missing nits from your past comment). The second I'm not sure how best to address: the problem is that it includes prepareToCommit in the semantics. So we have a few options: 1) some really horrendous generics; 2) moving prepareToCommit into the Transactional, making it no-args, and requiring any commit preparation arguments be provided in a separate method; 3) leaving as-is. I think I'm leaning towards (2), though may change my mind once taken through to its conclusion. It isn't perfect, but it does allow us to clearly codify all correct behaviours, at the cost of needing a little use of only-temporary builder-like state inside of some Transactional objects, both for the prepareToCommit parameters, and also for any return values (like in SSTableRewriter, or SSTableWriter, where we return the list of readers, or the reader, respectively). bq. We may want to convert the touched /io tests to take advantage of and exercise the various writers being Transactional Yeah. The reader tests probably not, but we should perhaps introduce a special SequentialWriter test that can work on both kinds of implementation to test the behaviours are consistent with Transactional. We appear to not have any kind of SSTableWriter test, either. I think that should be a separate ticket, since its scope is much broader, but perhaps I can introduce a starter touching just this functionality and file a follow-up. Introduce Transactional API for behaviours that can corrupt system state Key: CASSANDRA-8984 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.5 Attachments: 8984_windows_timeout.txt As a penultimate (and probably final for 2.1, if we agree to introduce it there) round of changes to the internals managing sstable writing, I've introduced a new API called Transactional that I hope will make it much easier to write correct behaviour. As things stand we conflate a lot of behaviours into methods like close - the recent changes unpicked some of these, but didn't go far enough. My proposal here introduces an interface designed to support four actions (on top of their normal function): * prepareToCommit * commit * abort * cleanup In normal operation, once we have finished constructing a state change we call prepareToCommit; once all such state changes are prepared, we call commit. If at any point everything fails, abort is called. In _either_ case, cleanup is called at the very last. These transactional objects are all AutoCloseable, with the behaviour being to rollback any changes unless commit has completed successfully. The changes are actually less invasive than it might sound, since we did recently introduce abort in some places, as well as have commit like methods. This simply formalises the behaviour, and makes it consistent between all objects that interact in this way. Much of the code change is boilerplate, such as moving an object into a try-declaration, although the change is still non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we have had since 2.1 was released. The data tracker API changes and compaction leftover cleanups should finish the job with making this much easier to reason about, but this change I think is worthwhile considering for 2.1, since we've just overhauled this entire area (and not released these changes), and this change is essentially just the finishing touches, so the risk is minimal and the potential gains reasonably significant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503381#comment-14503381 ] Joshua McKenzie commented on CASSANDRA-8984: h6. Feedback: There's enough usage of finish() and finishAndClose() floating around that it comes off as an undocumented extension to the Transactional interface. Given how clear the flow of usage is documented in the header on Transactional, it's a bit surprising to see finish() and finishAndClose() calls with the frequency they have - there might be some value in codifying those methods / documenting their idiomatic usage. BigTableWriter reaching into the guts of IndexWriter is error-prone as future changes to resources in IndexWriter would need to be managed in the TxnProxy in BigTableWriter. Perhaps we could change IndexWriter to implement Transactional and keep the management of its resources more localized? For instance, rather than seeing: {noformat} protected Throwable doCleanup(Throwable accumulate) { accumulate = iwriter.summary.close(accumulate); accumulate = iwriter.bf.close(accumulate); accumulate = iwriter.builder.close(accumulate); accumulate = dbuilder.close(accumulate); return accumulate; } {noformat} I'd be much more comfortable seeing: {noformat} protected Throwable doCleanup(Throwable accumulate) { accumulate = iwriter.close(accumulate); accumulate = dbuilder.close(accumulate); return accumulate; } {noformat} h6. Nits: We may want to convert the touched /io tests to take advantage of and exercise the various writers being Transactional Formatting change in SSTableImport.java for JsonFactory factory decl/init is unnecessary Tests look good on both linux and windows. Introduce Transactional API for behaviours that can corrupt system state Key: CASSANDRA-8984 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.5 Attachments: 8984_windows_timeout.txt As a penultimate (and probably final for 2.1, if we agree to introduce it there) round of changes to the internals managing sstable writing, I've introduced a new API called Transactional that I hope will make it much easier to write correct behaviour. As things stand we conflate a lot of behaviours into methods like close - the recent changes unpicked some of these, but didn't go far enough. My proposal here introduces an interface designed to support four actions (on top of their normal function): * prepareToCommit * commit * abort * cleanup In normal operation, once we have finished constructing a state change we call prepareToCommit; once all such state changes are prepared, we call commit. If at any point everything fails, abort is called. In _either_ case, cleanup is called at the very last. These transactional objects are all AutoCloseable, with the behaviour being to rollback any changes unless commit has completed successfully. The changes are actually less invasive than it might sound, since we did recently introduce abort in some places, as well as have commit like methods. This simply formalises the behaviour, and makes it consistent between all objects that interact in this way. Much of the code change is boilerplate, such as moving an object into a try-declaration, although the change is still non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we have had since 2.1 was released. The data tracker API changes and compaction leftover cleanups should finish the job with making this much easier to reason about, but this change I think is worthwhile considering for 2.1, since we've just overhauled this entire area (and not released these changes), and this change is essentially just the finishing touches, so the risk is minimal and the potential gains reasonably significant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9216) NullPointerException (NPE) during Compaction Cache Serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9216: --- Reproduced In: 2.1.4 Fix Version/s: (was: 2.1.4) 2.1.5 NullPointerException (NPE) during Compaction Cache Serialization Key: CASSANDRA-9216 URL: https://issues.apache.org/jira/browse/CASSANDRA-9216 Project: Cassandra Issue Type: Bug Environment: Debian linux wheezy patch current Reporter: Jason Kania Labels: Compaction, NPE Fix For: 2.1.5 In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: {code} ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9097) Repeated incremental nodetool repair results in failed repairs due to running anticompaction
[ https://issues.apache.org/jira/browse/CASSANDRA-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503513#comment-14503513 ] Jeremiah Jordan commented on CASSANDRA-9097: [~yukim] so do you think we can fix this in 2.1? Repeated incremental nodetool repair results in failed repairs due to running anticompaction Key: CASSANDRA-9097 URL: https://issues.apache.org/jira/browse/CASSANDRA-9097 Project: Cassandra Issue Type: Bug Reporter: Gustav Munkby Assignee: Yuki Morishita Priority: Minor Fix For: 2.1.5 I'm trying to synchronize incremental repairs over multiple nodes in a Cassandra cluster, and it does not seem to easily achievable. In principle, the process iterates through the nodes of the cluster and performs `nodetool -h $NODE repair --incremental`, but that sometimes fails on subsequent nodes. The reason for failing seems to be that the repair returns as soon as the repair and the _local_ anticompaction has completed, but does not guarantee that remote anticompactions are complete. If I subsequently try to issue another repair command, they fail to start (and terminate with failure after about one minute). It usually isn't a problem, as the local anticompaction typically involves as much (or more) data as the remote ones, but sometimes not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8729) Commitlog causes read before write when overwriting
[ https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503527#comment-14503527 ] Jeremiah Jordan commented on CASSANDRA-8729: His fix is actually to make the segment size = commit log size, aka 2 GB, which means you get one giant segment. That seems a little extreme, and makes commit log archiving much harder. If this really causes such a big performance degradation, can we just turn off segment recycle in 2.1? Seems like that isn't too invasive of a change, since we don't always recycle anyway? Commitlog causes read before write when overwriting --- Key: CASSANDRA-8729 URL: https://issues.apache.org/jira/browse/CASSANDRA-8729 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Labels: commitlog Fix For: 3.0 The memory mapped commit log implementation writes directly to the page cache. If a page is not in the cache the kernel will read it in even though we are going to overwrite. The way to avoid this is to write to private memory, and then pad the write with 0s at the end so it is page (4k) aligned before writing to a file. The commit log would benefit from being refactored into something that looks more like a pipeline with incoming requests receiving private memory to write in, completed buffers being submitted to a parallelized compression/checksum step, followed by submission to another thread for writing to a file that preserves the order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9216) NullPointerException (NPE) during Compaction Cache Serialization
Jason Kania created CASSANDRA-9216: -- Summary: NullPointerException (NPE) during Compaction Cache Serialization Key: CASSANDRA-9216 URL: https://issues.apache.org/jira/browse/CASSANDRA-9216 Project: Cassandra Issue Type: Bug Environment: Debian linux wheezy patch current Reporter: Jason Kania Fix For: 2.1.4 In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9216) NullPointerException (NPE) during Compaction Cache Serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503495#comment-14503495 ] Philip Thompson commented on CASSANDRA-9216: Could you possibly attack the entire system.log file? Do you know what table this was compacting when the exception occurred? Could you give us the schema for that table, especially the compaction strategy/options? NullPointerException (NPE) during Compaction Cache Serialization Key: CASSANDRA-9216 URL: https://issues.apache.org/jira/browse/CASSANDRA-9216 Project: Cassandra Issue Type: Bug Environment: Debian linux wheezy patch current Reporter: Jason Kania Labels: Compaction, NPE Fix For: 2.1.5 In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: {code} ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9216) NullPointerException (NPE) during Compaction Cache Serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503495#comment-14503495 ] Philip Thompson edited comment on CASSANDRA-9216 at 4/20/15 7:47 PM: - Could you possibly attach the entire system.log file? Do you know what table this was compacting when the exception occurred? Could you give us the schema for that table, especially the compaction strategy/options? was (Author: philipthompson): Could you possibly attack the entire system.log file? Do you know what table this was compacting when the exception occurred? Could you give us the schema for that table, especially the compaction strategy/options? NullPointerException (NPE) during Compaction Cache Serialization Key: CASSANDRA-9216 URL: https://issues.apache.org/jira/browse/CASSANDRA-9216 Project: Cassandra Issue Type: Bug Environment: Debian linux wheezy patch current Reporter: Jason Kania Labels: Compaction, NPE Fix For: 2.1.5 In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: {code} ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9216) NullPointerException (NPE) during Compaction Cache Serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9216: --- Description: In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: {code} ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code} was: In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] NullPointerException (NPE) during Compaction Cache Serialization Key: CASSANDRA-9216 URL: https://issues.apache.org/jira/browse/CASSANDRA-9216 Project: Cassandra Issue Type: Bug Environment: Debian linux wheezy patch current Reporter: Jason Kania Labels: Compaction, NPE Fix For: 2.1.4 In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: {code} ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9216) NullPointerException (NPE) during Compaction Cache Serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503519#comment-14503519 ] Jason Kania commented on CASSANDRA-9216: Unfortunately, the system log was deleted as the /var/log partition filled up when the database fell over. I only got this Exception because of another process monitoring the logs for exceptions and sending them off board. I will be monitoring the system directly so if the problem occurs again, I will capture it. NullPointerException (NPE) during Compaction Cache Serialization Key: CASSANDRA-9216 URL: https://issues.apache.org/jira/browse/CASSANDRA-9216 Project: Cassandra Issue Type: Bug Environment: Debian linux wheezy patch current Reporter: Jason Kania Labels: Compaction, NPE Fix For: 2.1.5 In case this hasn't been reported (I looked but did not see it), a null pointer exception is occurring during compaction. The stack track is as follows: {code} ERROR [CompactionExecutor:50] 2015-04-20 13:42:43,827 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:50,1,main] java.lang.NullPointerException: null at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:274) ~[apache-cassandra-2.1.4.jar:2.1.4] at org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1152) ~[apache-cassandra-2.1.4.jar:2.1.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_75] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6542) nodetool removenode hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502397#comment-14502397 ] Study Hsueh commented on CASSANDRA-6542: Also observed in 2.1.13 on CentOS 6.6 The nodes status Log. Host: 192.168.1.13 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.95 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.68 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 25.72 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 DN 192.168.1.27 ? 256 ? 2ca22f3d-f8d8-4bde-8cdc-de649056cf9c rack1 UN 192.168.1.26 20.71 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode force 2ca22f3d-f8d8-4bde-8cdc-de649056cf9c # nodetool removenode hangs $ nodetool removenode status RemovalStatus: Removing token (-9132940871846770123). Waiting for replication confirmation from [/192.168.1.29,/192.168.1.28,/192.168.1.26]. Host: 192.168.1.28 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.96 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.69 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 30.43 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 UN 192.168.1.26 20.72 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode status RemovalStatus: No token removals in process. Host: 192.168.1.29 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.96 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.69 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 30.43 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 UN 192.168.1.26 20.72 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode status RemovalStatus: No token removals in process. Host: 192.168.1.26 nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.96 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.69 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 30.43 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 UN 192.168.1.26 20.72 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode status RemovalStatus: No token removals in process. nodetool removenode hangs - Key: CASSANDRA-6542 URL: https://issues.apache.org/jira/browse/CASSANDRA-6542 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12, 1.2.11 DSE Reporter: Eric Lubow Assignee: Yuki Morishita Running *nodetool removenode $host-id* doesn't actually remove the node from the ring. I've let it run anywhere from 5 minutes to 3 days and there are no messages in the log about it hanging or failing, the command just sits there running. So the regular response has been to run *nodetool removenode