[jira] [Commented] (CASSANDRA-9282) Warn on unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526346#comment-14526346 ] Sylvain Lebresne commented on CASSANDRA-9282: - bq. the drivers should warn on these On top of Jonathan's comment, it's worth noting that drivers can't do it in all case. That is, if the batch is provided as a query string, drivers don't parse and thus don't know what it is. Which is not to say that drivers can't warn when possible, just that there is sense in doing it server side if we want to make sure people do get the warning. Warn on unlogged batches Key: CASSANDRA-9282 URL: https://issues.apache.org/jira/browse/CASSANDRA-9282 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 2.1.x At least until CASSANDRA-8303 is done and we can block them entirely, we should log a warning when unlogged batches across multiple partition keys are used. This could either be done by backporting NoSpamLogger and blindly logging every time, or we could add a threshold and warn when more than 10 keys are seen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526389#comment-14526389 ] Stefania commented on CASSANDRA-7066: - Thanks for your feedback, [~benedict], [~JoshuaMcKenzie], [~krummas]. I've removed temporary files and descriptor types entirely with a small exception for rewriting metadata which still uses a temporary file. I also plan on implementing the standalone tool suggested by Marcus. [~benedict]: assuming you want to be the reviewer, you can start with a quick first round if you have some spare time (https://github.com/stef1927/cassandra/commits/7066-8984-alt). I am still testing (some dtests are broken) and reviewing myself but you may want to take a look at the transaction log class, which is called *OperationLog*. I am not entirely sure if the integration of this class with the SSTableRewriter and SSTableWriter is what you had in mind or if it should be more tightly integrated. Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: compaction Fix For: 3.x Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8051) Add SERIAL and LOCAL_SERIAL consistency levels to cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526391#comment-14526391 ] Stefania commented on CASSANDRA-8051: - [~thobbs] review is complete, 8051-2.1-v3 and 8051-2.0 can be committed if you are OK with them. Add SERIAL and LOCAL_SERIAL consistency levels to cqlsh --- Key: CASSANDRA-8051 URL: https://issues.apache.org/jira/browse/CASSANDRA-8051 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Nicolas Favre-Felix Assignee: Carl Yeksigian Priority: Minor Labels: cqlsh Fix For: 2.0.x Attachments: 8051-2.0.txt, 8051-2.1-v2.txt, 8051-2.1-v3.txt, 8051-2.1.txt, log-statements-2.0.txt, log-statements-2.1.txt, python-driver-fix.txt cqlsh does not support setting the serial consistency level. The default CL.SERIAL does not let users safely execute LWT alongside an app that runs at LOCAL_SERIAL, and can prevent any LWT from running when a DC is down (e.g. with 2 DCs, RF=3 in each.) Implementing this well is a bit tricky. A user setting the serial CL will probably not want all of their statements to have a serial CL attached, but only the conditional updates. At the same time it would be useful to support serial reads. WITH CONSISTENCY LEVEL used to provide this flexibility. I believe that it is currently impossible to run a SELECT at SERIAL or LOCAL_SERIAL; the only workaround seems to be to run a conditional update with a predicate that always resolves to False, and to rely on the CAS response to read the data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8897) Remove FileCacheService, instead pooling the buffers
[ https://issues.apache.org/jira/browse/CASSANDRA-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526406#comment-14526406 ] Stefania commented on CASSANDRA-8897: - Thanks, I'll be sure to pick up both commits when I resume with this. Remove FileCacheService, instead pooling the buffers Key: CASSANDRA-8897 URL: https://issues.apache.org/jira/browse/CASSANDRA-8897 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Fix For: 3.x After CASSANDRA-8893, a RAR will be a very lightweight object and will not need caching, so we can eliminate this cache entirely. Instead we should have a pool of buffers that are page-aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526477#comment-14526477 ] Benedict commented on CASSANDRA-7066: - Sure, I'll review Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: compaction Fix For: 3.x Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-7066: Reviewer: Benedict Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: compaction Fix For: 3.x Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine
[ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526491#comment-14526491 ] Benedict commented on CASSANDRA-8099: - FTR, I plan to leave review of the Partition hierarchy (in particular the memtable stuff) until after that refactor, but am attempting to tackle the rest of the code in the meantime. As I do this, I will arrange pull requests (as I have done just now). I will only post updates here if the pull request is in any way substantial, so that they can be discussed on the permanent record. I think this will be more efficient than my posting suggestions/criticisms, since you only have so much time on your hands, and this permits a degree of parallelization that may help us reach a better end state in the time available. Refactor and modernize the storage engine - Key: CASSANDRA-8099 URL: https://issues.apache.org/jira/browse/CASSANDRA-8099 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 3.0 Attachments: 8099-nit The current storage engine (which for this ticket I'll loosely define as the code implementing the read/write path) is suffering from old age. One of the main problem is that the only structure it deals with is the cell, which completely ignores the more high level CQL structure that groups cell into (CQL) rows. This leads to many inefficiencies, like the fact that during a reads we have to group cells multiple times (to count on replica, then to count on the coordinator, then to produce the CQL resultset) because we forget about the grouping right away each time (so lots of useless cell names comparisons in particular). But outside inefficiencies, having to manually recreate the CQL structure every time we need it for something is hindering new features and makes the code more complex that it should be. Said storage engine also has tons of technical debt. To pick an example, the fact that during range queries we update {{SliceQueryFilter.count}} is pretty hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has to go into to simply remove the last query result. So I want to bite the bullet and modernize this storage engine. I propose to do 2 main things: # Make the storage engine more aware of the CQL structure. In practice, instead of having partitions be a simple iterable map of cells, it should be an iterable list of row (each being itself composed of per-column cells, though obviously not exactly the same kind of cell we have today). # Make the engine more iterative. What I mean here is that in the read path, we end up reading all cells in memory (we put them in a ColumnFamily object), but there is really no reason to. If instead we were working with iterators all the way through, we could get to a point where we're basically transferring data from disk to the network, and we should be able to reduce GC substantially. Please note that such refactor should provide some performance improvements right off the bat but it's not it's primary goal either. It's primary goal is to simplify the storage engine and adds abstraction that are better suited to further optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
[ https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526513#comment-14526513 ] Benedict commented on CASSANDRA-7066: - As far as level of integration goes, this looks fine for a base on 8984. CASSANDRA-8568 will most likely merge this work with its new db.lifecycle package, either storing the OperationLog inside of its Transaction object, or merging the functionality directly, since the behaviours are quite interwoven. Since 8568 has not been review yet, I will merge the functionality after this patch gets committed, in case that has to be revised significantly, although perhaps you can chime in on review of that ticket as well. A couple of things I've noticed (not doing a formal review just yet): * {{OperationLog}} should probably implement Transactional * {{OperationLog.cleanup()}} (or {{OperationLog.LogFile.delete()}}) should force a sync of the directory's file descriptor, to make sure there is a happens-before relationship between the two log-file deletions, else if the process crashes we may be left with the wrong one deleted and corrupt the sstable state * It looks to me like the SSTableDeleteNotification approach is currently broken: at least a set of sstables we care about needs to be maintained to be checked against, however I would probably consider adding a Runnable to each replaced {{SSTableReader}} using {{runOnClose}}, containing a {{Ref}} to be released which, once all are released, removes the log file. This way it's tracked, debuggable, and safe against miscounting. * SSTableReader.DescriptorTypeTidy and SSTableReader.GlobalTidy should be merged * getTemporaryFiles should probably return a Set, for efficient testing in the list methods Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Stefania Priority: Minor Labels: compaction Fix For: 3.x Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8574) Gracefully degrade SELECT when there are lots of tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526524#comment-14526524 ] Christian Spriegel commented on CASSANDRA-8574: --- Another use-case: I think being able to select tombstones could be very useful to examine TOEs. The user could simply do a query in CQLSH and see where the tombstones come from. Crazy thought: Perhaps there could be different tombstone modes: - One that selects all tombstones: good for debugging. - One that only returns the last tombstone: good for iterating. Gracefully degrade SELECT when there are lots of tombstones --- Key: CASSANDRA-8574 URL: https://issues.apache.org/jira/browse/CASSANDRA-8574 Project: Cassandra Issue Type: Improvement Reporter: Jens Rantil Fix For: 3.x *Background:* There's lots of tooling out there to do BigData analysis on Cassandra clusters. Examples are Spark and Hadoop, which is offered by DSE. The problem with both of these so far, is that a single partition key with too many tombstones can make the query job fail hard. The described scenario happens despite the user setting a rather small FetchSize. I assume this is a common scenario if you have larger rows. *Proposal:* To allow a CQL SELECT to gracefully degrade to only return a smaller batch of results if there are too many tombstones. The tombstones are ordered according to clustering key and one should be able to page through them. Potentially: SELECT * FROM mytable LIMIT 1000 TOMBSTONES; would page through maximum 1000 tombstones, _or_ 1000 (CQL) rows. I understand that this obviously would degrade performance, but it would at least yield a result. *Additional comment:* I haven't dug into Cassandra code, but conceptually I guess this would be doable. Let me know what you think. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs
Luis Correia created CASSANDRA-9291: --- Summary: Too many tombstones in schema_columns from creating too many CFs Key: CASSANDRA-9291 URL: https://issues.apache.org/jira/browse/CASSANDRA-9291 Project: Cassandra Issue Type: Bug Components: Core Environment: Production Cluster with 2 DCs of 3 nodes each and 1 DC of 7 nodes, running on dedicated Xeon hexacore, 96GB ram, RAID for Data and SSF for commitlog, running Debian 7 (with Java 1.7.0_76-b13 64-Bit, 8GB and 16GB of heap tested). Dev Cluster with 1 DC with 3 nodes and 1 DC with 1 node, running on virtualized env., Ubuntu 12.04.5 (with Java 1.7.0_72-b14 64-Bit 1GB, 4GB heap) Reporter: Luis Correia Priority: Blocker Attachments: after_schema.txt, before_schema.txt, schemas500.cql When creating lots of columnfamilies (about 200) the system.schema_columns gets filled with tombstones and therefore prevents clients using the binary protocol of connecting. Clients already connected continue normal operation (reading and inserting). Log messages are: For the first tries (sorry for the lack of precision): ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in system.schema_columns; query aborted (see tombstone_failure_threshold) For each client that tries to connect but fails with timeout: WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] WARN [ReadStage:55] 2015-04-27 15:40:10,759 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] WARN [ReadStage:51] 2015-04-27 15:40:10,821 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] WARN [ReadStage:61] 2015-04-27 15:40:10,822 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] WARN [ReadStage:43] 2015-04-27 15:40:10,827 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold).
[jira] [Updated] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis Correia updated CASSANDRA-9291: Description: When creating lots of columnfamilies (about 200) the system.schema_columns gets filled with tombstones and therefore prevents clients using the binary protocol of connecting. Clients already connected continue normal operation (reading and inserting). Log messages are: For the first tries (sorry for the lack of precision): ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in system.schema_columns; query aborted (see tombstone_failure_threshold) For each client that tries to connect but fails with timeout: bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,759 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,821 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,822 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,827 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,838 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:62] 2015-04-27 15:40:10,846 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,862 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns
[jira] [Updated] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis Correia updated CASSANDRA-9291: Description: When creating lots of columnfamilies (about 200) the system.schema_columns gets filled with tombstones and therefore prevents clients using the binary protocol of connecting. Clients already connected continue normal operation (reading and inserting). Log messages are: For the first tries (sorry for the lack of precision): bq. ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in system.schema_columns; query aborted (see tombstone_failure_threshold) For each client that tries to connect but fails with timeout: bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,759 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,821 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,822 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,827 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,838 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:62] 2015-04-27 15:40:10,846 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,862 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in
[jira] [Resolved] (CASSANDRA-9127) Out of memory failure: ~2Gb retained
[ https://issues.apache.org/jira/browse/CASSANDRA-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-9127. - Resolution: Not A Problem Out of memory failure: ~2Gb retained Key: CASSANDRA-9127 URL: https://issues.apache.org/jira/browse/CASSANDRA-9127 Project: Cassandra Issue Type: Bug Reporter: Maxim Podkolzine Assignee: Benedict Fix For: 2.1.x Attachments: snapshot.png See the snapshot analysis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods
[ https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526566#comment-14526566 ] Oded Peer commented on CASSANDRA-8755: -- Java's String.split() method has a fast-path for single character input, avoiding costly regexp creation. See http://stackoverflow.com/a/11002374/248656. Replace trivial uses of String.replace/replaceAll/split with StringUtils methods Key: CASSANDRA-8755 URL: https://issues.apache.org/jira/browse/CASSANDRA-8755 Project: Cassandra Issue Type: Improvement Reporter: Jaroslav Kamenik Priority: Trivial Labels: lhf Attachments: trunk-8755.patch There are places in the code where those regex based methods are used with plain, not regexp, strings, so StringUtils alternatives should be faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7276) Include keyspace and table names in logs where possible
[ https://issues.apache.org/jira/browse/CASSANDRA-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526574#comment-14526574 ] Oded Peer commented on CASSANDRA-7276: -- I suggest adding context to the thread name, then it will be printed when exceptions occur without needing to explicitly printing it out when an exception occurs. It's described in http://blog.takipi.com/supercharged-jstack-how-to-debug-your-servers-at-100mph/ Include keyspace and table names in logs where possible --- Key: CASSANDRA-7276 URL: https://issues.apache.org/jira/browse/CASSANDRA-7276 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Nitzan Volman Priority: Minor Labels: bootcamp, lhf Fix For: 2.1.x Attachments: 2.1-CASSANDRA-7276-v1.txt, cassandra-2.1-7276-compaction.txt, cassandra-2.1-7276.txt Most error messages and stacktraces give you no clue as to what keyspace or table was causing the problem. For example: {noformat} ERROR [MutationStage:61648] 2014-05-20 12:05:45,145 CassandraDaemon.java (line 198) Exception in thread Thread[MutationStage:61648,5,main] java.lang.IllegalArgumentException at java.nio.Buffer.limit(Unknown Source) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:63) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:98) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:328) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:200) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226) at org.apache.cassandra.db.Memtable.put(Memtable.java:173) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:206) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {noformat} We should try to include info on the keyspace and column family in the error messages or logs whenever possible. This includes reads, writes, compactions, flushes, repairs, and probably more. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9282) Warn on unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526576#comment-14526576 ] Aleksey Yeschenko commented on CASSANDRA-9282: -- I think what Jake meant was pushing warnings via CASSANDRA-8930 - but we don't have that (yet?). Warn on unlogged batches Key: CASSANDRA-9282 URL: https://issues.apache.org/jira/browse/CASSANDRA-9282 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 2.1.x At least until CASSANDRA-8303 is done and we can block them entirely, we should log a warning when unlogged batches across multiple partition keys are used. This could either be done by backporting NoSpamLogger and blindly logging every time, or we could add a threshold and warn when more than 10 keys are seen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-9291. -- Resolution: Duplicate Too many tombstones in schema_columns from creating too many CFs Key: CASSANDRA-9291 URL: https://issues.apache.org/jira/browse/CASSANDRA-9291 Project: Cassandra Issue Type: Bug Components: Core Environment: Production Cluster with 2 DCs of 3 nodes each and 1 DC of 7 nodes, running on dedicated Xeon hexacore, 96GB ram, RAID for Data and SSF for commitlog, running Debian 7 (with Java 1.7.0_76-b13 64-Bit, 8GB and 16GB of heap tested). Dev Cluster with 1 DC with 3 nodes and 1 DC with 1 node, running on virtualized env., Ubuntu 12.04.5 (with Java 1.7.0_72-b14 64-Bit 1GB, 4GB heap) Reporter: Luis Correia Priority: Blocker Attachments: after_schema.txt, before_schema.txt, schemas500.cql When creating lots of columnfamilies (about 200) the system.schema_columns gets filled with tombstones and therefore prevents clients using the binary protocol of connecting. Clients already connected continue normal operation (reading and inserting). Log messages are: For the first tries (sorry for the lack of precision): bq. ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in system.schema_columns; query aborted (see tombstone_failure_threshold) For each client that tries to connect but fails with timeout: bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,759 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,821 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,822 SliceQueryFilter.java (line 231) Read 864
[jira] [Commented] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526593#comment-14526593 ] Aleksey Yeschenko commented on CASSANDRA-9291: -- Thanks for the report. It's a known issue - a duplicate of CASSANDRA-8189 - and will be resolved soon in 3.0. Too many tombstones in schema_columns from creating too many CFs Key: CASSANDRA-9291 URL: https://issues.apache.org/jira/browse/CASSANDRA-9291 Project: Cassandra Issue Type: Bug Components: Core Environment: Production Cluster with 2 DCs of 3 nodes each and 1 DC of 7 nodes, running on dedicated Xeon hexacore, 96GB ram, RAID for Data and SSF for commitlog, running Debian 7 (with Java 1.7.0_76-b13 64-Bit, 8GB and 16GB of heap tested). Dev Cluster with 1 DC with 3 nodes and 1 DC with 1 node, running on virtualized env., Ubuntu 12.04.5 (with Java 1.7.0_72-b14 64-Bit 1GB, 4GB heap) Reporter: Luis Correia Priority: Blocker Attachments: after_schema.txt, before_schema.txt, schemas500.cql When creating lots of columnfamilies (about 200) the system.schema_columns gets filled with tombstones and therefore prevents clients using the binary protocol of connecting. Clients already connected continue normal operation (reading and inserting). Log messages are: For the first tries (sorry for the lack of precision): bq. ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in system.schema_columns; query aborted (see tombstone_failure_threshold) For each client that tries to connect but fails with timeout: bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147281748 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,759 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,821 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see tombstone_warn_threshold).
[jira] [Commented] (CASSANDRA-9282) Warn on unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526603#comment-14526603 ] T Jake Luciani commented on CASSANDRA-9282: --- [~iamaleksey] correct. Warn on unlogged batches Key: CASSANDRA-9282 URL: https://issues.apache.org/jira/browse/CASSANDRA-9282 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 2.1.x At least until CASSANDRA-8303 is done and we can block them entirely, we should log a warning when unlogged batches across multiple partition keys are used. This could either be done by backporting NoSpamLogger and blindly logging every time, or we could add a threshold and warn when more than 10 keys are seen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526647#comment-14526647 ] T Jake Luciani commented on CASSANDRA-9240: --- [~benedict] you have a bug in the CRAR constructor. allocateBuffer() is being called before segments are mapped so it is always creating an on heap buffer. Minor nit in your Cleanup method. For multiple levels of conditionals and loops please add {} Finally, I know CASSANDRA-8897 is going to remove pooling, but how will we avoid the cost of un-mapping the files if we don't pool? Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526653#comment-14526653 ] Carl Yeksigian commented on CASSANDRA-6477: --- # It is going to be the same mechanism, but we don't want to use the same consistency as what the insert is. This way, we can ensure that at least one node has seen all of the updates, and thus we can generate the correct tombstone based on the previous values # Each replica makes a GI update independently, based on the data that it has, which means that we might issue updates for an older update that hasn't made it to all of the replicas yet. To cut down on the amount of work that the indexes do, a pretty easy optimization is to just send the index mutation to the index replica that the data node will wait on instead of sending them to all of the index replicas # If we ever get into a situation where we have data loss in either the base table or the index table (both would likely go together), we would really need to run a rebuild, since there is no guarantee that extra data wouldn't be present in the index which isn't in the data table. Otherwise, we can repair the data and index tables independently, so that a repair issued on the data table should also repair all of the global index tables Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.x Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-5460) Consistency issue with CL.TWO
[ https://issues.apache.org/jira/browse/CASSANDRA-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani resolved CASSANDRA-5460. --- Resolution: Cannot Reproduce Consistency issue with CL.TWO - Key: CASSANDRA-5460 URL: https://issues.apache.org/jira/browse/CASSANDRA-5460 Project: Cassandra Issue Type: Test Affects Versions: 1.2.0 Reporter: T Jake Luciani Assignee: Philip Thompson Priority: Minor We have a keyspace with RF=4. We write at QUORUM and read at TWO. This should provide consistency since writes are on 3 replicas and we read from 2, but in practice we don't have consistent reads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526656#comment-14526656 ] Luis Correia commented on CASSANDRA-9291: - I cannot see this as a Minor issue as it can bring down an healthy Cluster just be creating few more new CFs. Please mind that in a new Cluster the problem can be mitigating by remodeling your data. In an existing Cluster (mind Production!) you can _prevent clients from connecting just by creating new CFs_. And the solution will be to wait for the gc_grace_period hoping the right amount of tombstones will be removed by compaction (that's at least 7 days). I've tried to import the stables to a new cluster and shift the time (skip to the day compaction should clear the tombstones), various schema import/export gymnastics, etc. Nothing worked. I had to delete system.schema_columns and re-create only the right amount of CF's (fixing the token range for each node) in order to get my clients connecting again. Too many tombstones in schema_columns from creating too many CFs Key: CASSANDRA-9291 URL: https://issues.apache.org/jira/browse/CASSANDRA-9291 Project: Cassandra Issue Type: Bug Components: Core Environment: Production Cluster with 2 DCs of 3 nodes each and 1 DC of 7 nodes, running on dedicated Xeon hexacore, 96GB ram, RAID for Data and SSF for commitlog, running Debian 7 (with Java 1.7.0_76-b13 64-Bit, 8GB and 16GB of heap tested). Dev Cluster with 1 DC with 3 nodes and 1 DC with 1 node, running on virtualized env., Ubuntu 12.04.5 (with Java 1.7.0_72-b14 64-Bit 1GB, 4GB heap) Reporter: Luis Correia Priority: Blocker Attachments: after_schema.txt, before_schema.txt, schemas500.cql When creating lots of columnfamilies (about 200) the system.schema_columns gets filled with tombstones and therefore prevents clients using the binary protocol of connecting. Clients already connected continue normal operation (reading and inserting). Log messages are: For the first tries (sorry for the lack of precision): bq. ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in system.schema_columns; query aborted (see tombstone_failure_threshold) For each client that tries to connect but fails with timeout: bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147283441 columns was requested, slices=[-] bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see tombstone_warn_threshold). 2147282894 columns was requested, slices=[-] bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see
[jira] [Commented] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526659#comment-14526659 ] Benedict commented on CASSANDRA-9240: - bq. Benedict you have a bug in the CRAR constructor. allocateBuffer() is being called before segments are mapped so it is always creating an on heap buffer. Good catch. I was planning on removing the constraint to only use direct when using mmapped files in a follow up commit. Any opposition to doing that here (since it also fixes that)? bq. Minor nit in your Cleanup method. For multiple levels of conditionals and loops please add {} I think we should clarify our code style policy on this, since I do this kind of thing a lot, certainly at the very least for multi-nested loops or conditionals, and for conditionals inside of loops, but a loop nested inside a conditional is its own unique case. Would be good to have an official policy. Could you and the other PMCs (or whoever decides this stuff) come up with an official position on when we can and cannot omit braces? Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526659#comment-14526659 ] Benedict edited comment on CASSANDRA-9240 at 5/4/15 2:15 PM: - bq. Benedict you have a bug in the CRAR constructor. allocateBuffer() is being called before segments are mapped so it is always creating an on heap buffer. Good catch. I was planning on removing the constraint to only use direct when using mmapped files in a follow up commit. Any opposition to doing that here (since it also fixes that)? bq. Minor nit in your Cleanup method. For multiple levels of conditionals and loops please add {} I think we should clarify our code style policy on this, since I do this kind of thing a lot, certainly at the very least for multi-nested loops or conditionals, and for conditionals inside of loops, but a loop nested inside a conditional is its own unique case. Would be good to have an official policy. Could you and the other PMCs (or whoever decides this stuff) come up with an official position on when we can and cannot omit braces? bq. Finally, I know CASSANDRA-8897 is going to remove pooling, but how will we avoid the cost of un-mapping the files if we don't pool? This patch has already avoided the cost of un-mapping the files, whether or not pooling is used was (Author: benedict): bq. Benedict you have a bug in the CRAR constructor. allocateBuffer() is being called before segments are mapped so it is always creating an on heap buffer. Good catch. I was planning on removing the constraint to only use direct when using mmapped files in a follow up commit. Any opposition to doing that here (since it also fixes that)? bq. Minor nit in your Cleanup method. For multiple levels of conditionals and loops please add {} I think we should clarify our code style policy on this, since I do this kind of thing a lot, certainly at the very least for multi-nested loops or conditionals, and for conditionals inside of loops, but a loop nested inside a conditional is its own unique case. Would be good to have an official policy. Could you and the other PMCs (or whoever decides this stuff) come up with an official position on when we can and cannot omit braces? Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526653#comment-14526653 ] Carl Yeksigian edited comment on CASSANDRA-6477 at 5/4/15 2:16 PM: --- # It is going to be the same mechanism, but we don't want to use the same consistency as what the insert is. This way, we can ensure that at least one node has seen all of the updates, and thus we can generate the correct tombstone based on the previous values; we are trying to make the dependency between the data table and the index table redundant, so we need to make sure a quorum is involved in the write # Each replica makes a GI update independently, based on the data that it has, which means that we might issue updates for an older update that hasn't made it to all of the replicas yet. To cut down on the amount of work that the indexes do, a pretty easy optimization is to just send the index mutation to the index replica that the data node will wait on instead of sending them to all of the index replicas # If we ever get into a situation where we have data loss in either the base table or the index table (both would likely go together), we would really need to run a rebuild, since there is no guarantee that extra data wouldn't be present in the index which isn't in the data table. Otherwise, we can repair the data and index tables independently, so that a repair issued on the data table should also repair all of the global index tables was (Author: carlyeks): # It is going to be the same mechanism, but we don't want to use the same consistency as what the insert is. This way, we can ensure that at least one node has seen all of the updates, and thus we can generate the correct tombstone based on the previous values # Each replica makes a GI update independently, based on the data that it has, which means that we might issue updates for an older update that hasn't made it to all of the replicas yet. To cut down on the amount of work that the indexes do, a pretty easy optimization is to just send the index mutation to the index replica that the data node will wait on instead of sending them to all of the index replicas # If we ever get into a situation where we have data loss in either the base table or the index table (both would likely go together), we would really need to run a rebuild, since there is no guarantee that extra data wouldn't be present in the index which isn't in the data table. Otherwise, we can repair the data and index tables independently, so that a repair issued on the data table should also repair all of the global index tables Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.x Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526663#comment-14526663 ] T Jake Luciani commented on CASSANDRA-9240: --- .bq I was planning on removing the constraint to only use direct when using mmapped files in a follow up commit. The original code did this if (useMmap useDirect...) But the compressors need to specify which kind of buffer they accept, example DeflateCompressors can't use Direct... Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526663#comment-14526663 ] T Jake Luciani edited comment on CASSANDRA-9240 at 5/4/15 2:20 PM: --- bq. I was planning on removing the constraint to only use direct when using mmapped files in a follow up commit. The original code did this if (useMmap useDirect...) But the compressors need to specify which kind of buffer they accept, example DeflateCompressors can't use Direct... was (Author: tjake): .bq I was planning on removing the constraint to only use direct when using mmapped files in a follow up commit. The original code did this if (useMmap useDirect...) But the compressors need to specify which kind of buffer they accept, example DeflateCompressors can't use Direct... Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526669#comment-14526669 ] Benedict commented on CASSANDRA-9240: - bq. But the compressors need to specify which kind of buffer they accept, example DeflateCompressors can't use Direct... That's the {{useDirect}} condition - the {{useMmap}} (which has been replaced by {{chunkSegments != null}}) isn't necessary for that I don't think? Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani reassigned CASSANDRA-9240: - Assignee: T Jake Luciani (was: Benedict) Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: T Jake Luciani Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-9240: -- Assignee: Benedict (was: T Jake Luciani) Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526681#comment-14526681 ] T Jake Luciani commented on CASSANDRA-9240: --- It is still needed because only mmap codepath uses the ByteBuffer ICompressor api the standard mode uses byte[] which won't work with direct Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9292) ParentRepairSession potentially block ANTI_ENTROPY stage
Yuki Morishita created CASSANDRA-9292: - Summary: ParentRepairSession potentially block ANTI_ENTROPY stage Key: CASSANDRA-9292 URL: https://issues.apache.org/jira/browse/CASSANDRA-9292 Project: Cassandra Issue Type: Bug Reporter: Yuki Morishita Priority: Minor Follow up from CASSANDRA-9151, {quote} potentially block this stage again since many methods are synchronized in ActiveRepairService. Methods prepareForRepair(could block for 1 hour for prepare message response) and finishParentSession(this one block for anticompaction to finish) are synchronized and could hold on to the lock for a long time. In RepairMessageVerbHandler.doVerb, if there is an exception for another repair, removeParentRepairSession(also synchronized) will block. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[Cassandra Wiki] Update of CodeStyle by JakeLuciani
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The CodeStyle page has been changed by JakeLuciani: https://wiki.apache.org/cassandra/CodeStyle?action=diffrev1=27rev2=28 * Prefer requiring initialization in the constructor to setters. * avoid redundant this references to member fields or methods * Do not extract interfaces (or abstract classes) unless you actually need multiple implementations of it + * Always include braces for nested levels of conditionals and loops. Only avoid braces for single level. == Multiline statements == * Try to keep lines under 120 characters, but use good judgement -- it's better to exceed 120 by a little, than split a line that has no natural splitting points.
cassandra git commit: Fix threadpool in RepairSession is not shutdown on failure
Repository: cassandra Updated Branches: refs/heads/trunk d9836e0ef - 7223ec025 Fix threadpool in RepairSession is not shutdown on failure patch by Sankalp Kohli; reviewed by yukim for CASSANDRA-9260 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7223ec02 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7223ec02 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7223ec02 Branch: refs/heads/trunk Commit: 7223ec025b749b57d7610aef5e991f151d73b157 Parents: d9836e0 Author: Sankalp Kohli kohlisank...@gmail.com Authored: Mon May 4 09:58:41 2015 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon May 4 09:58:41 2015 -0500 -- src/java/org/apache/cassandra/repair/RepairSession.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7223ec02/src/java/org/apache/cassandra/repair/RepairSession.java -- diff --git a/src/java/org/apache/cassandra/repair/RepairSession.java b/src/java/org/apache/cassandra/repair/RepairSession.java index 9762159..70bfaa6 100644 --- a/src/java/org/apache/cassandra/repair/RepairSession.java +++ b/src/java/org/apache/cassandra/repair/RepairSession.java @@ -276,7 +276,7 @@ public class RepairSession extends AbstractFutureRepairSessionResult implement { logger.error(String.format([repair #%s] Session completed with the following error, getId()), t); Tracing.traceRepair(Session completed with the following error: {}, t); -setException(t); +forceShutdown(t); } }); }
[jira] [Commented] (CASSANDRA-5839) Save repair data to system table
[ https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526735#comment-14526735 ] Marcus Eriksson commented on CASSANDRA-5839: Looks like it could be an issue with announcing the new keyspace system_distributed too soon after announcing system_traces (adding a 1s sleep before announcing the second ks fixes it). But why would it only break if we cluster.clear() and then restart? also, modifying the test to do: {code}node1.clear(clear_all=True) node2.clear(clear_all=True){code} instead of {{cluster.clear()}} makes the simple_test above pass (clear_all=True also removes log files) I'll keep digging... Save repair data to system table Key: CASSANDRA-5839 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839 Project: Cassandra Issue Type: New Feature Components: Core, Tools Reporter: Jonathan Ellis Assignee: Marcus Eriksson Priority: Minor Fix For: 3.0 Attachments: 0001-5839.patch, 2.0.4-5839-draft.patch, 2.0.6-5839-v2.patch As noted in CASSANDRA-2405, it would be useful to store repair results, particularly with sub-range repair available (CASSANDRA-5280). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9293) Unit tests should fail if any LEAK DETECTED errors are printed
Benedict created CASSANDRA-9293: --- Summary: Unit tests should fail if any LEAK DETECTED errors are printed Key: CASSANDRA-9293 URL: https://issues.apache.org/jira/browse/CASSANDRA-9293 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict We shouldn't depend on dtests to inform us of these problems (which have error log monitoring) - they should be caught by unit tests, which may also cover different failure conditions (besides being faster). There are a couple of ways we could do this, but probably the easiest is to add a static flag that is set to true if we ever see a leak (in Ref), and to just assert that this is false at the end of every test. [~enigmacurry] is this something TE can help with? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9282) Warn on unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526743#comment-14526743 ] T Jake Luciani commented on CASSANDRA-9282: --- [~aweisberg] how much of a change is it to backport the NoSpamLogger to 2.1? Warn on unlogged batches Key: CASSANDRA-9282 URL: https://issues.apache.org/jira/browse/CASSANDRA-9282 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 2.1.x At least until CASSANDRA-8303 is done and we can block them entirely, we should log a warning when unlogged batches across multiple partition keys are used. This could either be done by backporting NoSpamLogger and blindly logging every time, or we could add a threshold and warn when more than 10 keys are seen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-8805: -- Attachment: 8805-2.1.txt Registering the index summary redistribution with the compaction manager and checking isStopRequested. Added a test to make sure that the CompactionInterruptedException was thrown. runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted -- Key: CASSANDRA-8805 URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Carl Yeksigian Fix For: 2.1.x Attachments: 8805-2.1.txt Operations like repair that may operate over all sstables cancel compactions before beginning, and fail if there are any files marked compacting after doing so. Redistribution of index summaries is not a compaction, so is not cancelled by this action, but does mark sstables as compacting, so such an action will fail to initiate if there is an index summary redistribution in progress. It seems that IndexSummaryManager needs to register itself as interruptible along with compactions (AFAICT no other actions that may markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9282) Warn on unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526745#comment-14526745 ] Ariel Weisberg commented on CASSANDRA-9282: --- It has no external impact and doesn't start a thread or anything. Warn on unlogged batches Key: CASSANDRA-9282 URL: https://issues.apache.org/jira/browse/CASSANDRA-9282 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 2.1.x At least until CASSANDRA-8303 is done and we can block them entirely, we should log a warning when unlogged batches across multiple partition keys are used. This could either be done by backporting NoSpamLogger and blindly logging every time, or we could add a threshold and warn when more than 10 keys are seen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9160) Migrate CQL dtests to unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526749#comment-14526749 ] Jonathan Ellis commented on CASSANDRA-9160: --- [~slebresne] can you comment on the progress so far? Migrate CQL dtests to unit tests Key: CASSANDRA-9160 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160 Project: Cassandra Issue Type: Test Reporter: Sylvain Lebresne Assignee: Stefania We have CQL tests in 2 places: dtests and unit tests. The unit tests are actually somewhat better in the sense that they have the ability to test both prepared and unprepared statements at the flip of a switch. It's also better to have all those tests in the same place so we can improve the test framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we should move the CQL dtests to the unit tests (which will be a good occasion to organize them better). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9057) index validation fails for non-indexed column
[ https://issues.apache.org/jira/browse/CASSANDRA-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-9057: -- Attachment: 9057-2.1-v3.txt Latest patch includes [~beobal]'s changes, and removes {{SecondaryIndexCellSizeTest}} and addresses nits. The changes to {{IVVT}} look good; they better capture the tests we want. I agree that we should get rid of {{SICST}}; the validation tests are much more complete. index validation fails for non-indexed column - Key: CASSANDRA-9057 URL: https://issues.apache.org/jira/browse/CASSANDRA-9057 Project: Cassandra Issue Type: Bug Reporter: Eric Evans Assignee: Carl Yeksigian Fix For: 2.1.x Attachments: 9057-2.1-v2.txt, 9057-2.1-v3.txt, 9057-2.1.txt On 2.1.3, updates are failing with an InvalidRequestException when an unindexed column is greater than the maximum allowed for indexed entries. {noformat} ResponseError: Can't index column value of size 1483409 for index null on local_group_default_T_parsoid_html.data {noformat} In this case, the update _does_ include a 1483409 byte column value, but it is for a column that is not indexed, (the single indexed column is 32 bytes), presumably this is why {{cfm.getColumnDefinition(cell.name()).getIndexName()}} returns {{null}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9293) Unit tests should fail if any LEAK DETECTED errors are printed
[ https://issues.apache.org/jira/browse/CASSANDRA-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire reassigned CASSANDRA-9293: --- Assignee: Ryan McGuire Unit tests should fail if any LEAK DETECTED errors are printed -- Key: CASSANDRA-9293 URL: https://issues.apache.org/jira/browse/CASSANDRA-9293 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Ryan McGuire We shouldn't depend on dtests to inform us of these problems (which have error log monitoring) - they should be caught by unit tests, which may also cover different failure conditions (besides being faster). There are a couple of ways we could do this, but probably the easiest is to add a static flag that is set to true if we ever see a leak (in Ref), and to just assert that this is false at the end of every test. [~enigmacurry] is this something TE can help with? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9029) Add support for rate limiting log statements
[ https://issues.apache.org/jira/browse/CASSANDRA-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-9029: -- Fix Version/s: 2.1.6 3.0 Add support for rate limiting log statements Key: CASSANDRA-9029 URL: https://issues.apache.org/jira/browse/CASSANDRA-9029 Project: Cassandra Issue Type: Improvement Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0, 2.1.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/4] cassandra git commit: Introduce NoSpamLogger
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 739f3e37c - e6f027979 Introduce NoSpamLogger patch by ariel; reviewed by benedict for CASSANDRA-9029 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5bffaf85 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5bffaf85 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5bffaf85 Branch: refs/heads/cassandra-2.1 Commit: 5bffaf850ca3e978baaa8664acc65612d7460d3f Parents: 739f3e3 Author: Ariel Weisberg ariel.wesib...@datastax.com Authored: Fri Apr 3 23:27:10 2015 +0100 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:21:59 2015 -0400 -- .../apache/cassandra/utils/NoSpamLogger.java| 238 +++ .../cassandra/utils/NoSpamLoggerTest.java | 174 ++ 2 files changed, 412 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5bffaf85/src/java/org/apache/cassandra/utils/NoSpamLogger.java -- diff --git a/src/java/org/apache/cassandra/utils/NoSpamLogger.java b/src/java/org/apache/cassandra/utils/NoSpamLogger.java new file mode 100644 index 000..9f5d5ce --- /dev/null +++ b/src/java/org/apache/cassandra/utils/NoSpamLogger.java @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.utils; + +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; + +import org.cliffc.high_scale_lib.NonBlockingHashMap; +import org.slf4j.Logger; + +import com.google.common.annotations.VisibleForTesting; + +/** + * Logging that limits each log statement to firing based on time since the statement last fired. + * + * Every logger has a unique timer per statement. Minimum time between logging is set for each statement + * the first time it is used and a subsequent attempt to request that statement with a different minimum time will + * result in the original time being used. No warning is provided if there is a mismatch. + * + * If the statement is cached and used to log directly then only a volatile read will be required in the common case. + * If the Logger is cached then there is a single concurrent hash map lookup + the volatile read. + * If neither the logger nor the statement is cached then it is two concurrent hash map lookups + the volatile read. + * + */ +public class NoSpamLogger +{ +/** + * Levels for programmatically specifying the severity of a log statement + */ +public enum Level +{ +INFO, WARN, ERROR; +} + +@VisibleForTesting +static interface Clock +{ +long nanoTime(); +} + +@VisibleForTesting +static Clock CLOCK = new Clock() +{ +public long nanoTime() +{ +return System.nanoTime(); +} +}; + +public class NoSpamLogStatement extends AtomicLong +{ +private static final long serialVersionUID = 1L; + +private final String statement; +private final long minIntervalNanos; + +public NoSpamLogStatement(String statement, long minIntervalNanos) +{ +this.statement = statement; +this.minIntervalNanos = minIntervalNanos; +} + +private boolean shouldLog(long nowNanos) +{ +long expected = get(); +return nowNanos - expected = minIntervalNanos compareAndSet(expected, nowNanos); +} + +public void log(Level l, long nowNanos, Object... objects) +{ +if (!shouldLog(nowNanos)) return; + +switch (l) +{ +case INFO: +wrapped.info(statement, objects); +break; +case WARN: +wrapped.warn(statement, objects); +break; +case ERROR: +wrapped.error(statement, objects); +break; +default: +throw new AssertionError(); +} +
[2/4] cassandra git commit: Ninja fix CASSANDRA-9029
Ninja fix CASSANDRA-9029 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a6549440 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a6549440 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a6549440 Branch: refs/heads/cassandra-2.1 Commit: a6549440f30997273f0b1a073b1493684715c43b Parents: 5bffaf8 Author: Ariel Weisberg ar...@weisberg.ws Authored: Mon Apr 6 23:00:00 2015 +0200 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:34:08 2015 -0400 -- .../apache/cassandra/utils/NoSpamLogger.java| 35 +++--- .../cassandra/utils/NoSpamLoggerTest.java | 115 +++ 2 files changed, 109 insertions(+), 41 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6549440/src/java/org/apache/cassandra/utils/NoSpamLogger.java -- diff --git a/src/java/org/apache/cassandra/utils/NoSpamLogger.java b/src/java/org/apache/cassandra/utils/NoSpamLogger.java index 9f5d5ce..3cc8b5e 100644 --- a/src/java/org/apache/cassandra/utils/NoSpamLogger.java +++ b/src/java/org/apache/cassandra/utils/NoSpamLogger.java @@ -103,32 +103,32 @@ public class NoSpamLogger public void info(long nowNanos, Object... objects) { -log(Level.INFO, nowNanos, objects); +NoSpamLogStatement.this.log(Level.INFO, nowNanos, objects); } public void info(Object... objects) { -info(CLOCK.nanoTime(), objects); +NoSpamLogStatement.this.info(CLOCK.nanoTime(), objects); } public void warn(long nowNanos, Object... objects) { -log(Level.WARN, nowNanos, objects); +NoSpamLogStatement.this.log(Level.WARN, nowNanos, objects); } -public void warn(String s, Object... objects) +public void warn(Object... objects) { -warn(CLOCK.nanoTime(), s, objects); +NoSpamLogStatement.this.warn(CLOCK.nanoTime(), objects); } public void error(long nowNanos, Object... objects) { -log(Level.ERROR, nowNanos, objects); +NoSpamLogStatement.this.log(Level.ERROR, nowNanos, objects); } public void error(Object... objects) { -error(CLOCK.nanoTime(), objects); +NoSpamLogStatement.this.error(CLOCK.nanoTime(), objects); } } @@ -165,7 +165,8 @@ public class NoSpamLogger statement.log(level, nowNanos, objects); } -public static NoSpamLogStatement getStatement(Logger logger, String message, long minInterval, TimeUnit unit) { +public static NoSpamLogStatement getStatement(Logger logger, String message, long minInterval, TimeUnit unit) +{ NoSpamLogger wrapped = getLogger(logger, minInterval, unit); return wrapped.getStatement(message); } @@ -182,45 +183,45 @@ public class NoSpamLogger public void info(long nowNanos, String s, Object... objects) { -log( Level.INFO, s, nowNanos, objects); +NoSpamLogger.this.log( Level.INFO, s, nowNanos, objects); } public void info(String s, Object... objects) { -info(CLOCK.nanoTime(), s, objects); +NoSpamLogger.this.info(CLOCK.nanoTime(), s, objects); } public void warn(long nowNanos, String s, Object... objects) { -log( Level.WARN, s, nowNanos, objects); +NoSpamLogger.this.log( Level.WARN, s, nowNanos, objects); } public void warn(String s, Object... objects) { -warn(CLOCK.nanoTime(), s, objects); +NoSpamLogger.this.warn(CLOCK.nanoTime(), s, objects); } public void error(long nowNanos, String s, Object... objects) { -log( Level.ERROR, s, nowNanos, objects); +NoSpamLogger.this.log( Level.ERROR, s, nowNanos, objects); } public void error(String s, Object... objects) { -error(CLOCK.nanoTime(), s, objects); +NoSpamLogger.this.error(CLOCK.nanoTime(), s, objects); } public void log(Level l, String s, long nowNanos, Object... objects) { -getStatement(s, minIntervalNanos).log(l, nowNanos, objects); +NoSpamLogger.this.getStatement(s, minIntervalNanos).log(l, nowNanos, objects); } public NoSpamLogStatement getStatement(String s) { -return getStatement(s, minIntervalNanos); +return NoSpamLogger.this.getStatement(s, minIntervalNanos); } public NoSpamLogStatement getStatement(String s, long minInterval, TimeUnit unit) { -return getStatement(s, unit.toNanos(minInterval)); +return NoSpamLogger.this.getStatement(s, unit.toNanos(minInterval)); } public
[4/4] cassandra git commit: backport 9029 to 2.1
backport 9029 to 2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6f02797 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6f02797 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6f02797 Branch: refs/heads/cassandra-2.1 Commit: e6f027979a3ec4221438bd2a21db8053cb3c1ad7 Parents: 8ec1da2 Author: T Jake Luciani j...@apache.org Authored: Mon May 4 12:42:10 2015 -0400 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:42:10 2015 -0400 -- CHANGES.txt | 1 + .../cassandra/utils/NoSpamLoggerTest.java | 141 ++- 2 files changed, 139 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6f02797/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0593e2b..e7689ab 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.6 + * Add support for rate limiting log messages (CASSANDRA-9029) * Log the partition key with tombstone warnings (CASSANDRA-8561) * Reduce runWithCompactionsDisabled poll interval to 1ms (CASSANDRA-9271) * Fix PITR commitlog replay (CASSANDRA-9195) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6f02797/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java -- diff --git a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java index 0a5a005..0d6c8b1 100644 --- a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java +++ b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java @@ -32,34 +32,169 @@ import org.junit.Before; import org.junit.BeforeClass; import org.junit.Test; import org.slf4j.Logger; -import org.slf4j.helpers.SubstituteLogger; +import org.slf4j.helpers.MarkerIgnoringBase; public class NoSpamLoggerTest { MapLevel, QueuePairString, Object[] logged = new HashMap(); - Logger mock = new SubstituteLogger(null) + Logger mock = new MarkerIgnoringBase() { + public boolean isTraceEnabled() + { + return false; + } + + public void trace(String s) + { + + } + + public void trace(String s, Object o) + { + + } + + public void trace(String s, Object o, Object o1) + { + + } + + public void trace(String s, Object... objects) + { + + } + + public void trace(String s, Throwable throwable) + { + + } + + public boolean isDebugEnabled() + { + return false; + } + + public void debug(String s) + { + + } + + public void debug(String s, Object o) + { + + } + + public void debug(String s, Object o, Object o1) + { + + } + + public void debug(String s, Object... objects) + { + + } + + public void debug(String s, Throwable throwable) + { + + } + + public boolean isInfoEnabled() + { + return false; + } + + public void info(String s) + { + + } + + public void info(String s, Object o) + { + + } + + public void info(String s, Object o, Object o1) + { + + } + @Override public void info(String statement, Object... args) { logged.get(Level.INFO).offer(Pair.create(statement, args)); } + public void info(String s, Throwable throwable) + { + + } + + public boolean isWarnEnabled() + { + return false; + } + + public void warn(String s) + { + + } + + public void warn(String s, Object o) + { + + } + @Override public void warn(String statement, Object... args) { logged.get(Level.WARN).offer(Pair.create(statement, args)); } + public void warn(String s, Object o, Object o1) + { + + } + + public void warn(String s, Throwable throwable) + { + + } + + public boolean isErrorEnabled() + { + return false; + } + + public void error(String s) + { + + } + + public void error(String s, Object o) + { + + } + + public void error(String s, Object o, Object o1) + { + + } + @Override public void error(String statement, Object... args) { logged.get(Level.ERROR).offer(Pair.create(statement, args)); } + public void error(String s, Throwable throwable) + { + + } + @Override public int hashCode() { @@ -123,7 +258,7 @@ public class NoSpamLoggerTest
[3/4] cassandra git commit: Ninja fix CASSANDRA-9029
Ninja fix CASSANDRA-9029 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8ec1da21 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8ec1da21 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8ec1da21 Branch: refs/heads/cassandra-2.1 Commit: 8ec1da211830762ebf571f12d9cbd505d2a1fada Parents: a654944 Author: Ariel Weisberg ar...@weisberg.ws Authored: Tue Apr 7 01:01:16 2015 +0200 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:35:08 2015 -0400 -- test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8ec1da21/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java -- diff --git a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java index ca1d6d3..0a5a005 100644 --- a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java +++ b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java @@ -191,6 +191,8 @@ public class NoSpamLoggerTest @Test public void testLoggedResult() throws Exception { + now = 5; + NoSpamLogger.log( mock, Level.INFO, 5, TimeUnit.NANOSECONDS, statement, param); checkMock(Level.INFO);
[1/5] cassandra git commit: Introduce NoSpamLogger
Repository: cassandra Updated Branches: refs/heads/trunk 7223ec025 - 47964b766 Introduce NoSpamLogger patch by ariel; reviewed by benedict for CASSANDRA-9029 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5bffaf85 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5bffaf85 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5bffaf85 Branch: refs/heads/trunk Commit: 5bffaf850ca3e978baaa8664acc65612d7460d3f Parents: 739f3e3 Author: Ariel Weisberg ariel.wesib...@datastax.com Authored: Fri Apr 3 23:27:10 2015 +0100 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:21:59 2015 -0400 -- .../apache/cassandra/utils/NoSpamLogger.java| 238 +++ .../cassandra/utils/NoSpamLoggerTest.java | 174 ++ 2 files changed, 412 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5bffaf85/src/java/org/apache/cassandra/utils/NoSpamLogger.java -- diff --git a/src/java/org/apache/cassandra/utils/NoSpamLogger.java b/src/java/org/apache/cassandra/utils/NoSpamLogger.java new file mode 100644 index 000..9f5d5ce --- /dev/null +++ b/src/java/org/apache/cassandra/utils/NoSpamLogger.java @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.utils; + +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; + +import org.cliffc.high_scale_lib.NonBlockingHashMap; +import org.slf4j.Logger; + +import com.google.common.annotations.VisibleForTesting; + +/** + * Logging that limits each log statement to firing based on time since the statement last fired. + * + * Every logger has a unique timer per statement. Minimum time between logging is set for each statement + * the first time it is used and a subsequent attempt to request that statement with a different minimum time will + * result in the original time being used. No warning is provided if there is a mismatch. + * + * If the statement is cached and used to log directly then only a volatile read will be required in the common case. + * If the Logger is cached then there is a single concurrent hash map lookup + the volatile read. + * If neither the logger nor the statement is cached then it is two concurrent hash map lookups + the volatile read. + * + */ +public class NoSpamLogger +{ +/** + * Levels for programmatically specifying the severity of a log statement + */ +public enum Level +{ +INFO, WARN, ERROR; +} + +@VisibleForTesting +static interface Clock +{ +long nanoTime(); +} + +@VisibleForTesting +static Clock CLOCK = new Clock() +{ +public long nanoTime() +{ +return System.nanoTime(); +} +}; + +public class NoSpamLogStatement extends AtomicLong +{ +private static final long serialVersionUID = 1L; + +private final String statement; +private final long minIntervalNanos; + +public NoSpamLogStatement(String statement, long minIntervalNanos) +{ +this.statement = statement; +this.minIntervalNanos = minIntervalNanos; +} + +private boolean shouldLog(long nowNanos) +{ +long expected = get(); +return nowNanos - expected = minIntervalNanos compareAndSet(expected, nowNanos); +} + +public void log(Level l, long nowNanos, Object... objects) +{ +if (!shouldLog(nowNanos)) return; + +switch (l) +{ +case INFO: +wrapped.info(statement, objects); +break; +case WARN: +wrapped.warn(statement, objects); +break; +case ERROR: +wrapped.error(statement, objects); +break; +default: +throw new AssertionError(); +} +} + +
[5/5] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Conflicts: test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/47964b76 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/47964b76 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/47964b76 Branch: refs/heads/trunk Commit: 47964b766cdcfa2db3d2b9d1061c04cc850de5ca Parents: 7223ec0 e6f0279 Author: T Jake Luciani j...@apache.org Authored: Mon May 4 12:43:28 2015 -0400 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:43:28 2015 -0400 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/47964b76/CHANGES.txt -- diff --cc CHANGES.txt index 49645b2,e7689ab..7170085 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,103 -1,5 +1,104 @@@ +3.0 + * Disable memory mapping of hsperfdata file for JVM statistics (CASSANDRA-9242) + * Add pre-startup checks to detect potential incompatibilities (CASSANDRA-8049) + * Distinguish between null and unset in protocol v4 (CASSANDRA-7304) + * Add user/role permissions for user-defined functions (CASSANDRA-7557) + * Allow cassandra config to be updated to restart daemon without unloading classes (CASSANDRA-9046) + * Don't initialize compaction writer before checking if iter is empty (CASSANDRA-9117) + * Don't execute any functions at prepare-time (CASSANDRA-9037) + * Share file handles between all instances of a SegmentedFile (CASSANDRA-8893) + * Make it possible to major compact LCS (CASSANDRA-7272) + * Make FunctionExecutionException extend RequestExecutionException + (CASSANDRA-9055) + * Add support for SELECT JSON, INSERT JSON syntax and new toJson(), fromJson() + functions (CASSANDRA-7970) + * Optimise max purgeable timestamp calculation in compaction (CASSANDRA-8920) + * Constrain internode message buffer sizes, and improve IO class hierarchy (CASSANDRA-8670) + * New tool added to validate all sstables in a node (CASSANDRA-5791) + * Push notification when tracing completes for an operation (CASSANDRA-7807) + * Delay node up and node added notifications until native protocol server is started (CASSANDRA-8236) + * Compressed Commit Log (CASSANDRA-6809) + * Optimise IntervalTree (CASSANDRA-8988) + * Add a key-value payload for third party usage (CASSANDRA-8553, 9212) + * Bump metrics-reporter-config dependency for metrics 3.0 (CASSANDRA-8149) + * Partition intra-cluster message streams by size, not type (CASSANDRA-8789) + * Add WriteFailureException to native protocol, notify coordinator of + write failures (CASSANDRA-8592) + * Convert SequentialWriter to nio (CASSANDRA-8709) + * Add role based access control (CASSANDRA-7653, 8650, 7216, 8760, 8849, 8761, 8850) + * Record client ip address in tracing sessions (CASSANDRA-8162) + * Indicate partition key columns in response metadata for prepared + statements (CASSANDRA-7660) + * Merge UUIDType and TimeUUIDType parse logic (CASSANDRA-8759) + * Avoid memory allocation when searching index summary (CASSANDRA-8793) + * Optimise (Time)?UUIDType Comparisons (CASSANDRA-8730) + * Make CRC32Ex into a separate maven dependency (CASSANDRA-8836) + * Use preloaded jemalloc w/ Unsafe (CASSANDRA-8714) + * Avoid accessing partitioner through StorageProxy (CASSANDRA-8244, 8268) + * Upgrade Metrics library and remove depricated metrics (CASSANDRA-5657) + * Serializing Row cache alternative, fully off heap (CASSANDRA-7438) + * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707) + * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560) + * Support direct buffer decompression for reads (CASSANDRA-8464) + * DirectByteBuffer compatible LZ4 methods (CASSANDRA-7039) + * Group sstables for anticompaction correctly (CASSANDRA-8578) + * Add ReadFailureException to native protocol, respond + immediately when replicas encounter errors while handling + a read request (CASSANDRA-7886) + * Switch CommitLogSegment from RandomAccessFile to nio (CASSANDRA-8308) + * Allow mixing token and partition key restrictions (CASSANDRA-7016) + * Support index key/value entries on map collections (CASSANDRA-8473) + * Modernize schema tables (CASSANDRA-8261) + * Support for user-defined aggregation functions (CASSANDRA-8053) + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419) + * Refactor SelectStatement, return IN results in natural order instead + of IN value list order and ignore duplicate values in partition key IN restrictions (CASSANDRA-7981) + * Support UDTs, tuples, and collections in user-defined + functions (CASSANDRA-7563) + * Fix aggregate fn results
[3/5] cassandra git commit: Ninja fix CASSANDRA-9029
Ninja fix CASSANDRA-9029 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8ec1da21 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8ec1da21 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8ec1da21 Branch: refs/heads/trunk Commit: 8ec1da211830762ebf571f12d9cbd505d2a1fada Parents: a654944 Author: Ariel Weisberg ar...@weisberg.ws Authored: Tue Apr 7 01:01:16 2015 +0200 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:35:08 2015 -0400 -- test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8ec1da21/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java -- diff --git a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java index ca1d6d3..0a5a005 100644 --- a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java +++ b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java @@ -191,6 +191,8 @@ public class NoSpamLoggerTest @Test public void testLoggedResult() throws Exception { + now = 5; + NoSpamLogger.log( mock, Level.INFO, 5, TimeUnit.NANOSECONDS, statement, param); checkMock(Level.INFO);
[4/5] cassandra git commit: backport 9029 to 2.1
backport 9029 to 2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6f02797 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6f02797 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6f02797 Branch: refs/heads/trunk Commit: e6f027979a3ec4221438bd2a21db8053cb3c1ad7 Parents: 8ec1da2 Author: T Jake Luciani j...@apache.org Authored: Mon May 4 12:42:10 2015 -0400 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:42:10 2015 -0400 -- CHANGES.txt | 1 + .../cassandra/utils/NoSpamLoggerTest.java | 141 ++- 2 files changed, 139 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6f02797/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0593e2b..e7689ab 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.6 + * Add support for rate limiting log messages (CASSANDRA-9029) * Log the partition key with tombstone warnings (CASSANDRA-8561) * Reduce runWithCompactionsDisabled poll interval to 1ms (CASSANDRA-9271) * Fix PITR commitlog replay (CASSANDRA-9195) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6f02797/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java -- diff --git a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java index 0a5a005..0d6c8b1 100644 --- a/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java +++ b/test/unit/org/apache/cassandra/utils/NoSpamLoggerTest.java @@ -32,34 +32,169 @@ import org.junit.Before; import org.junit.BeforeClass; import org.junit.Test; import org.slf4j.Logger; -import org.slf4j.helpers.SubstituteLogger; +import org.slf4j.helpers.MarkerIgnoringBase; public class NoSpamLoggerTest { MapLevel, QueuePairString, Object[] logged = new HashMap(); - Logger mock = new SubstituteLogger(null) + Logger mock = new MarkerIgnoringBase() { + public boolean isTraceEnabled() + { + return false; + } + + public void trace(String s) + { + + } + + public void trace(String s, Object o) + { + + } + + public void trace(String s, Object o, Object o1) + { + + } + + public void trace(String s, Object... objects) + { + + } + + public void trace(String s, Throwable throwable) + { + + } + + public boolean isDebugEnabled() + { + return false; + } + + public void debug(String s) + { + + } + + public void debug(String s, Object o) + { + + } + + public void debug(String s, Object o, Object o1) + { + + } + + public void debug(String s, Object... objects) + { + + } + + public void debug(String s, Throwable throwable) + { + + } + + public boolean isInfoEnabled() + { + return false; + } + + public void info(String s) + { + + } + + public void info(String s, Object o) + { + + } + + public void info(String s, Object o, Object o1) + { + + } + @Override public void info(String statement, Object... args) { logged.get(Level.INFO).offer(Pair.create(statement, args)); } + public void info(String s, Throwable throwable) + { + + } + + public boolean isWarnEnabled() + { + return false; + } + + public void warn(String s) + { + + } + + public void warn(String s, Object o) + { + + } + @Override public void warn(String statement, Object... args) { logged.get(Level.WARN).offer(Pair.create(statement, args)); } + public void warn(String s, Object o, Object o1) + { + + } + + public void warn(String s, Throwable throwable) + { + + } + + public boolean isErrorEnabled() + { + return false; + } + + public void error(String s) + { + + } + + public void error(String s, Object o) + { + + } + + public void error(String s, Object o, Object o1) + { + + } + @Override public void error(String statement, Object... args) { logged.get(Level.ERROR).offer(Pair.create(statement, args)); } + public void error(String s, Throwable throwable) + { + + } + @Override public int hashCode() { @@ -123,7 +258,7 @@ public class NoSpamLoggerTest now
[2/5] cassandra git commit: Ninja fix CASSANDRA-9029
Ninja fix CASSANDRA-9029 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a6549440 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a6549440 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a6549440 Branch: refs/heads/trunk Commit: a6549440f30997273f0b1a073b1493684715c43b Parents: 5bffaf8 Author: Ariel Weisberg ar...@weisberg.ws Authored: Mon Apr 6 23:00:00 2015 +0200 Committer: T Jake Luciani j...@apache.org Committed: Mon May 4 12:34:08 2015 -0400 -- .../apache/cassandra/utils/NoSpamLogger.java| 35 +++--- .../cassandra/utils/NoSpamLoggerTest.java | 115 +++ 2 files changed, 109 insertions(+), 41 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6549440/src/java/org/apache/cassandra/utils/NoSpamLogger.java -- diff --git a/src/java/org/apache/cassandra/utils/NoSpamLogger.java b/src/java/org/apache/cassandra/utils/NoSpamLogger.java index 9f5d5ce..3cc8b5e 100644 --- a/src/java/org/apache/cassandra/utils/NoSpamLogger.java +++ b/src/java/org/apache/cassandra/utils/NoSpamLogger.java @@ -103,32 +103,32 @@ public class NoSpamLogger public void info(long nowNanos, Object... objects) { -log(Level.INFO, nowNanos, objects); +NoSpamLogStatement.this.log(Level.INFO, nowNanos, objects); } public void info(Object... objects) { -info(CLOCK.nanoTime(), objects); +NoSpamLogStatement.this.info(CLOCK.nanoTime(), objects); } public void warn(long nowNanos, Object... objects) { -log(Level.WARN, nowNanos, objects); +NoSpamLogStatement.this.log(Level.WARN, nowNanos, objects); } -public void warn(String s, Object... objects) +public void warn(Object... objects) { -warn(CLOCK.nanoTime(), s, objects); +NoSpamLogStatement.this.warn(CLOCK.nanoTime(), objects); } public void error(long nowNanos, Object... objects) { -log(Level.ERROR, nowNanos, objects); +NoSpamLogStatement.this.log(Level.ERROR, nowNanos, objects); } public void error(Object... objects) { -error(CLOCK.nanoTime(), objects); +NoSpamLogStatement.this.error(CLOCK.nanoTime(), objects); } } @@ -165,7 +165,8 @@ public class NoSpamLogger statement.log(level, nowNanos, objects); } -public static NoSpamLogStatement getStatement(Logger logger, String message, long minInterval, TimeUnit unit) { +public static NoSpamLogStatement getStatement(Logger logger, String message, long minInterval, TimeUnit unit) +{ NoSpamLogger wrapped = getLogger(logger, minInterval, unit); return wrapped.getStatement(message); } @@ -182,45 +183,45 @@ public class NoSpamLogger public void info(long nowNanos, String s, Object... objects) { -log( Level.INFO, s, nowNanos, objects); +NoSpamLogger.this.log( Level.INFO, s, nowNanos, objects); } public void info(String s, Object... objects) { -info(CLOCK.nanoTime(), s, objects); +NoSpamLogger.this.info(CLOCK.nanoTime(), s, objects); } public void warn(long nowNanos, String s, Object... objects) { -log( Level.WARN, s, nowNanos, objects); +NoSpamLogger.this.log( Level.WARN, s, nowNanos, objects); } public void warn(String s, Object... objects) { -warn(CLOCK.nanoTime(), s, objects); +NoSpamLogger.this.warn(CLOCK.nanoTime(), s, objects); } public void error(long nowNanos, String s, Object... objects) { -log( Level.ERROR, s, nowNanos, objects); +NoSpamLogger.this.log( Level.ERROR, s, nowNanos, objects); } public void error(String s, Object... objects) { -error(CLOCK.nanoTime(), s, objects); +NoSpamLogger.this.error(CLOCK.nanoTime(), s, objects); } public void log(Level l, String s, long nowNanos, Object... objects) { -getStatement(s, minIntervalNanos).log(l, nowNanos, objects); +NoSpamLogger.this.getStatement(s, minIntervalNanos).log(l, nowNanos, objects); } public NoSpamLogStatement getStatement(String s) { -return getStatement(s, minIntervalNanos); +return NoSpamLogger.this.getStatement(s, minIntervalNanos); } public NoSpamLogStatement getStatement(String s, long minInterval, TimeUnit unit) { -return getStatement(s, unit.toNanos(minInterval)); +return NoSpamLogger.this.getStatement(s, unit.toNanos(minInterval)); } public NoSpamLogStatement
[jira] [Created] (CASSANDRA-9294) Streaming errors should log the root cause
Brandon Williams created CASSANDRA-9294: --- Summary: Streaming errors should log the root cause Key: CASSANDRA-9294 URL: https://issues.apache.org/jira/browse/CASSANDRA-9294 Project: Cassandra Issue Type: Bug Reporter: Brandon Williams Assignee: Yuki Morishita Fix For: 2.0.x Currently, when a streaming errors occurs all you get is something like: {noformat} java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed {noformat} Instead, we should log the root cause. Was the connection reset by peer, did it timeout, etc? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9294) Streaming errors should log the root cause
[ https://issues.apache.org/jira/browse/CASSANDRA-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-9294: Description: Currently, when a streaming error occurs all you get is something like: {noformat} java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed {noformat} Instead, we should log the root cause. Was the connection reset by peer, did it timeout, etc? was: Currently, when a streaming errors occurs all you get is something like: {noformat} java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed {noformat} Instead, we should log the root cause. Was the connection reset by peer, did it timeout, etc? Streaming errors should log the root cause -- Key: CASSANDRA-9294 URL: https://issues.apache.org/jira/browse/CASSANDRA-9294 Project: Cassandra Issue Type: Bug Reporter: Brandon Williams Assignee: Yuki Morishita Fix For: 2.0.x Currently, when a streaming error occurs all you get is something like: {noformat} java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed {noformat} Instead, we should log the root cause. Was the connection reset by peer, did it timeout, etc? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9164) Randomized correctness testing of CQLSSTableWriter (and friends)
[ https://issues.apache.org/jira/browse/CASSANDRA-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526843#comment-14526843 ] Benedict commented on CASSANDRA-9164: - The more I think about this, the more I think it should be tied directly into cassandra-stress, like a lot of other things. I'm leaning towards stress implementing an AtomIterator for its generated data (post 8099), and we could then pipe these into any kind of writer, and also directly read the resulting sstables. I think this is probably true of CASSANDRA-9162 as well, but this won't be true of every part of the over-arching CASSANDRA-9163 ticket I hope. Randomized correctness testing of CQLSSTableWriter (and friends) Key: CASSANDRA-9164 URL: https://issues.apache.org/jira/browse/CASSANDRA-9164 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: retrospective_generated -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Conflicts: tools/stress/src/org/apache/cassandra/stress/generate/PartitionIterator.java tools/stress/src/org/apache/cassandra/stress/operations/SampledOpDistributionFactory.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ed084029 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ed084029 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ed084029 Branch: refs/heads/trunk Commit: ed08402999baade584aafd0ca47faa4df529f41d Parents: 47964b7 3bee990 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 18:03:43 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 18:03:43 2015 +0100 -- CHANGES.txt | 1 + .../org/apache/cassandra/stress/Operation.java | 13 +- .../apache/cassandra/stress/StressAction.java | 4 +- .../apache/cassandra/stress/StressProfile.java | 25 +- .../stress/generate/PartitionGenerator.java | 9 + .../stress/generate/PartitionIterator.java | 165 +++-- .../apache/cassandra/stress/generate/Row.java | 15 +- .../SampledOpDistributionFactory.java | 23 +- .../operations/userdefined/SchemaInsert.java| 14 +- .../operations/userdefined/SchemaQuery.java | 7 +- .../operations/userdefined/SchemaStatement.java | 35 +- .../userdefined/ValidatingSchemaQuery.java | 359 +++ .../SettingsCommandPreDefinedMixed.java | 9 +- .../stress/settings/SettingsCommandUser.java| 9 +- .../stress/settings/ValidationType.java | 29 -- 15 files changed, 581 insertions(+), 136 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed084029/CHANGES.txt -- diff --cc CHANGES.txt index 7170085,3a2daa7..ddfd174 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,103 -1,5 +1,104 @@@ +3.0 + * Disable memory mapping of hsperfdata file for JVM statistics (CASSANDRA-9242) + * Add pre-startup checks to detect potential incompatibilities (CASSANDRA-8049) + * Distinguish between null and unset in protocol v4 (CASSANDRA-7304) + * Add user/role permissions for user-defined functions (CASSANDRA-7557) + * Allow cassandra config to be updated to restart daemon without unloading classes (CASSANDRA-9046) + * Don't initialize compaction writer before checking if iter is empty (CASSANDRA-9117) + * Don't execute any functions at prepare-time (CASSANDRA-9037) + * Share file handles between all instances of a SegmentedFile (CASSANDRA-8893) + * Make it possible to major compact LCS (CASSANDRA-7272) + * Make FunctionExecutionException extend RequestExecutionException + (CASSANDRA-9055) + * Add support for SELECT JSON, INSERT JSON syntax and new toJson(), fromJson() + functions (CASSANDRA-7970) + * Optimise max purgeable timestamp calculation in compaction (CASSANDRA-8920) + * Constrain internode message buffer sizes, and improve IO class hierarchy (CASSANDRA-8670) + * New tool added to validate all sstables in a node (CASSANDRA-5791) + * Push notification when tracing completes for an operation (CASSANDRA-7807) + * Delay node up and node added notifications until native protocol server is started (CASSANDRA-8236) + * Compressed Commit Log (CASSANDRA-6809) + * Optimise IntervalTree (CASSANDRA-8988) + * Add a key-value payload for third party usage (CASSANDRA-8553, 9212) + * Bump metrics-reporter-config dependency for metrics 3.0 (CASSANDRA-8149) + * Partition intra-cluster message streams by size, not type (CASSANDRA-8789) + * Add WriteFailureException to native protocol, notify coordinator of + write failures (CASSANDRA-8592) + * Convert SequentialWriter to nio (CASSANDRA-8709) + * Add role based access control (CASSANDRA-7653, 8650, 7216, 8760, 8849, 8761, 8850) + * Record client ip address in tracing sessions (CASSANDRA-8162) + * Indicate partition key columns in response metadata for prepared + statements (CASSANDRA-7660) + * Merge UUIDType and TimeUUIDType parse logic (CASSANDRA-8759) + * Avoid memory allocation when searching index summary (CASSANDRA-8793) + * Optimise (Time)?UUIDType Comparisons (CASSANDRA-8730) + * Make CRC32Ex into a separate maven dependency (CASSANDRA-8836) + * Use preloaded jemalloc w/ Unsafe (CASSANDRA-8714) + * Avoid accessing partitioner through StorageProxy (CASSANDRA-8244, 8268) + * Upgrade Metrics library and remove depricated metrics (CASSANDRA-5657) + * Serializing Row cache alternative, fully off heap (CASSANDRA-7438) + * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707) + * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560) + * Support direct
[jira] [Updated] (CASSANDRA-9097) Repeated incremental nodetool repair results in failed repairs due to running anticompaction
[ https://issues.apache.org/jira/browse/CASSANDRA-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-9097: -- Attachment: 0001-Wait-for-anticompaction-to-finish.patch Attaching updated version. I changed Anticompaction response callback to handle failure. Above branch is also force updated. Repeated incremental nodetool repair results in failed repairs due to running anticompaction Key: CASSANDRA-9097 URL: https://issues.apache.org/jira/browse/CASSANDRA-9097 Project: Cassandra Issue Type: Bug Reporter: Gustav Munkby Assignee: Yuki Morishita Priority: Minor Fix For: 2.1.x Attachments: 0001-Wait-for-anticompaction-to-finish.patch I'm trying to synchronize incremental repairs over multiple nodes in a Cassandra cluster, and it does not seem to easily achievable. In principle, the process iterates through the nodes of the cluster and performs `nodetool -h $NODE repair --incremental`, but that sometimes fails on subsequent nodes. The reason for failing seems to be that the repair returns as soon as the repair and the _local_ anticompaction has completed, but does not guarantee that remote anticompactions are complete. If I subsequently try to issue another repair command, they fail to start (and terminate with failure after about one minute). It usually isn't a problem, as the local anticompaction typically involves as much (or more) data as the remote ones, but sometimes not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/3] cassandra git commit: cassandra-stress supports validation operations over user profiles
cassandra-stress supports validation operations over user profiles patch by benedict; reviewed by snazy for CASSANDRA-8773 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3bee990c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3bee990c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3bee990c Branch: refs/heads/trunk Commit: 3bee990ca2e46bf0fd5742c56b5d00cc0566950b Parents: e6f0279 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 18:01:38 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 18:01:38 2015 +0100 -- CHANGES.txt | 1 + .../org/apache/cassandra/stress/Operation.java | 13 +- .../apache/cassandra/stress/StressAction.java | 4 +- .../apache/cassandra/stress/StressProfile.java | 25 +- .../stress/generate/PartitionGenerator.java | 9 + .../stress/generate/PartitionIterator.java | 166 +++-- .../apache/cassandra/stress/generate/Row.java | 15 +- .../SampledOpDistributionFactory.java | 26 +- .../operations/userdefined/SchemaInsert.java| 14 +- .../operations/userdefined/SchemaQuery.java | 7 +- .../operations/userdefined/SchemaStatement.java | 35 +- .../userdefined/ValidatingSchemaQuery.java | 359 +++ .../SettingsCommandPreDefinedMixed.java | 9 +- .../stress/settings/SettingsCommandUser.java| 9 +- .../stress/settings/ValidationType.java | 29 -- 15 files changed, 581 insertions(+), 140 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3bee990c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index e7689ab..3a2daa7 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.6 + * cassandra-stress supports validation operations over user profiles (CASSANDRA-8773) * Add support for rate limiting log messages (CASSANDRA-9029) * Log the partition key with tombstone warnings (CASSANDRA-8561) * Reduce runWithCompactionsDisabled poll interval to 1ms (CASSANDRA-9271) http://git-wip-us.apache.org/repos/asf/cassandra/blob/3bee990c/tools/stress/src/org/apache/cassandra/stress/Operation.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/Operation.java b/tools/stress/src/org/apache/cassandra/stress/Operation.java index f4ac5ee..1179f71 100644 --- a/tools/stress/src/org/apache/cassandra/stress/Operation.java +++ b/tools/stress/src/org/apache/cassandra/stress/Operation.java @@ -104,10 +104,7 @@ public abstract class Operation if (seed == null) break; -if (spec.useRatio == null) -success = partitionCache.get(i).reset(seed, spec.targetCount, isWrite()); -else -success = partitionCache.get(i).reset(seed, spec.useRatio.next(), isWrite()); +success = reset(seed, partitionCache.get(i)); } } partitionCount = i; @@ -119,6 +116,14 @@ public abstract class Operation return !partitions.isEmpty(); } +protected boolean reset(Seed seed, PartitionIterator iterator) +{ +if (spec.useRatio == null) +return iterator.reset(seed, spec.targetCount, isWrite()); +else +return iterator.reset(seed, spec.useRatio.next(), isWrite()); +} + public boolean isWrite() { return false; http://git-wip-us.apache.org/repos/asf/cassandra/blob/3bee990c/tools/stress/src/org/apache/cassandra/stress/StressAction.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/StressAction.java b/tools/stress/src/org/apache/cassandra/stress/StressAction.java index f906a55..158a278 100644 --- a/tools/stress/src/org/apache/cassandra/stress/StressAction.java +++ b/tools/stress/src/org/apache/cassandra/stress/StressAction.java @@ -88,7 +88,9 @@ public class StressAction implements Runnable // warmup - do 50k iterations; by default hotspot compiles methods after 10k invocations PrintStream warmupOutput = new PrintStream(new OutputStream() { @Override public void write(int b) throws IOException { } } ); int iterations = 5 * settings.node.nodes.size(); -int threads = 20; +int threads = 100; +if (iterations settings.command.count settings.command.count 0) +return; if (settings.rate.maxThreads 0) threads = Math.min(threads, settings.rate.maxThreads);
[1/3] cassandra git commit: cassandra-stress supports validation operations over user profiles
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 e6f027979 - 3bee990ca refs/heads/trunk 47964b766 - ed0840299 cassandra-stress supports validation operations over user profiles patch by benedict; reviewed by snazy for CASSANDRA-8773 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3bee990c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3bee990c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3bee990c Branch: refs/heads/cassandra-2.1 Commit: 3bee990ca2e46bf0fd5742c56b5d00cc0566950b Parents: e6f0279 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 18:01:38 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 18:01:38 2015 +0100 -- CHANGES.txt | 1 + .../org/apache/cassandra/stress/Operation.java | 13 +- .../apache/cassandra/stress/StressAction.java | 4 +- .../apache/cassandra/stress/StressProfile.java | 25 +- .../stress/generate/PartitionGenerator.java | 9 + .../stress/generate/PartitionIterator.java | 166 +++-- .../apache/cassandra/stress/generate/Row.java | 15 +- .../SampledOpDistributionFactory.java | 26 +- .../operations/userdefined/SchemaInsert.java| 14 +- .../operations/userdefined/SchemaQuery.java | 7 +- .../operations/userdefined/SchemaStatement.java | 35 +- .../userdefined/ValidatingSchemaQuery.java | 359 +++ .../SettingsCommandPreDefinedMixed.java | 9 +- .../stress/settings/SettingsCommandUser.java| 9 +- .../stress/settings/ValidationType.java | 29 -- 15 files changed, 581 insertions(+), 140 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3bee990c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index e7689ab..3a2daa7 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.6 + * cassandra-stress supports validation operations over user profiles (CASSANDRA-8773) * Add support for rate limiting log messages (CASSANDRA-9029) * Log the partition key with tombstone warnings (CASSANDRA-8561) * Reduce runWithCompactionsDisabled poll interval to 1ms (CASSANDRA-9271) http://git-wip-us.apache.org/repos/asf/cassandra/blob/3bee990c/tools/stress/src/org/apache/cassandra/stress/Operation.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/Operation.java b/tools/stress/src/org/apache/cassandra/stress/Operation.java index f4ac5ee..1179f71 100644 --- a/tools/stress/src/org/apache/cassandra/stress/Operation.java +++ b/tools/stress/src/org/apache/cassandra/stress/Operation.java @@ -104,10 +104,7 @@ public abstract class Operation if (seed == null) break; -if (spec.useRatio == null) -success = partitionCache.get(i).reset(seed, spec.targetCount, isWrite()); -else -success = partitionCache.get(i).reset(seed, spec.useRatio.next(), isWrite()); +success = reset(seed, partitionCache.get(i)); } } partitionCount = i; @@ -119,6 +116,14 @@ public abstract class Operation return !partitions.isEmpty(); } +protected boolean reset(Seed seed, PartitionIterator iterator) +{ +if (spec.useRatio == null) +return iterator.reset(seed, spec.targetCount, isWrite()); +else +return iterator.reset(seed, spec.useRatio.next(), isWrite()); +} + public boolean isWrite() { return false; http://git-wip-us.apache.org/repos/asf/cassandra/blob/3bee990c/tools/stress/src/org/apache/cassandra/stress/StressAction.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/StressAction.java b/tools/stress/src/org/apache/cassandra/stress/StressAction.java index f906a55..158a278 100644 --- a/tools/stress/src/org/apache/cassandra/stress/StressAction.java +++ b/tools/stress/src/org/apache/cassandra/stress/StressAction.java @@ -88,7 +88,9 @@ public class StressAction implements Runnable // warmup - do 50k iterations; by default hotspot compiles methods after 10k invocations PrintStream warmupOutput = new PrintStream(new OutputStream() { @Override public void write(int b) throws IOException { } } ); int iterations = 5 * settings.node.nodes.size(); -int threads = 20; +int threads = 100; +if (iterations settings.command.count settings.command.count 0) +return; if (settings.rate.maxThreads 0) threads
[jira] [Commented] (CASSANDRA-9240) Performance issue after a restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526859#comment-14526859 ] Benedict commented on CASSANDRA-9240: - Well, that's easily fixed :) I've pushed an updated version with that changed Performance issue after a restart - Key: CASSANDRA-9240 URL: https://issues.apache.org/jira/browse/CASSANDRA-9240 Project: Cassandra Issue Type: Bug Reporter: Alan Boudreault Assignee: Benedict Priority: Minor Fix For: 3.x Attachments: Cassandra.snapshots.zip, cassandra_2.1.4-clientrequest-read.log, cassandra_2.1.4.log, cassandra_2.1.5-clientrequest-read.log, cassandra_2.1.5.log, cassandra_trunk-clientrequest-read.log, cassandra_trunk.log, cassandra_trunk_no_restart-clientrequest-read.log, cassandra_trunk_no_restart.log, issue.yaml, run_issue.sh, runs.log, trace_query.cql I have noticed a performance issue while I was working on compaction perf tests for CASSANDRA-7409. The performance for my use case is very bad after a restart. It is mostly a read performance issue but not strictly. I have attached my use case (see run_issue.sh and issue.yaml) and all test logs for 2.1.4, 2.1.5 and trunk: * 2.1.* are OK (although 2.1.4 seems to be better than 2.1.5?): ~6-7k ops/second and ~2-2.5k of read latency. * trunk is NOT OK: ~1.5-2k ops/second and 25-30k of read latency. * trunk is OK without a restart: ~ same perf than 2.1.4 and 2.1.5. EDIT: branch cassandra-2.1 is OK. I can help to bisect and/or profile on Monday if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9167) Improve bloom-filter false-positive-ratio
[ https://issues.apache.org/jira/browse/CASSANDRA-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9167: Reviewer: Benedict Improve bloom-filter false-positive-ratio - Key: CASSANDRA-9167 URL: https://issues.apache.org/jira/browse/CASSANDRA-9167 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Priority: Minor Labels: perfomance {{org.apache.cassandra.utils.BloomCalculations}} performs some table lookups to calculate the bloom filter specification (size, # of hashes). Using the exact maths for that computation brings a better false-positive-ratio (the maths usually returns higher numbers for hash-counts). TL;DR increasing the number of hash-rounds brings a nice improvement. Finally it's a trade-off between CPU and I/O. ||false-positive-chance||elements||capacity||hash count new||false-positive-ratio new||hash count current||false-positive-ratio current||improvement |0.1|1|50048|3|0.0848|3|0.0848|0 |0.1|10|500032|3|0.09203|3|0.09203|0 |0.1|100|564|3|0.0919|3|0.0919|0 |0.1|1000|5064|3|0.09182|3|0.09182|0 |0.1|1|50064|3|0.091874|3|0.091874|0 |0.01|1|100032|7|0.0092|5|0.0107|0.1630434783 |0.01|10|164|7|0.00818|5|0.00931|0.1381418093 |0.01|100|1064|7|0.008072|5|0.009405|0.1651387512 |0.01|1000|10064|7|0.008174|5|0.009375|0.146929288 |0.01|1|100064|7|0.008197|5|0.009428|0.150176894 |0.001|1|150080|10|0.0008|7|0.001|0.25 |0.001|10|1500032|10|0.0006|7|0.00094|0.57 |0.001|100|1564|10|0.000717|7|0.000991|0.3821478382 |0.001|1000|15064|10|0.000743|7|0.000992|0.33512786 |0.001|1|150064|10|0.000741|7|0.001002|0.3522267206 |0.0001|1|200064|13|0|10|0.0002|#DIV/0! |0.0001|10|264|13|0.4|10|0.0001|1.5 |0.0001|100|2064|13|0.75|10|0.91|0.21 |0.0001|1000|20064|13|0.69|10|0.87|0.2608695652 |0.0001|1|200064|13|0.68|10|0.9|0.3235294118 If we decide to allow more hash-rounds, it could be nicely back-ported even to 2.0 without affecting existing sstables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9167) Improve bloom-filter false-positive-ratio
[ https://issues.apache.org/jira/browse/CASSANDRA-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526864#comment-14526864 ] Benedict commented on CASSANDRA-9167: - [~snazy]: I will get to this and your other bloom filter ticket, at least before 3.0, just a bit swamped with other (bigger) reviews at the moment that are more directly blocking people. Improve bloom-filter false-positive-ratio - Key: CASSANDRA-9167 URL: https://issues.apache.org/jira/browse/CASSANDRA-9167 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Priority: Minor Labels: perfomance {{org.apache.cassandra.utils.BloomCalculations}} performs some table lookups to calculate the bloom filter specification (size, # of hashes). Using the exact maths for that computation brings a better false-positive-ratio (the maths usually returns higher numbers for hash-counts). TL;DR increasing the number of hash-rounds brings a nice improvement. Finally it's a trade-off between CPU and I/O. ||false-positive-chance||elements||capacity||hash count new||false-positive-ratio new||hash count current||false-positive-ratio current||improvement |0.1|1|50048|3|0.0848|3|0.0848|0 |0.1|10|500032|3|0.09203|3|0.09203|0 |0.1|100|564|3|0.0919|3|0.0919|0 |0.1|1000|5064|3|0.09182|3|0.09182|0 |0.1|1|50064|3|0.091874|3|0.091874|0 |0.01|1|100032|7|0.0092|5|0.0107|0.1630434783 |0.01|10|164|7|0.00818|5|0.00931|0.1381418093 |0.01|100|1064|7|0.008072|5|0.009405|0.1651387512 |0.01|1000|10064|7|0.008174|5|0.009375|0.146929288 |0.01|1|100064|7|0.008197|5|0.009428|0.150176894 |0.001|1|150080|10|0.0008|7|0.001|0.25 |0.001|10|1500032|10|0.0006|7|0.00094|0.57 |0.001|100|1564|10|0.000717|7|0.000991|0.3821478382 |0.001|1000|15064|10|0.000743|7|0.000992|0.33512786 |0.001|1|150064|10|0.000741|7|0.001002|0.3522267206 |0.0001|1|200064|13|0|10|0.0002|#DIV/0! |0.0001|10|264|13|0.4|10|0.0001|1.5 |0.0001|100|2064|13|0.75|10|0.91|0.21 |0.0001|1000|20064|13|0.69|10|0.87|0.2608695652 |0.0001|1|200064|13|0.68|10|0.9|0.3235294118 If we decide to allow more hash-rounds, it could be nicely back-ported even to 2.0 without affecting existing sstables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/3] cassandra git commit: Fix error when dropping table during compaction
Fix error when dropping table during compaction patch by benedict; reviewed by tjake CASSANDRA-9251 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/369966a2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/369966a2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/369966a2 Branch: refs/heads/trunk Commit: 369966a2af65aa1d8e8248307ebd187fccacbd8e Parents: 3bee990 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 18:31:14 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 18:31:14 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/ColumnFamilyStore.java | 24 ++--- .../org/apache/cassandra/db/DataTracker.java| 15 ++-- src/java/org/apache/cassandra/db/Keyspace.java | 4 +++ .../db/compaction/CompactionManager.java| 22 .../cassandra/cql3/CrcCheckChanceTest.java | 36 6 files changed, 69 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/369966a2/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3a2daa7..64d0760 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.6 + * Fix error when dropping table during compaction (CASSANDRA-9251) * cassandra-stress supports validation operations over user profiles (CASSANDRA-8773) * Add support for rate limiting log messages (CASSANDRA-9029) * Log the partition key with tombstone warnings (CASSANDRA-8561) http://git-wip-us.apache.org/repos/asf/cassandra/blob/369966a2/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index bb23332..4438afd 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -383,6 +383,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean /** call when dropping or renaming a CF. Performs mbean housekeeping and invalidates CFS to other operations */ public void invalidate() { +// disable and cancel in-progress compactions before invalidating valid = false; try @@ -397,7 +398,6 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean } latencyCalculator.cancel(false); -compactionStrategyWrapper.shutdown(); SystemKeyspace.removeTruncationRecord(metadata.cfId); data.unreferenceSSTables(); indexManager.invalidate(); @@ -2566,26 +2566,8 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean try { // interrupt in-progress compactions -FunctionColumnFamilyStore, CFMetaData f = new FunctionColumnFamilyStore, CFMetaData() -{ -public CFMetaData apply(ColumnFamilyStore cfs) -{ -return cfs.metadata; -} -}; -IterableCFMetaData allMetadata = Iterables.transform(selfWithIndexes, f); -CompactionManager.instance.interruptCompactionFor(allMetadata, interruptValidation); - -// wait for the interruption to be recognized -long start = System.nanoTime(); -long delay = TimeUnit.MINUTES.toNanos(1); -while (System.nanoTime() - start delay) -{ -if (CompactionManager.instance.isCompacting(selfWithIndexes)) -Uninterruptibles.sleepUninterruptibly(1, TimeUnit.MILLISECONDS); -else -break; -} + CompactionManager.instance.interruptCompactionForCFs(selfWithIndexes, interruptValidation); +CompactionManager.instance.waitForCessation(selfWithIndexes); // doublecheck that we finished, instead of timing out for (ColumnFamilyStore cfs : selfWithIndexes) http://git-wip-us.apache.org/repos/asf/cassandra/blob/369966a2/src/java/org/apache/cassandra/db/DataTracker.java -- diff --git a/src/java/org/apache/cassandra/db/DataTracker.java b/src/java/org/apache/cassandra/db/DataTracker.java index 757b48a..a520dcd 100644 --- a/src/java/org/apache/cassandra/db/DataTracker.java +++ b/src/java/org/apache/cassandra/db/DataTracker.java @@ -229,17 +229,6 @@ public class DataTracker */ public void
[1/3] cassandra git commit: Fix error when dropping table during compaction
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 3bee990ca - 369966a2a refs/heads/trunk ed0840299 - 54f4984f5 Fix error when dropping table during compaction patch by benedict; reviewed by tjake CASSANDRA-9251 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/369966a2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/369966a2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/369966a2 Branch: refs/heads/cassandra-2.1 Commit: 369966a2af65aa1d8e8248307ebd187fccacbd8e Parents: 3bee990 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 18:31:14 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 18:31:14 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/ColumnFamilyStore.java | 24 ++--- .../org/apache/cassandra/db/DataTracker.java| 15 ++-- src/java/org/apache/cassandra/db/Keyspace.java | 4 +++ .../db/compaction/CompactionManager.java| 22 .../cassandra/cql3/CrcCheckChanceTest.java | 36 6 files changed, 69 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/369966a2/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3a2daa7..64d0760 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.6 + * Fix error when dropping table during compaction (CASSANDRA-9251) * cassandra-stress supports validation operations over user profiles (CASSANDRA-8773) * Add support for rate limiting log messages (CASSANDRA-9029) * Log the partition key with tombstone warnings (CASSANDRA-8561) http://git-wip-us.apache.org/repos/asf/cassandra/blob/369966a2/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index bb23332..4438afd 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -383,6 +383,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean /** call when dropping or renaming a CF. Performs mbean housekeeping and invalidates CFS to other operations */ public void invalidate() { +// disable and cancel in-progress compactions before invalidating valid = false; try @@ -397,7 +398,6 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean } latencyCalculator.cancel(false); -compactionStrategyWrapper.shutdown(); SystemKeyspace.removeTruncationRecord(metadata.cfId); data.unreferenceSSTables(); indexManager.invalidate(); @@ -2566,26 +2566,8 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean try { // interrupt in-progress compactions -FunctionColumnFamilyStore, CFMetaData f = new FunctionColumnFamilyStore, CFMetaData() -{ -public CFMetaData apply(ColumnFamilyStore cfs) -{ -return cfs.metadata; -} -}; -IterableCFMetaData allMetadata = Iterables.transform(selfWithIndexes, f); -CompactionManager.instance.interruptCompactionFor(allMetadata, interruptValidation); - -// wait for the interruption to be recognized -long start = System.nanoTime(); -long delay = TimeUnit.MINUTES.toNanos(1); -while (System.nanoTime() - start delay) -{ -if (CompactionManager.instance.isCompacting(selfWithIndexes)) -Uninterruptibles.sleepUninterruptibly(1, TimeUnit.MILLISECONDS); -else -break; -} + CompactionManager.instance.interruptCompactionForCFs(selfWithIndexes, interruptValidation); +CompactionManager.instance.waitForCessation(selfWithIndexes); // doublecheck that we finished, instead of timing out for (ColumnFamilyStore cfs : selfWithIndexes) http://git-wip-us.apache.org/repos/asf/cassandra/blob/369966a2/src/java/org/apache/cassandra/db/DataTracker.java -- diff --git a/src/java/org/apache/cassandra/db/DataTracker.java b/src/java/org/apache/cassandra/db/DataTracker.java index 757b48a..a520dcd 100644 --- a/src/java/org/apache/cassandra/db/DataTracker.java +++
[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54f4984f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54f4984f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54f4984f Branch: refs/heads/trunk Commit: 54f4984f5966bba66bbacf0739695c6c266870a0 Parents: ed08402 369966a Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 18:31:31 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 18:31:31 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/ColumnFamilyStore.java | 24 ++--- .../org/apache/cassandra/db/DataTracker.java| 15 ++-- src/java/org/apache/cassandra/db/Keyspace.java | 4 +++ .../db/compaction/CompactionManager.java| 22 .../cassandra/cql3/CrcCheckChanceTest.java | 36 6 files changed, 69 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f4984f/CHANGES.txt -- diff --cc CHANGES.txt index ddfd174,64d0760..c0c209d --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,103 -1,5 +1,104 @@@ +3.0 + * Disable memory mapping of hsperfdata file for JVM statistics (CASSANDRA-9242) + * Add pre-startup checks to detect potential incompatibilities (CASSANDRA-8049) + * Distinguish between null and unset in protocol v4 (CASSANDRA-7304) + * Add user/role permissions for user-defined functions (CASSANDRA-7557) + * Allow cassandra config to be updated to restart daemon without unloading classes (CASSANDRA-9046) + * Don't initialize compaction writer before checking if iter is empty (CASSANDRA-9117) + * Don't execute any functions at prepare-time (CASSANDRA-9037) + * Share file handles between all instances of a SegmentedFile (CASSANDRA-8893) + * Make it possible to major compact LCS (CASSANDRA-7272) + * Make FunctionExecutionException extend RequestExecutionException + (CASSANDRA-9055) + * Add support for SELECT JSON, INSERT JSON syntax and new toJson(), fromJson() + functions (CASSANDRA-7970) + * Optimise max purgeable timestamp calculation in compaction (CASSANDRA-8920) + * Constrain internode message buffer sizes, and improve IO class hierarchy (CASSANDRA-8670) + * New tool added to validate all sstables in a node (CASSANDRA-5791) + * Push notification when tracing completes for an operation (CASSANDRA-7807) + * Delay node up and node added notifications until native protocol server is started (CASSANDRA-8236) + * Compressed Commit Log (CASSANDRA-6809) + * Optimise IntervalTree (CASSANDRA-8988) + * Add a key-value payload for third party usage (CASSANDRA-8553, 9212) + * Bump metrics-reporter-config dependency for metrics 3.0 (CASSANDRA-8149) + * Partition intra-cluster message streams by size, not type (CASSANDRA-8789) + * Add WriteFailureException to native protocol, notify coordinator of + write failures (CASSANDRA-8592) + * Convert SequentialWriter to nio (CASSANDRA-8709) + * Add role based access control (CASSANDRA-7653, 8650, 7216, 8760, 8849, 8761, 8850) + * Record client ip address in tracing sessions (CASSANDRA-8162) + * Indicate partition key columns in response metadata for prepared + statements (CASSANDRA-7660) + * Merge UUIDType and TimeUUIDType parse logic (CASSANDRA-8759) + * Avoid memory allocation when searching index summary (CASSANDRA-8793) + * Optimise (Time)?UUIDType Comparisons (CASSANDRA-8730) + * Make CRC32Ex into a separate maven dependency (CASSANDRA-8836) + * Use preloaded jemalloc w/ Unsafe (CASSANDRA-8714) + * Avoid accessing partitioner through StorageProxy (CASSANDRA-8244, 8268) + * Upgrade Metrics library and remove depricated metrics (CASSANDRA-5657) + * Serializing Row cache alternative, fully off heap (CASSANDRA-7438) + * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707) + * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560) + * Support direct buffer decompression for reads (CASSANDRA-8464) + * DirectByteBuffer compatible LZ4 methods (CASSANDRA-7039) + * Group sstables for anticompaction correctly (CASSANDRA-8578) + * Add ReadFailureException to native protocol, respond + immediately when replicas encounter errors while handling + a read request (CASSANDRA-7886) + * Switch CommitLogSegment from RandomAccessFile to nio (CASSANDRA-8308) + * Allow mixing token and partition key restrictions (CASSANDRA-7016) + * Support index key/value entries on map collections (CASSANDRA-8473) + * Modernize schema tables (CASSANDRA-8261) + * Support for user-defined aggregation functions (CASSANDRA-8053) + * Fix NPE in SelectStatement with
[jira] [Assigned] (CASSANDRA-9293) Unit tests should fail if any LEAK DETECTED errors are printed
[ https://issues.apache.org/jira/browse/CASSANDRA-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire reassigned CASSANDRA-9293: --- Assignee: Philip Thompson (was: Ryan McGuire) Unit tests should fail if any LEAK DETECTED errors are printed -- Key: CASSANDRA-9293 URL: https://issues.apache.org/jira/browse/CASSANDRA-9293 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Philip Thompson We shouldn't depend on dtests to inform us of these problems (which have error log monitoring) - they should be caught by unit tests, which may also cover different failure conditions (besides being faster). There are a couple of ways we could do this, but probably the easiest is to add a static flag that is set to true if we ever see a leak (in Ref), and to just assert that this is false at the end of every test. [~enigmacurry] is this something TE can help with? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526919#comment-14526919 ] Anuj commented on CASSANDRA-8150: - We have write heavy workload and used to face promotion failures/long gc pauses with Cassandra 2.0.x. I am not into code yet but I think that memtable and compaction related objects have mid-life and write heavy workload is not suitable for generation collection by default. So, we tuned JVM to make sure that minimum objects are promoted to Old Gen and achieved great success in that: MAX_HEAP_SIZE=12G HEAP_NEWSIZE=3G -XX:SurvivorRatio=2 -XX:MaxTenuringThreshold=20 -XX:CMSInitiatingOccupancyFraction=70 JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20 JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768 JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3 JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking We also think that default total_space_in_mb=1/4 heap is too much for write heavy loads. By default, young gen is also 1/4 heap.We reduced it to 1000mb in order to make sure that memtable related objects dont stay in memory for too long. Combining this with SurvivorRatio=2 and MaxTenuringThreshold=20 did the job well. GC was very consistent. No Full GC observed. Environment: 3 node cluster with each node having 24cores,64G RAM and SSDs in RAID5. We are making around 12k writes/sec in 5 cf (one with 4 sec index) and 2300 reads/sec on each node of 3 node cluster. 2 CFs have wide rows with max data of around 100mb per row Revaluate Default JVM tuning parameters --- Key: CASSANDRA-8150 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Matt Stump Assignee: Ryan McGuire Attachments: upload.png It's been found that the old twitter recommendations of 100m per core up to 800m is harmful and should no longer be used. Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 1/3 is probably better for releases greater than 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8150) Revaluate Default JVM tuning parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526919#comment-14526919 ] Anuj edited comment on CASSANDRA-8150 at 5/4/15 5:48 PM: - We have write heavy workload and used to face promotion failures/long gc pauses with Cassandra 2.0.x. I am not into code yet but I think that memtable and compaction related objects have mid-life and write heavy workload is not suitable for generation collection by default. So, we tuned JVM to make sure that minimum objects are promoted to Old Gen and achieved great success in that: MAX_HEAP_SIZE=12G HEAP_NEWSIZE=3G -XX:SurvivorRatio=2 -XX:MaxTenuringThreshold=20 -XX:CMSInitiatingOccupancyFraction=70 JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20 JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768 JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3 JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking We also think that default total_memtable_space_in_mb=1/4 heap is too much for write heavy loads. By default, young gen is also 1/4 heap.We reduced it to 1000mb in order to make sure that memtable related objects dont stay in memory for too long. Combining this with SurvivorRatio=2 and MaxTenuringThreshold=20 did the job well. GC was very consistent. No Full GC observed. Environment: 3 node cluster with each node having 24cores,64G RAM and SSDs in RAID5. We are making around 12k writes/sec in 5 cf (one with 4 sec index) and 2300 reads/sec on each node of 3 node cluster. 2 CFs have wide rows with max data of around 100mb per row was (Author: eanujwa): We have write heavy workload and used to face promotion failures/long gc pauses with Cassandra 2.0.x. I am not into code yet but I think that memtable and compaction related objects have mid-life and write heavy workload is not suitable for generation collection by default. So, we tuned JVM to make sure that minimum objects are promoted to Old Gen and achieved great success in that: MAX_HEAP_SIZE=12G HEAP_NEWSIZE=3G -XX:SurvivorRatio=2 -XX:MaxTenuringThreshold=20 -XX:CMSInitiatingOccupancyFraction=70 JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20 JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768 JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3 JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking We also think that default total_space_in_mb=1/4 heap is too much for write heavy loads. By default, young gen is also 1/4 heap.We reduced it to 1000mb in order to make sure that memtable related objects dont stay in memory for too long. Combining this with SurvivorRatio=2 and MaxTenuringThreshold=20 did the job well. GC was very consistent. No Full GC observed. Environment: 3 node cluster with each node having 24cores,64G RAM and SSDs in RAID5. We are making around 12k writes/sec in 5 cf (one with 4 sec index) and 2300 reads/sec on each node of 3 node cluster. 2 CFs have wide rows with max data of around 100mb per row Revaluate Default JVM tuning parameters --- Key: CASSANDRA-8150 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Matt Stump Assignee: Ryan McGuire Attachments: upload.png It's been found that the old twitter recommendations of 100m per core up to 800m is harmful and should no longer be used. Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 1/3 is probably better for releases greater than 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526923#comment-14526923 ] Hans van der Linde commented on CASSANDRA-8150: --- Dear sender,  I am on vacation and will be back in the office on Tuesday 12-05-2015 During this period I have no access to my email and your email will not be forwarded.  For urgent matters regarding GRTC / RTPE contact Peter v/d Koolwijk (peter.van.de.koolw...@ing.nl / 06-54660211  Or alternatively my manager Coos v/d Berg (coos.van.den.b...@ing.nl / 06-22018780)  Best regards, Hans van der Linde  - ATTENTION: The information in this electronic mail message is private and confidential, and only intended for the addressee. Should you receive this message by mistake, you are hereby notified that any disclosure, reproduction, distribution or use of this message is strictly prohibited. Please inform the sender by reply transmission and delete the message without copying or opening it. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. - Revaluate Default JVM tuning parameters --- Key: CASSANDRA-8150 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Matt Stump Assignee: Ryan McGuire Attachments: upload.png It's been found that the old twitter recommendations of 100m per core up to 800m is harmful and should no longer be used. Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 1/3 is probably better for releases greater than 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-8150) Revaluate Default JVM tuning parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-8150: Comment: was deleted (was: Dear sender,  I am on vacation and will be back in the office on Tuesday 12-05-2015 During this period I have no access to my email and your email will not be forwarded.  For urgent matters regarding GRTC / RTPE contact Peter v/d Koolwijk (peter.van.de.koolw...@ing.nl / 06-54660211  Or alternatively my manager Coos v/d Berg (coos.van.den.b...@ing.nl / 06-22018780)  Best regards, Hans van der Linde  - ATTENTION: The information in this electronic mail message is private and confidential, and only intended for the addressee. Should you receive this message by mistake, you are hereby notified that any disclosure, reproduction, distribution or use of this message is strictly prohibited. Please inform the sender by reply transmission and delete the message without copying or opening it. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. - ) Revaluate Default JVM tuning parameters --- Key: CASSANDRA-8150 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Matt Stump Assignee: Ryan McGuire Attachments: upload.png It's been found that the old twitter recommendations of 100m per core up to 800m is harmful and should no longer be used. Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 1/3 is probably better for releases greater than 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9294) Streaming errors should log the root cause
[ https://issues.apache.org/jira/browse/CASSANDRA-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526938#comment-14526938 ] Jose Martinez Poblete commented on CASSANDRA-9294: -- It would be very useful to have instead of having to enable TCP logging after the fact Streaming errors should log the root cause -- Key: CASSANDRA-9294 URL: https://issues.apache.org/jira/browse/CASSANDRA-9294 Project: Cassandra Issue Type: Bug Reporter: Brandon Williams Assignee: Yuki Morishita Fix For: 2.0.x Currently, when a streaming error occurs all you get is something like: {noformat} java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed {noformat} Instead, we should log the root cause. Was the connection reset by peer, did it timeout, etc? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
Jeremiah Jordan created CASSANDRA-9295: -- Summary: Streaming not holding on to ref's long enough. Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346) [cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:318) [cassandra-all-2.1.5.426.jar:2.1.5.426] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] INFO [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,232 StreamResultFuture.java:180 - [Stream
[jira] [Commented] (CASSANDRA-9282) Warn on unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526971#comment-14526971 ] T Jake Luciani commented on CASSANDRA-9282: --- Backported NoSpamLogger to 2.1 and put this change here https://github.com/tjake/cassandra/tree/9282-unloggedlog Warn on unlogged batches Key: CASSANDRA-9282 URL: https://issues.apache.org/jira/browse/CASSANDRA-9282 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 2.1.x At least until CASSANDRA-8303 is done and we can block them entirely, we should log a warning when unlogged batches across multiple partition keys are used. This could either be done by backporting NoSpamLogger and blindly logging every time, or we could add a threshold and warn when more than 10 keys are seen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9295: Attachment: 9295.debug.txt This might help debug if the problem is with early release Streaming not holding on to ref's long enough. -- Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x Attachments: 9295.debug.txt While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346)
[jira] [Commented] (CASSANDRA-9283) Deprecate unlogged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527003#comment-14527003 ] T Jake Luciani commented on CASSANDRA-9283: --- [~jbellis] does Deprecate == Remove UNLOGGED completely? Deprecate unlogged batches -- Key: CASSANDRA-9283 URL: https://issues.apache.org/jira/browse/CASSANDRA-9283 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: T Jake Luciani Fix For: 3.0 Officially mark unlogged batches deprecated. Note that the main good use case for unlogged batches, of multiple updates in a single partition key, is free when done as a logged batch. So really unlogged batches mainly serve as a honeypot to trick new users into abusing them in misguided bulk loading attempts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/4] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/26249326 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/26249326 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/26249326 Branch: refs/heads/trunk Commit: 26249326688817b0c831a8a66d55ace6c9e1a10b Parents: 54f4984 1a0262f Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 19:36:31 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 19:36:31 2015 +0100 -- .../apache/cassandra/cql3/CrcCheckChanceTest.java| 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) --
[1/4] cassandra git commit: fixup 9251 unit test
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 369966a2a - 1a0262f24 refs/heads/trunk 54f4984f5 - 1dea6b020 fixup 9251 unit test Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1a0262f2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1a0262f2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1a0262f2 Branch: refs/heads/cassandra-2.1 Commit: 1a0262f2472450821178a4b61c0c1ecb0193f888 Parents: 369966a Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 19:36:25 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 19:36:25 2015 +0100 -- .../apache/cassandra/cql3/CrcCheckChanceTest.java| 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1a0262f2/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java b/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java index cc803fb..f218c9d 100644 --- a/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java +++ b/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java @@ -18,12 +18,14 @@ package org.apache.cassandra.cql3; import java.util.List; +import java.util.concurrent.ExecutionException; import java.util.concurrent.Future; import junit.framework.Assert; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.db.ColumnFamilyStore; import org.apache.cassandra.db.Keyspace; +import org.apache.cassandra.db.compaction.CompactionInterruptedException; import org.apache.cassandra.db.compaction.CompactionManager; import org.apache.cassandra.utils.FBUtilities; @@ -138,11 +140,18 @@ public class CrcCheckChanceTest extends CQLTester } DatabaseDescriptor.setCompactionThroughputMbPerSec(1); -ListFuture? futures = CompactionManager.instance.submitMaximal(cfs, CompactionManager.GC_ALL); +ListFuture? futures = CompactionManager.instance.submitMaximal(cfs, CompactionManager.GC_ALL); execute(DROP TABLE %s); -FBUtilities.waitOnFutures(futures); - +try +{ +FBUtilities.waitOnFutures(futures); +} +catch (Throwable t) +{ +if (!(t.getCause() instanceof ExecutionException) || !(t.getCause().getCause() instanceof CompactionInterruptedException)) +throw t; +} } }
[4/4] cassandra git commit: fixup 9251 unit test
fixup 9251 unit test Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1dea6b02 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1dea6b02 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1dea6b02 Branch: refs/heads/trunk Commit: 1dea6b0201545c16d4b29af51ce1bb5e77fc02d2 Parents: 2624932 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 19:38:53 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 19:38:53 2015 +0100 -- src/java/org/apache/cassandra/db/compaction/CompactionManager.java | 2 +- test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1dea6b02/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 911926a..35e288d 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -1308,7 +1308,7 @@ public class CompactionManager implements CompactionManagerMBean return executor.submit(runnable); } -static int getDefaultGcBefore(ColumnFamilyStore cfs) +public static int getDefaultGcBefore(ColumnFamilyStore cfs) { // 2ndary indexes have ExpiringColumns too, so we need to purge tombstones deleted before now. We do not need to // add any GcGrace however since 2ndary indexes are local to a node. http://git-wip-us.apache.org/repos/asf/cassandra/blob/1dea6b02/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java b/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java index f218c9d..3bcccf0 100644 --- a/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java +++ b/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java @@ -140,7 +140,7 @@ public class CrcCheckChanceTest extends CQLTester } DatabaseDescriptor.setCompactionThroughputMbPerSec(1); -ListFuture? futures = CompactionManager.instance.submitMaximal(cfs, CompactionManager.GC_ALL); +ListFuture? futures = CompactionManager.instance.submitMaximal(cfs, CompactionManager.getDefaultGcBefore(cfs), false); execute(DROP TABLE %s); try
[2/4] cassandra git commit: fixup 9251 unit test
fixup 9251 unit test Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1a0262f2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1a0262f2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1a0262f2 Branch: refs/heads/trunk Commit: 1a0262f2472450821178a4b61c0c1ecb0193f888 Parents: 369966a Author: Benedict Elliott Smith bened...@apache.org Authored: Mon May 4 19:36:25 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon May 4 19:36:25 2015 +0100 -- .../apache/cassandra/cql3/CrcCheckChanceTest.java| 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1a0262f2/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java b/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java index cc803fb..f218c9d 100644 --- a/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java +++ b/test/unit/org/apache/cassandra/cql3/CrcCheckChanceTest.java @@ -18,12 +18,14 @@ package org.apache.cassandra.cql3; import java.util.List; +import java.util.concurrent.ExecutionException; import java.util.concurrent.Future; import junit.framework.Assert; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.db.ColumnFamilyStore; import org.apache.cassandra.db.Keyspace; +import org.apache.cassandra.db.compaction.CompactionInterruptedException; import org.apache.cassandra.db.compaction.CompactionManager; import org.apache.cassandra.utils.FBUtilities; @@ -138,11 +140,18 @@ public class CrcCheckChanceTest extends CQLTester } DatabaseDescriptor.setCompactionThroughputMbPerSec(1); -ListFuture? futures = CompactionManager.instance.submitMaximal(cfs, CompactionManager.GC_ALL); +ListFuture? futures = CompactionManager.instance.submitMaximal(cfs, CompactionManager.GC_ALL); execute(DROP TABLE %s); -FBUtilities.waitOnFutures(futures); - +try +{ +FBUtilities.waitOnFutures(futures); +} +catch (Throwable t) +{ +if (!(t.getCause() instanceof ExecutionException) || !(t.getCause().getCause() instanceof CompactionInterruptedException)) +throw t; +} } }
[jira] [Updated] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9295: Attachment: (was: 9295.debug.txt) Streaming not holding on to ref's long enough. -- Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346) [cassandra-all-2.1.5.426.jar:2.1.5.426] at
[jira] [Updated] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9295: Attachment: 9295.debug.txt Rebased against 2.1 Streaming not holding on to ref's long enough. -- Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x Attachments: 9295.debug.txt While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346) [cassandra-all-2.1.5.426.jar:2.1.5.426] at
[jira] [Updated] (CASSANDRA-8930) Add a warn notification for clients
[ https://issues.apache.org/jira/browse/CASSANDRA-8930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-8930: -- Attachment: 8930-trunk.txt I've attached a patch which adds a {{ClientWarn.warn()}} that buffers the messages and sends it with the response to the client. It is active for v4 clients. Added client warnings for batch log size and tombstone overwhelming warnings, as well as a deprecation warning for using unlogged batches (was the motivation for looking at this). Add a warn notification for clients --- Key: CASSANDRA-8930 URL: https://issues.apache.org/jira/browse/CASSANDRA-8930 Project: Cassandra Issue Type: Sub-task Reporter: Carl Yeksigian Labels: client-impacting, protocolv4 Fix For: 3.x Attachments: 8930-trunk.txt Currently, if a query generates a warning, it is going to be logged server side. If the person writing the query is not the admin, that warning isn't going to have an impact on the query, and we're just going to fill up the server logs. We should push these warnings back to the client so the driver users can make necessary changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9286) Add Keyspace/Table details to CollectionType.java error message
[ https://issues.apache.org/jira/browse/CASSANDRA-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-9286: -- Attachment: 9286-2.1.txt 9286-2.0.txt Attaching patches which log out ks/cf name for over collections. Add Keyspace/Table details to CollectionType.java error message --- Key: CASSANDRA-9286 URL: https://issues.apache.org/jira/browse/CASSANDRA-9286 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sequoyha pelletier Assignee: Carl Yeksigian Priority: Minor Attachments: 9286-2.0.txt, 9286-2.1.txt The error message for too many element in a collection does not give keyspace or column family information. This makes it a pain point to try to determine which table is the offending table. Example Error message: {noformat} ERROR [Native-Transport-Requests:809453] 2015-04-23 22:48:21,189 CollectionType.java (line 116) Detected collection with 136234 elements, more than the 65535 limit. Only the first 65535 elements will be returned to the client. Please see http://cassandra.apache.org/doc/cql3/CQL.html#collections for more details. {noformat} Currently, to try to pinpoint the table in question. We need to trace all requests and then try to match up the timestamps in the CQL tracing session with the log timestamps to try and match. If prepared statements are used, this is a dead end due to the logged tracing information missing the query. In which case, we have to look at other 3rd party methods for capturing the queries to try and match up. This is extremely tedious when many tables have collections and a high number of ops against them. Requesting that the error contain the keyspace.table name in the error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9131) Defining correct behavior during leap second insertion
[ https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire reassigned CASSANDRA-9131: --- Assignee: Sylvain Lebresne (was: Jim Witschey) Defining correct behavior during leap second insertion -- Key: CASSANDRA-9131 URL: https://issues.apache.org/jira/browse/CASSANDRA-9131 Project: Cassandra Issue Type: Bug Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Reporter: Jim Witschey Assignee: Sylvain Lebresne On Linux platforms, the insertion of a leap second breaks the monotonicity of timestamps. This can make values appear to have been inserted into Cassandra in a different order than they were. I want to know what behavior is expected and desirable for inserts over this discontinuity. From a timestamp perspective, an inserted leap second looks like a repeat of the previous second: {code} $ while true ; do echo `date +%s%N` `date -u` ; sleep .5 ; done 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015 {code} Note that 23:59:59 repeats itself, and that the timestamps increase during the first time through, then step back down to the beginning of the second and increase again. As a result, the timestamps on values inserted during these seconds will be out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and synced them to shortly before the leap second would be inserted. During the insertion of the leap second, I ran a test with logic something like: {code} simple_insert = session.prepare( 'INSERT INTO test (foo, bar) VALUES (?, ?);') for i in itertools.count(): # stop after midnight now = datetime.utcnow() last_midnight = now.replace(hour=0, minute=0, second=0, microsecond=0) seconds_since_midnight = (now - last_midnight).total_seconds() if 5 = seconds_since_midnight = 15: break session.execute(simple_insert, [i, i]) result = session.execute(SELECT bar, WRITETIME(bar) FROM test;) {code} EDIT: This behavior occurs with server-generated timestamps; in this particular test, I set {{use_client_timestamp}} to {{False}}. Under normal circumstances, the values and writetimes would increase together, but when inserted over the leap second, they don't. These {{value, writetime}} pairs are sorted by writetime: {code} (582, 1435708799285000) (579, 1435708799339000) (583, 1435708799593000) (580, 1435708799643000) (584, 1435708799897000) (581, 1435708799958000) {code} The values were inserted in increasing order, but their writetimes are in a different order because of the repeated second. During the first instance of 23:59:59, the values 579, 580, and 581 were inserted at the beginning, middle, and end of the second. During the leap second, which is also 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and end of the second. However, since the two seconds are the same second, they appear interleaved with respect to timestamps, as shown above. So, should I consider this behavior correct? If not, how should Cassandra correctly handle the discontinuity introduced by the insertion of a leap second? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9131) Defining correct behavior during leap second insertion
[ https://issues.apache.org/jira/browse/CASSANDRA-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527114#comment-14527114 ] Ryan McGuire commented on CASSANDRA-9131: - Assigned to [~slebresne] to provide protocol spec. Defining correct behavior during leap second insertion -- Key: CASSANDRA-9131 URL: https://issues.apache.org/jira/browse/CASSANDRA-9131 Project: Cassandra Issue Type: Bug Environment: Linux ip-172-31-0-5 3.2.0-57-virtual #87-Ubuntu SMP Tue Nov 12 21:53:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Reporter: Jim Witschey Assignee: Sylvain Lebresne On Linux platforms, the insertion of a leap second breaks the monotonicity of timestamps. This can make values appear to have been inserted into Cassandra in a different order than they were. I want to know what behavior is expected and desirable for inserts over this discontinuity. From a timestamp perspective, an inserted leap second looks like a repeat of the previous second: {code} $ while true ; do echo `date +%s%N` `date -u` ; sleep .5 ; done 1435708798171327029 Tue Jun 30 23:59:58 UTC 2015 1435708798679392477 Tue Jun 30 23:59:58 UTC 2015 1435708799187550335 Tue Jun 30 23:59:59 UTC 2015 1435708799695670453 Tue Jun 30 23:59:59 UTC 2015 1435708799203902068 Tue Jun 30 23:59:59 UTC 2015 1435708799712168566 Tue Jun 30 23:59:59 UTC 2015 1435708800220473932 Wed Jul 1 00:00:00 UTC 2015 1435708800728908190 Wed Jul 1 00:00:00 UTC 2015 1435708801237611983 Wed Jul 1 00:00:01 UTC 2015 1435708801746251996 Wed Jul 1 00:00:01 UTC 2015 {code} Note that 23:59:59 repeats itself, and that the timestamps increase during the first time through, then step back down to the beginning of the second and increase again. As a result, the timestamps on values inserted during these seconds will be out of order. I set up a 4-node cluster running under Ubuntu 12.04.3 and synced them to shortly before the leap second would be inserted. During the insertion of the leap second, I ran a test with logic something like: {code} simple_insert = session.prepare( 'INSERT INTO test (foo, bar) VALUES (?, ?);') for i in itertools.count(): # stop after midnight now = datetime.utcnow() last_midnight = now.replace(hour=0, minute=0, second=0, microsecond=0) seconds_since_midnight = (now - last_midnight).total_seconds() if 5 = seconds_since_midnight = 15: break session.execute(simple_insert, [i, i]) result = session.execute(SELECT bar, WRITETIME(bar) FROM test;) {code} EDIT: This behavior occurs with server-generated timestamps; in this particular test, I set {{use_client_timestamp}} to {{False}}. Under normal circumstances, the values and writetimes would increase together, but when inserted over the leap second, they don't. These {{value, writetime}} pairs are sorted by writetime: {code} (582, 1435708799285000) (579, 1435708799339000) (583, 1435708799593000) (580, 1435708799643000) (584, 1435708799897000) (581, 1435708799958000) {code} The values were inserted in increasing order, but their writetimes are in a different order because of the repeated second. During the first instance of 23:59:59, the values 579, 580, and 581 were inserted at the beginning, middle, and end of the second. During the leap second, which is also 23:59:59, 582, 583, and 584 were inserted, also at the beginning, middle, and end of the second. However, since the two seconds are the same second, they appear interleaved with respect to timestamps, as shown above. So, should I consider this behavior correct? If not, how should Cassandra correctly handle the discontinuity introduced by the insertion of a leap second? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527111#comment-14527111 ] Matt Stump commented on CASSANDRA-9193: --- I spent some time exploring hanging a JS REPL off of C* last night to see if that what I could do in terms of dynamically rewriting functionality of a running StorageProxy. I can interrogate or modify variables, and call methods, but I can't dynamically rewrite code. I think this may be a JVM limitation. I can extend a Java class overriding a method, but I can't modify the method of an already instantiated object unless it was already being resolved via invoke dynamic (I think). We could still have a JS REPL, but it's usefulness is limited. The vision of having the power of something like a Lisp or Erlang REPL hang off of the server process is mostly dead. So that leaves us with three possibilities, the known injection points as originally mentioned, the UDF route as outlined by [~slebresne], or a combination of the two. Facility to write dynamic code to selectively trigger trace or log for queries -- Key: CASSANDRA-9193 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193 Project: Cassandra Issue Type: New Feature Reporter: Matt Stump I want the equivalent of dtrace for Cassandra. I want the ability to intercept a query with a dynamic script (assume JS) and based on logic in that script trigger the statement for trace or logging. Examples - Trace only INSERT statements to a particular CF. - Trace statements for a particular partition or consistency level. - Log statements that fail to reach the desired consistency for read or write. - Log If the request size for read or write exceeds some threshold At some point in the future it would be helpful to also do things such as log partitions greater than X bytes or Z cells when performing compaction. Essentially be able to inject custom code dynamically without a reboot to the different stages of C*. The code should be executed synchronously as part of the monitored task, but we should provide the ability to log or execute CQL asynchronously from the provided API. Further down the line we could use this functionality to modify/rewrite requests or tasks dynamically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527812#comment-14527812 ] Yuki Morishita commented on CASSANDRA-9295: --- I think there are two problems here: File corruption and SSTable ref release process in error handling. I'm not sure about the cause of the former. Does it just happen on uncompressed SSTables? For the latter, when some error occurred, streaming session is still holding refs to SSTable and trying to use them even after release. StreamSession releases SSTable refs, but in outgoing message thread, there are already messages queued to be flushed out that holds the same refs. I will post the patch for it. Streaming not holding on to ref's long enough. -- Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x Attachments: 9295.debug.txt While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58)
[jira] [Commented] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527821#comment-14527821 ] Jeremiah Jordan commented on CASSANDRA-9295: I don't know. Takes a bit to reproduce, and I haven't reproduced the missing file issue with the debug stuff on yet, so that it still to figure out. Streaming not holding on to ref's long enough. -- Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x Attachments: 9295.debug.txt While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at
[jira] [Commented] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527834#comment-14527834 ] Yuki Morishita commented on CASSANDRA-9295: --- missing file is caused by accessing already released file in error handling procedure. It is the same as the latter problem I mentioned in my comment. Streaming not holding on to ref's long enough. -- Key: CASSANDRA-9295 URL: https://issues.apache.org/jira/browse/CASSANDRA-9295 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Fix For: 2.1.x Attachments: 9295.debug.txt While doing some testing around adding/removing nodes under load with cassandra-2.1 head as of a few days ago (after was 2.1.5 tagged) I am seeing stream out errors with file not found exceptions. The file in question just finished being compacted into a new file a few lines earlier in the log. Seems that streaming isn't holding onto Ref's correctly for the stuff in the stream plans. I also see a corrupt sstable exception for the file the missing file was compacted to. Trimmed logs with just the compaction/streaming related stuff: You can see the stream plan is initiated in between the compaction starting, and the compaction finishing. {noformat} INFO [MemtableFlushWriter:3] 2015-05-04 16:08:21,239 Memtable.java:380 - Completed flushing /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db (60666088 bytes) for commitlog position ReplayPosition(segmentId=1430755416941, position=32294797) INFO [CompactionExecutor:4] 2015-05-04 16:08:40,856 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-4-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-3-Data.db')] INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,047 StreamResultFuture.java:109 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Creating new streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53190] 2015-05-04 16:09:31,238 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-INIT-/10.240.213.56:53192] 2015-05-04 16:09:31,249 StreamResultFuture.java:116 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9, ID#0] Received streaming plan for Rebuild INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:31,353 ColumnFamilyStore.java:882 - Enqueuing flush of standard1: 91768068 (19%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:37,425 ColumnFamilyStore.java:882 - Enqueuing flush of solr: 10012689 (2%) on-heap, 0 (0%) off-heap INFO [STREAM-IN-/10.240.213.56] 2015-05-04 16:09:38,073 StreamResultFuture.java:166 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9 ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 6 files(284288285 bytes) INFO [CompactionExecutor:4] 2015-05-04 16:10:11,047 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5,/mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-8,]. 182,162,816 bytes to 182,162,816 (~100% of original) in 90,188ms = 1.926243MB/s. 339,856 total partitions merged to 339,856. Partition merge counts were {1:339856, } ERROR [STREAM-OUT-/10.240.213.56] 2015-05-04 16:10:25,169 StreamSession.java:477 - [Stream #f261c040-f277-11e4-a070-d126f0416bc9] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-49f17b30f27711e4a438775021e2cd7f/keyspace1-standard1-ka-5-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.426.jar:2.1.5.426] at
[jira] [Commented] (CASSANDRA-4938) CREATE INDEX can block for creation now that schema changes may be concurrent
[ https://issues.apache.org/jira/browse/CASSANDRA-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527867#comment-14527867 ] Jonathan Ellis commented on CASSANDRA-4938: --- We push schema events to clients, right? Can we push a index build finished event? I'd rather do that than block. Related: cqlsh should show a message to users when schema events are pushed to it. CREATE INDEX can block for creation now that schema changes may be concurrent - Key: CASSANDRA-4938 URL: https://issues.apache.org/jira/browse/CASSANDRA-4938 Project: Cassandra Issue Type: Wish Reporter: Krzysztof Cieslinski Cognitum Assignee: Kirk True Priority: Minor Labels: lhf Fix For: 3.x Response from CREATE INDEX command comes faster than the creation of secondary index. So below code: {code:xml} CREATE INDEX ON tab(name); SELECT * FROM tab WHERE name = 'Chris'; {code} doesn't return any rows(of course, in column family tab, there are some records with name value = 'Chris'..) and any errors ( i would expect something like ??Bad Request: No indexed columns present in by-columns clause with Equal operator??) Inputing some timeout between those two commands resolves the problem, so: {code:xml} CREATE INDEX ON tab(name); Sleep(timeout); // for column family with 2000 rows the timeout had to be set for ~1 second SELECT * FROM tab WHERE name = 'Chris'; {code} will return all rows with values as specified. I'm using single node cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9295) Streaming not holding on to ref's long enough.
[ https://issues.apache.org/jira/browse/CASSANDRA-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527668#comment-14527668 ] Jeremiah Jordan commented on CASSANDRA-9295: Here are some logs from that patch for the corrupt sstable issue. It isn't showing the missing file issue. But I think its still showing an issue in ref management. {noformat} INFO [CompactionExecutor:4] 2015-05-04 22:44:10,104 CompactionTask.java:270 - Compacted 2 sstables to [/mnt/cass_data_disks/data1/keyspace1/standard1-31b786e0f2ae11e4b357c3b5cec66e30/keyspace1-standard1-ka-11,]. 160,518,600 bytes to 160,518,600 (~100% of original) in 37,587ms = 4.072750MB/s. 299,475 total partitions merged to 299,475. Partition merge counts were {1:299475, } INFO [CompactionExecutor:3] 2015-05-04 22:44:10,146 CompactionTask.java:140 - Compacting [SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-31b786e0f2ae11e4b357c3b5cec66e30/keyspace1-standard1-ka-11-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-31b786e0f2ae11e4b357c3b5cec66e30/keyspace1-standard1-ka-5-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-31b786e0f2ae11e4b357c3b5cec66e30/keyspace1-standard1-ka-6-Data.db'), SSTableReader(path='/mnt/cass_data_disks/data1/keyspace1/standard1-31b786e0f2ae11e4b357c3b5cec66e30/keyspace1-standard1-ka-12-Data.db')] ERROR [STREAM-OUT-/10.240.140.97] 2015-05-04 22:44:36,851 StreamSession.java:475 - [Stream #05469d70-f2af-11e4-a23d-e75fecbd7a98] Streaming error occurred java.io.IOException: Corrupted SSTable : /mnt/cass_data_disks/data1/keyspace1/standard1-31b786e0f2ae11e4b357c3b5cec66e30/keyspace1-standard1-ka-11-Data.db at org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.validate(DataIntegrityMetadata.java:79) ~[cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:149) ~[cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:102) ~[cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:58) ~[cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) ~[cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:346) [cassandra-all-2.1.5.5182.jar:2.1.5.5182] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:318) [cassandra-all-2.1.5.5182.jar:2.1.5.5182] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] INFO [STREAM-OUT-/10.240.140.97] 2015-05-04 22:44:36,875 StreamResultFuture.java:180 - [Stream #05469d70-f2af-11e4-a23d-e75fecbd7a98] Session with /10.240.140.97 is complete WARN [STREAM-OUT-/10.240.140.97] 2015-05-04 22:44:36,876 StreamResultFuture.java:207 - [Stream #05469d70-f2af-11e4-a23d-e75fecbd7a98] Stream failed ERROR [STREAM-OUT-/10.240.140.97] 2015-05-04 22:44:36,881 Ref.java:210 - Allocate trace org.apache.cassandra.utils.concurrent.Ref$State@44977e69: Thread[STREAM-IN-/10.240.140.97,5,main] at java.lang.Thread.getStackTrace(Thread.java:1552) at org.apache.cassandra.utils.concurrent.Ref$Debug.init(Ref.java:200) at org.apache.cassandra.utils.concurrent.Ref$State.init(Ref.java:133) at org.apache.cassandra.utils.concurrent.Ref.init(Ref.java:66) at org.apache.cassandra.utils.concurrent.Ref.tryRef(Ref.java:98) at org.apache.cassandra.io.sstable.SSTableReader.tryRef(SSTableReader.java:2008) at org.apache.cassandra.utils.concurrent.Refs.tryRef(Refs.java:186) at org.apache.cassandra.db.ColumnFamilyStore.selectAndReference(ColumnFamilyStore.java:1830) at org.apache.cassandra.streaming.StreamSession.getSSTableSectionsForRanges(StreamSession.java:304) at org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:266) at org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:491) at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:423) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251) at java.lang.Thread.run(Thread.java:745) ERROR [STREAM-OUT-/10.240.140.97] 2015-05-04 22:44:36,881 Ref.java:212 - Deallocate trace org.apache.cassandra.utils.concurrent.Ref$State@44977e69: Thread[STREAM-OUT-/10.240.140.97,5,main] at
[jira] [Updated] (CASSANDRA-9278) LeveledCompactionStrategytest.testGrouperLevels fails with test-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-9278: -- Reviewer: Carl Yeksigian LeveledCompactionStrategytest.testGrouperLevels fails with test-compression --- Key: CASSANDRA-9278 URL: https://issues.apache.org/jira/browse/CASSANDRA-9278 Project: Cassandra Issue Type: Test Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.x Compression causes fewer sstables to be emitted so the test fails when it goes to look for the tables. Solution is to use entropy for the data so it doesn't compress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9278) LeveledCompactionStrategytest.testGrouperLevels fails with test-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527360#comment-14527360 ] Carl Yeksigian commented on CASSANDRA-9278: --- +1 LeveledCompactionStrategytest.testGrouperLevels fails with test-compression --- Key: CASSANDRA-9278 URL: https://issues.apache.org/jira/browse/CASSANDRA-9278 Project: Cassandra Issue Type: Test Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.x Compression causes fewer sstables to be emitted so the test fails when it goes to look for the tables. Solution is to use entropy for the data so it doesn't compress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9124) GCInspector logs very different times after CASSANDRA-7638
[ https://issues.apache.org/jira/browse/CASSANDRA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527524#comment-14527524 ] Ariel Weisberg commented on CASSANDRA-9124: --- Zing gc names are GPGC Old and GPGC New but the GCInspector receives zero notifications under Zing so it is a moot point until that is fixed. GCInspector logs very different times after CASSANDRA-7638 -- Key: CASSANDRA-9124 URL: https://issues.apache.org/jira/browse/CASSANDRA-9124 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jeremiah Jordan Assignee: Ariel Weisberg Priority: Minor Fix For: 3.0, 2.1.6 After the GCInspector rewrite in CASSANDRA-7638 the times reported for CMS are the full time (including all the concurrent time), not just the stop the world pause time. In previous versions we reported just the stop the world pause time. This change is kind of scary for someone used to the old logs, and is also not as useful. You can't get how long were things really stopped from the log message any more. For example, this is a CMS that got logged in C* 2.1: {noformat} INFO [Service Thread] 2015-04-03 23:58:37,583 GCInspector.java:142 - ConcurrentMarkSweep GC in 12926ms. CMS Old Gen: 5305346280 - 1106799064; Par Eden Space: 223080 - 158423560; Par Survivor Space: 42081744 - 51339584 {noformat} And here is the corresponding information for that CMS from the gc log. {noformat} 2015-04-03T23:58:24.656+: 8064.780: [GC [1 CMS-initial-mark: 5181002K(6901760K)] 5222315K(7639040K), 0.0316710 secs] [Times: user=0.03 sys=0.00, real=0.03 secs] 2015-04-03T23:58:24.688+: 8064.812: Total time for which application threads were stopped: 0.0324490 seconds 2015-04-03T23:58:24.688+: 8064.812: [CMS-concurrent-mark-start] 2015-04-03T23:58:26.939+: 8067.062: [CMS-concurrent-mark: 2.176/2.250 secs] [Times: user=12.94 sys=1.73, real=2.25 secs] 2015-04-03T23:58:26.939+: 8067.063: [CMS-concurrent-preclean-start] 2015-04-03T23:58:27.209+: 8067.333: [CMS-concurrent-preclean: 0.187/0.270 secs] [Times: user=1.53 sys=0.15, real=0.28 secs] 2015-04-03T23:58:27.210+: 8067.333: [CMS-concurrent-abortable-preclean-start] 2015-04-03T23:58:27.988+: 8068.112: [CMS-concurrent-abortable-preclean: 0.759/0.779 secs] [Times: user=4.07 sys=0.74, real=0.77 secs] 2015-04-03T23:58:27.989+: 8068.113: [GC[YG occupancy: 488441 K (737280 K)]2015-04-03T23:58:27.989+: 8068.113: [Rescan (parallel) , 0.3688960 secs]2015-04-03T23:58:28.358+: 8068.482: [weak refs processing, 0.0009620 secs]2015-04-03T23:58:28.359+: 8068.483: [class unloading, 0.0060870 secs]2015-04-03T23:58:28.365+: 8068.489: [scrub symbol table, 0.0146010 secs]2015-04-03T23:58:28.380+: 8068.504: [scrub string table, 0.0031270 secs] [1 CMS-remark: 5231445K(6901760K)] 5719886K(7639040K), 0.3953770 secs] [Times: user=2.96 sys=0.00, real=0.39 secs] 2015-04-03T23:58:28.385+: 8068.508: Total time for which application threads were stopped: 0.3962470 seconds 2015-04-03T23:58:28.385+: 8068.509: [CMS-concurrent-sweep-start] 2015-04-03T23:58:37.582+: 8077.706: [CMS-concurrent-sweep: 8.661/9.197 secs] [Times: user=44.80 sys=9.58, real=9.20 secs] 2015-04-03T23:58:37.589+: 8077.713: [CMS-concurrent-reset-start] 2015-04-03T23:58:37.633+: 8077.757: [CMS-concurrent-reset: 0.044/0.044 secs] [Times: user=0.19 sys=0.10, real=0.04 secs] {noformat} The entire CMS took the 12 seconds reported in the GCIspector log message. Previously we would have only reported the 0.39 seconds that were spent in STW pauses. At the least we need to change the log message so that people don't think we are still just reporting STW time. But it would be more helpful if we could get the STW time and put that into the log message like we had previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9296) OHC fails to load in Zing 1.7 VM
Ariel Weisberg created CASSANDRA-9296: - Summary: OHC fails to load in Zing 1.7 VM Key: CASSANDRA-9296 URL: https://issues.apache.org/jira/browse/CASSANDRA-9296 Project: Cassandra Issue Type: Bug Reporter: Ariel Weisberg Assignee: Robert Stupp Fix For: 3.x Error output is a disaster but I am including it here, will clean up later. I had to change code to get this error to appear properly with the invocation target exception unwrapped. 1.7 VM version {noformat} java version 1.7.0-zing_15.02.1.0 Zing Runtime Environment for Java Applications (build 1.7.0-zing_15.02.1.0-b2) Zing 64-Bit Tiered VM (build 1.7.0-zing_15.02.1.0-b2-product-azlinuxM-X86_64, mixed mode) {noformat} This does work with the 1.8 Zing VM. {noformat} [junit] java.lang.ExceptionInInitializerError [junit] at org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:330) [junit] at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:468) [junit] at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:444) [junit] at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:324) [junit] at org.apache.cassandra.db.Keyspace.init(Keyspace.java:275) [junit] at org.apache.cassandra.db.Keyspace.open(Keyspace.java:117) [junit] at org.apache.cassandra.db.Keyspace.open(Keyspace.java:94) [junit] at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:81) [junit] at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:78) [junit] at com.google.common.collect.Iterators$8.transform(Iterators.java:794) [junit] at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) [junit] at org.apache.cassandra.db.ColumnFamilyStore.all(ColumnFamilyStore.java:2387) [junit] at org.apache.cassandra.config.CFMetaData.existingIndexNames(CFMetaData.java:1078) [junit] at org.apache.cassandra.config.CFMetaData.validate(CFMetaData.java:1035) [junit] at org.apache.cassandra.config.KSMetaData.validate(KSMetaData.java:180) [junit] at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:264) [junit] at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:259) [junit] at org.apache.cassandra.SchemaLoader.createKeyspace(SchemaLoader.java:340) [junit] at org.apache.cassandra.SchemaLoader.createKeyspace(SchemaLoader.java:328) [junit] at org.apache.cassandra.cache.AutoSavingCacheTest.defineSchema(AutoSavingCacheTest.java:49) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit] at java.lang.reflect.Method.invoke(Method.java:606) [junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) [junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) [junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) [junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27) [junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) [junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220) [junit] at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) [junit] Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException [junit] at org.caffinitas.ohc.OHCacheBuilder.build(OHCacheBuilder.java:221) [junit] at org.apache.cassandra.cache.OHCProvider.create(OHCProvider.java:49) [junit] at org.apache.cassandra.service.CacheService.initRowCache(CacheService.java:151) [junit] at org.apache.cassandra.service.CacheService.init(CacheService.java:103) [junit] at org.apache.cassandra.service.CacheService.clinit(CacheService.java:83) [junit] ... 34 more [junit] Caused by: java.lang.reflect.InvocationTargetException [junit] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) [junit] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
[jira] [Commented] (CASSANDRA-9296) OHC fails to load in Zing 1.7 VM
[ https://issues.apache.org/jira/browse/CASSANDRA-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527555#comment-14527555 ] Ariel Weisberg commented on CASSANDRA-9296: --- Issue created on OHC github https://github.com/snazy/ohc/issues/10 OHC fails to load in Zing 1.7 VM Key: CASSANDRA-9296 URL: https://issues.apache.org/jira/browse/CASSANDRA-9296 Project: Cassandra Issue Type: Bug Reporter: Ariel Weisberg Assignee: Robert Stupp Fix For: 3.x Error output is a disaster but I am including it here, will clean up later. I had to change code to get this error to appear properly with the invocation target exception unwrapped. 1.7 VM version {noformat} java version 1.7.0-zing_15.02.1.0 Zing Runtime Environment for Java Applications (build 1.7.0-zing_15.02.1.0-b2) Zing 64-Bit Tiered VM (build 1.7.0-zing_15.02.1.0-b2-product-azlinuxM-X86_64, mixed mode) {noformat} This does work with the 1.8 Zing VM. {noformat} [junit] java.lang.ExceptionInInitializerError [junit] at org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:330) [junit] at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:468) [junit] at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:444) [junit] at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:324) [junit] at org.apache.cassandra.db.Keyspace.init(Keyspace.java:275) [junit] at org.apache.cassandra.db.Keyspace.open(Keyspace.java:117) [junit] at org.apache.cassandra.db.Keyspace.open(Keyspace.java:94) [junit] at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:81) [junit] at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:78) [junit] at com.google.common.collect.Iterators$8.transform(Iterators.java:794) [junit] at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) [junit] at org.apache.cassandra.db.ColumnFamilyStore.all(ColumnFamilyStore.java:2387) [junit] at org.apache.cassandra.config.CFMetaData.existingIndexNames(CFMetaData.java:1078) [junit] at org.apache.cassandra.config.CFMetaData.validate(CFMetaData.java:1035) [junit] at org.apache.cassandra.config.KSMetaData.validate(KSMetaData.java:180) [junit] at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:264) [junit] at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:259) [junit] at org.apache.cassandra.SchemaLoader.createKeyspace(SchemaLoader.java:340) [junit] at org.apache.cassandra.SchemaLoader.createKeyspace(SchemaLoader.java:328) [junit] at org.apache.cassandra.cache.AutoSavingCacheTest.defineSchema(AutoSavingCacheTest.java:49) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit] at java.lang.reflect.Method.invoke(Method.java:606) [junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) [junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) [junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) [junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27) [junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) [junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220) [junit] at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) [junit] Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException [junit] at org.caffinitas.ohc.OHCacheBuilder.build(OHCacheBuilder.java:221) [junit] at org.apache.cassandra.cache.OHCProvider.create(OHCProvider.java:49) [junit] at org.apache.cassandra.service.CacheService.initRowCache(CacheService.java:151) [junit] at org.apache.cassandra.service.CacheService.init(CacheService.java:103) [junit] at org.apache.cassandra.service.CacheService.clinit(CacheService.java:83) [junit] ... 34 more
[jira] [Commented] (CASSANDRA-9279) Gossip (and mutations) lock up on startup
[ https://issues.apache.org/jira/browse/CASSANDRA-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527578#comment-14527578 ] Benedict commented on CASSANDRA-9279: - bq. We should instead exit when the CLA dies. The behaviours on CLA error were well defined by CASSANDRA-6364. The problem here is we have a race condition: they are not yet started before they're shutdown, and the start/stop aren't unidirectional (you can start, then stop, then start again; or in this case stop and then start, without having actually stopped). There are lots of ways to try fixing this. It would be nice to make the transitions unidirectional unless explicitly invoked by user interaction. But perhaps we should just have a variable that indicates the system has correctly started up, and if it hasn't to die completely on any FS error (either CL or otherwise) Gossip (and mutations) lock up on startup - Key: CASSANDRA-9279 URL: https://issues.apache.org/jira/browse/CASSANDRA-9279 Project: Cassandra Issue Type: Bug Reporter: Sebastian Estevez Assignee: Benedict Fix For: 2.0.x Attachments: Screen Shot 2015-04-30 at 4.41.57 PM.png Cluster running 2.0.14.352 on EC2 - c3.4xl's 2 nodes out of 8 exhibited the following behavior When starting up the node we noticed it was gray in OpsCenter. Other monitoring tool showed it as up. Turned out gossip tasks were piling up and we could see the following in the system.log: {code} WARN [GossipTasks:1] 2015-04-30 20:22:29,512 Gossiper.java (line 671) Gossip stage has 4270 pending tasks; skipping status check (no nodes will be marked down) WARN [GossipTasks:1] 2015-04-30 20:22:30,612 Gossiper.java (line 671) Gossip stage has 4272 pending tasks; skipping status check (no nodes will be marked down) WARN [GossipTasks:1] 2015-04-30 20:22:31,713 Gossiper.java (line 671) Gossip stage has 4273 pending tasks; skipping status check (no nodes will be marked down) ... {code} and tpstats shows blocked tasks--gossip and mutations: {code} GossipStage 1 3904 29384 0 0 {code} the CPU's are inactive (See attachment) and dstat output: {code} You did not select any stats, using -cdngy by default. total-cpu-usage -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 2 0 97 0 0 0|1324k 1381k| 0 0 | 0 0 |6252 5548 0 0 100 0 0 0| 064k| 42k 1017k| 0 0 |3075 2537 0 0 99 0 0 0| 0 8192B| 39k 794k| 0 0 |6999 7039 0 0 100 0 0 0| 0 0 | 39k 759k| 0 0 |3067 2726 0 0 99 0 0 0| 0 184k| 48k 1086k| 0 0 |4829 4178 0 0 99 0 0 0| 0 8192B| 34k 802k| 0 0 |1671 1240 0 0 100 0 0 0| 0 8192B| 48k 1067k| 0 0 |1878 1193 {code} I managed to grab a thread dump: https://gist.githubusercontent.com/anonymous/3b7b4698c32032603493/raw/read.md and dmesg: https://gist.githubusercontent.com/anonymous/5982b15337c9afbd5d49/raw/f3c2e4411b9d59e90f4615d93c7c1ad25922e170/read.md Restarting the node solved the issue (it came up normally), we don't know what is causing it but apparently (per the thread dump) gossip threads are blocked writing the system keyspace and the writes waiting on the commitlog. Gossip: {code} GossipStage:1 daemon prio=10 tid=0x7ffa23471800 nid=0xa13fa waiting on condition [0x7ff9cbe26000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x0005d3f50960 (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:351) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:336) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:211) at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:709) at org.apache.cassandra.cql3.QueryProcessor.processInternal(QueryProcessor.java:208) at
[jira] [Commented] (CASSANDRA-4938) CREATE INDEX can block for creation now that schema changes may be concurrent
[ https://issues.apache.org/jira/browse/CASSANDRA-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527583#comment-14527583 ] Kirk True commented on CASSANDRA-4938: -- Any updates on this? CREATE INDEX can block for creation now that schema changes may be concurrent - Key: CASSANDRA-4938 URL: https://issues.apache.org/jira/browse/CASSANDRA-4938 Project: Cassandra Issue Type: Wish Reporter: Krzysztof Cieslinski Cognitum Assignee: Kirk True Priority: Minor Labels: lhf Fix For: 3.x Response from CREATE INDEX command comes faster than the creation of secondary index. So below code: {code:xml} CREATE INDEX ON tab(name); SELECT * FROM tab WHERE name = 'Chris'; {code} doesn't return any rows(of course, in column family tab, there are some records with name value = 'Chris'..) and any errors ( i would expect something like ??Bad Request: No indexed columns present in by-columns clause with Equal operator??) Inputing some timeout between those two commands resolves the problem, so: {code:xml} CREATE INDEX ON tab(name); Sleep(timeout); // for column family with 2000 rows the timeout had to be set for ~1 second SELECT * FROM tab WHERE name = 'Chris'; {code} will return all rows with values as specified. I'm using single node cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9279) Gossip (and mutations) lock up on startup
[ https://issues.apache.org/jira/browse/CASSANDRA-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527378#comment-14527378 ] Brandon Williams edited comment on CASSANDRA-9279 at 5/4/15 9:50 PM: - Here's another example: {noformat} ERROR [COMMIT-LOG-ALLOCATOR] 2015-05-02 04:38:54,102 CommitLog.java (line 420) Failed to allocate new commit log segments. Commit disk failure policy is stop; terminating thread FSWriteError in /var/lib/cassandra/commitlog/CommitLog-3-1430566697635.log at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:143) at org.apache.cassandra.db.commitlog.CommitLogAllocator$3.run(CommitLogAllocator.java:208) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:99) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: /var/lib/cassandra/commitlog/CommitLog-3-1430566697635.log (Permission denied) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:241) at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:125) ... 4 more INFO [CompactionExecutor:4] 2015-05-02 04:38:54,130 CompactionTask.java (line 120) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/sstable_activity/system-sstable_activity-jb-500-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/sstable_activity/system-sstable_activity-jb-501-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/sstable_activity/system-sstable_activity-jb-499-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/sstable_activity/system-sstable_activity-jb-502-Data.db')] INFO [CompactionExecutor:12] 2015-05-02 04:38:54,140 ColumnFamilyStore.java (line 795) Enqueuing flush of Memtable-compactions_in_progress@1153169288(419/4190 serialized/live bytes, 21 ops) INFO [FlushWriter:1] 2015-05-02 04:38:54,141 Memtable.java (line 358) Writing Memtable-compactions_in_progress@1153169288(419/4190 serialized/live bytes, 21 ops) INFO [CompactionExecutor:25] 2015-05-02 04:38:54,144 ColumnFamilyStore.java (line 795) Enqueuing flush of Memtable-compactions_in_progress@1811681988(441/4410 serialized/live bytes, 20 ops) INFO [FlushWriter:2] 2015-05-02 04:38:54,144 Memtable.java (line 358) Writing Memtable-compactions_in_progress@1811681988(441/4410 serialized/live bytes, 20 ops) INFO [FlushWriter:1] 2015-05-02 04:38:54,152 Memtable.java (line 398) Completed flushing /var/lib/cassandra/data/system/compactions_in_progress/system-compactions_in_progress-jb-138175-Data.db (245 bytes) for commitlog position ReplayPosition(segmentId=1430566697633, position=1621) INFO [main] 2015-05-02 04:38:54,889 StorageService.java (line 515) Cassandra version: 2.0.14.352 INFO [main] 2015-05-02 04:38:54,889 StorageService.java (line 516) Thrift API version: 19.39.0 INFO [main] 2015-05-02 04:38:54,893 StorageService.java (line 517) CQL supported versions: 2.0.0,3.1.7 (default: 3.1.7) INFO [main] 2015-05-02 04:38:54,911 StorageService.java (line 540) Loading persisted ring state INFO [main] 2015-05-02 04:38:55,404 StorageService.java (line 678) Starting up server gossip {noformat} As you can see, the CLA died, effectively making the machine useless/problematic, but everything else started up and adversely affected the cluster. We should instead exit when the CLA dies. was (Author: brandon.williams): Here's another example: {noformat} ERROR [COMMIT-LOG-ALLOCATOR] 2015-05-02 04:38:54,102 CommitLog.java (line 420) Failed to allocate new commit log segments. Commit disk failure policy is stop; terminating thread FSWriteError in /var/lib/cassandra/commitlog/CommitLog-3-1430566697635.log at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:143) at org.apache.cassandra.db.commitlog.CommitLogAllocator$3.run(CommitLogAllocator.java:208) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:99) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: /var/lib/cassandra/commitlog/CommitLog-3-1430566697635.log (Permission denied) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:241) at org.apache.cassandra.db.commitlog.CommitLogSegment.init(CommitLogSegment.java:125) ... 4 more INFO [CompactionExecutor:4] 2015-05-02 04:38:54,130 CompactionTask.java (line 120) Compacting