[jira] [Commented] (CASSANDRA-9563) Rename class for DATE type in Java driver
[ https://issues.apache.org/jira/browse/CASSANDRA-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583136#comment-14583136 ] Robert Stupp commented on CASSANDRA-9563: - Would be great to have this for C* 2.2rc2 Rename class for DATE type in Java driver - Key: CASSANDRA-9563 URL: https://issues.apache.org/jira/browse/CASSANDRA-9563 Project: Cassandra Issue Type: Improvement Reporter: Olivier Michallat Priority: Minor Fix For: 2.2.x An early preview of the Java driver 2.2 was provided for inclusion in Cassandra 2.2.0-rc1. It uses a custom Java type to represent CQL type {{DATE}}. Currently that Java type is called {{DateWithoutTime}}. We'd like to rename it to {{LocalDate}}. This would be a breaking change for Cassandra, because that type is visible from UDF implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9424) 3.X Schema Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583148#comment-14583148 ] Sylvain Lebresne commented on CASSANDRA-9424: - bq. Is it possible to make official way(API) to load schema offline? That is, the ability to read schema from stored SSTables without waking up unnecessary server components. I agree we should get this ultimately. What I'd suggest is to serialize the schema as a sstable metadata component (only the table the sstable is of of course). This would be useful for offline tools, but I've wanted that for debugging more than once too. So I went ahead and created CASSANDRA-9587. 3.X Schema Improvements --- Key: CASSANDRA-9424 URL: https://issues.apache.org/jira/browse/CASSANDRA-9424 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Fix For: 3.x C* schema code is both more brittle and less efficient than I'd like it to be. This ticket will aggregate the improvement tickets to go into 3.X and 4.X to improve the situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9587) Serialize table schema as a sstable component
Sylvain Lebresne created CASSANDRA-9587: --- Summary: Serialize table schema as a sstable component Key: CASSANDRA-9587 URL: https://issues.apache.org/jira/browse/CASSANDRA-9587 Project: Cassandra Issue Type: Sub-task Reporter: Sylvain Lebresne Having the schema with each sstable would be tremendously useful for offline tools and for debugging purposes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9586) ant eclipse-warnings fails in trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583155#comment-14583155 ] Stefania commented on CASSANDRA-9586: - It's a false warning because it is released in close(). Making the wrapper auto closeable does not fix it. Passing in a channel that is closed in a try and taking a new reference in the constructor fixes it however. ant eclipse-warnings fails in trunk --- Key: CASSANDRA-9586 URL: https://issues.apache.org/jira/browse/CASSANDRA-9586 Project: Cassandra Issue Type: Bug Reporter: Michael Shuler Assignee: Stefania Fix For: 3.x {noformat} eclipse-warnings: [mkdir] Created dir: /home/mshuler/git/cassandra/build/ecj [echo] Running Eclipse Code Analysis. Output logged to /home/mshuler/git/cassandra/build/ecj/eclipse_compiler_checks.txt [java] incorrect classpath: /home/mshuler/git/cassandra/build/cobertura/classes [java] -- [java] 1. ERROR in /home/mshuler/git/cassandra/src/java/org/apache/cassandra/io/util/RandomAccessReader.java (at line 81) [java] super(new ChannelProxy(file), DEFAULT_BUFFER_SIZE, -1L, BufferType.OFF_HEAP); [java] ^^ [java] Potential resource leak: 'unassigned Closeable value' may not be closed [java] -- [java] 1 problem (1 error) BUILD FAILED {noformat} (checked 2.2 and did not find this issue) git blame on line 81 shows commit 17dd4cc for CASSANDRA-8897 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9588) Make sstableofflinerelevel print stats before relevel
Jens Rantil created CASSANDRA-9588: -- Summary: Make sstableofflinerelevel print stats before relevel Key: CASSANDRA-9588 URL: https://issues.apache.org/jira/browse/CASSANDRA-9588 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jens Rantil Priority: Trivial The current version of sstableofflinerelevel prints the new level hierarchy. While nodetool cfstats ... will tell the current hierarchy it would be nice to have sstableofflinerelevel output the current level histograms for easy comparison of what changes will be made. Especially since sstableofflinerelevel needs to run when node isn't running and nodetool cfstats ... doesn't work because of that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9582) MarshalException after upgrading to 2.1.6
[ https://issues.apache.org/jira/browse/CASSANDRA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583088#comment-14583088 ] Tom van den Berge commented on CASSANDRA-9582: -- {code} keyspace_name | columnfamily_name | column_name| component_index | index_name | index_options | index_type | type | validator ---+---++-++---+++- drillster | InvoiceItem |column1 | 0 | null | null | null | clustering_key | org.apache.cassandra.db.marshal.UUIDType drillster | InvoiceItem | currencyCode |null | null | null | null |regular | org.apache.cassandra.db.marshal.UTF8Type drillster | InvoiceItem |description |null | null | null | null |regular | org.apache.cassandra.db.marshal.UTF8Type drillster | InvoiceItem |key |null | null | null | null | partition_key | org.apache.cassandra.db.marshal.BytesType drillster | InvoiceItem | priceGross |null | null | null | null |regular | org.apache.cassandra.db.marshal.LongType drillster | InvoiceItem | priceNett |null | null | null | null |regular | org.apache.cassandra.db.marshal.LongType drillster | InvoiceItem | quantity |null | null | null | null |regular | org.apache.cassandra.db.marshal.IntegerType drillster | InvoiceItem |sku |null | null | null | null |regular | org.apache.cassandra.db.marshal.UTF8Type drillster | InvoiceItem | unitPriceGross |null | null | null | null |regular | org.apache.cassandra.db.marshal.LongType drillster | InvoiceItem | unitPriceNett |null | null | null | null |regular | org.apache.cassandra.db.marshal.LongType drillster | InvoiceItem |vat |null | null | null | null |regular | org.apache.cassandra.db.marshal.LongType drillster | InvoiceItem | vatRateBasisPoints |null | null | null | null |regular | org.apache.cassandra.db.marshal.IntegerType {code} {code} keyspace_name | columnfamily_name | bloom_filter_fp_chance | caching | column_aliases | comment | compaction_strategy_class | compaction_strategy_options | comparator | compression_parameters | default_time_to_live | default_validator | dropped_columns | gc_grace_seconds | index_interval | is_dense | key_aliases | key_validator | local_read_repair_chance | max_compaction_threshold | memtable_flush_period_in_ms | min_compaction_threshold | populate_io_cache_on_flush | read_repair_chance | replicate_on_write | speculative_retry | subcomparator| type | value_alias ---+---++---++-+-+-+--++--+---+-+--++--+-+---+--+--+-+--++++---+--+---+- drillster | InvoiceItem | null | KEYS_ONLY | [] | | org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy | {} | org.apache.cassandra.db.marshal.UUIDType | {} |0 | org.apache.cassandra.db.marshal.BytesType | null | 864000 |128 |False | [] | org.apache.cassandra.db.marshal.BytesType |0 | 32 | 0 |4 | False | 1 | True | 99.0PERCENTILE | org.apache.cassandra.db.marshal.UTF8Type | Super |null {code} MarshalException
[jira] [Commented] (CASSANDRA-9587) Serialize table schema as a sstable component
[ https://issues.apache.org/jira/browse/CASSANDRA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583172#comment-14583172 ] Aleksey Yeschenko commented on CASSANDRA-9587: -- You don't mean the full schema, right? Only the schema for the table, and dependent user types? Serialize table schema as a sstable component - Key: CASSANDRA-9587 URL: https://issues.apache.org/jira/browse/CASSANDRA-9587 Project: Cassandra Issue Type: Sub-task Reporter: Sylvain Lebresne Fix For: 3.x Having the schema with each sstable would be tremendously useful for offline tools and for debugging purposes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9160) Migrate CQL dtests to unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107 ] Stefania edited comment on CASSANDRA-9160 at 6/12/15 7:43 AM: -- The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModificationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState. SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? A tentative rearrangement, pending CI: I had to move all {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. was (Author: stefania): The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModificationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? A tentative rearrangement, pending CI: I had to move all {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. Migrate CQL dtests to unit tests Key: CASSANDRA-9160 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160 Project: Cassandra Issue Type: Test Reporter: Sylvain Lebresne Assignee: Stefania We have CQL tests in 2 places: dtests and unit tests. The unit tests are actually somewhat better in the sense that they have the ability to test both prepared and unprepared statements at the flip of a switch. It's also better to have all those tests in the same place so we can improve the test framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we should move the CQL dtests to the unit tests (which will be a good occasion to organize them better). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9160) Migrate CQL dtests to unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107 ] Stefania edited comment on CASSANDRA-9160 at 6/12/15 7:43 AM: -- The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModificationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? A tentative rearrangement, pending CI: I had to move all {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. was (Author: stefania): The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModiciationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? A tentative rearrangement, pending CI: I had to move all {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. Migrate CQL dtests to unit tests Key: CASSANDRA-9160 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160 Project: Cassandra Issue Type: Test Reporter: Sylvain Lebresne Assignee: Stefania We have CQL tests in 2 places: dtests and unit tests. The unit tests are actually somewhat better in the sense that they have the ability to test both prepared and unprepared statements at the flip of a switch. It's also better to have all those tests in the same place so we can improve the test framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we should move the CQL dtests to the unit tests (which will be a good occasion to organize them better). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9160) Migrate CQL dtests to unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107 ] Stefania edited comment on CASSANDRA-9160 at 6/12/15 7:44 AM: -- The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModificationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState. SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? In order to rearrange tests, I had to move {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. was (Author: stefania): The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModificationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState. SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? A tentative rearrangement, pending CI: I had to move all {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. Migrate CQL dtests to unit tests Key: CASSANDRA-9160 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160 Project: Cassandra Issue Type: Test Reporter: Sylvain Lebresne Assignee: Stefania We have CQL tests in 2 places: dtests and unit tests. The unit tests are actually somewhat better in the sense that they have the ability to test both prepared and unprepared statements at the flip of a switch. It's also better to have all those tests in the same place so we can improve the test framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we should move the CQL dtests to the unit tests (which will be a good occasion to organize them better). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9160) Migrate CQL dtests to unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107 ] Stefania commented on CASSANDRA-9160: - The dtests are ready for review, I've already created a pull request: https://github.com/riptano/cassandra-dtest/pull/321. The problem with the cas unit tests failure was because of the CAS ballot time uuid, which I had incorrectly set to request.now in ModiciationStatement.casInternal(). I fixed it so that it should always be bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does something similar, but it doesn't look 100% correct to me. It might still fail under heavy load, what do you think? A tentative rearrangement, pending CI: I had to move all {{CQLTester}} based tests into a separate folder, _validation_, to distinguish the CQL tests from the following: - tests based on {{SchemaLoader}}, occupying file names that we needed, such as BatchTests or DeleteTest - unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) Inside this new folder I created these sub-folders: - _operations_, for statements - _entities_, for collections, secondary index, various types - _util_, to host CQLTester - _miscellaneous_, for everything else. I am not too happy with the _validation_ folder so if you can think of something else do tell, we could perhaps move them somewhere else entirely. Migrate CQL dtests to unit tests Key: CASSANDRA-9160 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160 Project: Cassandra Issue Type: Test Reporter: Sylvain Lebresne Assignee: Stefania We have CQL tests in 2 places: dtests and unit tests. The unit tests are actually somewhat better in the sense that they have the ability to test both prepared and unprepared statements at the flip of a switch. It's also better to have all those tests in the same place so we can improve the test framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we should move the CQL dtests to the unit tests (which will be a good occasion to organize them better). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9589) Unclear difference between Improvement and Wish in JIRA
Jens Rantil created CASSANDRA-9589: -- Summary: Unclear difference between Improvement and Wish in JIRA Key: CASSANDRA-9589 URL: https://issues.apache.org/jira/browse/CASSANDRA-9589 Project: Cassandra Issue Type: Bug Components: Documentation website, Tools Reporter: Jens Rantil Priority: Trivial The JIRA issue types Wish and Improvement sounds the same to me. Every time I have no idea which of them I should choose. Filing this bug to 1) get clarity and 2) propose either one of them is merged into the other or 3) rename them to make it clear why they differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: ninja suppresswarnings
Repository: cassandra Updated Branches: refs/heads/trunk 887bbc141 - b1abcd048 ninja suppresswarnings Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b1abcd04 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b1abcd04 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b1abcd04 Branch: refs/heads/trunk Commit: b1abcd048e3780f11256e455e6024dcf05887f71 Parents: 887bbc1 Author: Benedict Elliott Smith bened...@apache.org Authored: Fri Jun 12 11:58:42 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Fri Jun 12 11:58:42 2015 +0100 -- src/java/org/apache/cassandra/io/util/RandomAccessReader.java | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1abcd04/src/java/org/apache/cassandra/io/util/RandomAccessReader.java -- diff --git a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java index fef206e..c4be8e9 100644 --- a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java +++ b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java @@ -76,6 +76,7 @@ public class RandomAccessReader extends AbstractDataInput implements FileDataInp // not have a shared channel. private static class RandomAccessReaderWithChannel extends RandomAccessReader { +@SuppressWarnings(resource) RandomAccessReaderWithChannel(File file) { super(new ChannelProxy(file), DEFAULT_BUFFER_SIZE, -1L, BufferType.OFF_HEAP);
[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine
[ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583262#comment-14583262 ] Benedict commented on CASSANDRA-8099: - I've pushed a small semantic-changing suggestion for serialization and merging of RTs [here|https://github.com/belliottsmith/cassandra/tree/8099-RTMarker] I'm happy to split this (and further changes) out into a separate ticket, but while this does cross the threshold for discussion/mention, it's actually a pretty small/contained change. Basically, on a RT boundary, instead of issuing a close _and_ open marker, we just issue the new open marker - both during merge and serialization. On read, encountering an open marker when we _already_ have one open for that iterator is treated is a close/open pair. This both reduces storage on disk, especially for large records (where RT markers are both more frequent and, obviously, larger), but also gets rid of the UnfilteredRowIterators.MergedUnfiltered ugliness. Refactor and modernize the storage engine - Key: CASSANDRA-8099 URL: https://issues.apache.org/jira/browse/CASSANDRA-8099 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 3.0 beta 1 Attachments: 8099-nit The current storage engine (which for this ticket I'll loosely define as the code implementing the read/write path) is suffering from old age. One of the main problem is that the only structure it deals with is the cell, which completely ignores the more high level CQL structure that groups cell into (CQL) rows. This leads to many inefficiencies, like the fact that during a reads we have to group cells multiple times (to count on replica, then to count on the coordinator, then to produce the CQL resultset) because we forget about the grouping right away each time (so lots of useless cell names comparisons in particular). But outside inefficiencies, having to manually recreate the CQL structure every time we need it for something is hindering new features and makes the code more complex that it should be. Said storage engine also has tons of technical debt. To pick an example, the fact that during range queries we update {{SliceQueryFilter.count}} is pretty hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has to go into to simply remove the last query result. So I want to bite the bullet and modernize this storage engine. I propose to do 2 main things: # Make the storage engine more aware of the CQL structure. In practice, instead of having partitions be a simple iterable map of cells, it should be an iterable list of row (each being itself composed of per-column cells, though obviously not exactly the same kind of cell we have today). # Make the engine more iterative. What I mean here is that in the read path, we end up reading all cells in memory (we put them in a ColumnFamily object), but there is really no reason to. If instead we were working with iterators all the way through, we could get to a point where we're basically transferring data from disk to the network, and we should be able to reduce GC substantially. Please note that such refactor should provide some performance improvements right off the bat but it's not it's primary goal either. It's primary goal is to simplify the storage engine and adds abstraction that are better suited to further optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9587) Serialize table schema as a sstable component
[ https://issues.apache.org/jira/browse/CASSANDRA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583278#comment-14583278 ] Sylvain Lebresne commented on CASSANDRA-9587: - Yes, I mean only the information relevant to the sstable. Serialize table schema as a sstable component - Key: CASSANDRA-9587 URL: https://issues.apache.org/jira/browse/CASSANDRA-9587 Project: Cassandra Issue Type: Sub-task Reporter: Sylvain Lebresne Fix For: 3.x Having the schema with each sstable would be tremendously useful for offline tools and for debugging purposes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9581) pig-tests spend time waiting on /dev/random for SecureRandom
[ https://issues.apache.org/jira/browse/CASSANDRA-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583327#comment-14583327 ] Joshua McKenzie commented on CASSANDRA-9581: Welp - that's totally on me for not checking those 2 links. Sorry about that and thanks for confirming. :) pig-tests spend time waiting on /dev/random for SecureRandom Key: CASSANDRA-9581 URL: https://issues.apache.org/jira/browse/CASSANDRA-9581 Project: Cassandra Issue Type: Test Reporter: Ariel Weisberg Assignee: Ariel Weisberg We don't need secure random numbers (for unit tests) so waiting for entropy doesn't make much sense. Luckily Java makes it easy to point to /dev/urandom for entropy. It also transparently handles it correctly on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections
Stefan Podkowinski created CASSANDRA-9590: - Summary: Support for both encrypted and unencrypted native transport connections Key: CASSANDRA-9590 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stefan Podkowinski Enabling encryption for native transport currently turns SSL exclusively on or off for the opened socket. Migrating from plain to encrypted requires to migrate all native clients as well and redeploy all of them at the same time after starting the SSL enabled Cassandra nodes. This patch would allow to start Cassandra with both an unencrypted and ssl enabled native port. Clients can connect to either, based whether they support ssl or not. This has been implemented by introducing a new {{native_transport_port_ssl}} config option. There would be three scenarios: * client encryption disabled: native_transport_port unencrypted, port_ssl not used * client encryption enabled, port_ssl not set: encrypted native_transport_port * client encryption enabled and port_ssl set: native_transport_port unencrypted, port_ssl encrypted This approach would keep configuration behavior fully backwards compatible. Patch proposal (tests will be added later in case people will speak out in favor for the patch): [Diff trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl], [Patch against trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583297#comment-14583297 ] Marcus Eriksson commented on CASSANDRA-9045: been looking at this again today, but have to say that I have no idea what is going on, not able to reproduce could you post your current schema; (describe table bounces;) and logs between 2015-06-04T11:31:38 and 2015-06-08T08:27:36 for the nodes involved in your last example? Could you also run tools/bin/sstablemetadata over the sstables on one of those nodes? Just to check that the timestamps look ok. Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.x Attachments: 9045-debug-tracing.txt, another.txt, apache-cassandra-2.0.13-SNAPSHOT.jar, cqlsh.txt, debug.txt, inconsistency.txt Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584019#comment-14584019 ] Joshua McKenzie commented on CASSANDRA-7918: Fair point. My initial thought was 1 arg to name the output file (i.e. name for the test you're doing) and the rest passed through to stress, but as you said it's not a pressing question. Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9567) Windows does not handle ipv6 addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9567: --- Reviewer: Joshua McKenzie Attachment: 9567.txt Windows does not handle ipv6 addresses -- Key: CASSANDRA-9567 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.x, 2.1.x, 2.2.x Attachments: 9567.txt In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml by splitting on {{:}}, then selecting [1] from the resulting split, to separate the yaml key from the value. Unfortunately, because ipv6 addresses contain {{:}} characters, this means we are not grabbing the whole address, causing problems when starting the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
[ https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-9591: - Fix Version/s: (was: 2.0.15) 2.0.x Scrub (recover) sstables even when -Index.db is missing --- Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Labels: sstablescrub Fix For: 2.0.x Attachments: 9591-2.0.txt Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. There's also a cassandra distribution bundled with the patch [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz] to make life a little easier for anyone finding themselves in such a bad situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9596) Tombstone timestamps aren't used to skip SSTables while they are still in the memtable
Richard Low created CASSANDRA-9596: -- Summary: Tombstone timestamps aren't used to skip SSTables while they are still in the memtable Key: CASSANDRA-9596 URL: https://issues.apache.org/jira/browse/CASSANDRA-9596 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Fix For: 2.0.x If you have one SSTable containing a partition level tombstone at timestamp t and all other SSTables have cells with timestamp t, Cassandra will skip all the other SSTables and return nothing quickly. However, if the partition tombstone is still in the memtable it doesn’t skip any SSTables. It should use the same timestamp logic to skip all SSTables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9567) Windows does not handle ipv6 addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson reassigned CASSANDRA-9567: -- Assignee: Philip Thompson (was: Joshua McKenzie) Windows does not handle ipv6 addresses -- Key: CASSANDRA-9567 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.x, 2.1.x, 2.2.x In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml by splitting on {{:}}, then selecting [1] from the resulting split, to separate the yaml key from the value. Unfortunately, because ipv6 addresses contain {{:}} characters, this means we are not grabbing the whole address, causing problems when starting the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed
[ https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583998#comment-14583998 ] Benedict commented on CASSANDRA-8061: - [~rstrickland]: could you confirm you're definitely seeing these files stick around indefinitely? That would be an independent problem to the cfstats issue, which happens without this issue. CASSANDRA-9580 has already been posted for that, and a fix is available (although we may tweak the fix before release) tmplink files are not removed - Key: CASSANDRA-8061 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061 Project: Cassandra Issue Type: Bug Components: Core Environment: Linux Reporter: Gianluca Borello Assignee: Joshua McKenzie Fix For: 2.1.x Attachments: 8061_v1.txt, 8248-thread_dump.txt After installing 2.1.0, I'm experiencing a bunch of tmplink files that are filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 and that is very similar, and I confirm it happens both on 2.1.0 as well as from the latest commit on the cassandra-2.1 branch (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685 from the cassandra-2.1) Even starting with a clean keyspace, after a few hours I get: {noformat} $ sudo find /raid0 | grep tmplink | xargs du -hs 2.7G /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db 13M /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db 1.8G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db 12M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db 5.2M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db 822M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db 7.3M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db 1.2G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db 6.7M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db 1.1G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db 11M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db 1.7G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db 812K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db 122M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db 744K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db 660K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db 796K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db 137M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db 161M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db 139M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db 940K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db 936K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db 161M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db 672K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db 113M
[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed
[ https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584007#comment-14584007 ] Robbie Strickland commented on CASSANDRA-8061: -- [~benedict] I can confirm that I made an invalid assumption that the issue was related, but in fact the files are transient and the issue is CASSANDRA-9850. tmplink files are not removed - Key: CASSANDRA-8061 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061 Project: Cassandra Issue Type: Bug Components: Core Environment: Linux Reporter: Gianluca Borello Assignee: Joshua McKenzie Fix For: 2.1.x Attachments: 8061_v1.txt, 8248-thread_dump.txt After installing 2.1.0, I'm experiencing a bunch of tmplink files that are filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 and that is very similar, and I confirm it happens both on 2.1.0 as well as from the latest commit on the cassandra-2.1 branch (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685 from the cassandra-2.1) Even starting with a clean keyspace, after a few hours I get: {noformat} $ sudo find /raid0 | grep tmplink | xargs du -hs 2.7G /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db 13M /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db 1.8G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db 12M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db 5.2M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db 822M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db 7.3M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db 1.2G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db 6.7M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db 1.1G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db 11M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db 1.7G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db 812K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db 122M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db 744K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db 660K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db 796K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db 137M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db 161M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db 139M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db 940K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db 936K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db 161M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db 672K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db 113M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Data.db 116M
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-8831: -- Labels: client-impacting docs-impacting (was: client-impacting doc-impacting) Create a system table to expose prepared statements --- Key: CASSANDRA-8831 URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Robert Stupp Labels: client-impacting, docs-impacting Fix For: 3.x Attachments: 8831-3.0-v1.txt, 8831-v1.txt, 8831-v2.txt Because drivers abstract from users the handling of up/down nodes, they have to deal with the fact that when a node is restarted (or join), it won't know any prepared statement. Drivers could somewhat ignore that problem and wait for a query to return an error (that the statement is unknown by the node) to re-prepare the query on that node, but it's relatively inefficient because every time a node comes back up, you'll get bad latency spikes due to some queries first failing, then being re-prepared and then only being executed. So instead, drivers (at least the java driver but I believe others do as well) pro-actively re-prepare statements when a node comes up. It solves the latency problem, but currently every driver instance blindly re-prepare all statements, meaning that in a large cluster with many clients there is a lot of duplication of work (it would be enough for a single client to prepare the statements) and a bigger than necessary load on the node that started. An idea to solve this it to have a (cheap) way for clients to check if some statements are prepared on the node. There is different options to provide that but what I'd suggest is to add a system table to expose the (cached) prepared statements because: # it's reasonably straightforward to implement: we just add a line to the table when a statement is prepared and remove it when it's evicted (we already have eviction listeners). We'd also truncate the table on startup but that's easy enough). We can even switch it to a virtual table if/when CASSANDRA-7622 lands but it's trivial to do with a normal table in the meantime. # it doesn't require a change to the protocol or something like that. It could even be done in 2.1 if we wish to. # exposing prepared statements feels like a genuinely useful information to have (outside of the problem exposed here that is), if only for debugging/educational purposes. The exposed table could look something like: {noformat} CREATE TABLE system.prepared_statements ( keyspace_name text, table_name text, prepared_id blob, query_string text, PRIMARY KEY (keyspace_name, table_name, prepared_id) ) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9229) Add functions to convert timeuuid to date or time
[ https://issues.apache.org/jira/browse/CASSANDRA-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-9229: -- Labels: cql docs-impacting (was: cql doc-impacting) Add functions to convert timeuuid to date or time - Key: CASSANDRA-9229 URL: https://issues.apache.org/jira/browse/CASSANDRA-9229 Project: Cassandra Issue Type: New Feature Reporter: Michaël Figuière Assignee: Benjamin Lerer Labels: cql, docs-impacting Fix For: 2.2.0 rc2 Attachments: 9229.txt, CASSANDRA-9229-V2.txt, CASSANDRA-9229.txt As CASSANDRA-7523 brings the {{date}} and {{time}} native types to Cassandra, it would be useful to add builtin function to convert {{timeuuid}} to these two new types, just like {{dateOf()}} is doing for timestamps. {{timeOf()}} would extract the time component from a {{timeuuid}}. Example use case could be at insert time with for instance {{timeOf(now())}}, as well as at read time to compare the time component of a {{timeuuid}} column in a {{WHERE}} clause. The use cases would be similar for {{date}} but the solution is slightly less obvious, as in a perfect world we would want {{dateOf()}} to convert to {{date}} and {{timestampOf()}} for {{timestamp}}, unfortunately {{dateOf()}} already exist and convert to a {{timestamp}}, not a {{date}}. Making this change would break many existing CQL queries which is not acceptable. Therefore we could use a different name formatting logic such as {{toDate}} or {{dateFrom}}. We could then also consider using this new name convention for the 3 dates related types and just have {{dateOf}} becoming a deprecated alias. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9402) Implement proper sandboxing for UDFs
[ https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-9402: -- Labels: docs-impacting security (was: doc-impacting security) Implement proper sandboxing for UDFs Key: CASSANDRA-9402 URL: https://issues.apache.org/jira/browse/CASSANDRA-9402 Project: Cassandra Issue Type: Task Reporter: T Jake Luciani Assignee: Robert Stupp Priority: Critical Labels: docs-impacting, security Fix For: 3.0 beta 1 Attachments: 9402-warning.txt We want to avoid a security exploit for our users. We need to make sure we ship 2.2 UDFs with good defaults so someone exposing it to the internet accidentally doesn't open themselves up to having arbitrary code run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9567) Windows does not handle ipv6 addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583989#comment-14583989 ] Philip Thompson commented on CASSANDRA-9567: [~JoshuaMcKenzie], this fixes the issue I was running into. Windows does not handle ipv6 addresses -- Key: CASSANDRA-9567 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.x, 2.1.x, 2.2.x Attachments: 9567.txt In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml by splitting on {{:}}, then selecting [1] from the resulting split, to separate the yaml key from the value. Unfortunately, because ipv6 addresses contain {{:}} characters, this means we are not grabbing the whole address, causing problems when starting the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9567) Windows does not handle ipv6 addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583989#comment-14583989 ] Philip Thompson edited comment on CASSANDRA-9567 at 6/12/15 8:08 PM: - [~JoshuaMcKenzie], this fixes the issue I was running into. It relies on the (valid) assumption that a yaml requires a space after the {{:}} between the key and value. was (Author: philipthompson): [~JoshuaMcKenzie], this fixes the issue I was running into. It relies on the (valid) assumption that a yaml requires {{: }} between the key and value, not just a {{:}} Windows does not handle ipv6 addresses -- Key: CASSANDRA-9567 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.x, 2.1.x, 2.2.x Attachments: 9567.txt In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml by splitting on {{:}}, then selecting [1] from the resulting split, to separate the yaml key from the value. Unfortunately, because ipv6 addresses contain {{:}} characters, this means we are not grabbing the whole address, causing problems when starting the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9567) Windows does not handle ipv6 addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583989#comment-14583989 ] Philip Thompson edited comment on CASSANDRA-9567 at 6/12/15 8:08 PM: - [~JoshuaMcKenzie], this fixes the issue I was running into. It relies on the (valid) assumption that a yaml requires {{: }} between the key and value, not just a {{:}} was (Author: philipthompson): [~JoshuaMcKenzie], this fixes the issue I was running into. Windows does not handle ipv6 addresses -- Key: CASSANDRA-9567 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.x, 2.1.x, 2.2.x Attachments: 9567.txt In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml by splitting on {{:}}, then selecting [1] from the resulting split, to separate the yaml key from the value. Unfortunately, because ipv6 addresses contain {{:}} characters, this means we are not grabbing the whole address, causing problems when starting the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector
[ https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ron Kuris updated CASSANDRA-9526: - Attachment: (was: Monitor-Phi-JMX.patch.txt) Provide a JMX hook to monitor phi values in the FailureDetector --- Key: CASSANDRA-9526 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ron Kuris Fix For: 2.0.x phi_convict_threshold can be tuned, but there's currently no way to monitor the phi values to see if you're getting close. The attached patch adds the ability to get these values via JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-9592: -- Reviewer: Yuki Morishita `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584158#comment-14584158 ] Yuki Morishita commented on CASSANDRA-9592: --- * {{CompactionManager#submitBackground}} will not return null, so we need {{isEmpty}} check. * I think this change makes compaction kick scheduled here (https://github.com/belliottsmith/cassandra/blob/d1ddae1b61a9ca037b5edc137b5c9915e86dece6/src/java/org/apache/cassandra/service/CassandraDaemon.java#L371-L386) obsolete so we can delete it. * nit: compile error because of missed ; `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed
[ https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583927#comment-14583927 ] Robbie Strickland commented on CASSANDRA-8061: -- I can also verify I am seeing this after upgrading to 2.1.6. It breaks nodetool cfstats with an AssertionError: {noformat} error: /var/lib/cassandra/xvdb/data/prod_analytics_events/locationupdateevents-52f73af0fd5111e489f75b9deb90b453/prod_analytics_events-locationupdateevents-tmplink-ka-1460-Data.db -- StackTrace -- java.lang.AssertionError: /var/lib/cassandra/xvdb/data/prod_analytics_events/locationupdateevents-52f73af0fd5111e489f75b9deb90b453/prod_analytics_events-locationupdateevents-tmplink-ka-1460-Data.db at org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:270) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290) at com.yammer.metrics.reporting.JmxReporter$Gauge.getValue(JmxReporter.java:63) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1443) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399) at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:637) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$241(TCPTransport.java:683) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$1/602091790.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} tmplink files are not removed - Key: CASSANDRA-8061 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061 Project: Cassandra Issue Type: Bug Components: Core Environment: Linux Reporter: Gianluca Borello Assignee: Joshua McKenzie Fix For: 2.1.x Attachments: 8061_v1.txt, 8248-thread_dump.txt After installing 2.1.0, I'm experiencing a bunch of tmplink files that are filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 and that is very similar, and I confirm it
[jira] [Created] (CASSANDRA-9594) metrics reporter doesn't start until after a bootstrap
Eric Evans created CASSANDRA-9594: - Summary: metrics reporter doesn't start until after a bootstrap Key: CASSANDRA-9594 URL: https://issues.apache.org/jira/browse/CASSANDRA-9594 Project: Cassandra Issue Type: Bug Components: Core Reporter: Eric Evans Priority: Minor In {{o.a.c.service.CassandraDaemon#setup}}, the metrics reporter is started immediately after the invocation of {{o.a.c.service.StorageService#initServer}}, which for a bootstrapping node may block for a considerable period of time. If the metrics reporter is your only source of visibility, then you are blind until the bootstrap completes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector
[ https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ron Kuris updated CASSANDRA-9526: - Attachment: Monitor-Phi-JMX.patch Tiny-Race-Condition.patch Phi-Log-Debug-When-Close.patch Fixed some minor problems found while running this code at high volume for a while, so uploaded a revised patchset. Also corrected header reorganization. Provide a JMX hook to monitor phi values in the FailureDetector --- Key: CASSANDRA-9526 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ron Kuris Fix For: 2.0.x Attachments: Monitor-Phi-JMX.patch, Phi-Log-Debug-When-Close.patch, Tiny-Race-Condition.patch phi_convict_threshold can be tuned, but there's currently no way to monitor the phi values to see if you're getting close. The attached patch adds the ability to get these values via JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1685145 - /cassandra/site/publish/doc/cql3/CQL-2.2.html
Author: tylerhobbs Date: Fri Jun 12 18:42:50 2015 New Revision: 1685145 URL: http://svn.apache.org/r1685145 Log: Add collections, tuple, UDT to JSON types documentation Modified: cassandra/site/publish/doc/cql3/CQL-2.2.html Modified: cassandra/site/publish/doc/cql3/CQL-2.2.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/doc/cql3/CQL-2.2.html?rev=1685145r1=1685144r2=1685145view=diff == --- cassandra/site/publish/doc/cql3/CQL-2.2.html (original) +++ cassandra/site/publish/doc/cql3/CQL-2.2.html Fri Jun 12 18:42:50 2015 @@ -1,4 +1,4 @@ -?xml version='1.0' encoding='utf-8' ?!DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.0 Transitional//EN http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd;html xmlns=http://www.w3.org/1999/xhtml;headmeta http-equiv=Content-Type content=text/html; charset=utf-8/titleCQL-2.2/title/headbodyplink rel=StyleSheet href=CQL.css type=text/css media=screen/ph1 id=CassandraQueryLanguageCQLv3.3.0Cassandra Query Language (CQL) v3.3.0/h1span id=tableOfContentsol style=list-style: none;lia href=CQL-2.2.html#CassandraQueryLanguageCQLv3.3.0Cassandra Query Language (CQL) v3.3.0/aol style=list-style: none;lia href=CQL-2.2.html#CQLSyntaxCQL Syntax/aol style=list-style: none;lia href=CQL-2.2.html#PreamblePreamble/a/lilia href=CQL-2.2.html#ConventionsConventions/a/lilia href=CQL-2.2.html#identifiersIdentifiers and keywords/a/lilia href=CQL-2.2.html#constantsConstants/a/lilia href=CQL-2. 2.html#CommentsComments/a/lilia href=CQL-2.2.html#statementsStatements/a/lilia href=CQL-2.2.html#preparedStatementPrepared Statement/a/li/ol/lilia href=CQL-2.2.html#dataDefinitionData Definition/aol style=list-style: none;lia href=CQL-2.2.html#createKeyspaceStmtCREATE KEYSPACE/a/lilia href=CQL-2.2.html#useStmtUSE/a/lilia href=CQL-2.2.html#alterKeyspaceStmtALTER KEYSPACE/a/lilia href=CQL-2.2.html#dropKeyspaceStmtDROP KEYSPACE/a/lilia href=CQL-2.2.html#createTableStmtCREATE TABLE/a/lilia href=CQL-2.2.html#alterTableStmtALTER TABLE/a/lilia href=CQL-2.2.html#dropTableStmtDROP TABLE/a/lilia href=CQL-2.2.html#truncateStmtTRUNCATE/a/lilia href=CQL-2.2.html#createIndexStmtCREATE INDEX/a/lilia href=CQL-2.2.html#dropIndexStmtDROP INDEX/a/lilia href=CQL-2.2.html#createTypeStmtCREATE TYPE/a/lilia href=CQL-2.2.html#alterTypeStmtALTER TYPE/ a/lilia href=CQL-2.2.html#dropTypeStmtDROP TYPE/a/lilia href=CQL-2.2.html#createTriggerStmtCREATE TRIGGER/a/lilia href=CQL-2.2.html#dropTriggerStmtDROP TRIGGER/a/lilia href=CQL-2.2.html#createFunctionStmtCREATE FUNCTION/a/lilia href=CQL-2.2.html#dropFunctionStmtDROP FUNCTION/a/lilia href=CQL-2.2.html#createAggregateStmtCREATE AGGREGATE/a/lilia href=CQL-2.2.html#dropAggregateStmtDROP AGGREGATE/a/li/ol/lilia href=CQL-2.2.html#dataManipulationData Manipulation/aol style=list-style: none;lia href=CQL-2.2.html#insertStmtINSERT/a/lilia href=CQL-2.2.html#updateStmtUPDATE/a/lilia href=CQL-2.2.html#deleteStmtDELETE/a/lilia href=CQL-2.2.html#batchStmtBATCH/a/li/ol/lilia href=CQL-2.2.html#queriesQueries/aol style=list-style: none;lia href=CQL-2.2.html#selectStmtSELECT/a/li/ol/lilia href=CQL-2.2.html#databaseRolesDatabase Roles/a ol style=list-style: none;lia href=CQL-2.2.html#createRoleStmtCREATE ROLE/a/lilia href=CQL-2.2.html#alterRoleStmtALTER ROLE/a/lilia href=CQL-2.2.html#dropRoleStmtDROP ROLE/a/lilia href=CQL-2.2.html#grantRoleStmtGRANT ROLE/a/lilia href=CQL-2.2.html#revokeRoleStmtREVOKE ROLE/a/lilia href=CQL-2.2.html#createUserStmtCREATE USER /a/lilia href=CQL-2.2.html#alterUserStmtALTER USER /a/lilia href=CQL-2.2.html#dropUserStmtDROP USER /a/lilia href=CQL-2.2.html#listUsersStmtLIST USERS/a/li/ol/lilia href=CQL-2.2.html#dataControlData Control/aol style=list-style: none;lia href=CQL-2.2.html#permissionsPermissions /a/lilia href=CQL-2.2.html#grantPermissionsStmtGRANT PERMISSION/a/lilia href=CQL-2.2.html#revokePermissionsStmtREVOKE PERMISSION/a/li/ol/lilia href=CQL-2.2.html#typesData Types/aol style=list-style: none;lia href=CQL-2.2.html#usingti mestampsWorking with timestamps/a/lilia href=CQL-2.2.html#usingdatesWorking with dates/a/lilia href=CQL-2.2.html#usingtimeWorking with time/a/lilia href=CQL-2.2.html#countersCounters/a/lilia href=CQL-2.2.html#collectionsWorking with collections/a/li/ol/lilia href=CQL-2.2.html#functionsFunctions/aol style=list-style: none;lia href=CQL-2.2.html#tokenFunToken/a/lilia href=CQL-2.2.html#uuidFunUuid/a/lilia href=CQL-2.2.html#timeuuidFunTimeuuid functions/a/lilia href=CQL-2.2.html#blobFunBlob conversion functions/a/li/ol/lilia href=CQL-2.2.html#udfsUser-Defined Functions/a/lilia href=CQL-2.2.html#udasUser-Defined Aggregates/a/lilia href=CQL-2.2.html#jsonJSON Support/aol style=list-style: none;lia href=CQL-2.2.html#selectJsonSELECT JSON/a/lilia href=CQL-2.2.html#insertJsonINSERT JSON/a/lilia href=CQL-2.2.html#jsonEncodingJSON Enc oding of Cassandra Data Types/a/lilia
[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
[ https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-9591: --- Reviewer: Stefania Scrub (recover) sstables even when -Index.db is missing --- Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Labels: sstablescrub Fix For: 2.0.15 Attachments: 9591-2.0.txt Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. There's also a cassandra distribution bundled with the patch [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz] to make life a little easier for anyone finding themselves in such a bad situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583891#comment-14583891 ] Joshua McKenzie commented on CASSANDRA-7918: I think the general concern is that maintaining a code-base with gnuplot in it isn't something your fellow contributors are thrilled about, not the potential difficulty of a user interacting with it. How about something like a verbose_stress.sh that dumps current commit sha, yaml settings, and stress args to a file, passes all args through to cassandra-stress.* and appends the stress output to that file, then compresses the final results to the an archive named w/datetime stamp? Some simple section delimiters and our graph generator could parse that trivially. Avoids the coupling w/stress, keeps the collection of metadata and test output as a separate logical entity, and we get our canonical source of truth. Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583898#comment-14583898 ] Benedict commented on CASSANDRA-7918: - bq. How about something like a verbose_stress.sh SGTM. Although I'm not such a fan of datetime naming - they need to be prohibitively long to get uniqueness, and are really ugly to parse (mentally). Might prefer a mix of date + short ascii hash. Not exactly a pressing question though. Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector
[ https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ron Kuris updated CASSANDRA-9526: - Attachment: (was: PHI-Log-Debug-When-Close.patch.txt) Provide a JMX hook to monitor phi values in the FailureDetector --- Key: CASSANDRA-9526 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ron Kuris Fix For: 2.0.x phi_convict_threshold can be tuned, but there's currently no way to monitor the phi values to see if you're getting close. The attached patch adds the ability to get these values via JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector
[ https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ron Kuris updated CASSANDRA-9526: - Attachment: (was: PHI-Race-Condition.patch.txt) Provide a JMX hook to monitor phi values in the FailureDetector --- Key: CASSANDRA-9526 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ron Kuris Fix For: 2.0.x phi_convict_threshold can be tuned, but there's currently no way to monitor the phi values to see if you're getting close. The attached patch adds the ability to get these values via JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583935#comment-14583935 ] Marcus Eriksson commented on CASSANDRA-8460: bq. So my initial approach was to define a second config item, separate from data_file_directories yeah lets keep it simple for now, add a new config variable like you suggest Make it possible to move non-compacting sstables to slow/big storage in DTCS Key: CASSANDRA-8460 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Labels: dtcs It would be nice if we could configure DTCS to have a set of extra data directories where we move the sstables once they are older than max_sstable_age_days. This would enable users to have a quick, small SSD for hot, new data, and big spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9567) Windows does not handle ipv6 addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583984#comment-14583984 ] Philip Thompson commented on CASSANDRA-9567: [~kishkaru], can you test with this patch as well? Windows does not handle ipv6 addresses -- Key: CASSANDRA-9567 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Philip Thompson Fix For: 3.x, 2.1.x, 2.2.x Attachments: 9567.txt In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml by splitting on {{:}}, then selecting [1] from the resulting split, to separate the yaml key from the value. Unfortunately, because ipv6 addresses contain {{:}} characters, this means we are not grabbing the whole address, causing problems when starting the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9595) Compacting an empty table sometimes doesn't delete SSTables in 2.0 with LCS
Jim Witschey created CASSANDRA-9595: --- Summary: Compacting an empty table sometimes doesn't delete SSTables in 2.0 with LCS Key: CASSANDRA-9595 URL: https://issues.apache.org/jira/browse/CASSANDRA-9595 Project: Cassandra Issue Type: Bug Reporter: Jim Witschey Fix For: 2.0.x On 2.0, when compaction is run on a table with all rows deleted and configured with LCS, sometimes SSTables remain on disk afterwards. This causes one of our dtests to fail periodically, for instance [here|http://cassci.datastax.com/view/cassandra-2.0/job/cassandra-2.0_dtest/68/testReport/compaction_test/TestCompaction_with_LeveledCompactionStrategy/sstable_deletion_test/]. This can be reproduced in dtests with {code} CASSANDRA_VERSION=git:cassandra-2.0 nosetests ./compaction_test.py:TestCompaction_with_LeveledCompactionStrategy.sstable_deletion_test {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] cassandra git commit: Fix ipv6 parsing on Windows startup
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 a5be8f199 - 2e92cf899 Fix ipv6 parsing on Windows startup Patch by Philip Thompson; reviewed by jmckenzie for CASSANDRA-9567 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1702b0b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1702b0b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1702b0b Branch: refs/heads/cassandra-2.2 Commit: c1702b0b3c24040ed8b402e684a8d0ffd4e4359f Parents: 69b7dd3 Author: Philip Thompson ptnapol...@gmail.com Authored: Fri Jun 12 18:17:13 2015 -0400 Committer: Josh McKenzie josh.mcken...@datastax.com Committed: Fri Jun 12 18:17:13 2015 -0400 -- bin/cassandra.ps1 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c1702b0b/bin/cassandra.ps1 -- diff --git a/bin/cassandra.ps1 b/bin/cassandra.ps1 index 80049ee..41ea7c1 100644 --- a/bin/cassandra.ps1 +++ b/bin/cassandra.ps1 @@ -299,12 +299,12 @@ Function VerifyPortsAreAvailable { if ($line -match ^listen_address:) { -$args = $line -Split : +$args = $line -Split : $listenAddress = $args[1] -replace , } if ($line -match ^rpc_address:) { -$args = $line -Split : +$args = $line -Split : $rpcAddress = $args[1] -replace , } }
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584182#comment-14584182 ] Jeremiah Jordan commented on CASSANDRA-9592: bq. I think this change makes compaction kick scheduled here obsolete so we can delete it. We should make this new task every 5/10 minutes then so that we don't start compactions early. The 5 minute window with no compactions is nice to have to giving an operator time to disable compaction over JMX or other such things. So we shouldn't lower it down to only 1 minute. `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2e92cf89 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2e92cf89 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2e92cf89 Branch: refs/heads/cassandra-2.2 Commit: 2e92cf8996f42dc40fade41b73001affdfbd6f7d Parents: a5be8f1 c1702b0 Author: Josh McKenzie josh.mcken...@datastax.com Authored: Fri Jun 12 18:18:45 2015 -0400 Committer: Josh McKenzie josh.mcken...@datastax.com Committed: Fri Jun 12 18:18:45 2015 -0400 -- bin/cassandra.ps1 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --
cassandra git commit: Fix ipv6 parsing on Windows startup
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 69b7dd327 - c1702b0b3 Fix ipv6 parsing on Windows startup Patch by Philip Thompson; reviewed by jmckenzie for CASSANDRA-9567 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1702b0b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1702b0b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1702b0b Branch: refs/heads/cassandra-2.1 Commit: c1702b0b3c24040ed8b402e684a8d0ffd4e4359f Parents: 69b7dd3 Author: Philip Thompson ptnapol...@gmail.com Authored: Fri Jun 12 18:17:13 2015 -0400 Committer: Josh McKenzie josh.mcken...@datastax.com Committed: Fri Jun 12 18:17:13 2015 -0400 -- bin/cassandra.ps1 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c1702b0b/bin/cassandra.ps1 -- diff --git a/bin/cassandra.ps1 b/bin/cassandra.ps1 index 80049ee..41ea7c1 100644 --- a/bin/cassandra.ps1 +++ b/bin/cassandra.ps1 @@ -299,12 +299,12 @@ Function VerifyPortsAreAvailable { if ($line -match ^listen_address:) { -$args = $line -Split : +$args = $line -Split : $listenAddress = $args[1] -replace , } if ($line -match ^rpc_address:) { -$args = $line -Split : +$args = $line -Split : $rpcAddress = $args[1] -replace , } }
[2/3] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2e92cf89 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2e92cf89 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2e92cf89 Branch: refs/heads/trunk Commit: 2e92cf8996f42dc40fade41b73001affdfbd6f7d Parents: a5be8f1 c1702b0 Author: Josh McKenzie josh.mcken...@datastax.com Authored: Fri Jun 12 18:18:45 2015 -0400 Committer: Josh McKenzie josh.mcken...@datastax.com Committed: Fri Jun 12 18:18:45 2015 -0400 -- bin/cassandra.ps1 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --
[1/3] cassandra git commit: Fix ipv6 parsing on Windows startup
Repository: cassandra Updated Branches: refs/heads/trunk 40c3e8922 - 7476d83b4 Fix ipv6 parsing on Windows startup Patch by Philip Thompson; reviewed by jmckenzie for CASSANDRA-9567 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1702b0b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1702b0b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1702b0b Branch: refs/heads/trunk Commit: c1702b0b3c24040ed8b402e684a8d0ffd4e4359f Parents: 69b7dd3 Author: Philip Thompson ptnapol...@gmail.com Authored: Fri Jun 12 18:17:13 2015 -0400 Committer: Josh McKenzie josh.mcken...@datastax.com Committed: Fri Jun 12 18:17:13 2015 -0400 -- bin/cassandra.ps1 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c1702b0b/bin/cassandra.ps1 -- diff --git a/bin/cassandra.ps1 b/bin/cassandra.ps1 index 80049ee..41ea7c1 100644 --- a/bin/cassandra.ps1 +++ b/bin/cassandra.ps1 @@ -299,12 +299,12 @@ Function VerifyPortsAreAvailable { if ($line -match ^listen_address:) { -$args = $line -Split : +$args = $line -Split : $listenAddress = $args[1] -replace , } if ($line -match ^rpc_address:) { -$args = $line -Split : +$args = $line -Split : $rpcAddress = $args[1] -replace , } }
[3/3] cassandra git commit: Merge branch 'cassandra-2.2' into trunk
Merge branch 'cassandra-2.2' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7476d83b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7476d83b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7476d83b Branch: refs/heads/trunk Commit: 7476d83b44fa58a354fc0c7330b74b8e7ed7a3a3 Parents: 40c3e89 2e92cf8 Author: Josh McKenzie josh.mcken...@datastax.com Authored: Fri Jun 12 18:19:06 2015 -0400 Committer: Josh McKenzie josh.mcken...@datastax.com Committed: Fri Jun 12 18:19:06 2015 -0400 -- bin/cassandra.ps1 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --
[jira] [Comment Edited] (CASSANDRA-9596) Tombstone timestamps aren't used to skip SSTables while they are still in the memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584246#comment-14584246 ] Aleksey Yeschenko edited comment on CASSANDRA-9596 at 6/12/15 11:23 PM: We do for {{collectTimeOrderedData()}} since CASSANDRA-7394 (2.0.9). I did miss {{collectAllData()}} there, though. Benedict fixed it in CASSANDRA-9298 (2.1.6). was (Author: iamaleksey): We do for {{collectTimeOrderedData()}} since CASSANDRA-7394 (2.0.9). I did miss {{collectAllData()}} there, though. Benedict fixed it in CASSANDRA-9228 (2.1.6). Tombstone timestamps aren't used to skip SSTables while they are still in the memtable -- Key: CASSANDRA-9596 URL: https://issues.apache.org/jira/browse/CASSANDRA-9596 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Fix For: 2.0.x If you have one SSTable containing a partition level tombstone at timestamp t and all other SSTables have cells with timestamp t, Cassandra will skip all the other SSTables and return nothing quickly. However, if the partition tombstone is still in the memtable it doesn’t skip any SSTables. It should use the same timestamp logic to skip all SSTables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9576) Connection leak in CQLRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584285#comment-14584285 ] Philip Thompson commented on CASSANDRA-9576: [~beobal], could you check this out next week? I'm not seeing where we are leaking connections. Connection leak in CQLRecordWriter -- Key: CASSANDRA-9576 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: T Meyarivan Assignee: Philip Thompson Ran into connection leaks when using CQLCassandra apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). It seems like the order blocks of code starting at https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298 were reversed in 2.2 which leads to the connection leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9596) Tombstone timestamps aren't used to skip SSTables while they are still in the memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584246#comment-14584246 ] Aleksey Yeschenko commented on CASSANDRA-9596: -- We do for {{collectTimeOrderedData()}} since CASSANDRA-7394 (2.0.9). I did miss {{collectAllData()}} there, though. Benedict fixed it in CASSANDRA-9228 (2.1.6). Tombstone timestamps aren't used to skip SSTables while they are still in the memtable -- Key: CASSANDRA-9596 URL: https://issues.apache.org/jira/browse/CASSANDRA-9596 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low Fix For: 2.0.x If you have one SSTable containing a partition level tombstone at timestamp t and all other SSTables have cells with timestamp t, Cassandra will skip all the other SSTables and return nothing quickly. However, if the partition tombstone is still in the memtable it doesn’t skip any SSTables. It should use the same timestamp logic to skip all SSTables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
[ https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584250#comment-14584250 ] Aleksey Yeschenko commented on CASSANDRA-9592: -- bq. We should make this new task every 5/10 minutes then so that we don't start compactions early. Make it every 1 minute, just delay it by 5? `Periodically attempt to submit background compaction tasks --- Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9487) CommitLogTest hangs intermittently in 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583664#comment-14583664 ] Ariel Weisberg commented on CASSANDRA-9487: --- Was this merged? Can it be closed? CommitLogTest hangs intermittently in 2.0 - Key: CASSANDRA-9487 URL: https://issues.apache.org/jira/browse/CASSANDRA-9487 Project: Cassandra Issue Type: Bug Components: Tests Reporter: Michael Shuler Assignee: Branimir Lambov Fix For: 2.0.x Attachments: system.log Possibly related to CASSANDRA-8992 ? 2.0 unit tests are hanging periodically in the same way (I have not gone through all the branches, so can't say we're in the clear everywhere - marking for just 2.x at the moment). CommitLogTest hung system.log attached from local reproduction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9572) DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used.
[ https://issues.apache.org/jira/browse/CASSANDRA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583465#comment-14583465 ] Sylvain Lebresne commented on CASSANDRA-9572: - patch lgtm DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used. -- Key: CASSANDRA-9572 URL: https://issues.apache.org/jira/browse/CASSANDRA-9572 Project: Cassandra Issue Type: Bug Components: Core Reporter: Antti Nissinen Assignee: Marcus Eriksson Labels: dtcs Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x Attachments: cassandra_sstable_metadata_reader.py, cassandra_sstable_timespan_graph.py, compaction_stage_test01_jira.log, compaction_stage_test02_jira.log, datagen.py, explanation_jira.txt, first_results_after_patch.txt, motivation_jira.txt, src_2.1.5_with_debug.zip DateTieredCompaction works correctly when data is dumped for a certain time period in short SSTables in time manner and then compacted together. However, if TTL is applied to the data columns the DTCS fails to compact files correctly in timely manner. In our opinion the problem is caused by two issues: A) During the DateTieredCompaction process the getFullyExpiredSStables is called twice. First from the DateTieredCompactionStrategy class and second time from the CompactionTask class. On the first time the target is to find out fully expired SStables that are not overlapping with any non-fully expired SSTables. That works correctly. When the getFullyExpiredSSTables is called second time from CompactionTask class the selection of fully expired SSTables is modified compared to the first selection. B) The minimum timestamp of the new SSTables created by combining together fully expired SSTable and files from the most interesting bucket is not correct. These two issues together cause problems for the DTCS process when it combines together SSTables having overlap in time and TTL for the column. This is demonstrated by generating test data first without compactions and showing the timely distribution of files. When the compaction is enabled the DCTS combines files together, but the end result is not something to be expected. This is demonstrated in the file motivation_jira.txt Attachments contain following material: - Motivation_jira.txt: Practical examples how the DTCS behaves with TTL - Explanation_jira.txt: gives more details, explains test cases and demonstrates the problems in the compaction process - Logfile file for the compactions in the first test case (compaction_stage_test01_jira.log) - Logfile file for the compactions in the seconnd test case (compaction_stage_test02_jira.log) - source code zip file for version 2.1.5 with additional comment statements (src_2.1.5_with_debug.zip) - Python script to generate test data (datagen.py) - Python script to read metadata from SStables (cassandra_sstable_metadata_reader.py) - Python script to generate timeline representation of SSTables (cassandra_sstable_timespan_graph.py) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9572) DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used.
[ https://issues.apache.org/jira/browse/CASSANDRA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583313#comment-14583313 ] Antti Nissinen edited comment on CASSANDRA-9572 at 6/12/15 12:01 PM: - I ran the test again and looked at the log file in detail. Now it works as expected. Thank you very much for all involved! was (Author: anissinen): I ran the tests again and looked at the log file in detail. Now it works as expected. Thank you very much for all involved! DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used. -- Key: CASSANDRA-9572 URL: https://issues.apache.org/jira/browse/CASSANDRA-9572 Project: Cassandra Issue Type: Bug Components: Core Reporter: Antti Nissinen Assignee: Marcus Eriksson Labels: dtcs Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x Attachments: cassandra_sstable_metadata_reader.py, cassandra_sstable_timespan_graph.py, compaction_stage_test01_jira.log, compaction_stage_test02_jira.log, datagen.py, explanation_jira.txt, first_results_after_patch.txt, motivation_jira.txt, src_2.1.5_with_debug.zip DateTieredCompaction works correctly when data is dumped for a certain time period in short SSTables in time manner and then compacted together. However, if TTL is applied to the data columns the DTCS fails to compact files correctly in timely manner. In our opinion the problem is caused by two issues: A) During the DateTieredCompaction process the getFullyExpiredSStables is called twice. First from the DateTieredCompactionStrategy class and second time from the CompactionTask class. On the first time the target is to find out fully expired SStables that are not overlapping with any non-fully expired SSTables. That works correctly. When the getFullyExpiredSSTables is called second time from CompactionTask class the selection of fully expired SSTables is modified compared to the first selection. B) The minimum timestamp of the new SSTables created by combining together fully expired SSTable and files from the most interesting bucket is not correct. These two issues together cause problems for the DTCS process when it combines together SSTables having overlap in time and TTL for the column. This is demonstrated by generating test data first without compactions and showing the timely distribution of files. When the compaction is enabled the DCTS combines files together, but the end result is not something to be expected. This is demonstrated in the file motivation_jira.txt Attachments contain following material: - Motivation_jira.txt: Practical examples how the DTCS behaves with TTL - Explanation_jira.txt: gives more details, explains test cases and demonstrates the problems in the compaction process - Logfile file for the compactions in the first test case (compaction_stage_test01_jira.log) - Logfile file for the compactions in the seconnd test case (compaction_stage_test02_jira.log) - source code zip file for version 2.1.5 with additional comment statements (src_2.1.5_with_debug.zip) - Python script to generate test data (datagen.py) - Python script to read metadata from SStables (cassandra_sstable_metadata_reader.py) - Python script to generate timeline representation of SSTables (cassandra_sstable_timespan_graph.py) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9572) DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used.
[ https://issues.apache.org/jira/browse/CASSANDRA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583313#comment-14583313 ] Antti Nissinen commented on CASSANDRA-9572: --- I ran the tests again and looked at the log file in detail. Now it works as expected. Thank you very much for all involved! DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used. -- Key: CASSANDRA-9572 URL: https://issues.apache.org/jira/browse/CASSANDRA-9572 Project: Cassandra Issue Type: Bug Components: Core Reporter: Antti Nissinen Assignee: Marcus Eriksson Labels: dtcs Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x Attachments: cassandra_sstable_metadata_reader.py, cassandra_sstable_timespan_graph.py, compaction_stage_test01_jira.log, compaction_stage_test02_jira.log, datagen.py, explanation_jira.txt, first_results_after_patch.txt, motivation_jira.txt, src_2.1.5_with_debug.zip DateTieredCompaction works correctly when data is dumped for a certain time period in short SSTables in time manner and then compacted together. However, if TTL is applied to the data columns the DTCS fails to compact files correctly in timely manner. In our opinion the problem is caused by two issues: A) During the DateTieredCompaction process the getFullyExpiredSStables is called twice. First from the DateTieredCompactionStrategy class and second time from the CompactionTask class. On the first time the target is to find out fully expired SStables that are not overlapping with any non-fully expired SSTables. That works correctly. When the getFullyExpiredSSTables is called second time from CompactionTask class the selection of fully expired SSTables is modified compared to the first selection. B) The minimum timestamp of the new SSTables created by combining together fully expired SSTable and files from the most interesting bucket is not correct. These two issues together cause problems for the DTCS process when it combines together SSTables having overlap in time and TTL for the column. This is demonstrated by generating test data first without compactions and showing the timely distribution of files. When the compaction is enabled the DCTS combines files together, but the end result is not something to be expected. This is demonstrated in the file motivation_jira.txt Attachments contain following material: - Motivation_jira.txt: Practical examples how the DTCS behaves with TTL - Explanation_jira.txt: gives more details, explains test cases and demonstrates the problems in the compaction process - Logfile file for the compactions in the first test case (compaction_stage_test01_jira.log) - Logfile file for the compactions in the seconnd test case (compaction_stage_test02_jira.log) - source code zip file for version 2.1.5 with additional comment statements (src_2.1.5_with_debug.zip) - Python script to generate test data (datagen.py) - Python script to read metadata from SStables (cassandra_sstable_metadata_reader.py) - Python script to generate timeline representation of SSTables (cassandra_sstable_timespan_graph.py) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine
[ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583533#comment-14583533 ] Sylvain Lebresne commented on CASSANDRA-8099: - bq. I've pushed a small semantic-changing suggestion for serialization and merging of RT Thanks. I hesitated doing this initially and don't remember why I didn't. But this does clean up things a bit so I'll look at integrating it on Monday unless I remember a good reason not to (which there probably isn't). Refactor and modernize the storage engine - Key: CASSANDRA-8099 URL: https://issues.apache.org/jira/browse/CASSANDRA-8099 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 3.0 beta 1 Attachments: 8099-nit The current storage engine (which for this ticket I'll loosely define as the code implementing the read/write path) is suffering from old age. One of the main problem is that the only structure it deals with is the cell, which completely ignores the more high level CQL structure that groups cell into (CQL) rows. This leads to many inefficiencies, like the fact that during a reads we have to group cells multiple times (to count on replica, then to count on the coordinator, then to produce the CQL resultset) because we forget about the grouping right away each time (so lots of useless cell names comparisons in particular). But outside inefficiencies, having to manually recreate the CQL structure every time we need it for something is hindering new features and makes the code more complex that it should be. Said storage engine also has tons of technical debt. To pick an example, the fact that during range queries we update {{SliceQueryFilter.count}} is pretty hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has to go into to simply remove the last query result. So I want to bite the bullet and modernize this storage engine. I propose to do 2 main things: # Make the storage engine more aware of the CQL structure. In practice, instead of having partitions be a simple iterable map of cells, it should be an iterable list of row (each being itself composed of per-column cells, though obviously not exactly the same kind of cell we have today). # Make the engine more iterative. What I mean here is that in the read path, we end up reading all cells in memory (we put them in a ColumnFamily object), but there is really no reason to. If instead we were working with iterators all the way through, we could get to a point where we're basically transferring data from disk to the network, and we should be able to reduce GC substantially. Please note that such refactor should provide some performance improvements right off the bat but it's not it's primary goal either. It's primary goal is to simplify the storage engine and adds abstraction that are better suited to further optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9577) Cassandra not performing GC on stale SStables after compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583376#comment-14583376 ] Marcus Eriksson commented on CASSANDRA-9577: So, the lsof output and data directory contents are after the compaction? How long after? The sstables are not deleted immediately, the deletion is done in the background. Cassandra not performing GC on stale SStables after compaction -- Key: CASSANDRA-9577 URL: https://issues.apache.org/jira/browse/CASSANDRA-9577 Project: Cassandra Issue Type: Bug Components: Core Environment: 2.0.12.200 / DSE 4.6.1. Reporter: Jeff Ferland Assignee: Marcus Eriksson Space used (live), bytes: 878681716067 Space used (total), bytes: 2227857083852 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ sudo lsof *-Data.db COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java4473 cassandra 446r REG 0,26 17582559172 39241 trends-trends-jb-144864-Data.db java4473 cassandra 448r REG 0,26 62040962 37431 trends-trends-jb-144731-Data.db java4473 cassandra 449r REG 0,26 829935047545 21150 trends-trends-jb-143581-Data.db java4473 cassandra 452r REG 0,26 8980406 39503 trends-trends-jb-144882-Data.db java4473 cassandra 454r REG 0,26 8980406 39503 trends-trends-jb-144882-Data.db java4473 cassandra 462r REG 0,26 9487703 39542 trends-trends-jb-144883-Data.db java4473 cassandra 463r REG 0,26 36158226 39629 trends-trends-jb-144889-Data.db java4473 cassandra 468r REG 0,26105693505 39447 trends-trends-jb-144881-Data.db java4473 cassandra 530r REG 0,26 17582559172 39241 trends-trends-jb-144864-Data.db java4473 cassandra 535r REG 0,26105693505 39447 trends-trends-jb-144881-Data.db java4473 cassandra 542r REG 0,26 9487703 39542 trends-trends-jb-144883-Data.db java4473 cassandra 553u REG 0,26 6431729821 39556 trends-trends-tmp-jb-144884-Data.db jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ ls *-Data.db trends-trends-jb-142631-Data.db trends-trends-jb-143562-Data.db trends-trends-jb-143581-Data.db trends-trends-jb-144731-Data.db trends-trends-jb-144883-Data.db trends-trends-jb-142633-Data.db trends-trends-jb-143563-Data.db trends-trends-jb-144530-Data.db trends-trends-jb-144864-Data.db trends-trends-jb-144889-Data.db trends-trends-jb-143026-Data.db trends-trends-jb-143564-Data.db trends-trends-jb-144551-Data.db trends-trends-jb-144881-Data.db trends-trends-tmp-jb-144884-Data.db trends-trends-jb-143533-Data.db trends-trends-jb-143578-Data.db trends-trends-jb-144552-Data.db trends-trends-jb-144882-Data.db jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ cd - /mnt/cassandra/data/trends/trends jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ sudo lsof * jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ ls *-Data.db trends-trends-jb-124502-Data.db trends-trends-jb-141113-Data.db trends-trends-jb-141377-Data.db trends-trends-jb-141846-Data.db trends-trends-jb-144890-Data.db trends-trends-jb-125457-Data.db trends-trends-jb-141123-Data.db trends-trends-jb-141391-Data.db trends-trends-jb-141871-Data.db trends-trends-jb-41121-Data.db trends-trends-jb-130016-Data.db trends-trends-jb-141137-Data.db trends-trends-jb-141538-Data.db trends-trends-jb-141883-Data.db trends-trends.trends_date_idx-jb-2100-Data.db trends-trends-jb-139563-Data.db trends-trends-jb-141358-Data.db trends-trends-jb-141806-Data.db trends-trends-jb-142033-Data.db trends-trends-jb-141102-Data.db trends-trends-jb-141363-Data.db trends-trends-jb-141829-Data.db trends-trends-jb-144553-Data.db Compaction started INFO [CompactionExecutor:6661] 2015-06-05 14:02:36,515 CompactionTask.java (line 120) Compacting [SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-124502-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141358-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141883-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141846-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141871-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141391-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-139563-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-125457-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141806-Data.db'),
[jira] [Commented] (CASSANDRA-9577) Cassandra not performing GC on stale SStables after compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583407#comment-14583407 ] Jeff Ferland commented on CASSANDRA-9577: - Per timestamps above, this was the case more than 24 hours after completion. Restarting the host did clear the files. On the same host just after a restart I'm encountering the same pattern. Cassandra not performing GC on stale SStables after compaction -- Key: CASSANDRA-9577 URL: https://issues.apache.org/jira/browse/CASSANDRA-9577 Project: Cassandra Issue Type: Bug Components: Core Environment: 2.0.12.200 / DSE 4.6.1. Reporter: Jeff Ferland Assignee: Marcus Eriksson Space used (live), bytes: 878681716067 Space used (total), bytes: 2227857083852 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ sudo lsof *-Data.db COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java4473 cassandra 446r REG 0,26 17582559172 39241 trends-trends-jb-144864-Data.db java4473 cassandra 448r REG 0,26 62040962 37431 trends-trends-jb-144731-Data.db java4473 cassandra 449r REG 0,26 829935047545 21150 trends-trends-jb-143581-Data.db java4473 cassandra 452r REG 0,26 8980406 39503 trends-trends-jb-144882-Data.db java4473 cassandra 454r REG 0,26 8980406 39503 trends-trends-jb-144882-Data.db java4473 cassandra 462r REG 0,26 9487703 39542 trends-trends-jb-144883-Data.db java4473 cassandra 463r REG 0,26 36158226 39629 trends-trends-jb-144889-Data.db java4473 cassandra 468r REG 0,26105693505 39447 trends-trends-jb-144881-Data.db java4473 cassandra 530r REG 0,26 17582559172 39241 trends-trends-jb-144864-Data.db java4473 cassandra 535r REG 0,26105693505 39447 trends-trends-jb-144881-Data.db java4473 cassandra 542r REG 0,26 9487703 39542 trends-trends-jb-144883-Data.db java4473 cassandra 553u REG 0,26 6431729821 39556 trends-trends-tmp-jb-144884-Data.db jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ ls *-Data.db trends-trends-jb-142631-Data.db trends-trends-jb-143562-Data.db trends-trends-jb-143581-Data.db trends-trends-jb-144731-Data.db trends-trends-jb-144883-Data.db trends-trends-jb-142633-Data.db trends-trends-jb-143563-Data.db trends-trends-jb-144530-Data.db trends-trends-jb-144864-Data.db trends-trends-jb-144889-Data.db trends-trends-jb-143026-Data.db trends-trends-jb-143564-Data.db trends-trends-jb-144551-Data.db trends-trends-jb-144881-Data.db trends-trends-tmp-jb-144884-Data.db trends-trends-jb-143533-Data.db trends-trends-jb-143578-Data.db trends-trends-jb-144552-Data.db trends-trends-jb-144882-Data.db jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ cd - /mnt/cassandra/data/trends/trends jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ sudo lsof * jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ ls *-Data.db trends-trends-jb-124502-Data.db trends-trends-jb-141113-Data.db trends-trends-jb-141377-Data.db trends-trends-jb-141846-Data.db trends-trends-jb-144890-Data.db trends-trends-jb-125457-Data.db trends-trends-jb-141123-Data.db trends-trends-jb-141391-Data.db trends-trends-jb-141871-Data.db trends-trends-jb-41121-Data.db trends-trends-jb-130016-Data.db trends-trends-jb-141137-Data.db trends-trends-jb-141538-Data.db trends-trends-jb-141883-Data.db trends-trends.trends_date_idx-jb-2100-Data.db trends-trends-jb-139563-Data.db trends-trends-jb-141358-Data.db trends-trends-jb-141806-Data.db trends-trends-jb-142033-Data.db trends-trends-jb-141102-Data.db trends-trends-jb-141363-Data.db trends-trends-jb-141829-Data.db trends-trends-jb-144553-Data.db Compaction started INFO [CompactionExecutor:6661] 2015-06-05 14:02:36,515 CompactionTask.java (line 120) Compacting [SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-124502-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141358-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141883-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141846-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141871-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141391-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-139563-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-125457-Data.db'), SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141806-Data.db'),
[jira] [Updated] (CASSANDRA-6710) Support union types
[ https://issues.apache.org/jira/browse/CASSANDRA-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-6710: - Fix Version/s: 3.x Support union types --- Key: CASSANDRA-6710 URL: https://issues.apache.org/jira/browse/CASSANDRA-6710 Project: Cassandra Issue Type: Improvement Components: API, Core Reporter: Tupshin Harper Priority: Minor Labels: ponies Fix For: 3.x I sometimes find myself wanting to abuse Cassandra datatypes when I want to interleave two different types in the same column. An example is in CASSANDRA-6167 where an approach is to tag what would normally be a numeric field with text indicating that it is special in some ways. A more elegant approach would be to be able to explicitly define disjoint unions in the style of Haskell's and Scala's Either types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine
[ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583530#comment-14583530 ] Sylvain Lebresne commented on CASSANDRA-8099: - Some update on this. I've pushed a rebased (and squashed because that made it a *lot* easier to rebase) version in [my 8099 branch|https://github.com/pcmanus/cassandra/tree/8099]. It's still missing wire backward compatibility ([~thobbs] is finishing this so this should be ready hopefully soon). Regarding tests: * unit tests are almost green: mostly it remains some failures in the hadoop tests. I could actually use the experience of someone that knows these tests and the code involved as it's not immediately clear to me what this is even doing. * dtests still have a fair amount of failure but I've only look at them recently and it's getting down quickly. h2. OpOrder I think the main problem was that a local read (done through {{SP.LocalReadRunnable}}) was potentially keeping a group open while waiting on other nodes. I also realized this path meant local reads (the actual read of sstables) were done outside of the {{StorageProxy} methods, and so 1) not on the thread they were supposed to be on and 2) outside of the timeout check. I changed this so that a local response actually materialize everything upfront (similarly to what we do today), solving the problems above. This is not perfect and I'm sure we'll improve on this in the future, but that feels like a good enough option initially. Regarding moving {{OpOrder}} out of {{close}}, the only way to do that I can see is be to move it up the stack, in {{ReadCommandVerbHandler}} and {{SP.LocalReadRunnable}} (as suggested by Brananir above). I'm working on that (I just started and might not have the time to finish today, but it'll be done early monday for sure). h2. Branamir's review remarks I've integrated fixes for most of the remarks. I discuss the rest below. bq. [CompactionIterable 125|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L125]: I doubt index update belongs here, as side effect of iteration. Ideally index should be collected, not updated. Though I don't disagree on principle, this is not different from how it's done currently (it's done the same in {{LazilyCompactRow}}, but it just happens that the old {{LazilyCompactedRow}} has been merged to {{CompactionIterable}} (now {{CompactionIterator}}) because simplifications of the patch made it unnecessary to have separate classes). Happy to look at cleaning this in a separate ticket however (probably belongs to cleaning the 2ndary index API in practice). bq. [CompactionIterable 237|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L237]: Another side effect of iteration that preferably should be handled by writer. Maybe, but it's not that simple. Merging (which is done directly by the {{CompactionIterator}}) gets rid of empty partitions and more generally we get rid of them as soon as possible. I think that it's the right thing to do as it's easier for the rest of the code, but that means we have to do invalidation in {{CompactionIterator}}. Of course, we could special case {{CompactionIterator}} to not remove empty partitions and do cache invalidation externally, but I'm not sure it would be cleaner overall (it would somewhat more error-prone). Besides, I could argue that cache invalidation is very much a product of compaction and having it in {{CompactionIterator}} is not that bad. bq. Validation compaction now uses CompactionIterable and thus has side effects (index cache removal). I've fixed that but I'll note for posterity that as far as I can tell, index removal is done for validation compaction on trunk (and all previous version) due to the use of {{LazilyCompactedRow}}. I've still disabled it (for anything that wasn't a true compaction) because I think that's the right thing to do, but that's a difference of this ticket. bq. add that there is never content between two corresponding tombstone markers on any iterator. That's mentioned in Dealing with tombstones and shadowed cells. More precisely, that's what it's part of the contract of an AtomIterator that it must not shadow it's own data means. But I need to clean up/update the guide so I'll make sure to clarify further while at it. bq. These objects should be Iterable instead. Having that would give clear separation between the iteration process and the entity-level data Yes, it would be cleaner from that standpoint. And the use of iterators in the first place is indeed largely carried from the existing code, I just hadn't really though of the alternative tbh. I'll try to check next week how easily such
[jira] [Commented] (CASSANDRA-9487) CommitLogTest hangs intermittently in 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583542#comment-14583542 ] Branimir Lambov commented on CASSANDRA-9487: Ported the patch and uploaded it [here|https://github.com/apache/cassandra/compare/trunk...blambov:9487-2.0-cl-test-hang]. I tested that it normally succeeds, but I could not verify that it stops the hangs, because I could not reproduce the hanging in the first place. The log looks like it's the same problem, though. CommitLogTest hangs intermittently in 2.0 - Key: CASSANDRA-9487 URL: https://issues.apache.org/jira/browse/CASSANDRA-9487 Project: Cassandra Issue Type: Bug Components: Tests Reporter: Michael Shuler Assignee: Branimir Lambov Fix For: 2.0.x Attachments: system.log Possibly related to CASSANDRA-8992 ? 2.0 unit tests are hanging periodically in the same way (I have not gone through all the branches, so can't say we're in the clear everywhere - marking for just 2.x at the moment). CommitLogTest hung system.log attached from local reproduction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: undeprecate cache recentHitRate metrics
Repository: cassandra Updated Branches: refs/heads/trunk b1abcd048 - 8c19fd638 undeprecate cache recentHitRate metrics patch by Chris Burroughs; reviewed by benedict for CASSANDRA-6591 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8c19fd63 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8c19fd63 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8c19fd63 Branch: refs/heads/trunk Commit: 8c19fd638da7d5525e85d0cce41aa86e02798108 Parents: b1abcd0 Author: Chris Burroughs chris.burroughs+apa...@gmail.com Authored: Fri Jun 12 12:22:30 2015 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Fri Jun 12 12:22:30 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/metrics/CacheMetrics.java | 29 - .../org/apache/cassandra/utils/DynamicList.java | 2 +- .../apache/cassandra/utils/FasterRandom.java| 116 --- .../cassandra/stress/generate/FasterRandom.java | 116 +++ .../cassandra/stress/generate/values/Bytes.java | 2 +- .../stress/generate/values/Strings.java | 2 +- 7 files changed, 148 insertions(+), 120 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c19fd63/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index b80f272..35e02a2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -5,6 +5,7 @@ * Change gossip stabilization to use endpoit size (CASSANDRA-9401) * Change default garbage collector to G1 (CASSANDRA-7486) * Populate TokenMetadata early during startup (CASSANDRA-9317) + * undeprecate cache recentHitRate (CASSANDRA-6591) 2.2 http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c19fd63/src/java/org/apache/cassandra/metrics/CacheMetrics.java -- diff --git a/src/java/org/apache/cassandra/metrics/CacheMetrics.java b/src/java/org/apache/cassandra/metrics/CacheMetrics.java index 8b00e1c..151268b 100644 --- a/src/java/org/apache/cassandra/metrics/CacheMetrics.java +++ b/src/java/org/apache/cassandra/metrics/CacheMetrics.java @@ -37,8 +37,14 @@ public class CacheMetrics public final Meter hits; /** Total number of cache requests */ public final Meter requests; -/** cache hit rate */ +/** all time cache hit rate */ public final GaugeDouble hitRate; +/** 1m hit rate */ +public final GaugeDouble oneMinuteHitRate; +/** 5m hit rate */ +public final GaugeDouble fiveMinuteHitRate; +/** 15m hit rate */ +public final GaugeDouble fifteenMinuteHitRate; /** Total size of cache, in bytes */ public final GaugeLong size; /** Total number of cache entries */ @@ -71,6 +77,27 @@ public class CacheMetrics return Ratio.of(hits.getCount(), requests.getCount()); } }); +oneMinuteHitRate = Metrics.register(factory.createMetricName(OneMinuteHitRate), new RatioGauge() +{ +protected Ratio getRatio() +{ +return Ratio.of(hits.getOneMinuteRate(), requests.getOneMinuteRate()); +} +}); +fiveMinuteHitRate = Metrics.register(factory.createMetricName(FiveMinuteHitRate), new RatioGauge() +{ +protected Ratio getRatio() +{ +return Ratio.of(hits.getFiveMinuteRate(), requests.getFiveMinuteRate()); +} +}); +fifteenMinuteHitRate = Metrics.register(factory.createMetricName(FifteenMinuteHitRate), new RatioGauge() +{ +protected Ratio getRatio() +{ +return Ratio.of(hits.getFifteenMinuteRate(), requests.getFifteenMinuteRate()); +} +}); size = Metrics.register(factory.createMetricName(Size), new GaugeLong() { public Long getValue() http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c19fd63/src/java/org/apache/cassandra/utils/DynamicList.java -- diff --git a/src/java/org/apache/cassandra/utils/DynamicList.java b/src/java/org/apache/cassandra/utils/DynamicList.java index fc3d523..30f5160 100644 --- a/src/java/org/apache/cassandra/utils/DynamicList.java +++ b/src/java/org/apache/cassandra/utils/DynamicList.java @@ -238,7 +238,7 @@ public class DynamicListE canon.add(c); c++; } -FasterRandom rand = new FasterRandom(); +ThreadLocalRandom rand = ThreadLocalRandom.current(); assert list.isWellFormed(); for (int loop = 0 ; loop 100 ; loop++) {
[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine
[ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583553#comment-14583553 ] Benedict commented on CASSANDRA-8099: - bq. Yes, it would be cleaner from that standpoint. And the use of iterators in the first place is indeed largely carried from the existing code, I just hadn't really though of the alternative tbh. I'll try to check next week how easily such change is. That said, I'm not sure the use of iterators directly is that confusing either, so if it turns hairy, I don't think it's worth blocking on this (that is, we can very well change that later). It does change the semantics quite a bit, since the state needed for iterating must be constructed again each time, and is likely constructed in the caller of .iterator(). This has both advantages and disadvantages. One advantage of an Iterator, though, is that you cannot (easily) iterate over its contents twice. I'm personally not so upset at the use of Iterator, since it's a continuation of the existing approach, and Java 8 makes working with iterators a little easier. We can, for instance, make use of the forEachRemaining() method, or otherwise transform the iterator. I don't think there's any increased ugliness inherent in exposing the higher-level information in the Iterator, though. I believe [~iamaleksey] is working on a way to integrate the Java Streams API at some point in the future, which may lead to other benefits that Iterable cannot deliver. Either way, I think getting this ticket in sooner than later is better, and we can focus on how we might make the Iterator abstraction a little nicer in a follow up. Refactor and modernize the storage engine - Key: CASSANDRA-8099 URL: https://issues.apache.org/jira/browse/CASSANDRA-8099 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 3.0 beta 1 Attachments: 8099-nit The current storage engine (which for this ticket I'll loosely define as the code implementing the read/write path) is suffering from old age. One of the main problem is that the only structure it deals with is the cell, which completely ignores the more high level CQL structure that groups cell into (CQL) rows. This leads to many inefficiencies, like the fact that during a reads we have to group cells multiple times (to count on replica, then to count on the coordinator, then to produce the CQL resultset) because we forget about the grouping right away each time (so lots of useless cell names comparisons in particular). But outside inefficiencies, having to manually recreate the CQL structure every time we need it for something is hindering new features and makes the code more complex that it should be. Said storage engine also has tons of technical debt. To pick an example, the fact that during range queries we update {{SliceQueryFilter.count}} is pretty hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has to go into to simply remove the last query result. So I want to bite the bullet and modernize this storage engine. I propose to do 2 main things: # Make the storage engine more aware of the CQL structure. In practice, instead of having partitions be a simple iterable map of cells, it should be an iterable list of row (each being itself composed of per-column cells, though obviously not exactly the same kind of cell we have today). # Make the engine more iterative. What I mean here is that in the read path, we end up reading all cells in memory (we put them in a ColumnFamily object), but there is really no reason to. If instead we were working with iterators all the way through, we could get to a point where we're basically transferring data from disk to the network, and we should be able to reduce GC substantially. Please note that such refactor should provide some performance improvements right off the bat but it's not it's primary goal either. It's primary goal is to simplify the storage engine and adds abstraction that are better suited to further optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/4] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/271c9e4a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/271c9e4a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/271c9e4a Branch: refs/heads/trunk Commit: 271c9e4ac7a71252e4f4f1984fd4f8f16058bcde Parents: b61da9b 69b7dd3 Author: Marcus Eriksson marc...@apache.org Authored: Fri Jun 12 18:51:48 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:51:48 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/271c9e4a/src/java/org/apache/cassandra/db/compaction/CompactionController.java --
[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections
[ https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583700#comment-14583700 ] Vishy Kasar commented on CASSANDRA-9590: Thanks, this is similar to one of the features we requested: https://issues.apache.org/jira/browse/CASSANDRA-8803 : Implement transitional mode in C* that will accept both encrypted and non-encrypted client traffic Support for both encrypted and unencrypted native transport connections --- Key: CASSANDRA-9590 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stefan Podkowinski Enabling encryption for native transport currently turns SSL exclusively on or off for the opened socket. Migrating from plain to encrypted requires to migrate all native clients as well and redeploy all of them at the same time after starting the SSL enabled Cassandra nodes. This patch would allow to start Cassandra with both an unencrypted and ssl enabled native port. Clients can connect to either, based whether they support ssl or not. This has been implemented by introducing a new {{native_transport_port_ssl}} config option. There would be three scenarios: * client encryption disabled: native_transport_port unencrypted, port_ssl not used * client encryption enabled, port_ssl not set: encrypted native_transport_port * client encryption enabled and port_ssl set: native_transport_port unencrypted, port_ssl encrypted This approach would keep configuration behavior fully backwards compatible. Patch proposal (tests will be added later in case people will speak out in favor for the patch): [Diff trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl], [Patch against trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
mck created CASSANDRA-9591: -- Summary: Scrub (recover) sstables even when -Index.db is missing Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. I've uploaded a cassandra distribution bundled with the patch as well to make life a little easier for anyone finding themselves in such a bad situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] cassandra git commit: Ignore fully expired sstables when finding min timestamp
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 b61da9b56 - 271c9e4ac Ignore fully expired sstables when finding min timestamp Patch by marcuse; reviewed by slebresne for CASSANDRA-9572 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f Branch: refs/heads/cassandra-2.2 Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064 Parents: 3ddd17b Author: Marcus Eriksson marc...@apache.org Authored: Thu Jun 11 08:33:54 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:50:01 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionController.java b/src/java/org/apache/cassandra/db/compaction/CompactionController.java index 7a4b7d9..59453cc 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java @@ -102,7 +102,12 @@ public class CompactionController long minTimestamp = Long.MAX_VALUE; for (SSTableReader sstable : overlapping) -minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +{ +// Overlapping might include fully expired sstables. What we care about here is +// the min timestamp of the overlapping sstables that actually contain live data. +if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore) +minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +} for (SSTableReader candidate : compacting) {
[1/2] cassandra git commit: Ignore fully expired sstables when finding min timestamp
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 16665ee19 - 69b7dd327 Ignore fully expired sstables when finding min timestamp Patch by marcuse; reviewed by slebresne for CASSANDRA-9572 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f Branch: refs/heads/cassandra-2.1 Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064 Parents: 3ddd17b Author: Marcus Eriksson marc...@apache.org Authored: Thu Jun 11 08:33:54 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:50:01 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionController.java b/src/java/org/apache/cassandra/db/compaction/CompactionController.java index 7a4b7d9..59453cc 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java @@ -102,7 +102,12 @@ public class CompactionController long minTimestamp = Long.MAX_VALUE; for (SSTableReader sstable : overlapping) -minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +{ +// Overlapping might include fully expired sstables. What we care about here is +// the min timestamp of the overlapping sstables that actually contain live data. +if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore) +minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +} for (SSTableReader candidate : compacting) {
[2/3] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69b7dd32 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69b7dd32 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69b7dd32 Branch: refs/heads/cassandra-2.2 Commit: 69b7dd327443239b70a104dfe960bd0aa2ccf0a5 Parents: 16665ee 9e60611 Author: Marcus Eriksson marc...@apache.org Authored: Fri Jun 12 18:51:39 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:51:39 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/69b7dd32/src/java/org/apache/cassandra/db/compaction/CompactionController.java --
cassandra git commit: Ignore fully expired sstables when finding min timestamp
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 3ddd17b77 - 9e60611fb Ignore fully expired sstables when finding min timestamp Patch by marcuse; reviewed by slebresne for CASSANDRA-9572 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f Branch: refs/heads/cassandra-2.0 Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064 Parents: 3ddd17b Author: Marcus Eriksson marc...@apache.org Authored: Thu Jun 11 08:33:54 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:50:01 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionController.java b/src/java/org/apache/cassandra/db/compaction/CompactionController.java index 7a4b7d9..59453cc 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java @@ -102,7 +102,12 @@ public class CompactionController long minTimestamp = Long.MAX_VALUE; for (SSTableReader sstable : overlapping) -minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +{ +// Overlapping might include fully expired sstables. What we care about here is +// the min timestamp of the overlapping sstables that actually contain live data. +if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore) +minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +} for (SSTableReader candidate : compacting) {
[2/2] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69b7dd32 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69b7dd32 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69b7dd32 Branch: refs/heads/cassandra-2.1 Commit: 69b7dd327443239b70a104dfe960bd0aa2ccf0a5 Parents: 16665ee 9e60611 Author: Marcus Eriksson marc...@apache.org Authored: Fri Jun 12 18:51:39 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:51:39 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/69b7dd32/src/java/org/apache/cassandra/db/compaction/CompactionController.java --
[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/271c9e4a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/271c9e4a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/271c9e4a Branch: refs/heads/cassandra-2.2 Commit: 271c9e4ac7a71252e4f4f1984fd4f8f16058bcde Parents: b61da9b 69b7dd3 Author: Marcus Eriksson marc...@apache.org Authored: Fri Jun 12 18:51:48 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:51:48 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/271c9e4a/src/java/org/apache/cassandra/db/compaction/CompactionController.java --
[4/4] cassandra git commit: Merge branch 'cassandra-2.2' into trunk
Merge branch 'cassandra-2.2' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2e54ddc Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2e54ddc Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2e54ddc Branch: refs/heads/trunk Commit: c2e54ddc3f47912814323dcc4fca45300db2c518 Parents: 8c19fd6 271c9e4 Author: Marcus Eriksson marc...@apache.org Authored: Fri Jun 12 18:52:05 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:52:05 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) --
[1/4] cassandra git commit: Ignore fully expired sstables when finding min timestamp
Repository: cassandra Updated Branches: refs/heads/trunk 8c19fd638 - c2e54ddc3 Ignore fully expired sstables when finding min timestamp Patch by marcuse; reviewed by slebresne for CASSANDRA-9572 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f Branch: refs/heads/trunk Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064 Parents: 3ddd17b Author: Marcus Eriksson marc...@apache.org Authored: Thu Jun 11 08:33:54 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:50:01 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionController.java b/src/java/org/apache/cassandra/db/compaction/CompactionController.java index 7a4b7d9..59453cc 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java @@ -102,7 +102,12 @@ public class CompactionController long minTimestamp = Long.MAX_VALUE; for (SSTableReader sstable : overlapping) -minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +{ +// Overlapping might include fully expired sstables. What we care about here is +// the min timestamp of the overlapping sstables that actually contain live data. +if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore) +minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp()); +} for (SSTableReader candidate : compacting) {
[2/4] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69b7dd32 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69b7dd32 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69b7dd32 Branch: refs/heads/trunk Commit: 69b7dd327443239b70a104dfe960bd0aa2ccf0a5 Parents: 16665ee 9e60611 Author: Marcus Eriksson marc...@apache.org Authored: Fri Jun 12 18:51:39 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Fri Jun 12 18:51:39 2015 +0200 -- .../apache/cassandra/db/compaction/CompactionController.java | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/69b7dd32/src/java/org/apache/cassandra/db/compaction/CompactionController.java --
[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
[ https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-9591: --- Fix Version/s: 2.0.15 Scrub (recover) sstables even when -Index.db is missing --- Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Fix For: 2.0.15 Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. I've uploaded a cassandra distribution bundled with the patch as well to make life a little easier for anyone finding themselves in such a bad situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
[ https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-9591: --- Attachment: 9591-2.0.txt Scrub (recover) sstables even when -Index.db is missing --- Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Labels: sstablescrub Fix For: 2.0.15 Attachments: 9591-2.0.txt Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. I've uploaded a cassandra distribution bundled with the patch as well to make life a little easier for anyone finding themselves in such a bad situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks
Benedict created CASSANDRA-9592: --- Summary: `Periodically attempt to submit background compaction tasks Key: CASSANDRA-9592 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.x There's more race conditions affecting compaction task submission than CASSANDRA-7745, so to prevent some of these problems stalling compactions, I propose simply submitting background compactions once every minute, if possible. This will typically be a no-op, but there's no harm in that, since it's very cheap to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections
[ https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583429#comment-14583429 ] Mike Adamson commented on CASSANDRA-9590: - Have you considered doing this with TLS instead SSL? That would allow encrypted and unencrypted connections over the same port. Support for both encrypted and unencrypted native transport connections --- Key: CASSANDRA-9590 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stefan Podkowinski Enabling encryption for native transport currently turns SSL exclusively on or off for the opened socket. Migrating from plain to encrypted requires to migrate all native clients as well and redeploy all of them at the same time after starting the SSL enabled Cassandra nodes. This patch would allow to start Cassandra with both an unencrypted and ssl enabled native port. Clients can connect to either, based whether they support ssl or not. This has been implemented by introducing a new {{native_transport_port_ssl}} config option. There would be three scenarios: * client encryption disabled: native_transport_port unencrypted, port_ssl not used * client encryption enabled, port_ssl not set: encrypted native_transport_port * client encryption enabled and port_ssl set: native_transport_port unencrypted, port_ssl encrypted This approach would keep configuration behavior fully backwards compatible. Patch proposal (tests will be added later in case people will speak out in favor for the patch): [Diff trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl], [Patch against trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-7918: --- Reviewer: Joshua McKenzie Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583809#comment-14583809 ] Jeff Jirsa edited comment on CASSANDRA-8460 at 6/12/15 6:07 PM: {quote}yes, I've been thinking maybe adding priorities or tags to the data directories, but that is probably not needed now. Adding a flag to each data_directory that states whether it is for archival storage or not is probably enough for now.{quote} Asking for clarification to make sure I don't go too far into pony land: So my initial approach was to define a second config item, separate from {{data_file_directories}} entirely, so that no other code needed to be aware of it except for classes explicitly wanting to use `archive` tier storage ( {{dd.getAllDataFileLocations()}} would not return the archive tier, but rather add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of storage). It sounds from your description you're envisioning changing the list of data_file_locations to a map {noformat} tag1:location1,tag1:location2,tag3:location3 {noformat} or {noformat} tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also need to maintain backwards compatibility, which seems fairly straight forward to do (check to see if the provided {{data_files_directory}} is an old-format list rather than map and apply some default tag?) The first approach is clean and isolated, unlikely to introduce surprises, but potentially limits us from being able to do more interesting work with tagged data file directories later (ie: only store data for KS W in data directories tagged X, and KS Y in data directories tagged Z). Can you clarify which best fits your expectations? was (Author: jjirsa): {quote}yes, I've been thinking maybe adding priorities or tags to the data directories, but that is probably not needed now. Adding a flag to each data_directory that states whether it is for archival storage or not is probably enough for now.{quote} Asking for clarification to make sure I don't go too far into pony land: So my initial approach was to define a second config item, separate from {{data_file_directories}} entirely, so that no other code needed to be aware of it except for classes explicitly wanting to use `archive` tier storage ( {{dd.getAllDataFileLocations()}} would not return the archive tier, but rather add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of storage). It sounds from your description you're envisioning changing the list of data_file_locations to a map {noformat} [tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also need to maintain backwards compatibility, which seems fairly straight forward to do (check to see if the provided {{data_files_directory}} is an old-format list rather than map and apply some default tag?) The first approach is clean and isolated, unlikely to introduce surprises, but potentially limits us from being able to do more interesting work with tagged data file directories later (ie: only store data for KS W in data directories tagged X, and KS Y in data directories tagged Z). Can you clarify which best fits your expectations? Make it possible to move non-compacting sstables to slow/big storage in DTCS Key: CASSANDRA-8460 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Labels: dtcs It would be nice if we could configure DTCS to have a set of extra data directories where we move the sstables once they are older than max_sstable_age_days. This would enable users to have a quick, small SSD for hot, new data, and big spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9593) Compaction may stall due to race condition
Benedict created CASSANDRA-9593: --- Summary: Compaction may stall due to race condition Key: CASSANDRA-9593 URL: https://issues.apache.org/jira/browse/CASSANDRA-9593 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Priority: Minor Fix For: 2.2.x If the maximum number of compactions are running, and they all terminate simultaneously, they can fail to submit any further compaction tasks. Further, since each only submits one on completion, we only need two of these to race with each other to reduce the number of active compactions below the configured concurrency level. There are a couple of ways to get around this. This simplest is to submit a task to another thread pool to perform the submitBackgroundTask(), but this may be unnecessarily delayed. Another is to maintain a separate count of active compaction tasks, that is decremented while the thread is still serving the request. A partial solution is to just discount the calling thread from the count of active tasks, so at least one of any competitors will win. The problem is mitigated considerably by CASSANDRA-9592, so there's no urgency to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing
[ https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-9591: --- Description: Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. There's also a cassandra distribution bundled with the patch [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz] to make life a little easier for anyone finding themselves in such a bad situation. was: Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. I've uploaded a cassandra distribution bundled with the patch as well to make life a little easier for anyone finding themselves in such a bad situation. Scrub (recover) sstables even when -Index.db is missing --- Key: CASSANDRA-9591 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591 Project: Cassandra Issue Type: Improvement Reporter: mck Assignee: mck Labels: sstablescrub Fix For: 2.0.15 Attachments: 9591-2.0.txt Today SSTableReader needs at minimum 3 files to load an sstable: - -Data.db - -CompressionInfo.db - -Index.db But during the scrub process the -Index.db file isn't actually necessary, unless there's corruption in the -Data.db and we want to be able to skip over corrupted rows. Given that there is still a fair chance that there's nothing wrong with the -Data.db file and we're just missing the -Index.db file this patch addresses that situation. So the following patch makes it possible for the StandaloneScrubber (sstablescrub) to recover sstables despite missing -Index.db files. This can happen from a catastrophic incident where data directories have been lost and/or corrupted, or wiped and the backup not healthy. I'm aware that normally one depends on replicas or snapshots to avoid such situations, but such catastrophic incidents do occur in the wild. I have not tested this patch against normal c* operations and all the other (more critical) ways SSTableReader is used. i'll happily do that and add the needed units tests if people see merit in accepting the patch. Otherwise the patch can live with the issue, in-case anyone else needs it. There's also a cassandra distribution bundled with the patch [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz] to make life a little easier for anyone finding themselves in such a bad
[jira] [Updated] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-7918: Attachment: reads.svg Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583829#comment-14583829 ] Benedict edited comment on CASSANDRA-7918 at 6/12/15 6:19 PM: -- FTR, gnuplot _does_ (apparently) work on Windows :) edit: to avoid hunting around inside CASSANDRA-7282, I've uploaded the read comparison graph to this ticket was (Author: benedict): FTR, gnuplot _does_ (apparently) work on Windows :) Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-5322) Make dtest logging more granular
[ https://issues.apache.org/jira/browse/CASSANDRA-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Wang resolved CASSANDRA-5322. --- Resolution: Fixed Fix Version/s: (was: 3.x) 2.2.x 2.1.x Reviewer: Philip Thompson Reproduced In: 2.1.5 Modified ccmlib/cluster.py, ccmlib/common.py, and cassandra-dtest/dtest.py. I modified the dtest environment variables DEBUG and TRACE so that they could not only accept true/yes and false/no, but also names of C* classes (can add multiple by separating them with a colon). I did this using three functions: var_debug, var_trace, and modify_log. The first two change the log_level of the cluster for a specific class (If it's that is the case), and modify_log calls all the potential changes to the log_level's all at once. In cluster.py, I modified the add and set_log_level functions, and also added two global arrays, _debug and _trace. The two global variables serve to keep track of what classes have the respective log levels. In the set_log_level function, we check if there is a class_name being inputted, and if there is, we make sure it's not already being called. We then append the class to the respective global array, and then change the log_level on the node level. In the add function, I added a feature that whenever a node is added, it'll automatically take in the settings already set forth for class logging levels. Finally, in common.py, I modified the replaces_or_add_into_file_tail function. Before, all additional modifications would be written on the very last line of the file after the closing tag, which means it wasn't being read. This includes modifications to the log level. I changed it so that it would be added before the closing tag. Make dtest logging more granular - Key: CASSANDRA-5322 URL: https://issues.apache.org/jira/browse/CASSANDRA-5322 Project: Cassandra Issue Type: Test Reporter: Ryan McGuire Assignee: Steve Wang Fix For: 2.1.x, 2.2.x From Brandon: We need a way (might need to go in ccm, I haven't looked) to just set one class to DEBUG or TRACE, like we'd do in conf/log4-server.properties but with an env var preferably, so I can control it via buildbot, since it's better at reproducing some issues than I am sometimes, but I don't want to run the full hammer debug all the time. Also, a way to set Tester.allow_log_errors to false via an env var, since sometimes there's an error there that takes a while to fix but is cosmetic, and in the meantime I want to catch new failures so we don't fall behind. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583853#comment-14583853 ] Benedict commented on CASSANDRA-7918: - It's worth pointing out that the user doesn't have to ever touch gnuplot; it compiles scripts for gnuplot, and shells out itself. I don't have any specific attachment to it, though, and if we can get the same info via some other means I'm thrilled. My _ideal_ world would be one with graphs akin to those I produced with gnuplot, but in javascript, with interactive buttons _most especially_ for turning on/off certain aspects of the graph, so that they can more easily be viewed. For instance, adding/removing specific branches, or latency bands. I think stress should output all of the settings it receives if {{-log level=verbose}} is provided. However I'm not sure we want to tightly couple stress to the cassandra.yaml or the SHA. The approach I took was to parse a stress output, so if we standardise our performance tests to always run stress in verbose mode, the output file can become the canonical source of truth, and the graph generated on the fly. Perhaps we can SHA the output file, and store it in its entirety somewhere, inside a zip containing the cassandra.yaml, so that the graph can just contain this hash of the output file to route us to the permanent record? Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Add collections, tuple, and UDT to JSON type documentation
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 271c9e4ac - a5be8f199 Add collections, tuple, and UDT to JSON type documentation These were accidentally omitted when the docs were first written for CASSANDRA-7970. Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a5be8f19 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a5be8f19 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a5be8f19 Branch: refs/heads/cassandra-2.2 Commit: a5be8f199150c5f7a7ad9df3babad1bf950dd4b3 Parents: 271c9e4 Author: Tyler Hobbs tylerlho...@gmail.com Authored: Fri Jun 12 13:39:44 2015 -0500 Committer: Tyler Hobbs tylerlho...@gmail.com Committed: Fri Jun 12 13:39:44 2015 -0500 -- doc/cql3/CQL.textile | 5 + 1 file changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5be8f19/doc/cql3/CQL.textile -- diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile index 3755a2d..69c6032 100644 --- a/doc/cql3/CQL.textile +++ b/doc/cql3/CQL.textile @@ -1972,10 +1972,15 @@ Where possible, Cassandra will represent and accept data types in their native @ |@float@|integer, float, string|float|String must be valid integer or float| |@inet@ |string|string |IPv4 or IPv6 address| |@int@ |integer, string |integer |String must be valid 32 bit integer| +|@list@ |list, string |list |Uses JSON's native list representation| +|@map@ |map, string |map |Uses JSON's native map representation| +|@set@ |list, string |list |Uses JSON's native list representation| |@text@ |string|string |Uses JSON's @\u@ character escape| |@time@ |string|string |Time of day in format @HH-MM-SS[.f]@| |@timestamp@|integer, string |string |A timestamp. Strings constant are allow to input timestamps as dates, see Working with dates:#usingdates below for more information. Datestamps with format @-MM-DD HH:MM:SS.SSS@ are returned.| |@timeuuid@ |string|string |Type 1 UUID. See Constants:#constants for the UUID format| +|@tuple@|list, string |list |Uses JSON's native list representation| +|@UDT@ |map, string |map |Uses JSON's native map representation with field names as keys| |@uuid@ |string|string |See Constants:#constants for the UUID format| |@varchar@ |string|string |Uses JSON's @\u@ character escape| |@varint@ |integer, string |integer |Variable length; may overflow 32 or 64 bit integers in client-side decoder|
[1/2] cassandra git commit: Add collections, tuple, and UDT to JSON type documentation
Repository: cassandra Updated Branches: refs/heads/trunk c2e54ddc3 - 40c3e8922 Add collections, tuple, and UDT to JSON type documentation These were accidentally omitted when the docs were first written for CASSANDRA-7970. Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a5be8f19 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a5be8f19 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a5be8f19 Branch: refs/heads/trunk Commit: a5be8f199150c5f7a7ad9df3babad1bf950dd4b3 Parents: 271c9e4 Author: Tyler Hobbs tylerlho...@gmail.com Authored: Fri Jun 12 13:39:44 2015 -0500 Committer: Tyler Hobbs tylerlho...@gmail.com Committed: Fri Jun 12 13:39:44 2015 -0500 -- doc/cql3/CQL.textile | 5 + 1 file changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5be8f19/doc/cql3/CQL.textile -- diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile index 3755a2d..69c6032 100644 --- a/doc/cql3/CQL.textile +++ b/doc/cql3/CQL.textile @@ -1972,10 +1972,15 @@ Where possible, Cassandra will represent and accept data types in their native @ |@float@|integer, float, string|float|String must be valid integer or float| |@inet@ |string|string |IPv4 or IPv6 address| |@int@ |integer, string |integer |String must be valid 32 bit integer| +|@list@ |list, string |list |Uses JSON's native list representation| +|@map@ |map, string |map |Uses JSON's native map representation| +|@set@ |list, string |list |Uses JSON's native list representation| |@text@ |string|string |Uses JSON's @\u@ character escape| |@time@ |string|string |Time of day in format @HH-MM-SS[.f]@| |@timestamp@|integer, string |string |A timestamp. Strings constant are allow to input timestamps as dates, see Working with dates:#usingdates below for more information. Datestamps with format @-MM-DD HH:MM:SS.SSS@ are returned.| |@timeuuid@ |string|string |Type 1 UUID. See Constants:#constants for the UUID format| +|@tuple@|list, string |list |Uses JSON's native list representation| +|@UDT@ |map, string |map |Uses JSON's native map representation with field names as keys| |@uuid@ |string|string |See Constants:#constants for the UUID format| |@varchar@ |string|string |Uses JSON's @\u@ character escape| |@varint@ |integer, string |integer |Variable length; may overflow 32 or 64 bit integers in client-side decoder|
[2/2] cassandra git commit: Merge branch 'cassandra-2.2' into trunk
Merge branch 'cassandra-2.2' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/40c3e892 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/40c3e892 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/40c3e892 Branch: refs/heads/trunk Commit: 40c3e892291315dd1531159f5dc5f51e74bb1ac2 Parents: c2e54dd a5be8f1 Author: Tyler Hobbs tylerlho...@gmail.com Authored: Fri Jun 12 13:40:56 2015 -0500 Committer: Tyler Hobbs tylerlho...@gmail.com Committed: Fri Jun 12 13:40:56 2015 -0500 -- doc/cql3/CQL.textile | 5 + 1 file changed, 5 insertions(+) --
[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583809#comment-14583809 ] Jeff Jirsa commented on CASSANDRA-8460: --- {quote}yes, I've been thinking maybe adding priorities or tags to the data directories, but that is probably not needed now. Adding a flag to each data_directory that states whether it is for archival storage or not is probably enough for now.{quote} Asking for clarification to make sure I don't go too far into pony land: So my initial approach was to define a second config item, separate from {{data_file_directories}} entirely, so that no other code needed to be aware of it except for classes explicitly wanting to use `archive` tier storage ( {{dd.getAllDataFileLocations()}} would not return the archive tier, but rather add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of storage). It sounds from your description you're envisioning changing the list of data_file_locations to a list of maps {noformat} [tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also need to maintain backwards compatibility, which seems fairly straight forward to do (check to see if the provided {{data_files_directory}} is an old-format list rather than map and apply some default tag?) The first approach is clean and isolated, unlikely to introduce surprises, but potentially limits us from being able to do more interesting work with tagged data file directories later (ie: only store data for KS W in data directories tagged X, and KS Y in data directories tagged Z). Can you clarify which best fits your expectations? Make it possible to move non-compacting sstables to slow/big storage in DTCS Key: CASSANDRA-8460 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Labels: dtcs It would be nice if we could configure DTCS to have a set of extra data directories where we move the sstables once they are older than max_sstable_age_days. This would enable users to have a quick, small SSD for hot, new data, and big spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583809#comment-14583809 ] Jeff Jirsa edited comment on CASSANDRA-8460 at 6/12/15 6:07 PM: {quote}yes, I've been thinking maybe adding priorities or tags to the data directories, but that is probably not needed now. Adding a flag to each data_directory that states whether it is for archival storage or not is probably enough for now.{quote} Asking for clarification to make sure I don't go too far into pony land: So my initial approach was to define a second config item, separate from {{data_file_directories}} entirely, so that no other code needed to be aware of it except for classes explicitly wanting to use `archive` tier storage ( {{dd.getAllDataFileLocations()}} would not return the archive tier, but rather add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of storage). It sounds from your description you're envisioning changing the list of data_file_locations to a map {noformat} [tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also need to maintain backwards compatibility, which seems fairly straight forward to do (check to see if the provided {{data_files_directory}} is an old-format list rather than map and apply some default tag?) The first approach is clean and isolated, unlikely to introduce surprises, but potentially limits us from being able to do more interesting work with tagged data file directories later (ie: only store data for KS W in data directories tagged X, and KS Y in data directories tagged Z). Can you clarify which best fits your expectations? was (Author: jjirsa): {quote}yes, I've been thinking maybe adding priorities or tags to the data directories, but that is probably not needed now. Adding a flag to each data_directory that states whether it is for archival storage or not is probably enough for now.{quote} Asking for clarification to make sure I don't go too far into pony land: So my initial approach was to define a second config item, separate from {{data_file_directories}} entirely, so that no other code needed to be aware of it except for classes explicitly wanting to use `archive` tier storage ( {{dd.getAllDataFileLocations()}} would not return the archive tier, but rather add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of storage). It sounds from your description you're envisioning changing the list of data_file_locations to a list of maps {noformat} [tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also need to maintain backwards compatibility, which seems fairly straight forward to do (check to see if the provided {{data_files_directory}} is an old-format list rather than map and apply some default tag?) The first approach is clean and isolated, unlikely to introduce surprises, but potentially limits us from being able to do more interesting work with tagged data file directories later (ie: only store data for KS W in data directories tagged X, and KS Y in data directories tagged Z). Can you clarify which best fits your expectations? Make it possible to move non-compacting sstables to slow/big storage in DTCS Key: CASSANDRA-8460 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Labels: dtcs It would be nice if we could configure DTCS to have a set of extra data directories where we move the sstables once they are older than max_sstable_age_days. This would enable users to have a quick, small SSD for hot, new data, and big spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583829#comment-14583829 ] Benedict commented on CASSANDRA-7918: - FTR, gnuplot _does_ (apparently) work on Windows :) Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583840#comment-14583840 ] Joshua McKenzie commented on CASSANDRA-7918: Given our recent regressions and the upcoming effort for a performance testing harness, we need to move on this. Right now we have 1) Benedict's option that has more information but that's written using gnuplot which people feel strongly against and 2) Ryan's option that has less information available but is perhaps more immediately / intuitively digestible and doesn't use gnuplot. [~enigmacurry]: what are the chances you could integrate the throughput/latency/gc and tri-graphing approach benedict took into the existing cstar framework, giving us the best of both worlds? I wouldn't mind seeing the current format of the #'s from your solution below the graphs. One other thing - we need to scrape the cassandra.yaml file and dump out the relevant settings used for the test (or perhaps just all of them at the outset) as well as snapshotting the specific cassandra-stress command used to generate the test results for reproduction. A SHA for commit used on the test would also help, and I think that would give us a solid initial framework to start testing with and have reproducible tests. We can pursue future additions onto this later (capturing system info, /proc/cpuinfo, etc) but there's no point in holding it up to get it to be perfect for our 1st revision. Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Attachments: 7918.patch, reads.svg Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)