[jira] [Commented] (CASSANDRA-8671) Give compaction strategy more control over where sstables are created, including for flushing and streaming.
[ https://issues.apache.org/jira/browse/CASSANDRA-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705268#comment-14705268 ] Blake Eggleston commented on CASSANDRA-8671: CQLSSTableWriter change looks good. setInitialDirectories is a hook to add additional directories to initialDirectories. I've reworked the method a bit to ensure any changes results in a superset and added comments https://github.com/bdeggleston/cassandra/tree/8671-3 https://github.com/bdeggleston/cassandra/tree/8671-3-squashed Give compaction strategy more control over where sstables are created, including for flushing and streaming. Key: CASSANDRA-8671 URL: https://issues.apache.org/jira/browse/CASSANDRA-8671 Project: Cassandra Issue Type: Improvement Reporter: Blake Eggleston Assignee: Blake Eggleston Fix For: 3.x Attachments: 0001-C8671-creating-sstable-writers-for-flush-and-stream-.patch, 8671-giving-compaction-strategies-more-control-over.txt This would enable routing different partitions to different disks based on some user defined parameters. My initial take on how to do this would be to make an interface from SSTableWriter, and have a table's compaction strategy do all SSTableWriter instantiation. Compaction strategies could then implement their own SSTableWriter implementations (which basically wrap one or more normal sstablewriters) for compaction, flushing, and streaming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10140) Enable GC logging by default
[ https://issues.apache.org/jira/browse/CASSANDRA-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705287#comment-14705287 ] Ariel Weisberg commented on CASSANDRA-10140: Does anyone remember if GC logging writes directly to disk and impacts time spent in safe points? I can't remember if that is an issue or not. I plumbed this code a couple of times, and come to a different conclusion every time. Enable GC logging by default Key: CASSANDRA-10140 URL: https://issues.apache.org/jira/browse/CASSANDRA-10140 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Chris Lohfink Assignee: Chris Lohfink Priority: Minor Attachments: CASSANDRA-10140.txt Overhead for the gc logging is very small (with cycling logs in 7+) and it provides a ton of useful information. This will open up more for C* diagnostic tools to provide feedback as well without requiring restarts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign
[ https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705288#comment-14705288 ] Sylvain Lebresne commented on CASSANDRA-9459: - Haven't finished to review all of the patch, but I'm going to give a first batch of remarks/suggestions/review points. I'll note that while there is quite a few, they are mostly relatively small stuff: the bulk of the patch looks pretty good, so good job [~beobal]. * In {{SecondaryIndexManager}}: ** {{flushIndexesBlocking}} holds a lock on {{baseCfs.getTracker()}} for the whole duration of the flush. I don't think that's what we want. I think what we want is to hold the lock while we _submit_ the flush, but not for the whole time of the flushing. ** {{getBestIndexFor}} seems to now favor indexes that handle more of the expressions. I'm not convinced by that heuristic. It's totally possible for an index to handle less expression but be a lot more selective. So I think we should stick to only considering {{estimateResultRows}} (as we're currently doing unless I'm missing something). If anything else, I'd rather not do that kind of change in this refactoring ticket. ** In {{WriteTimeTransaction.onUpdated}}, I don't think we should ignore the {{onPrimaryKeyLivenessInfo}} call: we could be updating a TTL on only the clustering columns and that should be carried out to the index. ** {{IndexGCTransaction.onRowMerge}} seems to do more work than it should. I believe all we want to do during compaction is remove cells that have been shadowed by some deletion (since we don't handle those at write time). But the code seems to also add any update (I'm saying imo the condition should be {{if (original != null merged == null)}}). ** In {{indexPartition}}, the static case should be inside the {{try()}}: no reason to filter normal rows but not the static one. ** Why do we need IndexAccessor, since that's created from a {{ReadCommand}} in the first place. Can't we just return an {{Index}}, and have the rest of the methods of IndexAccessor be methods of {{Index}} taking a {{ReadCommand}} (which they mostly already are anyway)? (would make {{ReadCommand.getIndex()}} method actually return an {{Index}}, which is a little bit more consistent). ** Should probably add a {{if (!hasIndexes())}} test on top of {{newUpdateTransaction}}: that's a very common case and a very hot path and currently even with no index I think we'll still do a bunch of work (including allocating an empty array). ** {{CleanupTransaction}} should be split up in 2 since Cleanup and Compaction use of it don't overlap in what they use and that's a bit confusing. I'd create 3 interfaces: {{UpdateIndexTransaction}}, {{CleanupIndexTransaction}} and {{CompactionIndexTransaction}}. I'd also make those top-level interface to avoid the long {{SecondaryIndexManager}} everywhere (the concrete implementations can stay where they are). We could also have a {{IndexTransaction}} (that they all extend and have just start() and commit()) to put inside the {{TransactionType}} (just because {{IndexTransaction.Type}} looks better than {{SecondaryIndexManager.TransactionType}} :)). ** Is there a reason for using the whole {{IndexMetadata}} as map key in the {{indexes}} map? It feels that using the index name should be enough (since we guarantee it's unique and fixed) and would make looking a tad faster since there is less to hash/compare and might avoid building fake {{IndexMetadata}} just for lookup. Certainly feels cleaner to me in principle. ** I'd rename {{getAllIndexStorageTables}} to {{getAllIndexColumnFamilyStore}}: not sure it's worth adding the new verbiage StorageTables (note that I hope we'll soon rename {{ColumnFamilyteStore}} to {{TableStore}} and rename that method accordingly, but it's better to rename consistently for now and deal with that later imo). * In {{Index}}: ** The initialization of an {{Index}} bothers me a bit: the fact there is basically 3 calls ({{init()}}, {{setIndexMetadata()}} and then {{register()}}) make it hard to understand what initialization actually does. It also means nothing can be final in the implementations even if it kind of should (at least for {{baseCfs}} in {{CassandraIndex}}). I haven't tested it so I might miss some detail, but what I could suggest would be to pass the base table CFS and the initial {{IndexMetadata}} to the ctor (so for custom index, we'd specify they should have a ctor expecting those) and we'd then just have a {{initializationTask()}} that return what needs to be done initially. As for registration, index can do it directly in the ctor by calling {{baseCfs.indexManager.register()}} (we can also specify they have to do it). Now, it's true that {{setIndexMetadata}} is also called during CFS reload, but that leads me to another point: it's a bit misleading imo that the index can't
[jira] [Commented] (CASSANDRA-10114) Allow count(*) and count(1) to be use as normal aggregation
[ https://issues.apache.org/jira/browse/CASSANDRA-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704611#comment-14704611 ] Stefania commented on CASSANDRA-10114: -- LGTM but I am not really familiar with this part of the code. One nit: - imports in AbstractFunctionSelector are out of order Tests look good, a couple of failing compactions utests in 3.0 but they pass locally and are unrelated. _counter_tests.TestCounters.upgrade_test_ in 3.0 dtests is also unrelated (CASSANDRA-10109). Allow count(*) and count(1) to be use as normal aggregation --- Key: CASSANDRA-10114 URL: https://issues.apache.org/jira/browse/CASSANDRA-10114 Project: Cassandra Issue Type: Improvement Reporter: Benjamin Lerer Assignee: Benjamin Lerer Priority: Minor Fix For: 2.2.x Attachments: 10114-2.2.txt For the following query: {code} SELECT count(*), max(timestamp), min(timestamp) FROM myData WHERE id = ? {code} Cassandra will throw a {{InvalidSyntaxException}}. We should allow count(*) and count(1) to be queried with other aggregations or columns -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10127) Make naming for secondary indexes consistent
[ https://issues.apache.org/jira/browse/CASSANDRA-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-10127: - Assignee: Benjamin Lerer (was: Blake Eggleston) Make naming for secondary indexes consistent Key: CASSANDRA-10127 URL: https://issues.apache.org/jira/browse/CASSANDRA-10127 Project: Cassandra Issue Type: Bug Reporter: Sam Tunnicliffe Assignee: Benjamin Lerer Fix For: 3.0 beta 2 We have a longstanding mismatch between the name of an index as defined in schema and what gets returned from {{SecondaryIndex#getIndexName()}}, which for the builtin index impls is the name of the underlying index CFS, of the form {{base_table_name.index_name}}. This mismatch causes a number of UI inconsistencies: {code}nodetool rebuild_index ks tbl idx{code} {{idx}} must be qualified, i.e. include the redundant table name as without it, the rebuild silently fails {{system.IndexInfo}} (which is also exposed over JMX) uses the form {{tbl.idx}} {code}cqlsh describe index [ks.]idx{code} here, qualifying {{idx}} with the base table name is an error. Generally, anything CQL related uses the index name directly, whereas anthing concerned with building or rebuiling requires the version based on an underlying backing table name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9623) Added column does not sort as the last column
[ https://issues.apache.org/jira/browse/CASSANDRA-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-9623. Resolution: Duplicate Looks like the exceptions are not during cleanup, they are happening with regular compaction, we have fixed a few issues regarding overlap with LCS since 2.0.13, so I would recommend that you upgrade to latest 2.0 to make sure this is actually a new bug. If it keeps happening, please reopen this ticket Added column does not sort as the last column - Key: CASSANDRA-9623 URL: https://issues.apache.org/jira/browse/CASSANDRA-9623 Project: Cassandra Issue Type: Bug Reporter: Marcin Pietraszek Assignee: Marcus Eriksson Fix For: 2.0.x Attachments: cassandra_log.txt After adding new machines to existing cluster running cleanup one of the tables ends with: {noformat} ERROR [CompactionExecutor:1015] 2015-06-19 11:24:05,038 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1015,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:161) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} We're using patched 2.0.13-190ef4f -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9623) Added column does not sort as the last column
[ https://issues.apache.org/jira/browse/CASSANDRA-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-9623: --- Fix Version/s: (was: 2.0.x) Added column does not sort as the last column - Key: CASSANDRA-9623 URL: https://issues.apache.org/jira/browse/CASSANDRA-9623 Project: Cassandra Issue Type: Bug Reporter: Marcin Pietraszek Assignee: Marcus Eriksson Attachments: cassandra_log.txt After adding new machines to existing cluster running cleanup one of the tables ends with: {noformat} ERROR [CompactionExecutor:1015] 2015-06-19 11:24:05,038 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1015,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:161) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} We're using patched 2.0.13-190ef4f -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6717) Modernize schema tables
[ https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704509#comment-14704509 ] Robert Stupp commented on CASSANDRA-6717: - Ninja-fix for the UFPureScriptTest caused by the driver update is here: https://gist.github.com/snazy/4c05fbd32307da68e1f6 Modernize schema tables --- Key: CASSANDRA-6717 URL: https://issues.apache.org/jira/browse/CASSANDRA-6717 Project: Cassandra Issue Type: Sub-task Reporter: Sylvain Lebresne Assignee: Aleksey Yeschenko Labels: client-impacting, doc-impacting Fix For: 3.0 beta 2 There is a few problems/improvements that can be done with the way we store schema: # CASSANDRA-4988: as explained on the ticket, storing the comparator is now redundant (or almost, we'd need to store whether the table is COMPACT or not too, which we don't currently is easy and probably a good idea anyway), it can be entirely reconstructed from the infos in schema_columns (the same is true of key_validator and subcomparator, and replacing default_validator by a COMPACT_VALUE column in all case is relatively simple). And storing the comparator as an opaque string broke concurrent updates of sub-part of said comparator (concurrent collection addition or altering 2 separate clustering columns typically) so it's really worth removing it. # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I think we should fix it once and for-all nonetheless (see below). # For CASSANDRA-6382 and to allow indexing both map keys and values at the same time, we'd need to be able to have more than one index definition for a given column. # There is a few mismatches in table options between the one stored in the schema and the one used when declaring/altering a table which would be nice to fix. The compaction, compression and replication maps are one already mentioned from CASSANDRA-4603, but also for some reason 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' in the schema table, and 'min/max_compaction_threshold' are column families option in the schema but just compaction options for CQL (which makes more sense). None of those issues are major, and we could probably deal with them independently but it might be simpler to just fix them all in one shot so I wanted to sum them all up here. In particular, the fact that 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, but it may limit future stuff too) which suggest we should migrate it to a new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt to rename schema_columnfamilies to schema_tables for the years to come since that's the prefered vernacular for CQL. Overall, what I would suggest is to move all schema tables to a new keyspace, named 'schema' for instance (or 'system_schema' but I prefer the shorter version), and fix all the issues above at once. Since we currently don't exchange schema between nodes of different versions, all we'd need to do that is a one shot startup migration, and overall, I think it could be simpler for clients to deal with one clear migration than to have to handle minor individual changes all over the place. I also think it's somewhat cleaner conceptually to have schema tables in their own keyspace since they are replicated through a different mechanism than other system tables. If we do that, we could, for instance, migrate to the following schema tables (details up for discussion of course): {noformat} CREATE TYPE user_type ( name text, column_names listtext, column_types listtext ) CREATE TABLE keyspaces ( name text PRIMARY KEY, durable_writes boolean, replication mapstring, string, user_types mapstring, user_type ) CREATE TYPE trigger_definition ( name text, options maptex, text ) CREATE TABLE tables ( keyspace text, name text, id uuid, table_type text, // COMPACT, CQL or SUPER dropped_columns maptext, bigint, triggers maptext, trigger_definition, // options comment text, compaction maptext, text, compression maptext, text, read_repair_chance double, dclocal_read_repair_chance double, gc_grace_seconds int, caching text, rows_per_partition_to_cache text, default_time_to_live int, min_index_interval int, max_index_interval int, speculative_retry text, populate_io_cache_on_flush boolean, bloom_filter_fp_chance double memtable_flush_period_in_ms int, PRIMARY KEY (keyspace, name) ) CREATE TYPE index_definition ( name text, index_type text, options maptext, text ) CREATE TABLE columns ( keyspace text, table text,
[jira] [Assigned] (CASSANDRA-4386) Allow cql to use the IN syntax on secondary index values
[ https://issues.apache.org/jira/browse/CASSANDRA-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-4386: - Assignee: Benjamin Lerer Allow cql to use the IN syntax on secondary index values Key: CASSANDRA-4386 URL: https://issues.apache.org/jira/browse/CASSANDRA-4386 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jeremy Hanna Assignee: Benjamin Lerer Priority: Minor Labels: cql Currently CQL has a syntax for using IN to get a set of rows with a set of keys. This would also be very helpful for use with columns with secondary indexes on them. Such as: {code} select * from users where first_name in ('françois','frank'); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9836) Allow comments on CQL table columns
[ https://issues.apache.org/jira/browse/CASSANDRA-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-9836: - Assignee: Aleksey Yeschenko Allow comments on CQL table columns --- Key: CASSANDRA-9836 URL: https://issues.apache.org/jira/browse/CASSANDRA-9836 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Aleksey Yeschenko Priority: Minor Fix For: 3.x We have a _comment_ option for tables. Having such a comment on individual columns (as many other databases have) would be nice especially when executing a {{DESCRIBE TABLE foo}}. Further, we could add comments in the same way for UDTs and UDT fields. Also for UDFs, UDAs and MVs (maybe not on the MV columns individually). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6717) Modernize schema tables
[ https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704593#comment-14704593 ] Aleksey Yeschenko commented on CASSANDRA-6717: -- [~snazy] go ahead Modernize schema tables --- Key: CASSANDRA-6717 URL: https://issues.apache.org/jira/browse/CASSANDRA-6717 Project: Cassandra Issue Type: Sub-task Reporter: Sylvain Lebresne Assignee: Aleksey Yeschenko Labels: client-impacting, doc-impacting Fix For: 3.0 beta 2 There is a few problems/improvements that can be done with the way we store schema: # CASSANDRA-4988: as explained on the ticket, storing the comparator is now redundant (or almost, we'd need to store whether the table is COMPACT or not too, which we don't currently is easy and probably a good idea anyway), it can be entirely reconstructed from the infos in schema_columns (the same is true of key_validator and subcomparator, and replacing default_validator by a COMPACT_VALUE column in all case is relatively simple). And storing the comparator as an opaque string broke concurrent updates of sub-part of said comparator (concurrent collection addition or altering 2 separate clustering columns typically) so it's really worth removing it. # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I think we should fix it once and for-all nonetheless (see below). # For CASSANDRA-6382 and to allow indexing both map keys and values at the same time, we'd need to be able to have more than one index definition for a given column. # There is a few mismatches in table options between the one stored in the schema and the one used when declaring/altering a table which would be nice to fix. The compaction, compression and replication maps are one already mentioned from CASSANDRA-4603, but also for some reason 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' in the schema table, and 'min/max_compaction_threshold' are column families option in the schema but just compaction options for CQL (which makes more sense). None of those issues are major, and we could probably deal with them independently but it might be simpler to just fix them all in one shot so I wanted to sum them all up here. In particular, the fact that 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, but it may limit future stuff too) which suggest we should migrate it to a new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt to rename schema_columnfamilies to schema_tables for the years to come since that's the prefered vernacular for CQL. Overall, what I would suggest is to move all schema tables to a new keyspace, named 'schema' for instance (or 'system_schema' but I prefer the shorter version), and fix all the issues above at once. Since we currently don't exchange schema between nodes of different versions, all we'd need to do that is a one shot startup migration, and overall, I think it could be simpler for clients to deal with one clear migration than to have to handle minor individual changes all over the place. I also think it's somewhat cleaner conceptually to have schema tables in their own keyspace since they are replicated through a different mechanism than other system tables. If we do that, we could, for instance, migrate to the following schema tables (details up for discussion of course): {noformat} CREATE TYPE user_type ( name text, column_names listtext, column_types listtext ) CREATE TABLE keyspaces ( name text PRIMARY KEY, durable_writes boolean, replication mapstring, string, user_types mapstring, user_type ) CREATE TYPE trigger_definition ( name text, options maptex, text ) CREATE TABLE tables ( keyspace text, name text, id uuid, table_type text, // COMPACT, CQL or SUPER dropped_columns maptext, bigint, triggers maptext, trigger_definition, // options comment text, compaction maptext, text, compression maptext, text, read_repair_chance double, dclocal_read_repair_chance double, gc_grace_seconds int, caching text, rows_per_partition_to_cache text, default_time_to_live int, min_index_interval int, max_index_interval int, speculative_retry text, populate_io_cache_on_flush boolean, bloom_filter_fp_chance double memtable_flush_period_in_ms int, PRIMARY KEY (keyspace, name) ) CREATE TYPE index_definition ( name text, index_type text, options maptext, text ) CREATE TABLE columns ( keyspace text, table text, name text, kind text, // PARTITION_KEY, CLUSTERING_COLUMN, REGULAR or COMPACT_VALUE
[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)
[ https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704551#comment-14704551 ] Stefania commented on CASSANDRA-8630: - I've rebased to 3.0 and renamed the branch to [8630-3.0|https://github.com/stef1927/cassandra/tree/8630-3.0]. I had conflicts with CASSANDRA-6230, since it introduced {{ChecksummedDataInput}}, a new specialization of {{AbstractDataInput}}. I integrated it somehow but I feel there is some duplication between this class and {{ChecksummedRandomAccessReader}}. However the two aren't identical either, so for now they are separate. I made good progress with moving the segments to the builders and replacing the map with an array but this is still not complete. If you want to take a quick look the class doing the bulk of the work is called [MmappedRegions|https://github.com/stef1927/cassandra/commit/d1418ab889f60812cc866f12bf94b2360b3bb2d3#diff-88342f36d0687d3a0559fede5d158d83R33]. Your feedback is welcome but it is far from complete, specifically I still need to make it into a ref counted object, since the builders don't necessarily survive the files, and address thread safety issues. Also, I merely extend segments at the moment, I make no effort to compact them. I did not go for the two arrays approach since the code is more readable with a single array of a well defined class but if you really think this makes a big difference I can change it. The second batch of code review comments is also still WIP. I will post more details once it is complete. Faster sequential IO (on compaction, streaming, etc) Key: CASSANDRA-8630 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630 Project: Cassandra Issue Type: Improvement Components: Core, Tools Reporter: Oleg Anastasyev Assignee: Stefania Labels: compaction, performance Fix For: 3.x Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz, mmaped_uncomp_hotspot.png When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int). This is because default implementations of readShort,readLong, etc as well as their matching write* are implemented with numerous calls of byte by byte read and write. This makes a lot of syscalls as well. A quick microbench shows than just reimplementation of these methods in either way gives 8x speed increase. A patch attached implements RandomAccessReader.readType and SequencialWriter.writeType methods in more efficient way. I also eliminated some extra byte copies in CompositeType.split and ColumnNameHelper.maxComponents, which were on my profiler's hotspot method list during tests. A stress tests on my laptop show that this patch makes compaction 25-30% faster on uncompressed sstables and 15% faster for compressed ones. A deployment to production shows much less CPU load for compaction. (I attached a cpu load graph from one of our production, orange is niced CPU load - i.e. compaction; yellow is user - i.e. not compaction related tasks) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10039) Make UDF script sandbox more robust against Nashorn internal changes
[ https://issues.apache.org/jira/browse/CASSANDRA-10039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704599#comment-14704599 ] Robert Stupp commented on CASSANDRA-10039: -- Notes from offline discussion: * Reason why we need that amount of allowed stuff for javascript value serialization. Even if serialization is decoupled, scripts require access to UDTValue and TupleValue and DataType, Codec and so on. * Play with Nashorn's new {{ClassFilter}} thingy - it might be the solution. * Try to decouple serialisation and deserialization from (sandboxed) script execution. * Require _create untrusted_ for other languages than javascript. Make UDF script sandbox more robust against Nashorn internal changes Key: CASSANDRA-10039 URL: https://issues.apache.org/jira/browse/CASSANDRA-10039 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Fix For: 3.x {{UFPureScriptTest}} doesn't work against Java 1.8.0_25 but with recent versions (1.8.0_51 for example). Need to find a way to make this more robust against future Nashorn changes. /cc [~aweisberg] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors
[ https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704365#comment-14704365 ] Branimir Lambov commented on CASSANDRA-9749: Test fix for 2.2 pushed [here|https://github.com/blambov/cassandra/tree/9749-2.2-testfix]. [testall|http://cassci.datastax.com/job/blambov-9749-2.2-testfix-testall/] [dtest|http://cassci.datastax.com/job/blambov-9749-2.2-testfix-dtest/] CommitLogReplayer continues startup after encountering errors - Key: CASSANDRA-9749 URL: https://issues.apache.org/jira/browse/CASSANDRA-9749 Project: Cassandra Issue Type: Bug Reporter: Blake Eggleston Assignee: Branimir Lambov Fix For: 2.2.1, 3.0 beta 1 Attachments: 9749-coverage.tgz There are a few places where the commit log recovery method either skips sections or just returns when it encounters errors. Specifically if it can't read the header here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298 Or if there are compressor problems here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314 and here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366 Whether these are user-fixable or not, I think we should require more direct user intervention (ie: fix what's wrong, or remove the bad file and restart) since we're basically losing data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-7461) operator functionality in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-7461: - Assignee: Benjamin Lerer operator functionality in CQL - Key: CASSANDRA-7461 URL: https://issues.apache.org/jira/browse/CASSANDRA-7461 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Robert Stupp Assignee: Benjamin Lerer Labels: cql Intention: Allow operators in CQL Operators could be decimal arithmetics {{+ - * /}} or boolen arithmetics {{| !}} or string 'arithmetics' {{+}} {{SELECT tab.label + ' = ' + tab.value FROM foo.tab}} {{SELECT * FROM tab WHERE tab.label + ' = ' + tab.value = 'foo = bar'}} as well as {{CREATE INDEX idx ON tab ( tab.tabel + '=' + tab.value )}} or {{CREATE INDEX idx ON tab (label) WHERE contains(tab.tabel, 'very-important-key')}} Operators could be mapped to UDFs like this: {{+}} mapped to UDF {{cstarstd::oper_plus(...)}} {{-}} mapped to UDF {{cstarstd::oper_minus(...)}} or handled directly via {{Cql.g}} in 'special' code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-5505) Please add support for basic arithmetic operations in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-5505: - Assignee: Benjamin Lerer Please add support for basic arithmetic operations in CQL - Key: CASSANDRA-5505 URL: https://issues.apache.org/jira/browse/CASSANDRA-5505 Project: Cassandra Issue Type: Wish Reporter: Arthur Zubarev Assignee: Benjamin Lerer Labels: cql Please add support for basic arithmetic operations in CQL as -, +, /, *. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b10c00b2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b10c00b2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b10c00b2 Branch: refs/heads/trunk Commit: b10c00b283f9b917dc148e92db845d1d6beb7ab6 Parents: 110e803 c997c08 Author: Robert Stupp sn...@snazy.de Authored: Thu Aug 20 11:20:53 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Thu Aug 20 11:20:53 2015 +0200 -- .../org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java | 2 ++ 1 file changed, 2 insertions(+) --
[jira] [Commented] (CASSANDRA-10109) Windows dtest 3.0: ttl_test.py failures
[ https://issues.apache.org/jira/browse/CASSANDRA-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704603#comment-14704603 ] Stefania commented on CASSANDRA-10109: -- I think there is also another issue here: http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-10114-3.0-dtest/lastCompletedBuild/testReport/counter_tests/TestCounters/upgrade_test/ Windows dtest 3.0: ttl_test.py failures --- Key: CASSANDRA-10109 URL: https://issues.apache.org/jira/browse/CASSANDRA-10109 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Assignee: Stefania Labels: Windows Fix For: 3.0.x ttl_test.py:TestTTL.update_column_ttl_with_default_ttl_test2 ttl_test.py:TestTTL.update_multiple_columns_ttl_test ttl_test.py:TestTTL.update_single_column_ttl_test Errors locally are different than CI from yesterday. Yesterday on CI we have timeouts and general node hangs. Today on all 3 tests when run locally I see: {noformat} Traceback (most recent call last): File c:\src\cassandra-dtest\dtest.py, line 532, in tearDown raise AssertionError('Unexpected error in %s node log: %s' % (node.name, errors)) AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 16:53:43,120 NoSpamLogger.java:97 - This platform does not support atomic directory streams (SecureDirectoryStream); race conditions when loading sstable files could occurr'] {noformat} This traces back to the commit for CASSANDRA-7066 today by [~Stefania] and [~benedict]. Stefania - care to take this ticket and also look further into whether or not we're going to have issues with 7066 on Windows? That error message certainly *sounds* like it's not a good thing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6717) Modernize schema tables
[ https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704602#comment-14704602 ] Robert Stupp commented on CASSANDRA-6717: - fix committed as c997c08c47b049f4278a08b2d0ad9329fdf4a455 Modernize schema tables --- Key: CASSANDRA-6717 URL: https://issues.apache.org/jira/browse/CASSANDRA-6717 Project: Cassandra Issue Type: Sub-task Reporter: Sylvain Lebresne Assignee: Aleksey Yeschenko Labels: client-impacting, doc-impacting Fix For: 3.0 beta 2 There is a few problems/improvements that can be done with the way we store schema: # CASSANDRA-4988: as explained on the ticket, storing the comparator is now redundant (or almost, we'd need to store whether the table is COMPACT or not too, which we don't currently is easy and probably a good idea anyway), it can be entirely reconstructed from the infos in schema_columns (the same is true of key_validator and subcomparator, and replacing default_validator by a COMPACT_VALUE column in all case is relatively simple). And storing the comparator as an opaque string broke concurrent updates of sub-part of said comparator (concurrent collection addition or altering 2 separate clustering columns typically) so it's really worth removing it. # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I think we should fix it once and for-all nonetheless (see below). # For CASSANDRA-6382 and to allow indexing both map keys and values at the same time, we'd need to be able to have more than one index definition for a given column. # There is a few mismatches in table options between the one stored in the schema and the one used when declaring/altering a table which would be nice to fix. The compaction, compression and replication maps are one already mentioned from CASSANDRA-4603, but also for some reason 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' in the schema table, and 'min/max_compaction_threshold' are column families option in the schema but just compaction options for CQL (which makes more sense). None of those issues are major, and we could probably deal with them independently but it might be simpler to just fix them all in one shot so I wanted to sum them all up here. In particular, the fact that 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, but it may limit future stuff too) which suggest we should migrate it to a new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt to rename schema_columnfamilies to schema_tables for the years to come since that's the prefered vernacular for CQL. Overall, what I would suggest is to move all schema tables to a new keyspace, named 'schema' for instance (or 'system_schema' but I prefer the shorter version), and fix all the issues above at once. Since we currently don't exchange schema between nodes of different versions, all we'd need to do that is a one shot startup migration, and overall, I think it could be simpler for clients to deal with one clear migration than to have to handle minor individual changes all over the place. I also think it's somewhat cleaner conceptually to have schema tables in their own keyspace since they are replicated through a different mechanism than other system tables. If we do that, we could, for instance, migrate to the following schema tables (details up for discussion of course): {noformat} CREATE TYPE user_type ( name text, column_names listtext, column_types listtext ) CREATE TABLE keyspaces ( name text PRIMARY KEY, durable_writes boolean, replication mapstring, string, user_types mapstring, user_type ) CREATE TYPE trigger_definition ( name text, options maptex, text ) CREATE TABLE tables ( keyspace text, name text, id uuid, table_type text, // COMPACT, CQL or SUPER dropped_columns maptext, bigint, triggers maptext, trigger_definition, // options comment text, compaction maptext, text, compression maptext, text, read_repair_chance double, dclocal_read_repair_chance double, gc_grace_seconds int, caching text, rows_per_partition_to_cache text, default_time_to_live int, min_index_interval int, max_index_interval int, speculative_retry text, populate_io_cache_on_flush boolean, bloom_filter_fp_chance double memtable_flush_period_in_ms int, PRIMARY KEY (keyspace, name) ) CREATE TYPE index_definition ( name text, index_type text, options maptext, text ) CREATE TABLE columns ( keyspace text, table text, name text, kind text, // PARTITION_KEY, CLUSTERING_COLUMN,
cassandra git commit: Ninja-fix UFPureScriptTest (post-6717 driver update)
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 13172bd99 - c997c08c4 Ninja-fix UFPureScriptTest (post-6717 driver update) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c997c08c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c997c08c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c997c08c Branch: refs/heads/cassandra-3.0 Commit: c997c08c47b049f4278a08b2d0ad9329fdf4a455 Parents: 13172bd Author: Robert Stupp sn...@snazy.de Authored: Thu Aug 20 11:20:08 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Thu Aug 20 11:20:08 2015 +0200 -- .../org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c997c08c/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java -- diff --git a/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java b/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java index 19ef769..ce4ea5e 100644 --- a/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java +++ b/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java @@ -70,8 +70,10 @@ final class ScriptBasedUDFunction extends UDFunction jdk.nashorn.internal.runtime.linker, // following required by Java Driver java.math, +java.nio, java.text, com.google.common.base, +com.google.common.collect, com.google.common.reflect, // following required by UDF com.datastax.driver.core,
[1/2] cassandra git commit: Ninja-fix UFPureScriptTest (post-6717 driver update)
Repository: cassandra Updated Branches: refs/heads/trunk 110e803ed - b10c00b28 Ninja-fix UFPureScriptTest (post-6717 driver update) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c997c08c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c997c08c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c997c08c Branch: refs/heads/trunk Commit: c997c08c47b049f4278a08b2d0ad9329fdf4a455 Parents: 13172bd Author: Robert Stupp sn...@snazy.de Authored: Thu Aug 20 11:20:08 2015 +0200 Committer: Robert Stupp sn...@snazy.de Committed: Thu Aug 20 11:20:08 2015 +0200 -- .../org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c997c08c/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java -- diff --git a/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java b/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java index 19ef769..ce4ea5e 100644 --- a/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java +++ b/src/java/org/apache/cassandra/cql3/functions/ScriptBasedUDFunction.java @@ -70,8 +70,10 @@ final class ScriptBasedUDFunction extends UDFunction jdk.nashorn.internal.runtime.linker, // following required by Java Driver java.math, +java.nio, java.text, com.google.common.base, +com.google.common.collect, com.google.common.reflect, // following required by UDF com.datastax.driver.core,
[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)
[ https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704570#comment-14704570 ] Benedict commented on CASSANDRA-8630: - If you're going with an array of {{Region}} objects, you may as well just use a {{NavigableMap}}. The main benefit is only realised with two arrays. (I don't mind terribly which you do). When it comes to thread safety, I would suggest backing it by {{SharedCloseableImpl}}, with the {{Tidy}} instance being retained by the builder, and being mutated as it's being built (completely thread safe as it holds a reference, so it will never be tidied before it's done). Whenever we build a {{SegmentedFile}} we copy a snapshot of the current state into a new instance that obtains a reference. Faster sequential IO (on compaction, streaming, etc) Key: CASSANDRA-8630 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630 Project: Cassandra Issue Type: Improvement Components: Core, Tools Reporter: Oleg Anastasyev Assignee: Stefania Labels: compaction, performance Fix For: 3.x Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz, mmaped_uncomp_hotspot.png When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int). This is because default implementations of readShort,readLong, etc as well as their matching write* are implemented with numerous calls of byte by byte read and write. This makes a lot of syscalls as well. A quick microbench shows than just reimplementation of these methods in either way gives 8x speed increase. A patch attached implements RandomAccessReader.readType and SequencialWriter.writeType methods in more efficient way. I also eliminated some extra byte copies in CompositeType.split and ColumnNameHelper.maxComponents, which were on my profiler's hotspot method list during tests. A stress tests on my laptop show that this patch makes compaction 25-30% faster on uncompressed sstables and 15% faster for compressed ones. A deployment to production shows much less CPU load for compaction. (I attached a cpu load graph from one of our production, orange is niced CPU load - i.e. compaction; yellow is user - i.e. not compaction related tasks) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10053) Connection error: ('Unable to connect to any servers', {'127.0.0.1': ProtocolError(cql_version '3.2.0' is not supported by remote (w/ native protocol). Supported
[ https://issues.apache.org/jira/browse/CASSANDRA-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704560#comment-14704560 ] Benjamin Lerer commented on CASSANDRA-10053: [~chaitanya_mr] Do you still have this problem? If yes, could you provide us more information on the cassandra and client version that you are using? Connection error: ('Unable to connect to any servers', {'127.0.0.1': ProtocolError(cql_version '3.2.0' is not supported by remote (w/ native protocol). Supported versions: [u'3.1.1'],)}) Key: CASSANDRA-10053 URL: https://issues.apache.org/jira/browse/CASSANDRA-10053 Project: Cassandra Issue Type: Bug Reporter: chaitanya Connection error: ('Unable to connect to any servers', {'127.0.0.1': ProtocolError(cql_version '3.2.0' is not supported by remote (w/ native protocol). Supported versions: [u'3.1.1'],)}) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10150) Cassandra read latency potentially caused by memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Ren updated CASSANDRA-10150: -- Description: We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://photos-6.dropbox.com/t/2/AAAYKgpmWp3gByCU6-uDcmE06at3ptVKue_E9CxQG3rpWA/12/303980955/png/32x32/1/_/1/2/1.png/CJvD-ZABIAEgAiADIAQgBSAGIAcoAigH/_Gu9l48fZBTA2-e9tWN2Hhy_c6oGdsQVIL07btYk7II?size=1280x960size_mode=2|height=300,width=500! TP95 latencies over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.1disp=embrealattid=ii_14f42580ee666154attbid=ANGjdJ8e959Qch4PmY57AAg-qi3cPMTX_p-33H4Snd1igoxQQ5N0owSRHKEBT-M2gzKKzfMmx0WwUnImJDDMkZcWqeiHieLrGgHJX4i3-Ust8tPrgMDQxe6C_2c3N40sz=w908-h372ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! OldGen memory usage over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.4disp=embrealattid=ii_14f4258cc47c9d36attbid=ANGjdJ9LgcECnife3mdKz1JlhDWur7KjiVtbEYYCFyxh0xoF9yEC4Q_90PS56PhU1hOraDiYCDQ1ro0dcOtQhqEU70Pwoc--wsdXbpbWmhJ5hF7QC2FDRS8zpuX_KC0sz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! You can see from this that the old gen section of our heap is what is using up the majority of the heap space. We cannot figure out why the memory is not being collected during a full GC. For reference, in our old cassandra cluster, the behavior is that the full GC will clear up the majority of the heap space. See image below from an old production node operating normally: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.3disp=embrealattid=ii_14f4262f2c3781bbattbid=ANGjdJ_G3oT4ITmlQMJe16jsYpYINHC1j6dqxvZ5RKfjMp5YUj1VA71_VfWTqUP47wsuRqb6GkeAk_1BllaL6D5bjn0QvScXBPIsr5L4uFMBEMpGZAvRzKaC9Q3xXrssz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! From heap dump file we found that most memory is consumed by unreachable objects. With further analysis we were able to see those objects are RMIConnectionImpl$CombinedClassLoader$ClassLoaderWrapper (holding 4GB of memory) and java.security.ProtectionDomain (holding 2GB) . The only place we know Cassandra is using RMI is in JMX, but does anyone has any clue on where else those objects are used? And Why do they take so much memory? Or It would be great if someone could offer any further debugging tips on the latency or GC issue. was: We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.2disp=embrealattid=ii_14f4256a57b697abattbid=ANGjdJ8836xDhsdopJteTGvid1FXOcMruq1Pz9fCkoasJ1Zsf2cQCXpbQ3CUB8DOupdYHstLw4n5xg9oXpWmSmp6FAvg3CnO9q7BlDNZ-EmMIy4tIg1yprl8ipDtgzwsz=w908-h372ats=1440110174663rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! TP95 latencies over 24 hours:
[jira] [Updated] (CASSANDRA-10150) Cassandra read latency potentially caused by memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Ren updated CASSANDRA-10150: -- Description: We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://dl.dropboxusercontent.com/u/303980955/1.png|height=300,width=500! TP95 latencies over 24 hours: !https://dl.dropboxusercontent.com/u/303980955/2.png|height=300,width=500! OldGen memory usage over 24 hours: !https://dl.dropboxusercontent.com/u/303980955/3.png|height=300,width=500! You can see from this that the old gen section of our heap is what is using up the majority of the heap space. We cannot figure out why the memory is not being collected during a full GC. For reference, in our old cassandra cluster, the behavior is that the full GC will clear up the majority of the heap space. See image below from an old production node operating normally: !https://dl.dropboxusercontent.com/u/303980955/4.png|height=300,width=500! From heap dump file we found that most memory is consumed by unreachable objects. With further analysis we were able to see those objects are RMIConnectionImpl$CombinedClassLoader$ClassLoaderWrapper (holding 4GB of memory) and java.security.ProtectionDomain (holding 2GB) . The only place we know Cassandra is using RMI is in JMX, but does anyone has any clue on where else those objects are used? And Why do they take so much memory? Or It would be great if someone could offer any further debugging tips on the latency or GC issue. was: We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://dl.dropboxusercontent.com/u/303980955/1.png|height=300,width=500! TP95 latencies over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.1disp=embrealattid=ii_14f42580ee666154attbid=ANGjdJ8e959Qch4PmY57AAg-qi3cPMTX_p-33H4Snd1igoxQQ5N0owSRHKEBT-M2gzKKzfMmx0WwUnImJDDMkZcWqeiHieLrGgHJX4i3-Ust8tPrgMDQxe6C_2c3N40sz=w908-h372ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! OldGen memory usage over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.4disp=embrealattid=ii_14f4258cc47c9d36attbid=ANGjdJ9LgcECnife3mdKz1JlhDWur7KjiVtbEYYCFyxh0xoF9yEC4Q_90PS56PhU1hOraDiYCDQ1ro0dcOtQhqEU70Pwoc--wsdXbpbWmhJ5hF7QC2FDRS8zpuX_KC0sz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! You can see from this that the old gen section of our heap is what is using up the majority of the heap space. We cannot figure out why the memory is not being collected during a full GC. For reference, in our old cassandra cluster, the behavior is that the full GC will clear up the majority of the heap space. See image below from an old production node operating normally:
[jira] [Updated] (CASSANDRA-10150) Cassandra read latency potentially caused by memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Ren updated CASSANDRA-10150: -- Description: We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://dl.dropboxusercontent.com/u/303980955/1.png|height=300,width=500! TP95 latencies over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.1disp=embrealattid=ii_14f42580ee666154attbid=ANGjdJ8e959Qch4PmY57AAg-qi3cPMTX_p-33H4Snd1igoxQQ5N0owSRHKEBT-M2gzKKzfMmx0WwUnImJDDMkZcWqeiHieLrGgHJX4i3-Ust8tPrgMDQxe6C_2c3N40sz=w908-h372ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! OldGen memory usage over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.4disp=embrealattid=ii_14f4258cc47c9d36attbid=ANGjdJ9LgcECnife3mdKz1JlhDWur7KjiVtbEYYCFyxh0xoF9yEC4Q_90PS56PhU1hOraDiYCDQ1ro0dcOtQhqEU70Pwoc--wsdXbpbWmhJ5hF7QC2FDRS8zpuX_KC0sz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! You can see from this that the old gen section of our heap is what is using up the majority of the heap space. We cannot figure out why the memory is not being collected during a full GC. For reference, in our old cassandra cluster, the behavior is that the full GC will clear up the majority of the heap space. See image below from an old production node operating normally: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.3disp=embrealattid=ii_14f4262f2c3781bbattbid=ANGjdJ_G3oT4ITmlQMJe16jsYpYINHC1j6dqxvZ5RKfjMp5YUj1VA71_VfWTqUP47wsuRqb6GkeAk_1BllaL6D5bjn0QvScXBPIsr5L4uFMBEMpGZAvRzKaC9Q3xXrssz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! From heap dump file we found that most memory is consumed by unreachable objects. With further analysis we were able to see those objects are RMIConnectionImpl$CombinedClassLoader$ClassLoaderWrapper (holding 4GB of memory) and java.security.ProtectionDomain (holding 2GB) . The only place we know Cassandra is using RMI is in JMX, but does anyone has any clue on where else those objects are used? And Why do they take so much memory? Or It would be great if someone could offer any further debugging tips on the latency or GC issue. was: We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://drive.google.com/open?id=0ByfQBqj1GO9KYWxXRk8xN1NRTGlUQUxFTnJaQnBoSWN0c0F3|height=300,width=500! TP95 latencies over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.1disp=embrealattid=ii_14f42580ee666154attbid=ANGjdJ8e959Qch4PmY57AAg-qi3cPMTX_p-33H4Snd1igoxQQ5N0owSRHKEBT-M2gzKKzfMmx0WwUnImJDDMkZcWqeiHieLrGgHJX4i3-Ust8tPrgMDQxe6C_2c3N40sz=w908-h372ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! OldGen memory usage over 24 hours:
[jira] [Updated] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9666: -- Description: DTCS is great for time series data, but it comes with caveats that make it difficult to use in production (typical operator behaviors such as bootstrap, removenode, and repair have MAJOR caveats as they relate to max_sstable_age_days, and hints/read repair break the selection algorithm). I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices the tiered nature of DTCS in order to address some of DTCS' operational shortcomings. I believe it is necessary to propose an alternative rather than simply adjusting DTCS, because it fundamentally removes the tiered nature in order to remove the parameter max_sstable_age_days - the result is very very different, even if it is heavily inspired by DTCS. Specifically, rather than creating a number of windows of ever increasing sizes, this strategy allows an operator to choose the window size, compact with STCS within the first window of that size, and aggressive compact down to a single sstable once that window is no longer current. The window size is a combination of unit (minutes, hours, days) and size (1, etc), such that an operator can expect all data using a block of that size to be compacted together (that is, if your unit is hours, and size is 6, you will create roughly 4 sstables per day, each one containing roughly 6 hours of data). The result addresses a number of the problems with DateTieredCompactionStrategy: - At the present time, DTCS’s first window is compacted using an unusual selection criteria, which prefers files with earlier timestamps, but ignores sizes. In TimeWindowCompactionStrategy, the first window data will be compacted with the well tested, fast, reliable STCS. All STCS options can be passed to TimeWindowCompactionStrategy to configure the first window’s compaction behavior. - HintedHandoff may put old data in new sstables, but it will have little impact other than slightly reduced efficiency (sstables will cover a wider range, but the old timestamps will not impact sstable selection criteria during compaction) - ReadRepair may put old data in new sstables, but it will have little impact other than slightly reduced efficiency (sstables will cover a wider range, but the old timestamps will not impact sstable selection criteria during compaction) - Small, old sstables resulting from streams of any kind will be swiftly and aggressively compacted with the other sstables matching their similar maxTimestamp, without causing sstables in neighboring windows to grow in size. - The configuration options are explicit and straightforward - the tuning parameters leave little room for error. The window is set in common, easily understandable terms such as “12 hours”, “1 Day”, “30 days”. The minute/hour/day options are granular enough for users keeping data for hours, and users keeping data for years. - There is no explicitly configurable max sstable age, though sstables will naturally stop compacting once new data is written in that window. - Streaming operations can create sstables with old timestamps, and they'll naturally be joined together with sstables in the same time bucket. This is true for bootstrap/repair/sstableloader/removenode. - It remains true that if old data and new data is written into the memtable at the same time, the resulting sstables will be treated as if they were new sstables, however, that no longer negatively impacts the compaction strategy’s selection criteria for older windows. Patch provided for : - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 - trunk (post-8099): https://github.com/jeffjirsa/cassandra/commits/twcs Rebased, force-pushed July 18, with bug fixes for estimated pending compactions and potential starvation if more than min_threshold tables existed in current window but STCS did not consider them viable candidates Rebased, force-pushed Aug 20 to bring in relevant logic from CASSANDRA-9882 was: DTCS is great for time series data, but it comes with caveats that make it difficult to use in production (typical operator behaviors such as bootstrap, removenode, and repair have MAJOR caveats as they relate to max_sstable_age_days, and hints/read repair break the selection algorithm). I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices the tiered nature of DTCS in order to address some of DTCS' operational shortcomings. I believe it is necessary to propose an alternative rather than simply adjusting DTCS, because it fundamentally removes the tiered nature in order to remove the parameter max_sstable_age_days - the result is very very different, even if it is heavily inspired by DTCS. Specifically, rather than
[jira] [Updated] (CASSANDRA-10048) cassandra-stress - Decimal is a BigInt not a Double
[ https://issues.apache.org/jira/browse/CASSANDRA-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-10048: -- Fix Version/s: (was: 2.1.8) 2.1.x cassandra-stress - Decimal is a BigInt not a Double --- Key: CASSANDRA-10048 URL: https://issues.apache.org/jira/browse/CASSANDRA-10048 Project: Cassandra Issue Type: Bug Reporter: Sebastian Estevez Fix For: 2.1.x Attachments: CASSANDRA-10048.patch Similar to CASSANDRA-8882 I'll provide a patch. {code} com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided ^Ccom.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 26 of CQL type decimal, expecting class java.math.BigDecimal but class java.lang.Double provided com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8879) Alter table on compact storage broken
[ https://issues.apache.org/jira/browse/CASSANDRA-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8879: - Fix Version/s: (was: 2.0.x) 2.1.x Alter table on compact storage broken - Key: CASSANDRA-8879 URL: https://issues.apache.org/jira/browse/CASSANDRA-8879 Project: Cassandra Issue Type: Bug Reporter: Nick Bailey Assignee: Aleksey Yeschenko Fix For: 2.1.x Attachments: 8879-2.0.txt In 2.0 HEAD, alter table on compact storage tables seems to be broken. With the following table definition, altering the column breaks cqlsh and generates a stack trace in the log. {noformat} CREATE TABLE settings ( key blob, column1 blob, value blob, PRIMARY KEY ((key), column1) ) WITH COMPACT STORAGE {noformat} {noformat} cqlsh:OpsCenter alter table settings ALTER column1 TYPE ascii ; TSocket read 0 bytes cqlsh:OpsCenter DESC TABLE settings; {noformat} {noformat} ERROR [Thrift:7] 2015-02-26 17:20:24,640 CassandraDaemon.java (line 199) Exception in thread Thread[Thrift:7,5,main] java.lang.AssertionError ...at org.apache.cassandra.cql3.statements.AlterTableStatement.announceMigration(AlterTableStatement.java:198) ...at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:79) ...at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158) ...at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175) ...at org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1958) ...at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486) ...at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470) ...at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ...at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ...at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:204) ...at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ...at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ...at java.lang.Thread.run(Thread.java:724) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF
[ https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705971#comment-14705971 ] Brent Haines commented on CASSANDRA-10084: -- I did a lot of tuning with prefetching, threads per client, and added multithreading to our query collator. Performance has improved a lot, but it doesn't come close to what we had before we added the collection to the table. Right now, I have discovered a query for a specific index value that is particularly slow, 3 minutes for 10,000 records. First, it returned only a about 1% of the data without error. I did a repair on one of the nodes for that partition key and it seems to be working, but is very slow now. I have attached stack dumps for every node, though I am not certain which one is working at any given time. Stupid question - is there a quick way to see what nodes own the key for a specific query? I turn trace on and run the query a bunch of times to get all three. Please see the attached profiles for the 3 nodes. We run an incremental repair nightly. They usually finish, but sometimes nodes report *much* more storage than they actually own. They all own about 60 to 90GB, but after repair some nodes will say they own 2+ TB! Restarting reveals that they are way behind on compaction and take about 2 hours to clear that up. Very slow performance streaming a large query from a single CF -- Key: CASSANDRA-10084 URL: https://issues.apache.org/jira/browse/CASSANDRA-10084 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.8 12GB EC2 instance 12 node cluster 32 concurrent reads 32 concurrent writes 6GB heap space Reporter: Brent Haines Attachments: cassandra.yaml We have a relatively simple column family that we use to track event data from different providers. We have been utilizing it for some time. Here is what it looks like: {code} CREATE TABLE data.stories_by_text ( ref_id timeuuid, second_type text, second_value text, object_type text, field_name text, value text, story_id timeuuid, data maptext, text, PRIMARY KEY ((ref_id, second_type, second_value, object_type, field_name), value, story_id) ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Searchable fields and actions in a story are indexed by ref id which corresponds to a brand, app, app instance, or user.' AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {code} We will, on a daily basis pull a query of the complete data for a given index, it will look like this: {code} select * from stories_by_text where ref_id = f0124740-2f5a-11e5-a113-03cdf3f3c6dc and second_type = 'Day' and second_value = '20150812' and object_type = 'booshaka:user' and field_name = 'hashedEmail'; {code} In the past, we have been able to pull millions of records out of the CF in a few seconds. We recently added the data column so that we could filter on event data and provide more detailed analysis of activity for our reports. The data map, declared with 'data maptext, text' is very small; only 2 or 3 name/value pairs. Since we have added this column, our streaming query performance has gone straight to hell. I just ran the above query and it took 46 minutes to read 86K rows and then it timed out. I am uncertain what other data you need to see in order to diagnose this. We are using STCS and are considering a change to Leveled Compaction. The table is repaired nightly and the updates, which are at a very fast clip will only impact the partition key for today, while the queries are for previous days only. To my knowledge these queries no longer finish ever. They time out, even though I put a 60 second timeout on the read for the cluster. I can watch it pause for 30 to 50 seconds many times during the stream. Again, this only started happening when we added the data column. Please let me know what else you need for this. It is having a very big impact on our system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF
[ https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705971#comment-14705971 ] Brent Haines edited comment on CASSANDRA-10084 at 8/21/15 12:47 AM: I did a lot of tuning with prefetching, threads per client, and added multithreading to our query collator. Performance has improved a lot, but it doesn't come close to what we had before we added the collection to the table. Right now, I have discovered a query for a specific index value that is particularly slow, 3 minutes for 10,000 records (there are 650,000 records). When I discovered this index, it was halting after providing only a about 1% of the data and did not produce any kind of error or exception. I did a repair on one of the nodes for that partition key and it seems to be working, but is very slow now. I have attached stack dumps for every node involved in the query while the query was running, though I am not certain which one is doing work at any given time. Stupid question - is there a quick way to see what nodes own the key for a specific query? I turn trace on and run the query a bunch of times to get all three. Please see the attached profiles for the 3 nodes. Also FYI - We run an incremental repair nightly. They usually finish, but sometimes, in the morning, nodes report *much* more storage than they actually own. They all own about 60 to 90GB, but after repair some nodes will say they own 2+ TB! Restarting reveals that they are way behind on compaction and take about 2 hours to clear that up. If I try a nodetool compactionstats before restarting, it will hang until timeout. Final question, is upgrading to 2.2 a safe bet for some of these issues? Specifically the halting of compaction during repair? was (Author: thebrenthaines): I did a lot of tuning with prefetching, threads per client, and added multithreading to our query collator. Performance has improved a lot, but it doesn't come close to what we had before we added the collection to the table. Right now, I have discovered a query for a specific index value that is particularly slow, 3 minutes for 10,000 records. First, it stop after providing only a about 1% of the data, but did not produce any kind of error or exception. I did a repair on one of the nodes for that partition key and it seems to be working, but is very slow now. I have attached stack dumps for every node involved in the query, though I am not certain which one is doing work at any given time. Stupid question - is there a quick way to see what nodes own the key for a specific query? I turn trace on and run the query a bunch of times to get all three. Please see the attached profiles for the 3 nodes. Also FYI - We run an incremental repair nightly. They usually finish, but sometimes, in the morning, nodes report *much* more storage than they actually own. They all own about 60 to 90GB, but after repair some nodes will say they own 2+ TB! Restarting reveals that they are way behind on compaction and takes about 2 hours to clear that up. If I try a nodetool compactionstats before restarting, it will hang until timeout. Final question, is upgrading to 2.2 a safe bet for some of these issues? Specifically the halting of compaction during repair? Very slow performance streaming a large query from a single CF -- Key: CASSANDRA-10084 URL: https://issues.apache.org/jira/browse/CASSANDRA-10084 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.8 12GB EC2 instance 12 node cluster 32 concurrent reads 32 concurrent writes 6GB heap space Reporter: Brent Haines Attachments: cassandra.yaml, node1.txt, node2.txt, node3.txt We have a relatively simple column family that we use to track event data from different providers. We have been utilizing it for some time. Here is what it looks like: {code} CREATE TABLE data.stories_by_text ( ref_id timeuuid, second_type text, second_value text, object_type text, field_name text, value text, story_id timeuuid, data maptext, text, PRIMARY KEY ((ref_id, second_type, second_value, object_type, field_name), value, story_id) ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Searchable fields and actions in a story are indexed by ref id which corresponds to a brand, app, app instance, or user.' AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression':
[jira] [Updated] (CASSANDRA-8199) CQL Spec needs to be updated with DateTieredCompactionStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-8199: --- Reviewer: Yuki Morishita (was: Michaël Figuière) CQL Spec needs to be updated with DateTieredCompactionStrategy -- Key: CASSANDRA-8199 URL: https://issues.apache.org/jira/browse/CASSANDRA-8199 Project: Cassandra Issue Type: Task Reporter: Michaël Figuière Assignee: Marcus Eriksson Priority: Minor Labels: dtcs Attachments: 0001-update-docs.patch The {{CREATE TABLE}} section of the CQL Specification isn't up to date for the latest {{DateTieredCompactionStrategy}} that has been added in 2.0.11 and 2.1.1. We need to cover all its options just like it's done for the other strategies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10137) Consistency problem
Sergey created CASSANDRA-10137: -- Summary: Consistency problem Key: CASSANDRA-10137 URL: https://issues.apache.org/jira/browse/CASSANDRA-10137 Project: Cassandra Issue Type: Bug Reporter: Sergey I have 2 dc and 3 node: dc1: 2 node; dc2: 1 node; Exist keyspace KEYSPACE itm_dhcp_test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'} AND durable_writes = true; and CF: TABLE itm_dhcp_test.lock ( name text PRIMARY KEY, reason text, time timestamp, who text ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Periodically there is a problem with deleting records. For example execute query: INSERT INTO lock (name, reason, time, who) values ('unitTest4', 'CassandraClusterLockTest', dateof(now()), 'I') IF NOT EXISTS USING TTL 60 SELECT * FROM lock WHERE name='unitTest4' DELETE FROM lock WHERE name='unitTest4' SELECT * FROM lock WHERE name='unitTest4' 20% - 30% of cases last SELECT returns not empty record. Most often when coordinator node1-dc2. In trace I see the message: Parsing DELETE FROM lock WHERE name='unitTest4' | node1.dc2 | 45 | SharedPool-Worker-3 | Preparing statement | node1.dc2 |151 | SharedPool-Worker-3 | Executing single-partition query on users | node1.dc2 |588 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 |601 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 |634 | SharedPool-Worker-1 | Key cache hit for sstable 2 | node1.dc2|668 | SharedPool-Worker-1 | Seeking to partition beginning in data file | node1.dc2 |674 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 |737 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 |743 | SharedPool-Worker-1 |Read 1 live and 0 tombstoned cells | node1.dc2 |795 | SharedPool-Worker-1 | Executing single-partition query on permissions | node1.dc2 | 1653 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 | 1662 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 | 1690 | SharedPool-Worker-1 | Key cache hit for sstable 5 | node1.dc2| 1737 | SharedPool-Worker-1 | Seeking to partition indexed section in data file | node1.dc2 | 1742 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 1797 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 | 1805 | SharedPool-Worker-1 |Read 0 live and 0 tombstoned cells | node1.dc2 | 1819 | SharedPool-Worker-1 | Executing single-partition query on users | node1.dc2 | 2798 | SharedPool-Worker-4 | Acquiring sstable references | node1.dc2 | 2808 | SharedPool-Worker-4 | Merging memtable tombstones | node1.dc2 | 2851 | SharedPool-Worker-4 | Key cache hit for sstable 2 | node1.dc2| 2896 | SharedPool-Worker-4 | Seeking to partition beginning in data file | node1.dc2 | 2903 | SharedPool-Worker-4 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 2948 | SharedPool-Worker-4 |Merging data from memtables and 1 sstables | node1.dc2 | 2954 | SharedPool-Worker-4 |Read 1 live and 0
[jira] [Resolved] (CASSANDRA-5505) Please add support for basic arithmetic operations in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer resolved CASSANDRA-5505. --- Resolution: Duplicate Please add support for basic arithmetic operations in CQL - Key: CASSANDRA-5505 URL: https://issues.apache.org/jira/browse/CASSANDRA-5505 Project: Cassandra Issue Type: Wish Reporter: Arthur Zubarev Assignee: Benjamin Lerer Labels: cql Please add support for basic arithmetic operations in CQL as -, +, /, *. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5505) Please add support for basic arithmetic operations in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704709#comment-14704709 ] Benjamin Lerer commented on CASSANDRA-5505: --- {quote}And I think this is not very useful because in pretty much any place we could easily support this currently, you can pretty much do your basic arithmetic client side{quote} If you take the case of somebody looking at some data through {{cqlsh}}, it is simply not possible for him to do some arithmetics. In the past, I have used arithmetic operations a lot in MySQL and it would have been really annoying for me if they had not been available. It is for sure not a ticket with a high priority, but it is in my opinion a nice to have. Please add support for basic arithmetic operations in CQL - Key: CASSANDRA-5505 URL: https://issues.apache.org/jira/browse/CASSANDRA-5505 Project: Cassandra Issue Type: Wish Reporter: Arthur Zubarev Assignee: Benjamin Lerer Labels: cql Please add support for basic arithmetic operations in CQL as -, +, /, *. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8671) Give compaction strategy more control over where sstables are created, including for flushing and streaming.
[ https://issues.apache.org/jira/browse/CASSANDRA-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704729#comment-14704729 ] Marcus Eriksson commented on CASSANDRA-8671: This is basically ready to commit now, just 2 comments * A bit worried about the CQLSSTableWriter change to create the cfs if it does not exist - could we instead create a default SSTableMultiWriter for the SSTableTxnWriter and avoid creating the CFS instance? (something like this: https://github.com/krummas/cassandra/commit/acb133e99d464aba73f14a405e9ca7115fd24500)? * setInitialDirectories in ColumnFamilyStore is unused - is it needed? Give compaction strategy more control over where sstables are created, including for flushing and streaming. Key: CASSANDRA-8671 URL: https://issues.apache.org/jira/browse/CASSANDRA-8671 Project: Cassandra Issue Type: Improvement Reporter: Blake Eggleston Assignee: Blake Eggleston Fix For: 3.x Attachments: 0001-C8671-creating-sstable-writers-for-flush-and-stream-.patch, 8671-giving-compaction-strategies-more-control-over.txt This would enable routing different partitions to different disks based on some user defined parameters. My initial take on how to do this would be to make an interface from SSTableWriter, and have a table's compaction strategy do all SSTableWriter instantiation. Compaction strategies could then implement their own SSTableWriter implementations (which basically wrap one or more normal sstablewriters) for compaction, flushing, and streaming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704767#comment-14704767 ] Marcus Eriksson commented on CASSANDRA-9045: Ping [~r0mant] - any updates? Is this still happening? Deleted columns are resurrected after repair in wide rows - Key: CASSANDRA-9045 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 Project: Cassandra Issue Type: Bug Components: Core Reporter: Roman Tkachenko Assignee: Marcus Eriksson Priority: Critical Fix For: 2.0.x Attachments: 9045-debug-tracing.txt, another.txt, apache-cassandra-2.0.13-SNAPSHOT.jar, cqlsh.txt, debug.txt, inconsistency.txt Hey guys, After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here. h5. Setup Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away) Multi datacenter 12+6 nodes cluster. h5. Schema {code} cqlsh describe keyspace blackbook; CREATE KEYSPACE blackbook WITH replication = { 'class': 'NetworkTopologyStrategy', 'IAD': '3', 'ORD': '3' }; USE blackbook; CREATE TABLE bounces ( domainid text, address text, message text, timestamp bigint, PRIMARY KEY (domainid, address) ) WITH bloom_filter_fp_chance=0.10 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.00 AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'LeveledCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; {code} h5. Use case Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns. Columns are not TTL'ed but can be deleted using the following CQL3 statement: {code} delete from bounces where domainid = 'domain.com' and address = 'al...@example.com'; {code} All queries are performed using LOCAL_QUORUM CL. h5. Problem We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared. I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue: * delete an entry * verify it's not returned even with CL=ALL * run repair on nodes that own this row's key * the columns reappear and are returned even with CL=ALL I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair. h5. Other steps I've taken so far Made sure NTP is running on all servers and clocks are synchronized. Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help. Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared. Finally, I noticed these log entries for the row in question: {code} INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally {code} Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear. We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue. Please let me know if you need more information on the case or if I can run more experiments. Thanks! Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] cassandra git commit: Check column names in IN restrictions
Repository: cassandra Updated Branches: refs/heads/trunk b10c00b28 - 0edf54777 Check column names in IN restrictions patch by Benjamin Lerer; reviewed by Robert Stupp for CASSANDRA-10043 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/62fc314c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/62fc314c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/62fc314c Branch: refs/heads/trunk Commit: 62fc314c582e9f218987e96c79db6c1aa0ba6c1e Parents: de2e0a6 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 13:45:54 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 13:45:54 2015 +0200 -- .../cassandra/cql3/SingleColumnRelation.java | 2 +- .../org/apache/cassandra/cql3/TokenRelation.java | 2 +- .../operations/SelectMultiColumnRelationTest.java | 12 .../operations/SelectOrderedPartitionerTest.java | 10 ++ .../operations/SelectSingleColumnRelationTest.java | 17 + 5 files changed, 41 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java -- diff --git a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java index c4c48aa..b206631 100644 --- a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java +++ b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java @@ -153,7 +153,7 @@ public final class SingleColumnRelation extends Relation protected Restriction newINRestriction(CFMetaData cfm, VariableSpecifications boundNames) throws InvalidRequestException { -ColumnDefinition columnDef = cfm.getColumnDefinition(getEntity().prepare(cfm)); +ColumnDefinition columnDef = toColumnDefinition(cfm, entity); List? extends ColumnSpecification receivers = toReceivers(columnDef); ListTerm terms = toTerms(receivers, inValues, cfm.ksName, boundNames); if (terms == null) http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/src/java/org/apache/cassandra/cql3/TokenRelation.java -- diff --git a/src/java/org/apache/cassandra/cql3/TokenRelation.java b/src/java/org/apache/cassandra/cql3/TokenRelation.java index 5896fae..46a812c 100644 --- a/src/java/org/apache/cassandra/cql3/TokenRelation.java +++ b/src/java/org/apache/cassandra/cql3/TokenRelation.java @@ -109,7 +109,7 @@ public final class TokenRelation extends Relation @Override public String toString() { -return String.format(token(%s) %s %s, Tuples.tupleToString(entities), relationType, value); +return String.format(token%s %s %s, Tuples.tupleToString(entities), relationType, value); } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java index 84343a7..b3232d5 100644 --- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java +++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java @@ -1011,4 +1011,16 @@ public class SelectMultiColumnRelationTest extends CQLTester row(0, 0, 2, 2, 2), row(0, 0, 3, 3, 3)); } + +@Test +public void testInvalidColumnNames() throws Throwable +{ +createTable(CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c))); +assertInvalidMessage(Undefined name e in where clause ('(b, e) = (0, 0)'), SELECT * FROM %s WHERE (b, e) = (0, 0)); +assertInvalidMessage(Undefined name e in where clause ('(b, e) IN ((0, 1), (2, 4))'), SELECT * FROM %s WHERE (b, e) IN ((0, 1), (2, 4))); +assertInvalidMessage(Undefined name e in where clause ('(b, e) (0, 1)'), SELECT * FROM %s WHERE (b, e) (0, 1) and b = 2); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) = (0, 0)'), SELECT c AS e FROM %s WHERE (b, e) = (0, 0)); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) IN ((0, 1), (2, 4))'), SELECT c AS e FROM %s WHERE (b, e) IN ((0, 1), (2, 4))); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) (0, 1)'), SELECT c AS e
[2/3] cassandra git commit: Merge cassandra-2.2 into cassandra-3.0
Merge cassandra-2.2 into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/be2c26f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/be2c26f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/be2c26f1 Branch: refs/heads/trunk Commit: be2c26f183b6ef9c606fab83342fff026054f5a3 Parents: c997c08 62fc314 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 13:52:27 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 13:53:30 2015 +0200 -- .../cassandra/cql3/SingleColumnRelation.java | 2 +- .../org/apache/cassandra/cql3/TokenRelation.java | 2 +- .../operations/SelectMultiColumnRelationTest.java | 12 .../operations/SelectOrderedPartitionerTest.java | 10 ++ .../operations/SelectSingleColumnRelationTest.java | 17 + 5 files changed, 41 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java -- diff --cc src/java/org/apache/cassandra/cql3/SingleColumnRelation.java index 885a2e2,b206631..c848b9e --- a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java +++ b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java @@@ -153,8 -153,8 +153,8 @@@ public final class SingleColumnRelatio protected Restriction newINRestriction(CFMetaData cfm, VariableSpecifications boundNames) throws InvalidRequestException { - ColumnDefinition columnDef = cfm.getColumnDefinition(getEntity().prepare(cfm)); + ColumnDefinition columnDef = toColumnDefinition(cfm, entity); -List? extends ColumnSpecification receivers = toReceivers(columnDef); +List? extends ColumnSpecification receivers = toReceivers(columnDef, cfm.isDense()); ListTerm terms = toTerms(receivers, inValues, cfm.ksName, boundNames); if (terms == null) { http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/src/java/org/apache/cassandra/cql3/TokenRelation.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/test/unit/org/apache/cassandra/cql3/validation/operations/SelectOrderedPartitionerTest.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/test/unit/org/apache/cassandra/cql3/validation/operations/SelectSingleColumnRelationTest.java --
cassandra git commit: Allow count(*) and count(1) to be use as normal aggregation
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 62fc314c5 - 4fc58513d Allow count(*) and count(1) to be use as normal aggregation patch by Benjamin Lerer; reviewed by Stefania Alborghetti for CASSANDRA-10114 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4fc58513 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4fc58513 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4fc58513 Branch: refs/heads/cassandra-2.2 Commit: 4fc58513dce5ee6acb83ba07d9f31c26812075f9 Parents: 62fc314 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 14:01:37 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 14:01:37 2015 +0200 -- NEWS.txt| 1 + src/java/org/apache/cassandra/cql3/Cql.g| 9 + .../cassandra/cql3/functions/AggregateFcts.java | 11 ++ .../selection/AbstractFunctionSelector.java | 6 +++ .../cassandra/cql3/selection/Selector.java | 1 - .../validation/operations/AggregationTest.java | 39 6 files changed, 59 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 37a1b9e..a9cf70d 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -100,6 +100,7 @@ New features - The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow to convert from date into timestamp type and bigint raw value. - SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed. + - COUNT(*) and COUNT(1) can be selected with other columns or functions 2.1.9 http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 0db09b8..3d2aba5 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -295,8 +295,7 @@ selectStatement returns [SelectStatement.RawStatement expr] } : K_SELECT ( K_JSON { isJson = true; } )? - ( ( K_DISTINCT { isDistinct = true; } )? sclause=selectClause -| sclause=selectCountClause ) + ( ( K_DISTINCT { isDistinct = true; } )? sclause=selectClause ) K_FROM cf=columnFamilyName ( K_WHERE wclause=whereClause )? ( K_ORDER K_BY orderByClause[orderings] ( ',' orderByClause[orderings] )* )? @@ -324,6 +323,7 @@ selector returns [RawSelector s] unaliasedSelector returns [Selectable.Raw s] @init { Selectable.Raw tmp = null; } : ( c=cident { tmp = c; } + | K_COUNT '(' countArgument ')' { tmp = new Selectable.WithFunction.Raw(FunctionName.nativeFunction(countRows), Collections.Selectable.RawemptyList());} | K_WRITETIME '(' c=cident ')' { tmp = new Selectable.WritetimeOrTTL.Raw(c, true); } | K_TTL '(' c=cident ')' { tmp = new Selectable.WritetimeOrTTL.Raw(c, false); } | f=functionName args=selectionFunctionArgs { tmp = new Selectable.WithFunction.Raw(f, args); } @@ -337,11 +337,6 @@ selectionFunctionArgs returns [ListSelectable.Raw a] ')' { $a = args; } ; -selectCountClause returns [ListRawSelector expr] -@init{ ColumnIdentifier alias = new ColumnIdentifier(count, false); } -: K_COUNT '(' countArgument ')' (K_AS c=ident { alias = c; })? { $expr = new ArrayListRawSelector(); $expr.add( new RawSelector(new Selectable.WithFunction.Raw(FunctionName.nativeFunction(countRows), Collections.Selectable.RawemptyList()), alias));} -; - countArgument : '\*' | i=INTEGER { if (!i.getText().equals(1)) addRecognitionError(Only COUNT(1) is supported, got COUNT( + i.getText() + ));} http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java -- diff --git a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java index 1b22da6..41e43c0 100644 --- a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java +++ b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java @@ -38,6 +38,17 @@ import org.apache.cassandra.db.marshal.ShortType; public abstract class AggregateFcts { /** + * Checks if the specified function is the count rows (e.g. COUNT(*) or COUNT(1)) function. + * + * @param function the function to check + * @return codetrue/code if the
[jira] [Commented] (CASSANDRA-9142) DC Local repair or -hosts should only be allowed with -full repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704766#comment-14704766 ] Marcus Eriksson commented on CASSANDRA-9142: ping on this [~kohlisankalp] - do you have time to review or should I find someone else? DC Local repair or -hosts should only be allowed with -full repair -- Key: CASSANDRA-9142 URL: https://issues.apache.org/jira/browse/CASSANDRA-9142 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Priority: Minor Fix For: 2.2.x Attachments: trunk_9142.txt We should not let users mix incremental repair with dc local repair or -host or any repair which does not include all replicas. This will currently cause stables on some replicas to be marked as repaired. The next incremental repair will not work on same set of data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5505) Please add support for basic arithmetic operations in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704628#comment-14704628 ] Sylvain Lebresne commented on CASSANDRA-5505: - I'll note that imo this is not very useful in the current state of CQL and hence, imo, not worth adding the complexity. And I think this is not very useful because in pretty much any place we could easily support this currently, you can pretty much do your basic arithmetic client side. _If_ we were to support things like {{noformat}} SELECT * FROM t WHERE c1 = c2 + 1; {{noformat}} then yes, this will be a nice to have, but we don't support that kind of condition between 2 columns and while we will likely add it in the future, this is out of scope for this issue. So I'm personally in favor of closing this as later. Please add support for basic arithmetic operations in CQL - Key: CASSANDRA-5505 URL: https://issues.apache.org/jira/browse/CASSANDRA-5505 Project: Cassandra Issue Type: Wish Reporter: Arthur Zubarev Assignee: Benjamin Lerer Labels: cql Please add support for basic arithmetic operations in CQL as -, +, /, *. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records
[ https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704693#comment-14704693 ] Branimir Lambov commented on CASSANDRA-9669: The code looks good now, +1. bq. the ranges are inclusive-start, exclusive-end, the inverse of what you expected Yes, they are the inverse of what I expected, but that's actually start-exclusive, end-inclusive, the [shouldReplay comment|https://github.com/apache/cassandra/commit/459b96333c84837fe757d706c2a6be91b8b27f2e#diff-2b76f3efa4aaa38339ab8f4869c9b7bfR88] is correct. I'd also add as a mutation's replay position is assigned after it is added to the log to it. Deleting the files at [the start of the test|https://github.com/apache/cassandra/commit/459b96333c84837fe757d706c2a6be91b8b27f2e#diff-b3802331f8dcba05356ad47ee54fe6dfR321] may cause problems on Windows, but I don't know how to this safely in 2.0. If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records --- Key: CASSANDRA-9669 URL: https://issues.apache.org/jira/browse/CASSANDRA-9669 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Critical Labels: correctness Fix For: 3.x, 2.1.x, 2.2.x, 3.0.x While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, on restart we simply take the maximum replay position of any sstable on disk, and ignore anything prior. It is quite possible for there to be two flushes triggered for a given table, and for the second to finish first by virtue of containing a much smaller quantity of live data (or perhaps the disk is just under less pressure). If we crash before the first sstable has been written, then on restart the data it would have represented will disappear, since we will not replay the CL records. This looks to be a bug present since time immemorial, and also seems pretty serious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/3] cassandra git commit: Merge remote-tracking branch 'asf/cassandra-3.0' into trunk
Merge remote-tracking branch 'asf/cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0edf5477 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0edf5477 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0edf5477 Branch: refs/heads/trunk Commit: 0edf54777ad18e3729c95552f5276016e00583a4 Parents: b10c00b be2c26f Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 13:55:04 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 13:55:04 2015 +0200 -- .../cassandra/cql3/SingleColumnRelation.java | 2 +- .../org/apache/cassandra/cql3/TokenRelation.java | 2 +- .../operations/SelectMultiColumnRelationTest.java | 12 .../operations/SelectOrderedPartitionerTest.java | 10 ++ .../operations/SelectSingleColumnRelationTest.java | 17 + 5 files changed, 41 insertions(+), 2 deletions(-) --
[2/2] cassandra git commit: Merge cassandra-2.2 into cassandra-3.0
Merge cassandra-2.2 into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/be2c26f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/be2c26f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/be2c26f1 Branch: refs/heads/cassandra-3.0 Commit: be2c26f183b6ef9c606fab83342fff026054f5a3 Parents: c997c08 62fc314 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 13:52:27 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 13:53:30 2015 +0200 -- .../cassandra/cql3/SingleColumnRelation.java | 2 +- .../org/apache/cassandra/cql3/TokenRelation.java | 2 +- .../operations/SelectMultiColumnRelationTest.java | 12 .../operations/SelectOrderedPartitionerTest.java | 10 ++ .../operations/SelectSingleColumnRelationTest.java | 17 + 5 files changed, 41 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java -- diff --cc src/java/org/apache/cassandra/cql3/SingleColumnRelation.java index 885a2e2,b206631..c848b9e --- a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java +++ b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java @@@ -153,8 -153,8 +153,8 @@@ public final class SingleColumnRelatio protected Restriction newINRestriction(CFMetaData cfm, VariableSpecifications boundNames) throws InvalidRequestException { - ColumnDefinition columnDef = cfm.getColumnDefinition(getEntity().prepare(cfm)); + ColumnDefinition columnDef = toColumnDefinition(cfm, entity); -List? extends ColumnSpecification receivers = toReceivers(columnDef); +List? extends ColumnSpecification receivers = toReceivers(columnDef, cfm.isDense()); ListTerm terms = toTerms(receivers, inValues, cfm.ksName, boundNames); if (terms == null) { http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/src/java/org/apache/cassandra/cql3/TokenRelation.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/test/unit/org/apache/cassandra/cql3/validation/operations/SelectOrderedPartitionerTest.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2c26f1/test/unit/org/apache/cassandra/cql3/validation/operations/SelectSingleColumnRelationTest.java --
[1/2] cassandra git commit: Check column names in IN restrictions
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 c997c08c4 - be2c26f18 Check column names in IN restrictions patch by Benjamin Lerer; reviewed by Robert Stupp for CASSANDRA-10043 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/62fc314c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/62fc314c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/62fc314c Branch: refs/heads/cassandra-3.0 Commit: 62fc314c582e9f218987e96c79db6c1aa0ba6c1e Parents: de2e0a6 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 13:45:54 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 13:45:54 2015 +0200 -- .../cassandra/cql3/SingleColumnRelation.java | 2 +- .../org/apache/cassandra/cql3/TokenRelation.java | 2 +- .../operations/SelectMultiColumnRelationTest.java | 12 .../operations/SelectOrderedPartitionerTest.java | 10 ++ .../operations/SelectSingleColumnRelationTest.java | 17 + 5 files changed, 41 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java -- diff --git a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java index c4c48aa..b206631 100644 --- a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java +++ b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java @@ -153,7 +153,7 @@ public final class SingleColumnRelation extends Relation protected Restriction newINRestriction(CFMetaData cfm, VariableSpecifications boundNames) throws InvalidRequestException { -ColumnDefinition columnDef = cfm.getColumnDefinition(getEntity().prepare(cfm)); +ColumnDefinition columnDef = toColumnDefinition(cfm, entity); List? extends ColumnSpecification receivers = toReceivers(columnDef); ListTerm terms = toTerms(receivers, inValues, cfm.ksName, boundNames); if (terms == null) http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/src/java/org/apache/cassandra/cql3/TokenRelation.java -- diff --git a/src/java/org/apache/cassandra/cql3/TokenRelation.java b/src/java/org/apache/cassandra/cql3/TokenRelation.java index 5896fae..46a812c 100644 --- a/src/java/org/apache/cassandra/cql3/TokenRelation.java +++ b/src/java/org/apache/cassandra/cql3/TokenRelation.java @@ -109,7 +109,7 @@ public final class TokenRelation extends Relation @Override public String toString() { -return String.format(token(%s) %s %s, Tuples.tupleToString(entities), relationType, value); +return String.format(token%s %s %s, Tuples.tupleToString(entities), relationType, value); } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java index 84343a7..b3232d5 100644 --- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java +++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java @@ -1011,4 +1011,16 @@ public class SelectMultiColumnRelationTest extends CQLTester row(0, 0, 2, 2, 2), row(0, 0, 3, 3, 3)); } + +@Test +public void testInvalidColumnNames() throws Throwable +{ +createTable(CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c))); +assertInvalidMessage(Undefined name e in where clause ('(b, e) = (0, 0)'), SELECT * FROM %s WHERE (b, e) = (0, 0)); +assertInvalidMessage(Undefined name e in where clause ('(b, e) IN ((0, 1), (2, 4))'), SELECT * FROM %s WHERE (b, e) IN ((0, 1), (2, 4))); +assertInvalidMessage(Undefined name e in where clause ('(b, e) (0, 1)'), SELECT * FROM %s WHERE (b, e) (0, 1) and b = 2); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) = (0, 0)'), SELECT c AS e FROM %s WHERE (b, e) = (0, 0)); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) IN ((0, 1), (2, 4))'), SELECT c AS e FROM %s WHERE (b, e) IN ((0, 1), (2, 4))); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) (0, 1)'),
[3/3] cassandra git commit: Merge branch cassandra-3.0 into trunk
Merge branch cassandra-3.0 into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/86bca3a0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/86bca3a0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/86bca3a0 Branch: refs/heads/trunk Commit: 86bca3a0b90cc8a1e67f0ddac9e33af63898d569 Parents: 0edf547 1964a82 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 14:12:27 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 14:13:23 2015 +0200 -- NEWS.txt| 1 + src/java/org/apache/cassandra/cql3/Cql.g| 9 + .../cassandra/cql3/functions/AggregateFcts.java | 11 ++ .../selection/AbstractFunctionSelector.java | 6 +++ .../cassandra/cql3/selection/Selector.java | 1 - .../validation/operations/AggregationTest.java | 39 6 files changed, 59 insertions(+), 8 deletions(-) --
[1/3] cassandra git commit: Allow count(*) and count(1) to be use as normal aggregation
Repository: cassandra Updated Branches: refs/heads/trunk 0edf54777 - 86bca3a0b Allow count(*) and count(1) to be use as normal aggregation patch by Benjamin Lerer; reviewed by Stefania Alborghetti for CASSANDRA-10114 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4fc58513 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4fc58513 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4fc58513 Branch: refs/heads/trunk Commit: 4fc58513dce5ee6acb83ba07d9f31c26812075f9 Parents: 62fc314 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 14:01:37 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 14:01:37 2015 +0200 -- NEWS.txt| 1 + src/java/org/apache/cassandra/cql3/Cql.g| 9 + .../cassandra/cql3/functions/AggregateFcts.java | 11 ++ .../selection/AbstractFunctionSelector.java | 6 +++ .../cassandra/cql3/selection/Selector.java | 1 - .../validation/operations/AggregationTest.java | 39 6 files changed, 59 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 37a1b9e..a9cf70d 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -100,6 +100,7 @@ New features - The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow to convert from date into timestamp type and bigint raw value. - SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed. + - COUNT(*) and COUNT(1) can be selected with other columns or functions 2.1.9 http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 0db09b8..3d2aba5 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -295,8 +295,7 @@ selectStatement returns [SelectStatement.RawStatement expr] } : K_SELECT ( K_JSON { isJson = true; } )? - ( ( K_DISTINCT { isDistinct = true; } )? sclause=selectClause -| sclause=selectCountClause ) + ( ( K_DISTINCT { isDistinct = true; } )? sclause=selectClause ) K_FROM cf=columnFamilyName ( K_WHERE wclause=whereClause )? ( K_ORDER K_BY orderByClause[orderings] ( ',' orderByClause[orderings] )* )? @@ -324,6 +323,7 @@ selector returns [RawSelector s] unaliasedSelector returns [Selectable.Raw s] @init { Selectable.Raw tmp = null; } : ( c=cident { tmp = c; } + | K_COUNT '(' countArgument ')' { tmp = new Selectable.WithFunction.Raw(FunctionName.nativeFunction(countRows), Collections.Selectable.RawemptyList());} | K_WRITETIME '(' c=cident ')' { tmp = new Selectable.WritetimeOrTTL.Raw(c, true); } | K_TTL '(' c=cident ')' { tmp = new Selectable.WritetimeOrTTL.Raw(c, false); } | f=functionName args=selectionFunctionArgs { tmp = new Selectable.WithFunction.Raw(f, args); } @@ -337,11 +337,6 @@ selectionFunctionArgs returns [ListSelectable.Raw a] ')' { $a = args; } ; -selectCountClause returns [ListRawSelector expr] -@init{ ColumnIdentifier alias = new ColumnIdentifier(count, false); } -: K_COUNT '(' countArgument ')' (K_AS c=ident { alias = c; })? { $expr = new ArrayListRawSelector(); $expr.add( new RawSelector(new Selectable.WithFunction.Raw(FunctionName.nativeFunction(countRows), Collections.Selectable.RawemptyList()), alias));} -; - countArgument : '\*' | i=INTEGER { if (!i.getText().equals(1)) addRecognitionError(Only COUNT(1) is supported, got COUNT( + i.getText() + ));} http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java -- diff --git a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java index 1b22da6..41e43c0 100644 --- a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java +++ b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java @@ -38,6 +38,17 @@ import org.apache.cassandra.db.marshal.ShortType; public abstract class AggregateFcts { /** + * Checks if the specified function is the count rows (e.g. COUNT(*) or COUNT(1)) function. + * + * @param function the function to check + * @return codetrue/code if the specified function is
[2/3] cassandra git commit: Merge cassandra-2.2 into cassandra-3.0
Merge cassandra-2.2 into cassandra-3.0 Conflicts: src/java/org/apache/cassandra/cql3/Cql.g Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1964a82b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1964a82b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1964a82b Branch: refs/heads/trunk Commit: 1964a82bf0dea261ccb1f48c36cda0de7fa8d62e Parents: be2c26f 4fc5851 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 14:10:11 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 14:10:11 2015 +0200 -- NEWS.txt| 1 + src/java/org/apache/cassandra/cql3/Cql.g| 9 + .../cassandra/cql3/functions/AggregateFcts.java | 11 ++ .../selection/AbstractFunctionSelector.java | 6 +++ .../cassandra/cql3/selection/Selector.java | 1 - .../validation/operations/AggregationTest.java | 39 6 files changed, 59 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/NEWS.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/src/java/org/apache/cassandra/cql3/Cql.g -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java -- diff --cc src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java index 153e5eb,41e43c0..7b5bdb8 --- a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java +++ b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java @@@ -32,47 -37,18 +32,58 @@@ import org.apache.cassandra.db.marshal. */ public abstract class AggregateFcts { +public static CollectionAggregateFunction all() +{ +CollectionAggregateFunction functions = new ArrayList(); + +functions.add(countRowsFunction); + +// sum for primitives +functions.add(sumFunctionForByte); +functions.add(sumFunctionForShort); +functions.add(sumFunctionForInt32); +functions.add(sumFunctionForLong); +functions.add(sumFunctionForFloat); +functions.add(sumFunctionForDouble); +functions.add(sumFunctionForDecimal); +functions.add(sumFunctionForVarint); + +// avg for primitives +functions.add(avgFunctionForByte); +functions.add(avgFunctionForShort); +functions.add(avgFunctionForInt32); +functions.add(avgFunctionForLong); +functions.add(avgFunctionForFloat); +functions.add(avgFunctionForDouble); +functions.add(avgFunctionForDecimal); +functions.add(avgFunctionForVarint); + +// count, max, and min for all standard types +for (CQL3Type type : CQL3Type.Native.values()) +{ +if (type != CQL3Type.Native.VARCHAR) // varchar and text both mapping to UTF8Type +{ + functions.add(AggregateFcts.makeCountFunction(type.getType())); +functions.add(AggregateFcts.makeMaxFunction(type.getType())); +functions.add(AggregateFcts.makeMinFunction(type.getType())); +} +} + +return functions; +} + /** + * Checks if the specified function is the count rows (e.g. COUNT(*) or COUNT(1)) function. + * + * @param function the function to check + * @return codetrue/code if the specified function is the count rows one, codefalse/code otherwise. + */ + public static boolean isCountRows(Function function) + { + return function == countRowsFunction; + } + + /** * The function used to count the number of rows of a result set. This function is called when COUNT(*) or COUNT(1) * is specified. */ http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/src/java/org/apache/cassandra/cql3/selection/Selector.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java --
[jira] [Updated] (CASSANDRA-10137) Consistency problem
[ https://issues.apache.org/jira/browse/CASSANDRA-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey updated CASSANDRA-10137: --- Description: I have 2 dc and 3 node: dc1: 2 node; dc2: 1 node; Exist keyspace KEYSPACE itm_dhcp_test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'} AND durable_writes = true; and CF: TABLE itm_dhcp_test.lock ( name text PRIMARY KEY, reason text, time timestamp, who text ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Periodically there is a problem with deleting records. For example execute query: INSERT INTO lock (name, reason, time, who) values ('unitTest4', 'CassandraClusterLockTest', dateof(now()), 'I') IF NOT EXISTS USING TTL 60 SELECT * FROM lock WHERE name='unitTest4' DELETE FROM lock WHERE name='unitTest4' SELECT * FROM lock WHERE name='unitTest4' 20% - 30% of cases last SELECT returns not empty record. Most often when coordinator node1-dc2. In trace I see the message: | Parsing DELETE FROM lock WHERE name='unitTest4' | node1.dc2 | 45 | SharedPool-Worker-3 | Preparing statement | node1.dc2 |151 | SharedPool-Worker-3 | Executing single-partition query on users | node1.dc2 |588 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 |601 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 |634 | SharedPool-Worker-1 | Key cache hit for sstable 2 | node1.dc2|668 | SharedPool-Worker-1 | Seeking to partition beginning in data file | node1.dc2 |674 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 |737 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 |743 | SharedPool-Worker-1 |Read 1 live and 0 tombstoned cells | node1.dc2 |795 | SharedPool-Worker-1 | Executing single-partition query on permissions | node1.dc2 | 1653 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 | 1662 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 | 1690 | SharedPool-Worker-1 | Key cache hit for sstable 5 | node1.dc2| 1737 | SharedPool-Worker-1 | Seeking to partition indexed section in data file | node1.dc2 | 1742 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 1797 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 | 1805 | SharedPool-Worker-1 |Read 0 live and 0 tombstoned cells | node1.dc2 | 1819 | SharedPool-Worker-1 | Executing single-partition query on users | node1.dc2 | 2798 | SharedPool-Worker-4 | Acquiring sstable references | node1.dc2 | 2808 | SharedPool-Worker-4 | Merging memtable tombstones | node1.dc2 | 2851 | SharedPool-Worker-4 | Key cache hit for sstable 2 | node1.dc2| 2896 | SharedPool-Worker-4 | Seeking to partition beginning in data file | node1.dc2 | 2903 | SharedPool-Worker-4 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 2948 | SharedPool-Worker-4 |Merging data from memtables and 1 sstables | node1.dc2 | 2954 | SharedPool-Worker-4 |Read 1 live and 0 tombstoned cells | node1.dc2 | 3004 | SharedPool-Worker-4 |
[jira] [Resolved] (CASSANDRA-9958) Add support for the IN operator in secondary index queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer resolved CASSANDRA-9958. --- Resolution: Duplicate Add support for the IN operator in secondary index queries -- Key: CASSANDRA-9958 URL: https://issues.apache.org/jira/browse/CASSANDRA-9958 Project: Cassandra Issue Type: Improvement Reporter: Benjamin Lerer Assignee: Benjamin Lerer Priority: Minor On a table like: {code} CREATE Table t (a int , b int, c text, PRIMARY KEY (a, b)); CREATE INDEX ON t (c); {code} A query like {code} SELECT * FROM t WHERE a IN (1, 2) AND b 3 AND c= 'test' ALLOW FILTERING; {code} is not supported. If the user want to paginate over the result of such query, issuing 2 queries is not a good option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Check column names in IN restrictions
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 de2e0a6c9 - 62fc314c5 Check column names in IN restrictions patch by Benjamin Lerer; reviewed by Robert Stupp for CASSANDRA-10043 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/62fc314c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/62fc314c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/62fc314c Branch: refs/heads/cassandra-2.2 Commit: 62fc314c582e9f218987e96c79db6c1aa0ba6c1e Parents: de2e0a6 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 13:45:54 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 13:45:54 2015 +0200 -- .../cassandra/cql3/SingleColumnRelation.java | 2 +- .../org/apache/cassandra/cql3/TokenRelation.java | 2 +- .../operations/SelectMultiColumnRelationTest.java | 12 .../operations/SelectOrderedPartitionerTest.java | 10 ++ .../operations/SelectSingleColumnRelationTest.java | 17 + 5 files changed, 41 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java -- diff --git a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java index c4c48aa..b206631 100644 --- a/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java +++ b/src/java/org/apache/cassandra/cql3/SingleColumnRelation.java @@ -153,7 +153,7 @@ public final class SingleColumnRelation extends Relation protected Restriction newINRestriction(CFMetaData cfm, VariableSpecifications boundNames) throws InvalidRequestException { -ColumnDefinition columnDef = cfm.getColumnDefinition(getEntity().prepare(cfm)); +ColumnDefinition columnDef = toColumnDefinition(cfm, entity); List? extends ColumnSpecification receivers = toReceivers(columnDef); ListTerm terms = toTerms(receivers, inValues, cfm.ksName, boundNames); if (terms == null) http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/src/java/org/apache/cassandra/cql3/TokenRelation.java -- diff --git a/src/java/org/apache/cassandra/cql3/TokenRelation.java b/src/java/org/apache/cassandra/cql3/TokenRelation.java index 5896fae..46a812c 100644 --- a/src/java/org/apache/cassandra/cql3/TokenRelation.java +++ b/src/java/org/apache/cassandra/cql3/TokenRelation.java @@ -109,7 +109,7 @@ public final class TokenRelation extends Relation @Override public String toString() { -return String.format(token(%s) %s %s, Tuples.tupleToString(entities), relationType, value); +return String.format(token%s %s %s, Tuples.tupleToString(entities), relationType, value); } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/62fc314c/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java index 84343a7..b3232d5 100644 --- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java +++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectMultiColumnRelationTest.java @@ -1011,4 +1011,16 @@ public class SelectMultiColumnRelationTest extends CQLTester row(0, 0, 2, 2, 2), row(0, 0, 3, 3, 3)); } + +@Test +public void testInvalidColumnNames() throws Throwable +{ +createTable(CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c))); +assertInvalidMessage(Undefined name e in where clause ('(b, e) = (0, 0)'), SELECT * FROM %s WHERE (b, e) = (0, 0)); +assertInvalidMessage(Undefined name e in where clause ('(b, e) IN ((0, 1), (2, 4))'), SELECT * FROM %s WHERE (b, e) IN ((0, 1), (2, 4))); +assertInvalidMessage(Undefined name e in where clause ('(b, e) (0, 1)'), SELECT * FROM %s WHERE (b, e) (0, 1) and b = 2); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) = (0, 0)'), SELECT c AS e FROM %s WHERE (b, e) = (0, 0)); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) IN ((0, 1), (2, 4))'), SELECT c AS e FROM %s WHERE (b, e) IN ((0, 1), (2, 4))); +assertInvalidMessage(Aliases aren't allowed in the where clause ('(b, e) (0, 1)'),
[2/2] cassandra git commit: Merge cassandra-2.2 into cassandra-3.0
Merge cassandra-2.2 into cassandra-3.0 Conflicts: src/java/org/apache/cassandra/cql3/Cql.g Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1964a82b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1964a82b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1964a82b Branch: refs/heads/cassandra-3.0 Commit: 1964a82bf0dea261ccb1f48c36cda0de7fa8d62e Parents: be2c26f 4fc5851 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 14:10:11 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 14:10:11 2015 +0200 -- NEWS.txt| 1 + src/java/org/apache/cassandra/cql3/Cql.g| 9 + .../cassandra/cql3/functions/AggregateFcts.java | 11 ++ .../selection/AbstractFunctionSelector.java | 6 +++ .../cassandra/cql3/selection/Selector.java | 1 - .../validation/operations/AggregationTest.java | 39 6 files changed, 59 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/NEWS.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/src/java/org/apache/cassandra/cql3/Cql.g -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java -- diff --cc src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java index 153e5eb,41e43c0..7b5bdb8 --- a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java +++ b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java @@@ -32,47 -37,18 +32,58 @@@ import org.apache.cassandra.db.marshal. */ public abstract class AggregateFcts { +public static CollectionAggregateFunction all() +{ +CollectionAggregateFunction functions = new ArrayList(); + +functions.add(countRowsFunction); + +// sum for primitives +functions.add(sumFunctionForByte); +functions.add(sumFunctionForShort); +functions.add(sumFunctionForInt32); +functions.add(sumFunctionForLong); +functions.add(sumFunctionForFloat); +functions.add(sumFunctionForDouble); +functions.add(sumFunctionForDecimal); +functions.add(sumFunctionForVarint); + +// avg for primitives +functions.add(avgFunctionForByte); +functions.add(avgFunctionForShort); +functions.add(avgFunctionForInt32); +functions.add(avgFunctionForLong); +functions.add(avgFunctionForFloat); +functions.add(avgFunctionForDouble); +functions.add(avgFunctionForDecimal); +functions.add(avgFunctionForVarint); + +// count, max, and min for all standard types +for (CQL3Type type : CQL3Type.Native.values()) +{ +if (type != CQL3Type.Native.VARCHAR) // varchar and text both mapping to UTF8Type +{ + functions.add(AggregateFcts.makeCountFunction(type.getType())); +functions.add(AggregateFcts.makeMaxFunction(type.getType())); +functions.add(AggregateFcts.makeMinFunction(type.getType())); +} +} + +return functions; +} + /** + * Checks if the specified function is the count rows (e.g. COUNT(*) or COUNT(1)) function. + * + * @param function the function to check + * @return codetrue/code if the specified function is the count rows one, codefalse/code otherwise. + */ + public static boolean isCountRows(Function function) + { + return function == countRowsFunction; + } + + /** * The function used to count the number of rows of a result set. This function is called when COUNT(*) or COUNT(1) * is specified. */ http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/src/java/org/apache/cassandra/cql3/selection/Selector.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1964a82b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java --
[1/2] cassandra git commit: Allow count(*) and count(1) to be use as normal aggregation
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 be2c26f18 - 1964a82bf Allow count(*) and count(1) to be use as normal aggregation patch by Benjamin Lerer; reviewed by Stefania Alborghetti for CASSANDRA-10114 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4fc58513 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4fc58513 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4fc58513 Branch: refs/heads/cassandra-3.0 Commit: 4fc58513dce5ee6acb83ba07d9f31c26812075f9 Parents: 62fc314 Author: blerer benjamin.le...@datastax.com Authored: Thu Aug 20 14:01:37 2015 +0200 Committer: blerer benjamin.le...@datastax.com Committed: Thu Aug 20 14:01:37 2015 +0200 -- NEWS.txt| 1 + src/java/org/apache/cassandra/cql3/Cql.g| 9 + .../cassandra/cql3/functions/AggregateFcts.java | 11 ++ .../selection/AbstractFunctionSelector.java | 6 +++ .../cassandra/cql3/selection/Selector.java | 1 - .../validation/operations/AggregationTest.java | 39 6 files changed, 59 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 37a1b9e..a9cf70d 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -100,6 +100,7 @@ New features - The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow to convert from date into timestamp type and bigint raw value. - SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed. + - COUNT(*) and COUNT(1) can be selected with other columns or functions 2.1.9 http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 0db09b8..3d2aba5 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -295,8 +295,7 @@ selectStatement returns [SelectStatement.RawStatement expr] } : K_SELECT ( K_JSON { isJson = true; } )? - ( ( K_DISTINCT { isDistinct = true; } )? sclause=selectClause -| sclause=selectCountClause ) + ( ( K_DISTINCT { isDistinct = true; } )? sclause=selectClause ) K_FROM cf=columnFamilyName ( K_WHERE wclause=whereClause )? ( K_ORDER K_BY orderByClause[orderings] ( ',' orderByClause[orderings] )* )? @@ -324,6 +323,7 @@ selector returns [RawSelector s] unaliasedSelector returns [Selectable.Raw s] @init { Selectable.Raw tmp = null; } : ( c=cident { tmp = c; } + | K_COUNT '(' countArgument ')' { tmp = new Selectable.WithFunction.Raw(FunctionName.nativeFunction(countRows), Collections.Selectable.RawemptyList());} | K_WRITETIME '(' c=cident ')' { tmp = new Selectable.WritetimeOrTTL.Raw(c, true); } | K_TTL '(' c=cident ')' { tmp = new Selectable.WritetimeOrTTL.Raw(c, false); } | f=functionName args=selectionFunctionArgs { tmp = new Selectable.WithFunction.Raw(f, args); } @@ -337,11 +337,6 @@ selectionFunctionArgs returns [ListSelectable.Raw a] ')' { $a = args; } ; -selectCountClause returns [ListRawSelector expr] -@init{ ColumnIdentifier alias = new ColumnIdentifier(count, false); } -: K_COUNT '(' countArgument ')' (K_AS c=ident { alias = c; })? { $expr = new ArrayListRawSelector(); $expr.add( new RawSelector(new Selectable.WithFunction.Raw(FunctionName.nativeFunction(countRows), Collections.Selectable.RawemptyList()), alias));} -; - countArgument : '\*' | i=INTEGER { if (!i.getText().equals(1)) addRecognitionError(Only COUNT(1) is supported, got COUNT( + i.getText() + ));} http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fc58513/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java -- diff --git a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java index 1b22da6..41e43c0 100644 --- a/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java +++ b/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java @@ -38,6 +38,17 @@ import org.apache.cassandra.db.marshal.ShortType; public abstract class AggregateFcts { /** + * Checks if the specified function is the count rows (e.g. COUNT(*) or COUNT(1)) function. + * + * @param function the function to check + * @return codetrue/code if the
[jira] [Commented] (CASSANDRA-10139) Windows utest 2.2: NanoTimeToCurrentTimeMillisTest.testTimestampOrdering intermittent failure
[ https://issues.apache.org/jira/browse/CASSANDRA-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706028#comment-14706028 ] Paulo Motta commented on CASSANDRA-10139: - I ran the test multiple times on Windows, and it passes, so this problem happens mostly on CI (intermittently). I found a minor bug in the test, in the assignment of the {{nowNanos}} variable: {code} nowNanos = Math.max(now, System.nanoTime()); {code} But {{now}} is actually {{System.currentTimeMillis()}}, so it will always be smaller than {{System.naoTime()}}. I changed the assignment to: {code} nowNanos = Math.max(nowNanos, System.nanoTime()); {code} This could be the cause of the failures, if there is a clock synchronization service running on WIndows, but I'm not sure of that. I think this might be caused by difference in the clock resolution on Linux and Windows. In the latest test failures there were the following time differences: * convertedNow - lastConverted = [11ms|http://cassci.datastax.com/view/win32/job/cassandra-2.2_utest_win32/77/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/], [6ms|http://cassci.datastax.com/view/win32/job/cassandra-2.2_utest_win32/77/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/] * convertedNow - now = [23ms|http://cassci.datastax.com/view/win32/job/cassandra-2.2_utest_win32/16/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/], [17ms|http://cassci.datastax.com/view/win32/job/cassandra-2.2_utest_win32/53/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/], [19ms|http://cassci.datastax.com/view/win32/job/cassandra-2.2_utest_win32/37/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/], [5ms|http://cassci.datastax.com/view/win32/job/cassandra-2.2_utest_win32/91/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/] Based on these numbers, I added a tolerance on WIndows of 25ms for {{convertedNow - now}} and a tolerance of 15ms for {{convertedNow - lastConverted}}. These numbers are more or less accurate with the section *Clocks and Timers on Windows* of this [oracle blog post|https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks]: bq. On WIndows, System.currentTimeMillis() is updated at a constant rate regardless of how the timer interrupt has been programmed - depending on the platform this will either be 10ms or 15ms (this value seems tied to the default interrupt period). The patch is available [here|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:10139-2.2] for review. Unit tests are available below: * [2.2|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10139-2.2-testall/lastCompletedBuild/testReport/] * [3.0|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10139-3.0-testall/lastCompletedBuild/testReport/] * [trunk|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10139-trunk-testall/lastCompletedBuild/testReport/] Adding [~aweisberg] as reviewer, since he is the original author of the test. Windows utest 2.2: NanoTimeToCurrentTimeMillisTest.testTimestampOrdering intermittent failure - Key: CASSANDRA-10139 URL: https://issues.apache.org/jira/browse/CASSANDRA-10139 Project: Cassandra Issue Type: Bug Reporter: Joshua McKenzie Assignee: Paulo Motta Labels: Windows Fix For: 2.2.x {noformat} now = 1440076083035 convertedNow = 1440076083032 in iteration 8890001 junit.framework.AssertionFailedError: now = 1440076083035 convertedNow = 1440076083032 in iteration 8890001 at org.apache.cassandra.utils.NanoTimeToCurrentTimeMillisTest.testTimestampOrdering(NanoTimeToCurrentTimeMillisTest.java:48) {noformat} History: [intermittent|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.utils/NanoTimeToCurrentTimeMillisTest/testTimestampOrdering/history/] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10086) Add a CLEAR cqlsh command to clear the console
[ https://issues.apache.org/jira/browse/CASSANDRA-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul O'Fallon updated CASSANDRA-10086: -- Attachment: 10086v2.txt Add a CLEAR cqlsh command to clear the console Key: CASSANDRA-10086 URL: https://issues.apache.org/jira/browse/CASSANDRA-10086 Project: Cassandra Issue Type: Improvement Reporter: Paul O'Fallon Priority: Trivial Labels: cqlsh, doc-impacting Attachments: 10086.txt, 10086v2.txt It would be very helpful to have a CLEAR command to clear the cqlsh console. I learned (after researching a patch for this) that lowercase CTRL+L will clear the screen, but having a discrete command would make that more obvious. To match the expectations of Windows users, an alias to CLS would be nice as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9840) global_row_key_cache_test.py fails; loses mutations on cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706133#comment-14706133 ] zhaohang commented on CASSANDRA-9840: - global_row_key_cache_test.py fails; loses mutations on cluster restart -- Key: CASSANDRA-9840 URL: https://issues.apache.org/jira/browse/CASSANDRA-9840 Project: Cassandra Issue Type: Sub-task Reporter: Shawn Kumar Priority: Blocker Fix For: 3.0.x Attachments: node1.log, node2.log, node3.log, noseout.txt This test is currently failing on trunk. I've attached the test output and logs. It seems that the failure of the test doesn't necessarily have anything to do with global row/key caches - as on the initial loop of the test [neither are used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15] and we still hit failure. The test itself fails when a second validation of values after a cluster restart fails to capture deletes issued prior to the restart and first successful validation. However, if I add flushes prior to restarting the cluster the test completes successfully, implying an issue with loss of in-memory mutations due to the cluster restart. Initially I had though this might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that this test has been succeeding consistently on both 2.1 and 2.2 branch indicates there may be another issue at hand. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-9840) global_row_key_cache_test.py fails; loses mutations on cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaohang updated CASSANDRA-9840: Comment: was deleted (was: ) global_row_key_cache_test.py fails; loses mutations on cluster restart -- Key: CASSANDRA-9840 URL: https://issues.apache.org/jira/browse/CASSANDRA-9840 Project: Cassandra Issue Type: Sub-task Reporter: Shawn Kumar Priority: Blocker Fix For: 3.0.x Attachments: node1.log, node2.log, node3.log, noseout.txt This test is currently failing on trunk. I've attached the test output and logs. It seems that the failure of the test doesn't necessarily have anything to do with global row/key caches - as on the initial loop of the test [neither are used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15] and we still hit failure. The test itself fails when a second validation of values after a cluster restart fails to capture deletes issued prior to the restart and first successful validation. However, if I add flushes prior to restarting the cluster the test completes successfully, implying an issue with loss of in-memory mutations due to the cluster restart. Initially I had though this might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that this test has been succeeding consistently on both 2.1 and 2.2 branch indicates there may be another issue at hand. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10150) Cassandra read latency potentially caused by memory leak
Cheng Ren created CASSANDRA-10150: - Summary: Cassandra read latency potentially caused by memory leak Key: CASSANDRA-10150 URL: https://issues.apache.org/jira/browse/CASSANDRA-10150 Project: Cassandra Issue Type: Bug Components: Core Environment: cassandra 2.0.12 Reporter: Cheng Ren We are currently migrating to a new cassandra cluster which is multi-region on ec2. Our previous cluster was also on ec2 but only in the east region. In addition we have upgraded to cassandra 2.0.12 from 2.0.4 and from ubuntu 12 to 14. We are investigating a cassandra latency problem on our new cluster. The symptom is that over a long period of time (12-16 hours) the TP90-95 read latency degrades to the point of being well above our SLA's. During normal operation our TP95 for a 50key lookup is 75ms, when fully degraded, we are facing 300ms TP95 latencies. Doing a rolling restart resolves the problem. We are noticing a high correlation between the Old Gen heap usage (and how much is freed up) and the high latencies. We are running with a max heap size of 12GB and a max new-gen size of 2GB. Below is a chart of the heap usage over a 24 hour period. Right below it is a chart of TP95 latencies (was a mixed workload of 50 and single key lookups), the third image is a look at CMS Old Gen memory usage: Overall heap usage over 24 hrs: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.2disp=embrealattid=ii_14f4256a57b697abattbid=ANGjdJ8836xDhsdopJteTGvid1FXOcMruq1Pz9fCkoasJ1Zsf2cQCXpbQ3CUB8DOupdYHstLw4n5xg9oXpWmSmp6FAvg3CnO9q7BlDNZ-EmMIy4tIg1yprl8ipDtgzwsz=w908-h372ats=1440110174663rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! TP95 latencies over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.1disp=embrealattid=ii_14f42580ee666154attbid=ANGjdJ8e959Qch4PmY57AAg-qi3cPMTX_p-33H4Snd1igoxQQ5N0owSRHKEBT-M2gzKKzfMmx0WwUnImJDDMkZcWqeiHieLrGgHJX4i3-Ust8tPrgMDQxe6C_2c3N40sz=w908-h372ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! OldGen memory usage over 24 hours: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.4disp=embrealattid=ii_14f4258cc47c9d36attbid=ANGjdJ9LgcECnife3mdKz1JlhDWur7KjiVtbEYYCFyxh0xoF9yEC4Q_90PS56PhU1hOraDiYCDQ1ro0dcOtQhqEU70Pwoc--wsdXbpbWmhJ5hF7QC2FDRS8zpuX_KC0sz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! You can see from this that the old gen section of our heap is what is using up the majority of the heap space. We cannot figure out why the memory is not being collected during a full GC. For reference, in our old cassandra cluster, the behavior is that the full GC will clear up the majority of the heap space. See image below from an old production node operating normally: !https://mail.google.com/mail/u/0/?ui=2ik=0c69b03890view=fimgth=14f4d1d4381a0760attid=0.3disp=embrealattid=ii_14f4262f2c3781bbattbid=ANGjdJ_G3oT4ITmlQMJe16jsYpYINHC1j6dqxvZ5RKfjMp5YUj1VA71_VfWTqUP47wsuRqb6GkeAk_1BllaL6D5bjn0QvScXBPIsr5L4uFMBEMpGZAvRzKaC9Q3xXrssz=w908-h390ats=1440110174664rm=14f4d1d4381a0760zwatsh=1|height=300,width=500! From heap dump file we found that most memory is consumed by unreachable objects. With further analysis we were able to see those objects are RMIConnectionImpl$CombinedClassLoader$ClassLoaderWrapper (holding 4GB of memory) and java.security.ProtectionDomain (holding 2GB) . The only place we know Cassandra is using RMI is in JMX, but does anyone has any clue on where else those objects are used? And Why do they take so much memory? Or It would be great if someone could offer any further debugging tips on the latency or GC issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10108) Windows dtest 3.0: sstablesplit_test.py:TestSSTableSplit.split_test fails
[ https://issues.apache.org/jira/browse/CASSANDRA-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706032#comment-14706032 ] Paulo Motta commented on CASSANDRA-10108: - couldn't reproduce locally. waiting for [cassandra-dtest PR|https://github.com/riptano/cassandra-dtest/pull/485] and [ccm PR|https://github.com/pcmanus/ccm/pull/363] to be reviewed, so more debugging information can be collected from cassci. Windows dtest 3.0: sstablesplit_test.py:TestSSTableSplit.split_test fails - Key: CASSANDRA-10108 URL: https://issues.apache.org/jira/browse/CASSANDRA-10108 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Assignee: Paulo Motta Labels: Windows Fix For: 3.0.x Locally: {noformat} -- ma-28-big-Data.db- Exception in thread main java.lang.NoClassDefFoundError: org/supercsv/prefs/CsvPreference$Builder at org.apache.cassandra.config.Config.clinit(Config.java:240) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:105) at org.apache.cassandra.service.StorageService.getPartitioner(StorageService.java:220) at org.apache.cassandra.service.StorageService.init(StorageService.java:206) at org.apache.cassandra.service.StorageService.clinit(StorageService.java:211) at org.apache.cassandra.schema.LegacySchemaTables.getSchemaPartitionsForTable(LegacySchemaTables.java:295) at org.apache.cassandra.schema.LegacySchemaTables.readSchemaFromSystemTables(LegacySchemaTables.java:210) at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:108) at org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:58) Caused by: java.lang.ClassNotFoundException: org.supercsv.prefs.CsvPreference$Builder at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 9 more Number of sstables after split: 1. expected 21.0 {noformat} on CI: {noformat} 21.0 not less than or equal to 2 and [node1 ERROR] Exception calling CompareTo with 1 argument(s): Object must be of type String. At D:\temp\dtest-i3xwjx\test\node1\conf\cassandra-env.ps1:336 char:9 + if ($env:JVM_VERSION.CompareTo(1.8.0_40 -eq -1)) + ~ + CategoryInfo : NotSpecified: (:) [], MethodInvocationException + FullyQualifiedErrorId : ArgumentException -- ma-28-big-Data.db- {noformat} Failure history: [consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/sstablesplit_test/TestSSTableSplit/split_test/history/] Env: both CI and local -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10149) Make nodetool cfstats and cfhistograms consistent
[ https://issues.apache.org/jira/browse/CASSANDRA-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705926#comment-14705926 ] Joel Knighton commented on CASSANDRA-10149: --- Quick note: in the description, the description of parameters is inverted. cfstats takes keyspace.table and cfhistograms takes keyspace table. For what it's worth, I'd bet (or at least it makes sense to me) that cfstats uses keyspace.table because it is possible to just give a keyspace to cfstats. Using space means the current parser would have to distinguish between keyspace1 keyspace2 and keyspace1 table1 as arguments. In the event that keyspace2 and table1 have the same name, there's no way to distinguish intent. It may make sense to add keyspace.table as an option format for cfhistograms. Make nodetool cfstats and cfhistograms consistent - Key: CASSANDRA-10149 URL: https://issues.apache.org/jira/browse/CASSANDRA-10149 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Jeremy Hanna Labels: lhf Currently when using nodetool cfstats and cfhistograms (for a keyspace and table) have different syntax. cfstats uses keyspace table. cfhistograms uses keyspace.table. We should unify it one way or the other (or both). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF
[ https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705971#comment-14705971 ] Brent Haines edited comment on CASSANDRA-10084 at 8/21/15 12:42 AM: I did a lot of tuning with prefetching, threads per client, and added multithreading to our query collator. Performance has improved a lot, but it doesn't come close to what we had before we added the collection to the table. Right now, I have discovered a query for a specific index value that is particularly slow, 3 minutes for 10,000 records. First, it stop after providing only a about 1% of the data, but did not produce any kind of error or exception. I did a repair on one of the nodes for that partition key and it seems to be working, but is very slow now. I have attached stack dumps for every node involved in the query, though I am not certain which one is doing work at any given time. Stupid question - is there a quick way to see what nodes own the key for a specific query? I turn trace on and run the query a bunch of times to get all three. Please see the attached profiles for the 3 nodes. Also FYI - We run an incremental repair nightly. They usually finish, but sometimes, in the morning, nodes report *much* more storage than they actually own. They all own about 60 to 90GB, but after repair some nodes will say they own 2+ TB! Restarting reveals that they are way behind on compaction and takes about 2 hours to clear that up. If I try a nodetool compactionstats before restarting, it will hang until timeout. Final question, is upgrading to 2.2 a safe bet for some of these issues? Specifically the halting of compaction during repair? was (Author: thebrenthaines): I did a lot of tuning with prefetching, threads per client, and added multithreading to our query collator. Performance has improved a lot, but it doesn't come close to what we had before we added the collection to the table. Right now, I have discovered a query for a specific index value that is particularly slow, 3 minutes for 10,000 records. First, it returned only a about 1% of the data without error. I did a repair on one of the nodes for that partition key and it seems to be working, but is very slow now. I have attached stack dumps for every node, though I am not certain which one is working at any given time. Stupid question - is there a quick way to see what nodes own the key for a specific query? I turn trace on and run the query a bunch of times to get all three. Please see the attached profiles for the 3 nodes. We run an incremental repair nightly. They usually finish, but sometimes nodes report *much* more storage than they actually own. They all own about 60 to 90GB, but after repair some nodes will say they own 2+ TB! Restarting reveals that they are way behind on compaction and take about 2 hours to clear that up. Very slow performance streaming a large query from a single CF -- Key: CASSANDRA-10084 URL: https://issues.apache.org/jira/browse/CASSANDRA-10084 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.8 12GB EC2 instance 12 node cluster 32 concurrent reads 32 concurrent writes 6GB heap space Reporter: Brent Haines Attachments: cassandra.yaml, node1.txt, node2.txt, node3.txt We have a relatively simple column family that we use to track event data from different providers. We have been utilizing it for some time. Here is what it looks like: {code} CREATE TABLE data.stories_by_text ( ref_id timeuuid, second_type text, second_value text, object_type text, field_name text, value text, story_id timeuuid, data maptext, text, PRIMARY KEY ((ref_id, second_type, second_value, object_type, field_name), value, story_id) ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Searchable fields and actions in a story are indexed by ref id which corresponds to a brand, app, app instance, or user.' AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {code} We will, on a daily basis pull a query of the complete data for a
[jira] [Commented] (CASSANDRA-10110) Windows dtest 3.0: udtencoding_test.py:TestUDTEncoding.udt_test
[ https://issues.apache.org/jira/browse/CASSANDRA-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706123#comment-14706123 ] Paulo Motta commented on CASSANDRA-10110: - Test is passing again, locally and on [ci|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/udtencoding_test/TestUDTEncoding/udt_test/]. Will resolve for now, and reopen if it becomes flakey. Windows dtest 3.0: udtencoding_test.py:TestUDTEncoding.udt_test --- Key: CASSANDRA-10110 URL: https://issues.apache.org/jira/browse/CASSANDRA-10110 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Labels: Windows Currently broken by CASSANDRA-7066 (thus depending on CASSANDRA-10109). Error message from CI yesterday was: {noformat} File D:\Python27\lib\unittest\case.py, line 329, in run testMethod() File D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\udtencoding_test.py, line 15, in udt_test cluster.populate(3).start() File build\bdist.win-amd64\egg\ccmlib\cluster.py, line 249, in start p = node.start(update_pid=False, jvm_args=jvm_args, profile_options=profile_options) File build\bdist.win-amd64\egg\ccmlib\node.py, line 447, in start common.check_socket_available(itf) File build\bdist.win-amd64\egg\ccmlib\common.py, line 343, in check_socket_available raise UnavailableSocketError(Inet address %s:%s is not available: %s % (addr, port, msg)) 'Inet address 127.0.0.1:7000 is not available: [Errno 10013] An attempt was made to access a socket in a way forbidden by its access permissions\n begin captured logging \ndtest: DEBUG: cluster ccm directory: d:\\temp\\dtest-dpsz3i\n- end captured logging -' {noformat} Failure history: [regression in build #17|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/udtencoding_test/TestUDTEncoding/udt_test/history/]. Doesn't look like there was any real change to explain that though. Env: Not sure if repro locally since CASSANDRA-7066 error is in the way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-10110) Windows dtest 3.0: udtencoding_test.py:TestUDTEncoding.udt_test
[ https://issues.apache.org/jira/browse/CASSANDRA-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta resolved CASSANDRA-10110. - Resolution: Invalid Fix Version/s: (was: 3.0.x) Windows dtest 3.0: udtencoding_test.py:TestUDTEncoding.udt_test --- Key: CASSANDRA-10110 URL: https://issues.apache.org/jira/browse/CASSANDRA-10110 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Labels: Windows Currently broken by CASSANDRA-7066 (thus depending on CASSANDRA-10109). Error message from CI yesterday was: {noformat} File D:\Python27\lib\unittest\case.py, line 329, in run testMethod() File D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\udtencoding_test.py, line 15, in udt_test cluster.populate(3).start() File build\bdist.win-amd64\egg\ccmlib\cluster.py, line 249, in start p = node.start(update_pid=False, jvm_args=jvm_args, profile_options=profile_options) File build\bdist.win-amd64\egg\ccmlib\node.py, line 447, in start common.check_socket_available(itf) File build\bdist.win-amd64\egg\ccmlib\common.py, line 343, in check_socket_available raise UnavailableSocketError(Inet address %s:%s is not available: %s % (addr, port, msg)) 'Inet address 127.0.0.1:7000 is not available: [Errno 10013] An attempt was made to access a socket in a way forbidden by its access permissions\n begin captured logging \ndtest: DEBUG: cluster ccm directory: d:\\temp\\dtest-dpsz3i\n- end captured logging -' {noformat} Failure history: [regression in build #17|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/udtencoding_test/TestUDTEncoding/udt_test/history/]. Doesn't look like there was any real change to explain that though. Env: Not sure if repro locally since CASSANDRA-7066 error is in the way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF
[ https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brent Haines updated CASSANDRA-10084: - Attachment: node3.txt node2.txt node1.txt stack dumps for 3 nodes who are processing the slow streaming query. Very slow performance streaming a large query from a single CF -- Key: CASSANDRA-10084 URL: https://issues.apache.org/jira/browse/CASSANDRA-10084 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.8 12GB EC2 instance 12 node cluster 32 concurrent reads 32 concurrent writes 6GB heap space Reporter: Brent Haines Attachments: cassandra.yaml, node1.txt, node2.txt, node3.txt We have a relatively simple column family that we use to track event data from different providers. We have been utilizing it for some time. Here is what it looks like: {code} CREATE TABLE data.stories_by_text ( ref_id timeuuid, second_type text, second_value text, object_type text, field_name text, value text, story_id timeuuid, data maptext, text, PRIMARY KEY ((ref_id, second_type, second_value, object_type, field_name), value, story_id) ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Searchable fields and actions in a story are indexed by ref id which corresponds to a brand, app, app instance, or user.' AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {code} We will, on a daily basis pull a query of the complete data for a given index, it will look like this: {code} select * from stories_by_text where ref_id = f0124740-2f5a-11e5-a113-03cdf3f3c6dc and second_type = 'Day' and second_value = '20150812' and object_type = 'booshaka:user' and field_name = 'hashedEmail'; {code} In the past, we have been able to pull millions of records out of the CF in a few seconds. We recently added the data column so that we could filter on event data and provide more detailed analysis of activity for our reports. The data map, declared with 'data maptext, text' is very small; only 2 or 3 name/value pairs. Since we have added this column, our streaming query performance has gone straight to hell. I just ran the above query and it took 46 minutes to read 86K rows and then it timed out. I am uncertain what other data you need to see in order to diagnose this. We are using STCS and are considering a change to Leveled Compaction. The table is repaired nightly and the updates, which are at a very fast clip will only impact the partition key for today, while the queries are for previous days only. To my knowledge these queries no longer finish ever. They time out, even though I put a 60 second timeout on the read for the cluster. I can watch it pause for 30 to 50 seconds many times during the stream. Again, this only started happening when we added the data column. Please let me know what else you need for this. It is having a very big impact on our system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF
[ https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705974#comment-14705974 ] Brent Haines edited comment on CASSANDRA-10084 at 8/20/15 11:41 PM: stack dumps for 3 nodes that are processing the slow streaming query. was (Author: thebrenthaines): stack dumps for 3 nodes who are processing the slow streaming query. Very slow performance streaming a large query from a single CF -- Key: CASSANDRA-10084 URL: https://issues.apache.org/jira/browse/CASSANDRA-10084 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.8 12GB EC2 instance 12 node cluster 32 concurrent reads 32 concurrent writes 6GB heap space Reporter: Brent Haines Attachments: cassandra.yaml, node1.txt, node2.txt, node3.txt We have a relatively simple column family that we use to track event data from different providers. We have been utilizing it for some time. Here is what it looks like: {code} CREATE TABLE data.stories_by_text ( ref_id timeuuid, second_type text, second_value text, object_type text, field_name text, value text, story_id timeuuid, data maptext, text, PRIMARY KEY ((ref_id, second_type, second_value, object_type, field_name), value, story_id) ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'Searchable fields and actions in a story are indexed by ref id which corresponds to a brand, app, app instance, or user.' AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; {code} We will, on a daily basis pull a query of the complete data for a given index, it will look like this: {code} select * from stories_by_text where ref_id = f0124740-2f5a-11e5-a113-03cdf3f3c6dc and second_type = 'Day' and second_value = '20150812' and object_type = 'booshaka:user' and field_name = 'hashedEmail'; {code} In the past, we have been able to pull millions of records out of the CF in a few seconds. We recently added the data column so that we could filter on event data and provide more detailed analysis of activity for our reports. The data map, declared with 'data maptext, text' is very small; only 2 or 3 name/value pairs. Since we have added this column, our streaming query performance has gone straight to hell. I just ran the above query and it took 46 minutes to read 86K rows and then it timed out. I am uncertain what other data you need to see in order to diagnose this. We are using STCS and are considering a change to Leveled Compaction. The table is repaired nightly and the updates, which are at a very fast clip will only impact the partition key for today, while the queries are for previous days only. To my knowledge these queries no longer finish ever. They time out, even though I put a 60 second timeout on the read for the cluster. I can watch it pause for 30 to 50 seconds many times during the stream. Again, this only started happening when we added the data column. Please let me know what else you need for this. It is having a very big impact on our system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6542) nodetool removenode hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705918#comment-14705918 ] Pete Ehlke commented on CASSANDRA-6542: --- *bump* Seeing this with a 6 node 2.1.6 cluster. Is anyone home? Anyone even looking at this? nodetool removenode hangs - Key: CASSANDRA-6542 URL: https://issues.apache.org/jira/browse/CASSANDRA-6542 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12, 1.2.11 DSE Reporter: Eric Lubow Assignee: Yuki Morishita Running *nodetool removenode $host-id* doesn't actually remove the node from the ring. I've let it run anywhere from 5 minutes to 3 days and there are no messages in the log about it hanging or failing, the command just sits there running. So the regular response has been to run *nodetool removenode $host-id*, give it about 10-15 minutes and then run *nodetool removenode force*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10141) UFPureScriptTest fails with pre-3.0 java-driver
[ https://issues.apache.org/jira/browse/CASSANDRA-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-10141: - Summary: UFPureScriptTest fails with pre-3.0 java-driver (was: Windows utest 3.0: UFPureScriptTest fails) UFPureScriptTest fails with pre-3.0 java-driver --- Key: CASSANDRA-10141 URL: https://issues.apache.org/jira/browse/CASSANDRA-10141 Project: Cassandra Issue Type: Bug Reporter: Joshua McKenzie Assignee: Robert Stupp Labels: Windows Fix For: 3.0.x {noformat} [junit] - --- [junit] Testcase: testJavascriptTupleTypeCollection(org.apache.cassandra.cql3.validation.entities.UFPureScriptTest): Caused an ERROR [junit] execution of 'cql_test_keyspace_alt.function_3[tupledouble, frozenlistdouble, frozensettext, frozenmapint, boolean]' failed: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] org.apache.cassandra.exceptions.FunctionExecutionException: execution of 'cql_test_keyspace_alt.function_3[tupledouble, frozenlistdouble, frozensettext, frozenmapint, boolean]' failed: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] at org.apache.cassandra.exceptions.FunctionExecutionException.create(FunctionExecutionException.java:35) [junit] at org.apache.cassandra.cql3.functions.UDFunction.execute(UDFunction.java:287) [junit] at org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:60) [junit] at org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:535) [junit] at org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.getOutputRow(Selection.java:363) [junit] at org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.build(Selection.java:351) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:599) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:363) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:379) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:72) [junit] at org.apache.cassandra.cql3.QueryProcessor.executeOnceInternal(QueryProcessor.java:337) [junit] at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:654) [junit] at org.apache.cassandra.cql3.validation.entities.UFPureScriptTest.testJavascriptTupleTypeCollection(UFPureScriptTest.java:178) [junit] Caused by: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] at java.security.AccessControlContext.checkPermission(AccessControlContext.java:457) [junit] at java.security.AccessController.checkPermission(AccessController.java:884) [junit] at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) [junit] at org.apache.cassandra.cql3.functions.ThreadAwareSecurityManager.checkPermission(ThreadAwareSecurityManager.java:164) [junit] at java.lang.Class.checkMemberAccess(Class.java:2348) [junit] at java.lang.Class.getEnclosingMethod(Class.java:1037) [junit] at java.lang.Class.getGenericSuperclass(Class.java:777) [junit] at com.google.common.reflect.TypeCapture.capture(TypeCapture.java:33) [junit] at com.google.common.reflect.TypeToken.init(TypeToken.java:113) [junit] at com.datastax.driver.core.CodecUtils$4.init(CodecUtils.java:44) [junit] at com.datastax.driver.core.CodecUtils.listOf(CodecUtils.java:44) [junit] at com.datastax.driver.core.AbstractGettableByIndexData.getList(AbstractGettableByIndexData.java:347) [junit] at com.datastax.driver.core.TupleValue.getList(TupleValue.java:21) [junit] at com.datastax.driver.core.AbstractGettableByIndexData.getList(AbstractGettableByIndexData.java:336) [junit] at com.datastax.driver.core.TupleValue.getList(TupleValue.java:21) [junit] at jdk.nashorn.internal.scripts.Script$2$\^eval\_.:program(eval:1) [junit] at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:636) [junit] at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:229) [junit] at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:387) [junit] at
[jira] [Created] (CASSANDRA-10143) Apparent counter overcount during certain network partitions
Joel Knighton created CASSANDRA-10143: - Summary: Apparent counter overcount during certain network partitions Key: CASSANDRA-10143 URL: https://issues.apache.org/jira/browse/CASSANDRA-10143 Project: Cassandra Issue Type: Bug Reporter: Joel Knighton This issue is reproducible in this [Jepsen Test|https://github.com/riptano/jepsen/blob/f45f5320db608d48de2c02c871aecc4910f4d963/cassandra/test/cassandra/counter_test.clj#L16]. The test starts a five-node cluster and issues increments by one against a single counter. It then checks that the counter is in the range [OKed increments, OKed increments + Write Timeouts] at each read. Increments are issued at CL.ONE and reads at CL.ALL. Throughout the test, network failures are induced that create halved network partitions. A halved network partition splits the cluster into three connected nodes and two connected nodes, randomly. This test started failing; bisects showed that it was actually a test change that caused this failure. When the network partitions are induced in a cycle of 15s healthy/45s partitioned or 20s healthy/45s partitioned, the test failes. When network partitions are induced in a cycle of 15s healthy/60s partitioned, 20s healthy/45s partitioned, or 20s healthy/60s partitioned, the test passes. There is nothing unusual in the logs of the nodes for the failed tests. The results are very reproducible. One noticeable trend is that more reads seem to get serviced during the failed tests. Most testing has been done in 2.1.8 - the same issue appears to be present in 2.2/3.0/trunk, but I haven't spent as much time reproducing. Ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9142) DC Local repair or -hosts should only be allowed with -full repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705496#comment-14705496 ] sankalp kohli commented on CASSANDRA-9142: -- Let me do this by early next week. DC Local repair or -hosts should only be allowed with -full repair -- Key: CASSANDRA-9142 URL: https://issues.apache.org/jira/browse/CASSANDRA-9142 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Priority: Minor Fix For: 2.2.x Attachments: trunk_9142.txt We should not let users mix incremental repair with dc local repair or -host or any repair which does not include all replicas. This will currently cause stables on some replicas to be marked as repaired. The next incremental repair will not work on same set of data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] cassandra git commit: Fix post-9749 test failures
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 df52cd64e - 0d866456a Fix post-9749 test failures patch by Branimir Lambov; reviewed by Ariel Weisberg for CASSANDRA-9749 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7a85c8b8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7a85c8b8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7a85c8b8 Branch: refs/heads/cassandra-3.0 Commit: 7a85c8b8fbf753858c4334c4249475e6bb1a24e4 Parents: a6dd2b8 Author: Branimir Lambov branimir.lam...@datastax.com Authored: Wed Aug 19 19:56:58 2015 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu Aug 20 21:25:33 2015 +0300 -- .../cassandra/db/commitlog/CommitLogTest.java | 102 ++- .../db/commitlog/CommitLogUpgradeTest.java | 38 --- 2 files changed, 78 insertions(+), 62 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7a85c8b8/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index da8058c..0ad880b 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -29,19 +29,18 @@ import java.nio.ByteBuffer; import java.util.HashMap; import java.util.Map; import java.util.UUID; -import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.zip.CRC32; import java.util.zip.Checksum; import com.google.common.collect.ImmutableMap; + import org.junit.Assert; import org.junit.BeforeClass; import org.junit.Test; import org.apache.cassandra.SchemaLoader; import org.apache.cassandra.Util; -import org.apache.cassandra.config.Config.CommitFailurePolicy; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.KSMetaData; import org.apache.cassandra.config.ParameterizedClass; @@ -63,8 +62,7 @@ import org.apache.cassandra.io.util.ByteBufferDataInput; import org.apache.cassandra.io.util.FileDataInput; import org.apache.cassandra.locator.SimpleStrategy; import org.apache.cassandra.net.MessagingService; -import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; +import org.apache.cassandra.utils.*; import static org.apache.cassandra.utils.ByteBufferUtil.bytes; @@ -94,10 +92,15 @@ public class CommitLogTest CompactionManager.instance.disableAutoCompaction(); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithEmptyLog() throws Exception { -CommitLog.instance.recover(new File[]{ tmpFile(CommitLogDescriptor.current_version) }); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +CommitLog.instance.recover(new File[]{ tmpFile(CommitLogDescriptor.current_version) }); +} +}, CommitLogReplayException.class); } @Test @@ -119,10 +122,15 @@ public class CommitLogTest testRecoveryWithBadSizeArgument(100, 10); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithShortSize() throws Exception { -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +} +}, CommitLogReplayException.class); } @Test @@ -146,10 +154,15 @@ public class CommitLogTest testRecovery(garbage, CommitLogDescriptor.current_version); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithGarbageLog_fail() throws Exception { -testRecoveryWithGarbageLog(); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +testRecoveryWithGarbageLog(); +} +}, CommitLogReplayException.class); } @Test @@ -164,18 +177,6 @@ public class CommitLogTest } @Test -public void testRecoveryWithGarbageLog_ignoredByPolicy() throws Exception -{ -CommitFailurePolicy existingPolicy = DatabaseDescriptor.getCommitFailurePolicy(); -try { - DatabaseDescriptor.setCommitFailurePolicy(CommitFailurePolicy.ignore); -testRecoveryWithGarbageLog(); -} finally { -
[2/2] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0d866456 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0d866456 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0d866456 Branch: refs/heads/cassandra-3.0 Commit: 0d866456a144ea6b3f86f3677f0e8d90c7b1d2d5 Parents: df52cd6 7a85c8b Author: Aleksey Yeschenko alek...@apache.org Authored: Thu Aug 20 21:30:15 2015 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu Aug 20 21:30:15 2015 +0300 -- --
[2/3] cassandra git commit: Give compaction strategies more control over sstable creation
Give compaction strategies more control over sstable creation Patch by Blake Eggleston; reviewed by marcuse for CASSANDRA-8671 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9ed27277 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9ed27277 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9ed27277 Branch: refs/heads/trunk Commit: 9ed2727739c73d64086d09a86a407a77390f081a Parents: 0d86645 Author: Blake Eggleston bdeggles...@gmail.com Authored: Thu Aug 6 10:19:55 2015 -0700 Committer: Marcus Eriksson marc...@apache.org Committed: Thu Aug 20 20:47:40 2015 +0200 -- .../apache/cassandra/db/ColumnFamilyStore.java | 73 +--- .../org/apache/cassandra/db/Directories.java| 42 +-- src/java/org/apache/cassandra/db/Keyspace.java | 5 + src/java/org/apache/cassandra/db/Memtable.java | 32 +++-- .../compaction/AbstractCompactionStrategy.java | 24 +++- .../db/compaction/AbstractCompactionTask.java | 3 +- .../db/compaction/CompactionManager.java| 6 +- .../compaction/CompactionStrategyManager.java | 40 +-- .../cassandra/db/compaction/CompactionTask.java | 22 ++-- .../db/compaction/LeveledCompactionTask.java| 6 +- .../db/compaction/SSTableSplitter.java | 3 +- .../cassandra/db/compaction/Scrubber.java | 3 +- .../SizeTieredCompactionStrategy.java | 4 +- .../writers/CompactionAwareWriter.java | 53 ++--- .../writers/DefaultCompactionWriter.java| 32 ++--- .../writers/MajorLeveledCompactionWriter.java | 46 .../writers/MaxSSTableSizeWriter.java | 45 --- .../SplittingSizeTieredCompactionWriter.java| 52 - .../db/lifecycle/LifecycleTransaction.java | 9 ++ .../apache/cassandra/db/lifecycle/Tracker.java | 34 +++--- .../org/apache/cassandra/db/lifecycle/View.java | 4 +- .../io/sstable/AbstractSSTableSimpleWriter.java | 11 +- .../io/sstable/SSTableMultiWriter.java | 54 + .../cassandra/io/sstable/SSTableTxnWriter.java | 43 +-- .../io/sstable/SimpleSSTableMultiWriter.java| 116 +++ .../notifications/SSTableAddedNotification.java | 4 +- .../cassandra/streaming/StreamReader.java | 22 ++-- .../cassandra/streaming/StreamReceiveTask.java | 22 ++-- .../compress/CompressedStreamReader.java| 8 +- .../streaming/messages/IncomingFileMessage.java | 7 +- .../cassandra/tools/SSTableExpiredBlockers.java | 3 +- .../cassandra/tools/SSTableLevelResetter.java | 2 +- .../cassandra/tools/SSTableOfflineRelevel.java | 5 +- .../cassandra/tools/StandaloneScrubber.java | 2 +- .../cassandra/tools/StandaloneUpgrader.java | 2 +- .../cassandra/tools/StandaloneVerifier.java | 7 +- .../db/compaction/LongCompactionsTest.java | 6 +- test/unit/org/apache/cassandra/MockSchema.java | 2 +- .../cassandra/db/ColumnFamilyStoreTest.java | 4 +- .../unit/org/apache/cassandra/db/ScrubTest.java | 12 +- .../db/compaction/AntiCompactionTest.java | 10 +- .../compaction/CompactionAwareWriterTest.java | 8 +- .../LeveledCompactionStrategyTest.java | 2 +- .../db/lifecycle/RealTransactionsTest.java | 8 +- .../cassandra/db/lifecycle/TrackerTest.java | 19 +-- .../apache/cassandra/db/lifecycle/ViewTest.java | 2 +- .../io/sstable/BigTableWriterTest.java | 4 +- .../io/sstable/CQLSSTableWriterClientTest.java | 2 + .../io/sstable/SSTableRewriterTest.java | 10 +- .../cassandra/io/sstable/SSTableUtils.java | 25 ++-- .../org/apache/cassandra/schema/DefsTest.java | 6 +- 51 files changed, 651 insertions(+), 315 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ed27277/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index a12de0a..b199c77 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -58,6 +58,7 @@ import org.apache.cassandra.io.FSWriteError; import org.apache.cassandra.io.sstable.*; import org.apache.cassandra.io.sstable.Descriptor; import org.apache.cassandra.io.sstable.format.*; +import org.apache.cassandra.io.sstable.metadata.MetadataCollector; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.metrics.TableMetrics.Sampler; import org.apache.cassandra.metrics.TableMetrics; @@ -75,6 +76,33 @@ import static org.apache.cassandra.utils.Throwables.maybeFail; public class ColumnFamilyStore implements ColumnFamilyStoreMBean { +// the
[2/2] cassandra git commit: Give compaction strategies more control over sstable creation
Give compaction strategies more control over sstable creation Patch by Blake Eggleston; reviewed by marcuse for CASSANDRA-8671 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9ed27277 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9ed27277 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9ed27277 Branch: refs/heads/cassandra-3.0 Commit: 9ed2727739c73d64086d09a86a407a77390f081a Parents: 0d86645 Author: Blake Eggleston bdeggles...@gmail.com Authored: Thu Aug 6 10:19:55 2015 -0700 Committer: Marcus Eriksson marc...@apache.org Committed: Thu Aug 20 20:47:40 2015 +0200 -- .../apache/cassandra/db/ColumnFamilyStore.java | 73 +--- .../org/apache/cassandra/db/Directories.java| 42 +-- src/java/org/apache/cassandra/db/Keyspace.java | 5 + src/java/org/apache/cassandra/db/Memtable.java | 32 +++-- .../compaction/AbstractCompactionStrategy.java | 24 +++- .../db/compaction/AbstractCompactionTask.java | 3 +- .../db/compaction/CompactionManager.java| 6 +- .../compaction/CompactionStrategyManager.java | 40 +-- .../cassandra/db/compaction/CompactionTask.java | 22 ++-- .../db/compaction/LeveledCompactionTask.java| 6 +- .../db/compaction/SSTableSplitter.java | 3 +- .../cassandra/db/compaction/Scrubber.java | 3 +- .../SizeTieredCompactionStrategy.java | 4 +- .../writers/CompactionAwareWriter.java | 53 ++--- .../writers/DefaultCompactionWriter.java| 32 ++--- .../writers/MajorLeveledCompactionWriter.java | 46 .../writers/MaxSSTableSizeWriter.java | 45 --- .../SplittingSizeTieredCompactionWriter.java| 52 - .../db/lifecycle/LifecycleTransaction.java | 9 ++ .../apache/cassandra/db/lifecycle/Tracker.java | 34 +++--- .../org/apache/cassandra/db/lifecycle/View.java | 4 +- .../io/sstable/AbstractSSTableSimpleWriter.java | 11 +- .../io/sstable/SSTableMultiWriter.java | 54 + .../cassandra/io/sstable/SSTableTxnWriter.java | 43 +-- .../io/sstable/SimpleSSTableMultiWriter.java| 116 +++ .../notifications/SSTableAddedNotification.java | 4 +- .../cassandra/streaming/StreamReader.java | 22 ++-- .../cassandra/streaming/StreamReceiveTask.java | 22 ++-- .../compress/CompressedStreamReader.java| 8 +- .../streaming/messages/IncomingFileMessage.java | 7 +- .../cassandra/tools/SSTableExpiredBlockers.java | 3 +- .../cassandra/tools/SSTableLevelResetter.java | 2 +- .../cassandra/tools/SSTableOfflineRelevel.java | 5 +- .../cassandra/tools/StandaloneScrubber.java | 2 +- .../cassandra/tools/StandaloneUpgrader.java | 2 +- .../cassandra/tools/StandaloneVerifier.java | 7 +- .../db/compaction/LongCompactionsTest.java | 6 +- test/unit/org/apache/cassandra/MockSchema.java | 2 +- .../cassandra/db/ColumnFamilyStoreTest.java | 4 +- .../unit/org/apache/cassandra/db/ScrubTest.java | 12 +- .../db/compaction/AntiCompactionTest.java | 10 +- .../compaction/CompactionAwareWriterTest.java | 8 +- .../LeveledCompactionStrategyTest.java | 2 +- .../db/lifecycle/RealTransactionsTest.java | 8 +- .../cassandra/db/lifecycle/TrackerTest.java | 19 +-- .../apache/cassandra/db/lifecycle/ViewTest.java | 2 +- .../io/sstable/BigTableWriterTest.java | 4 +- .../io/sstable/CQLSSTableWriterClientTest.java | 2 + .../io/sstable/SSTableRewriterTest.java | 10 +- .../cassandra/io/sstable/SSTableUtils.java | 25 ++-- .../org/apache/cassandra/schema/DefsTest.java | 6 +- 51 files changed, 651 insertions(+), 315 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ed27277/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index a12de0a..b199c77 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -58,6 +58,7 @@ import org.apache.cassandra.io.FSWriteError; import org.apache.cassandra.io.sstable.*; import org.apache.cassandra.io.sstable.Descriptor; import org.apache.cassandra.io.sstable.format.*; +import org.apache.cassandra.io.sstable.metadata.MetadataCollector; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.metrics.TableMetrics.Sampler; import org.apache.cassandra.metrics.TableMetrics; @@ -75,6 +76,33 @@ import static org.apache.cassandra.utils.Throwables.maybeFail; public class ColumnFamilyStore implements ColumnFamilyStoreMBean { +//
[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0fd857ba Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0fd857ba Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0fd857ba Branch: refs/heads/trunk Commit: 0fd857baf462b4e8c47366f345ba938ce6574657 Parents: a134925 9ed2727 Author: Marcus Eriksson marc...@apache.org Authored: Thu Aug 20 20:50:20 2015 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Thu Aug 20 20:50:20 2015 +0200 -- .../apache/cassandra/db/ColumnFamilyStore.java | 73 +--- .../org/apache/cassandra/db/Directories.java| 42 +-- src/java/org/apache/cassandra/db/Keyspace.java | 5 + src/java/org/apache/cassandra/db/Memtable.java | 32 +++-- .../compaction/AbstractCompactionStrategy.java | 24 +++- .../db/compaction/AbstractCompactionTask.java | 3 +- .../db/compaction/CompactionManager.java| 6 +- .../compaction/CompactionStrategyManager.java | 40 +-- .../cassandra/db/compaction/CompactionTask.java | 22 ++-- .../db/compaction/LeveledCompactionTask.java| 6 +- .../db/compaction/SSTableSplitter.java | 3 +- .../cassandra/db/compaction/Scrubber.java | 3 +- .../SizeTieredCompactionStrategy.java | 4 +- .../writers/CompactionAwareWriter.java | 53 ++--- .../writers/DefaultCompactionWriter.java| 32 ++--- .../writers/MajorLeveledCompactionWriter.java | 46 .../writers/MaxSSTableSizeWriter.java | 45 --- .../SplittingSizeTieredCompactionWriter.java| 52 - .../db/lifecycle/LifecycleTransaction.java | 9 ++ .../apache/cassandra/db/lifecycle/Tracker.java | 34 +++--- .../org/apache/cassandra/db/lifecycle/View.java | 4 +- .../io/sstable/AbstractSSTableSimpleWriter.java | 11 +- .../io/sstable/SSTableMultiWriter.java | 54 + .../cassandra/io/sstable/SSTableTxnWriter.java | 43 +-- .../io/sstable/SimpleSSTableMultiWriter.java| 116 +++ .../notifications/SSTableAddedNotification.java | 4 +- .../cassandra/streaming/StreamReader.java | 22 ++-- .../cassandra/streaming/StreamReceiveTask.java | 22 ++-- .../compress/CompressedStreamReader.java| 8 +- .../streaming/messages/IncomingFileMessage.java | 7 +- .../cassandra/tools/SSTableExpiredBlockers.java | 3 +- .../cassandra/tools/SSTableLevelResetter.java | 2 +- .../cassandra/tools/SSTableOfflineRelevel.java | 5 +- .../cassandra/tools/StandaloneScrubber.java | 2 +- .../cassandra/tools/StandaloneUpgrader.java | 2 +- .../cassandra/tools/StandaloneVerifier.java | 7 +- .../db/compaction/LongCompactionsTest.java | 6 +- test/unit/org/apache/cassandra/MockSchema.java | 2 +- .../cassandra/db/ColumnFamilyStoreTest.java | 4 +- .../unit/org/apache/cassandra/db/ScrubTest.java | 12 +- .../db/compaction/AntiCompactionTest.java | 10 +- .../compaction/CompactionAwareWriterTest.java | 8 +- .../LeveledCompactionStrategyTest.java | 2 +- .../db/lifecycle/RealTransactionsTest.java | 8 +- .../cassandra/db/lifecycle/TrackerTest.java | 19 +-- .../apache/cassandra/db/lifecycle/ViewTest.java | 2 +- .../io/sstable/BigTableWriterTest.java | 4 +- .../io/sstable/CQLSSTableWriterClientTest.java | 2 + .../io/sstable/SSTableRewriterTest.java | 10 +- .../cassandra/io/sstable/SSTableUtils.java | 25 ++-- .../org/apache/cassandra/schema/DefsTest.java | 6 +- 51 files changed, 651 insertions(+), 315 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0fd857ba/src/java/org/apache/cassandra/db/ColumnFamilyStore.java --
[1/2] cassandra git commit: Give compaction strategies more control over sstable creation
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 0d866456a - 9ed272773 http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ed27277/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java b/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java new file mode 100644 index 000..2112656 --- /dev/null +++ b/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.io.sstable; + +import java.util.Collection; +import java.util.Collections; +import java.util.UUID; + +import org.apache.cassandra.config.CFMetaData; +import org.apache.cassandra.db.RowIndexEntry; +import org.apache.cassandra.db.SerializationHeader; +import org.apache.cassandra.db.lifecycle.LifecycleTransaction; +import org.apache.cassandra.db.rows.UnfilteredRowIterator; +import org.apache.cassandra.io.sstable.format.SSTableReader; +import org.apache.cassandra.io.sstable.format.SSTableWriter; +import org.apache.cassandra.io.sstable.metadata.MetadataCollector; + +public class SimpleSSTableMultiWriter implements SSTableMultiWriter +{ +private final SSTableWriter writer; + +private SimpleSSTableMultiWriter(SSTableWriter writer) +{ +this.writer = writer; +} + +public boolean append(UnfilteredRowIterator partition) +{ +RowIndexEntry indexEntry = writer.append(partition); +return indexEntry != null; +} + +public CollectionSSTableReader finish(long repairedAt, long maxDataAge, boolean openResult) +{ +return Collections.singleton(writer.finish(repairedAt, maxDataAge, openResult)); +} + +public CollectionSSTableReader finish(boolean openResult) +{ +return Collections.singleton(writer.finish(openResult)); +} + +public CollectionSSTableReader finished() +{ +return Collections.singleton(writer.finished()); +} + +public SSTableMultiWriter setOpenResult(boolean openResult) +{ +writer.setOpenResult(openResult); +return this; +} + +public String getFilename() +{ +return writer.getFilename(); +} + +public long getFilePointer() +{ +return writer.getFilePointer(); +} + +public UUID getCfId() +{ +return writer.metadata.cfId; +} + +public Throwable commit(Throwable accumulate) +{ +return writer.commit(accumulate); +} + +public Throwable abort(Throwable accumulate) +{ +return writer.abort(accumulate); +} + +public void prepareToCommit() +{ +writer.prepareToCommit(); +} + +public void close() throws Exception +{ +writer.close(); +} + +public static SSTableMultiWriter create(Descriptor descriptor, +long keyCount, +long repairedAt, +CFMetaData cfm, +MetadataCollector metadataCollector, +SerializationHeader header, +LifecycleTransaction txn) +{ +SSTableWriter writer = SSTableWriter.create(descriptor, keyCount, repairedAt, cfm, metadataCollector, header, txn); +return new SimpleSSTableMultiWriter(writer); +} +} http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ed27277/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java -- diff --git a/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java b/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java index 15230ea..56d6130 100644 --- a/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java +++ b/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java @@ -21,8 +21,8 @@ import
[1/3] cassandra git commit: Give compaction strategies more control over sstable creation
Repository: cassandra Updated Branches: refs/heads/trunk a1349257d - 0fd857baf http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ed27277/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java b/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java new file mode 100644 index 000..2112656 --- /dev/null +++ b/src/java/org/apache/cassandra/io/sstable/SimpleSSTableMultiWriter.java @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.io.sstable; + +import java.util.Collection; +import java.util.Collections; +import java.util.UUID; + +import org.apache.cassandra.config.CFMetaData; +import org.apache.cassandra.db.RowIndexEntry; +import org.apache.cassandra.db.SerializationHeader; +import org.apache.cassandra.db.lifecycle.LifecycleTransaction; +import org.apache.cassandra.db.rows.UnfilteredRowIterator; +import org.apache.cassandra.io.sstable.format.SSTableReader; +import org.apache.cassandra.io.sstable.format.SSTableWriter; +import org.apache.cassandra.io.sstable.metadata.MetadataCollector; + +public class SimpleSSTableMultiWriter implements SSTableMultiWriter +{ +private final SSTableWriter writer; + +private SimpleSSTableMultiWriter(SSTableWriter writer) +{ +this.writer = writer; +} + +public boolean append(UnfilteredRowIterator partition) +{ +RowIndexEntry indexEntry = writer.append(partition); +return indexEntry != null; +} + +public CollectionSSTableReader finish(long repairedAt, long maxDataAge, boolean openResult) +{ +return Collections.singleton(writer.finish(repairedAt, maxDataAge, openResult)); +} + +public CollectionSSTableReader finish(boolean openResult) +{ +return Collections.singleton(writer.finish(openResult)); +} + +public CollectionSSTableReader finished() +{ +return Collections.singleton(writer.finished()); +} + +public SSTableMultiWriter setOpenResult(boolean openResult) +{ +writer.setOpenResult(openResult); +return this; +} + +public String getFilename() +{ +return writer.getFilename(); +} + +public long getFilePointer() +{ +return writer.getFilePointer(); +} + +public UUID getCfId() +{ +return writer.metadata.cfId; +} + +public Throwable commit(Throwable accumulate) +{ +return writer.commit(accumulate); +} + +public Throwable abort(Throwable accumulate) +{ +return writer.abort(accumulate); +} + +public void prepareToCommit() +{ +writer.prepareToCommit(); +} + +public void close() throws Exception +{ +writer.close(); +} + +public static SSTableMultiWriter create(Descriptor descriptor, +long keyCount, +long repairedAt, +CFMetaData cfm, +MetadataCollector metadataCollector, +SerializationHeader header, +LifecycleTransaction txn) +{ +SSTableWriter writer = SSTableWriter.create(descriptor, keyCount, repairedAt, cfm, metadataCollector, header, txn); +return new SimpleSSTableMultiWriter(writer); +} +} http://git-wip-us.apache.org/repos/asf/cassandra/blob/9ed27277/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java -- diff --git a/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java b/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java index 15230ea..56d6130 100644 --- a/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java +++ b/src/java/org/apache/cassandra/notifications/SSTableAddedNotification.java @@ -21,8 +21,8 @@ import org.apache.cassandra.io.sstable.format.SSTableReader;
[jira] [Commented] (CASSANDRA-10126) Column subset serialization uses an unnecessary -1L for large subsets
[ https://issues.apache.org/jira/browse/CASSANDRA-10126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705415#comment-14705415 ] Ariel Weisberg commented on CASSANDRA-10126: +1. Looks correct, Cassci is happy, and this is covered by ColumnsTest. Column subset serialization uses an unnecessary -1L for large subsets - Key: CASSANDRA-10126 URL: https://issues.apache.org/jira/browse/CASSANDRA-10126 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 3.0.0 rc1 Follow up to CASSANDRA-9894. It's possible to completely remove a 9 byte overhead for large columns, as we know upfront which kind of serialization we will have done. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7410) Pig support for BulkOutputFormat as a parameter in url
[ https://issues.apache.org/jira/browse/CASSANDRA-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705457#comment-14705457 ] Jeremiah Jordan commented on CASSANDRA-7410: ping Pig support for BulkOutputFormat as a parameter in url -- Key: CASSANDRA-7410 URL: https://issues.apache.org/jira/browse/CASSANDRA-7410 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Priority: Minor Fix For: 2.0.x Attachments: 7410-2.0-branch.txt, 7410-2.1-branch.txt, 7410-v2-2.0-branch.txt, 7410-v3-2.0-branch.txt, CASSANDRA-7410-v2-2.1-branch.txt, CASSANDRA-7410-v3-2.1-branch.txt, CASSANDRA-7410-v4-2.0-branch.txt, CASSANDRA-7410-v4-2.1-branch.txt, CASSANDRA-7410-v5-2.0-branch.txt Add BulkOutputFormat support in Pig url -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10141) Windows utest 3.0: UFPureScriptTest fails
[ https://issues.apache.org/jira/browse/CASSANDRA-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-10141: - Issue Type: Bug (was: Sub-task) Parent: (was: CASSANDRA-10032) Windows utest 3.0: UFPureScriptTest fails - Key: CASSANDRA-10141 URL: https://issues.apache.org/jira/browse/CASSANDRA-10141 Project: Cassandra Issue Type: Bug Reporter: Joshua McKenzie Assignee: Robert Stupp Labels: Windows Fix For: 3.0.x {noformat} [junit] - --- [junit] Testcase: testJavascriptTupleTypeCollection(org.apache.cassandra.cql3.validation.entities.UFPureScriptTest): Caused an ERROR [junit] execution of 'cql_test_keyspace_alt.function_3[tupledouble, frozenlistdouble, frozensettext, frozenmapint, boolean]' failed: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] org.apache.cassandra.exceptions.FunctionExecutionException: execution of 'cql_test_keyspace_alt.function_3[tupledouble, frozenlistdouble, frozensettext, frozenmapint, boolean]' failed: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] at org.apache.cassandra.exceptions.FunctionExecutionException.create(FunctionExecutionException.java:35) [junit] at org.apache.cassandra.cql3.functions.UDFunction.execute(UDFunction.java:287) [junit] at org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:60) [junit] at org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:535) [junit] at org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.getOutputRow(Selection.java:363) [junit] at org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.build(Selection.java:351) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:599) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:363) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:379) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:72) [junit] at org.apache.cassandra.cql3.QueryProcessor.executeOnceInternal(QueryProcessor.java:337) [junit] at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:654) [junit] at org.apache.cassandra.cql3.validation.entities.UFPureScriptTest.testJavascriptTupleTypeCollection(UFPureScriptTest.java:178) [junit] Caused by: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] at java.security.AccessControlContext.checkPermission(AccessControlContext.java:457) [junit] at java.security.AccessController.checkPermission(AccessController.java:884) [junit] at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) [junit] at org.apache.cassandra.cql3.functions.ThreadAwareSecurityManager.checkPermission(ThreadAwareSecurityManager.java:164) [junit] at java.lang.Class.checkMemberAccess(Class.java:2348) [junit] at java.lang.Class.getEnclosingMethod(Class.java:1037) [junit] at java.lang.Class.getGenericSuperclass(Class.java:777) [junit] at com.google.common.reflect.TypeCapture.capture(TypeCapture.java:33) [junit] at com.google.common.reflect.TypeToken.init(TypeToken.java:113) [junit] at com.datastax.driver.core.CodecUtils$4.init(CodecUtils.java:44) [junit] at com.datastax.driver.core.CodecUtils.listOf(CodecUtils.java:44) [junit] at com.datastax.driver.core.AbstractGettableByIndexData.getList(AbstractGettableByIndexData.java:347) [junit] at com.datastax.driver.core.TupleValue.getList(TupleValue.java:21) [junit] at com.datastax.driver.core.AbstractGettableByIndexData.getList(AbstractGettableByIndexData.java:336) [junit] at com.datastax.driver.core.TupleValue.getList(TupleValue.java:21) [junit] at jdk.nashorn.internal.scripts.Script$2$\^eval\_.:program(eval:1) [junit] at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:636) [junit] at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:229) [junit] at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:387) [junit] at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:414) [junit]
[jira] [Commented] (CASSANDRA-10141) UFPureScriptTest fails with pre-3.0 java-driver
[ https://issues.apache.org/jira/browse/CASSANDRA-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705460#comment-14705460 ] Robert Stupp commented on CASSANDRA-10141: -- Can be reproduced with just running {{UFPureScriptTest.testJavascriptTupleTypeCollection}}. So it's not specific to Windows. UFPureScriptTest fails with pre-3.0 java-driver --- Key: CASSANDRA-10141 URL: https://issues.apache.org/jira/browse/CASSANDRA-10141 Project: Cassandra Issue Type: Bug Reporter: Joshua McKenzie Assignee: Robert Stupp Labels: Windows Fix For: 3.0.x {noformat} [junit] - --- [junit] Testcase: testJavascriptTupleTypeCollection(org.apache.cassandra.cql3.validation.entities.UFPureScriptTest): Caused an ERROR [junit] execution of 'cql_test_keyspace_alt.function_3[tupledouble, frozenlistdouble, frozensettext, frozenmapint, boolean]' failed: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] org.apache.cassandra.exceptions.FunctionExecutionException: execution of 'cql_test_keyspace_alt.function_3[tupledouble, frozenlistdouble, frozensettext, frozenmapint, boolean]' failed: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] at org.apache.cassandra.exceptions.FunctionExecutionException.create(FunctionExecutionException.java:35) [junit] at org.apache.cassandra.cql3.functions.UDFunction.execute(UDFunction.java:287) [junit] at org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:60) [junit] at org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:535) [junit] at org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.getOutputRow(Selection.java:363) [junit] at org.apache.cassandra.cql3.selection.Selection$ResultSetBuilder.build(Selection.java:351) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:599) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:363) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:379) [junit] at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:72) [junit] at org.apache.cassandra.cql3.QueryProcessor.executeOnceInternal(QueryProcessor.java:337) [junit] at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:654) [junit] at org.apache.cassandra.cql3.validation.entities.UFPureScriptTest.testJavascriptTupleTypeCollection(UFPureScriptTest.java:178) [junit] Caused by: java.security.AccessControlException: access denied (java.lang.RuntimePermission accessDeclaredMembers) [junit] at java.security.AccessControlContext.checkPermission(AccessControlContext.java:457) [junit] at java.security.AccessController.checkPermission(AccessController.java:884) [junit] at java.lang.SecurityManager.checkPermission(SecurityManager.java:549) [junit] at org.apache.cassandra.cql3.functions.ThreadAwareSecurityManager.checkPermission(ThreadAwareSecurityManager.java:164) [junit] at java.lang.Class.checkMemberAccess(Class.java:2348) [junit] at java.lang.Class.getEnclosingMethod(Class.java:1037) [junit] at java.lang.Class.getGenericSuperclass(Class.java:777) [junit] at com.google.common.reflect.TypeCapture.capture(TypeCapture.java:33) [junit] at com.google.common.reflect.TypeToken.init(TypeToken.java:113) [junit] at com.datastax.driver.core.CodecUtils$4.init(CodecUtils.java:44) [junit] at com.datastax.driver.core.CodecUtils.listOf(CodecUtils.java:44) [junit] at com.datastax.driver.core.AbstractGettableByIndexData.getList(AbstractGettableByIndexData.java:347) [junit] at com.datastax.driver.core.TupleValue.getList(TupleValue.java:21) [junit] at com.datastax.driver.core.AbstractGettableByIndexData.getList(AbstractGettableByIndexData.java:336) [junit] at com.datastax.driver.core.TupleValue.getList(TupleValue.java:21) [junit] at jdk.nashorn.internal.scripts.Script$2$\^eval\_.:program(eval:1) [junit] at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:636) [junit] at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:229) [junit] at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:387)
[1/3] cassandra git commit: Fix post-9749 test failures
Repository: cassandra Updated Branches: refs/heads/trunk 171889c80 - a1349257d Fix post-9749 test failures patch by Branimir Lambov; reviewed by Ariel Weisberg for CASSANDRA-9749 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7a85c8b8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7a85c8b8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7a85c8b8 Branch: refs/heads/trunk Commit: 7a85c8b8fbf753858c4334c4249475e6bb1a24e4 Parents: a6dd2b8 Author: Branimir Lambov branimir.lam...@datastax.com Authored: Wed Aug 19 19:56:58 2015 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu Aug 20 21:25:33 2015 +0300 -- .../cassandra/db/commitlog/CommitLogTest.java | 102 ++- .../db/commitlog/CommitLogUpgradeTest.java | 38 --- 2 files changed, 78 insertions(+), 62 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7a85c8b8/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index da8058c..0ad880b 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -29,19 +29,18 @@ import java.nio.ByteBuffer; import java.util.HashMap; import java.util.Map; import java.util.UUID; -import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.zip.CRC32; import java.util.zip.Checksum; import com.google.common.collect.ImmutableMap; + import org.junit.Assert; import org.junit.BeforeClass; import org.junit.Test; import org.apache.cassandra.SchemaLoader; import org.apache.cassandra.Util; -import org.apache.cassandra.config.Config.CommitFailurePolicy; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.KSMetaData; import org.apache.cassandra.config.ParameterizedClass; @@ -63,8 +62,7 @@ import org.apache.cassandra.io.util.ByteBufferDataInput; import org.apache.cassandra.io.util.FileDataInput; import org.apache.cassandra.locator.SimpleStrategy; import org.apache.cassandra.net.MessagingService; -import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; +import org.apache.cassandra.utils.*; import static org.apache.cassandra.utils.ByteBufferUtil.bytes; @@ -94,10 +92,15 @@ public class CommitLogTest CompactionManager.instance.disableAutoCompaction(); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithEmptyLog() throws Exception { -CommitLog.instance.recover(new File[]{ tmpFile(CommitLogDescriptor.current_version) }); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +CommitLog.instance.recover(new File[]{ tmpFile(CommitLogDescriptor.current_version) }); +} +}, CommitLogReplayException.class); } @Test @@ -119,10 +122,15 @@ public class CommitLogTest testRecoveryWithBadSizeArgument(100, 10); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithShortSize() throws Exception { -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +} +}, CommitLogReplayException.class); } @Test @@ -146,10 +154,15 @@ public class CommitLogTest testRecovery(garbage, CommitLogDescriptor.current_version); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithGarbageLog_fail() throws Exception { -testRecoveryWithGarbageLog(); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +testRecoveryWithGarbageLog(); +} +}, CommitLogReplayException.class); } @Test @@ -164,18 +177,6 @@ public class CommitLogTest } @Test -public void testRecoveryWithGarbageLog_ignoredByPolicy() throws Exception -{ -CommitFailurePolicy existingPolicy = DatabaseDescriptor.getCommitFailurePolicy(); -try { - DatabaseDescriptor.setCommitFailurePolicy(CommitFailurePolicy.ignore); -testRecoveryWithGarbageLog(); -} finally { -DatabaseDescriptor.setCommitFailurePolicy(existingPolicy); -
[2/3] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0d866456 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0d866456 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0d866456 Branch: refs/heads/trunk Commit: 0d866456a144ea6b3f86f3677f0e8d90c7b1d2d5 Parents: df52cd6 7a85c8b Author: Aleksey Yeschenko alek...@apache.org Authored: Thu Aug 20 21:30:15 2015 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu Aug 20 21:30:15 2015 +0300 -- --
[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a1349257 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a1349257 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a1349257 Branch: refs/heads/trunk Commit: a1349257d5dc9ce2af4c2172fd29089da99c3719 Parents: 171889c 0d86645 Author: Aleksey Yeschenko alek...@apache.org Authored: Thu Aug 20 21:31:05 2015 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu Aug 20 21:31:05 2015 +0300 -- --
[jira] [Assigned] (CASSANDRA-8611) give streaming_socket_timeout_in_ms a non-zero default
[ https://issues.apache.org/jira/browse/CASSANDRA-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-8611: - Assignee: Benjamin Lerer give streaming_socket_timeout_in_ms a non-zero default -- Key: CASSANDRA-8611 URL: https://issues.apache.org/jira/browse/CASSANDRA-8611 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jeremy Hanna Assignee: Benjamin Lerer Sometimes as mentioned in CASSANDRA-8472 streams will hang. We have streaming_socket_timeout_in_ms which can retry after a timeout. It would be good to make a default non-zero value. We don't want to paper over problems, but streams sometimes hang and you don't want long running streaming operations to just fail - as in repairs or bootstraps. streaming_socket_timeout_in_ms should be based on the tcp idle timeout so it shouldn't be a problem to set it to on the order of minutes. Also the socket should only be open during the actual streaming and not during operations such as merkle tree generation. We can set it to a conservative value and people can set it more aggressively as needed. Disabling as a default, in my opinion, is too conservative. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors
[ https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705382#comment-14705382 ] Ariel Weisberg commented on CASSANDRA-9749: --- 3.0 is not effected because the updated expectations were done when rebasing the change for 3.0. I think we are coming back now and updating the expectations in 2.2. The changes look good. This is just removing the tests that depended on the policy having an impact (which they don't anymore). Also some test refactoring. Cassci seems OK with it. +1 CommitLogReplayer continues startup after encountering errors - Key: CASSANDRA-9749 URL: https://issues.apache.org/jira/browse/CASSANDRA-9749 Project: Cassandra Issue Type: Bug Reporter: Blake Eggleston Assignee: Branimir Lambov Fix For: 2.2.1, 3.0 beta 1 Attachments: 9749-coverage.tgz There are a few places where the commit log recovery method either skips sections or just returns when it encounters errors. Specifically if it can't read the header here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298 Or if there are compressor problems here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314 and here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366 Whether these are user-fixable or not, I think we should require more direct user intervention (ie: fix what's wrong, or remove the bad file and restart) since we're basically losing data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10140) Enable GC logging by default
[ https://issues.apache.org/jira/browse/CASSANDRA-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705430#comment-14705430 ] Ariel Weisberg commented on CASSANDRA-10140: [This is what I was thinking of.|http://blog.ragozin.info/2012/10/safepoints-in-hotspot-jvm.html?showComment=137138998#c5290312151653517505] {quote} Very informative article, thank you. We have seen due to IO overload inside Linux. When this happens, GC log entries show use_time ~0, sys_time ~ 0, and real-time ~ at least 1 second. We are able to recreate this type of stalls in the lab too. It turns out that deferred writes to append a file can be blocked for a long time when the write is blocked by journal commit. Or when dirty_ratio is exceeded. We straced the Java process and could correlate some but not all of the stalls to GC threads when they write to the gc.log file. If GC threads do not have park the Java threads running in kernel mode, we are stumped about what else could have caused the stall (where user_time ~ 0, sys_time ~0). Any other data/traces you would recommend to help us understand the issue better? Many thanks. {quote} I agree we should be able to turn on GC logging all the time. But it should buffer sufficiently to private memory and have a dedicated victim thread to flush to files. Enable GC logging by default Key: CASSANDRA-10140 URL: https://issues.apache.org/jira/browse/CASSANDRA-10140 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Chris Lohfink Assignee: Chris Lohfink Priority: Minor Attachments: CASSANDRA-10140.txt Overhead for the gc logging is very small (with cycling logs in 7+) and it provides a ton of useful information. This will open up more for C* diagnostic tools to provide feedback as well without requiring restarts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Fix post-9749 test failures
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 a6dd2b893 - 7a85c8b8f Fix post-9749 test failures patch by Branimir Lambov; reviewed by Ariel Weisberg for CASSANDRA-9749 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7a85c8b8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7a85c8b8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7a85c8b8 Branch: refs/heads/cassandra-2.2 Commit: 7a85c8b8fbf753858c4334c4249475e6bb1a24e4 Parents: a6dd2b8 Author: Branimir Lambov branimir.lam...@datastax.com Authored: Wed Aug 19 19:56:58 2015 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu Aug 20 21:25:33 2015 +0300 -- .../cassandra/db/commitlog/CommitLogTest.java | 102 ++- .../db/commitlog/CommitLogUpgradeTest.java | 38 --- 2 files changed, 78 insertions(+), 62 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7a85c8b8/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index da8058c..0ad880b 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -29,19 +29,18 @@ import java.nio.ByteBuffer; import java.util.HashMap; import java.util.Map; import java.util.UUID; -import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.zip.CRC32; import java.util.zip.Checksum; import com.google.common.collect.ImmutableMap; + import org.junit.Assert; import org.junit.BeforeClass; import org.junit.Test; import org.apache.cassandra.SchemaLoader; import org.apache.cassandra.Util; -import org.apache.cassandra.config.Config.CommitFailurePolicy; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.KSMetaData; import org.apache.cassandra.config.ParameterizedClass; @@ -63,8 +62,7 @@ import org.apache.cassandra.io.util.ByteBufferDataInput; import org.apache.cassandra.io.util.FileDataInput; import org.apache.cassandra.locator.SimpleStrategy; import org.apache.cassandra.net.MessagingService; -import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; +import org.apache.cassandra.utils.*; import static org.apache.cassandra.utils.ByteBufferUtil.bytes; @@ -94,10 +92,15 @@ public class CommitLogTest CompactionManager.instance.disableAutoCompaction(); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithEmptyLog() throws Exception { -CommitLog.instance.recover(new File[]{ tmpFile(CommitLogDescriptor.current_version) }); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +CommitLog.instance.recover(new File[]{ tmpFile(CommitLogDescriptor.current_version) }); +} +}, CommitLogReplayException.class); } @Test @@ -119,10 +122,15 @@ public class CommitLogTest testRecoveryWithBadSizeArgument(100, 10); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithShortSize() throws Exception { -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +} +}, CommitLogReplayException.class); } @Test @@ -146,10 +154,15 @@ public class CommitLogTest testRecovery(garbage, CommitLogDescriptor.current_version); } -@Test(expected = CommitLogReplayException.class) +@Test public void testRecoveryWithGarbageLog_fail() throws Exception { -testRecoveryWithGarbageLog(); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception +{ +testRecoveryWithGarbageLog(); +} +}, CommitLogReplayException.class); } @Test @@ -164,18 +177,6 @@ public class CommitLogTest } @Test -public void testRecoveryWithGarbageLog_ignoredByPolicy() throws Exception -{ -CommitFailurePolicy existingPolicy = DatabaseDescriptor.getCommitFailurePolicy(); -try { - DatabaseDescriptor.setCommitFailurePolicy(CommitFailurePolicy.ignore); -testRecoveryWithGarbageLog(); -} finally { -
[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors
[ https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705514#comment-14705514 ] Aleksey Yeschenko commented on CASSANDRA-9749: -- Committed to cassandra-2.2 as [7a85c8b8fbf753858c4334c4249475e6bb1a24e4|https://github.com/apache/cassandra/commit/7a85c8b8fbf753858c4334c4249475e6bb1a24e4] and merged --ours with 3.0 and trunk. Thanks. CommitLogReplayer continues startup after encountering errors - Key: CASSANDRA-9749 URL: https://issues.apache.org/jira/browse/CASSANDRA-9749 Project: Cassandra Issue Type: Bug Reporter: Blake Eggleston Assignee: Branimir Lambov Fix For: 2.2.1, 3.0 beta 1 Attachments: 9749-coverage.tgz There are a few places where the commit log recovery method either skips sections or just returns when it encounters errors. Specifically if it can't read the header here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298 Or if there are compressor problems here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314 and here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366 Whether these are user-fixable or not, I think we should require more direct user intervention (ie: fix what's wrong, or remove the bad file and restart) since we're basically losing data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10140) Enable GC logging by default
[ https://issues.apache.org/jira/browse/CASSANDRA-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705561#comment-14705561 ] Chris Lohfink commented on CASSANDRA-10140: --- Can we use cstar perf or something to measure the impact if any? I don't see anything but my benchmarks on laptop are inconsistent at best Enable GC logging by default Key: CASSANDRA-10140 URL: https://issues.apache.org/jira/browse/CASSANDRA-10140 Project: Cassandra Issue Type: Improvement Components: Config Reporter: Chris Lohfink Assignee: Chris Lohfink Priority: Minor Attachments: CASSANDRA-10140.txt Overhead for the gc logging is very small (with cycling logs in 7+) and it provides a ton of useful information. This will open up more for C* diagnostic tools to provide feedback as well without requiring restarts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10135) Quoting changed for username in GRANT statement
[ https://issues.apache.org/jira/browse/CASSANDRA-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704936#comment-14704936 ] Bernhard K. Weisshuhn commented on CASSANDRA-10135: --- Thanks for the explanation and quick response, Sam. Much appreciated! Quoting changed for username in GRANT statement --- Key: CASSANDRA-10135 URL: https://issues.apache.org/jira/browse/CASSANDRA-10135 Project: Cassandra Issue Type: Bug Components: API Environment: cassandra 2.2.0 Reporter: Bernhard K. Weisshuhn Assignee: Sam Tunnicliffe Priority: Minor Fix For: 2.2.1, 3.0 beta 2 We may have uncovered an undocumented api change between cassandra 2.1.x and 2.2.0. When granting permissions to a username containing special characters, 2.1.x needed single quotes around the username and refused doubles. 2.2.0 needs doubles and refuses singles. Working example for 2.1.x: {code:sql} GRANT SELECT ON ALL KEYSPACES TO 'vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797'; {code} Enclosing the username in double quotes instead of singles fails with the following error message: {quote} cassandra@cqlsh GRANT SELECT ON ALL KEYSPACES TO vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797; SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:33 mismatched input 'vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797' expecting set null (...SELECT ON ALL KEYSPACES TO [vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-144001779]...) {quote} Singles fail in 2.2.0: {quote} cassandra@cqlsh GRANT SELECT ON ALL KEYSPACES TO 'vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308'; SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:33 no viable alternative at input 'vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308' (...SELECT ON ALL KEYSPACES TO ['vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-144001930]...) {quote} ... whereas double quotes succeed: {code:sql} GRANT SELECT ON ALL KEYSPACES TO vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308; {code} If this is a deliberate change, I don't think it is reflected in the documentation. I am temped to consider this a bug introduced with the role additions. Motivation for this report: https://github.com/hashicorp/vault/pull/545#issuecomment-132634630 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10138) Millions of compaction tasks on empty DB
A Markov created CASSANDRA-10138: Summary: Millions of compaction tasks on empty DB Key: CASSANDRA-10138 URL: https://issues.apache.org/jira/browse/CASSANDRA-10138 Project: Cassandra Issue Type: Bug Environment: CentOS 6.5 and Cassandra 2.1.8 Reporter: A Markov Fresh installation of 2.1.8 Cassandra with no data in the database except systems tables becomes unresponsive after about 5-10 minutes from the start. Initially problem was discovered on empty cluster of 12 nodes because of the creation schema error - script was exiting by timeout giving an error. Analysis of log files showed that nodes were constantly reported as DOWN and then after some period of time UP. That was reported for multiple nodes. Verification of the system.log file showed that nodes constantly perform GC and while doing that all cores of the system were 100% busy which caused node disconnect after some time. Further analysis with nodetool (tpstats option) showed us that just after 10 minutes since clean node restart node completed more then 47M compaction tasks and had more then 12M pending. Here is example of the output: nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked CounterMutationStage 0 0 0 0 0 ReadStage 0 0 0 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0257 0 0 ReadRepairStage 0 0 0 0 0 GossipStage 0 0 0 0 0 CacheCleanupExecutor 0 0 0 0 0 MigrationStage0 0 0 0 0 ValidationExecutor0 0 0 0 0 Sampler 0 0 0 0 0 MemtableReclaimMemory 0 0 8 0 0 InternalResponseStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 MemtableFlushWriter 0 0 8 0 0 PendingRangeCalculator0 0 1 0 0 MemtablePostFlush 0 0 44 0 0 CompactionExecutor0 12996398 47578625 0 0 AntiEntropySessions 0 0 0 0 0 HintedHandoff 0 1 2 0 0 I am repeating myself but that was on TOTALLY EMPTY DB after 10 minutes since cassandra was started. I was able to repeateadly reproduce same issue and behaviour with single cassandra instance. Issue was persistent after I did full cassandra wipe out and reinstall from repository. I discovered that issue dissipaters if I execute nodetool disableautocompaction in that case system quickly (in a matter of 20-30 seconds) goes though all pending tasks and becomes idle. If I enable autocompaction again in about 1 minute it jumps to millions of pending tasks again. I verified it on the save server with version of Cassandra 2.1.6 and issue was not present. logs file do not show any ERROR messages. There were only warnings about GC events that were taking too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10135) Quoting changed for username in GRANT statement
[ https://issues.apache.org/jira/browse/CASSANDRA-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704818#comment-14704818 ] Sam Tunnicliffe commented on CASSANDRA-10135: - Thi is mostly discussed in CASSANDRA-8850, tl;dr the initial proposal was to make role names quoted strings, but in the end it was decided to stick with identifiers as had previously been the case for user names. 2.1 and 2.2 define a user/role name as either an identifier - an unquoted, case-insensitive string which matches the rule {{LETTER (LETTER | DIGIT | '_')*}} - or as a string literal, which are single quoted non-case sensitive. From 2.2, string literals also support extended postgres syntax (CASSANDRA-7769). There is a divergence between user and role names in the 2.2 CQL grammar though; role are defined as identifiers, unreserved keywords or quoted names. Quoted names are much like the 2.1 string literals, except are double quoted and in this usage, case-sensitive. They do not support the postgres {{$$}} syntax. I've pushed a branch [here|https://github.com/apache/cassandra/compare/cassandra-2.2...beobal:10135-2.2] which adds string literal as a supported production for role names, along with some additional tests which exercise the various syntax options. To preserve backwards compatibility, a quoted string role name (either singly quoted or using pg syntax) is *not* case sensitive. Quoting changed for username in GRANT statement --- Key: CASSANDRA-10135 URL: https://issues.apache.org/jira/browse/CASSANDRA-10135 Project: Cassandra Issue Type: Bug Components: API Environment: cassandra 2.2.0 Reporter: Bernhard K. Weisshuhn Priority: Minor We may have uncovered an undocumented api change between cassandra 2.1.x and 2.2.0. When granting permissions to a username containing special characters, 2.1.x needed single quotes around the username and refused doubles. 2.2.0 needs doubles and refuses singles. Working example for 2.1.x: {code:sql} GRANT SELECT ON ALL KEYSPACES TO 'vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797'; {code} Enclosing the username in double quotes instead of singles fails with the following error message: {quote} cassandra@cqlsh GRANT SELECT ON ALL KEYSPACES TO vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797; SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:33 mismatched input 'vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797' expecting set null (...SELECT ON ALL KEYSPACES TO [vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-144001779]...) {quote} Singles fail in 2.2.0: {quote} cassandra@cqlsh GRANT SELECT ON ALL KEYSPACES TO 'vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308'; SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:33 no viable alternative at input 'vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308' (...SELECT ON ALL KEYSPACES TO ['vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-144001930]...) {quote} ... whereas double quotes succeed: {code:sql} GRANT SELECT ON ALL KEYSPACES TO vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308; {code} If this is a deliberate change, I don't think it is reflected in the documentation. I am temped to consider this a bug introduced with the role additions. Motivation for this report: https://github.com/hashicorp/vault/pull/545#issuecomment-132634630 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9623) Added column does not sort as the last column
[ https://issues.apache.org/jira/browse/CASSANDRA-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704816#comment-14704816 ] Erik Forsberg commented on CASSANDRA-9623: -- Well, in CASSANDRA-9450. we're seeing a similar traceback (it ends on the same two functions/source code lines) *during requests*. Are you saying that there could be fixes in 2.0.15/2.0.16 that fix CASSANDRA-9450 as well? Added column does not sort as the last column - Key: CASSANDRA-9623 URL: https://issues.apache.org/jira/browse/CASSANDRA-9623 Project: Cassandra Issue Type: Bug Reporter: Marcin Pietraszek Assignee: Marcus Eriksson Attachments: cassandra_log.txt After adding new machines to existing cluster running cleanup one of the tables ends with: {noformat} ERROR [CompactionExecutor:1015] 2015-06-19 11:24:05,038 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1015,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:161) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} We're using patched 2.0.13-190ef4f -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9623) Added column does not sort as the last column
[ https://issues.apache.org/jira/browse/CASSANDRA-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704821#comment-14704821 ] Marcus Eriksson commented on CASSANDRA-9623: probably not, so either this is a duplicate of some LCS ticket or CASSANDRA-9450 Added column does not sort as the last column - Key: CASSANDRA-9623 URL: https://issues.apache.org/jira/browse/CASSANDRA-9623 Project: Cassandra Issue Type: Bug Reporter: Marcin Pietraszek Assignee: Marcus Eriksson Attachments: cassandra_log.txt After adding new machines to existing cluster running cleanup one of the tables ends with: {noformat} ERROR [CompactionExecutor:1015] 2015-06-19 11:24:05,038 CassandraDaemon.java (line 199) Exception in thread Thread[CompactionExecutor:1015,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:161) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} We're using patched 2.0.13-190ef4f -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-10137) Consistency problem
[ https://issues.apache.org/jira/browse/CASSANDRA-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-10137. Resolution: Invalid Consistency problem --- Key: CASSANDRA-10137 URL: https://issues.apache.org/jira/browse/CASSANDRA-10137 Project: Cassandra Issue Type: Bug Reporter: Sergey I have 2 dc and 3 node: dc1: 2 node; dc2: 1 node; Exist keyspace KEYSPACE itm_dhcp_test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'} AND durable_writes = true; and CF: TABLE itm_dhcp_test.lock ( name text PRIMARY KEY, reason text, time timestamp, who text ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Periodically there is a problem with deleting records. For example execute query: INSERT INTO lock (name, reason, time, who) values ('unitTest4', 'CassandraClusterLockTest', dateof(now()), 'I') IF NOT EXISTS USING TTL 60 SELECT * FROM lock WHERE name='unitTest4' DELETE FROM lock WHERE name='unitTest4' SELECT * FROM lock WHERE name='unitTest4' 20% - 30% of cases last SELECT returns not empty record. Most often when coordinator node1-dc2. In trace I see the message: | Parsing DELETE FROM lock WHERE name='unitTest4' | node1.dc2 | 45 | SharedPool-Worker-3 | Preparing statement | node1.dc2 |151 | SharedPool-Worker-3 | Executing single-partition query on users | node1.dc2 |588 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 |601 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 |634 | SharedPool-Worker-1 | Key cache hit for sstable 2 | node1.dc2|668 | SharedPool-Worker-1 | Seeking to partition beginning in data file | node1.dc2 |674 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 |737 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 |743 | SharedPool-Worker-1 |Read 1 live and 0 tombstoned cells | node1.dc2 |795 | SharedPool-Worker-1 | Executing single-partition query on permissions | node1.dc2 | 1653 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 | 1662 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 | 1690 | SharedPool-Worker-1 | Key cache hit for sstable 5 | node1.dc2| 1737 | SharedPool-Worker-1 | Seeking to partition indexed section in data file | node1.dc2 | 1742 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 1797 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 | 1805 | SharedPool-Worker-1 |Read 0 live and 0 tombstoned cells | node1.dc2 | 1819 | SharedPool-Worker-1 | Executing single-partition query on users | node1.dc2 | 2798 | SharedPool-Worker-4 | Acquiring sstable references | node1.dc2 | 2808 | SharedPool-Worker-4 | Merging memtable tombstones | node1.dc2 | 2851 | SharedPool-Worker-4 | Key cache hit for sstable 2 | node1.dc2| 2896 | SharedPool-Worker-4 | Seeking to partition beginning in data file | node1.dc2 | 2903 | SharedPool-Worker-4 | Skipped 0/1 non-slice-intersecting sstables, included 0 due
[jira] [Commented] (CASSANDRA-10138) Millions of compaction tasks on empty DB
[ https://issues.apache.org/jira/browse/CASSANDRA-10138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704808#comment-14704808 ] Marcus Eriksson commented on CASSANDRA-10138: - the many pending compaction tasks were fixed in CASSANDRA-9662 The other issues I cannot explain though, could you attach logs and more details about your nodes? Millions of compaction tasks on empty DB Key: CASSANDRA-10138 URL: https://issues.apache.org/jira/browse/CASSANDRA-10138 Project: Cassandra Issue Type: Bug Environment: CentOS 6.5 and Cassandra 2.1.8 Reporter: A Markov Fresh installation of 2.1.8 Cassandra with no data in the database except systems tables becomes unresponsive after about 5-10 minutes from the start. Initially problem was discovered on empty cluster of 12 nodes because of the creation schema error - script was exiting by timeout giving an error. Analysis of log files showed that nodes were constantly reported as DOWN and then after some period of time UP. That was reported for multiple nodes. Verification of the system.log file showed that nodes constantly perform GC and while doing that all cores of the system were 100% busy which caused node disconnect after some time. Further analysis with nodetool (tpstats option) showed us that just after 10 minutes since clean node restart node completed more then 47M compaction tasks and had more then 12M pending. Here is example of the output: nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked CounterMutationStage 0 0 0 0 0 ReadStage 0 0 0 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 0 0257 0 0 ReadRepairStage 0 0 0 0 0 GossipStage 0 0 0 0 0 CacheCleanupExecutor 0 0 0 0 0 MigrationStage0 0 0 0 0 ValidationExecutor0 0 0 0 0 Sampler 0 0 0 0 0 MemtableReclaimMemory 0 0 8 0 0 InternalResponseStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 MemtableFlushWriter 0 0 8 0 0 PendingRangeCalculator0 0 1 0 0 MemtablePostFlush 0 0 44 0 0 CompactionExecutor0 12996398 47578625 0 0 AntiEntropySessions 0 0 0 0 0 HintedHandoff 0 1 2 0 0 I am repeating myself but that was on TOTALLY EMPTY DB after 10 minutes since cassandra was started. I was able to repeateadly reproduce same issue and behaviour with single cassandra instance. Issue was persistent after I did full cassandra wipe out and reinstall from repository. I discovered that issue dissipaters if I execute nodetool disableautocompaction in that case system quickly (in a matter of 20-30 seconds) goes though all pending tasks and becomes idle. If I enable autocompaction again in about 1 minute it jumps to millions of pending tasks again. I verified it on the save server with version of Cassandra 2.1.6 and issue was not present. logs file do not show any ERROR messages. There were only warnings about GC events that were taking too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-10135) Quoting changed for username in GRANT statement
[ https://issues.apache.org/jira/browse/CASSANDRA-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe reassigned CASSANDRA-10135: --- Assignee: Sam Tunnicliffe Quoting changed for username in GRANT statement --- Key: CASSANDRA-10135 URL: https://issues.apache.org/jira/browse/CASSANDRA-10135 Project: Cassandra Issue Type: Bug Components: API Environment: cassandra 2.2.0 Reporter: Bernhard K. Weisshuhn Assignee: Sam Tunnicliffe Priority: Minor Fix For: 2.2.1, 3.0 beta 2 We may have uncovered an undocumented api change between cassandra 2.1.x and 2.2.0. When granting permissions to a username containing special characters, 2.1.x needed single quotes around the username and refused doubles. 2.2.0 needs doubles and refuses singles. Working example for 2.1.x: {code:sql} GRANT SELECT ON ALL KEYSPACES TO 'vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797'; {code} Enclosing the username in double quotes instead of singles fails with the following error message: {quote} cassandra@cqlsh GRANT SELECT ON ALL KEYSPACES TO vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797; SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:33 mismatched input 'vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-1440017797' expecting set null (...SELECT ON ALL KEYSPACES TO [vault-readonly-root-79840dbb-917e-ed90-38e0-578226e6c1c6-144001779]...) {quote} Singles fail in 2.2.0: {quote} cassandra@cqlsh GRANT SELECT ON ALL KEYSPACES TO 'vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308'; SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:33 no viable alternative at input 'vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308' (...SELECT ON ALL KEYSPACES TO ['vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-144001930]...) {quote} ... whereas double quotes succeed: {code:sql} GRANT SELECT ON ALL KEYSPACES TO vault-readonly-root-e04e7a84-a7ba-d84f-f3c0-1e50e7590179-1440019308; {code} If this is a deliberate change, I don't think it is reflected in the documentation. I am temped to consider this a bug introduced with the role additions. Motivation for this report: https://github.com/hashicorp/vault/pull/545#issuecomment-132634630 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10137) Consistency problem
[ https://issues.apache.org/jira/browse/CASSANDRA-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704827#comment-14704827 ] Sergey commented on CASSANDRA-10137: That is, with frequent INSERT and DELETE the need to use Lightweight transactions? Сonstruction IF NOT EXISTS, IF EXISTS? Consistency problem --- Key: CASSANDRA-10137 URL: https://issues.apache.org/jira/browse/CASSANDRA-10137 Project: Cassandra Issue Type: Bug Reporter: Sergey I have 2 dc and 3 node: dc1: 2 node; dc2: 1 node; Exist keyspace KEYSPACE itm_dhcp_test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'} AND durable_writes = true; and CF: TABLE itm_dhcp_test.lock ( name text PRIMARY KEY, reason text, time timestamp, who text ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Periodically there is a problem with deleting records. For example execute query: INSERT INTO lock (name, reason, time, who) values ('unitTest4', 'CassandraClusterLockTest', dateof(now()), 'I') IF NOT EXISTS USING TTL 60 SELECT * FROM lock WHERE name='unitTest4' DELETE FROM lock WHERE name='unitTest4' SELECT * FROM lock WHERE name='unitTest4' 20% - 30% of cases last SELECT returns not empty record. Most often when coordinator node1-dc2. In trace I see the message: | Parsing DELETE FROM lock WHERE name='unitTest4' | node1.dc2 | 45 | SharedPool-Worker-3 | Preparing statement | node1.dc2 |151 | SharedPool-Worker-3 | Executing single-partition query on users | node1.dc2 |588 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 |601 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 |634 | SharedPool-Worker-1 | Key cache hit for sstable 2 | node1.dc2|668 | SharedPool-Worker-1 | Seeking to partition beginning in data file | node1.dc2 |674 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 |737 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 |743 | SharedPool-Worker-1 |Read 1 live and 0 tombstoned cells | node1.dc2 |795 | SharedPool-Worker-1 | Executing single-partition query on permissions | node1.dc2 | 1653 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 | 1662 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 | 1690 | SharedPool-Worker-1 | Key cache hit for sstable 5 | node1.dc2| 1737 | SharedPool-Worker-1 | Seeking to partition indexed section in data file | node1.dc2 | 1742 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 1797 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 | 1805 | SharedPool-Worker-1 |Read 0 live and 0 tombstoned cells | node1.dc2 | 1819 | SharedPool-Worker-1 | Executing single-partition query on users | node1.dc2 | 2798 | SharedPool-Worker-4 | Acquiring sstable references | node1.dc2 | 2808 | SharedPool-Worker-4 | Merging memtable tombstones | node1.dc2 | 2851 | SharedPool-Worker-4 | Key cache hit for sstable 2 | node1.dc2| 2896 | SharedPool-Worker-4 | Seeking to partition
[jira] [Commented] (CASSANDRA-10137) Consistency problem
[ https://issues.apache.org/jira/browse/CASSANDRA-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704833#comment-14704833 ] Benedict commented on CASSANDRA-10137: -- You have used LWT by using IF NOT EXISTS with your insert. As a result, to be certain your DELETE will apply, you must use IF EXISTS (or some other condition) Any cell should only be updated by either LWT, or regular updates. If you mix the two, you may see non-LWT updates disappear. Consistency problem --- Key: CASSANDRA-10137 URL: https://issues.apache.org/jira/browse/CASSANDRA-10137 Project: Cassandra Issue Type: Bug Reporter: Sergey I have 2 dc and 3 node: dc1: 2 node; dc2: 1 node; Exist keyspace KEYSPACE itm_dhcp_test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'} AND durable_writes = true; and CF: TABLE itm_dhcp_test.lock ( name text PRIMARY KEY, reason text, time timestamp, who text ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Periodically there is a problem with deleting records. For example execute query: INSERT INTO lock (name, reason, time, who) values ('unitTest4', 'CassandraClusterLockTest', dateof(now()), 'I') IF NOT EXISTS USING TTL 60 SELECT * FROM lock WHERE name='unitTest4' DELETE FROM lock WHERE name='unitTest4' SELECT * FROM lock WHERE name='unitTest4' 20% - 30% of cases last SELECT returns not empty record. Most often when coordinator node1-dc2. In trace I see the message: | Parsing DELETE FROM lock WHERE name='unitTest4' | node1.dc2 | 45 | SharedPool-Worker-3 | Preparing statement | node1.dc2 |151 | SharedPool-Worker-3 | Executing single-partition query on users | node1.dc2 |588 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 |601 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 |634 | SharedPool-Worker-1 | Key cache hit for sstable 2 | node1.dc2|668 | SharedPool-Worker-1 | Seeking to partition beginning in data file | node1.dc2 |674 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 |737 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 |743 | SharedPool-Worker-1 |Read 1 live and 0 tombstoned cells | node1.dc2 |795 | SharedPool-Worker-1 | Executing single-partition query on permissions | node1.dc2 | 1653 | SharedPool-Worker-1 | Acquiring sstable references | node1.dc2 | 1662 | SharedPool-Worker-1 | Merging memtable tombstones | node1.dc2 | 1690 | SharedPool-Worker-1 | Key cache hit for sstable 5 | node1.dc2| 1737 | SharedPool-Worker-1 | Seeking to partition indexed section in data file | node1.dc2 | 1742 | SharedPool-Worker-1 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node1.dc2 | 1797 | SharedPool-Worker-1 |Merging data from memtables and 1 sstables | node1.dc2 | 1805 | SharedPool-Worker-1 |Read 0 live and 0 tombstoned cells | node1.dc2 | 1819 | SharedPool-Worker-1 | Executing single-partition query on users | node1.dc2 | 2798 | SharedPool-Worker-4 | Acquiring sstable references | node1.dc2 | 2808 | SharedPool-Worker-4 | Merging memtable tombstones | node1.dc2 | 2851 | SharedPool-Worker-4 |