[jira] [Updated] (CASSANDRA-8098) Allow CqlInputFormat to be restricted to more than one data-center
[ https://issues.apache.org/jira/browse/CASSANDRA-8098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-8098: --- Description: Today, using CqlInputFormat, it's only possible to - enforce data-locality to one specific data-center, or - disable it by changing CL from LOCAL_ONE to ONE. We need a way to enforce data-locality to specific *data-centers*, and would like to contribute a solution. Suggested ideas - CqlInputFormat (gently) calls describeLocalRing against all the listed connection addresses and merge the results into one masterRangeNodes list, or - changing the signature of describeLocalRing(..) to describeRings(String keyspace, String[] dc) and having the job specify which DCs it will be running within. *Long description* A lot has changed since CASSANDRA-2388 that has made life a lot easier with integrating c* and hadoop, for example: CqlInputFormat, CL.LOCAL_ONE, LimitedLocalNodeFirstLocalBalancingPolicy, vnodes, and describe_local_ring. When using CqlInputFormat, if you don't want to be stuck within datacenter-locality you can for example change the consistency level from LOCAL_ONE to ONE. That's great, but describe_local_ring + CL.LOCAL_ONE in its current implementation isn't enough for us. We have multiple datacenters for offline, multiple for online, because we still want the availability advantages that come from aligning virtual datacenters to physical datacenters for the offline stuff too. That is using hadoop for aggregation purposes on top of c* doesn't always imply one can settle with an CP solution. Some of our jobs have their own InputFormat implementation that uses describe_ring, LOCAL_ONE, and data with only replica in the offline datacenters. Works very well, except the last point kinda sucks because we have online clients that want to read this data and have to then do so through nodes in the offline datacenters. Underlying performance improvements: eg cross_node_timeout and speculative requests; have helped but there's still the need to separate online and offline. If we wanted to push replica out on to the online nodes, i think the best approach is for us is to have to filter out those splits/locations in getRangeMap(..) Back to this issue we also have jobs using CqlInputFormat. Specifying multiple client input addresses doesn't help take advantage of the multiple offline datacenters because the Cassandra.Client only makes one call to describe_local_ring, and StorageService.describeLocalRing(..) only checks against its own address. It would work to have either a) CqlInputFormat call describeLocalRing against all the listed connection addresses and merge the results into one masterRangeNodes list, or b) something along the lines of changing the signature of describeLocalRing(..) to describeRings(String keyspace, String[] dc) and having the job specify which DCs it will be running within. was: Today, using CqlInputFormat, it's only possible to - enforce data-locality to one specific data-center, or - disable it by changing CL from LOCAL_ONE to ONE. We need a way to enforce data-locality to specific *data-centers*, and would like to contribute a solution. Suggested ideas - CqlInputFormat (gently) calls describeLocalRing against all the listed connection addresses and merge the results into one masterRangeNodes list, or - changing the signature of describeLocalRing(..) to describeRings(String keyspace, String[] dc) and having the job specify which DCs it will be running within. *Long description* A lot has changed since CASSANDRA-2388 that has made life a lot easier with integrating c* and hadoop, for example: CqlInputFormat, CL.LOCAL_ONE, LimitedLocalNodeFirstLocalBalancingPolicy, vnodes, and describe_local_ring. When using CqlInputFormat, if you don't want to be stuck within datacenter-locality you can for example change the consistency level from LOCAL_ONE to ONE. That's great, but describe_local_ring + CL.LOCAL_ONE in its current implementation isn't enough for us. We have multiple datacenters for offline, multiple for online, because we still want the availability advantages that come from aligning virtual datacenters to physical datacenters for the offline stuff too. Some of our jobs have their own InputFormat implementation that uses describe_ring, LOCAL_ONE, and data with only replica in the offline datacenters. Works very well, except the last point kinda sucks because we have online clients that want to read this data and have to then do so through nodes in the offline datacenters. Underlying performance improvements: eg cross_node_timeout and speculative requests; have helped but there's still the need to separate online and offline. If we wanted to push replica out on to the online nodes, i think the best approach is for us is to have to filter out those splits/locations in getRangeMap(..) Back to this issue we
[jira] [Created] (CASSANDRA-8109) Avoid constant boxing in ColumnStats.{Min/Max}Tracker
Sylvain Lebresne created CASSANDRA-8109: --- Summary: Avoid constant boxing in ColumnStats.{Min/Max}Tracker Key: CASSANDRA-8109 URL: https://issues.apache.org/jira/browse/CASSANDRA-8109 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Fix For: 3.0 We use the {{ColumnStats.MinTracker}} and {{ColumnStats.MaxTracker}} to track timestamps and deletion times in sstable. Those classes are generics but we really ever use them for longs and integers. The consequence is that every call to their {{update}} method (called for every cell during sstable write) box it's argument (since we don't store the cell timestamps and deletion time boxed). That feels like a waste that is easy to fix: we could just make those work on longs only for instance and convert back to int at the end when that's what we need. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8109) Avoid constant boxing in ColumnStats.{Min/Max}Tracker
[ https://issues.apache.org/jira/browse/CASSANDRA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-8109: Labels: lhf (was: ) Avoid constant boxing in ColumnStats.{Min/Max}Tracker - Key: CASSANDRA-8109 URL: https://issues.apache.org/jira/browse/CASSANDRA-8109 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Fix For: 3.0 We use the {{ColumnStats.MinTracker}} and {{ColumnStats.MaxTracker}} to track timestamps and deletion times in sstable. Those classes are generics but we really ever use them for longs and integers. The consequence is that every call to their {{update}} method (called for every cell during sstable write) box it's argument (since we don't store the cell timestamps and deletion time boxed). That feels like a waste that is easy to fix: we could just make those work on longs only for instance and convert back to int at the end when that's what we need. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
git commit: Keep sstable level when bootstrapping
Repository: cassandra Updated Branches: refs/heads/trunk e473769fb - 0de0b8c03 Keep sstable level when bootstrapping Patch by marcuse; reviewed by iamaleksey for CASSANDRA-7460 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0de0b8c0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0de0b8c0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0de0b8c0 Branch: refs/heads/trunk Commit: 0de0b8c0372e825e834b1ffd9685d3db87d21378 Parents: e473769 Author: Marcus Eriksson marc...@apache.org Authored: Tue Oct 7 07:35:53 2014 +0200 Committer: Marcus Eriksson marc...@apache.org Committed: Mon Oct 13 11:24:00 2014 +0200 -- CHANGES.txt | 1 + .../cassandra/db/compaction/LeveledManifest.java| 14 ++ .../org/apache/cassandra/dht/RangeStreamer.java | 8 ++-- .../apache/cassandra/io/sstable/SSTableLoader.java | 2 +- .../apache/cassandra/io/sstable/SSTableWriter.java | 10 -- .../cassandra/net/IncomingStreamingConnection.java | 2 +- .../org/apache/cassandra/repair/LocalSyncTask.java | 2 +- .../cassandra/repair/StreamingRepairTask.java | 2 +- .../cassandra/streaming/ConnectionHandler.java | 3 ++- .../cassandra/streaming/StreamCoordinator.java | 8 +--- .../org/apache/cassandra/streaming/StreamPlan.java | 11 --- .../apache/cassandra/streaming/StreamReader.java| 7 --- .../cassandra/streaming/StreamResultFuture.java | 9 + .../apache/cassandra/streaming/StreamSession.java | 9 - .../cassandra/streaming/StreamTransferTask.java | 2 +- .../streaming/messages/FileMessageHeader.java | 11 +-- .../streaming/messages/OutgoingFileMessage.java | 5 +++-- .../streaming/messages/StreamInitMessage.java | 9 +++-- .../cassandra/streaming/messages/StreamMessage.java | 2 +- .../org/apache/cassandra/tools/SSTableImport.java | 4 ++-- .../apache/cassandra/io/sstable/SSTableUtils.java | 2 +- .../cassandra/streaming/StreamTransferTaskTest.java | 2 +- .../apache/cassandra/tools/SSTableExportTest.java | 16 23 files changed, 94 insertions(+), 47 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0de0b8c0/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index f602c0e..b6a3766 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0 + * Keep sstable levels when bootstrapping (CASSANDRA-7460) * Add Sigar library and perform basic OS settings check on startup (CASSANDRA-7838) * Support for scripting languages in user-defined functions (CASSANDRA-7526) * Support for aggregation functions (CASSANDRA-4914) http://git-wip-us.apache.org/repos/asf/cassandra/blob/0de0b8c0/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java index a0836a8..6d3bf69 100644 --- a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java +++ b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java @@ -38,6 +38,7 @@ import org.apache.cassandra.dht.Bounds; import org.apache.cassandra.dht.Range; import org.apache.cassandra.dht.Token; import org.apache.cassandra.io.sstable.*; +import org.apache.cassandra.service.StorageService; import org.apache.cassandra.utils.Pair; public class LeveledManifest @@ -330,6 +331,19 @@ public class LeveledManifest return new CompactionCandidate(unrepairedMostInterresting, 0, Long.MAX_VALUE); } } + +// during bootstrap we only do size tiering in L0 to make sure +// the streamed files can be placed in their original levels +if (StorageService.instance.isBootstrapMode()) +{ +ListSSTableReader mostInteresting = getSSTablesForSTCS(getLevel(0)); +if (!mostInteresting.isEmpty()) +{ +logger.info(Bootstrapping - doing STCS in L0); +return new CompactionCandidate(mostInteresting, 0, Long.MAX_VALUE); +} +return null; +} // LevelDB gives each level a score of how much data it contains vs its ideal amount, and // compacts the level with the highest score. But this falls apart spectacularly once you // get behind. Consider this set of levels: http://git-wip-us.apache.org/repos/asf/cassandra/blob/0de0b8c0/src/java/org/apache/cassandra/dht/RangeStreamer.java -- diff --git
[jira] [Created] (CASSANDRA-8110) Make streaming backwards compatible
Marcus Eriksson created CASSANDRA-8110: -- Summary: Make streaming backwards compatible Key: CASSANDRA-8110 URL: https://issues.apache.org/jira/browse/CASSANDRA-8110 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Fix For: 3.0 To be able to seamlessly upgrade clusters we need to make it possible to stream files between nodes with different StreamMessage.CURRENT_VERSION -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169137#comment-14169137 ] Marcus Eriksson commented on CASSANDRA-6602: [~Bj0rn] i think the resolution in time_unit is a bit to high, wdyt about making it in minutes instead? And, perhaps renaming it to something along the lines of 'base_time_minutes'? (or do you have a better suggestion?). A 'time unit' to me is usually second/minute/etc. Compaction improvements to optimize time series data Key: CASSANDRA-6602 URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Tupshin Harper Assignee: Björn Hegerfors Labels: compaction, performance Fix For: 2.0.11 Attachments: 1 week.txt, 8 weeks.txt, STCS 16 hours.txt, TimestampViewer.java, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v3.txt There are some unique characteristics of many/most time series use cases that both provide challenges, as well as provide unique opportunities for optimizations. One of the major challenges is in compaction. The existing compaction strategies will tend to re-compact data on disk at least a few times over the lifespan of each data point, greatly increasing the cpu and IO costs of that write. Compaction exists to 1) ensure that there aren't too many files on disk 2) ensure that data that should be contiguous (part of the same partition) is laid out contiguously 3) deleting data due to ttls or tombstones The special characteristics of time series data allow us to optimize away all three. Time series data 1) tends to be delivered in time order, with relatively constrained exceptions 2) often has a pre-determined and fixed expiration date 3) Never gets deleted prior to TTL 4) Has relatively predictable ingestion rates Note that I filed CASSANDRA-5561 and this ticket potentially replaces or lowers the need for it. In that ticket, jbellis reasonably asks, how that compaction strategy is better than disabling compaction. Taking that to heart, here is a compaction-strategy-less approach that could be extremely efficient for time-series use cases that follow the above pattern. (For context, I'm thinking of an example use case involving lots of streams of time-series data with a 5GB per day ingestion rate, and a 1000 day retention with TTL, resulting in an eventual steady state of 5TB per node) 1) You have an extremely large memtable (preferably off heap, if/when doable) for the table, and that memtable is sized to be able to hold a lengthy window of time. A typical period might be one day. At the end of that period, you flush the contents of the memtable to an sstable and move to the next one. This is basically identical to current behaviour, but with thresholds adjusted so that you can ensure flushing at predictable intervals. (Open question is whether predictable intervals is actually necessary, or whether just waiting until the huge memtable is nearly full is sufficient) 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be efficiently dropped once all of the columns have. (Another side note, it might be valuable to have a modified version of CASSANDRA-3974 that doesn't bother storing per-column TTL since it is required that all columns have the same TTL) 3) Be able to mark column families as read/write only (no explicit deletes), so no tombstones. 4) Optionally add back an additional type of delete that would delete all data earlier than a particular timestamp, resulting in immediate dropping of obsoleted sstables. The result is that for in-order delivered data, Every cell will be laid out optimally on disk on the first pass, and over the course of 1000 days and 5TB of data, there will only be 1000 5GB sstables, so the number of filehandles will be reasonable. For exceptions (out-of-order delivery), most cases will be caught by the extended (24 hour+) memtable flush times and merged correctly automatically. For those that were slightly askew at flush time, or were delivered so far out of order that they go in the wrong sstable, there is relatively low overhead to reading from two sstables for a time slice, instead of one, and that overhead would be incurred relatively rarely unless out-of-order delivery was the common case, in which case, this strategy should not be used. Another possible optimization to address out-of-order would be to maintain more than one time-centric memtables in memory at a time (e.g. two 12 hour
[jira] [Commented] (CASSANDRA-8021) Improve cqlsh autocomplete for alter keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-8021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169138#comment-14169138 ] Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-8021: - Should I go ahead and generate the patch for the 2.0.11 branch as well? One question for the same changes for the trunk. Is the trunk update going to be taken care of by the merges from the other release branches such as 2.1.1 OR do we have to generate patch like this for the trunk as well? Please clarify. Thanks Improve cqlsh autocomplete for alter keyspace - Key: CASSANDRA-8021 URL: https://issues.apache.org/jira/browse/CASSANDRA-8021 Project: Cassandra Issue Type: Improvement Reporter: Philip Thompson Assignee: Rajanarayanan Thottuvaikkatumana Priority: Minor Labels: cqlsh, lhf Fix For: 2.0.11, 2.1.1 Attachments: cassandra-2.1.1-8021.txt Cqlsh autocomplete stops giving suggestions for the statement {code}ALTER KEYSPACE k WITH REPLICATION { 'class' : 'SimpleStrategy', 'replication_factor' : 1'};{code} after the word WITH. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169173#comment-14169173 ] Marcus Eriksson commented on CASSANDRA-6602: bq. wdyt about making it in minutes instead? or maybe not, since people can use non microsecond timestamps... Compaction improvements to optimize time series data Key: CASSANDRA-6602 URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Tupshin Harper Assignee: Björn Hegerfors Labels: compaction, performance Fix For: 2.0.11 Attachments: 1 week.txt, 8 weeks.txt, STCS 16 hours.txt, TimestampViewer.java, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v3.txt There are some unique characteristics of many/most time series use cases that both provide challenges, as well as provide unique opportunities for optimizations. One of the major challenges is in compaction. The existing compaction strategies will tend to re-compact data on disk at least a few times over the lifespan of each data point, greatly increasing the cpu and IO costs of that write. Compaction exists to 1) ensure that there aren't too many files on disk 2) ensure that data that should be contiguous (part of the same partition) is laid out contiguously 3) deleting data due to ttls or tombstones The special characteristics of time series data allow us to optimize away all three. Time series data 1) tends to be delivered in time order, with relatively constrained exceptions 2) often has a pre-determined and fixed expiration date 3) Never gets deleted prior to TTL 4) Has relatively predictable ingestion rates Note that I filed CASSANDRA-5561 and this ticket potentially replaces or lowers the need for it. In that ticket, jbellis reasonably asks, how that compaction strategy is better than disabling compaction. Taking that to heart, here is a compaction-strategy-less approach that could be extremely efficient for time-series use cases that follow the above pattern. (For context, I'm thinking of an example use case involving lots of streams of time-series data with a 5GB per day ingestion rate, and a 1000 day retention with TTL, resulting in an eventual steady state of 5TB per node) 1) You have an extremely large memtable (preferably off heap, if/when doable) for the table, and that memtable is sized to be able to hold a lengthy window of time. A typical period might be one day. At the end of that period, you flush the contents of the memtable to an sstable and move to the next one. This is basically identical to current behaviour, but with thresholds adjusted so that you can ensure flushing at predictable intervals. (Open question is whether predictable intervals is actually necessary, or whether just waiting until the huge memtable is nearly full is sufficient) 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be efficiently dropped once all of the columns have. (Another side note, it might be valuable to have a modified version of CASSANDRA-3974 that doesn't bother storing per-column TTL since it is required that all columns have the same TTL) 3) Be able to mark column families as read/write only (no explicit deletes), so no tombstones. 4) Optionally add back an additional type of delete that would delete all data earlier than a particular timestamp, resulting in immediate dropping of obsoleted sstables. The result is that for in-order delivered data, Every cell will be laid out optimally on disk on the first pass, and over the course of 1000 days and 5TB of data, there will only be 1000 5GB sstables, so the number of filehandles will be reasonable. For exceptions (out-of-order delivery), most cases will be caught by the extended (24 hour+) memtable flush times and merged correctly automatically. For those that were slightly askew at flush time, or were delivered so far out of order that they go in the wrong sstable, there is relatively low overhead to reading from two sstables for a time slice, instead of one, and that overhead would be incurred relatively rarely unless out-of-order delivery was the common case, in which case, this strategy should not be used. Another possible optimization to address out-of-order would be to maintain more than one time-centric memtables in memory at a time (e.g. two 12 hour ones), and then you always insert into whichever one of the two owns the appropriate range of time. By delaying flushing the ahead one until we are ready to roll
[jira] [Resolved] (CASSANDRA-8089) Invalid tombstone warnings / exceptions
[ https://issues.apache.org/jira/browse/CASSANDRA-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew S resolved CASSANDRA-8089. - Resolution: Not a Problem Thank you for a quick feedback! It was our own mistake we were inserting data in Cassandra setting some columns to null, this is why we were seeing tombstone warnings. Closing ticket. Invalid tombstone warnings / exceptions --- Key: CASSANDRA-8089 URL: https://issues.apache.org/jira/browse/CASSANDRA-8089 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.0 Debian 7.6, 3.2.0-4-amd64 GNU/Linux java version 1.7.0_51 Java(TM) SE Runtime Environment (build 1.7.0_51-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) Reporter: Andrew S Hey, We are having a strange issue with tombstone warnings which look like this: {code} WARN 12:28:42 Read 129 live and 4113 tombstoned cells in XXX.xxx (see tombstone_warn_threshold). 500 columns was requested, slices=[31660a4e-4f94-11e4-ac1d-53f244a29642-0a8073aa-4f9f-11e4-87c7-5b3e253389d8:!], delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647} {code} What is strange is that the row requested should not contain any tombstones as we never delete data from that row. (We do delete data from other row in the same column family) To debug the issue we have dumped the data for this row using sstable2json and the result does not contain any tombstones. (We have done this on all nodes having the data and all sstables containing the key) {code} ./sstable2json /var/lib/cassandra/data/XXX/xxx/XXX-xxx-ka-81524-Data.db -k xxx {code} We are getting the warnings after issuing a simple query: {code} select count(*) from xxx where key = 'x' and aggregate='x'; {code} There are only ~500 cells but it issues a warning about scanning 1700 tombstones. We are very worried about this because for some of the queries we are hitting TombstoneOverwhelmingException for no obvious reason. Here is the table definiion: {code} CREATE TABLE Xxxx.xxx ( key text, aggregate text, t timeuuid, . {date fields } PRIMARY KEY ((key, aggregate), t) ) WITH CLUSTERING ORDER BY (t ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = 'we love cassandra' AND compaction = {'min_threshold': '6', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'} AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 3600 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.1 AND speculative_retry = '99.0PERCENTILE'; {code} Do you have any ideas how can we debug this further? Thanks, Andrew -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8111) Create backup directories for commitlog archiving during startup
Jan Karlsson created CASSANDRA-8111: --- Summary: Create backup directories for commitlog archiving during startup Key: CASSANDRA-8111 URL: https://issues.apache.org/jira/browse/CASSANDRA-8111 Project: Cassandra Issue Type: Improvement Reporter: Jan Karlsson Priority: Trivial Fix For: 2.0.11 Cassandra currently crashes if the recovery directory in commitlog_archiving does not exist (or cannot be listed). I would like to propose that Cassandra creates this directory if it does not exist. This will mimic the behavior of creating data, commitlog .. etc. directories during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8111) Create backup directories for commitlog archiving during startup
[ https://issues.apache.org/jira/browse/CASSANDRA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Karlsson updated CASSANDRA-8111: Reproduced In: 2.1.0, 2.0.11 (was: 2.0.11, 2.1.0) Attachment: CASSANDRA-8111.patch Create backup directories for commitlog archiving during startup Key: CASSANDRA-8111 URL: https://issues.apache.org/jira/browse/CASSANDRA-8111 Project: Cassandra Issue Type: Improvement Reporter: Jan Karlsson Priority: Trivial Labels: archive, commitlog Fix For: 2.0.11 Attachments: CASSANDRA-8111.patch Cassandra currently crashes if the recovery directory in commitlog_archiving does not exist (or cannot be listed). I would like to propose that Cassandra creates this directory if it does not exist. This will mimic the behavior of creating data, commitlog .. etc. directories during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169229#comment-14169229 ] Marcus Eriksson commented on CASSANDRA-7443: In general, this looks very good, few comments/thoughts; I think we might want to force format-implementors to provide 'legacy' sstable scanners that generate OnDiskAtomIterators that work the same way as the current format. I see two cases here, either we compact sstables with the same format together, or we mix formats and these cases could be very different depending on how the formats look. I can, for example, imagine some formats being extremely efficient on merging Row Groups (columnar/parquet) but very slow at merging our legacy partitions, and we would need to merge partitions like that if we have one legacy sstable in the compacting sstables. Ie, what I think we need is something along the lines of an abstract 'SSTableReader#getLegacyScanner', and then having the logics within the compaction code to use those scanners if we mix formats. Small/nit code comments (note, didn't review the TestFormat as its mostly a PoC); * Making RowIndexEntry generic needs to be propagated throughout the code, we now get more unchecked assignment warnings because of this. * keepExistingFormat in compaction task should probably be handled some other way? should be possible to generate data, switch format and validate what we get? * BigTableReader.SizeComparator() should reference SSTableReader.SizeComparator() in SizeTieredCompactionStrategy * rename SSTableFormat#getCompactedRowWriter to just 'getCompactedRow'? * SSTableFormat, Descriptor - wrong StringUtils import, should probably be org.apache.commons.lang3.StringUtils (thats what we use elsewhere anyway) SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169246#comment-14169246 ] T Jake Luciani commented on CASSANDRA-7443: --- bq. force format-implementors to provide 'legacy' sstable scanners that generate OnDiskAtomIterators that work the same way as the current format. Not sure I follow, the iterator must still return data in the same order regardless of the format. Out of Comparator order will throw an exception in the ColumnFamily. We could add an explicit check lower down in the code but then it's redundant IMO. bq. keepExistingFormat in compaction task should probably be handled some other way? should be possible to generate data, switch format and validate what we get? I only added this for A test case where you stream in data from a new format and compact it in the same format (without globally requiring all sstables to be written in the same format) Outside of tests it should not be allowed. I can add a @VisibleForTesting and a better comment. I'll fix the other nits and rebase SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169247#comment-14169247 ] Sylvain Lebresne commented on CASSANDRA-7443: - At the risk of repeating myself, I'm fine with having better abstractions internally that make it easier to transition sstable format, but I'm still very opposed to anything more (and I know I'm not the only one). Concretely, I'm fine with the patch in general but I won't be happy unless anything user visible is removed, which means at least the yaml/config setting and the CQLSStableWriter.Builder method (and whatever else I've missed). SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169253#comment-14169253 ] T Jake Luciani commented on CASSANDRA-7443: --- bq. I won't be happy unless anything user visible is removed I also agree these should be removed since, as you mentioned, is not the intent of this patch. I can hide those settings but internally we still need access to them as developers. SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169256#comment-14169256 ] Marcus Eriksson commented on CASSANDRA-7443: bq. the iterator must still return data in the same order regardless of the format. right, but I'm imagining (perhaps prematurely) that some sstable formats could be compacted in a much more efficient way if we have a way of scanning sstables without that requirement during compaction SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements
[ https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oded Peer updated CASSANDRA-7304: - Attachment: 7304-03.patch 7304-03.patch contents: * Updated the v4 spec doc * Handled backward compatilbilty in {{CBUtil.readValue}} by inspecting the version protocol * Reverted modifications to CQLTester * Handled collections and counters * Since a UDT value is always written in its entirety Cassandra can't preserve a pre-existing value by 'not setting' the new value. unset values in UDT are treated as null values. * Added tests for collections and counters and UDT Ability to distinguish between NULL and UNSET values in Prepared Statements --- Key: CASSANDRA-7304 URL: https://issues.apache.org/jira/browse/CASSANDRA-7304 Project: Cassandra Issue Type: Sub-task Reporter: Drew Kutcharian Labels: cql, protocolv4 Fix For: 3.0 Attachments: 7304-03.patch, 7304-2.patch, 7304.patch Currently Cassandra inserts tombstones when a value of a column is bound to NULL in a prepared statement. At higher insert rates managing all these tombstones becomes an unnecessary overhead. This limits the usefulness of the prepared statements since developers have to either create multiple prepared statements (each with a different combination of column names, which at times is just unfeasible because of the sheer number of possible combinations) or fall back to using regular (non-prepared) statements. This JIRA is here to explore the possibility of either: A. Have a flag on prepared statements that once set, tells Cassandra to ignore null columns or B. Have an UNSET value which makes Cassandra skip the null columns and not tombstone them Basically, in the context of a prepared statement, a null value means delete, but we don’t have anything that means ignore (besides creating a new prepared statement without the ignored column). Please refer to the original conversation on DataStax Java Driver mailing list for more background: https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169264#comment-14169264 ] T Jake Luciani commented on CASSANDRA-7443: --- Ah. I would prefer to deal with that once we have the basics covered in the new format. SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169265#comment-14169265 ] Sylvain Lebresne commented on CASSANDRA-7443: - bq. can hide those settings but internally we still need access to them as developers Why? We've always transitioned sstables format by continuing to read old sstables and only writing new ones, and I'm fan of sticking to that. If we do so, the actual format used is hardcoded for write (for a given C* version) and based on the sstable for read. No reason to have a Config setting, even hidden. Note that if we ever change our mind on how we want to proceed, adding a hidden setting is pretty easy, but I'd rather not anticipate on stuffs we haven't agreed on yet. bq. I'm imagining (perhaps prematurely) that some sstable formats could be compacted in a much more efficient way if we have a way I do think that's premature. If we come up with a format that support that, we can always find the proper abstraction at that time. SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7396) Allow selecting Map key, List index
[ https://issues.apache.org/jira/browse/CASSANDRA-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169271#comment-14169271 ] Robert Stupp commented on CASSANDRA-7396: - Discussed offline with [~slebresne] - we'll wait until CASSANDRA-8099 has been implemented because it simplifies access patterns and makes access optimizations for this one much easier. Allow selecting Map key, List index --- Key: CASSANDRA-7396 URL: https://issues.apache.org/jira/browse/CASSANDRA-7396 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Robert Stupp Labels: cql Fix For: 3.0 Allow SELECT map['key] and SELECT list[index]. (Selecting a UDT subfield is already supported.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169281#comment-14169281 ] Marcus Eriksson commented on CASSANDRA-7443: sure! SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8106) Schema changes raises comparators do not match or are not compatible ConfigurationException
[ https://issues.apache.org/jira/browse/CASSANDRA-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169288#comment-14169288 ] Tommaso Barbugli commented on CASSANDRA-8106: - turns out it was an issue with 2 collection columns created concurrently on the same table; I think it would be have some information in the logs instead of the exception alone. Finding the issue can result in hours spent looking at system tables. Schema changes raises comparators do not match or are not compatible ConfigurationException - Key: CASSANDRA-8106 URL: https://issues.apache.org/jira/browse/CASSANDRA-8106 Project: Cassandra Issue Type: Bug Reporter: Tommaso Barbugli I am running Cassandra 2.0.10 on 2 nodes; since the last few hours every schema migration issued via CQL (both CREATE and ALTER tables) raises a ConfigurationException comparators do not match or are not compatible exception. {code} ERROR [Native-Transport-Requests:5802] 2014-10-12 22:48:12,237 QueryMessage.java (line 131) Unexpected error during query java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:285) at org.apache.cassandra.service.MigrationManager.announceNewColumnFamily(MigrationManager.java:223) at org.apache.cassandra.cql3.statements.CreateTableStatement.announceMigration(CreateTableStatement.java:121) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:79) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:306) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409) ... 16 more Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:1052) at org.apache.cassandra.db.DefsTables.updateColumnFamily(DefsTables.java:377) at org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:318) at org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:183) at org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:303) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) ... 3 more Caused by: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at org.apache.cassandra.config.CFMetaData.validateCompatility(CFMetaData.java:1142) at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:1067) at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:1048) ... 10 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8054) EXECUTE request with skipMetadata=false gets no metadata in response
[ https://issues.apache.org/jira/browse/CASSANDRA-8054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169297#comment-14169297 ] Sylvain Lebresne commented on CASSANDRA-8054: - I agree that not copying the flags is an overside of the initial patch, though I'd really put the flag copying in the {{copy}} method. But aside that, +1. bq. would prefer the initial Metadata have everything immutable/null in it, and only the Metadata returned from copy() to use mutable things That's why my initial idea in went withCASSANDRA-7120 was to make Metadata basically immutable and copy to do changes, but it forces to do more copy than necessary and required a couple other chances. I don't mind terribly tbh. EXECUTE request with skipMetadata=false gets no metadata in response Key: CASSANDRA-8054 URL: https://issues.apache.org/jira/browse/CASSANDRA-8054 Project: Cassandra Issue Type: Bug Components: Core Reporter: Olivier Michallat Assignee: Sylvain Lebresne Fix For: 2.0.11, 2.1.1 Attachments: 8054-2.1.txt, 8054-fix.txt, 8054-v2.txt This has been reported independently with the [Java|https://datastax-oss.atlassian.net/browse/JAVA-482] and [C++|https://datastax-oss.atlassian.net/browse/CPP-174] drivers. This happens under heavy load, where multiple client threads prepare and execute statements in parallel. One of them sends an EXECUTE request with skipMetadata=false, but the returned ROWS response has no metadata in it. A patch of {{Message.Dispatcher.channelRead0}} confirmed that the flag was incorrectly set on the response: {code} logger.debug(Received: {}, v={}, request, connection.getVersion()); boolean skipMetadataOnRequest = false; if (request instanceof ExecuteMessage) { ExecuteMessage execute = (ExecuteMessage)request; skipMetadataOnRequest = execute.options.skipMetadata(); } response = request.execute(qstate); if (request instanceof ExecuteMessage) { Rows rows = (Rows)response; boolean skipMetadataOnResponse = rows.result.metadata.flags.contains(Flag.NO_METADATA); if (skipMetadataOnResponse != skipMetadataOnRequest) { logger.warn(Inconsistent skipMetadata on streamId {}, was {} in request but {} in response, request.getStreamId(), skipMetadataOnRequest, skipMetadataOnResponse); } } {code} We observed the warning with (false, true) during our tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169307#comment-14169307 ] T Jake Luciani commented on CASSANDRA-7443: --- bq. Why? I'm literally talking about for purposes of developing the new format (for example the test format attached to this ticket). We need a way to develop tests against only the new format and not the old. so we need access to writing a specific format. We can technically do it without that but then we have a chicken and egg problem, we can't run the test harness without a fully functional sstable format. Once the format is working I'm happy to remove those options but at the moment it's needed. SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7777) Ability to clean up local sstable files after they've been loaded by the CqlBulkRecordWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169309#comment-14169309 ] Piotr Kołaczkowski commented on CASSANDRA-: --- +1 Ability to clean up local sstable files after they've been loaded by the CqlBulkRecordWriter Key: CASSANDRA- URL: https://issues.apache.org/jira/browse/CASSANDRA- Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Paul Pak Assignee: Paul Pak Priority: Minor Labels: cql3, hadoop Attachments: trunk--v1.txt, trunk--v2.txt, trunk--v3.txt Deleting the source files should most likely be the default behavior with the ability to disable it via config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
git commit: Remove UDF-as-class functionality
Repository: cassandra Updated Branches: refs/heads/trunk 0de0b8c03 - 9333f86cf Remove UDF-as-class functionality patch by snazy; reviewed by slebresne for CASSANDRA-8063 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9333f86c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9333f86c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9333f86c Branch: refs/heads/trunk Commit: 9333f86cf0acfdce75ebcd1ff38263e3d199efc7 Parents: 0de0b8c Author: Sylvain Lebresne sylv...@datastax.com Authored: Mon Oct 13 15:57:01 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Oct 13 15:57:01 2014 +0200 -- CHANGES.txt | 9 +- pylib/cqlshlib/cql3handling.py | 9 +- src/java/org/apache/cassandra/cql3/Cql.g| 19 +-- .../cql3/functions/ReflectionBasedUDF.java | 127 --- .../cassandra/cql3/functions/UDFunction.java| 1 - test/unit/org/apache/cassandra/cql3/UFTest.java | 98 +++--- 6 files changed, 23 insertions(+), 240 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9333f86c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index b6a3766..cd3950f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,21 +1,16 @@ 3.0 * Keep sstable levels when bootstrapping (CASSANDRA-7460) * Add Sigar library and perform basic OS settings check on startup (CASSANDRA-7838) - * Support for scripting languages in user-defined functions (CASSANDRA-7526) * Support for aggregation functions (CASSANDRA-4914) - * Improve query to read paxos table on propose (CASSANDRA-7929) * Remove cassandra-cli (CASSANDRA-7920) - * Optimize java source-based UDF invocation (CASSANDRA-7924) * Accept dollar quoted strings in CQL (CASSANDRA-7769) * Make assassinate a first class command (CASSANDRA-7935) * Support IN clause on any clustering column (CASSANDRA-4762) * Improve compaction logging (CASSANDRA-7818) * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917) - * Support Java source code for user-defined functions (CASSANDRA-7562) - * Require arg types to disambiguate UDF drops (CASSANDRA-7812) * Do anticompaction in groups (CASSANDRA-6851) - * Verify that UDF class methods are static (CASSANDRA-7781) - * Support pure user-defined functions (CASSANDRA-7395, 7740) + * Support pure user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 7781, 7929, + 7924, 7812, 8063) * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416) * Move sstable RandomAccessReader to nio2, which allows using the FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050) http://git-wip-us.apache.org/repos/asf/cassandra/blob/9333f86c/pylib/cqlshlib/cql3handling.py -- diff --git a/pylib/cqlshlib/cql3handling.py b/pylib/cqlshlib/cql3handling.py index 69fc277..6980216 100644 --- a/pylib/cqlshlib/cql3handling.py +++ b/pylib/cqlshlib/cql3handling.py @@ -1004,14 +1004,7 @@ syntax_rules += r''' ( , [newcolname]=cident storageType )* )? ) )? RETURNS storageType -( - (LANGUAGE cident AS -( - stringLiteral -) - ) - | (USING stringLiteral) -) +LANGUAGE cident AS stringLiteral ; ''' http://git-wip-us.apache.org/repos/asf/cassandra/blob/9333f86c/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 2ec9746..81f7d25 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -493,8 +493,6 @@ createFunctionStatement returns [CreateFunctionStatement expr] boolean ifNotExists = false; boolean deterministic = true; -String language = class; -String bodyOrClassName = null; ListColumnIdentifier argsNames = new ArrayList(); ListCQL3Type.Raw argsTypes = new ArrayList(); } @@ -509,19 +507,10 @@ createFunctionStatement returns [CreateFunctionStatement expr] ( ',' k=cident v=comparatorType { argsNames.add(k); argsTypes.add(v); } )* )? ')' - K_RETURNS - rt=comparatorType - ( - ( K_USING cls = STRING_LITERAL { bodyOrClassName = $cls.text; } ) -| ( K_LANGUAGE l
[jira] [Commented] (CASSANDRA-7563) UserType, TupleType and collections in UDFs
[ https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169314#comment-14169314 ] Sylvain Lebresne commented on CASSANDRA-7563: - I've only scanned this very quickly, but why bother with special casing UDT, Tupe and collections in {{JavaTypesHelper.driverType()}}? Using {{parseOne()}} should work with any type (at which point I'm not sure having a separate {{JavaTypesHelper}} is really justified (I've committed CASSANDRA-8063 so {{javaType}} goes away too I believe), this could all stay into UDFunction imo, but that's a detail). UserType, TupleType and collections in UDFs --- Key: CASSANDRA-7563 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563 Project: Cassandra Issue Type: Bug Reporter: Robert Stupp Assignee: Robert Stupp Fix For: 3.0 Attachments: 7563-7740.txt, 7563.txt * is Java Driver as a dependency required ? * is it possible to extract parts of the Java Driver for UDT/TT/coll support ? * CQL {{DROP TYPE}} must check UDFs * must check keyspace access permissions (if those exist) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7443) SSTable Pluggability v2
[ https://issues.apache.org/jira/browse/CASSANDRA-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169319#comment-14169319 ] Sylvain Lebresne commented on CASSANDRA-7443: - All I'm saying is that for tests, having (static) getter and setter in {{DatabaseDescriptor}} ought to be enough, no need to have it in {{Config}}. If that's what you meant by hidden setting, we're in agreement then. SSTable Pluggability v2 --- Key: CASSANDRA-7443 URL: https://issues.apache.org/jira/browse/CASSANDRA-7443 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0 Attachments: 7443-refactor-v1.txt, 7443-testformat-v1.txt As part of a wider effort to improve the performance of our storage engine we will need to support basic pluggability of the SSTable reader/writer. We primarily need this to support the current SSTable format and new SSTable format in the same version. This will also let us encapsulate the changes in a single layer vs forcing the whole engine to change at once. We previously discussed how to accomplish this in CASSANDRA-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7821) Add Optional Backoff on Retry to Cassandra Stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169416#comment-14169416 ] T Jake Luciani commented on CASSANDRA-7821: --- bq. Of course writing a new retry policy would be acceptable too, a BackingOffRetryPolicy.java or something alike? Yes this is how I think it should be done Add Optional Backoff on Retry to Cassandra Stress - Key: CASSANDRA-7821 URL: https://issues.apache.org/jira/browse/CASSANDRA-7821 Project: Cassandra Issue Type: Improvement Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer Attachments: CASSANDRA-7821-2.1.patch Currently when stress is running against a cluster which occasionally has nodes marked as down, it will almost immediately stop. This occurs because the retry loop can execute extremely quickly if each execution terminates with a {{com.datastax.driver.core.exceptions.NoHostAvailableException}} or {{com.datastax.driver.core.exceptions.UnavailableException}}. In case of these exceptions is will most likely be unable to succeed if the retries are performed as fast as possible. To get around this, we could add an optional delay on retries giving the cluster time to recover rather than terminating the stress run. We could make this configurable, with options such as: * Constant # Delays the same amount after each retry * Linear # Backoff a set amount * the trial number * Exponential # Backoff set amount * 2 ^ trial number This may also require adjusting the thread is stuck check to make sure that the max retry timeout will not cause the thread to be terminated early. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7821) Add Optional Backoff on Retry to Cassandra Stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169444#comment-14169444 ] Russell Alexander Spitzer commented on CASSANDRA-7821: -- Np, I'll work on that when i get some free time. Add Optional Backoff on Retry to Cassandra Stress - Key: CASSANDRA-7821 URL: https://issues.apache.org/jira/browse/CASSANDRA-7821 Project: Cassandra Issue Type: Improvement Reporter: Russell Alexander Spitzer Assignee: Russell Alexander Spitzer Attachments: CASSANDRA-7821-2.1.patch Currently when stress is running against a cluster which occasionally has nodes marked as down, it will almost immediately stop. This occurs because the retry loop can execute extremely quickly if each execution terminates with a {{com.datastax.driver.core.exceptions.NoHostAvailableException}} or {{com.datastax.driver.core.exceptions.UnavailableException}}. In case of these exceptions is will most likely be unable to succeed if the retries are performed as fast as possible. To get around this, we could add an optional delay on retries giving the cluster time to recover rather than terminating the stress run. We could make this configurable, with options such as: * Constant # Delays the same amount after each retry * Linear # Backoff a set amount * the trial number * Exponential # Backoff set amount * 2 ^ trial number This may also require adjusting the thread is stuck check to make sure that the max retry timeout will not cause the thread to be terminated early. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8084) GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE clusters doesnt use the PRIVATE IPS for Intra-DC communications - When running nodetool repai
[ https://issues.apache.org/jira/browse/CASSANDRA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169468#comment-14169468 ] Yuki Morishita commented on CASSANDRA-8084: --- bq. Is use of the broadcast address in netstats and the logs intentional? Each node identifies other nodes by broadcast address, even messages are sent through private IPs. So currently logs and netstats show only broadcast IP. I definitely can switch output for netstats to private IP when it is used. Or show it alongside. For logs, with some work it can be done. Though, I don't know if it helps a lot. For example, repair log: {noformat} [repair #8cc731c0-52f4-11e4-916c-0800200c9a66] Endpoints /54.183.192.248 and /54.215.139.161 are consistent for Standard1 {noformat} is it better to show private IP for both nodes since they are communicating through private IPs? GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE clusters doesnt use the PRIVATE IPS for Intra-DC communications - When running nodetool repair - Key: CASSANDRA-8084 URL: https://issues.apache.org/jira/browse/CASSANDRA-8084 Project: Cassandra Issue Type: Bug Components: Config Environment: Tested this in GCE and AWS clusters. Created multi region and multi dc cluster once in GCE and once in AWS and ran into the same problem. DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION=Ubuntu 12.04.3 LTS NAME=Ubuntu VERSION=12.04.3 LTS, Precise Pangolin ID=ubuntu ID_LIKE=debian PRETTY_NAME=Ubuntu precise (12.04.3 LTS) VERSION_ID=12.04 Tried to install Apache Cassandra version ReleaseVersion: 2.0.10 and also latest DSE version which is 4.5 and which corresponds to 2.0.8.39. Reporter: Jana Assignee: Yuki Morishita Labels: features Fix For: 2.0.11 Attachments: 8084-2.0.txt Neither of these snitches(GossipFilePropertySnitch and EC2MultiRegionSnitch ) used the PRIVATE IPS for communication between INTRA-DC nodes in my multi-region multi-dc cluster in cloud(on both AWS and GCE) when I ran nodetool repair -local. It works fine during regular reads. Here are the various cluster flavors I tried and failed- AWS + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + (Prefer_local=true) in rackdc-properties file. AWS + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in rackdc-properties file. GCE + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + (Prefer_local=true) in rackdc-properties file. GCE + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in rackdc-properties file. I am expecting with the above setup all of my nodes in a given DC all communicate via private ips since the cloud providers dont charge us for using the private ips and they charge for using public ips. But they can use PUBLIC IPs for INTER-DC communications which is working as expected. Here is a snippet from my log files when I ran the nodetool repair -local - Node responding to 'node running repair' INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,628 Validator.java (line 254) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree to /54.172.118.222 for system_traces/sessions INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,741 Validator.java (line 254) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree to /54.172.118.222 for system_traces/events Node running repair - INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,927 RepairSession.java (line 166) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Received merkle tree for events from /54.172.118.222 Note: The IPs its communicating is all PUBLIC Ips and it should have used the PRIVATE IPs starting with 172.x.x.x YAML file values : The listen address is set to: PRIVATE IP The broadcast address is set to: PUBLIC IP The SEEDs address is set to: PUBLIC IPs from both DCs The SNITCHES tried: GPFS and EC2MultiRegionSnitch RACK-DC: Had prefer_local set to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27fdf421 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27fdf421 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27fdf421 Branch: refs/heads/trunk Commit: 27fdf4211a3a8eb01367209d3d7121f40e044718 Parents: 9333f86 dee15a8 Author: Yuki Morishita yu...@apache.org Authored: Mon Oct 13 12:11:40 2014 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon Oct 13 12:11:40 2014 -0500 -- CHANGES.txt | 1 + .../apache/cassandra/db/AtomicBTreeColumns.java | 138 --- src/java/org/apache/cassandra/db/Memtable.java | 10 +- .../cassandra/utils/concurrent/Locks.java | 37 + 4 files changed, 166 insertions(+), 20 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/27fdf421/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/27fdf421/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java -- diff --cc src/java/org/apache/cassandra/db/AtomicBTreeColumns.java index 0572c4a,7b5e8a8..7a98cb9 --- a/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java +++ b/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java @@@ -35,10 -36,9 +36,11 @@@ import org.apache.cassandra.db.composit import org.apache.cassandra.db.composites.Composite; import org.apache.cassandra.db.filter.ColumnSlice; import org.apache.cassandra.utils.ObjectSizes; +import org.apache.cassandra.utils.SearchIterator; import org.apache.cassandra.utils.btree.BTree; +import org.apache.cassandra.utils.btree.BTreeSearchIterator; import org.apache.cassandra.utils.btree.UpdateFunction; + import org.apache.cassandra.utils.concurrent.Locks; import org.apache.cassandra.utils.concurrent.OpOrder; import org.apache.cassandra.utils.memory.HeapAllocator; import org.apache.cassandra.utils.memory.MemtableAllocator; http://git-wip-us.apache.org/repos/asf/cassandra/blob/27fdf421/src/java/org/apache/cassandra/db/Memtable.java --
[jira] [Commented] (CASSANDRA-8084) GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE clusters doesnt use the PRIVATE IPS for Intra-DC communications - When running nodetool repai
[ https://issues.apache.org/jira/browse/CASSANDRA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169555#comment-14169555 ] J.B. Langston commented on CASSANDRA-8084: -- I think it is most important to show the private IP in netstats, and my vote would be to show both the public and private IP in that case. On the logs, I can see that would be more work to fix and I don't necessarily think it needs to show the private IP everywhere, but maybe on the messages that specifically concern streaming, we could show both. GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE clusters doesnt use the PRIVATE IPS for Intra-DC communications - When running nodetool repair - Key: CASSANDRA-8084 URL: https://issues.apache.org/jira/browse/CASSANDRA-8084 Project: Cassandra Issue Type: Bug Components: Config Environment: Tested this in GCE and AWS clusters. Created multi region and multi dc cluster once in GCE and once in AWS and ran into the same problem. DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION=Ubuntu 12.04.3 LTS NAME=Ubuntu VERSION=12.04.3 LTS, Precise Pangolin ID=ubuntu ID_LIKE=debian PRETTY_NAME=Ubuntu precise (12.04.3 LTS) VERSION_ID=12.04 Tried to install Apache Cassandra version ReleaseVersion: 2.0.10 and also latest DSE version which is 4.5 and which corresponds to 2.0.8.39. Reporter: Jana Assignee: Yuki Morishita Labels: features Fix For: 2.0.11 Attachments: 8084-2.0.txt Neither of these snitches(GossipFilePropertySnitch and EC2MultiRegionSnitch ) used the PRIVATE IPS for communication between INTRA-DC nodes in my multi-region multi-dc cluster in cloud(on both AWS and GCE) when I ran nodetool repair -local. It works fine during regular reads. Here are the various cluster flavors I tried and failed- AWS + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + (Prefer_local=true) in rackdc-properties file. AWS + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in rackdc-properties file. GCE + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + (Prefer_local=true) in rackdc-properties file. GCE + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in rackdc-properties file. I am expecting with the above setup all of my nodes in a given DC all communicate via private ips since the cloud providers dont charge us for using the private ips and they charge for using public ips. But they can use PUBLIC IPs for INTER-DC communications which is working as expected. Here is a snippet from my log files when I ran the nodetool repair -local - Node responding to 'node running repair' INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,628 Validator.java (line 254) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree to /54.172.118.222 for system_traces/sessions INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,741 Validator.java (line 254) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree to /54.172.118.222 for system_traces/events Node running repair - INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,927 RepairSession.java (line 166) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Received merkle tree for events from /54.172.118.222 Note: The IPs its communicating is all PUBLIC Ips and it should have used the PRIVATE IPs starting with 172.x.x.x YAML file values : The listen address is set to: PRIVATE IP The broadcast address is set to: PUBLIC IP The SEEDs address is set to: PUBLIC IPs from both DCs The SNITCHES tried: GPFS and EC2MultiRegionSnitch RACK-DC: Had prefer_local set to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/3] git commit: Fix spin loop in AtomicSortedColumns
Fix spin loop in AtomicSortedColumns patch by graham sanderson and benedict; reviewed by yukim for CASSANDRA-7546 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dee15a85 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dee15a85 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dee15a85 Branch: refs/heads/trunk Commit: dee15a85c8640e58e162798d46f026c47fdd432c Parents: 11e8dc1 Author: graham sanderson gra...@vast.com Authored: Mon Oct 13 11:52:03 2014 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon Oct 13 11:57:54 2014 -0500 -- CHANGES.txt | 1 + .../apache/cassandra/db/AtomicBTreeColumns.java | 139 --- src/java/org/apache/cassandra/db/Memtable.java | 10 +- .../cassandra/utils/concurrent/Locks.java | 37 + 4 files changed, 166 insertions(+), 21 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dee15a85/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 891848e..0b2dd0c 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -125,6 +125,7 @@ Merged from 2.0: * Fix wrong progress when streaming uncompressed (CASSANDRA-7878) * Fix possible infinite loop in creating repair range (CASSANDRA-7983) * Fix unit in nodetool for streaming throughput (CASSANDRA-7375) + * Fix spin loop in AtomicSortedColumns (CASSANDRA-7546) Merged from 1.2: * Don't index tombstones (CASSANDRA-7828) * Improve PasswordAuthenticator default super user setup (CASSANDRA-7788) http://git-wip-us.apache.org/repos/asf/cassandra/blob/dee15a85/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java -- diff --git a/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java b/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java index 559e759..7b5e8a8 100644 --- a/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java +++ b/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java @@ -23,6 +23,7 @@ import java.util.Collection; import java.util.Comparator; import java.util.Iterator; import java.util.List; +import java.util.concurrent.atomic.AtomicIntegerFieldUpdater; import java.util.concurrent.atomic.AtomicReferenceFieldUpdater; import com.google.common.base.Function; @@ -37,10 +38,10 @@ import org.apache.cassandra.db.filter.ColumnSlice; import org.apache.cassandra.utils.ObjectSizes; import org.apache.cassandra.utils.btree.BTree; import org.apache.cassandra.utils.btree.UpdateFunction; +import org.apache.cassandra.utils.concurrent.Locks; import org.apache.cassandra.utils.concurrent.OpOrder; import org.apache.cassandra.utils.memory.HeapAllocator; import org.apache.cassandra.utils.memory.MemtableAllocator; -import org.apache.cassandra.utils.memory.NativeAllocator; import org.apache.cassandra.utils.memory.NativePool; import static org.apache.cassandra.db.index.SecondaryIndexManager.Updater; @@ -59,6 +60,31 @@ public class AtomicBTreeColumns extends ColumnFamily static final long EMPTY_SIZE = ObjectSizes.measure(new AtomicBTreeColumns(CFMetaData.IndexCf, null)) + ObjectSizes.measure(new Holder(null, null)); +// Reserved values for wasteTracker field. These values must not be consecutive (see avoidReservedValues) +private static final int TRACKER_NEVER_WASTED = 0; +private static final int TRACKER_PESSIMISTIC_LOCKING = Integer.MAX_VALUE; + +// The granularity with which we track wasted allocation/work; we round up +private static final int ALLOCATION_GRANULARITY_BYTES = 1024; +// The number of bytes we have to waste in excess of our acceptable realtime rate of waste (defined below) +private static final long EXCESS_WASTE_BYTES = 10 * 1024 * 1024L; +private static final int EXCESS_WASTE_OFFSET = (int) (EXCESS_WASTE_BYTES / ALLOCATION_GRANULARITY_BYTES); +// Note this is a shift, because dividing a long time and then picking the low 32 bits doesn't give correct rollover behavior +private static final int CLOCK_SHIFT = 17; +// CLOCK_GRANULARITY = 1^9ns CLOCK_SHIFT == 132us == (1/7.63)ms + +/** + * (clock + allocation) granularity are combined to give us an acceptable (waste) allocation rate that is defined by + * the passage of real time of ALLOCATION_GRANULARITY_BYTES/CLOCK_GRANULARITY, or in this case 7.63Kb/ms, or 7.45Mb/s + * + * in wasteTracker we maintain within EXCESS_WASTE_OFFSET before the current time; whenever we waste bytes + * we increment the current value if it is within this window, and set it to the min of the window plus our waste + * otherwise. + */ +private volatile int wasteTracker = TRACKER_NEVER_WASTED; + +private
[jira] [Commented] (CASSANDRA-7446) Batchlog should be streamed to a different node on decom
[ https://issues.apache.org/jira/browse/CASSANDRA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169558#comment-14169558 ] Jason Brown commented on CASSANDRA-7446: Hmm, I'm kinda -1 on this, as the coordinator will have no idea who now has those batchlog entries (after they've been streamed off), so we're pretty much guaranteed that the new owner of the batchlog entries will replay them. If anything, maybe the decommissioned node should just ahead and send them. This being said, in SS.decomission, we already wait RING_DELAY before performing any unbootstrap() work (and we've already gossiped that we're leaving). And by the time we've streamed off the data (and the hints) it's most likely that any batchlog entries we would have been either deleted or replayed as streaming a non-trivial amount of data will take some amount of time greater than the batchlog replay timeout. Batchlog should be streamed to a different node on decom Key: CASSANDRA-7446 URL: https://issues.apache.org/jira/browse/CASSANDRA-7446 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Branimir Lambov Just like we stream hints on decom, we should also stream the contents of the batchlog - even though we do replicate the batch to at least two nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7446) Batchlog should be streamed to a different node on decom
[ https://issues.apache.org/jira/browse/CASSANDRA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169565#comment-14169565 ] Aleksey Yeschenko commented on CASSANDRA-7446: -- You are probably right. I think what I came up is the wrong fix the problem - but the problem itself is real. Now I think what we should do is to force batchlog replay on decommission instead (and do it before we stream away the hints). Batchlog should be streamed to a different node on decom Key: CASSANDRA-7446 URL: https://issues.apache.org/jira/browse/CASSANDRA-7446 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Branimir Lambov Just like we stream hints on decom, we should also stream the contents of the batchlog - even though we do replicate the batch to at least two nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] git commit: Fix spin loop in AtomicSortedColumns
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 11e8dc1c1 - dee15a85c refs/heads/trunk 9333f86cf - 27fdf4211 Fix spin loop in AtomicSortedColumns patch by graham sanderson and benedict; reviewed by yukim for CASSANDRA-7546 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dee15a85 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dee15a85 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dee15a85 Branch: refs/heads/cassandra-2.1 Commit: dee15a85c8640e58e162798d46f026c47fdd432c Parents: 11e8dc1 Author: graham sanderson gra...@vast.com Authored: Mon Oct 13 11:52:03 2014 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Mon Oct 13 11:57:54 2014 -0500 -- CHANGES.txt | 1 + .../apache/cassandra/db/AtomicBTreeColumns.java | 139 --- src/java/org/apache/cassandra/db/Memtable.java | 10 +- .../cassandra/utils/concurrent/Locks.java | 37 + 4 files changed, 166 insertions(+), 21 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dee15a85/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 891848e..0b2dd0c 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -125,6 +125,7 @@ Merged from 2.0: * Fix wrong progress when streaming uncompressed (CASSANDRA-7878) * Fix possible infinite loop in creating repair range (CASSANDRA-7983) * Fix unit in nodetool for streaming throughput (CASSANDRA-7375) + * Fix spin loop in AtomicSortedColumns (CASSANDRA-7546) Merged from 1.2: * Don't index tombstones (CASSANDRA-7828) * Improve PasswordAuthenticator default super user setup (CASSANDRA-7788) http://git-wip-us.apache.org/repos/asf/cassandra/blob/dee15a85/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java -- diff --git a/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java b/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java index 559e759..7b5e8a8 100644 --- a/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java +++ b/src/java/org/apache/cassandra/db/AtomicBTreeColumns.java @@ -23,6 +23,7 @@ import java.util.Collection; import java.util.Comparator; import java.util.Iterator; import java.util.List; +import java.util.concurrent.atomic.AtomicIntegerFieldUpdater; import java.util.concurrent.atomic.AtomicReferenceFieldUpdater; import com.google.common.base.Function; @@ -37,10 +38,10 @@ import org.apache.cassandra.db.filter.ColumnSlice; import org.apache.cassandra.utils.ObjectSizes; import org.apache.cassandra.utils.btree.BTree; import org.apache.cassandra.utils.btree.UpdateFunction; +import org.apache.cassandra.utils.concurrent.Locks; import org.apache.cassandra.utils.concurrent.OpOrder; import org.apache.cassandra.utils.memory.HeapAllocator; import org.apache.cassandra.utils.memory.MemtableAllocator; -import org.apache.cassandra.utils.memory.NativeAllocator; import org.apache.cassandra.utils.memory.NativePool; import static org.apache.cassandra.db.index.SecondaryIndexManager.Updater; @@ -59,6 +60,31 @@ public class AtomicBTreeColumns extends ColumnFamily static final long EMPTY_SIZE = ObjectSizes.measure(new AtomicBTreeColumns(CFMetaData.IndexCf, null)) + ObjectSizes.measure(new Holder(null, null)); +// Reserved values for wasteTracker field. These values must not be consecutive (see avoidReservedValues) +private static final int TRACKER_NEVER_WASTED = 0; +private static final int TRACKER_PESSIMISTIC_LOCKING = Integer.MAX_VALUE; + +// The granularity with which we track wasted allocation/work; we round up +private static final int ALLOCATION_GRANULARITY_BYTES = 1024; +// The number of bytes we have to waste in excess of our acceptable realtime rate of waste (defined below) +private static final long EXCESS_WASTE_BYTES = 10 * 1024 * 1024L; +private static final int EXCESS_WASTE_OFFSET = (int) (EXCESS_WASTE_BYTES / ALLOCATION_GRANULARITY_BYTES); +// Note this is a shift, because dividing a long time and then picking the low 32 bits doesn't give correct rollover behavior +private static final int CLOCK_SHIFT = 17; +// CLOCK_GRANULARITY = 1^9ns CLOCK_SHIFT == 132us == (1/7.63)ms + +/** + * (clock + allocation) granularity are combined to give us an acceptable (waste) allocation rate that is defined by + * the passage of real time of ALLOCATION_GRANULARITY_BYTES/CLOCK_GRANULARITY, or in this case 7.63Kb/ms, or 7.45Mb/s + * + * in wasteTracker we maintain within EXCESS_WASTE_OFFSET before the current time; whenever we waste bytes + * we increment the current value if it is within this window, and set it to
[jira] [Commented] (CASSANDRA-7446) Batchlog should be streamed to a different node on decom
[ https://issues.apache.org/jira/browse/CASSANDRA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169577#comment-14169577 ] Jason Brown commented on CASSANDRA-7446: Agreed about letting the decommissioning node replay those batchlogs it has, but why not give if it enough time to do it naturally - that is, wait until after the data and hints have streamed? I suspect enough time will have elapsed in the non-trivial case that the batchlogs would have done what they normally do. Forcing the replay as the last (or one of the very last) steps makes sense, and gives the coordinator a few extra moments to get the right thing done. However, and because I'm lazy right now, we need to double check that coordinator still sends the batchlog delete command while the decom'ing node is LEAVING (LEFT, of course, is totally different). Batchlog should be streamed to a different node on decom Key: CASSANDRA-7446 URL: https://issues.apache.org/jira/browse/CASSANDRA-7446 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Branimir Lambov Just like we stream hints on decom, we should also stream the contents of the batchlog - even though we do replicate the batch to at least two nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8021) Improve cqlsh autocomplete for alter keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-8021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169578#comment-14169578 ] Mikhail Stepura commented on CASSANDRA-8021: [~rnamboodiri] go ahead and create a patch for 2.0.x. There is no need for a separate patch for the trunk Improve cqlsh autocomplete for alter keyspace - Key: CASSANDRA-8021 URL: https://issues.apache.org/jira/browse/CASSANDRA-8021 Project: Cassandra Issue Type: Improvement Reporter: Philip Thompson Assignee: Rajanarayanan Thottuvaikkatumana Priority: Minor Labels: cqlsh, lhf Fix For: 2.0.11, 2.1.1 Attachments: cassandra-2.1.1-8021.txt Cqlsh autocomplete stops giving suggestions for the statement {code}ALTER KEYSPACE k WITH REPLICATION { 'class' : 'SimpleStrategy', 'replication_factor' : 1'};{code} after the word WITH. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7899) SSL does not work in cassandra-cli
[ https://issues.apache.org/jira/browse/CASSANDRA-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169587#comment-14169587 ] Charles Cao commented on CASSANDRA-7899: What is the fix version? SSL does not work in cassandra-cli -- Key: CASSANDRA-7899 URL: https://issues.apache.org/jira/browse/CASSANDRA-7899 Project: Cassandra Issue Type: Bug Components: Tools Environment: Linux 2.6.32-431.20.3.el6.x86_64 #1 SMP Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_17 Java(TM) SE Runtime Environment (build 1.7.0_17-b02) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) [cqlsh 4.1.1 | Cassandra 2.0.10 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Reporter: Zdenek Ott Assignee: Jason Brown Attachments: 7899-v1.txt, 7899-v2.txt When specify transport factory parameter '-tf org.apache.cassandra.cli.transport.SSLTransportFactory' it throws exception, see below, because SSLTransportFactory extends TTransportFactory not ITransportFactory. Exception in thread main java.lang.IllegalArgumentException: Cannot create a transport factory 'org.apache.cassandra.cli.transport.SSLTransportFactory'. at org.apache.cassandra.cli.CliOptions.validateAndSetTransportFactory(CliOptions.java:288) at org.apache.cassandra.cli.CliOptions.processArgs(CliOptions.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:230) Caused by: java.lang.IllegalArgumentException: transport factory 'org.apache.cassandra.cli.transport.SSLTransportFactory' not derived from ITransportFactory at org.apache.cassandra.cli.CliOptions.validateAndSetTransportFactory(CliOptions.java:282) ... 2 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7446) Batchlog should be streamed to a different node on decom
[ https://issues.apache.org/jira/browse/CASSANDRA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169601#comment-14169601 ] Jason Brown commented on CASSANDRA-7446: bq. I prefer to not trust anything that's based purely on timeouts I think we're in agreement here, I'm just arguing to force the batchlogs last :) Batchlog should be streamed to a different node on decom Key: CASSANDRA-7446 URL: https://issues.apache.org/jira/browse/CASSANDRA-7446 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Branimir Lambov Just like we stream hints on decom, we should also stream the contents of the batchlog - even though we do replicate the batch to at least two nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7899) SSL does not work in cassandra-cli
[ https://issues.apache.org/jira/browse/CASSANDRA-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-7899: --- Fix Version/s: 2.0.11 SSL does not work in cassandra-cli -- Key: CASSANDRA-7899 URL: https://issues.apache.org/jira/browse/CASSANDRA-7899 Project: Cassandra Issue Type: Bug Components: Tools Environment: Linux 2.6.32-431.20.3.el6.x86_64 #1 SMP Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_17 Java(TM) SE Runtime Environment (build 1.7.0_17-b02) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) [cqlsh 4.1.1 | Cassandra 2.0.10 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Reporter: Zdenek Ott Assignee: Jason Brown Fix For: 2.0.11, 2.1.1 Attachments: 7899-v1.txt, 7899-v2.txt When specify transport factory parameter '-tf org.apache.cassandra.cli.transport.SSLTransportFactory' it throws exception, see below, because SSLTransportFactory extends TTransportFactory not ITransportFactory. Exception in thread main java.lang.IllegalArgumentException: Cannot create a transport factory 'org.apache.cassandra.cli.transport.SSLTransportFactory'. at org.apache.cassandra.cli.CliOptions.validateAndSetTransportFactory(CliOptions.java:288) at org.apache.cassandra.cli.CliOptions.processArgs(CliOptions.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:230) Caused by: java.lang.IllegalArgumentException: transport factory 'org.apache.cassandra.cli.transport.SSLTransportFactory' not derived from ITransportFactory at org.apache.cassandra.cli.CliOptions.validateAndSetTransportFactory(CliOptions.java:282) ... 2 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7899) SSL does not work in cassandra-cli
[ https://issues.apache.org/jira/browse/CASSANDRA-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169605#comment-14169605 ] Jason Brown commented on CASSANDRA-7899: update the fixed version: 2.0.11 SSL does not work in cassandra-cli -- Key: CASSANDRA-7899 URL: https://issues.apache.org/jira/browse/CASSANDRA-7899 Project: Cassandra Issue Type: Bug Components: Tools Environment: Linux 2.6.32-431.20.3.el6.x86_64 #1 SMP Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_17 Java(TM) SE Runtime Environment (build 1.7.0_17-b02) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) [cqlsh 4.1.1 | Cassandra 2.0.10 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Reporter: Zdenek Ott Assignee: Jason Brown Fix For: 2.0.11, 2.1.1 Attachments: 7899-v1.txt, 7899-v2.txt When specify transport factory parameter '-tf org.apache.cassandra.cli.transport.SSLTransportFactory' it throws exception, see below, because SSLTransportFactory extends TTransportFactory not ITransportFactory. Exception in thread main java.lang.IllegalArgumentException: Cannot create a transport factory 'org.apache.cassandra.cli.transport.SSLTransportFactory'. at org.apache.cassandra.cli.CliOptions.validateAndSetTransportFactory(CliOptions.java:288) at org.apache.cassandra.cli.CliOptions.processArgs(CliOptions.java:223) at org.apache.cassandra.cli.CliMain.main(CliMain.java:230) Caused by: java.lang.IllegalArgumentException: transport factory 'org.apache.cassandra.cli.transport.SSLTransportFactory' not derived from ITransportFactory at org.apache.cassandra.cli.CliOptions.validateAndSetTransportFactory(CliOptions.java:282) ... 2 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8021) Improve cqlsh autocomplete for alter keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-8021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajanarayanan Thottuvaikkatumana updated CASSANDRA-8021: Attachment: cassandra-2.0.11-8021.txt Patch for the 2.0 version Improve cqlsh autocomplete for alter keyspace - Key: CASSANDRA-8021 URL: https://issues.apache.org/jira/browse/CASSANDRA-8021 Project: Cassandra Issue Type: Improvement Reporter: Philip Thompson Assignee: Rajanarayanan Thottuvaikkatumana Priority: Minor Labels: cqlsh, lhf Fix For: 2.0.11, 2.1.1 Attachments: cassandra-2.0.11-8021.txt, cassandra-2.1.1-8021.txt Cqlsh autocomplete stops giving suggestions for the statement {code}ALTER KEYSPACE k WITH REPLICATION { 'class' : 'SimpleStrategy', 'replication_factor' : 1'};{code} after the word WITH. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169705#comment-14169705 ] Ryan McGuire commented on CASSANDRA-7918: - I got pretty far with this before I got sidetracked again. https://github.com/EnigmaCurry/cassandra/tree/7918-stress-graph This implements the command line interface, captures the metrics to a temporary file, and embeds the graphing javascript as a java resource. What's still left to do: dump the metrics into the html file, and handle the merge case where the html file already exists. Provide graphing tool along with cassandra-stress - Key: CASSANDRA-7918 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Ryan McGuire Priority: Minor Whilst cstar makes some pretty graphs, they're a little limited and also require you to run your tests through it. It would be useful to be able to graph results from any stress run easily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7927) Kill daemon on any disk error
[ https://issues.apache.org/jira/browse/CASSANDRA-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169707#comment-14169707 ] Joshua McKenzie commented on CASSANDRA-7927: The previously linked branch actually had a couple of problems with it I've resolved [here|https://github.com/josh-mckenzie/cassandra/compare/7927?expand=1]. Namely, the when I combined the checking for FSError / CorruptSSTableException in inspectThrowable I didn't check the Commit log failure policy in the DatabaseDescriptor and also wouldn't have been able to do so without augmenting the information passed in to indicate it originated in a CommitLog context. I think you were on the right track w/having an independent entry point for inspection of CommitLog errors - that way we can kill the JVM on *any* commit log errors without having to worry about the type of error thrown on the CommitLog operation. I did a few other things on this branch as well: # added an entry in CHANGES.txt # added assertion to CommitLogTest to confirm the _die actually worked # added a workaround for the fact that File.setWritable(false) on a directory fails on Windows (/sigh) # merged the KillerForTests into the JVMStabilityInspector to help keep the code-base clean # promoted the inspection in FileUtils and in CommitLog of the Throwable to the root of (handleFSError/handleCorruptSSTable/handleCommitError) so the inspector will immediately kill if appropriate Kill daemon on any disk error - Key: CASSANDRA-7927 URL: https://issues.apache.org/jira/browse/CASSANDRA-7927 Project: Cassandra Issue Type: New Feature Components: Core Environment: aws, stock cassandra or dse Reporter: John Sumsion Assignee: John Sumsion Labels: bootcamp, lhf Fix For: 2.1.1 Attachments: 7927-v1-die.patch We got a disk read error on 1.2.13 that didn't trigger the disk failure policy, and I'm trying to hunt down why, but in doing so, I saw that there is no disk_failure_policy option for just killing the daemon. If we ever get a corrupt sstable, we want to replace the node anyway, because some aws instance store disks just go bad. I want to use the JVMStabilityInspector from CASSANDRA-7507 to kill so that remains standard, so I will base my patch on CASSANDRA-7507. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8066) High Heap Consumption due to high number of SSTableReader
[ https://issues.apache.org/jira/browse/CASSANDRA-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169722#comment-14169722 ] Benoit Lacelle commented on CASSANDRA-8066: --- I have a single node. The schema holds around 8 tables. A few of them are counter tables. I regularly run nearly-full scan on on a not-counter table. Else, it is mainly an equivalent of reads and writes on 2 given tables. These tables have a few million of rows. Do you need more details? High Heap Consumption due to high number of SSTableReader - Key: CASSANDRA-8066 URL: https://issues.apache.org/jira/browse/CASSANDRA-8066 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.0 Reporter: Benoit Lacelle Assignee: T Jake Luciani Fix For: 2.1.1 Given a workload with quite a lot of reads, I recently encountered high heap memory consumption. Given 2GB of Heap, it appears I have 750.000+ tasks in SSTableReader.syncExecutor, consuming more than 1.2GB. These tasks have type SSTableReader$5, which I guess corresponds to : {code} readMeterSyncFuture = syncExecutor.scheduleAtFixedRate(new Runnable() { public void run() { if (!isCompacted.get()) { meterSyncThrottle.acquire(); SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, desc.generation, readMeter); } } }, 1, 5, TimeUnit.MINUTES); {code} I do not have have to the environment right now, but I could provide a threaddump later if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8066) High Heap Consumption due to high number of SSTableReader
[ https://issues.apache.org/jira/browse/CASSANDRA-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169740#comment-14169740 ] T Jake Luciani commented on CASSANDRA-8066: --- bq. Do you need more details? Yes actual data size and number of sstables in the largest keyspace High Heap Consumption due to high number of SSTableReader - Key: CASSANDRA-8066 URL: https://issues.apache.org/jira/browse/CASSANDRA-8066 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.0 Reporter: Benoit Lacelle Assignee: T Jake Luciani Fix For: 2.1.1 Given a workload with quite a lot of reads, I recently encountered high heap memory consumption. Given 2GB of Heap, it appears I have 750.000+ tasks in SSTableReader.syncExecutor, consuming more than 1.2GB. These tasks have type SSTableReader$5, which I guess corresponds to : {code} readMeterSyncFuture = syncExecutor.scheduleAtFixedRate(new Runnable() { public void run() { if (!isCompacted.get()) { meterSyncThrottle.acquire(); SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, desc.generation, readMeter); } } }, 1, 5, TimeUnit.MINUTES); {code} I do not have have to the environment right now, but I could provide a threaddump later if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8066) High Heap Consumption due to high number of SSTableReader
[ https://issues.apache.org/jira/browse/CASSANDRA-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169775#comment-14169775 ] Benoit Lacelle commented on CASSANDRA-8066: --- Here is the output for nodetool cfstats {code} Keyspace: prod_7 Read Count: 1424130 Read Latency: 3.122707294980093 ms. Write Count: 8808265 Write Latency: 0.03234213866181365 ms. Pending Flushes: 0 Table: alerts SSTable count: 2 Space used (live), bytes: 10237 Space used (total), bytes: 10237 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.6354418105671432 Memtable cell count: 0 Memtable data size, bytes: 0 Memtable switch count: 0 Local read count: 0 Local read latency: NaN ms Local write count: 0 Local write latency: NaN ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used, bytes: 32 Compacted partition minimum bytes: 259 Compacted partition maximum bytes: 372 Compacted partition mean bytes: 341 Average live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0 Table: details SSTable count: 103 Space used (live), bytes: 578266489 Space used (total), bytes: 578266489 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.724988344517149 Memtable cell count: 67212 Memtable data size, bytes: 18468770 Memtable switch count: 23 Local read count: 6 Local read latency: 10.742 ms Local write count: 2036971 Local write latency: 0.017 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used, bytes: 5136 Compacted partition minimum bytes: 87 Compacted partition maximum bytes: 129557750 Compacted partition mean bytes: 2595076 Average live cells per slice (last five minutes): 0. Average tombstones per slice (last five minutes): 0.0 Table: domains SSTable count: 21 Space used (live), bytes: 122407 Space used (total), bytes: 122407 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.5848437821906775 Memtable cell count: 238238 Memtable data size, bytes: 2793 Memtable switch count: 1 Local read count: 60281 Local read latency: 0.162 ms Local write count: 402903 Local write latency: 0.012 ms Pending flushes: 0 Bloom filter false positives: 25 Bloom filter false ratio: 0.08929 Bloom filter space used, bytes: 664 Compacted partition minimum bytes: 87 Compacted partition maximum bytes: 372 Compacted partition mean bytes: 171 Average live cells per slice (last five minutes): 0.9985401459854014 Average tombstones per slice (last five minutes): 0.0 Table: domains_statistics SSTable count: 7 Space used (live), bytes: 302413 Space used (total), bytes: 302413 Space used by snapshots (total), bytes: 0 SSTable Compression Ratio: 0.6144160676068052 Memtable cell count: 849893 Memtable data size, bytes: 42569 Memtable switch count: 1 Local read count: 60511 Local read latency: 0.141 ms Local write count: 1055892 Local write latency: 0.013 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used, bytes: 416 Compacted partition minimum bytes: 87 Compacted partition maximum bytes: 24601 Compacted partition mean bytes: 3795 Average live cells per slice (last five minutes): 0.9894160583941606 Average tombstones per slice (last five minutes): 0.0 Table: latest_urls SSTable count: 3 Space used (live), bytes: 10239518
[jira] [Commented] (CASSANDRA-8094) Heavy writes in RangeSlice read requests
[ https://issues.apache.org/jira/browse/CASSANDRA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169774#comment-14169774 ] Minh Do commented on CASSANDRA-8094: @Jonathan, can we introduce another similar option like read_repair_chance per Column Family? Heavy writes in RangeSlice read requests -- Key: CASSANDRA-8094 URL: https://issues.apache.org/jira/browse/CASSANDRA-8094 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Minh Do Assignee: Minh Do Fix For: 2.0.11 RangeSlice requests always do a scheduled read repair when coordinators try to resolve replicas' responses no matter read_repair_chance is set or not. Because of this, in low writes and high reads clusters, there are very high write requests going on between nodes. We should have an option to turn this off and this can be different than the read_repair_chance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7446) Batchlog should be streamed to a different node on decom
[ https://issues.apache.org/jira/browse/CASSANDRA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169812#comment-14169812 ] Jason Brown commented on CASSANDRA-7446: bq. batchlog replay might write a few hints Ah, good call. Batchlog should be streamed to a different node on decom Key: CASSANDRA-7446 URL: https://issues.apache.org/jira/browse/CASSANDRA-7446 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Branimir Lambov Just like we stream hints on decom, we should also stream the contents of the batchlog - even though we do replicate the batch to at least two nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8049) Explicitly examine current C* state on startup to detect incompatibilities before upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-8049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169821#comment-14169821 ] Jeremy Hanna commented on CASSANDRA-8049: - Would be great to see this as there was CASSANDRA-6598 which made upgrading sstables in 1.2 not include index sstables. That made it so if a user didn't upgrade sstables with a 1.2.14+ version of C* before going to 2.0, 2.0 would migrate some system data and then fail to startup. It was seen in a few different cases and painful to go back. It is documented, but something to fail fast would be very welcome. Explicitly examine current C* state on startup to detect incompatibilities before upgrade - Key: CASSANDRA-8049 URL: https://issues.apache.org/jira/browse/CASSANDRA-8049 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Fix For: 3.0 Unfortunately, we cannot rely on users reading, and following, NEWS.txt before upgrading. People don't read, or ignore it, and sometimes have issues as the result (see CASSANDRA-8047, for example, and I know of several cases like that one). We should add an explicit compatibility check on startup, before we modify anything, or write out sstables with the new format. We should fail and complain loudly if we detect a skipped upgrade step. We should also snapshot the schema tables before attempting any conversions (since it's not uncommon to make schema modifications as part of the upgrade). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8112) nodetool compactionhistory can allocate memory unbounded
T Jake Luciani created CASSANDRA-8112: - Summary: nodetool compactionhistory can allocate memory unbounded Key: CASSANDRA-8112 URL: https://issues.apache.org/jira/browse/CASSANDRA-8112 Project: Cassandra Issue Type: Bug Reporter: T Jake Luciani Priority: Minor Fix For: 2.1.1 nodetool compactionhistory keeps data for 1 week by default and creates a table from the result set in memory. For many systems a week can generate 10's of thousands of compactions (esp with LCS) we should guard against this command allocating too much memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8112) nodetool compactionhistory can allocate memory unbounded
[ https://issues.apache.org/jira/browse/CASSANDRA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-8112: -- Description: nodetool compactionhistory keeps data for 1 week by default and creates a table from the result set in memory. For many systems a week can generate 10's of thousands of compactions in this time (esp with LCS) we should guard against this command allocating too much memory. was: nodetool compactionhistory keeps data for 1 week by default and creates a table from the result set in memory. For many systems a week can generate 10's of thousands of compactions (esp with LCS) we should guard against this command allocating too much memory. nodetool compactionhistory can allocate memory unbounded Key: CASSANDRA-8112 URL: https://issues.apache.org/jira/browse/CASSANDRA-8112 Project: Cassandra Issue Type: Bug Reporter: T Jake Luciani Priority: Minor Fix For: 2.1.1 nodetool compactionhistory keeps data for 1 week by default and creates a table from the result set in memory. For many systems a week can generate 10's of thousands of compactions in this time (esp with LCS) we should guard against this command allocating too much memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
Richard Low created CASSANDRA-8113: -- Summary: Gossip should ignore generation numbers too far in the future Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8066) High Heap Consumption due to high number of SSTableReader
[ https://issues.apache.org/jira/browse/CASSANDRA-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-8066: -- Attachment: 8066.txt Patch to only add to the queue under normal open circumstances (not a openEarly or moveStarts call) This greatly reduces the number of entries to the queue since before moveStarts was called on every sstable being compacted every 50MB High Heap Consumption due to high number of SSTableReader - Key: CASSANDRA-8066 URL: https://issues.apache.org/jira/browse/CASSANDRA-8066 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.0 Reporter: Benoit Lacelle Assignee: T Jake Luciani Fix For: 2.1.1 Attachments: 8066.txt Given a workload with quite a lot of reads, I recently encountered high heap memory consumption. Given 2GB of Heap, it appears I have 750.000+ tasks in SSTableReader.syncExecutor, consuming more than 1.2GB. These tasks have type SSTableReader$5, which I guess corresponds to : {code} readMeterSyncFuture = syncExecutor.scheduleAtFixedRate(new Runnable() { public void run() { if (!isCompacted.get()) { meterSyncThrottle.acquire(); SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, desc.generation, readMeter); } } }, 1, 5, TimeUnit.MINUTES); {code} I do not have have to the environment right now, but I could provide a threaddump later if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169922#comment-14169922 ] Jason Brown commented on CASSANDRA-8113: What happens when you have a network partition that lasts days (or a week or two)? The heartbeat is updated, more or less, once a second. So the version of a given node can increment by ~86400 per day (minus a few for GC collection, thread scheduling, etc) . Depending on what you think a too far in the future value is, if you set that high water mark too low, you will doom the cluster to never converging, as well. If we want to consider a high water mark of difference as a couple million, or so, that might be reasonable. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7813) Decide how to deal with conflict between native and user-defined functions
[ https://issues.apache.org/jira/browse/CASSANDRA-7813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-7813: Attachment: 7813v2.txt Updated patch v2 (merge conflicts after removing UDF class functionality) Decide how to deal with conflict between native and user-defined functions -- Key: CASSANDRA-7813 URL: https://issues.apache.org/jira/browse/CASSANDRA-7813 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Robert Stupp Labels: cql Fix For: 3.0 Attachments: 7813.txt, 7813v2.txt We have a bunch of native/hardcoded functions (now(), dateOf(), ...) and in 3.0, user will be able to define new functions. Now, there is a very high change that we will provide more native functions over-time (to be clear, I'm not particularly for adding native functions for allthethings just because we can, but it's clear that we should ultimately provide more than what we have). Which begs the question: how do we want to deal with the problem of adding a native function potentially breaking a previously defined user-defined function? A priori I see the following options (maybe there is more?): # don't do anything specific, hoping that it won't happen often and consider it a user problem if it does. # reserve a big number of names that we're hoping will cover all future need. # make native function and user-defined function syntactically distinct so it cannot happen. I'm not a huge fan of solution 1). Solution 2) is actually what we did for UDT but I think it's somewhat less practical here: there is so much types that it makes sense to provide natively and so it wasn't too hard to come up with a reasonably small list of types name to reserve just in case. This feels a lot harder for functions to me. Which leaves solution 3). Since we already have the concept of namespaces for functions, a simple idea would be to force user function to have namespace. We could even allow that namespace to be empty as long as we force the namespace separator (so we'd allow {{bar::foo}} and {{::foo}} for user functions, but *not* {{foo}} which would be reserved for native function). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169933#comment-14169933 ] Brandon Williams commented on CASSANDRA-8113: - bq. Depending on what you think a too far in the future value is I was thinking we could just take current time + one year. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8066) High Heap Consumption due to high number of SSTableReader
[ https://issues.apache.org/jira/browse/CASSANDRA-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169938#comment-14169938 ] Jason Brown commented on CASSANDRA-8066: +1 High Heap Consumption due to high number of SSTableReader - Key: CASSANDRA-8066 URL: https://issues.apache.org/jira/browse/CASSANDRA-8066 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.1.0 Reporter: Benoit Lacelle Assignee: T Jake Luciani Fix For: 2.1.1 Attachments: 8066.txt Given a workload with quite a lot of reads, I recently encountered high heap memory consumption. Given 2GB of Heap, it appears I have 750.000+ tasks in SSTableReader.syncExecutor, consuming more than 1.2GB. These tasks have type SSTableReader$5, which I guess corresponds to : {code} readMeterSyncFuture = syncExecutor.scheduleAtFixedRate(new Runnable() { public void run() { if (!isCompacted.get()) { meterSyncThrottle.acquire(); SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, desc.generation, readMeter); } } }, 1, 5, TimeUnit.MINUTES); {code} I do not have have to the environment right now, but I could provide a threaddump later if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169950#comment-14169950 ] Jason Brown commented on CASSANDRA-8113: What if your local clock is borked? Perhaps, a remote node's last version + an approximation of the number of updates in a year (86400 * 365) = 31,536,000. wdyt? Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8113) Gossip should ignore generation numbers too far in the future
[ https://issues.apache.org/jira/browse/CASSANDRA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169956#comment-14169956 ] Brandon Williams commented on CASSANDRA-8113: - bq. What if your local clock is borked? Then you have bigger problems ;) bq. Perhaps, a remote node's last version + an approximation of the number of updates in a year (86400 * 365) = 31,536,000. Sounds reasonable to me. Gossip should ignore generation numbers too far in the future - Key: CASSANDRA-8113 URL: https://issues.apache.org/jira/browse/CASSANDRA-8113 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Richard Low If a node sends corrupted gossip, it could set the generation numbers for other nodes to arbitrarily large values. This is dangerous since one bad node (e.g. with bad memory) could in theory bring down the cluster. Nodes should refuse to accept generation numbers that are too far in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8114) Better handle critical thread exits
sankalp kohli created CASSANDRA-8114: Summary: Better handle critical thread exits Key: CASSANDRA-8114 URL: https://issues.apache.org/jira/browse/CASSANDRA-8114 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor We have seen on 2 occasion where a critical thread exited due to some exception and C* was still running. I see two options to detect such thread deaths 1) Write a wrapper around such Runnable which takes some action if the thread is about to exit. 2) Write something in uncaught exception handler which identifies a thread by name and alerts if it is a critical thread. Once we can better detect such things, we can configure action on it like alerting or killing C*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
git commit: Fix handling of EXECUTE with skip_metadata=false
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 dee15a85c - 63cb95e01 Fix handling of EXECUTE with skip_metadata=false patch by Aleksey Yeschenko and Jake Luciani; reviewed by Sylvain Lebresne Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/63cb95e0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/63cb95e0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/63cb95e0 Branch: refs/heads/cassandra-2.1 Commit: 63cb95e012ae3cc197b42b9aa881f905449437b1 Parents: dee15a8 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 14 00:23:54 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 14 00:24:04 2014 +0300 -- .../org/apache/cassandra/cql3/ResultSet.java| 2 +- .../cql3/PreparedStatementCleanupTest.java | 86 - .../cassandra/cql3/PreparedStatementsTest.java | 122 +++ 3 files changed, 123 insertions(+), 87 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/63cb95e0/src/java/org/apache/cassandra/cql3/ResultSet.java -- diff --git a/src/java/org/apache/cassandra/cql3/ResultSet.java b/src/java/org/apache/cassandra/cql3/ResultSet.java index 3928060..e463b29 100644 --- a/src/java/org/apache/cassandra/cql3/ResultSet.java +++ b/src/java/org/apache/cassandra/cql3/ResultSet.java @@ -266,7 +266,7 @@ public class ResultSet public Metadata copy() { -return new Metadata(flags, names, columnCount, pagingState); +return new Metadata(EnumSet.copyOf(flags), names, columnCount, pagingState); } // The maximum number of values that the ResultSet can hold. This can be bigger than columnCount due to CASSANDRA-4911 http://git-wip-us.apache.org/repos/asf/cassandra/blob/63cb95e0/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java b/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java deleted file mode 100644 index 3e725e9..000 --- a/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java +++ /dev/null @@ -1,86 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * License); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an AS IS BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.cassandra.cql3; - -import com.datastax.driver.core.Cluster; -import com.datastax.driver.core.PreparedStatement; -import com.datastax.driver.core.Session; -import org.apache.cassandra.SchemaLoader; -import org.apache.cassandra.config.DatabaseDescriptor; -import org.apache.cassandra.config.Schema; -import org.apache.cassandra.service.EmbeddedCassandraService; -import org.junit.AfterClass; -import org.junit.BeforeClass; -import org.junit.Test; - -public class PreparedStatementCleanupTest extends SchemaLoader -{ -private static Cluster cluster; -private static Session session; - -private static final String KEYSPACE = prepared_stmt_cleanup; -private static final String createKsStatement = CREATE KEYSPACE + KEYSPACE + - WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };; -private static final String dropKsStatement = DROP KEYSPACE IF EXISTS + KEYSPACE; - -@BeforeClass -public static void setup() throws Exception -{ -Schema.instance.clear(); - -EmbeddedCassandraService cassandra = new EmbeddedCassandraService(); -cassandra.start(); - -// Currently the native server start method return before the server is fully binded to the socket, so we need -// to wait slightly before trying to connect to it. We should fix this but in the meantime using a sleep. -Thread.sleep(500); - - cluster = Cluster.builder().addContactPoint(127.0.0.1) - .withPort(DatabaseDescriptor.getNativeTransportPort()) -
[2/2] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a19c9128 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a19c9128 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a19c9128 Branch: refs/heads/trunk Commit: a19c9128010a2925b9654976b5123df87f4b35a7 Parents: 27fdf42 63cb95e Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 14 00:25:56 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 14 00:25:56 2014 +0300 -- .../org/apache/cassandra/cql3/ResultSet.java| 2 +- .../cql3/PreparedStatementCleanupTest.java | 86 - .../cassandra/cql3/PreparedStatementsTest.java | 122 +++ 3 files changed, 123 insertions(+), 87 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a19c9128/src/java/org/apache/cassandra/cql3/ResultSet.java --
[1/2] git commit: Fix handling of EXECUTE with skip_metadata=false
Repository: cassandra Updated Branches: refs/heads/trunk 27fdf4211 - a19c91280 Fix handling of EXECUTE with skip_metadata=false patch by Aleksey Yeschenko and Jake Luciani; reviewed by Sylvain Lebresne Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/63cb95e0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/63cb95e0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/63cb95e0 Branch: refs/heads/trunk Commit: 63cb95e012ae3cc197b42b9aa881f905449437b1 Parents: dee15a8 Author: Aleksey Yeschenko alek...@apache.org Authored: Tue Oct 14 00:23:54 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Tue Oct 14 00:24:04 2014 +0300 -- .../org/apache/cassandra/cql3/ResultSet.java| 2 +- .../cql3/PreparedStatementCleanupTest.java | 86 - .../cassandra/cql3/PreparedStatementsTest.java | 122 +++ 3 files changed, 123 insertions(+), 87 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/63cb95e0/src/java/org/apache/cassandra/cql3/ResultSet.java -- diff --git a/src/java/org/apache/cassandra/cql3/ResultSet.java b/src/java/org/apache/cassandra/cql3/ResultSet.java index 3928060..e463b29 100644 --- a/src/java/org/apache/cassandra/cql3/ResultSet.java +++ b/src/java/org/apache/cassandra/cql3/ResultSet.java @@ -266,7 +266,7 @@ public class ResultSet public Metadata copy() { -return new Metadata(flags, names, columnCount, pagingState); +return new Metadata(EnumSet.copyOf(flags), names, columnCount, pagingState); } // The maximum number of values that the ResultSet can hold. This can be bigger than columnCount due to CASSANDRA-4911 http://git-wip-us.apache.org/repos/asf/cassandra/blob/63cb95e0/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java b/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java deleted file mode 100644 index 3e725e9..000 --- a/test/unit/org/apache/cassandra/cql3/PreparedStatementCleanupTest.java +++ /dev/null @@ -1,86 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * License); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an AS IS BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.cassandra.cql3; - -import com.datastax.driver.core.Cluster; -import com.datastax.driver.core.PreparedStatement; -import com.datastax.driver.core.Session; -import org.apache.cassandra.SchemaLoader; -import org.apache.cassandra.config.DatabaseDescriptor; -import org.apache.cassandra.config.Schema; -import org.apache.cassandra.service.EmbeddedCassandraService; -import org.junit.AfterClass; -import org.junit.BeforeClass; -import org.junit.Test; - -public class PreparedStatementCleanupTest extends SchemaLoader -{ -private static Cluster cluster; -private static Session session; - -private static final String KEYSPACE = prepared_stmt_cleanup; -private static final String createKsStatement = CREATE KEYSPACE + KEYSPACE + - WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };; -private static final String dropKsStatement = DROP KEYSPACE IF EXISTS + KEYSPACE; - -@BeforeClass -public static void setup() throws Exception -{ -Schema.instance.clear(); - -EmbeddedCassandraService cassandra = new EmbeddedCassandraService(); -cassandra.start(); - -// Currently the native server start method return before the server is fully binded to the socket, so we need -// to wait slightly before trying to connect to it. We should fix this but in the meantime using a sleep. -Thread.sleep(500); - - cluster = Cluster.builder().addContactPoint(127.0.0.1) - .withPort(DatabaseDescriptor.getNativeTransportPort()) -
[jira] [Commented] (CASSANDRA-8054) EXECUTE request with skipMetadata=false gets no metadata in response
[ https://issues.apache.org/jira/browse/CASSANDRA-8054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170013#comment-14170013 ] Aleksey Yeschenko commented on CASSANDRA-8054: -- Pushed as https://github.com/apache/cassandra/commit/63cb95e012ae3cc197b42b9aa881f905449437b1 to 2.1 and trunk, because it started to become annoying for some stress users. Same issue still persists in 2.0, so not closing the ticket until that one is fixed as well (I'll work on a fix). EXECUTE request with skipMetadata=false gets no metadata in response Key: CASSANDRA-8054 URL: https://issues.apache.org/jira/browse/CASSANDRA-8054 Project: Cassandra Issue Type: Bug Components: Core Reporter: Olivier Michallat Assignee: Sylvain Lebresne Fix For: 2.0.11, 2.1.1 Attachments: 8054-2.1.txt, 8054-fix.txt, 8054-v2.txt This has been reported independently with the [Java|https://datastax-oss.atlassian.net/browse/JAVA-482] and [C++|https://datastax-oss.atlassian.net/browse/CPP-174] drivers. This happens under heavy load, where multiple client threads prepare and execute statements in parallel. One of them sends an EXECUTE request with skipMetadata=false, but the returned ROWS response has no metadata in it. A patch of {{Message.Dispatcher.channelRead0}} confirmed that the flag was incorrectly set on the response: {code} logger.debug(Received: {}, v={}, request, connection.getVersion()); boolean skipMetadataOnRequest = false; if (request instanceof ExecuteMessage) { ExecuteMessage execute = (ExecuteMessage)request; skipMetadataOnRequest = execute.options.skipMetadata(); } response = request.execute(qstate); if (request instanceof ExecuteMessage) { Rows rows = (Rows)response; boolean skipMetadataOnResponse = rows.result.metadata.flags.contains(Flag.NO_METADATA); if (skipMetadataOnResponse != skipMetadataOnRequest) { logger.warn(Inconsistent skipMetadata on streamId {}, was {} in request but {} in response, request.getStreamId(), skipMetadataOnRequest, skipMetadataOnResponse); } } {code} We observed the warning with (false, true) during our tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8106) Schema changes raises comparators do not match or are not compatible ConfigurationException
[ https://issues.apache.org/jira/browse/CASSANDRA-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-8106. -- Resolution: Duplicate Marking as duplicate of CASSANDRA-4988. Meanwhile will look into providing a better exception here. Schema changes raises comparators do not match or are not compatible ConfigurationException - Key: CASSANDRA-8106 URL: https://issues.apache.org/jira/browse/CASSANDRA-8106 Project: Cassandra Issue Type: Bug Reporter: Tommaso Barbugli I am running Cassandra 2.0.10 on 2 nodes; since the last few hours every schema migration issued via CQL (both CREATE and ALTER tables) raises a ConfigurationException comparators do not match or are not compatible exception. {code} ERROR [Native-Transport-Requests:5802] 2014-10-12 22:48:12,237 QueryMessage.java (line 131) Unexpected error during query java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:285) at org.apache.cassandra.service.MigrationManager.announceNewColumnFamily(MigrationManager.java:223) at org.apache.cassandra.cql3.statements.CreateTableStatement.announceMigration(CreateTableStatement.java:121) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:79) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:306) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409) ... 16 more Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:1052) at org.apache.cassandra.db.DefsTables.updateColumnFamily(DefsTables.java:377) at org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:318) at org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:183) at org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:303) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) ... 3 more Caused by: org.apache.cassandra.exceptions.ConfigurationException: comparators do not match or are not compatible. at org.apache.cassandra.config.CFMetaData.validateCompatility(CFMetaData.java:1142) at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:1067) at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:1048) ... 10 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7927) Kill daemon on any disk error
[ https://issues.apache.org/jira/browse/CASSANDRA-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170210#comment-14170210 ] John Sumsion commented on CASSANDRA-7927: - LGTM Kill daemon on any disk error - Key: CASSANDRA-7927 URL: https://issues.apache.org/jira/browse/CASSANDRA-7927 Project: Cassandra Issue Type: New Feature Components: Core Environment: aws, stock cassandra or dse Reporter: John Sumsion Assignee: John Sumsion Labels: bootcamp, lhf Fix For: 2.1.1 Attachments: 7927-v1-die.patch We got a disk read error on 1.2.13 that didn't trigger the disk failure policy, and I'm trying to hunt down why, but in doing so, I saw that there is no disk_failure_policy option for just killing the daemon. If we ever get a corrupt sstable, we want to replace the node anyway, because some aws instance store disks just go bad. I want to use the JVMStabilityInspector from CASSANDRA-7507 to kill so that remains standard, so I will base my patch on CASSANDRA-7507. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8084) GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE clusters doesnt use the PRIVATE IPS for Intra-DC communications - When running nodetool repair
[ https://issues.apache.org/jira/browse/CASSANDRA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-8084: -- Attachment: 8084-2.0-v2.txt Attaching V2. In this version, I changed the code so that repair/bootstrap/etc pass private IP when available to create stream session. This way, we can show private IP in streaming related log and nodetool netstats, but not both. (Also, I think it is better not to create dependency to system keyspace from streaming.) GossipFilePropertySnitch and EC2MultiRegionSnitch when used in AWS/GCE clusters doesnt use the PRIVATE IPS for Intra-DC communications - When running nodetool repair - Key: CASSANDRA-8084 URL: https://issues.apache.org/jira/browse/CASSANDRA-8084 Project: Cassandra Issue Type: Bug Components: Config Environment: Tested this in GCE and AWS clusters. Created multi region and multi dc cluster once in GCE and once in AWS and ran into the same problem. DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION=Ubuntu 12.04.3 LTS NAME=Ubuntu VERSION=12.04.3 LTS, Precise Pangolin ID=ubuntu ID_LIKE=debian PRETTY_NAME=Ubuntu precise (12.04.3 LTS) VERSION_ID=12.04 Tried to install Apache Cassandra version ReleaseVersion: 2.0.10 and also latest DSE version which is 4.5 and which corresponds to 2.0.8.39. Reporter: Jana Assignee: Yuki Morishita Labels: features Fix For: 2.0.11 Attachments: 8084-2.0-v2.txt, 8084-2.0.txt Neither of these snitches(GossipFilePropertySnitch and EC2MultiRegionSnitch ) used the PRIVATE IPS for communication between INTRA-DC nodes in my multi-region multi-dc cluster in cloud(on both AWS and GCE) when I ran nodetool repair -local. It works fine during regular reads. Here are the various cluster flavors I tried and failed- AWS + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + (Prefer_local=true) in rackdc-properties file. AWS + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in rackdc-properties file. GCE + Multi-REGION + Multi-DC + GossipPropertyFileSnitch + (Prefer_local=true) in rackdc-properties file. GCE + Multi-REGION + Multi-DC + EC2MultiRegionSnitch + (Prefer_local=true) in rackdc-properties file. I am expecting with the above setup all of my nodes in a given DC all communicate via private ips since the cloud providers dont charge us for using the private ips and they charge for using public ips. But they can use PUBLIC IPs for INTER-DC communications which is working as expected. Here is a snippet from my log files when I ran the nodetool repair -local - Node responding to 'node running repair' INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,628 Validator.java (line 254) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree to /54.172.118.222 for system_traces/sessions INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,741 Validator.java (line 254) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Sending completed merkle tree to /54.172.118.222 for system_traces/events Node running repair - INFO [AntiEntropyStage:1] 2014-10-08 14:47:51,927 RepairSession.java (line 166) [repair #1439f290-4efa-11e4-bf3a-df845ecf54f8] Received merkle tree for events from /54.172.118.222 Note: The IPs its communicating is all PUBLIC Ips and it should have used the PRIVATE IPs starting with 172.x.x.x YAML file values : The listen address is set to: PRIVATE IP The broadcast address is set to: PUBLIC IP The SEEDs address is set to: PUBLIC IPs from both DCs The SNITCHES tried: GPFS and EC2MultiRegionSnitch RACK-DC: Had prefer_local set to true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8016) Auth tables should use higher consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-8016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170277#comment-14170277 ] Vishy Kasar commented on CASSANDRA-8016: We need to fallback to CL.LOCAL_QUORUM also in case of CL.LOCAL_ONE failure (as opposed to miss) Auth tables should use higher consistency level --- Key: CASSANDRA-8016 URL: https://issues.apache.org/jira/browse/CASSANDRA-8016 Project: Cassandra Issue Type: Bug Reporter: T Jake Luciani Assignee: Aleksey Yeschenko Fix For: 2.0.11 The Auth code in Cassandra uses CL.ONE or CL.LOCAL_ONE except in the case of the superuser. Since the Auth keyspace is created with RF=1 the default experience is fine. However if you change to RF 1 suddenly the select statements are open to misses. We should change reads/writes in Auth, PasswordAuthenticator, CassandraAuthorizer to always use LOCAL_QUORUM/QUORUM. For reads we could optimize the code to start with CL.ONE and on a miss increase to CL.QUORUM -- This message was sent by Atlassian JIRA (v6.3.4#6332)