[jira] [Commented] (CASSANDRA-6434) Repair-aware gc grace period
[ https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386284#comment-14386284 ] Marcus Eriksson commented on CASSANDRA-6434: [~kohlisankalp] no updates, I'll do some more research in what we can actually do here > Repair-aware gc grace period > - > > Key: CASSANDRA-6434 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6434 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: sankalp kohli >Assignee: Marcus Eriksson > Fix For: 3.0 > > > Since the reason for gcgs is to ensure that we don't purge tombstones until > every replica has been notified, it's redundant in a world where we're > tracking repair times per sstable (and repairing frequentily), i.e., a world > where we default to incremental repair a la CASSANDRA-5351. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8238) NPE in SizeTieredCompactionStrategy.filterColdSSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386279#comment-14386279 ] Marcus Eriksson commented on CASSANDRA-8238: dunno, my thinking was that if anyone has enabled this in 2.0, they probably know what they are doing and removing this would change behavior a bit too much in 2.0 > NPE in SizeTieredCompactionStrategy.filterColdSSTables > -- > > Key: CASSANDRA-8238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Tyler Hobbs >Assignee: Marcus Eriksson > Fix For: 2.1.4 > > Attachments: 0001-assert-that-readMeter-is-not-null.patch, > 0001-dont-always-set-client-mode-for-sstable-loader.patch > > > {noformat} > ERROR [CompactionExecutor:15] 2014-10-31 15:28:32,318 > CassandraDaemon.java:153 - Exception in thread > Thread[CompactionExecutor:15,1,main] > java.lang.NullPointerException: null > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.filterColdSSTables(SizeTieredCompactionStrategy.java:181) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:83) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:267) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:226) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_72] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_72] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_72] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_72] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-6363) CAS not applied on rows containing an expired ttl column
[ https://issues.apache.org/jira/browse/CASSANDRA-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386114#comment-14386114 ] Stefania edited comment on CASSANDRA-6363 at 3/30/15 6:38 AM: -- [~thobbs]: - tested manually and with dtest below in cassandra-2.0 and cannot reproduce. - tested with dtest below in cassandra-2.1 and cannot reproduce - tested with dtest below in trunk and cannot reproduce The dtest: https://github.com/stef1927/cassandra-dtest/commit/eaf56385405db4702d869699e80ad4d00b41cec4 {code} def delete_with_ttl_expired_test(self): """ Updating a row with a ttl does not prevent deletion, test for CASSANDRA-6363 """ self.cursor1.execute("DROP TABLE IF EXISTS session") self.cursor1.execute("CREATE TABLE session (id text, usr text, valid int, PRIMARY KEY (id))") self.cursor1.execute("insert into session (id, usr) values ('abc', 'abc')") self.cursor1.execute("update session using ttl 1 set valid = 1 where id = 'abc'") self.smart_sleep(time.time(), 1) self.cursor1.execute("delete from session where id = 'abc' if usr ='abc'") assert_row_count(self.cursor1, 'session', 0) {code} Please confirm it's OK to close. was (Author: stefania): [~thobbs]: - tested manually and with dtest below in cassandra-2.0 and cannot reproduce. - tested with dtest below in cassandra-2.1 and cannot reproduce - tested with dtest below in trunk and cannot reproduce The dtest: https://github.com/stef1927/cassandra-dtest/commit/eaf56385405db4702d869699e80ad4d00b41cec4 {code} def delete_with_ttl_expired_test(self): """ Updating a row with a ttl does not prevent deletion, test for CASSANDRA-6363 """ self.cursor1.execute("DROP TABLE IF EXISTS session") self.cursor1.execute("CREATE TABLE session (id text, usr text, valid int, PRIMARY KEY (id))") self.cursor1.execute("insert into session (id, usr) values ('abc', 'abc')") self.cursor1.execute("update session using ttl 1 set valid = 1 where id = 'abc'") self.smart_sleep(time.time(), 1) self.cursor1.execute("delete from session where id = 'abc' if usr ='abc'") assert_row_count(self.cursor1, 'session', 0) {code} > CAS not applied on rows containing an expired ttl column > > > Key: CASSANDRA-6363 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6363 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux/x64 2.0.2 4-node cluster >Reporter: Michał Ziemski >Assignee: Stefania > > CREATE TABLE session ( > id text, > usr text, > valid int, > PRIMARY KEY (id) > ); > insert into session (id, usr) values ('abc', 'abc'); > update session using ttl 1 set valid = 1 where id = 'abc'; > (wait 1 sec) > And > delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr > ='demo'; > Yields: > [applied] | usr > ---+- > False | abc > Rather than applying the delete. > Executing: > update session set valid = null where id = 'abc'; > and again > delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr > ='demo'; > Positively deletes the row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7976) Changes to index_interval table properties revert after subsequent modifications
[ https://issues.apache.org/jira/browse/CASSANDRA-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386275#comment-14386275 ] Stefania commented on CASSANDRA-7976: - Verified {{index_interval}} on cassandra-2.0 and reproduced the problem, added unit test. Verified {{min_index_interval}} and {{max_index_interval}} on trunk and cassandra-2.1 but could not reproduce. The problem for 2.0 is that the index interval is not updated in {{CFMetadata.apply()}}. As a consequence, the index interval was never changed, this is clearly visible in the log file, despite what the cqlsh DESC command shows. The reason why the initial DESC command shows an updated index interval is that the migration manager pushes a schema change that was not correctly applied. When the index interval was changed into min and max on cassandra-2.1, {{CFMetadata.apply()}} was fixed (verified on trunk). The patch for 2.0 is here: https://github.com/stef1927/cassandra/commits/7976. > Changes to index_interval table properties revert after subsequent > modifications > > > Key: CASSANDRA-7976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7976 > Project: Cassandra > Issue Type: Bug > Components: Config > Environment: cqlsh 4.1.1, Cassandra 2.0.9-SNAPSHOT (built w/ `ccm` on > Mac OS X 10.9.4 with Java 1.7.0_67 - more detail below) > $ java -version > java version "1.7.0_67" > Java(TM) SE Runtime Environment (build 1.7.0_67-b01) > Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) > $ mvn --version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T13:58:10-07:00) > Maven home: /usr/local/Cellar/maven/3.2.3/libexec > Java version: 1.7.0_67, vendor: Oracle Corporation > Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre > Default locale: en_US, platform encoding: UTF-8 > OS name: "mac os x", version: "10.9.4", arch: "x86_64", family: "mac" >Reporter: Andrew Lenards >Assignee: Stefania > Labels: cql3, metadata > > It appears that if you want to increase the sampling in *-Summary.db files, > you would change the default for {{index_interval}} table property from the > {{128}} default value to {{256}} on a given CQL {{TABLE}}. > However, if you {{ALTER TABLE}} after setting the value, {{index_interval}} > returns to the default, {{128}}. This is unexpected behavior. I would expect > the value for {{index_interval}} to not be affected by subsequent {{ALTER > TABLE}} statements. > As noted in Environment, this was seen with a 2.0.9-SNAPSHOT built w/ `ccm`. > If I just use a table from one of DataStax documentation tutorials (musicdb > as mdb): > {noformat} > cqlsh:mdb> DESC TABLE songs; > CREATE TABLE songs ( > id uuid, > album text, > artist text, > data blob, > reviews list, > tags set, > title text, > venue map, > PRIMARY KEY ((id)) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.10 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.00 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > {noformat} > We've got {{128}} as expected. > We alter it: > {noformat} > cqlsh:mdb> ALTER TABLE songs WITH index_interval = 256; > {noformat} > And the change appears: > {noformat} > cqlsh:mdb> DESC TABLE songs; > CREATE TABLE songs ( > id uuid, > album text, > artist text, > data blob, > reviews list, > tags set, > title text, > venue map, > PRIMARY KEY ((id)) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.10 AND > gc_grace_seconds=864000 AND > index_interval=256 AND > read_repair_chance=0.00 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > {noformat} > But if do another {{ALTER TABLE}}, say, change the caching or comment, the > {{index_interval}} will revert back to {{128}}. > {noformat} > cqlsh:mdb> ALTER TABLE songs WITH caching = 'none'; > cqlsh:mdb> DESC TABLE songs; > CREATE TABLE songs ( > id uuid, > album text, > artist text, > data blob, > reviews list, > tags set, > title text, > venue map, > P
[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386271#comment-14386271 ] Marcus Eriksson commented on CASSANDRA-9045: [~philipthompson] yes, do that, with this patch you can set it to 0 even: http://aep.appspot.com/display/wSaOmJhJ6IGh0NYSe8-gY0sM4Yg/ > Deleted columns are resurrected after repair in wide rows > - > > Key: CASSANDRA-9045 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9045 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Roman Tkachenko >Assignee: Marcus Eriksson >Priority: Critical > Fix For: 2.0.14 > > Attachments: cqlsh.txt > > > Hey guys, > After almost a week of researching the issue and trying out multiple things > with (almost) no luck I was suggested (on the user@cass list) to file a > report here. > h5. Setup > Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if > it goes away) > Multi datacenter 12+6 nodes cluster. > h5. Schema > {code} > cqlsh> describe keyspace blackbook; > CREATE KEYSPACE blackbook WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'IAD': '3', > 'ORD': '3' > }; > USE blackbook; > CREATE TABLE bounces ( > domainid text, > address text, > message text, > "timestamp" bigint, > PRIMARY KEY (domainid, address) > ) WITH > bloom_filter_fp_chance=0.10 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.10 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.00 AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'LeveledCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > {code} > h5. Use case > Each row (defined by a domainid) can have many many columns (bounce entries) > so rows can get pretty wide. In practice, most of the rows are not that big > but some of them contain hundreds of thousands and even millions of columns. > Columns are not TTL'ed but can be deleted using the following CQL3 statement: > {code} > delete from bounces where domainid = 'domain.com' and address = > 'al...@example.com'; > {code} > All queries are performed using LOCAL_QUORUM CL. > h5. Problem > We weren't very diligent about running repairs on the cluster initially, but > shorty after we started doing it we noticed that some of previously deleted > columns (bounce entries) are there again, as if tombstones have disappeared. > I have run this test multiple times via cqlsh, on the row of the customer who > originally reported the issue: > * delete an entry > * verify it's not returned even with CL=ALL > * run repair on nodes that own this row's key > * the columns reappear and are returned even with CL=ALL > I tried the same test on another row with much less data and everything was > correctly deleted and didn't reappear after repair. > h5. Other steps I've taken so far > Made sure NTP is running on all servers and clocks are synchronized. > Increased gc_grace_seconds to 100 days, ran full repair (on the affected > keyspace) on all nodes, then changed it back to the default 10 days again. > Didn't help. > Performed one more test. Updated one of the resurrected columns, then deleted > it and ran repair again. This time the updated version of the column > reappeared. > Finally, I noticed these log entries for the row in question: > {code} > INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 > CompactionController.java (line 192) Compacting large row > blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally > {code} > Figuring it may be related I bumped "in_memory_compaction_limit_in_mb" to > 512MB so the row fits into it, deleted the entry and ran repair once again. > The log entry for this row was gone and the columns didn't reappear. > We have a lot of rows much larger than 512MB so can't increase this > parameters forever, if that is the issue. > Please let me know if you need more information on the case or if I can run > more experiments. > Thanks! > Roman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-9060: --- Attachment: 0001-another-tweak-to-9060.patch > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Gustav Munkby >Priority: Minor > Fix For: 2.1.4 > > Attachments: 0001-another-tweak-to-9060.patch, 2.1-9060-simple.patch, > trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reopened CASSANDRA-9060: attaching another small tweak to this > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Gustav Munkby >Priority: Minor > Fix For: 2.1.4 > > Attachments: 2.1-9060-simple.patch, trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6363) CAS not applied on rows containing an expired ttl column
[ https://issues.apache.org/jira/browse/CASSANDRA-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386114#comment-14386114 ] Stefania commented on CASSANDRA-6363: - [~thobbs]: - tested manually and with dtest below in cassandra-2.0 and cannot reproduce. - tested with dtest below in cassandra-2.1 and cannot reproduce - tested with dtest below in trunk and cannot reproduce The dtest: https://github.com/stef1927/cassandra-dtest/commit/eaf56385405db4702d869699e80ad4d00b41cec4 {code} def delete_with_ttl_expired_test(self): """ Updating a row with a ttl does not prevent deletion, test for CASSANDRA-6363 """ self.cursor1.execute("DROP TABLE IF EXISTS session") self.cursor1.execute("CREATE TABLE session (id text, usr text, valid int, PRIMARY KEY (id))") self.cursor1.execute("insert into session (id, usr) values ('abc', 'abc')") self.cursor1.execute("update session using ttl 1 set valid = 1 where id = 'abc'") self.smart_sleep(time.time(), 1) self.cursor1.execute("delete from session where id = 'abc' if usr ='abc'") assert_row_count(self.cursor1, 'session', 0) {code} > CAS not applied on rows containing an expired ttl column > > > Key: CASSANDRA-6363 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6363 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux/x64 2.0.2 4-node cluster >Reporter: Michał Ziemski >Assignee: Stefania > > CREATE TABLE session ( > id text, > usr text, > valid int, > PRIMARY KEY (id) > ); > insert into session (id, usr) values ('abc', 'abc'); > update session using ttl 1 set valid = 1 where id = 'abc'; > (wait 1 sec) > And > delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr > ='demo'; > Yields: > [applied] | usr > ---+- > False | abc > Rather than applying the delete. > Executing: > update session set valid = null where id = 'abc'; > and again > delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr > ='demo'; > Positively deletes the row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9066) BloomFilter serialization is inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386104#comment-14386104 ] Jonathan Ellis commented on CASSANDRA-9066: --- Thanks, Gustav and Benedict! > BloomFilter serialization is inefficient > > > Key: CASSANDRA-9066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Gustav Munkby > Fix For: 2.1.4 > > Attachments: 2.1-9066.patch > > > As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is > very slow. In that ticket I proposed that 2.1 use buffered serialization, and > 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8236) Delay "node up" and "node added" notifications until native protocol server is started
[ https://issues.apache.org/jira/browse/CASSANDRA-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386089#comment-14386089 ] Stefania commented on CASSANDRA-8236: - Re-based to pick-up the fix for CASSANDRA-9034 which was impacting the dtests. [~brandon.williams] dead state protection commit is ready for review. > Delay "node up" and "node added" notifications until native protocol server > is started > -- > > Key: CASSANDRA-8236 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8236 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Tyler Hobbs >Assignee: Stefania > Fix For: 3.0 > > Attachments: 8236.txt > > > As discussed in CASSANDRA-7510, there is still a gap between when a "node up" > or "node added" notification may be sent to native protocol clients (in > response to a gossip event) and when the native protocol server is ready to > serve requests. > Everything in between the call to {{StorageService.instance.initServer()}} > and creation of the native server in {{CassandraDaemon.setup()}} contributes > to this delay, but waiting for Gossip to settle introduces the biggest delay. > We may need to introduce a "STARTING" gossip state for the period inbetween, > which is why this is scheduled for 3.0. If there's a better option, though, > it may make sense to put this in 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: simplify
Repository: cassandra Updated Branches: refs/heads/trunk ce643ff99 -> 04389ad5e simplify Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/04389ad5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/04389ad5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/04389ad5 Branch: refs/heads/trunk Commit: 04389ad5ef879ae809db925bf46bad60b60fa454 Parents: ce643ff Author: Dave Brosius Authored: Sun Mar 29 21:35:20 2015 -0400 Committer: Dave Brosius Committed: Sun Mar 29 21:35:20 2015 -0400 -- src/java/org/apache/cassandra/db/Directories.java | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/04389ad5/src/java/org/apache/cassandra/db/Directories.java -- diff --git a/src/java/org/apache/cassandra/db/Directories.java b/src/java/org/apache/cassandra/db/Directories.java index a5be956..76171f0 100644 --- a/src/java/org/apache/cassandra/db/Directories.java +++ b/src/java/org/apache/cassandra/db/Directories.java @@ -193,10 +193,11 @@ public class Directories this.dataPaths = new File[dataDirectories.length]; // If upgraded from version less than 2.1, use existing directories +String oldSSTableRelativePath = join(metadata.ksName, metadata.cfName); for (int i = 0; i < dataDirectories.length; ++i) { // check if old SSTable directory exists -dataPaths[i] = new File(dataDirectories[i].location, join(metadata.ksName, metadata.cfName)); +dataPaths[i] = new File(dataDirectories[i].location, oldSSTableRelativePath); } boolean olderDirectoryExists = Iterables.any(Arrays.asList(dataPaths), new Predicate() { @@ -208,8 +209,10 @@ public class Directories if (!olderDirectoryExists) { // use 2.1-style path names + + String newSSTableRelativePath = join(metadata.ksName, directoryName); for (int i = 0; i < dataDirectories.length; ++i) -dataPaths[i] = new File(dataDirectories[i].location, join(metadata.ksName, directoryName)); +dataPaths[i] = new File(dataDirectories[i].location, newSSTableRelativePath); } for (File dir : dataPaths)
[jira] [Commented] (CASSANDRA-7807) Push notification when tracing completes for an operation
[ https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386069#comment-14386069 ] Stefania commented on CASSANDRA-7807: - {{ServerConnection.java}}: Great, thank you. \\ {{debug-cql.java}}: I imagine it used to work with sourcing the env file and something got broken? You too are getting this exception when sourcing it? {code} stefania@mia:~/git/cstar/cassandra/bin$ ./debug-cql 127.0.0.1 9042 CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned (Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.selectBoundary (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;II)I CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.strictnessOfLessThan (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I Error: Exception thrown by the agent : java.lang.NullPointerException {code} I would open a ticket, there may be more to this exception than we understand at the moment. At least I feel that way. \\ {{SimpleClient.java / TransportException}}: I see the problem now. Can we then not just leave it alone as it was before: {code} if (msg instanceof ErrorMessage) throw new RuntimeException((Throwable)((ErrorMessage) msg).error); {code} and then do this in {{testTraceCompleteVersion3()}} {code} catch (RuntimeException e) { Assert.assertTrue(e.getCause() instanceof ProtocolException); // that's what we want } {code} Or did you have any other reason to change it? I know it's test code but it worries me that some day we'll have a {{TransportException}} that is not a {{RuntimeException}}. However, if you prefer to clean it up in another ticket, like CASSANDRA-8809, then I am happy to leave the cast for a limited amount of time. \\ Probabilistic tracing: It looks correct now but why do we need an extra boolean at all? Was it not enough not to pass the connection in {{createTracingSession()}} like so: {code} public void createTracingSession(Connection connection) { UUID session = this.preparedTracingSession; if (session == null) { Tracing.instance.newSession(); //< no connection here } else { Tracing.instance.newSession(connection, session); this.preparedTracingSession = null; } } {code} > Push notification when tracing completes for an operation > - > > Key: CASSANDRA-7807 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7807 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Tyler Hobbs >Assignee: Robert Stupp >Priority: Minor > Labels: client-impacting, protocolv4 > Fix For: 3.0 > > Attachments: 7807-v2.txt, 7807-v3.txt, 7807.txt > > > Tracing is an asynchronous operation, and drivers currently poll to determine > when the trace is complete (in a loop with sleeps). Instead, the server > could push a notification to the driver when the trace completes. > I'm guessing that most of the work for this will be around pushing > notifications to a single connection instead of all connections that have > registered listeners for a particular event type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: simplify
Repository: cassandra Updated Branches: refs/heads/trunk 7ea642c89 -> ce643ff99 simplify Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ce643ff9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ce643ff9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ce643ff9 Branch: refs/heads/trunk Commit: ce643ff99f57784577f2067f2ae4c95d3a02ecff Parents: 7ea642c Author: Dave Brosius Authored: Sun Mar 29 20:46:04 2015 -0400 Committer: Dave Brosius Committed: Sun Mar 29 20:46:04 2015 -0400 -- src/java/org/apache/cassandra/tools/NodeTool.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ce643ff9/src/java/org/apache/cassandra/tools/NodeTool.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java b/src/java/org/apache/cassandra/tools/NodeTool.java index 9c804c0..4ef2469 100644 --- a/src/java/org/apache/cassandra/tools/NodeTool.java +++ b/src/java/org/apache/cassandra/tools/NodeTool.java @@ -2133,12 +2133,11 @@ public class NodeTool @Option(title = "resolve_ip", name = {"-r", "--resolve-ip"}, description = "Show node domain names instead of IPs") private boolean resolveIp = false; -private boolean hasEffectiveOwns = false; private boolean isTokenPerNode = true; private int maxAddressLength = 0; private String format = null; private Collection joiningNodes, leavingNodes, movingNodes, liveNodes, unreachableNodes; -private Map loadMap, hostIDMap, tokensToEndpoints; +private Map loadMap, hostIDMap; private EndpointSnitchInfoMBean epSnitchInfo; @Override @@ -2148,7 +2147,7 @@ public class NodeTool leavingNodes = probe.getLeavingNodes(); movingNodes = probe.getMovingNodes(); loadMap = probe.getLoadMap(); -tokensToEndpoints = probe.getTokenToEndpointMap(); +Map tokensToEndpoints = probe.getTokenToEndpointMap(); liveNodes = probe.getLiveNodes(); unreachableNodes = probe.getUnreachableNodes(); hostIDMap = probe.getHostIdMap(); @@ -2157,6 +2156,7 @@ public class NodeTool StringBuffer errors = new StringBuffer(); Map ownerships = null; +boolean hasEffectiveOwns = false; try { ownerships = probe.effectiveOwnership(keyspace);
cassandra git commit: remove dead code
Repository: cassandra Updated Branches: refs/heads/trunk c0fc8d823 -> 7ea642c89 remove dead code Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7ea642c8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7ea642c8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7ea642c8 Branch: refs/heads/trunk Commit: 7ea642c89c124f0515754ceb8a163370e8d74578 Parents: c0fc8d8 Author: Dave Brosius Authored: Sun Mar 29 20:35:24 2015 -0400 Committer: Dave Brosius Committed: Sun Mar 29 20:35:24 2015 -0400 -- src/java/org/apache/cassandra/utils/IntervalTree.java | 9 - 1 file changed, 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7ea642c8/src/java/org/apache/cassandra/utils/IntervalTree.java -- diff --git a/src/java/org/apache/cassandra/utils/IntervalTree.java b/src/java/org/apache/cassandra/utils/IntervalTree.java index 0c3c611..4522e27 100644 --- a/src/java/org/apache/cassandra/utils/IntervalTree.java +++ b/src/java/org/apache/cassandra/utils/IntervalTree.java @@ -47,19 +47,10 @@ public class IntervalTree, D, I extends Interval protected IntervalTree(Collection intervals) { -final IntervalTree it = this; this.head = intervals == null || intervals.isEmpty() ? null : new IntervalNode(intervals); this.count = intervals == null ? 0 : intervals.size(); } -public static , D, I extends Interval> IntervalTree build(Collection intervals, Comparator comparator) -{ -if (intervals == null || intervals.isEmpty()) -return emptyTree(); - -return new IntervalTree(intervals); -} - public static , D, I extends Interval> IntervalTree build(Collection intervals) { if (intervals == null || intervals.isEmpty())
[jira] [Commented] (CASSANDRA-7814) enable describe on indices
[ https://issues.apache.org/jira/browse/CASSANDRA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386024#comment-14386024 ] Stefania commented on CASSANDRA-7814: - So sorry [~blerer] I tested the zip file without realizing I still had the python driver installed. Please try again. I changed the python driver {{setup.py}} to append the git hash to the root directory as well as the file name, and then I rebuilt the driver at 5f06ec5. [~thobbs] please check commit 2671917 in https://github.com/stef1927/python-driver/commits/7814 and let us know if this seems reasonable. My doubt is because the official python driver won't have this commit in all likelihood, unless there is a way to tell python sdist to only optionally add the git hash, in which case I can open another python ticket to ask them to pick it up. At the moment, when we pick up an official release or build on the official master branch it will still work, just without the git hash. The alternative is to manually add the hash to the file name like I did the first time round and then to enhance cqlsh to remove the hash when it calculates the root directory . > enable describe on indices > -- > > Key: CASSANDRA-7814 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7814 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: radha >Assignee: Stefania >Priority: Minor > Fix For: 2.1.4 > > > Describe index should be supported, right now, the only way is to export the > schema and find what it really is before updating/dropping the index. > verified in > [cqlsh 3.1.8 | Cassandra 1.2.18.1 | CQL spec 3.0.0 | Thrift protocol 19.36.2] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage
[ https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386007#comment-14386007 ] Benedict commented on CASSANDRA-8670: - I've pushed some suggestions for further refactoring [here|https://github.com/belliottsmith/cassandra/tree/8670-suggestions]. I've only looked at the overall class hierarchy, I haven't focused yet on reviewing the method implementation changes. Mostly these changes flatten the class hierarchy; it's gotten deep enough I don't think there's a good reason to maintain the distinction between DataStreamOutputPlus and DataStreamOutputPlusAndChannel, especially since we often just mock up a Channel based off the OutputStream. I've also flattened NIODataOutputStream and DataOutputStreamByteBufferPlus into BufferedDataOutputStreamPlus, since we only write to the buffer if we don't exceed its size. At the same time, since we are now refactoring this whole hierarchy, I made DataOutputBuffer extend BufferedDataOutputStreamPlus, and just ensures the buffer grows as necessary, and have removed FastByteArrayOutputStream since we no longer need it. I've also stopped SequentialWriter implementing WritableByteChannel, and now pass in its internal Channel, since that's the only way the operations will benefit. As a follow up ticket, we should probably move SequentialWriter to utilising BufferedDataOutputStreamPlus directly, so that it can benefit from faster encoding of primitives Let me know what you think of the changes to the hierarchy, and once we've ironed that out we can move on to the home stretch and confirm the code changes. One other thing we could consider is dropping the "Plus" from everything except the interface, since it seems superfluous, and it's all fairly verbose. > Large columns + NIO memory pooling causes excessive direct memory usage > --- > > Key: CASSANDRA-8670 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8670 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 3.0 > > Attachments: largecolumn_test.py > > > If you provide a large byte array to NIO and ask it to populate the byte > array from a socket it will allocate a thread local byte buffer that is the > size of the requested read no matter how large it is. Old IO wraps new IO for > sockets (but not files) so old IO is effected as well. > Even If you are using Buffered{Input | Output}Stream you can end up passing a > large byte array to NIO. The byte array read method will pass the array to > NIO directly if it is larger than the internal buffer. > Passing large cells between nodes as part of intra-cluster messaging can > cause the NIO pooled buffers to quickly reach a high watermark and stay > there. This ends up costing 2x the largest cell size because there is a > buffer for input and output since they are different threads. This is further > multiplied by the number of nodes in the cluster - 1 since each has a > dedicated thread pair with separate thread locals. > Anecdotally it appears that the cost is doubled beyond that although it isn't > clear why. Possibly the control connections or possibly there is some way in > which multiple > Need a workload in CI that tests the advertised limits of cells on a cluster. > It would be reasonable to ratchet down the max direct memory for the test to > trigger failures if a memory pooling issue is introduced. I don't think we > need to test concurrently pulling in a lot of them, but it should at least > work serially. > The obvious fix to address this issue would be to read in smaller chunks when > dealing with large values. I think small should still be relatively large (4 > megabytes) so that code that is reading from a disk can amortize the cost of > a seek. It can be hard to tell what the underlying thing being read from is > going to be in some of the contexts where we might choose to implement > switching to reading chunks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8481) ghost node in gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8481: --- Reproduced In: 2.0.11 Fix Version/s: 2.0.14 > ghost node in gossip > > > Key: CASSANDRA-8481 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8481 > Project: Cassandra > Issue Type: Bug >Reporter: Alexey Larkov >Priority: Minor > Fix For: 2.0.14 > > > After inaccurate removing nodes from cluster > nodetool gossipinfo and jmx > org.apache.cassandra.net.FailureDetector.AllEndpointsStates shows the node > status is LEFT. > Name Value TypeDisplay NameUpdate Interval Description > /192.168.58.75 > generation:3 > heartbeat:0 > REMOVAL_COORDINATOR:REMOVER,f9a28f8c-3244-42d1-986e-592aafe1406c > STATUS:LEFT,-3361705224534889554,141446785 > jmx org.apache.cassandra.net.FailureDetector.DownEndpointCount is 1 > node 58.75 is absent in nodetool status and system.peers table. > Before node got LEFT status it was in REMOVING state. > I've done unsafeassassinateendpoint and it's status became LEFT, but > DownEndpointCount is still 1. > And org.apache.cassandra.net.FailureDetector.SimpleStates is still DOWN. > How to remove this node from gossip? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8481) ghost node in gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385862#comment-14385862 ] Alexey Larkov commented on CASSANDRA-8481: -- That was latest cassandra version by that date. 2.0.12 or 11 i guess. > ghost node in gossip > > > Key: CASSANDRA-8481 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8481 > Project: Cassandra > Issue Type: Bug >Reporter: Alexey Larkov >Priority: Minor > > After inaccurate removing nodes from cluster > nodetool gossipinfo and jmx > org.apache.cassandra.net.FailureDetector.AllEndpointsStates shows the node > status is LEFT. > Name Value TypeDisplay NameUpdate Interval Description > /192.168.58.75 > generation:3 > heartbeat:0 > REMOVAL_COORDINATOR:REMOVER,f9a28f8c-3244-42d1-986e-592aafe1406c > STATUS:LEFT,-3361705224534889554,141446785 > jmx org.apache.cassandra.net.FailureDetector.DownEndpointCount is 1 > node 58.75 is absent in nodetool status and system.peers table. > Before node got LEFT status it was in REMOVING state. > I've done unsafeassassinateendpoint and it's status became LEFT, but > DownEndpointCount is still 1. > And org.apache.cassandra.net.FailureDetector.SimpleStates is still DOWN. > How to remove this node from gossip? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9066) BloomFilter serialization is inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-9066. - Resolution: Fixed > BloomFilter serialization is inefficient > > > Key: CASSANDRA-9066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Gustav Munkby > Fix For: 2.1.4 > > Attachments: 2.1-9066.patch > > > As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is > very slow. In that ticket I proposed that 2.1 use buffered serialization, and > 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9068) AntiCompaction should calculate a more accurate lower bound on bloom filter size for each target
Benedict created CASSANDRA-9068: --- Summary: AntiCompaction should calculate a more accurate lower bound on bloom filter size for each target Key: CASSANDRA-9068 URL: https://issues.apache.org/jira/browse/CASSANDRA-9068 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Fix For: 3.0 As a follow up to CASSANDRA-9060, the ratio of occupancy for each resultant file for an anticompaction group (or single sstable in 2.1) could be estimated with a tweaked version of estimatedKeysForRanges(), and a method for inverting a range collection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9066) BloomFilter serialization is inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385825#comment-14385825 ] Benedict commented on CASSANDRA-9066: - Committed, and filed CASSANDRA-9067 as a follow up > BloomFilter serialization is inefficient > > > Key: CASSANDRA-9066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Gustav Munkby > Fix For: 2.1.4 > > Attachments: 2.1-9066.patch > > > As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is > very slow. In that ticket I proposed that 2.1 use buffered serialization, and > 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9066) BloomFilter serialization is inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9066: Assignee: Gustav Munkby > BloomFilter serialization is inefficient > > > Key: CASSANDRA-9066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Gustav Munkby > Fix For: 2.1.4 > > Attachments: 2.1-9066.patch > > > As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is > very slow. In that ticket I proposed that 2.1 use buffered serialization, and > 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9067) BloomFilter serialization format should not change byte ordering
Benedict created CASSANDRA-9067: --- Summary: BloomFilter serialization format should not change byte ordering Key: CASSANDRA-9067 URL: https://issues.apache.org/jira/browse/CASSANDRA-9067 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 3.0 As a follow-up to CASSANDRA-9066 and CASSANDRA-9060, it appears we do some unnecessary byte swapping during the serialization of bloom filters, which makes the logic slower and harder to follow. We should either perform them more efficiently (using Long.reverseBytes) or, preferably, eliminate the conversion altogether since it does not appear to serve any purpose. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385821#comment-14385821 ] Benedict commented on CASSANDRA-9060: - I've committed your patch to 2.1. 3.0 looks to already behave approximately equivalently to the behaviour introduced by this patch due to the use of HLL cardinality estimation, but both could do with estimating a better lower bound on the occupancy of each side of the result. > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.0 > > Attachments: 2.1-9060-simple.patch, trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/5] cassandra git commit: Fix anti-compaction target bloom filter size
Fix anti-compaction target bloom filter size patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9060 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b0de3270 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b0de3270 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b0de3270 Branch: refs/heads/cassandra-2.1 Commit: b0de327099c22dd4708b699dfa9e18496abd7429 Parents: 7b1331f Author: Gustav Munkby Authored: Sun Mar 29 16:13:23 2015 +0100 Committer: Benedict Elliott Smith Committed: Sun Mar 29 16:19:54 2015 +0100 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/compaction/CompactionManager.java | 3 +-- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8854261..c02af99 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.4 + * Fix anti-compaction target bloom filter size (CASSANDRA-9060) * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047) * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034) * Avoid overwriting index summaries for sstables with an older format that http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 992378f..b9c4553 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -1050,8 +1050,6 @@ public class CompactionManager implements CompactionManagerMBean List anticompactedSSTables = new ArrayList<>(); int repairedKeyCount = 0; int unrepairedKeyCount = 0; -// TODO(5351): we can do better here: -int expectedBloomFilterSize = Math.max(cfs.metadata.getMinIndexInterval(), (int)(SSTableReader.getApproximateKeyCount(repairedSSTables))); logger.info("Performing anticompaction on {} sstables", repairedSSTables.size()); // iterate over sstables to check if the repaired / unrepaired ranges intersect them. for (SSTableReader sstable : repairedSSTables) @@ -1075,6 +1073,7 @@ public class CompactionManager implements CompactionManagerMBean try (AbstractCompactionStrategy.ScannerList scanners = cfs.getCompactionStrategy().getScanners(new HashSet<>(Collections.singleton(sstable))); CompactionController controller = new CompactionController(cfs, sstableAsSet, CFMetaData.DEFAULT_GC_GRACE_SECONDS)) { +int expectedBloomFilterSize = Math.max(cfs.metadata.getMinIndexInterval(), (int)sstable.estimatedKeys()); repairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, destination, expectedBloomFilterSize, repairedAt, sstable)); unRepairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, destination, expectedBloomFilterSize, ActiveRepairService.UNREPAIRED_SSTABLE, sstable));
[4/5] cassandra git commit: Fix anti-compaction target bloom filter size
Fix anti-compaction target bloom filter size patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9060 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b0de3270 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b0de3270 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b0de3270 Branch: refs/heads/trunk Commit: b0de327099c22dd4708b699dfa9e18496abd7429 Parents: 7b1331f Author: Gustav Munkby Authored: Sun Mar 29 16:13:23 2015 +0100 Committer: Benedict Elliott Smith Committed: Sun Mar 29 16:19:54 2015 +0100 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/compaction/CompactionManager.java | 3 +-- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8854261..c02af99 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.4 + * Fix anti-compaction target bloom filter size (CASSANDRA-9060) * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047) * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034) * Avoid overwriting index summaries for sstables with an older format that http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 992378f..b9c4553 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -1050,8 +1050,6 @@ public class CompactionManager implements CompactionManagerMBean List anticompactedSSTables = new ArrayList<>(); int repairedKeyCount = 0; int unrepairedKeyCount = 0; -// TODO(5351): we can do better here: -int expectedBloomFilterSize = Math.max(cfs.metadata.getMinIndexInterval(), (int)(SSTableReader.getApproximateKeyCount(repairedSSTables))); logger.info("Performing anticompaction on {} sstables", repairedSSTables.size()); // iterate over sstables to check if the repaired / unrepaired ranges intersect them. for (SSTableReader sstable : repairedSSTables) @@ -1075,6 +1073,7 @@ public class CompactionManager implements CompactionManagerMBean try (AbstractCompactionStrategy.ScannerList scanners = cfs.getCompactionStrategy().getScanners(new HashSet<>(Collections.singleton(sstable))); CompactionController controller = new CompactionController(cfs, sstableAsSet, CFMetaData.DEFAULT_GC_GRACE_SECONDS)) { +int expectedBloomFilterSize = Math.max(cfs.metadata.getMinIndexInterval(), (int)sstable.estimatedKeys()); repairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, destination, expectedBloomFilterSize, repairedAt, sstable)); unRepairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, destination, expectedBloomFilterSize, ActiveRepairService.UNREPAIRED_SSTABLE, sstable));
[1/5] cassandra git commit: Buffer bloom filter serialization
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 7b1331fed -> d3258f615 refs/heads/trunk 95d5d8b23 -> c0fc8d823 Buffer bloom filter serialization patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9066 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d3258f61 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d3258f61 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d3258f61 Branch: refs/heads/cassandra-2.1 Commit: d3258f6152eda3be4cba0a021ea34fcb34b7a569 Parents: b0de327 Author: Gustav Munkby Authored: Sun Mar 29 16:17:56 2015 +0100 Committer: Benedict Elliott Smith Committed: Sun Mar 29 16:19:54 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/io/sstable/SSTableWriter.java| 14 +++--- 2 files changed, 4 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index c02af99..bd5e277 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.4 + * Buffer bloom filter serialization (CASSANDRA-9066) * Fix anti-compaction target bloom filter size (CASSANDRA-9060) * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047) * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java index 440961f..a39c134 100644 --- a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java +++ b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java @@ -17,10 +17,7 @@ */ package org.apache.cassandra.io.sstable; -import java.io.DataInput; -import java.io.File; -import java.io.FileOutputStream; -import java.io.IOException; +import java.io.*; import java.nio.ByteBuffer; import java.util.Arrays; import java.util.Collections; @@ -55,12 +52,7 @@ import org.apache.cassandra.io.sstable.metadata.MetadataCollector; import org.apache.cassandra.io.sstable.metadata.MetadataComponent; import org.apache.cassandra.io.sstable.metadata.MetadataType; import org.apache.cassandra.io.sstable.metadata.StatsMetadata; -import org.apache.cassandra.io.util.DataOutputPlus; -import org.apache.cassandra.io.util.DataOutputStreamAndChannel; -import org.apache.cassandra.io.util.FileMark; -import org.apache.cassandra.io.util.FileUtils; -import org.apache.cassandra.io.util.SegmentedFile; -import org.apache.cassandra.io.util.SequentialWriter; +import org.apache.cassandra.io.util.*; import org.apache.cassandra.service.StorageService; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.FBUtilities; @@ -647,7 +639,7 @@ public class SSTableWriter extends SSTable { // bloom filter FileOutputStream fos = new FileOutputStream(path); -DataOutputStreamAndChannel stream = new DataOutputStreamAndChannel(fos); +DataOutputStreamPlus stream = new DataOutputStreamPlus(new BufferedOutputStream(fos)); FilterFactory.serialize(bf, stream); stream.flush(); fos.getFD().sync();
[5/5] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Conflicts: src/java/org/apache/cassandra/db/compaction/CompactionManager.java src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c0fc8d82 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c0fc8d82 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c0fc8d82 Branch: refs/heads/trunk Commit: c0fc8d8236ee2e6e58929a7ab8c74a49fe2a6622 Parents: 95d5d8b d3258f6 Author: Benedict Elliott Smith Authored: Sun Mar 29 16:20:53 2015 +0100 Committer: Benedict Elliott Smith Committed: Sun Mar 29 16:20:53 2015 +0100 -- CHANGES.txt | 2 ++ .../cassandra/db/compaction/CompactionManager.java| 2 -- .../io/sstable/format/big/BigTableWriter.java | 14 +++--- 3 files changed, 5 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c0fc8d82/CHANGES.txt -- diff --cc CHANGES.txt index 739926e,bd5e277..e66b724 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,83 -1,6 +1,85 @@@ +3.0 + * Compressed Commit Log (CASSANDRA-6809) + * Optimise IntervalTree (CASSANDRA-8988) + * Add a key-value payload for third party usage (CASSANDRA-8553) + * Bump metrics-reporter-config dependency for metrics 3.0 (CASSANDRA-8149) + * Partition intra-cluster message streams by size, not type (CASSANDRA-8789) + * Add WriteFailureException to native protocol, notify coordinator of + write failures (CASSANDRA-8592) + * Convert SequentialWriter to nio (CASSANDRA-8709) + * Add role based access control (CASSANDRA-7653, 8650, 7216, 8760, 8849, 8761, 8850) + * Record client ip address in tracing sessions (CASSANDRA-8162) + * Indicate partition key columns in response metadata for prepared + statements (CASSANDRA-7660) + * Merge UUIDType and TimeUUIDType parse logic (CASSANDRA-8759) + * Avoid memory allocation when searching index summary (CASSANDRA-8793) + * Optimise (Time)?UUIDType Comparisons (CASSANDRA-8730) + * Make CRC32Ex into a separate maven dependency (CASSANDRA-8836) + * Use preloaded jemalloc w/ Unsafe (CASSANDRA-8714) + * Avoid accessing partitioner through StorageProxy (CASSANDRA-8244, 8268) + * Upgrade Metrics library and remove depricated metrics (CASSANDRA-5657) + * Serializing Row cache alternative, fully off heap (CASSANDRA-7438) + * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707) + * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560) + * Support direct buffer decompression for reads (CASSANDRA-8464) + * DirectByteBuffer compatible LZ4 methods (CASSANDRA-7039) + * Group sstables for anticompaction correctly (CASSANDRA-8578) + * Add ReadFailureException to native protocol, respond + immediately when replicas encounter errors while handling + a read request (CASSANDRA-7886) + * Switch CommitLogSegment from RandomAccessFile to nio (CASSANDRA-8308) + * Allow mixing token and partition key restrictions (CASSANDRA-7016) + * Support index key/value entries on map collections (CASSANDRA-8473) + * Modernize schema tables (CASSANDRA-8261) + * Support for user-defined aggregation functions (CASSANDRA-8053) + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419) + * Refactor SelectStatement, return IN results in natural order instead + of IN value list order and ignore duplicate values in partition key IN restrictions (CASSANDRA-7981) + * Support UDTs, tuples, and collections in user-defined + functions (CASSANDRA-7563) + * Fix aggregate fn results on empty selection, result column name, + and cqlsh parsing (CASSANDRA-8229) + * Mark sstables as repaired after full repair (CASSANDRA-7586) + * Extend Descriptor to include a format value and refactor reader/writer + APIs (CASSANDRA-7443) + * Integrate JMH for microbenchmarks (CASSANDRA-8151) + * Keep sstable levels when bootstrapping (CASSANDRA-7460) + * Add Sigar library and perform basic OS settings check on startup (CASSANDRA-7838) + * Support for aggregation functions (CASSANDRA-4914) + * Remove cassandra-cli (CASSANDRA-7920) + * Accept dollar quoted strings in CQL (CASSANDRA-7769) + * Make assassinate a first class command (CASSANDRA-7935) + * Support IN clause on any partition key column (CASSANDRA-7855) + * Support IN clause on any clustering column (CASSANDRA-4762) + * Improve compaction logging (CASSANDRA-7818) + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917) + * Do anticompaction in groups (CASSANDRA-6851) + * Support user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 7781, 7929, + 7924, 7812, 8063, 7813, 7708) + * Permit c
[3/5] cassandra git commit: Buffer bloom filter serialization
Buffer bloom filter serialization patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9066 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d3258f61 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d3258f61 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d3258f61 Branch: refs/heads/trunk Commit: d3258f6152eda3be4cba0a021ea34fcb34b7a569 Parents: b0de327 Author: Gustav Munkby Authored: Sun Mar 29 16:17:56 2015 +0100 Committer: Benedict Elliott Smith Committed: Sun Mar 29 16:19:54 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/io/sstable/SSTableWriter.java| 14 +++--- 2 files changed, 4 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index c02af99..bd5e277 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.4 + * Buffer bloom filter serialization (CASSANDRA-9066) * Fix anti-compaction target bloom filter size (CASSANDRA-9060) * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047) * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java index 440961f..a39c134 100644 --- a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java +++ b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java @@ -17,10 +17,7 @@ */ package org.apache.cassandra.io.sstable; -import java.io.DataInput; -import java.io.File; -import java.io.FileOutputStream; -import java.io.IOException; +import java.io.*; import java.nio.ByteBuffer; import java.util.Arrays; import java.util.Collections; @@ -55,12 +52,7 @@ import org.apache.cassandra.io.sstable.metadata.MetadataCollector; import org.apache.cassandra.io.sstable.metadata.MetadataComponent; import org.apache.cassandra.io.sstable.metadata.MetadataType; import org.apache.cassandra.io.sstable.metadata.StatsMetadata; -import org.apache.cassandra.io.util.DataOutputPlus; -import org.apache.cassandra.io.util.DataOutputStreamAndChannel; -import org.apache.cassandra.io.util.FileMark; -import org.apache.cassandra.io.util.FileUtils; -import org.apache.cassandra.io.util.SegmentedFile; -import org.apache.cassandra.io.util.SequentialWriter; +import org.apache.cassandra.io.util.*; import org.apache.cassandra.service.StorageService; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.FBUtilities; @@ -647,7 +639,7 @@ public class SSTableWriter extends SSTable { // bloom filter FileOutputStream fos = new FileOutputStream(path); -DataOutputStreamAndChannel stream = new DataOutputStreamAndChannel(fos); +DataOutputStreamPlus stream = new DataOutputStreamPlus(new BufferedOutputStream(fos)); FilterFactory.serialize(bf, stream); stream.flush(); fos.getFD().sync();
[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385804#comment-14385804 ] Benedict commented on CASSANDRA-9060: - bq. I think the immediate problem is that they are created to allow room for all keys in all anticompacted tables, whereas anticompactions process one table at a time Thanks. You're right., and this is definitely something to fix in 2.1 In this instance we don't use HLL cardinality estimators, but the index summary, which isn't probabilistic. What it is, however, is only accurate to a certain granularity. As a first patch your approach reduces the problem to the one I initially assumed it was, i.e. a doubling of required space (instead of \*N), but with a small amount of TLC the estimatedKeysForRanges() method could be modified to give a lower bound for the size of both resultant tables (at the moment it can significantly over estimate in some scenarios, but also cannot easily estimate the cardinality of the negation of the range - so we would have to subtract the overestimation, giving an underestimate which is much worse). Your patch looks to me to significantly improve the status quo, so I will commit it now, and we can address a slightly improved patch for perhaps 2.1.5 > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.0 > > Attachments: 2.1-9060-simple.patch, trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385767#comment-14385767 ] Gustav Munkby commented on CASSANDRA-9060: -- Regarding the size of the Bloom filters, I think the immediate problem is that they are created to allow room for all keys in all anticompacted tables, whereas anticompactions process one table at a time. I've added a patch, which I believe does exactly that. Given that this change is fairly small, I targeted it at 2.1. As the keys are going to be distributed over the two resulting tables, in the ideal world we might want to have much smaller bloom filters on either side than what we initially thought. I'm guessing this is a general problem with compactions, but the HyperLogLog cardinality estimators should help in the normal case. For the general case of ensuring the Bloom filters are not too large, I can see basically two solutions. Either introduce a scanning phase before the actual compaction, where the size of the bloom filter(s) are calculated. Or reduce the size of the Bloom filter once compaction has completed. The obvious implementation of the latter would be to scan through the compacted index, possibly gated by a comparison of the index size and the bloom filter size. I guess scanning through the index could be avoided by making sure that the IndexWriter kept track of multiple Bloom-filters of exponentially growing sizes. That way, once the index is complete, the most appropriate Bloom-filter could be picked and written to disk, discarding the others. > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.0 > > Attachments: 2.1-9060-simple.patch, trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustav Munkby updated CASSANDRA-9060: - Attachment: 2.1-9060-simple.patch > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.0 > > Attachments: 2.1-9060-simple.patch, trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9066) BloomFilter serialization is inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustav Munkby updated CASSANDRA-9066: - Attachment: 2.1-9066.patch > BloomFilter serialization is inefficient > > > Key: CASSANDRA-9066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.1.4 > > Attachments: 2.1-9066.patch > > > As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is > very slow. In that ticket I proposed that 2.1 use buffered serialization, and > 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7282) Faster Memtable map
[ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385740#comment-14385740 ] Benedict commented on CASSANDRA-7282: - bq. +1 for massive understatement. Thanks. I spent a day working on just that, so glad it panned out :) I'm currently leaning towards postponing this ticket until 3.1, since some careful consideration is needed to ensure a uniform distribution of hash keys within the map, especially without vnodes on large clusters. It's possible we could only enable this optimisation on nodes that can predict their distribution will be fair. In either case I think it may be helpful to consider the ticket in relation to CASSANDRA-7032 and CASSANDRA-6696, by e.g. having a separate hash table for each vnode range. Depending on 3.0 release timeline, the incorporation of these tickets, and on the progression of my other commitments, I may still aim to deliver this in 3.0, but just alerting that at the moment my view is this is uncertain and on balance less than likely. > Faster Memtable map > --- > > Key: CASSANDRA-7282 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7282 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Labels: performance > Fix For: 3.0 > > Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, > run1.svg, writes.svg > > > Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in > our memtables. Maintaining this is an O(lg(n)) operation; since the vast > majority of users use a hash partitioner, it occurs to me we could maintain a > hybrid ordered list / hash map. The list would impose the normal order on the > collection, but a hash index would live alongside as part of the same data > structure, simply mapping into the list and permitting O(1) lookups and > inserts. > I've chosen to implement this initial version as a linked-list node per item, > but we can optimise this in future by storing fatter nodes that permit a > cache-line's worth of hashes to be checked at once, further reducing the > constant factor costs for lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9066) BloomFilter serialization is inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-9066: Reviewer: Benedict > BloomFilter serialization is inefficient > > > Key: CASSANDRA-9066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.1.4 > > > As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is > very slow. In that ticket I proposed that 2.1 use buffered serialization, and > 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385735#comment-14385735 ] Benedict commented on CASSANDRA-9060: - I've split the slow serialization problem out into CASSANDRA-9066, since the problem of anti-compaction mispredicting the number of rows can at worst halve performance, whereas the slow serialization could have an order of magnitude impact. [~grddev]: do you want to have a stab at that ticket? > Anticompaction hangs on bloom filter bitset serialization > -- > > Key: CASSANDRA-9060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9060 > Project: Cassandra > Issue Type: Bug >Reporter: Gustav Munkby >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.0 > > Attachments: trunk-9060.patch > > > I tried running an incremental repair against a 15-node vnode-cluster with > roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the > suggested migration steps. I manually chose a small range for the repair > (using --start/end-token). The actual repair part took almost no time at all, > but the anticompactions took a lot of time (not surprisingly). > Obviously, this might not be the ideal way to run incremental repairs, but I > wanted to look into what made the whole process so slow. The results were > rather surprising. The majority of the time was spent serializing bloom > filters. > The reason seemed to be two-fold. First, the bloom-filters generated were > huge (probably because the original SSTables were large). With a proper > migration to incremental repairs, I'm guessing this would not happen. > Secondly, however, the bloom filters were being written to the output one > byte at a time (with quite a few type-conversions on the way) to transform > the little-endian in-memory representation to the big-endian on-disk > representation. > I have implemented a solution where big-endian is used in-memory as well as > on-disk, which obviously makes de-/serialization much, much faster. This > introduces some slight overhead when checking the bloom filter, but I can't > see how that would be problematic. An obvious alternative would be to still > perform the serialization/deserialization using a byte array, but perform the > byte-order swap there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9066) BloomFilter serialization is inefficient
Benedict created CASSANDRA-9066: --- Summary: BloomFilter serialization is inefficient Key: CASSANDRA-9066 URL: https://issues.apache.org/jira/browse/CASSANDRA-9066 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Fix For: 2.1.4 As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is very slow. In that ticket I proposed that 2.1 use buffered serialization, and 3.0 make the serialization format itself more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385731#comment-14385731 ] Benedict commented on CASSANDRA-8984: - bq. the more interacting objects and abstractions around behavior we have, the more the complexity burden My model of complexity is: for any set of actions (units of execution) or abstractions you want to understand or modify, what is the transitive closure of _interactions_ with other actions/abstractions that need to be understood and considered in conjunction to ensure correctness. In parallel with this, I would suggest that fragility is the portion of this complexity that is implicit, or easily missed\*. To go back to your 5 20x complexity vs 100 1x complexity, this can be the difference between additive and multiplicative complexity. If all 20 points of complexity in each five classes can interact with any other point in any of the five classes, then the complexity burden is 3.2M, not 100. \* or if the complexity is too large to fit into your working memory My point is simply that if a new class reduces the number of interactions that need to be considered (i.e. isolation), then complexity is reduced. This is a bit of an abstract discussion, but I do love me some meta argumentation. (In my model of complexity, what I called acclimation is the number of high level abstractions a newcomer needs to have a vague understanding of to mentally map and model the overall functional unit they're addressing. I think this complexity is completely drowned out by the other once real work starts to happen. NB: I don't pretend this model of complexity is complete, but I think it serves for this discussion) bq. having to manage some state transitions manually leaks that portion of the Transactional abstraction Leakage at the precise clearly defined point cuts for interaction (i.e. the abstract methods requiring some boilerplate) aren't such a problem for complexity (by my definition), but they are _ugly_. I've uploaded an alternative approach [here|https://github.com/belliottsmith/cassandra/tree/8984-alt] that I do prefer, but technically increases the number of classes and doesn't reduce the amount of boilerplate, so I initially avoided (as inner classes can be even worse for acclimation IME), but does have the advantage of that boilerplate being better managed by the compiler and IDE. That is, solving the multiple inheritance problem through Java's only other mechanism besides code duplication: implementation proxies. bq. What's your confidence regarding the likelihood of 8690 delivering on that safety? safety or safely? The latter: high; the former: medium (i'm sure we can improve it, but doubt we'll get it to the same level) bq. I haven't had the time to sit down and really consider revisions to this design I'll leave both approaches I've concocted in your court for now, then. If you can come up with a third approach, I'm all ears :) > Introduce Transactional API for behaviours that can corrupt system state > > > Key: CASSANDRA-8984 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 2.1.4 > > Attachments: 8984_windows_timeout.txt > > > As a penultimate (and probably final for 2.1, if we agree to introduce it > there) round of changes to the internals managing sstable writing, I've > introduced a new API called "Transactional" that I hope will make it much > easier to write correct behaviour. As things stand we conflate a lot of > behaviours into methods like "close" - the recent changes unpicked some of > these, but didn't go far enough. My proposal here introduces an interface > designed to support four actions (on top of their normal function): > * prepareToCommit > * commit > * abort > * cleanup > In normal operation, once we have finished constructing a state change we > call prepareToCommit; once all such state changes are prepared, we call > commit. If at any point everything fails, abort is called. In _either_ case, > cleanup is called at the very last. > These transactional objects are all AutoCloseable, with the behaviour being > to rollback any changes unless commit has completed successfully. > The changes are actually less invasive than it might sound, since we did > recently introduce abort in some places, as well as have commit like methods. > This simply formalises the behaviour, and makes it consistent between all > objects that interact in this way. Much of the code change is boilerplate, > such as moving an object into a try-declaration, although the change is still > non-trivial. What it
[jira] [Commented] (CASSANDRA-7807) Push notification when tracing completes for an operation
[ https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385718#comment-14385718 ] Robert Stupp commented on CASSANDRA-7807: - {{ServerConnection.java}} done {{debug-cql}} no, did not forget. Otherwise {{debug-cql}} would not start if any other process (C*) uses port 7199 - i.e. {{debug-cql}} would not start if C* is running locally). It’s probably better handled in a separate ticket - or we decide to remove sourcing of {{cassandra-env.sh}} at all for {{debug-cql}}. Not sure about either way. {{SimpleClient.java}} / {{TransportException}} Yea - it’s a bit awkward. But TE is only implemented by C* exceptions that extend RuntimeException. These are: {{ServerError}}, {{ProtocolException}} and {{CassandraException}}. Eventually it makes more sense to let {{ServerError}} and {{ProtocolException}} extend {{CassandraException}} and get rid {{TransportException}}. We already have an ”exception cleanup ticket” for 3.0 (CASSANDRA-8809) - maybe it’s worth to cleanup this, too. {{TraceCompleteTest.java}} Worked in the increased timeout. Also changed the code to deal better with trace probability (via an explicit boolean whether the event should be sent or not). Maybe I counted too much on the ”correct” result of the test. Let me know what you thing about the TE and debug-cql things and I'll provide a matching patch for that. The branch's updated. > Push notification when tracing completes for an operation > - > > Key: CASSANDRA-7807 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7807 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Tyler Hobbs >Assignee: Robert Stupp >Priority: Minor > Labels: client-impacting, protocolv4 > Fix For: 3.0 > > Attachments: 7807-v2.txt, 7807-v3.txt, 7807.txt > > > Tracing is an asynchronous operation, and drivers currently poll to determine > when the trace is complete (in a loop with sleeps). Instead, the server > could push a notification to the driver when the trace completes. > I'm guessing that most of the work for this will be around pushing > notifications to a single connection instead of all connections that have > registered listeners for a particular event type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements
[ https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oded Peer updated CASSANDRA-7304: - Attachment: 7304-05.patch Rebased to trunk > Ability to distinguish between NULL and UNSET values in Prepared Statements > --- > > Key: CASSANDRA-7304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7304 > Project: Cassandra > Issue Type: Sub-task >Reporter: Drew Kutcharian >Assignee: Oded Peer > Labels: cql, protocolv4 > Fix For: 3.0 > > Attachments: 7304-03.patch, 7304-04.patch, 7304-05.patch, > 7304-2.patch, 7304.patch > > > Currently Cassandra inserts tombstones when a value of a column is bound to > NULL in a prepared statement. At higher insert rates managing all these > tombstones becomes an unnecessary overhead. This limits the usefulness of the > prepared statements since developers have to either create multiple prepared > statements (each with a different combination of column names, which at times > is just unfeasible because of the sheer number of possible combinations) or > fall back to using regular (non-prepared) statements. > This JIRA is here to explore the possibility of either: > A. Have a flag on prepared statements that once set, tells Cassandra to > ignore null columns > or > B. Have an "UNSET" value which makes Cassandra skip the null columns and not > tombstone them > Basically, in the context of a prepared statement, a null value means delete, > but we don’t have anything that means "ignore" (besides creating a new > prepared statement without the ignored column). > Please refer to the original conversation on DataStax Java Driver mailing > list for more background: > https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion > *EDIT 18/12/14 - [~odpeer] Implementation Notes:* > The motivation hasn't changed. > Protocol version 4 specifies that bind variables do not require having a > value when executing a statement. Bind variables without a value are called > 'unset'. The 'unset' bind variable is serialized as the int value '-2' > without following bytes. > \\ > \\ > * An unset bind variable in an EXECUTE or BATCH request > ** On a {{value}} does not modify the value and does not create a tombstone > ** On the {{ttl}} clause is treated as 'unlimited' > ** On the {{timestamp}} clause is treated as 'now' > ** On a map key or a list index throws {{InvalidRequestException}} > ** On a {{counter}} increment or decrement operation does not change the > counter value, e.g. {{UPDATE my_tab SET c = c - ? WHERE k = 1}} does change > the value of counter {{c}} > ** On a tuple field or UDT field throws {{InvalidRequestException}} > * An unset bind variable in a QUERY request > ** On a partition column, clustering column or index column in the {{WHERE}} > clause throws {{InvalidRequestException}} > ** On the {{limit}} clause is treated as 'unlimited' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8696) nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot
[ https://issues.apache.org/jira/browse/CASSANDRA-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385668#comment-14385668 ] Ran Rubinstein commented on CASSANDRA-8696: --- Sorry, we went back to 2.0 > nodetool repair on cassandra 2.1.2 keyspaces return > java.lang.RuntimeException: Could not create snapshot > - > > Key: CASSANDRA-8696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8696 > Project: Cassandra > Issue Type: Bug >Reporter: Jeff Liu > Fix For: 2.1.4 > > > When trying to run nodetool repair -pr on cassandra node ( 2.1.2), cassandra > throw java exceptions: cannot create snapshot. > the error log from system.log: > {noformat} > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:28,815 > StreamResultFuture.java:166 - [Stream #692c1450-a692-11e4-9973-070e938df227 > ID#0] Prepare completed. Receiving 2 files(221187 bytes), sending 5 > files(632105 bytes) > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 > StreamResultFuture.java:180 - [Stream #692c1450-a692-11e4-9973-070e938df227] > Session with /10.97.9.110 is complete > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 > StreamResultFuture.java:212 - [Stream #692c1450-a692-11e4-9973-070e938df227] > All sessions completed > INFO [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,047 > StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] > streaming task succeed, returning response to /10.98.194.68 > INFO [RepairJobTask:1] 2015-01-28 02:07:29,065 StreamResultFuture.java:86 - > [Stream #692c6270-a692-11e4-9973-070e938df227] Executing streaming plan for > Repair > INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,065 > StreamSession.java:213 - [Stream #692c6270-a692-11e4-9973-070e938df227] > Starting streaming to /10.66.187.201 > INFO [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,070 > StreamCoordinator.java:209 - [Stream #692c6270-a692-11e4-9973-070e938df227, > ID#0] Beginning stream session with /10.66.187.201 > INFO [STREAM-IN-/10.66.187.201] 2015-01-28 02:07:29,465 > StreamResultFuture.java:166 - [Stream #692c6270-a692-11e4-9973-070e938df227 > ID#0] Prepare completed. Receiving 5 files(627994 bytes), sending 5 > files(632105 bytes) > INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,971 > StreamResultFuture.java:180 - [Stream #692c6270-a692-11e4-9973-070e938df227] > Session with /10.66.187.201 is complete > INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 > StreamResultFuture.java:212 - [Stream #692c6270-a692-11e4-9973-070e938df227] > All sessions completed > INFO [StreamReceiveTask:22] 2015-01-28 02:07:31,972 > StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] > streaming task succeed, returning response to /10.98.194.68 > ERROR [RepairJobTask:1] 2015-01-28 02:07:39,444 RepairJob.java:127 - Error > occurred during snapshot phase > java.lang.RuntimeException: Could not create snapshot at /10.97.9.110 > at > org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:347) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > INFO [AntiEntropySessions:6] 2015-01-28 02:07:39,445 RepairSession.java:260 > - [repair #6f85e740-a692-11e4-9973-070e938df227] new session: will sync > /10.98.194.68, /10.66.187.201, /10.226.218.135 on range > (12817179804668051873746972069086 > 2638799,12863540308359254031520865977436165] for events.[bigint0text, > bigint0boolean, bigint0int, dataset_catalog, column_categories, > bigint0double, bigint0bigint] > ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,445 RepairSession.java:303 > - [repair #685e3d00-a692-11e4-9973-070e938df227] session completed with the > following error > java.io.IOException: Failed during snapshot creation. > at > org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at > org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:128) > ~[apache-cassandra-2.1.2.jar:2.1.2] > at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) > ~[guava-16.0.jar:na] > at
[jira] [Commented] (CASSANDRA-7807) Push notification when tracing completes for an operation
[ https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385655#comment-14385655 ] Stefania commented on CASSANDRA-7807: - {{ServerConnection.java}}: {code} public boolean isRegistered(Event.Type traceComplete) { return ((Server.ConnectionTracker)getTracker()).isRegistered(Event.Type.TRACE_COMPLETE, channel()); } {code} Surely you must have meant: {code} public boolean isRegistered(Event.Type eventType) { return ((Server.ConnectionTracker)getTracker()).isRegistered(eventType, channel()); } {code} I would have added {{isRegistered()}} to the {{Tracker}} interface instead of a new method in {{Connection}} but up to you. Make sure to change the param argument name from {{traceComplete}} to {{eventType}} in {{Connection.isRegistred()}} if you keep {{isRegistered()}} in {{Connection}} however. \\ {{debug-cql}}: You forgot to uncomment lines 47-50 \\ {{SimpleClient.java}}: bq. Regarding TransportException - unfortunately it’s an interface - not an (unchecked) exception class. {code} if (msg instanceof ErrorMessage) -throw new RuntimeException((Throwable)((ErrorMessage)msg).error);+ throw (RuntimeException)((ErrorMessage)msg).error; {code} Then how can we be sure that {{msg.error}} is always going to be a {{RuntimeException}}? \\ {{TraceCompleteTest.java}}: The assertion at line 68 occasionally fails in ant, perhaps we need to increase the timeout a bit: {code} Event event = eventHandlerA.queue.poll(100, TimeUnit.MILLISECONDS); Assert.assertNotNull(event); {code} {code} [junit] Testcase: testTraceComplete(org.apache.cassandra.tracing.TraceCompleteTest): FAILED [junit] [junit] junit.framework.AssertionFailedError: [junit] at org.apache.cassandra.tracing.TraceCompleteTest.testTraceComplete(TraceCompleteTest.java:68) [junit] [junit] [junit] Test org.apache.cassandra.tracing.TraceCompleteTest FAILED {code} \\ Also, I could not convince myself how probabilistic tracing was not sending the event so I increased the timeout to 2 seconds in {{testTraceCompleteWithProbability()}} and the test then fails: {code} stefania@mia:~/git/cstar/cassandra$ git diff diff --git a/test/unit/org/apache/cassandra/tracing/TraceCompleteTest.java b/test/unit/org/apache/cassandra/tracing/TraceComp index b64c852..4784ca2 100644 --- a/test/unit/org/apache/cassandra/tracing/TraceCompleteTest.java +++ b/test/unit/org/apache/cassandra/tracing/TraceCompleteTest.java @@ -185,10 +185,10 @@ public class TraceCompleteTest extends CQLTester QueryMessage query = new QueryMessage("SELECT * FROM " + KEYSPACE + '.' + currentTable(), QueryOptions.DEFAU clientA.execute(query); -Event event = eventHandlerA.queue.poll(100, TimeUnit.MILLISECONDS); +Event event = eventHandlerA.queue.poll(2000, TimeUnit.MILLISECONDS); Assert.assertNull(event); -Assert.assertNull(eventHandlerB.queue.poll(100, TimeUnit.MILLISECONDS)); +Assert.assertNull(eventHandlerB.queue.poll(2000, TimeUnit.MILLISECONDS)); } finally {code} {code} [junit] - --- [junit] Testcase: testTraceCompleteWithProbability(org.apache.cassandra.tracing.TraceCompleteTest): FAILED [junit] [junit] junit.framework.AssertionFailedError: [junit] at org.apache.cassandra.tracing.TraceCompleteTest.testTraceCompleteWithProbability(TraceCompleteTest.java:189) [junit] [junit] [junit] Test org.apache.cassandra.tracing.TraceCompleteTest FAILED {code} Apologies for not picking this up in the previous round, please double check the probabilistic tracing flow. > Push notification when tracing completes for an operation > - > > Key: CASSANDRA-7807 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7807 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Tyler Hobbs >Assignee: Robert Stupp >Priority: Minor > Labels: client-impacting, protocolv4 > Fix For: 3.0 > > Attachments: 7807-v2.txt, 7807-v3.txt, 7807.txt > > > Tracing is an asynchronous operation, and drivers currently poll to determine > when the trace is complete (in a loop with sleeps). Instead, the server > could push a notification to the driver when the trace completes. > I'm guessing that most of the work for this will be around pushing > notifications to a single connection instead of all connections that have > registered listeners for a particular event type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)