[jira] [Commented] (CASSANDRA-6434) Repair-aware gc grace period

2015-03-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386284#comment-14386284
 ] 

Marcus Eriksson commented on CASSANDRA-6434:


[~kohlisankalp] no updates, I'll do some more research in what we can actually 
do here

> Repair-aware gc grace period 
> -
>
> Key: CASSANDRA-6434
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6434
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
>
> Since the reason for gcgs is to ensure that we don't purge tombstones until 
> every replica has been notified, it's redundant in a world where we're 
> tracking repair times per sstable (and repairing frequentily), i.e., a world 
> where we default to incremental repair a la CASSANDRA-5351.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8238) NPE in SizeTieredCompactionStrategy.filterColdSSTables

2015-03-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386279#comment-14386279
 ] 

Marcus Eriksson commented on CASSANDRA-8238:


dunno, my thinking was that if anyone has enabled this in 2.0, they probably 
know what they are doing and removing this would change behavior a bit too much 
in 2.0

> NPE in SizeTieredCompactionStrategy.filterColdSSTables
> --
>
> Key: CASSANDRA-8238
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8238
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Marcus Eriksson
> Fix For: 2.1.4
>
> Attachments: 0001-assert-that-readMeter-is-not-null.patch, 
> 0001-dont-always-set-client-mode-for-sstable-loader.patch
>
>
> {noformat}
> ERROR [CompactionExecutor:15] 2014-10-31 15:28:32,318 
> CassandraDaemon.java:153 - Exception in thread 
> Thread[CompactionExecutor:15,1,main]
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.filterColdSSTables(SizeTieredCompactionStrategy.java:181)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:83)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:267)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:226)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_72]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_72]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_72]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_72]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-6363) CAS not applied on rows containing an expired ttl column

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386114#comment-14386114
 ] 

Stefania edited comment on CASSANDRA-6363 at 3/30/15 6:38 AM:
--

[~thobbs]:

- tested manually and with dtest below in cassandra-2.0 and cannot reproduce. 
- tested with dtest below in cassandra-2.1 and cannot reproduce
- tested with dtest below in trunk and cannot reproduce

The dtest:

https://github.com/stef1927/cassandra-dtest/commit/eaf56385405db4702d869699e80ad4d00b41cec4

{code}
def delete_with_ttl_expired_test(self):
"""
Updating a row with a ttl does not prevent deletion, test for 
CASSANDRA-6363
"""
self.cursor1.execute("DROP TABLE IF EXISTS session")
self.cursor1.execute("CREATE TABLE session (id text, usr text, valid 
int, PRIMARY KEY (id))")

self.cursor1.execute("insert into session (id, usr) values ('abc', 
'abc')")
self.cursor1.execute("update session using ttl 1 set valid = 1 where id 
= 'abc'")
self.smart_sleep(time.time(), 1)

self.cursor1.execute("delete from session where id = 'abc' if usr 
='abc'")
assert_row_count(self.cursor1, 'session', 0)
{code}

Please confirm it's OK to close.


was (Author: stefania):
[~thobbs]:

- tested manually and with dtest below in cassandra-2.0 and cannot reproduce. 
- tested with dtest below in cassandra-2.1 and cannot reproduce
- tested with dtest below in trunk and cannot reproduce

The dtest:

https://github.com/stef1927/cassandra-dtest/commit/eaf56385405db4702d869699e80ad4d00b41cec4

{code}
def delete_with_ttl_expired_test(self):
"""
Updating a row with a ttl does not prevent deletion, test for 
CASSANDRA-6363
"""
self.cursor1.execute("DROP TABLE IF EXISTS session")
self.cursor1.execute("CREATE TABLE session (id text, usr text, valid 
int, PRIMARY KEY (id))")

self.cursor1.execute("insert into session (id, usr) values ('abc', 
'abc')")
self.cursor1.execute("update session using ttl 1 set valid = 1 where id 
= 'abc'")
self.smart_sleep(time.time(), 1)

self.cursor1.execute("delete from session where id = 'abc' if usr 
='abc'")
assert_row_count(self.cursor1, 'session', 0)
{code}

> CAS not applied on rows containing an expired ttl column
> 
>
> Key: CASSANDRA-6363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6363
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux/x64 2.0.2 4-node cluster
>Reporter: Michał Ziemski
>Assignee: Stefania
>
> CREATE TABLE session (
>   id text,
>   usr text,
>   valid int,
>   PRIMARY KEY (id)
> );
> insert into session (id, usr) values ('abc', 'abc');
> update session using ttl 1 set valid = 1 where id = 'abc';
> (wait 1 sec)
> And 
> delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr 
> ='demo';
> Yields:
>  [applied] | usr
> ---+-
>  False | abc
> Rather than applying the delete.
> Executing:
> update session set valid = null where id = 'abc';
> and again
> delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr 
> ='demo';
> Positively deletes the row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7976) Changes to index_interval table properties revert after subsequent modifications

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386275#comment-14386275
 ] 

Stefania commented on CASSANDRA-7976:
-

Verified {{index_interval}} on cassandra-2.0 and reproduced the problem, added 
unit test.

Verified {{min_index_interval}} and {{max_index_interval}} on trunk and 
cassandra-2.1 but could not reproduce.

The problem for 2.0 is that the index interval is not updated in 
{{CFMetadata.apply()}}. As a consequence, the index interval was never changed, 
this is clearly visible in the log file, despite what the cqlsh DESC command 
shows. The reason why the initial DESC command shows an updated index interval 
is that the migration manager pushes a schema change that was not correctly 
applied.

When the index interval was changed into min and max on cassandra-2.1, 
{{CFMetadata.apply()}} was fixed (verified on trunk).

The patch for 2.0 is here: https://github.com/stef1927/cassandra/commits/7976.


> Changes to index_interval table properties revert after subsequent 
> modifications
> 
>
> Key: CASSANDRA-7976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7976
> Project: Cassandra
>  Issue Type: Bug
>  Components: Config
> Environment: cqlsh 4.1.1, Cassandra 2.0.9-SNAPSHOT (built w/ `ccm` on 
> Mac OS X 10.9.4 with Java 1.7.0_67 - more detail below)
> $ java -version 
> java version "1.7.0_67"
> Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
> Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
> $ mvn --version 
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T13:58:10-07:00)
> Maven home: /usr/local/Cellar/maven/3.2.3/libexec
> Java version: 1.7.0_67, vendor: Oracle Corporation
> Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.9.4", arch: "x86_64", family: "mac"
>Reporter: Andrew Lenards
>Assignee: Stefania
>  Labels: cql3, metadata
>
> It appears that if you want to increase the sampling in *-Summary.db files, 
> you would change the default for {{index_interval}} table property from the 
> {{128}} default value to {{256}} on a given CQL {{TABLE}}.
> However, if you {{ALTER TABLE}} after setting the value, {{index_interval}} 
> returns to the default, {{128}}. This is unexpected behavior. I would expect 
> the value for {{index_interval}} to not be affected by subsequent {{ALTER 
> TABLE}} statements.
> As noted in Environment, this was seen with a 2.0.9-SNAPSHOT built w/ `ccm`. 
> If I just use a table from one of DataStax documentation tutorials (musicdb 
> as mdb):
> {noformat}
> cqlsh:mdb> DESC TABLE songs;
> CREATE TABLE songs (
>   id uuid,
>   album text,
>   artist text,
>   data blob,
>   reviews list,
>   tags set,
>   title text,
>   venue map,
>   PRIMARY KEY ((id))
> ) WITH
>   bloom_filter_fp_chance=0.01 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.10 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.00 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> {noformat}
> We've got {{128}} as expected.
> We alter it:
> {noformat}
> cqlsh:mdb> ALTER TABLE songs WITH index_interval = 256; 
> {noformat}
> And the change appears: 
> {noformat}
> cqlsh:mdb> DESC TABLE songs;
> CREATE TABLE songs (
>   id uuid,
>   album text,
>   artist text,
>   data blob,
>   reviews list,
>   tags set,
>   title text,
>   venue map,
>   PRIMARY KEY ((id))
> ) WITH
>   bloom_filter_fp_chance=0.01 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.10 AND
>   gc_grace_seconds=864000 AND
>   index_interval=256 AND
>   read_repair_chance=0.00 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> {noformat}
> But if do another {{ALTER TABLE}}, say, change the caching or comment, the 
> {{index_interval}} will revert back to {{128}}.
> {noformat}
> cqlsh:mdb> ALTER TABLE songs WITH caching = 'none'; 
> cqlsh:mdb> DESC TABLE songs; 
> CREATE TABLE songs (
>   id uuid,
>   album text,
>   artist text,
>   data blob,
>   reviews list,
>   tags set,
>   title text,
>   venue map,
>   P

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows

2015-03-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386271#comment-14386271
 ] 

Marcus Eriksson commented on CASSANDRA-9045:


[~philipthompson] yes, do that, with this patch you can set it to 0 even:
http://aep.appspot.com/display/wSaOmJhJ6IGh0NYSe8-gY0sM4Yg/

> Deleted columns are resurrected after repair in wide rows
> -
>
> Key: CASSANDRA-9045
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Roman Tkachenko
>Assignee: Marcus Eriksson
>Priority: Critical
> Fix For: 2.0.14
>
> Attachments: cqlsh.txt
>
>
> Hey guys,
> After almost a week of researching the issue and trying out multiple things 
> with (almost) no luck I was suggested (on the user@cass list) to file a 
> report here.
> h5. Setup
> Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
> it goes away)
> Multi datacenter 12+6 nodes cluster.
> h5. Schema
> {code}
> cqlsh> describe keyspace blackbook;
> CREATE KEYSPACE blackbook WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'IAD': '3',
>   'ORD': '3'
> };
> USE blackbook;
> CREATE TABLE bounces (
>   domainid text,
>   address text,
>   message text,
>   "timestamp" bigint,
>   PRIMARY KEY (domainid, address)
> ) WITH
>   bloom_filter_fp_chance=0.10 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.10 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.00 AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'LeveledCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> {code}
> h5. Use case
> Each row (defined by a domainid) can have many many columns (bounce entries) 
> so rows can get pretty wide. In practice, most of the rows are not that big 
> but some of them contain hundreds of thousands and even millions of columns.
> Columns are not TTL'ed but can be deleted using the following CQL3 statement:
> {code}
> delete from bounces where domainid = 'domain.com' and address = 
> 'al...@example.com';
> {code}
> All queries are performed using LOCAL_QUORUM CL.
> h5. Problem
> We weren't very diligent about running repairs on the cluster initially, but 
> shorty after we started doing it we noticed that some of previously deleted 
> columns (bounce entries) are there again, as if tombstones have disappeared.
> I have run this test multiple times via cqlsh, on the row of the customer who 
> originally reported the issue:
> * delete an entry
> * verify it's not returned even with CL=ALL
> * run repair on nodes that own this row's key
> * the columns reappear and are returned even with CL=ALL
> I tried the same test on another row with much less data and everything was 
> correctly deleted and didn't reappear after repair.
> h5. Other steps I've taken so far
> Made sure NTP is running on all servers and clocks are synchronized.
> Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
> keyspace) on all nodes, then changed it back to the default 10 days again. 
> Didn't help.
> Performed one more test. Updated one of the resurrected columns, then deleted 
> it and ran repair again. This time the updated version of the column 
> reappeared.
> Finally, I noticed these log entries for the row in question:
> {code}
> INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
> CompactionController.java (line 192) Compacting large row 
> blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
> {code}
> Figuring it may be related I bumped "in_memory_compaction_limit_in_mb" to 
> 512MB so the row fits into it, deleted the entry and ran repair once again. 
> The log entry for this row was gone and the columns didn't reappear.
> We have a lot of rows much larger than 512MB so can't increase this 
> parameters forever, if that is the issue.
> Please let me know if you need more information on the case or if I can run 
> more experiments.
> Thanks!
> Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-9060:
---
Attachment: 0001-another-tweak-to-9060.patch

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Gustav Munkby
>Priority: Minor
> Fix For: 2.1.4
>
> Attachments: 0001-another-tweak-to-9060.patch, 2.1-9060-simple.patch, 
> trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reopened CASSANDRA-9060:


attaching another small tweak to this

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Gustav Munkby
>Priority: Minor
> Fix For: 2.1.4
>
> Attachments: 2.1-9060-simple.patch, trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6363) CAS not applied on rows containing an expired ttl column

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386114#comment-14386114
 ] 

Stefania commented on CASSANDRA-6363:
-

[~thobbs]:

- tested manually and with dtest below in cassandra-2.0 and cannot reproduce. 
- tested with dtest below in cassandra-2.1 and cannot reproduce
- tested with dtest below in trunk and cannot reproduce

The dtest:

https://github.com/stef1927/cassandra-dtest/commit/eaf56385405db4702d869699e80ad4d00b41cec4

{code}
def delete_with_ttl_expired_test(self):
"""
Updating a row with a ttl does not prevent deletion, test for 
CASSANDRA-6363
"""
self.cursor1.execute("DROP TABLE IF EXISTS session")
self.cursor1.execute("CREATE TABLE session (id text, usr text, valid 
int, PRIMARY KEY (id))")

self.cursor1.execute("insert into session (id, usr) values ('abc', 
'abc')")
self.cursor1.execute("update session using ttl 1 set valid = 1 where id 
= 'abc'")
self.smart_sleep(time.time(), 1)

self.cursor1.execute("delete from session where id = 'abc' if usr 
='abc'")
assert_row_count(self.cursor1, 'session', 0)
{code}

> CAS not applied on rows containing an expired ttl column
> 
>
> Key: CASSANDRA-6363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6363
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux/x64 2.0.2 4-node cluster
>Reporter: Michał Ziemski
>Assignee: Stefania
>
> CREATE TABLE session (
>   id text,
>   usr text,
>   valid int,
>   PRIMARY KEY (id)
> );
> insert into session (id, usr) values ('abc', 'abc');
> update session using ttl 1 set valid = 1 where id = 'abc';
> (wait 1 sec)
> And 
> delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr 
> ='demo';
> Yields:
>  [applied] | usr
> ---+-
>  False | abc
> Rather than applying the delete.
> Executing:
> update session set valid = null where id = 'abc';
> and again
> delete from session where id = 'DSYUCTCLSOEKVLAQWNWYLVQMEQGGXD' if usr 
> ='demo';
> Positively deletes the row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386104#comment-14386104
 ] 

Jonathan Ellis commented on CASSANDRA-9066:
---

Thanks, Gustav and Benedict!

> BloomFilter serialization is inefficient
> 
>
> Key: CASSANDRA-9066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Gustav Munkby
> Fix For: 2.1.4
>
> Attachments: 2.1-9066.patch
>
>
> As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
> very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
> 3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8236) Delay "node up" and "node added" notifications until native protocol server is started

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386089#comment-14386089
 ] 

Stefania commented on CASSANDRA-8236:
-

Re-based to pick-up the fix for CASSANDRA-9034 which was impacting the dtests.

[~brandon.williams] dead state protection commit is ready for review.

> Delay "node up" and "node added" notifications until native protocol server 
> is started
> --
>
> Key: CASSANDRA-8236
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8236
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Stefania
> Fix For: 3.0
>
> Attachments: 8236.txt
>
>
> As discussed in CASSANDRA-7510, there is still a gap between when a "node up" 
> or "node added" notification may be sent to native protocol clients (in 
> response to a gossip event) and when the native protocol server is ready to 
> serve requests.
> Everything in between the call to {{StorageService.instance.initServer()}} 
> and creation of the native server in {{CassandraDaemon.setup()}} contributes 
> to this delay, but waiting for Gossip to settle introduces the biggest delay.
> We may need to introduce a "STARTING" gossip state for the period inbetween, 
> which is why this is scheduled for 3.0.  If there's a better option, though, 
> it may make sense to put this in 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: simplify

2015-03-29 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk ce643ff99 -> 04389ad5e


simplify


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/04389ad5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/04389ad5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/04389ad5

Branch: refs/heads/trunk
Commit: 04389ad5ef879ae809db925bf46bad60b60fa454
Parents: ce643ff
Author: Dave Brosius 
Authored: Sun Mar 29 21:35:20 2015 -0400
Committer: Dave Brosius 
Committed: Sun Mar 29 21:35:20 2015 -0400

--
 src/java/org/apache/cassandra/db/Directories.java | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/04389ad5/src/java/org/apache/cassandra/db/Directories.java
--
diff --git a/src/java/org/apache/cassandra/db/Directories.java 
b/src/java/org/apache/cassandra/db/Directories.java
index a5be956..76171f0 100644
--- a/src/java/org/apache/cassandra/db/Directories.java
+++ b/src/java/org/apache/cassandra/db/Directories.java
@@ -193,10 +193,11 @@ public class Directories
 
 this.dataPaths = new File[dataDirectories.length];
 // If upgraded from version less than 2.1, use existing directories
+String oldSSTableRelativePath = join(metadata.ksName, metadata.cfName);
 for (int i = 0; i < dataDirectories.length; ++i)
 {
 // check if old SSTable directory exists
-dataPaths[i] = new File(dataDirectories[i].location, 
join(metadata.ksName, metadata.cfName));
+dataPaths[i] = new File(dataDirectories[i].location, 
oldSSTableRelativePath);
 }
 boolean olderDirectoryExists = Iterables.any(Arrays.asList(dataPaths), 
new Predicate()
 {
@@ -208,8 +209,10 @@ public class Directories
 if (!olderDirectoryExists)
 {
 // use 2.1-style path names
+   
+   String newSSTableRelativePath = join(metadata.ksName, 
directoryName);
 for (int i = 0; i < dataDirectories.length; ++i)
-dataPaths[i] = new File(dataDirectories[i].location, 
join(metadata.ksName, directoryName));
+dataPaths[i] = new File(dataDirectories[i].location, 
newSSTableRelativePath);
 }
 
 for (File dir : dataPaths)



[jira] [Commented] (CASSANDRA-7807) Push notification when tracing completes for an operation

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386069#comment-14386069
 ] 

Stefania commented on CASSANDRA-7807:
-

{{ServerConnection.java}}:
Great, thank you.

\\
{{debug-cql.java}}:
I imagine it used to work with sourcing the env file and something got broken? 
You too are getting this exception when sourcing it?
{code}
stefania@mia:~/git/cstar/cassandra/bin$ ./debug-cql 127.0.0.1 9042
CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo 
(Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline 
org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned 
(Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V
CompilerOracle: inline 
org/apache/cassandra/utils/AsymmetricOrdering.selectBoundary 
(Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;II)I
CompilerOracle: inline 
org/apache/cassandra/utils/AsymmetricOrdering.strictnessOfLessThan 
(Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare 
(Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare 
([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline 
org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned 
(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline 
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo 
(Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline 
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo 
(Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline 
org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo 
(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
Error: Exception thrown by the agent : java.lang.NullPointerException
{code}
I would open a ticket, there may be more to this exception than we understand 
at the moment. At least I feel that way.

\\
{{SimpleClient.java / TransportException}}:
I see the problem now. Can we then not just leave it alone as it was before:
{code}
if (msg instanceof ErrorMessage)
throw new RuntimeException((Throwable)((ErrorMessage) msg).error);
{code}

and then do this in {{testTraceCompleteVersion3()}}
{code}
catch (RuntimeException e)
{
Assert.assertTrue(e.getCause() instanceof ProtocolException); // that's 
what we want
}
{code}
Or did you have any other reason to change it? I know it's test code but it 
worries me that some day we'll have a {{TransportException}} that is not
a {{RuntimeException}}. However, if you prefer to clean it up in another 
ticket, like CASSANDRA-8809, then I am happy to leave the cast for a limited
amount of time.

\\
Probabilistic tracing:
It looks correct now but why do we need an extra boolean at all? Was it not 
enough not to pass the connection in {{createTracingSession()}} like so:
{code}
public void createTracingSession(Connection connection)
{
UUID session = this.preparedTracingSession;
if (session == null)
{
Tracing.instance.newSession(); //< no connection here
}
else
{
Tracing.instance.newSession(connection, session);
this.preparedTracingSession = null;
}
}
{code}

> Push notification when tracing completes for an operation
> -
>
> Key: CASSANDRA-7807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7807
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: client-impacting, protocolv4
> Fix For: 3.0
>
> Attachments: 7807-v2.txt, 7807-v3.txt, 7807.txt
>
>
> Tracing is an asynchronous operation, and drivers currently poll to determine 
> when the trace is complete (in a loop with sleeps).  Instead, the server 
> could push a notification to the driver when the trace completes.
> I'm guessing that most of the work for this will be around pushing 
> notifications to a single connection instead of all connections that have 
> registered listeners for a particular event type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: simplify

2015-03-29 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk 7ea642c89 -> ce643ff99


simplify


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ce643ff9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ce643ff9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ce643ff9

Branch: refs/heads/trunk
Commit: ce643ff99f57784577f2067f2ae4c95d3a02ecff
Parents: 7ea642c
Author: Dave Brosius 
Authored: Sun Mar 29 20:46:04 2015 -0400
Committer: Dave Brosius 
Committed: Sun Mar 29 20:46:04 2015 -0400

--
 src/java/org/apache/cassandra/tools/NodeTool.java | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ce643ff9/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index 9c804c0..4ef2469 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -2133,12 +2133,11 @@ public class NodeTool
 @Option(title = "resolve_ip", name = {"-r", "--resolve-ip"}, 
description = "Show node domain names instead of IPs")
 private boolean resolveIp = false;
 
-private boolean hasEffectiveOwns = false;
 private boolean isTokenPerNode = true;
 private int maxAddressLength = 0;
 private String format = null;
 private Collection joiningNodes, leavingNodes, movingNodes, 
liveNodes, unreachableNodes;
-private Map loadMap, hostIDMap, tokensToEndpoints;
+private Map loadMap, hostIDMap;
 private EndpointSnitchInfoMBean epSnitchInfo;
 
 @Override
@@ -2148,7 +2147,7 @@ public class NodeTool
 leavingNodes = probe.getLeavingNodes();
 movingNodes = probe.getMovingNodes();
 loadMap = probe.getLoadMap();
-tokensToEndpoints = probe.getTokenToEndpointMap();
+Map tokensToEndpoints = 
probe.getTokenToEndpointMap();
 liveNodes = probe.getLiveNodes();
 unreachableNodes = probe.getUnreachableNodes();
 hostIDMap = probe.getHostIdMap();
@@ -2157,6 +2156,7 @@ public class NodeTool
 StringBuffer errors = new StringBuffer();
 
 Map ownerships = null;
+boolean hasEffectiveOwns = false;
 try
 {
 ownerships = probe.effectiveOwnership(keyspace);



cassandra git commit: remove dead code

2015-03-29 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk c0fc8d823 -> 7ea642c89


remove dead code


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7ea642c8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7ea642c8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7ea642c8

Branch: refs/heads/trunk
Commit: 7ea642c89c124f0515754ceb8a163370e8d74578
Parents: c0fc8d8
Author: Dave Brosius 
Authored: Sun Mar 29 20:35:24 2015 -0400
Committer: Dave Brosius 
Committed: Sun Mar 29 20:35:24 2015 -0400

--
 src/java/org/apache/cassandra/utils/IntervalTree.java | 9 -
 1 file changed, 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7ea642c8/src/java/org/apache/cassandra/utils/IntervalTree.java
--
diff --git a/src/java/org/apache/cassandra/utils/IntervalTree.java 
b/src/java/org/apache/cassandra/utils/IntervalTree.java
index 0c3c611..4522e27 100644
--- a/src/java/org/apache/cassandra/utils/IntervalTree.java
+++ b/src/java/org/apache/cassandra/utils/IntervalTree.java
@@ -47,19 +47,10 @@ public class IntervalTree, 
D, I extends Interval
 
 protected IntervalTree(Collection intervals)
 {
-final IntervalTree it = this;
 this.head = intervals == null || intervals.isEmpty() ? null : new 
IntervalNode(intervals);
 this.count = intervals == null ? 0 : intervals.size();
 }
 
-public static , D, I extends Interval> IntervalTree build(Collection intervals, Comparator 
comparator)
-{
-if (intervals == null || intervals.isEmpty())
-return emptyTree();
-
-return new IntervalTree(intervals);
-}
-
 public static , D, I extends Interval> IntervalTree build(Collection intervals)
 {
 if (intervals == null || intervals.isEmpty())



[jira] [Commented] (CASSANDRA-7814) enable describe on indices

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386024#comment-14386024
 ] 

Stefania commented on CASSANDRA-7814:
-

So sorry [~blerer] I tested the zip file without realizing I still had the 
python driver installed. Please try again.

I changed the python driver {{setup.py}} to append the git hash to the root 
directory as well as the file name, and then I rebuilt the driver at 5f06ec5. 
[~thobbs] please check commit 2671917 in 
https://github.com/stef1927/python-driver/commits/7814 and let us know if this 
seems reasonable. My doubt is because the official python driver won't have 
this commit in all likelihood, unless there is a way to tell python sdist to 
only optionally add the git hash, in which case I can open another python 
ticket to ask them to pick it up. At the moment, when we pick up an official 
release or build on the official master branch it will still work, just without 
the git hash. The alternative is to manually add the hash to the file name like 
I did the first time round and then to enhance cqlsh to remove the hash when it 
calculates the root directory .

> enable describe on indices
> --
>
> Key: CASSANDRA-7814
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7814
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: radha
>Assignee: Stefania
>Priority: Minor
> Fix For: 2.1.4
>
>
> Describe index should be supported, right now, the only way is to export the 
> schema and find what it really is before updating/dropping the index.
> verified in 
> [cqlsh 3.1.8 | Cassandra 1.2.18.1 | CQL spec 3.0.0 | Thrift protocol 19.36.2]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386007#comment-14386007
 ] 

Benedict commented on CASSANDRA-8670:
-

I've pushed some suggestions for further refactoring 
[here|https://github.com/belliottsmith/cassandra/tree/8670-suggestions]. I've 
only looked at the overall class hierarchy, I haven't focused yet on reviewing 
the method implementation changes.

Mostly these changes flatten the class hierarchy; it's gotten deep enough I 
don't think there's a good reason to maintain the distinction between 
DataStreamOutputPlus and DataStreamOutputPlusAndChannel, especially since we 
often just mock up a Channel based off the OutputStream. I've also flattened 
NIODataOutputStream and DataOutputStreamByteBufferPlus into 
BufferedDataOutputStreamPlus, since we only write to the buffer if we don't 
exceed its size. At the same time, since we are now refactoring this whole 
hierarchy, I made DataOutputBuffer extend BufferedDataOutputStreamPlus, and 
just ensures the buffer grows as necessary, and have removed 
FastByteArrayOutputStream since we no longer need it.

I've also stopped SequentialWriter implementing WritableByteChannel, and now 
pass in its internal Channel, since that's the only way the operations will 
benefit. As a follow up ticket, we should probably move SequentialWriter to 
utilising BufferedDataOutputStreamPlus directly, so that it can benefit from 
faster encoding of primitives

Let me know what you think of the changes to the hierarchy, and once we've 
ironed that out we can move on to the home stretch and confirm the code 
changes. One other thing we could consider is dropping the "Plus" from 
everything except the interface, since it seems superfluous, and it's all 
fairly verbose.


> Large columns + NIO memory pooling causes excessive direct memory usage
> ---
>
> Key: CASSANDRA-8670
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 3.0
>
> Attachments: largecolumn_test.py
>
>
> If you provide a large byte array to NIO and ask it to populate the byte 
> array from a socket it will allocate a thread local byte buffer that is the 
> size of the requested read no matter how large it is. Old IO wraps new IO for 
> sockets (but not files) so old IO is effected as well.
> Even If you are using Buffered{Input | Output}Stream you can end up passing a 
> large byte array to NIO. The byte array read method will pass the array to 
> NIO directly if it is larger than the internal buffer.  
> Passing large cells between nodes as part of intra-cluster messaging can 
> cause the NIO pooled buffers to quickly reach a high watermark and stay 
> there. This ends up costing 2x the largest cell size because there is a 
> buffer for input and output since they are different threads. This is further 
> multiplied by the number of nodes in the cluster - 1 since each has a 
> dedicated thread pair with separate thread locals.
> Anecdotally it appears that the cost is doubled beyond that although it isn't 
> clear why. Possibly the control connections or possibly there is some way in 
> which multiple 
> Need a workload in CI that tests the advertised limits of cells on a cluster. 
> It would be reasonable to ratchet down the max direct memory for the test to 
> trigger failures if a memory pooling issue is introduced. I don't think we 
> need to test concurrently pulling in a lot of them, but it should at least 
> work serially.
> The obvious fix to address this issue would be to read in smaller chunks when 
> dealing with large values. I think small should still be relatively large (4 
> megabytes) so that code that is reading from a disk can amortize the cost of 
> a seek. It can be hard to tell what the underlying thing being read from is 
> going to be in some of the contexts where we might choose to implement 
> switching to reading chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8481) ghost node in gossip

2015-03-29 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8481:
---
Reproduced In: 2.0.11
Fix Version/s: 2.0.14

> ghost node in gossip
> 
>
> Key: CASSANDRA-8481
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8481
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexey Larkov
>Priority: Minor
> Fix For: 2.0.14
>
>
> After inaccurate removing nodes from cluster
> nodetool gossipinfo and jmx 
> org.apache.cassandra.net.FailureDetector.AllEndpointsStates  shows the node 
> status is LEFT.
> Name  Value   TypeDisplay NameUpdate Interval Description
> /192.168.58.75
>   generation:3
>   heartbeat:0
>   REMOVAL_COORDINATOR:REMOVER,f9a28f8c-3244-42d1-986e-592aafe1406c
>   STATUS:LEFT,-3361705224534889554,141446785
> jmx org.apache.cassandra.net.FailureDetector.DownEndpointCount is 1
> node 58.75 is absent in nodetool status and system.peers table.
> Before node got LEFT status it was in REMOVING state.
> I've done unsafeassassinateendpoint and it's status became LEFT, but 
> DownEndpointCount is still 1.
> And org.apache.cassandra.net.FailureDetector.SimpleStates is still DOWN.
> How to remove this node from gossip?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8481) ghost node in gossip

2015-03-29 Thread Alexey Larkov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385862#comment-14385862
 ] 

Alexey Larkov commented on CASSANDRA-8481:
--

That was latest cassandra version by that date. 2.0.12 or 11 i guess.

> ghost node in gossip
> 
>
> Key: CASSANDRA-8481
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8481
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexey Larkov
>Priority: Minor
>
> After inaccurate removing nodes from cluster
> nodetool gossipinfo and jmx 
> org.apache.cassandra.net.FailureDetector.AllEndpointsStates  shows the node 
> status is LEFT.
> Name  Value   TypeDisplay NameUpdate Interval Description
> /192.168.58.75
>   generation:3
>   heartbeat:0
>   REMOVAL_COORDINATOR:REMOVER,f9a28f8c-3244-42d1-986e-592aafe1406c
>   STATUS:LEFT,-3361705224534889554,141446785
> jmx org.apache.cassandra.net.FailureDetector.DownEndpointCount is 1
> node 58.75 is absent in nodetool status and system.peers table.
> Before node got LEFT status it was in REMOVING state.
> I've done unsafeassassinateendpoint and it's status became LEFT, but 
> DownEndpointCount is still 1.
> And org.apache.cassandra.net.FailureDetector.SimpleStates is still DOWN.
> How to remove this node from gossip?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-9066.
-
Resolution: Fixed

> BloomFilter serialization is inefficient
> 
>
> Key: CASSANDRA-9066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Gustav Munkby
> Fix For: 2.1.4
>
> Attachments: 2.1-9066.patch
>
>
> As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
> very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
> 3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9068) AntiCompaction should calculate a more accurate lower bound on bloom filter size for each target

2015-03-29 Thread Benedict (JIRA)
Benedict created CASSANDRA-9068:
---

 Summary: AntiCompaction should calculate a more accurate lower 
bound on bloom filter size for each target
 Key: CASSANDRA-9068
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9068
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.0


As a follow up to CASSANDRA-9060, the ratio of occupancy for each resultant 
file for an anticompaction group (or single sstable in 2.1) could be estimated 
with a tweaked version of estimatedKeysForRanges(), and a method for inverting 
a range collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385825#comment-14385825
 ] 

Benedict commented on CASSANDRA-9066:
-

Committed, and filed CASSANDRA-9067 as a follow up

> BloomFilter serialization is inefficient
> 
>
> Key: CASSANDRA-9066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Gustav Munkby
> Fix For: 2.1.4
>
> Attachments: 2.1-9066.patch
>
>
> As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
> very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
> 3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-9066:

Assignee: Gustav Munkby

> BloomFilter serialization is inefficient
> 
>
> Key: CASSANDRA-9066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Gustav Munkby
> Fix For: 2.1.4
>
> Attachments: 2.1-9066.patch
>
>
> As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
> very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
> 3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9067) BloomFilter serialization format should not change byte ordering

2015-03-29 Thread Benedict (JIRA)
Benedict created CASSANDRA-9067:
---

 Summary: BloomFilter serialization format should not change byte 
ordering
 Key: CASSANDRA-9067
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9067
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.0


As a follow-up to CASSANDRA-9066 and CASSANDRA-9060, it appears we do some 
unnecessary byte swapping during the serialization of bloom filters, which 
makes the logic slower and harder to follow. We should either perform them more 
efficiently (using Long.reverseBytes) or, preferably, eliminate the conversion 
altogether since it does not appear to serve any purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385821#comment-14385821
 ] 

Benedict commented on CASSANDRA-9060:
-

I've committed your patch to 2.1. 3.0 looks to already behave approximately 
equivalently to the behaviour introduced by this patch due to the use of HLL 
cardinality estimation, but both could do with estimating a better lower bound 
on the occupancy of each side of the result.

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0
>
> Attachments: 2.1-9060-simple.patch, trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/5] cassandra git commit: Fix anti-compaction target bloom filter size

2015-03-29 Thread benedict
Fix anti-compaction target bloom filter size

patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9060


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b0de3270
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b0de3270
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b0de3270

Branch: refs/heads/cassandra-2.1
Commit: b0de327099c22dd4708b699dfa9e18496abd7429
Parents: 7b1331f
Author: Gustav Munkby 
Authored: Sun Mar 29 16:13:23 2015 +0100
Committer: Benedict Elliott Smith 
Committed: Sun Mar 29 16:19:54 2015 +0100

--
 CHANGES.txt   | 1 +
 .../org/apache/cassandra/db/compaction/CompactionManager.java | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 8854261..c02af99 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.4
+ * Fix anti-compaction target bloom filter size (CASSANDRA-9060)
  * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047)
  * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034)
  * Avoid overwriting index summaries for sstables with an older format that

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 992378f..b9c4553 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -1050,8 +1050,6 @@ public class CompactionManager implements 
CompactionManagerMBean
 List anticompactedSSTables = new ArrayList<>();
 int repairedKeyCount = 0;
 int unrepairedKeyCount = 0;
-// TODO(5351): we can do better here:
-int expectedBloomFilterSize = 
Math.max(cfs.metadata.getMinIndexInterval(), 
(int)(SSTableReader.getApproximateKeyCount(repairedSSTables)));
 logger.info("Performing anticompaction on {} sstables", 
repairedSSTables.size());
 // iterate over sstables to check if the repaired / unrepaired ranges 
intersect them.
 for (SSTableReader sstable : repairedSSTables)
@@ -1075,6 +1073,7 @@ public class CompactionManager implements 
CompactionManagerMBean
 try (AbstractCompactionStrategy.ScannerList scanners = 
cfs.getCompactionStrategy().getScanners(new 
HashSet<>(Collections.singleton(sstable)));
  CompactionController controller = new 
CompactionController(cfs, sstableAsSet, CFMetaData.DEFAULT_GC_GRACE_SECONDS))
 {
+int expectedBloomFilterSize = 
Math.max(cfs.metadata.getMinIndexInterval(), (int)sstable.estimatedKeys());
 
repairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, 
destination, expectedBloomFilterSize, repairedAt, sstable));
 
unRepairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, 
destination, expectedBloomFilterSize, ActiveRepairService.UNREPAIRED_SSTABLE, 
sstable));
 



[4/5] cassandra git commit: Fix anti-compaction target bloom filter size

2015-03-29 Thread benedict
Fix anti-compaction target bloom filter size

patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9060


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b0de3270
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b0de3270
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b0de3270

Branch: refs/heads/trunk
Commit: b0de327099c22dd4708b699dfa9e18496abd7429
Parents: 7b1331f
Author: Gustav Munkby 
Authored: Sun Mar 29 16:13:23 2015 +0100
Committer: Benedict Elliott Smith 
Committed: Sun Mar 29 16:19:54 2015 +0100

--
 CHANGES.txt   | 1 +
 .../org/apache/cassandra/db/compaction/CompactionManager.java | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 8854261..c02af99 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.4
+ * Fix anti-compaction target bloom filter size (CASSANDRA-9060)
  * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047)
  * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034)
  * Avoid overwriting index summaries for sstables with an older format that

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b0de3270/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 992378f..b9c4553 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -1050,8 +1050,6 @@ public class CompactionManager implements 
CompactionManagerMBean
 List anticompactedSSTables = new ArrayList<>();
 int repairedKeyCount = 0;
 int unrepairedKeyCount = 0;
-// TODO(5351): we can do better here:
-int expectedBloomFilterSize = 
Math.max(cfs.metadata.getMinIndexInterval(), 
(int)(SSTableReader.getApproximateKeyCount(repairedSSTables)));
 logger.info("Performing anticompaction on {} sstables", 
repairedSSTables.size());
 // iterate over sstables to check if the repaired / unrepaired ranges 
intersect them.
 for (SSTableReader sstable : repairedSSTables)
@@ -1075,6 +1073,7 @@ public class CompactionManager implements 
CompactionManagerMBean
 try (AbstractCompactionStrategy.ScannerList scanners = 
cfs.getCompactionStrategy().getScanners(new 
HashSet<>(Collections.singleton(sstable)));
  CompactionController controller = new 
CompactionController(cfs, sstableAsSet, CFMetaData.DEFAULT_GC_GRACE_SECONDS))
 {
+int expectedBloomFilterSize = 
Math.max(cfs.metadata.getMinIndexInterval(), (int)sstable.estimatedKeys());
 
repairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, 
destination, expectedBloomFilterSize, repairedAt, sstable));
 
unRepairedSSTableWriter.switchWriter(CompactionManager.createWriter(cfs, 
destination, expectedBloomFilterSize, ActiveRepairService.UNREPAIRED_SSTABLE, 
sstable));
 



[1/5] cassandra git commit: Buffer bloom filter serialization

2015-03-29 Thread benedict
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 7b1331fed -> d3258f615
  refs/heads/trunk 95d5d8b23 -> c0fc8d823


Buffer bloom filter serialization

patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9066


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d3258f61
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d3258f61
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d3258f61

Branch: refs/heads/cassandra-2.1
Commit: d3258f6152eda3be4cba0a021ea34fcb34b7a569
Parents: b0de327
Author: Gustav Munkby 
Authored: Sun Mar 29 16:17:56 2015 +0100
Committer: Benedict Elliott Smith 
Committed: Sun Mar 29 16:19:54 2015 +0100

--
 CHANGES.txt   |  1 +
 .../apache/cassandra/io/sstable/SSTableWriter.java| 14 +++---
 2 files changed, 4 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c02af99..bd5e277 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.4
+ * Buffer bloom filter serialization (CASSANDRA-9066)
  * Fix anti-compaction target bloom filter size (CASSANDRA-9060)
  * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047)
  * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
index 440961f..a39c134 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
@@ -17,10 +17,7 @@
  */
 package org.apache.cassandra.io.sstable;
 
-import java.io.DataInput;
-import java.io.File;
-import java.io.FileOutputStream;
-import java.io.IOException;
+import java.io.*;
 import java.nio.ByteBuffer;
 import java.util.Arrays;
 import java.util.Collections;
@@ -55,12 +52,7 @@ import 
org.apache.cassandra.io.sstable.metadata.MetadataCollector;
 import org.apache.cassandra.io.sstable.metadata.MetadataComponent;
 import org.apache.cassandra.io.sstable.metadata.MetadataType;
 import org.apache.cassandra.io.sstable.metadata.StatsMetadata;
-import org.apache.cassandra.io.util.DataOutputPlus;
-import org.apache.cassandra.io.util.DataOutputStreamAndChannel;
-import org.apache.cassandra.io.util.FileMark;
-import org.apache.cassandra.io.util.FileUtils;
-import org.apache.cassandra.io.util.SegmentedFile;
-import org.apache.cassandra.io.util.SequentialWriter;
+import org.apache.cassandra.io.util.*;
 import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.utils.ByteBufferUtil;
 import org.apache.cassandra.utils.FBUtilities;
@@ -647,7 +639,7 @@ public class SSTableWriter extends SSTable
 {
 // bloom filter
 FileOutputStream fos = new FileOutputStream(path);
-DataOutputStreamAndChannel stream = new 
DataOutputStreamAndChannel(fos);
+DataOutputStreamPlus stream = new DataOutputStreamPlus(new 
BufferedOutputStream(fos));
 FilterFactory.serialize(bf, stream);
 stream.flush();
 fos.getFD().sync();



[5/5] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-03-29 Thread benedict
Merge branch 'cassandra-2.1' into trunk

Conflicts:
src/java/org/apache/cassandra/db/compaction/CompactionManager.java
src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c0fc8d82
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c0fc8d82
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c0fc8d82

Branch: refs/heads/trunk
Commit: c0fc8d8236ee2e6e58929a7ab8c74a49fe2a6622
Parents: 95d5d8b d3258f6
Author: Benedict Elliott Smith 
Authored: Sun Mar 29 16:20:53 2015 +0100
Committer: Benedict Elliott Smith 
Committed: Sun Mar 29 16:20:53 2015 +0100

--
 CHANGES.txt   |  2 ++
 .../cassandra/db/compaction/CompactionManager.java|  2 --
 .../io/sstable/format/big/BigTableWriter.java | 14 +++---
 3 files changed, 5 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c0fc8d82/CHANGES.txt
--
diff --cc CHANGES.txt
index 739926e,bd5e277..e66b724
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,83 -1,6 +1,85 @@@
 +3.0
 + * Compressed Commit Log (CASSANDRA-6809)
 + * Optimise IntervalTree (CASSANDRA-8988)
 + * Add a key-value payload for third party usage (CASSANDRA-8553)
 + * Bump metrics-reporter-config dependency for metrics 3.0 (CASSANDRA-8149)
 + * Partition intra-cluster message streams by size, not type (CASSANDRA-8789)
 + * Add WriteFailureException to native protocol, notify coordinator of
 +   write failures (CASSANDRA-8592)
 + * Convert SequentialWriter to nio (CASSANDRA-8709)
 + * Add role based access control (CASSANDRA-7653, 8650, 7216, 8760, 8849, 
8761, 8850)
 + * Record client ip address in tracing sessions (CASSANDRA-8162)
 + * Indicate partition key columns in response metadata for prepared
 +   statements (CASSANDRA-7660)
 + * Merge UUIDType and TimeUUIDType parse logic (CASSANDRA-8759)
 + * Avoid memory allocation when searching index summary (CASSANDRA-8793)
 + * Optimise (Time)?UUIDType Comparisons (CASSANDRA-8730)
 + * Make CRC32Ex into a separate maven dependency (CASSANDRA-8836)
 + * Use preloaded jemalloc w/ Unsafe (CASSANDRA-8714)
 + * Avoid accessing partitioner through StorageProxy (CASSANDRA-8244, 8268)
 + * Upgrade Metrics library and remove depricated metrics (CASSANDRA-5657)
 + * Serializing Row cache alternative, fully off heap (CASSANDRA-7438)
 + * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707)
 + * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560)
 + * Support direct buffer decompression for reads (CASSANDRA-8464)
 + * DirectByteBuffer compatible LZ4 methods (CASSANDRA-7039)
 + * Group sstables for anticompaction correctly (CASSANDRA-8578)
 + * Add ReadFailureException to native protocol, respond
 +   immediately when replicas encounter errors while handling
 +   a read request (CASSANDRA-7886)
 + * Switch CommitLogSegment from RandomAccessFile to nio (CASSANDRA-8308)
 + * Allow mixing token and partition key restrictions (CASSANDRA-7016)
 + * Support index key/value entries on map collections (CASSANDRA-8473)
 + * Modernize schema tables (CASSANDRA-8261)
 + * Support for user-defined aggregation functions (CASSANDRA-8053)
 + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419)
 + * Refactor SelectStatement, return IN results in natural order instead
 +   of IN value list order and ignore duplicate values in partition key IN 
restrictions (CASSANDRA-7981)
 + * Support UDTs, tuples, and collections in user-defined
 +   functions (CASSANDRA-7563)
 + * Fix aggregate fn results on empty selection, result column name,
 +   and cqlsh parsing (CASSANDRA-8229)
 + * Mark sstables as repaired after full repair (CASSANDRA-7586)
 + * Extend Descriptor to include a format value and refactor reader/writer
 +   APIs (CASSANDRA-7443)
 + * Integrate JMH for microbenchmarks (CASSANDRA-8151)
 + * Keep sstable levels when bootstrapping (CASSANDRA-7460)
 + * Add Sigar library and perform basic OS settings check on startup 
(CASSANDRA-7838)
 + * Support for aggregation functions (CASSANDRA-4914)
 + * Remove cassandra-cli (CASSANDRA-7920)
 + * Accept dollar quoted strings in CQL (CASSANDRA-7769)
 + * Make assassinate a first class command (CASSANDRA-7935)
 + * Support IN clause on any partition key column (CASSANDRA-7855)
 + * Support IN clause on any clustering column (CASSANDRA-4762)
 + * Improve compaction logging (CASSANDRA-7818)
 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917)
 + * Do anticompaction in groups (CASSANDRA-6851)
 + * Support user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 7781, 
7929,
 +   7924, 7812, 8063, 7813, 7708)
 + * Permit c

[3/5] cassandra git commit: Buffer bloom filter serialization

2015-03-29 Thread benedict
Buffer bloom filter serialization

patch by Gustav Munkby; reviewed by benedict for CASSANDRA-9066


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d3258f61
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d3258f61
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d3258f61

Branch: refs/heads/trunk
Commit: d3258f6152eda3be4cba0a021ea34fcb34b7a569
Parents: b0de327
Author: Gustav Munkby 
Authored: Sun Mar 29 16:17:56 2015 +0100
Committer: Benedict Elliott Smith 
Committed: Sun Mar 29 16:19:54 2015 +0100

--
 CHANGES.txt   |  1 +
 .../apache/cassandra/io/sstable/SSTableWriter.java| 14 +++---
 2 files changed, 4 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c02af99..bd5e277 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.4
+ * Buffer bloom filter serialization (CASSANDRA-9066)
  * Fix anti-compaction target bloom filter size (CASSANDRA-9060)
  * Make FROZEN and TUPLE unreserved keywords in CQL (CASSANDRA-9047)
  * Prevent AssertionError from SizeEstimatesRecorder (CASSANDRA-9034)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d3258f61/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
index 440961f..a39c134 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
@@ -17,10 +17,7 @@
  */
 package org.apache.cassandra.io.sstable;
 
-import java.io.DataInput;
-import java.io.File;
-import java.io.FileOutputStream;
-import java.io.IOException;
+import java.io.*;
 import java.nio.ByteBuffer;
 import java.util.Arrays;
 import java.util.Collections;
@@ -55,12 +52,7 @@ import 
org.apache.cassandra.io.sstable.metadata.MetadataCollector;
 import org.apache.cassandra.io.sstable.metadata.MetadataComponent;
 import org.apache.cassandra.io.sstable.metadata.MetadataType;
 import org.apache.cassandra.io.sstable.metadata.StatsMetadata;
-import org.apache.cassandra.io.util.DataOutputPlus;
-import org.apache.cassandra.io.util.DataOutputStreamAndChannel;
-import org.apache.cassandra.io.util.FileMark;
-import org.apache.cassandra.io.util.FileUtils;
-import org.apache.cassandra.io.util.SegmentedFile;
-import org.apache.cassandra.io.util.SequentialWriter;
+import org.apache.cassandra.io.util.*;
 import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.utils.ByteBufferUtil;
 import org.apache.cassandra.utils.FBUtilities;
@@ -647,7 +639,7 @@ public class SSTableWriter extends SSTable
 {
 // bloom filter
 FileOutputStream fos = new FileOutputStream(path);
-DataOutputStreamAndChannel stream = new 
DataOutputStreamAndChannel(fos);
+DataOutputStreamPlus stream = new DataOutputStreamPlus(new 
BufferedOutputStream(fos));
 FilterFactory.serialize(bf, stream);
 stream.flush();
 fos.getFD().sync();



[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385804#comment-14385804
 ] 

Benedict commented on CASSANDRA-9060:
-

bq. I think the immediate problem is that they are created to allow room for 
all keys in all anticompacted tables, whereas anticompactions process one table 
at a time

Thanks. You're right., and this is definitely something to fix in 2.1

In this instance we don't use HLL cardinality estimators, but the index 
summary, which isn't probabilistic. What it is, however, is only accurate to a 
certain granularity. As a first patch your approach reduces the problem to the 
one I initially assumed it was, i.e. a doubling of required space (instead of 
\*N), but with a small amount of TLC the estimatedKeysForRanges() method could 
be modified to give a lower bound for the size of both resultant tables (at the 
moment it can significantly over estimate in some scenarios, but also cannot 
easily estimate the cardinality of the negation of the range - so we would have 
to subtract the overestimation, giving an underestimate which is much worse).

Your patch looks to me to significantly improve the status quo, so I will 
commit it now, and we can address a slightly improved patch for perhaps 2.1.5

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0
>
> Attachments: 2.1-9060-simple.patch, trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Gustav Munkby (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385767#comment-14385767
 ] 

Gustav Munkby commented on CASSANDRA-9060:
--

Regarding the size of the Bloom filters, I think the immediate problem is that 
they are created to allow room for all keys in all anticompacted tables, 
whereas anticompactions process one table at a time. I've added a patch, which 
I believe does exactly that. Given that this change is fairly small, I targeted 
it at 2.1.

As the keys are going to be distributed over the two resulting tables, in the 
ideal world we might want to have much smaller bloom filters on either side 
than what we initially thought. I'm guessing this is a general problem with 
compactions, but the HyperLogLog cardinality estimators should help in the 
normal case.

For the general case of ensuring the Bloom filters are not too large, I can see 
basically two solutions. Either introduce a scanning phase before the actual 
compaction, where the size of the bloom filter(s) are calculated. Or reduce the 
size of the Bloom filter once compaction has completed. The obvious 
implementation of the latter would be to scan through the compacted index, 
possibly gated by a comparison of the index size and the bloom filter size.

I guess scanning through the index could be avoided by making sure that the 
IndexWriter kept track of multiple Bloom-filters of exponentially growing 
sizes. That way, once the index is complete, the most appropriate Bloom-filter 
could be picked and written to disk, discarding the others.

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0
>
> Attachments: 2.1-9060-simple.patch, trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Gustav Munkby (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gustav Munkby updated CASSANDRA-9060:
-
Attachment: 2.1-9060-simple.patch

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0
>
> Attachments: 2.1-9060-simple.patch, trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Gustav Munkby (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gustav Munkby updated CASSANDRA-9066:
-
Attachment: 2.1-9066.patch

> BloomFilter serialization is inefficient
> 
>
> Key: CASSANDRA-9066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
> Fix For: 2.1.4
>
> Attachments: 2.1-9066.patch
>
>
> As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
> very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
> 3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385740#comment-14385740
 ] 

Benedict commented on CASSANDRA-7282:
-

bq. +1 for massive understatement.

Thanks. I spent a day working on just that, so glad it panned out :)

I'm currently leaning towards postponing this ticket until 3.1, since some 
careful consideration is needed to ensure a uniform distribution of hash keys 
within the map, especially without vnodes on large clusters. It's possible we 
could only enable this optimisation on nodes that can predict their 
distribution will be fair. In either case I think it may be helpful to consider 
the ticket in relation to CASSANDRA-7032 and CASSANDRA-6696, by e.g. having a 
separate hash table for each vnode range. Depending on 3.0 release timeline, 
the incorporation of these tickets, and on the progression of my other 
commitments, I may still aim to deliver this in 3.0, but just alerting that at 
the moment my view is this is uncertain and on balance less than likely.

> Faster Memtable map
> ---
>
> Key: CASSANDRA-7282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>  Labels: performance
> Fix For: 3.0
>
> Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, 
> run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-9066:

Reviewer: Benedict

> BloomFilter serialization is inefficient
> 
>
> Key: CASSANDRA-9066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
> Fix For: 2.1.4
>
>
> As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
> very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
> 3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9060) Anticompaction hangs on bloom filter bitset serialization

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385735#comment-14385735
 ] 

Benedict commented on CASSANDRA-9060:
-

I've split the slow serialization problem out into CASSANDRA-9066, since the 
problem of anti-compaction mispredicting the number of rows can at worst halve 
performance, whereas the slow serialization could have an order of magnitude 
impact. [~grddev]: do you want to have a stab at that ticket?

> Anticompaction hangs on bloom filter bitset serialization 
> --
>
> Key: CASSANDRA-9060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gustav Munkby
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0
>
> Attachments: trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9066) BloomFilter serialization is inefficient

2015-03-29 Thread Benedict (JIRA)
Benedict created CASSANDRA-9066:
---

 Summary: BloomFilter serialization is inefficient
 Key: CASSANDRA-9066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9066
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
 Fix For: 2.1.4


As pointed out by [~grddev] in CASSANDRA-9060, bloom filter serialization is 
very slow. In that ticket I proposed that 2.1 use buffered serialization, and 
3.0 make the serialization format itself more efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state

2015-03-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385731#comment-14385731
 ] 

Benedict commented on CASSANDRA-8984:
-

bq.  the more interacting objects and abstractions around behavior we have, the 
more the complexity burden

My model of complexity is: for any set of actions (units of execution) or 
abstractions you want to understand or modify, what is the transitive closure 
of _interactions_ with other actions/abstractions that need to be understood 
and considered in conjunction to ensure correctness. In parallel with this, I 
would suggest that fragility is the portion of this complexity that is 
implicit, or easily missed\*. To go back to your 5 20x complexity vs 100 1x 
complexity, this can be the difference between additive and multiplicative 
complexity. If all 20 points of complexity in each five classes can interact 
with any other point in any of the five classes, then the complexity burden is 
3.2M, not 100.

\* or if the complexity is too large to fit into your working memory

My point is simply that if a new class reduces the number of interactions that 
need to be considered (i.e. isolation), then complexity is reduced. This is a 
bit of an abstract discussion, but I do love me some meta argumentation.

(In my model of complexity, what I called acclimation is the number of high 
level abstractions a newcomer needs to have a vague understanding of to 
mentally map and model the overall functional unit they're addressing. I think 
this complexity is completely drowned out by the other once real work starts to 
happen. NB: I don't pretend this model of complexity is complete, but I think 
it serves for this discussion)

bq. having to manage some state transitions manually leaks that portion of the 
Transactional abstraction

Leakage at the precise clearly defined point cuts for interaction (i.e. the 
abstract methods requiring some boilerplate) aren't such a problem for 
complexity (by my definition), but they are _ugly_. I've uploaded an 
alternative approach 
[here|https://github.com/belliottsmith/cassandra/tree/8984-alt] that I do 
prefer, but technically increases the number of classes and doesn't reduce the 
amount of boilerplate, so I initially avoided (as inner classes can be even 
worse for acclimation IME), but does have the advantage of that boilerplate 
being better managed by the compiler and IDE. That is, solving the multiple 
inheritance problem through Java's only other mechanism besides code 
duplication: implementation proxies.

bq. What's your confidence regarding the likelihood of 8690 delivering on that 
safety?

safety or safely? The latter: high; the former: medium (i'm sure we can improve 
it, but doubt we'll get it to the same level)

bq. I haven't had the time to sit down and really consider revisions to this 
design

I'll leave both approaches I've concocted in your court for now, then. If you 
can come up with a third approach, I'm all ears :)

> Introduce Transactional API for behaviours that can corrupt system state
> 
>
> Key: CASSANDRA-8984
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8984
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 2.1.4
>
> Attachments: 8984_windows_timeout.txt
>
>
> As a penultimate (and probably final for 2.1, if we agree to introduce it 
> there) round of changes to the internals managing sstable writing, I've 
> introduced a new API called "Transactional" that I hope will make it much 
> easier to write correct behaviour. As things stand we conflate a lot of 
> behaviours into methods like "close" - the recent changes unpicked some of 
> these, but didn't go far enough. My proposal here introduces an interface 
> designed to support four actions (on top of their normal function):
> * prepareToCommit
> * commit
> * abort
> * cleanup
> In normal operation, once we have finished constructing a state change we 
> call prepareToCommit; once all such state changes are prepared, we call 
> commit. If at any point everything fails, abort is called. In _either_ case, 
> cleanup is called at the very last.
> These transactional objects are all AutoCloseable, with the behaviour being 
> to rollback any changes unless commit has completed successfully.
> The changes are actually less invasive than it might sound, since we did 
> recently introduce abort in some places, as well as have commit like methods. 
> This simply formalises the behaviour, and makes it consistent between all 
> objects that interact in this way. Much of the code change is boilerplate, 
> such as moving an object into a try-declaration, although the change is still 
> non-trivial. What it

[jira] [Commented] (CASSANDRA-7807) Push notification when tracing completes for an operation

2015-03-29 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385718#comment-14385718
 ] 

Robert Stupp commented on CASSANDRA-7807:
-

{{ServerConnection.java}} 
done

{{debug-cql}} 
no, did not forget. Otherwise {{debug-cql}} would not start if any other 
process (C*) uses port 7199 - i.e. {{debug-cql}} would not start if C* is 
running locally). It’s probably better handled in a separate ticket - or we 
decide to remove sourcing of {{cassandra-env.sh}} at all for {{debug-cql}}. Not 
sure about either way.

{{SimpleClient.java}} / {{TransportException}}
Yea - it’s a bit awkward. But TE is only implemented by C* exceptions that 
extend RuntimeException. These are: {{ServerError}}, {{ProtocolException}} and 
{{CassandraException}}. Eventually it makes more sense to let {{ServerError}} 
and {{ProtocolException}} extend {{CassandraException}} and get rid 
{{TransportException}}. We already have an ”exception cleanup ticket” for 3.0 
(CASSANDRA-8809) - maybe it’s worth to cleanup this, too.

{{TraceCompleteTest.java}}
Worked in the increased timeout.

Also changed the code to deal better with trace probability (via an explicit 
boolean whether the event should be sent or not). Maybe I counted too much on 
the ”correct” result of the test.

Let me know what you thing about the TE and debug-cql things and I'll provide a 
matching patch for that. The branch's updated.

> Push notification when tracing completes for an operation
> -
>
> Key: CASSANDRA-7807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7807
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: client-impacting, protocolv4
> Fix For: 3.0
>
> Attachments: 7807-v2.txt, 7807-v3.txt, 7807.txt
>
>
> Tracing is an asynchronous operation, and drivers currently poll to determine 
> when the trace is complete (in a loop with sleeps).  Instead, the server 
> could push a notification to the driver when the trace completes.
> I'm guessing that most of the work for this will be around pushing 
> notifications to a single connection instead of all connections that have 
> registered listeners for a particular event type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements

2015-03-29 Thread Oded Peer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oded Peer updated CASSANDRA-7304:
-
Attachment: 7304-05.patch

Rebased to trunk

> Ability to distinguish between NULL and UNSET values in Prepared Statements
> ---
>
> Key: CASSANDRA-7304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7304
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Drew Kutcharian
>Assignee: Oded Peer
>  Labels: cql, protocolv4
> Fix For: 3.0
>
> Attachments: 7304-03.patch, 7304-04.patch, 7304-05.patch, 
> 7304-2.patch, 7304.patch
>
>
> Currently Cassandra inserts tombstones when a value of a column is bound to 
> NULL in a prepared statement. At higher insert rates managing all these 
> tombstones becomes an unnecessary overhead. This limits the usefulness of the 
> prepared statements since developers have to either create multiple prepared 
> statements (each with a different combination of column names, which at times 
> is just unfeasible because of the sheer number of possible combinations) or 
> fall back to using regular (non-prepared) statements.
> This JIRA is here to explore the possibility of either:
> A. Have a flag on prepared statements that once set, tells Cassandra to 
> ignore null columns
> or
> B. Have an "UNSET" value which makes Cassandra skip the null columns and not 
> tombstone them
> Basically, in the context of a prepared statement, a null value means delete, 
> but we don’t have anything that means "ignore" (besides creating a new 
> prepared statement without the ignored column).
> Please refer to the original conversation on DataStax Java Driver mailing 
> list for more background:
> https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion
> *EDIT 18/12/14 - [~odpeer] Implementation Notes:*
> The motivation hasn't changed.
> Protocol version 4 specifies that bind variables do not require having a 
> value when executing a statement. Bind variables without a value are called 
> 'unset'. The 'unset' bind variable is serialized as the int value '-2' 
> without following bytes.
> \\
> \\
> * An unset bind variable in an EXECUTE or BATCH request
> ** On a {{value}} does not modify the value and does not create a tombstone
> ** On the {{ttl}} clause is treated as 'unlimited'
> ** On the {{timestamp}} clause is treated as 'now'
> ** On a map key or a list index throws {{InvalidRequestException}}
> ** On a {{counter}} increment or decrement operation does not change the 
> counter value, e.g. {{UPDATE my_tab SET c = c - ? WHERE k = 1}} does change 
> the value of counter {{c}}
> ** On a tuple field or UDT field throws {{InvalidRequestException}}
> * An unset bind variable in a QUERY request
> ** On a partition column, clustering column or index column in the {{WHERE}} 
> clause throws {{InvalidRequestException}}
> ** On the {{limit}} clause is treated as 'unlimited'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8696) nodetool repair on cassandra 2.1.2 keyspaces return java.lang.RuntimeException: Could not create snapshot

2015-03-29 Thread Ran Rubinstein (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385668#comment-14385668
 ] 

Ran Rubinstein commented on CASSANDRA-8696:
---

Sorry, we went back to 2.0

> nodetool repair on cassandra 2.1.2 keyspaces return 
> java.lang.RuntimeException: Could not create snapshot
> -
>
> Key: CASSANDRA-8696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8696
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeff Liu
> Fix For: 2.1.4
>
>
> When trying to run nodetool repair -pr on cassandra node ( 2.1.2), cassandra 
> throw java exceptions: cannot create snapshot. 
> the error log from system.log:
> {noformat}
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:28,815 
> StreamResultFuture.java:166 - [Stream #692c1450-a692-11e4-9973-070e938df227 
> ID#0] Prepare completed. Receiving 2 files(221187 bytes), sending 5 
> files(632105 bytes)
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 
> StreamResultFuture.java:180 - [Stream #692c1450-a692-11e4-9973-070e938df227] 
> Session with /10.97.9.110 is complete
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,046 
> StreamResultFuture.java:212 - [Stream #692c1450-a692-11e4-9973-070e938df227] 
> All sessions completed
> INFO  [STREAM-IN-/10.97.9.110] 2015-01-28 02:07:29,047 
> StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] 
> streaming task succeed, returning response to /10.98.194.68
> INFO  [RepairJobTask:1] 2015-01-28 02:07:29,065 StreamResultFuture.java:86 - 
> [Stream #692c6270-a692-11e4-9973-070e938df227] Executing streaming plan for 
> Repair
> INFO  [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,065 
> StreamSession.java:213 - [Stream #692c6270-a692-11e4-9973-070e938df227] 
> Starting streaming to /10.66.187.201
> INFO  [StreamConnectionEstablisher:4] 2015-01-28 02:07:29,070 
> StreamCoordinator.java:209 - [Stream #692c6270-a692-11e4-9973-070e938df227, 
> ID#0] Beginning stream session with /10.66.187.201
> INFO  [STREAM-IN-/10.66.187.201] 2015-01-28 02:07:29,465 
> StreamResultFuture.java:166 - [Stream #692c6270-a692-11e4-9973-070e938df227 
> ID#0] Prepare completed. Receiving 5 files(627994 bytes), sending 5 
> files(632105 bytes)
> INFO  [StreamReceiveTask:22] 2015-01-28 02:07:31,971 
> StreamResultFuture.java:180 - [Stream #692c6270-a692-11e4-9973-070e938df227] 
> Session with /10.66.187.201 is complete
> INFO  [StreamReceiveTask:22] 2015-01-28 02:07:31,972 
> StreamResultFuture.java:212 - [Stream #692c6270-a692-11e4-9973-070e938df227] 
> All sessions completed
> INFO  [StreamReceiveTask:22] 2015-01-28 02:07:31,972 
> StreamingRepairTask.java:96 - [repair #685e3d00-a692-11e4-9973-070e938df227] 
> streaming task succeed, returning response to /10.98.194.68
> ERROR [RepairJobTask:1] 2015-01-28 02:07:39,444 RepairJob.java:127 - Error 
> occurred during snapshot phase
> java.lang.RuntimeException: Could not create snapshot at /10.97.9.110
> at 
> org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:347) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_45]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> INFO  [AntiEntropySessions:6] 2015-01-28 02:07:39,445 RepairSession.java:260 
> - [repair #6f85e740-a692-11e4-9973-070e938df227] new session: will sync 
> /10.98.194.68, /10.66.187.201, /10.226.218.135 on range 
> (12817179804668051873746972069086
> 2638799,12863540308359254031520865977436165] for events.[bigint0text, 
> bigint0boolean, bigint0int, dataset_catalog, column_categories, 
> bigint0double, bigint0bigint]
> ERROR [AntiEntropySessions:5] 2015-01-28 02:07:39,445 RepairSession.java:303 
> - [repair #685e3d00-a692-11e4-9973-070e938df227] session completed with the 
> following error
> java.io.IOException: Failed during snapshot creation.
> at 
> org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
>  ~[apache-cassandra-2.1.2.jar:2.1.2]
> at 
> org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:128) 
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
> ~[guava-16.0.jar:na]
> at 

[jira] [Commented] (CASSANDRA-7807) Push notification when tracing completes for an operation

2015-03-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385655#comment-14385655
 ] 

Stefania commented on CASSANDRA-7807:
-

{{ServerConnection.java}}:

{code}
public boolean isRegistered(Event.Type traceComplete)
{
return 
((Server.ConnectionTracker)getTracker()).isRegistered(Event.Type.TRACE_COMPLETE,
 channel());
}
{code}

Surely you must have meant:

{code}
public boolean isRegistered(Event.Type eventType)
{
return ((Server.ConnectionTracker)getTracker()).isRegistered(eventType, 
channel());
}
{code}

I would have added {{isRegistered()}} to the {{Tracker}} interface instead of a 
new method in {{Connection}} but up to you. Make sure to change the param 
argument name from {{traceComplete}} to {{eventType}} in 
{{Connection.isRegistred()}} if you keep {{isRegistered()}} in {{Connection}} 
however.

\\

{{debug-cql}}:
You forgot to uncomment lines 47-50

\\

{{SimpleClient.java}}:
bq. Regarding TransportException - unfortunately it’s an interface - not an 
(unchecked) exception class.
{code}
if (msg instanceof ErrorMessage)
-throw new RuntimeException((Throwable)((ErrorMessage)msg).error);+ 
   throw (RuntimeException)((ErrorMessage)msg).error;
{code}
Then how can we be sure that {{msg.error}} is always going to be a 
{{RuntimeException}}? 

\\

{{TraceCompleteTest.java}}:
The assertion at line 68 occasionally fails in ant, perhaps we need to increase 
the timeout a bit:
{code}
 Event event = eventHandlerA.queue.poll(100, TimeUnit.MILLISECONDS);
Assert.assertNotNull(event);
{code}

{code}
   [junit] Testcase: 
testTraceComplete(org.apache.cassandra.tracing.TraceCompleteTest): FAILED
[junit] 
[junit] junit.framework.AssertionFailedError: 
[junit]   at 
org.apache.cassandra.tracing.TraceCompleteTest.testTraceComplete(TraceCompleteTest.java:68)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.tracing.TraceCompleteTest FAILED
{code}

\\

Also, I could not convince myself how probabilistic tracing was not sending the 
event so I increased the timeout to 2 seconds in 
{{testTraceCompleteWithProbability()}} and the test then fails:
{code}
stefania@mia:~/git/cstar/cassandra$ git diff
diff --git a/test/unit/org/apache/cassandra/tracing/TraceCompleteTest.java 
b/test/unit/org/apache/cassandra/tracing/TraceComp
index b64c852..4784ca2 100644
--- a/test/unit/org/apache/cassandra/tracing/TraceCompleteTest.java
+++ b/test/unit/org/apache/cassandra/tracing/TraceCompleteTest.java
@@ -185,10 +185,10 @@ public class TraceCompleteTest extends CQLTester
 QueryMessage query = new QueryMessage("SELECT * FROM " + 
KEYSPACE + '.' + currentTable(), QueryOptions.DEFAU
 clientA.execute(query);
 
-Event event = eventHandlerA.queue.poll(100, 
TimeUnit.MILLISECONDS);
+Event event = eventHandlerA.queue.poll(2000, 
TimeUnit.MILLISECONDS);
 Assert.assertNull(event);
 
-Assert.assertNull(eventHandlerB.queue.poll(100, 
TimeUnit.MILLISECONDS));
+Assert.assertNull(eventHandlerB.queue.poll(2000, 
TimeUnit.MILLISECONDS));
 }
 finally
{code}

{code}
[junit] -  ---
[junit] Testcase: 
testTraceCompleteWithProbability(org.apache.cassandra.tracing.TraceCompleteTest):
 FAILED
[junit] 
[junit] junit.framework.AssertionFailedError: 
[junit]   at 
org.apache.cassandra.tracing.TraceCompleteTest.testTraceCompleteWithProbability(TraceCompleteTest.java:189)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.tracing.TraceCompleteTest FAILED
{code}

Apologies for not picking this up in the previous round, please double check 
the probabilistic tracing flow.

> Push notification when tracing completes for an operation
> -
>
> Key: CASSANDRA-7807
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7807
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: client-impacting, protocolv4
> Fix For: 3.0
>
> Attachments: 7807-v2.txt, 7807-v3.txt, 7807.txt
>
>
> Tracing is an asynchronous operation, and drivers currently poll to determine 
> when the trace is complete (in a loop with sleeps).  Instead, the server 
> could push a notification to the driver when the trace completes.
> I'm guessing that most of the work for this will be around pushing 
> notifications to a single connection instead of all connections that have 
> registered listeners for a particular event type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)