date:20140127

[jira] [Created] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)

2014-01-27 Thread Csaba Seres (JIRA)

Csaba Seres created CASSANDRA-6623:
--

 Summary: Null in a cell caused by expired TTL does not work with 
IF clause (in CQL3)
 Key: CASSANDRA-6623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6623
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
 Environment: One cluster with two nodes on a Linux and a Windows 
system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 
19.39.0. CQL3 Column Family
Reporter: Csaba Seres
Priority: Minor
 Fix For: 2.0.4


IF onecell=null clause does not work if the onecell has got its null value from 
an expired TTL. If onecell is updated with null value (UPDATE) then IF 
onecell=null works fine.
This bug is not present when you create a table with COMPACT STORAGE directive.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (CASSANDRA-6624) General protection fault

2014-01-27 Thread Mateusz Gajewski (JIRA)

Mateusz Gajewski created CASSANDRA-6624:
---

 Summary: General protection fault
 Key: CASSANDRA-6624
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6624
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Linux s41083 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 
17:37:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
Reporter: Mateusz Gajewski
Priority: Critical
 Fix For: 2.0.3
 Attachments: system.log

Hi,

Yesterday I got General Protection Fault in cassandra 2.0.3 process while 
stress testing it.

Jan 26 23:19:43 s41083 kernel: [461545.017756] java[192074] general protection 
ip:7fea558c6ae7 sp:7fe959844bf0 error:0 in libc-2.15.so[7fea5588d000+1b5000]

It just died while doing compactation and restarted couple of times several 
minutes after that.

System log attached



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6624) General protection fault

2014-01-27 Thread Mateusz Gajewski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mateusz Gajewski updated CASSANDRA-6624:


Attachment: system.log

> General protection fault
> 
>
> Key: CASSANDRA-6624
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6624
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux s41083 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 
> 17:37:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.7.0_51"
> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>Reporter: Mateusz Gajewski
>Priority: Critical
> Fix For: 2.0.3
>
> Attachments: system.log
>
>
> Hi,
> Yesterday I got General Protection Fault in cassandra 2.0.3 process while 
> stress testing it.
> Jan 26 23:19:43 s41083 kernel: [461545.017756] java[192074] general 
> protection ip:7fea558c6ae7 sp:7fe959844bf0 error:0 in 
> libc-2.15.so[7fea5588d000+1b5000]
> It just died while doing compactation and restarted couple of times several 
> minutes after that.
> System log attached



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6555) AbstractQueryPager.DiscardFirst is still broken

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6555:


Reviewer: Tyler Hobbs

> AbstractQueryPager.DiscardFirst is still broken
> ---
>
> Key: CASSANDRA-6555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6555
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 2.0.5
>
> Attachments: 6555.txt
>
>
> See https://datastax-oss.atlassian.net/browse/JAVA-243 for an example failure.
> This is my bad, I messed up while testing the fix for CASSANDRA-6447. 
> Attaching fix for that. I've (correctly) tested that this fixes the issue but 
> also added a few specific unit tests for discardFirst/discardLast to make 
> sure they work correctly this time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6561) Static columns in CQL3

2014-01-27 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882768#comment-13882768
 ] 

Sylvain Lebresne commented on CASSANDRA-6561:
-

Concerning CAS, the previous patches were covering the ability to serialize 
updates to a given partition (through CASing a static columns), but to be a 
full alternative to CASSANDRA-5633 we still need to allow batching with 
conditions. The good news is that we don't need extra syntax for that, we just 
need to allow IF in batches which is relatively natural (of course, we need to 
limit batch with conditions to be only on one partition since we don't support 
cross-partition CAS).

I've pushed a rebased version of the branch above with an additional patch to 
handle batching at https://github.com/pcmanus/cassandra/commits/6561-2. I've 
also updated the dtests, so examples of how it looks like is at 
https://github.com/riptano/cassandra-dtest/commit/dee0d8f6bcf4816cc0690b001f875929fd69dfb4.

> Static columns in CQL3
> --
>
> Key: CASSANDRA-6561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6561
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sylvain Lebresne
>
> I'd like to suggest the following idea for adding "static" columns to CQL3.  
> I'll note that the basic idea has been suggested by jhalliday on irc but the 
> rest of the details are mine and I should be blamed for anything stupid in 
> what follows.
> Let me start with a rational: there is 2 main family of CF that have been 
> historically used in Thrift: static ones and dynamic ones. CQL3 handles both 
> family through the presence or not of clustering columns. There is however 
> some cases where mixing both behavior has its use. I like to think of those 
> use cases as 3 broad category:
> # to denormalize small amounts of not-entirely-static data in otherwise 
> static entities. It's say "tags" for a product or "custom properties" in a 
> user profile. This is why we've added CQL3 collections. Importantly, this is 
> the *only* use case for which collections are meant (which doesn't diminishes 
> their usefulness imo, and I wouldn't disagree that we've maybe not 
> communicated this too well).
> # to optimize fetching both a static entity and related dynamic ones. Say you 
> have blog posts, and each post has associated comments (chronologically 
> ordered). *And* say that a very common query is "fetch a post and its 50 last 
> comments". In that case, it *might* be beneficial to store a blog post 
> (static entity) in the same underlying CF than it's comments for performance 
> reason.  So that "fetch a post and it's 50 last comments" is just one slice 
> internally.
> # you want to CAS rows of a dynamic partition based on some partition 
> condition. This is the same use case than why CASSANDRA-5633 exists for.
> As said above, 1) is already covered by collections, but 2) and 3) are not 
> (and
> I strongly believe collections are not the right fit, API wise, for those).
> Also, note that I don't want to underestimate the usefulness of 2). In most 
> cases, using a separate table for the blog posts and the comments is The 
> Right Solution, and trying to do 2) is premature optimisation. Yet, when used 
> properly, that kind of optimisation can make a difference, so I think having 
> a relatively native solution for it in CQL3 could make sense.
> Regarding 3), though CASSANDRA-5633 would provide one solution for it, I have 
> the feeling that static columns actually are a more natural approach (in term 
> of API). That's arguably more of a personal opinion/feeling though.
> So long story short, CQL3 lacks a way to mix both some "static" and "dynamic" 
> rows in the same partition of the same CQL3 table, and I think such a tool 
> could have it's use.
> The proposal is thus to allow "static" columns. Static columns would only 
> make sense in table with clustering columns (the "dynamic" ones). A static 
> column value would be static to the partition (all rows of the partition 
> would share the value for such column). The syntax would just be:
> {noformat}
> CREATE TABLE t (
>   k text,
>   s text static,
>   i int,
>   v text,
>   PRIMARY KEY (k, i)
> )
> {noformat}
> then you'd get:
> {noformat}
> INSERT INTO t(k, s, i, v) VALUES ("k0", "I'm shared",   0, "foo");
> INSERT INTO t(k, s, i, v) VALUES ("k0", "I'm still shared", 1, "bar");
> SELECT * FROM t;
>  k |  s | i |v
> 
> k0 | "I'm still shared" | 0 | "bar"
> k0 | "I'm still shared" | 1 | "foo"
> {noformat}
> There would be a few semantic details to decide on regarding deletions, ttl, 
> etc. but let's see if we agree it's a good idea first before ironing those 
> out.
> One last point is the implementation. Though I do think this idea has merits, 
> it's de

[jira] [Commented] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)

2014-01-27 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882783#comment-13882783
 ] 

Sylvain Lebresne commented on CASSANDRA-6623:
-

Can you provide a simple reproduction test case? (I'm not contesting there is a 
problem, just want to make sure what you are running into exactly first)

> Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
> ---
>
> Key: CASSANDRA-6623
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6623
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
> Environment: One cluster with two nodes on a Linux and a Windows 
> system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 
> 19.39.0. CQL3 Column Family
>Reporter: Csaba Seres
>Priority: Minor
> Fix For: 2.0.4
>
>
> IF onecell=null clause does not work if the onecell has got its null value 
> from an expired TTL. If onecell is updated with null value (UPDATE) then IF 
> onecell=null works fine.
> This bug is not present when you create a table with COMPACT STORAGE 
> directive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6082) 1.1.12 --> 1.2.x upgrade may result inconsistent ring

2014-01-27 Thread Chris Burroughs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882786#comment-13882786
 ] 

Chris Burroughs commented on CASSANDRA-6082:


Duplicate of which ticket?

> 1.1.12 --> 1.2.x upgrade may result inconsistent ring
> -
>
> Key: CASSANDRA-6082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6082
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 1.1.12 --> 1.2.9
>Reporter: Chris Burroughs
>Priority: Minor
> Attachments: c-gossipinfo, c-status
>
>
> This happened to me once, and since I don't have any more 1.1.x clusters I 
> won't be testing again.  I hope the attached files are enough for someone to 
> connect the dots.
> I did a rolling restart to upgrade from 1.1.12 --> 1.2.9.  About a week later 
> I discovered that one node was in an inconsistent state in the ring.  It was 
> either:
>  * up
>  * host-id=null
>  * missing
> Depending on which node I ran nodetool status from.  I *think* I just missed 
> this during the upgrade but can not rule out the possibility that it "just 
> happened for no reason" some time after the upgrade.  It was detected when 
> running repair in such a ring caused all sorts of terrible data "duplication" 
> and performance tanked.  Restarting the seeds + "bad" node caused the ring to 
> be consistent again.
> Two possibly suspicious things are a ArrayIndexOutOfBoundsException on 
> startup:
> {noformat}
> ERROR [GossipStage:1] 2013-09-06 10:45:35,213 CassandraDaemon.java (line 194) 
> Exception in thread Thread[GossipStage:1,5,main]
> java.lang.ArrayIndexOutOfBoundsException: 2
> at 
> org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at 
> org.apache.cassandra.service.StorageService.handleStateRemoving(StorageService.java:1607)
> at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1230)
> at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1958)
> at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:841)
> at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:919)
> at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> and problems to hint delivery to multiple node.
> {noformat}
> ERROR [MutationStage:11] 2013-09-06 13:59:19,604 CassandraDaemon.java (line 
> 194) Exception in thread Thread[MutationStage:11,5,main]
> java.lang.AssertionError: Missing host ID for 10.20.2.45
> at 
> org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:583)
> at 
> org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:552)
> at 
> org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1658)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Not however that while there were delivery problems to multiple nodes during 
> the rolling upgrade, only one node was in a funky state a week later.
> Attached are the results of running gossipinfo and status on every node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-5839) Save repair data to system table

2014-01-27 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5839:
--

Reviewer: Marcus Eriksson  (was: Jason Brown)

> Save repair data to system table
> 
>
> Key: CASSANDRA-5839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core, Tools
>Reporter: Jonathan Ellis
>Assignee: Jimmy Mårdell
>Priority: Minor
> Fix For: 2.0.5
>
> Attachments: 2.0.4-5839-draft.patch
>
>
> As noted in CASSANDRA-2405, it would be useful to store repair results, 
> particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6591) un-deprecate cache recentHitRate and expose in o.a.c.metrics

2014-01-27 Thread Chris Burroughs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882820#comment-13882820
 ] 

Chris Burroughs commented on CASSANDRA-6591:


Brendan Gregg's System Performance book has convinced me that cache misses per 
unit time is a valuable metric by itself.  Will suck it up and add that as well 
(which will incidentally allow 1/5/15, which seemed popular).

> un-deprecate cache recentHitRate and expose in o.a.c.metrics
> 
>
> Key: CASSANDRA-6591
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6591
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Burroughs
>Assignee: Chris Burroughs
>Priority: Minor
> Attachments: j6591-1.2-v1.txt, j6591-1.2-v2.txt
>
>
> recentHitRate metrics were not added as part of CASSANDRA-4009 because there 
> is not an obvious way to do it with the Metrics library.  Instead hitRate was 
> added as an all time measurement since node restart.
> This does allow changes in cache rate (aka production performance problems)  
> to be detected.  Ideally there would be 1/5/15 moving averages for the hit 
> rate, but I'm not sure how to calculate that.  Instead I propose updating 
> recentHitRate on a fixed interval and exposing that as a Gauge.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)

2014-01-27 Thread Csaba Seres (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882828#comment-13882828
 ] 

Csaba Seres commented on CASSANDRA-6623:


Dear Sylvain Lebresne,

I created a table without COMPACT STORAGE directive:
cqlsh> CREATE TABLE astyanaxks.cf2 (name varchar PRIMARY KEY, lock varchar, 
something varchar);
Then inserted a row:
cqlsh> INSERT INTO astyanaxks.cf2 (name, lock , something ) VALUES ( 'name1', 
'lock1', 'som1');
Updated the lock column with a new TTL value:
cqlsh> UPDATE astyanaxks.cf2 USING TTL 10 SET lock='lock2' WHERE name='name1';
cqlsh> SELECT * FROM astyanaxks.cf2;

 name  | lock  | something
---+---+---
 name1 | lock2 |  som1

(1 rows)

After 10 seconds:

cqlsh> SELECT * FROM astyanaxks.cf2;

 name  | lock | something
---+--+---
 name1 | null |  som1

(1 rows)

Then I wanted to update the row if lock is null:
cqlsh> UPDATE astyanaxks.cf2 USING TTL 10 SET lock='lock2' WHERE name='name1' 
IF lock=null;

 [applied]
---
 False

It was unsuccessful.

cqlsh> SELECT * FROM astyanaxks.cf2;

 name  | lock | something
---+--+---
 name1 | null |  som1

(1 rows)

cqlsh> 

On the other hand, if the null value is set by an update, then IF clause works. 
On the same Column Family:
cqlsh> UPDATE astyanaxks.cf2 SET lock=null WHERE name='name1';
cqlsh> UPDATE astyanaxks.cf2 USING TTL 10 SET lock='lock2' WHERE name='name1' 
IF lock=null;

 [applied]
---
  True

cqlsh> SELECT * FROM astyanaxks.cf2;

 name  | lock  | something
---+---+---
 name1 | lock2 |  som1

Now lock column is set.

Best Regards,

Csaba Seres
 




> Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
> ---
>
> Key: CASSANDRA-6623
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6623
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
> Environment: One cluster with two nodes on a Linux and a Windows 
> system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 
> 19.39.0. CQL3 Column Family
>Reporter: Csaba Seres
>Priority: Minor
> Fix For: 2.0.4
>
>
> IF onecell=null clause does not work if the onecell has got its null value 
> from an expired TTL. If onecell is updated with null value (UPDATE) then IF 
> onecell=null works fine.
> This bug is not present when you create a table with COMPACT STORAGE 
> directive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6623) Null in a cell caused by expired TTL does not work with IF clause (in CQL3)

2014-01-27 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882875#comment-13882875
 ] 

Sylvain Lebresne commented on CASSANDRA-6623:
-

Thanks Csaba.

So this is kind of due to CASSANDRA-5619 patch, with a soupçon of the problem 
from CASSANDRA-5762 thrown in.

When we generate conditions on a row internally, we currently include the row 
marker in the 'expected' CF. The reason is that for CASSANDRA-5619, we wanted 
to distinguish the cases of CAS failure because the row doesn't exist, from CAS 
failure because all columns on which we have conditions are null (but the row 
exists). So including the row marker in 'expected', makes us query the row 
marker, which in turns allows to say if the row does exists or not on CAS 
failure.

But this also means that 'UPDATE IF' checks for the row existence. Which 
doesn't really matter, unless you have only 'null' conditions. Namely, doing
{noformat}
CREATE TABLE test (k int PRIMARY KEY, v int)
UPDATE test SET v = 1 WHERE k = 0 IF v = null
{noformat}
the last update won't work, because the row doesn't exist prior to the update.  
And I think this is actually the first question to ask here: do we want that 
update not to work? I could see arguments for either side tbh. On the one side, 
since 'null' means the column doesn't exist, if the row don't exist then the 
column doesn't either and in that sense the update should work. On the other 
side, making it not work is somewhat more expressive since it allow to check 
for 'row exists but has null value' separately of 'row doesn't exists' (which 
you can already check with an 'INSERT IF NOT EXISTS'). Also, this reinforce the 
notion that with conditions, UPDATE work really more like a SQL UPDATE.

So anyway, the problem Csaba is having here is a bit different. Namely, it's 
due to the fact that TTL ends up removing the row marker, even if the TTL was 
only applied to one of the columns and is not a proper marker of row existence. 
 The result is that the row marker expires in Casba example, but since the CAS 
expects it to be there, it fails (and this, even though the row actually does 
still exist).

I think the solution here is basically the same than in CASSANDRA-5762, we 
should query the full CQL row as soon as we have a condition on it to be able 
to reliably say if the row exists or not. But while we're at it, it's worth 
deciding if we want to preserve the current 'UPDATE IF always checks for row 
existence' behavior or not.


> Null in a cell caused by expired TTL does not work with IF clause (in CQL3)
> ---
>
> Key: CASSANDRA-6623
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6623
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
> Environment: One cluster with two nodes on a Linux and a Windows 
> system. cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 
> 19.39.0. CQL3 Column Family
>Reporter: Csaba Seres
>Priority: Minor
> Fix For: 2.0.4
>
>
> IF onecell=null clause does not work if the onecell has got its null value 
> from an expired TTL. If onecell is updated with null value (UPDATE) then IF 
> onecell=null works fine.
> This bug is not present when you create a table with COMPACT STORAGE 
> directive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6082) 1.1.12 --> 1.2.x upgrade may result inconsistent ring

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882834#comment-13882834
 ] 

Brandon Williams commented on CASSANDRA-6082:
-

The one linked as 'duplicates' ;)  CASSANDRA-6564

> 1.1.12 --> 1.2.x upgrade may result inconsistent ring
> -
>
> Key: CASSANDRA-6082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6082
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 1.1.12 --> 1.2.9
>Reporter: Chris Burroughs
>Priority: Minor
> Attachments: c-gossipinfo, c-status
>
>
> This happened to me once, and since I don't have any more 1.1.x clusters I 
> won't be testing again.  I hope the attached files are enough for someone to 
> connect the dots.
> I did a rolling restart to upgrade from 1.1.12 --> 1.2.9.  About a week later 
> I discovered that one node was in an inconsistent state in the ring.  It was 
> either:
>  * up
>  * host-id=null
>  * missing
> Depending on which node I ran nodetool status from.  I *think* I just missed 
> this during the upgrade but can not rule out the possibility that it "just 
> happened for no reason" some time after the upgrade.  It was detected when 
> running repair in such a ring caused all sorts of terrible data "duplication" 
> and performance tanked.  Restarting the seeds + "bad" node caused the ring to 
> be consistent again.
> Two possibly suspicious things are a ArrayIndexOutOfBoundsException on 
> startup:
> {noformat}
> ERROR [GossipStage:1] 2013-09-06 10:45:35,213 CassandraDaemon.java (line 194) 
> Exception in thread Thread[GossipStage:1,5,main]
> java.lang.ArrayIndexOutOfBoundsException: 2
> at 
> org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at 
> org.apache.cassandra.service.StorageService.handleStateRemoving(StorageService.java:1607)
> at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1230)
> at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1958)
> at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:841)
> at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:919)
> at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> and problems to hint delivery to multiple node.
> {noformat}
> ERROR [MutationStage:11] 2013-09-06 13:59:19,604 CassandraDaemon.java (line 
> 194) Exception in thread Thread[MutationStage:11,5,main]
> java.lang.AssertionError: Missing host ID for 10.20.2.45
> at 
> org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:583)
> at 
> org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:552)
> at 
> org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1658)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Not however that while there were delivery problems to multiple nodes during 
> the rolling upgrade, only one node was in a funky state a week later.
> Attached are the results of running gossipinfo and status on every node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6622) Streaming session failures during node replace using replace_address

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882833#comment-13882833
 ] 

Brandon Williams commented on CASSANDRA-6622:
-

If the node is already marked dead, the FailureDetector isn't going to 
convict() again just because it receives a dead state like hibernate.  It will 
call onDead again for subscribers, but StreamSession doesn't care about that.  
What it does care about, however, is onRestart being called since there was a 
generation change, and that will fail the session.

That said, certainly delaying the stream until gossip has propagated should 
solve the issue, though I'm not sure streaming should be failing based on 
gossip/FD events (/cc [~yukim]).  However, instead of sleeping for 
BROADCAST_INTERVAL we can save half the time and sleep for RING_DELAY, since if 
gossip hasn't propagated fully by then there are bigger problems.  WDYT 
[~thobbs]?

> Streaming session failures during node replace using replace_address
> 
>
> Key: CASSANDRA-6622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6622
> Project: Cassandra
>  Issue Type: Bug
> Environment: RHEL6, cassandra-2.0.4
>Reporter: Ravi Prasad
> Attachments: 6622-2.0.txt
>
>
> When using replace_address, Gossiper ApplicationState is set to hibernate, 
> which is a down state. We are seeing that the peer nodes are seeing streaming 
> plan request even before the Gossiper on them marks the replacing node as 
> dead. As a result, streaming on peer nodes convicts the replacing node by 
> closing the stream handler.  
> I think, making the StorageService thread on the replacing node, sleep for 
> BROADCAST_INTERVAL before bootstrapping, would avoid this scenario.
> Relevant logs from peer node (see that the Gossiper on peer node mark the 
> replacing node as down, 2 secs after  the streaming init request):
>  INFO [STREAM-INIT-/x.x.x.x:46436] 2014-01-26 20:42:24,388 
> StreamResultFuture.java (line 116) [Stream 
> #5c6cd940-86ca-11e3-90a0-411b913c0e88] Received streaming plan for Bootstrap
> 
>  INFO [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with /x.x.x.x is 
> complete
>  WARN [GossipTasks:1] 2014-01-26 20:42:25,240 StreamResultFuture.java (line 
> 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed
>  INFO [GossipStage:1] 2014-01-26 20:42:25,242 Gossiper.java (line 850) 
> InetAddress /x.x.x.x is now DOWN
> ERROR [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,766 StreamSession.java (line 
> 410) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Streaming error occurred
> java.lang.RuntimeException: Outgoing stream handler has been closed
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:175)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:722)
>  INFO [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 181) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Session with 
> /x.x.x.x is complete
>  WARN [STREAM-IN-/x.x.x.x] 2014-01-26 20:42:25,768 StreamResultFuture.java 
> (line 210) [Stream #5c6cd940-86ca-11e3-90a0-411b913c0e88] Stream failed



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread JIRA

Ondřej Černoš created CASSANDRA-6625:


 Summary: Batch containing delete and insert leads to inconsistent 
results
 Key: CASSANDRA-6625
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: C* 1.2.11
Reporter: Ondřej Černoš
Priority: Critical


On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into the 
following. Let's consider empty keyspace with the following table:

{noformat}
CREATE TABLE test (
a varchar,
b varchar,
PRIMARY KEY (a, b)
) WITH comment='List of a related to b - widerow';
{noformat}

The table is empty.

Now we issue the following batch:

{noformat}
BEGIN BATCH
DELETE FROM test WHERE a = 'a1' AND b = 'b1';
INSERT INTO test (a, b) VALUES ('a1', 'b1');
APPLY BATCH;
{noformat}

When the batch successfully finishes, the table is empty.

This is consequence of the fact tombstone wins if timestamps are the same. And 
they are, because the operation is batched.

I consider this a bug. Batching operations shouldn't change the semantics of 
the operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ondřej Černoš updated CASSANDRA-6625:
-

Description: 
On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into the 
following. Let's consider empty keyspace with the following table:

{noformat}
CREATE TABLE test (
a varchar,
b varchar,
PRIMARY KEY (a, b)
) WITH comment='List of a related to b - widerow';
{noformat}

The table is empty.

Now we issue the following batch:

{noformat}
BEGIN BATCH
DELETE FROM test WHERE a = 'a1' AND b = 'b1';
INSERT INTO test (a, b) VALUES ('a1', 'b1');
APPLY BATCH;
{noformat}

When the batch successfully finishes, the table is empty.

This is consequence of the fact tombstone wins if timestamps are the same. And 
they are, because the operation is batched.

I consider this a bug. Batching operations shouldn't change the semantics of 
batched operations.

  was:
On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into the 
following. Let's consider empty keyspace with the following table:

{noformat}
CREATE TABLE test (
a varchar,
b varchar,
PRIMARY KEY (a, b)
) WITH comment='List of a related to b - widerow';
{noformat}

The table is empty.

Now we issue the following batch:

{noformat}
BEGIN BATCH
DELETE FROM test WHERE a = 'a1' AND b = 'b1';
INSERT INTO test (a, b) VALUES ('a1', 'b1');
APPLY BATCH;
{noformat}

When the batch successfully finishes, the table is empty.

This is consequence of the fact tombstone wins if timestamps are the same. And 
they are, because the operation is batched.

I consider this a bug. Batching operations shouldn't change the semantics of 
the operations.


> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Critical
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6421) Add bash completion to nodetool

2014-01-27 Thread Lyuben Todorov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882946#comment-13882946
 ] 

Lyuben Todorov commented on CASSANDRA-6421:
---

Just got the completion to work on OSX (mavericks), looks good. I can check 
again after the rebase if necessary.

> Add bash completion to nodetool
> ---
>
> Key: CASSANDRA-6421
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6421
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Cyril Scetbon
>Assignee: Michael Shuler
>Priority: Trivial
> Fix For: 2.1
>
>
> You can find the bash-completion file at 
> https://raw.github.com/cscetbon/cassandra/nodetool-completion/etc/bash_completion.d/nodetool
> it uses cqlsh to get keyspaces and namespaces and could use an environment 
> variable (not implemented) to get access which cqlsh if authentification is 
> needed. But I think that's really a good start :)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882953#comment-13882953
 ] 

Brandon Williams commented on CASSANDRA-6625:
-

bq. Batching operations shouldn't change the semantics of batched operations.

And it doesn't.

{noformat}
DELETE FROM test WHERE a = 'a1' AND b = 'b1' USING TIMESTAMP 123;
INSERT INTO test (a, b) VALUES ('a1', 'b1') USING TIMESTAMP 123;
{noformat}

 Does the same thing.


> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6625.
---

Resolution: Not A Problem

> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6625:


Priority: Minor  (was: Critical)

> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882962#comment-13882962
 ] 

Ondřej Černoš commented on CASSANDRA-6625:
--

[~brandon.williams] Original operations don't set the timestamp explicitly. 
From user's point of view the result of batching operations is unexpected and 
confusing. If you consider this Not a Problem, so be it, you are the 
maintainer. But the documentation of batch should state it explicitly.

Mixing client-generated timestamps with implicit timestamps is fragile and 
error prone. Batches therefore provide only limited value and one has to be 
very careful about the nature of batched operations; if these have the 
slightest chance of dealing with the same value, batches need to stay out.

Sidenote: from my understanding of Cassandra internals I know that fixing this 
batch misfeature would be very difficult. So please fix at least the 
documentation. By no means am I the last user hitting this wall. Thank you.

> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6421) Add bash completion to nodetool

2014-01-27 Thread Cyril Scetbon (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882977#comment-13882977
 ] 

Cyril Scetbon commented on CASSANDRA-6421:
--

If you can check after the rebase it could help :) otherwise I'll check it 
tomorrow.

> Add bash completion to nodetool
> ---
>
> Key: CASSANDRA-6421
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6421
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Cyril Scetbon
>Assignee: Michael Shuler
>Priority: Trivial
> Fix For: 2.1
>
>
> You can find the bash-completion file at 
> https://raw.github.com/cscetbon/cassandra/nodetool-completion/etc/bash_completion.d/nodetool
> it uses cqlsh to get keyspaces and namespaces and could use an environment 
> variable (not implemented) to get access which cqlsh if authentification is 
> needed. But I think that's really a good start :)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6446) Faster range tombstones on wide partitions

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6446:


Attachment: (was: 6446-Read-patch-v3.txt)

> Faster range tombstones on wide partitions
> --
>
> Key: CASSANDRA-6446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oleg Anastasyev
>Assignee: Oleg Anastasyev
> Fix For: 2.1
>
> Attachments: RangeTombstonesReadOptimization.diff, 
> RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of 
> them, we found inefficiencies in handling of range tombstones on both write 
> and read paths.
> I attached 2 patches here, one for write path 
> (RangeTombstonesWriteOptimization.diff) and another on read 
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of 
> deletion is represented by range tombstone. On put of this tombstone to 
> memtable the original code takes all columns from memtable from partition and 
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column 
> stay in memtable or it was deleted by new tombstone. Needless to say, more 
> columns you have on partition the slower deletions you have heating your CPU 
> with brute range tombstones check. 
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than 
> 1 columns loops by tombstones instead and checks existance of columns for 
> each of them. Also it copies of whole memtable range tombstone list only if 
> there are changes to be made there (original code copies range tombstone list 
> on every write).
> On read path, original code scans whole range tombstone list of a partition 
> to match sstable columns to their range tomstones. The 
> RangeTombstonesReadOptimization.diff patch scans only necessary range of 
> tombstones, according to filter used for read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6446) Faster range tombstones on wide partitions

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6446:


Attachment: (was: 0002-6446-Read-patch-v2.txt)

> Faster range tombstones on wide partitions
> --
>
> Key: CASSANDRA-6446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oleg Anastasyev
>Assignee: Oleg Anastasyev
> Fix For: 2.1
>
> Attachments: RangeTombstonesReadOptimization.diff, 
> RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of 
> them, we found inefficiencies in handling of range tombstones on both write 
> and read paths.
> I attached 2 patches here, one for write path 
> (RangeTombstonesWriteOptimization.diff) and another on read 
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of 
> deletion is represented by range tombstone. On put of this tombstone to 
> memtable the original code takes all columns from memtable from partition and 
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column 
> stay in memtable or it was deleted by new tombstone. Needless to say, more 
> columns you have on partition the slower deletions you have heating your CPU 
> with brute range tombstones check. 
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than 
> 1 columns loops by tombstones instead and checks existance of columns for 
> each of them. Also it copies of whole memtable range tombstone list only if 
> there are changes to be made there (original code copies range tombstone list 
> on every write).
> On read path, original code scans whole range tombstone list of a partition 
> to match sstable columns to their range tomstones. The 
> RangeTombstonesReadOptimization.diff patch scans only necessary range of 
> tombstones, according to filter used for read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6446) Faster range tombstones on wide partitions

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6446:


Attachment: 6446-write-path-v3.txt
6446-Read-patch-v3.txt

Thanks for the testing :). Re-attached rebased v3's with the 2 problems 
mentioned above fixed.

> Faster range tombstones on wide partitions
> --
>
> Key: CASSANDRA-6446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oleg Anastasyev
>Assignee: Oleg Anastasyev
> Fix For: 2.1
>
> Attachments: 6446-Read-patch-v3.txt, 6446-write-path-v3.txt, 
> RangeTombstonesReadOptimization.diff, RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of 
> them, we found inefficiencies in handling of range tombstones on both write 
> and read paths.
> I attached 2 patches here, one for write path 
> (RangeTombstonesWriteOptimization.diff) and another on read 
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of 
> deletion is represented by range tombstone. On put of this tombstone to 
> memtable the original code takes all columns from memtable from partition and 
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column 
> stay in memtable or it was deleted by new tombstone. Needless to say, more 
> columns you have on partition the slower deletions you have heating your CPU 
> with brute range tombstones check. 
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than 
> 1 columns loops by tombstones instead and checks existance of columns for 
> each of them. Also it copies of whole memtable range tombstone list only if 
> there are changes to be made there (original code copies range tombstone list 
> on every write).
> On read path, original code scans whole range tombstone list of a partition 
> to match sstable columns to their range tomstones. The 
> RangeTombstonesReadOptimization.diff patch scans only necessary range of 
> tombstones, according to filter used for read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[2/2] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2014-01-27 Thread slebresne

Merge branch 'cassandra-1.2' into cassandra-2.0

Conflicts:
src/java/org/apache/cassandra/cql3/statements/BatchStatement.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8bbb6eda
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8bbb6eda
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8bbb6eda

Branch: refs/heads/cassandra-2.0
Commit: 8bbb6eda66412bdf347e302d5677538b82d26948
Parents: 1b858be 09563ab
Author: Sylvain Lebresne 
Authored: Mon Jan 27 18:46:54 2014 +0100
Committer: Sylvain Lebresne 
Committed: Mon Jan 27 18:46:54 2014 +0100

--
 CHANGES.txt | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8bbb6eda/CHANGES.txt
--
diff --cc CHANGES.txt
index 6aa944d,c690150..6acbc87
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -19,29 -15,10 +19,30 @@@ Merged from 1.2
   * skip blocking on streaming during drain (CASSANDRA-6603)
   * Improve error message when schema doesn't match loaded sstable 
(CASSANDRA-6262)
   * Add properties to adjust FD initial value and max interval (CASSANDRA-4375)
+  * Fix preparing with batch and delete from collection (CASSANDRA-6607)
  
  
 -1.2.13
 +2.0.4
 + * Allow removing snapshots of no-longer-existing CFs (CASSANDRA-6418)
 + * add StorageService.stopDaemon() (CASSANDRA-4268)
 + * add IRE for invalid CF supplied to get_count (CASSANDRA-5701)
 + * add client encryption support to sstableloader (CASSANDRA-6378)
 + * Fix accept() loop for SSL sockets post-shutdown (CASSANDRA-6468)
 + * Fix size-tiered compaction in LCS L0 (CASSANDRA-6496)
 + * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
 + * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
 + * Fix cleanup ClassCastException (CASSANDRA-6462)
 + * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)
 + * Allow specifying datacenters to participate in a repair (CASSANDRA-6218)
 + * Fix divide-by-zero in PCI (CASSANDRA-6403)
 + * Fix setting last compacted key in the wrong level for LCS (CASSANDRA-6284)
 + * Add millisecond precision formats to the timestamp parser (CASSANDRA-6395)
 + * Expose a total memtable size metric for a CF (CASSANDRA-6391)
 + * cqlsh: handle symlinks properly (CASSANDRA-6425)
 + * Fix potential infinite loop when paging query with IN (CASSANDRA-6464)
 + * Fix assertion error in AbstractQueryPager.discardFirst (CASSANDRA-6447)
 + * Fix streaming older SSTable yields unnecessary tombstones (CASSANDRA-6527)
 +Merged from 1.2:
   * Improved error message on bad properties in DDL queries (CASSANDRA-6453)
   * Randomize batchlog candidates selection (CASSANDRA-6481)
   * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)

[1/2] git commit: Fix preparing statement with batch and delete from collection

2014-01-27 Thread slebresne

Updated Branches:
  refs/heads/cassandra-2.0 1b858be48 -> 8bbb6eda6


Fix preparing statement with batch and delete from collection

patch by Jan Chochol; reviewed by slebresne for CASSANDRA-6607


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/09563ab2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/09563ab2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/09563ab2

Branch: refs/heads/cassandra-2.0
Commit: 09563ab2cc12bc5968b281080478885cb8f7f352
Parents: c612a36
Author: Sylvain Lebresne 
Authored: Mon Jan 27 18:43:05 2014 +0100
Committer: Sylvain Lebresne 
Committed: Mon Jan 27 18:45:23 2014 +0100

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/cql3/statements/BatchStatement.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/09563ab2/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 524ffb7..c690150 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -15,6 +15,7 @@
  * skip blocking on streaming during drain (CASSANDRA-6603)
  * Improve error message when schema doesn't match loaded sstable 
(CASSANDRA-6262)
  * Add properties to adjust FD initial value and max interval (CASSANDRA-4375)
+ * Fix preparing with batch and delete from collection (CASSANDRA-6607)
 
 
 1.2.13

http://git-wip-us.apache.org/repos/asf/cassandra/blob/09563ab2/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
index 05dae48..f9b9a68 100644
--- a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
@@ -139,7 +139,7 @@ public class BatchStatement extends ModificationStatement
 
 public ParsedStatement.Prepared prepare() throws InvalidRequestException
 {
-CFDefinition.Name[] boundNames = new 
CFDefinition.Name[getBoundTerms()];
+ColumnSpecification[] boundNames = new 
ColumnSpecification[getBoundTerms()];
 return prepare(boundNames);
 }

[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2014-01-27 Thread slebresne

Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/680f2bda
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/680f2bda
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/680f2bda

Branch: refs/heads/trunk
Commit: 680f2bda4d0d51e023bcad2d160883ab408cca8f
Parents: 4d13d09 8bbb6ed
Author: Sylvain Lebresne 
Authored: Mon Jan 27 18:47:21 2014 +0100
Committer: Sylvain Lebresne 
Committed: Mon Jan 27 18:47:21 2014 +0100

--
 CHANGES.txt | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/680f2bda/CHANGES.txt
--

[2/3] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2014-01-27 Thread slebresne

Merge branch 'cassandra-1.2' into cassandra-2.0

Conflicts:
src/java/org/apache/cassandra/cql3/statements/BatchStatement.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8bbb6eda
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8bbb6eda
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8bbb6eda

Branch: refs/heads/trunk
Commit: 8bbb6eda66412bdf347e302d5677538b82d26948
Parents: 1b858be 09563ab
Author: Sylvain Lebresne 
Authored: Mon Jan 27 18:46:54 2014 +0100
Committer: Sylvain Lebresne 
Committed: Mon Jan 27 18:46:54 2014 +0100

--
 CHANGES.txt | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8bbb6eda/CHANGES.txt
--
diff --cc CHANGES.txt
index 6aa944d,c690150..6acbc87
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -19,29 -15,10 +19,30 @@@ Merged from 1.2
   * skip blocking on streaming during drain (CASSANDRA-6603)
   * Improve error message when schema doesn't match loaded sstable 
(CASSANDRA-6262)
   * Add properties to adjust FD initial value and max interval (CASSANDRA-4375)
+  * Fix preparing with batch and delete from collection (CASSANDRA-6607)
  
  
 -1.2.13
 +2.0.4
 + * Allow removing snapshots of no-longer-existing CFs (CASSANDRA-6418)
 + * add StorageService.stopDaemon() (CASSANDRA-4268)
 + * add IRE for invalid CF supplied to get_count (CASSANDRA-5701)
 + * add client encryption support to sstableloader (CASSANDRA-6378)
 + * Fix accept() loop for SSL sockets post-shutdown (CASSANDRA-6468)
 + * Fix size-tiered compaction in LCS L0 (CASSANDRA-6496)
 + * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
 + * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
 + * Fix cleanup ClassCastException (CASSANDRA-6462)
 + * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)
 + * Allow specifying datacenters to participate in a repair (CASSANDRA-6218)
 + * Fix divide-by-zero in PCI (CASSANDRA-6403)
 + * Fix setting last compacted key in the wrong level for LCS (CASSANDRA-6284)
 + * Add millisecond precision formats to the timestamp parser (CASSANDRA-6395)
 + * Expose a total memtable size metric for a CF (CASSANDRA-6391)
 + * cqlsh: handle symlinks properly (CASSANDRA-6425)
 + * Fix potential infinite loop when paging query with IN (CASSANDRA-6464)
 + * Fix assertion error in AbstractQueryPager.discardFirst (CASSANDRA-6447)
 + * Fix streaming older SSTable yields unnecessary tombstones (CASSANDRA-6527)
 +Merged from 1.2:
   * Improved error message on bad properties in DDL queries (CASSANDRA-6453)
   * Randomize batchlog candidates selection (CASSANDRA-6481)
   * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)

[1/3] git commit: Fix preparing statement with batch and delete from collection

2014-01-27 Thread slebresne

Updated Branches:
  refs/heads/trunk 4d13d0998 -> 680f2bda4


Fix preparing statement with batch and delete from collection

patch by Jan Chochol; reviewed by slebresne for CASSANDRA-6607


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/09563ab2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/09563ab2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/09563ab2

Branch: refs/heads/trunk
Commit: 09563ab2cc12bc5968b281080478885cb8f7f352
Parents: c612a36
Author: Sylvain Lebresne 
Authored: Mon Jan 27 18:43:05 2014 +0100
Committer: Sylvain Lebresne 
Committed: Mon Jan 27 18:45:23 2014 +0100

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/cql3/statements/BatchStatement.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/09563ab2/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 524ffb7..c690150 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -15,6 +15,7 @@
  * skip blocking on streaming during drain (CASSANDRA-6603)
  * Improve error message when schema doesn't match loaded sstable 
(CASSANDRA-6262)
  * Add properties to adjust FD initial value and max interval (CASSANDRA-4375)
+ * Fix preparing with batch and delete from collection (CASSANDRA-6607)
 
 
 1.2.13

http://git-wip-us.apache.org/repos/asf/cassandra/blob/09563ab2/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
index 05dae48..f9b9a68 100644
--- a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
@@ -139,7 +139,7 @@ public class BatchStatement extends ModificationStatement
 
 public ParsedStatement.Prepared prepare() throws InvalidRequestException
 {
-CFDefinition.Name[] boundNames = new 
CFDefinition.Name[getBoundTerms()];
+ColumnSpecification[] boundNames = new 
ColumnSpecification[getBoundTerms()];
 return prepare(boundNames);
 }

[jira] [Resolved] (CASSANDRA-6607) Unable to prepare statement with batch and delete from collection

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-6607.
-

   Resolution: Fixed
Fix Version/s: 1.2.14

Oh, right, that's definitively an oversight. Fix committed to 1.2. Thanks for 
the report.

> Unable to prepare statement with batch and delete from collection
> -
>
> Key: CASSANDRA-6607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6607
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jan Chochol
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.14
>
>
> It is not possible to prepare statement with batch containing delete 
> operation on one item of collection, e.g.:
> {noformat}
> BEGIN BATCH
> DELETE colection[?] FROM table WHERE key = ?;
> APPLY BATCH
> {noformat}
> Result of preparing such statement is:
> {noformat}
> java.lang.ArrayStoreException: org.apache.cassandra.cql3.ColumnSpecification
> {noformat}
> With stacktrace:
> {noformat}
> ERROR 16:26:36,816 Unexpected exception during request
> java.lang.ArrayStoreException: org.apache.cassandra.cql3.ColumnSpecification
>   at 
> org.apache.cassandra.cql3.AbstractMarker.collectMarkerSpecification(AbstractMarker.java:40)
>   at 
> org.apache.cassandra.cql3.Operation.collectMarkerSpecification(Operation.java:75)
>   at 
> org.apache.cassandra.cql3.statements.DeleteStatement.prepare(DeleteStatement.java:160)
>   at 
> org.apache.cassandra.cql3.statements.BatchStatement.prepare(BatchStatement.java:125)
>   at 
> org.apache.cassandra.cql3.statements.BatchStatement.prepare(BatchStatement.java:133)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:273)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.prepare(QueryProcessor.java:201)
>   at 
> org.apache.cassandra.transport.messages.PrepareMessage.execute(PrepareMessage.java:77)
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:287)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>   at 
> org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
>   at 
> org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> {noformat}
> This fix seems to help:
> {noformat}
> diff --git 
> a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java 
> b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
> index f93eb63..74c0a45 100644
> --- a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
> +++ b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
> @@ -129,7 +129,7 @@ public class BatchStatement extends ModificationStatement
>  
>  public ParsedStatement.Prepared prepare() throws InvalidRequestException
>  {
> -CFDefinition.Name[] boundNames = new 
> CFDefinition.Name[getBoundsTerms()];
> +ColumnSpecification[] boundNames = new 
> ColumnSpecification[getBoundsTerms()];
>  return prepare(boundNames);
>  }
> {noformat}
> It is probably corrected in Cassandra 2.0 by commit 
> {{e431fb722f80d8957a0a7fd2ecf80333e9275c53}} (CASSANDRA-5443).
> We  are facing this issue with Cassandra version 1.2.11.
> Would it be possible to fix this issue in branch 1.2?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

git commit: Fix preparing statement with batch and delete from collection

2014-01-27 Thread slebresne

Updated Branches:
  refs/heads/cassandra-1.2 c612a3649 -> 09563ab2c


Fix preparing statement with batch and delete from collection

patch by Jan Chochol; reviewed by slebresne for CASSANDRA-6607


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/09563ab2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/09563ab2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/09563ab2

Branch: refs/heads/cassandra-1.2
Commit: 09563ab2cc12bc5968b281080478885cb8f7f352
Parents: c612a36
Author: Sylvain Lebresne 
Authored: Mon Jan 27 18:43:05 2014 +0100
Committer: Sylvain Lebresne 
Committed: Mon Jan 27 18:45:23 2014 +0100

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/cql3/statements/BatchStatement.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/09563ab2/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 524ffb7..c690150 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -15,6 +15,7 @@
  * skip blocking on streaming during drain (CASSANDRA-6603)
  * Improve error message when schema doesn't match loaded sstable 
(CASSANDRA-6262)
  * Add properties to adjust FD initial value and max interval (CASSANDRA-4375)
+ * Fix preparing with batch and delete from collection (CASSANDRA-6607)
 
 
 1.2.13

http://git-wip-us.apache.org/repos/asf/cassandra/blob/09563ab2/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
index 05dae48..f9b9a68 100644
--- a/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/BatchStatement.java
@@ -139,7 +139,7 @@ public class BatchStatement extends ModificationStatement
 
 public ParsedStatement.Prepared prepare() throws InvalidRequestException
 {
-CFDefinition.Name[] boundNames = new 
CFDefinition.Name[getBoundTerms()];
+ColumnSpecification[] boundNames = new 
ColumnSpecification[getBoundTerms()];
 return prepare(boundNames);
 }

[jira] [Updated] (CASSANDRA-6446) Faster range tombstones on wide partitions

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6446:


Attachment: (was: 6446-write-path-v3.txt)

> Faster range tombstones on wide partitions
> --
>
> Key: CASSANDRA-6446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oleg Anastasyev
>Assignee: Oleg Anastasyev
> Fix For: 2.1
>
> Attachments: RangeTombstonesReadOptimization.diff, 
> RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of 
> them, we found inefficiencies in handling of range tombstones on both write 
> and read paths.
> I attached 2 patches here, one for write path 
> (RangeTombstonesWriteOptimization.diff) and another on read 
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of 
> deletion is represented by range tombstone. On put of this tombstone to 
> memtable the original code takes all columns from memtable from partition and 
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column 
> stay in memtable or it was deleted by new tombstone. Needless to say, more 
> columns you have on partition the slower deletions you have heating your CPU 
> with brute range tombstones check. 
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than 
> 1 columns loops by tombstones instead and checks existance of columns for 
> each of them. Also it copies of whole memtable range tombstone list only if 
> there are changes to be made there (original code copies range tombstone list 
> on every write).
> On read path, original code scans whole range tombstone list of a partition 
> to match sstable columns to their range tomstones. The 
> RangeTombstonesReadOptimization.diff patch scans only necessary range of 
> tombstones, according to filter used for read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6446) Faster range tombstones on wide partitions

2014-01-27 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6446:


Attachment: (was: 0001-6446-write-path-v2.txt)

> Faster range tombstones on wide partitions
> --
>
> Key: CASSANDRA-6446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oleg Anastasyev
>Assignee: Oleg Anastasyev
> Fix For: 2.1
>
> Attachments: RangeTombstonesReadOptimization.diff, 
> RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of 
> them, we found inefficiencies in handling of range tombstones on both write 
> and read paths.
> I attached 2 patches here, one for write path 
> (RangeTombstonesWriteOptimization.diff) and another on read 
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of 
> deletion is represented by range tombstone. On put of this tombstone to 
> memtable the original code takes all columns from memtable from partition and 
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column 
> stay in memtable or it was deleted by new tombstone. Needless to say, more 
> columns you have on partition the slower deletions you have heating your CPU 
> with brute range tombstones check. 
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than 
> 1 columns loops by tombstones instead and checks existance of columns for 
> each of them. Also it copies of whole memtable range tombstone list only if 
> there are changes to be made there (original code copies range tombstone list 
> on every write).
> On read path, original code scans whole range tombstone list of a partition 
> to match sstable columns to their range tomstones. The 
> RangeTombstonesReadOptimization.diff patch scans only necessary range of 
> tombstones, according to filter used for read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882980#comment-13882980
 ] 

Jonathan Ellis commented on CASSANDRA-6625:
---

If you're relying on the latency of non-batch operations to guarantee distinct 
timestamps, you're in for a nasty surprise at some point.

In other words, it's not batch design that's broken; you're simply doing it 
wrong.

> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[Cassandra Wiki] Update of "MemtableSSTable" by JonathanEllis

2014-01-27 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "MemtableSSTable" page has been changed by JonathanEllis:
https://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=20&rev2=21

Comment:
update for 1.0+

  == Overview ==
- Cassandra writes are first written to the [[Durability|CommitLog]], and then 
to a per-!ColumnFamily structure called a Memtable.  When a Memtable is full, 
it is written to disk as an SSTable.
+ Cassandra writes are first written to the [[Durability|CommitLog]], and then 
to a per-!ColumnFamily structure called a Memtable.  When a Memtable is full, 
[[http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/dml/manage_dml_intro_c.html|it
 is written to disk as an SSTable]].
  
  A Memtable is basically a write-back cache of data rows that can be looked up 
by key -- that is, unlike a write-through cache, writes are batched up in the 
Memtable until it is full, when it is flushed.
  
  == Flushing ==
+ The process of turning a Memtable into a SSTable is called flushing.  You can 
manually trigger flush via jmx (e.g. with bin/nodetool), which you may want to 
do before restarting nodes since it will reduce !CommitLog replay time.  
Memtables are sorted by key and then written out sequentially.  Thus, writes 
are extremely fast, costing only a commitlog append and an amortized sequential 
write for the flush.
  
- The process of turning a Memtable into a SSTable is called flushing.  You can 
manually trigger flush via jmx (e.g. with bin/nodetool), which you may want to 
do before restarting nodes since it will reduce !CommitLog replay time.  
Memtables are sorted by key and then written out sequentially.  Thus, writes 
are extremely fast, costing only a commitlog append and an amortized sequential 
write for the flush!
- 
- Once flushed, SSTable files are immutable; no further writes may be done.  
So, on the read path, the server must (potentially, although it uses tricks 
like bloom filters to avoid doing so unnecessarily) combine row fragments from 
all the SSTables on disk, as well as any unflushed Memtables, to produce the 
requested data.
+ Once flushed, SSTable files are immutable; no further writes may be done.  
So, on the 
[[http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/dml/dml_about_reads_c.html|read
 path]], the server must (potentially, although it uses tricks like bloom 
filters to avoid doing so unnecessarily) combine row fragments from all the 
SSTables on disk, as well as any unflushed Memtables, to produce the requested 
data.
  
  == Compaction ==
- To bound the number of SSTable files that must be consulted on reads, and to 
reclaim [[DistributedDeletes|space taken by unused data]], Cassandra performs 
compactions: merging multiple old SSTable files into a single new one. 
Compactions are triggered when at least N SStables have been flushed to disk, 
where N is tunable and defaults to 4. Four similar-sized SSTables are merged 
into a single one. They start out being the same size as your memtable flush 
size, and then form a hierarchy with each one doubling in size. So you'll have 
up to N of the same size as your memtable, then up to N double that size, then 
up to N double that size, etc.
+ To bound the number of SSTable files that must be consulted on reads, and to 
reclaim [[DistributedDeletes|space taken by unused data]], Cassandra performs 
compactions: merging multiple old SSTable files into a single new one.  
Compaction strategies are pluggable; out of the box are provided 
SizeTieredCompactionStrategy, which combines sstables of similar sizes, and 
LeveledCompactionStrategy, which sorts sstables into a heirarchy of levels, 
each an order of magnitude larger than the previous.  As a rule of thumb, 
[[http://www.datastax.com/dev/blog/when-to-use-leveled-compaction|SizeTiered is 
better for write-intensive workloads, and Leveled better for read-intensive]].
  
- "Minor" only compactions merge sstables of similar size; "major" compactions 
merge all sstables in a given !ColumnFamily.  Prior to Cassandra 0.6.6/0.7.0, 
only major compactions can clean out obsolete [[DistributedDeletes|tombstones]].
+ (For those familiar with other LSM implementations, it's worth noting that 
[[https://issues.apache.org/jira/browse/CASSANDRA-1074|Cassandra can remove 
tombstones without a "major" compaction combining all sstables into a single 
file]].)
  
- Since the input SSTables are all sorted by key, merging can be done 
efficiently, still requiring no random i/o.  Once compaction is finished, the 
old SSTable files may be deleted: note that in the worst case (a workload 
consisting of no overwrites or deletes) this will temporarily require 2x your 
existing on-disk space used.  In today's world of multi-TB disks this is 
usually not a problem but it is good to keep in mind when you are setting alert 
thresholds.
+ Since the input SSTables are all sorted by key (te

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Brandon Kearby (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883112#comment-13883112
 ] 

Brandon Kearby commented on CASSANDRA-6285:
---

After doing some more digging, looks like my issue is the same as 
https://issues.apache.org/jira/browse/CASSANDRA-4687


> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Russ Hatch
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (CASSANDRA-5493) Confusing output of CommandDroppedTasks

2014-01-27 Thread Mikhail Stepura (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883128#comment-13883128
 ] 

Mikhail Stepura edited comment on CASSANDRA-5493 at 1/27/14 7:09 PM:
-

so we're still not good. There is still 1 excess IP address.


was (Author: mishail):
so we're still not good. There is still 1 excess 1 IP address.

> Confusing output of CommandDroppedTasks
> ---
>
> Key: CASSANDRA-5493
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5493
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.3
>Reporter: Ondřej Černoš
>Assignee: Mikhail Stepura
>Priority: Minor
>
> We have 2 DCs, 3 nodes in each, using EC2 support. We are debugging nodetool 
> repair problems (roughly 1 out of 2 attempts just freezes). We looked into 
> the MessagingServiceBean to see what is going on using jmxterm. See the 
> following:
> {noformat}
> #mbean = org.apache.cassandra.net:type=MessagingService:
> CommandDroppedTasks = { 
>  107.aaa.bbb.ccc = 0;
>  166.ddd.eee.fff = 124320;
>  10.ggg.hhh.iii = 0;
>  107.jjj.kkk.lll = 0;
>  166.mmm.nnn.ooo = 1336699;
>  166.ppp.qqq.rrr = 1329171;
>  10.sss.ttt.uuu = 0;
>  107.vvv.www.xxx = 0;
> };
> {noformat}
> The problem with this output is it has 8 records. The node's neighbours (the 
> 107 and 10 nodes) are mentioned twice in the output, once with their public 
> IPs and once with their private IPs. The nodes in remote DC (the 166 ones) 
> are reported only once. I am pretty sure this is a bug - the node should be 
> reported only with one of its addresses in all outputs from Cassandra and it 
> should be consistent.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883129#comment-13883129
 ] 

Jonathan Ellis commented on CASSANDRA-6285:
---

[~rhatch] would still be useful to try to repro w/ Brandon's instructions since 
we don't have a way to repro 4687 yet.

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Russ Hatch
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-4851) CQL3: improve support for paginating over composites

2014-01-27 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-4851:


Fix Version/s: 2.0.5

> CQL3: improve support for paginating over composites
> 
>
> Key: CASSANDRA-4851
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4851
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Priority: Minor
> Fix For: 2.0.5
>
>
> Consider the following table:
> {noformat}
> CREATE TABLE test (
> k int,
> c1 int,
> c2 int,
> PRIMARY KEY (k, c1, c2)
> )
> {noformat}
> with the following data:
> {noformat}
> k | c1 | c2
> 
> 0 | 0  | 0
> 0 | 0  | 1
> 0 | 1  | 0
> 0 | 1  | 1
> {noformat}
> Currently, CQL3 allows to slice over either c1 or c2:
> {noformat}
> SELECT * FROM test WHERE k = 0 AND c1 > 0 AND c1 < 2
> SELECT * FROM test WHERE k = 0 AND c1 = 1 AND c2 > 0 AND c2 < 2
> {noformat}
> but you cannot express a query that return the 3 last records. Indeed, for 
> that you would need to do a query like say:
> {noformat}
> SELECT * FROM test WHERE k = 0 AND ((c1 = 0 AND c2 > 0) OR c2 > 0)
> {noformat}
> but we don't support that.
> This can make it hard to paginate over say all records for {{k = 0}} (I'm 
> saying "can" because if the value for c2 cannot be very large, an easy 
> workaround could be to paginate by entire value of c1, which you can do).
> For the case where you only paginate to avoid OOMing on a query, 
> CASSANDRA-4415 will that and is probably the best solution. However, there 
> may be case where the pagination is say user (as in, the user of your 
> application) triggered.
> I note that one solution would be to add the OR support at least in case like 
> the one above. That's definitively doable but on the other side, we won't be 
> able to support full-blown OR, so it may not be very natural that we support 
> seemingly random combination of OR and not others.
> Another solution would be to allow the following syntax:
> {noformat}
> SELECT * FROM test WHERE k = 0 AND (c1, c2) > (0, 0)
> {noformat}
> which would literally mean that you want records where the values of c1 and 
> c2 taken as a tuple is lexicographically greater than the tuple (0, 0). This 
> is less SQL-like (though maybe some SQL store have that, it's a fairly thing 
> to have imo?), but would be much simpler to implement and probably to use too.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6609:


Attachment: tmp2.patch

I'm attaching a patch that resolves all garbage in this code path, and actually 
improves performance by around 10-15% as well, through the use of a ThreadLocal 
long[]

This is a suboptimal solution, in my book, as it's not clear that the 
performance boost will remain in normal system running, as the ThreadLocal 
access will probably become more costly. But as things stand this is the only 
way to eliminate all garbage, and I think that is paramount for this method. 
It's unlikely the ThreadLocal lookup will become dramatically more costly, so I 
think until stack allocation can be improved this is the best bet.

> Reduce Bloom Filter Garbage Allocation
> --
>
> Key: CASSANDRA-6609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
> Attachments: tmp.diff, tmp2.patch
>
>
> Just spotted that we allocate potentially large amounts of garbage on bloom 
> filter lookups, since we allocate a new long[] for each hash() and to store 
> the bucket indexes we visit, in a manner that guarantees they are allocated 
> on heap. With a lot of sstables and many requests, this could easily be 
> hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883153#comment-13883153
 ] 

Benedict commented on CASSANDRA-6609:
-

Also, I think this should be a candidate for 1.2. Looks like it should apply 
cleanly.

> Reduce Bloom Filter Garbage Allocation
> --
>
> Key: CASSANDRA-6609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
> Attachments: tmp.diff, tmp2.patch
>
>
> Just spotted that we allocate potentially large amounts of garbage on bloom 
> filter lookups, since we allocate a new long[] for each hash() and to store 
> the bucket indexes we visit, in a manner that guarantees they are allocated 
> on heap. With a lot of sstables and many requests, this could easily be 
> hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6616) Nodetool repair is taking a long time.

2014-01-27 Thread Dharsan Logendran (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883168#comment-13883168
 ] 

Dharsan Logendran commented on CASSANDRA-6616:
--

Hi Brandon,

In our DB 99% of the tables are empty. Can Cassandra ignore the empty tables 
and run the repair only on the tables that contain data?. 

Thanks
Dharsan
 



> Nodetool repair is taking a long time. 
> ---
>
> Key: CASSANDRA-6616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra Version 2.0.4 running on Redhat Version 6
>Reporter: Dharsan Logendran
>
> We have a two  nodes cluster with the replication factor of 2.   The db has 
> more than 2500 column families(tables).   The 'nodetool -pr -par' repair on 
> an empty database(one or table has a litter data) takes about 30 hours to 
> complete.  
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6609:


Attachment: tmp3.patch

Uh, that is if it weren't for an awful fat finger error. Fixed, and also 
reintroduced a deoptimised public getHashBuckets for use by the unit tests 
(deoptimised because it permits far more hashes than we ever do for realz, so I 
skip the ThreadLocal so we don't have to allocate an array that large).

> Reduce Bloom Filter Garbage Allocation
> --
>
> Key: CASSANDRA-6609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
> Attachments: tmp.diff, tmp2.patch, tmp3.patch
>
>
> Just spotted that we allocate potentially large amounts of garbage on bloom 
> filter lookups, since we allocate a new long[] for each hash() and to store 
> the bucket indexes we visit, in a manner that guarantees they are allocated 
> on heap. With a lot of sstables and many requests, this could easily be 
> hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6616) Nodetool repair is taking a long time.

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883198#comment-13883198
 ] 

Brandon Williams commented on CASSANDRA-6616:
-

Pass the ones that have data explicitly to nodetool.

> Nodetool repair is taking a long time. 
> ---
>
> Key: CASSANDRA-6616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra Version 2.0.4 running on Redhat Version 6
>Reporter: Dharsan Logendran
>
> We have a two  nodes cluster with the replication factor of 2.   The db has 
> more than 2500 column families(tables).   The 'nodetool -pr -par' repair on 
> an empty database(one or table has a litter data) takes about 30 hours to 
> complete.  
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883223#comment-13883223
 ] 

Ondřej Černoš edited comment on CASSANDRA-6625 at 1/27/14 8:10 PM:
---

If operations on the same tcp connection are not strongly ordered, then you 
should definitely update your 
[documentation|http://cassandra.apache.org/doc/cql3/CQL.html#updateStmt]. It 
should rather read: do not use server-side generated timestamps unless your 
operations do not need to be ordered, even if you issue them synchronously in 
serial order. Googling ordering in Cassandra reveals nil, timestamp treatment 
in CQL doc on both Apache site and Datastax site is very limited.


was (Author: ondrej.cernos):
If operations on the same tcp connection are not strongly ordered, then you 
should definitely update your 
[documentation|http://cassandra.apache.org/doc/cql3/CQL.html#updateStmt]. It 
should rather read: do not use server-side generated timestamps unless your 
operations do not need to be ordered, even if you issue them in synchronously 
in serial order. Googling ordering in Cassandra reveals nil, timestamp 
treatment in CQL doc on both Apache site and Datastax site is very limited.

> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6625) Batch containing delete and insert leads to inconsistent results

2014-01-27 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883223#comment-13883223
 ] 

Ondřej Černoš commented on CASSANDRA-6625:
--

If operations on the same tcp connection are not strongly ordered, then you 
should definitely update your 
[documentation|http://cassandra.apache.org/doc/cql3/CQL.html#updateStmt]. It 
should rather read: do not use server-side generated timestamps unless your 
operations do not need to be ordered, even if you issue them in synchronously 
in serial order. Googling ordering in Cassandra reveals nil, timestamp 
treatment in CQL doc on both Apache site and Datastax site is very limited.

> Batch containing delete and insert leads to inconsistent results
> 
>
> Key: CASSANDRA-6625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6625
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 1.2.11
>Reporter: Ondřej Černoš
>Priority: Minor
>  Labels: cql3
>
> On a single node cluster (i.e. ./bin/cassandra -f on localhost) we ran into 
> the following. Let's consider empty keyspace with the following table:
> {noformat}
> CREATE TABLE test (
> a varchar,
> b varchar,
> PRIMARY KEY (a, b)
> ) WITH comment='List of a related to b - widerow';
> {noformat}
> The table is empty.
> Now we issue the following batch:
> {noformat}
> BEGIN BATCH
> DELETE FROM test WHERE a = 'a1' AND b = 'b1';
> INSERT INTO test (a, b) VALUES ('a1', 'b1');
> APPLY BATCH;
> {noformat}
> When the batch successfully finishes, the table is empty.
> This is consequence of the fact tombstone wins if timestamps are the same. 
> And they are, because the operation is batched.
> I consider this a bug. Batching operations shouldn't change the semantics of 
> batched operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6616) Nodetool repair is taking a long time.

2014-01-27 Thread Dharsan Logendran (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883243#comment-13883243
 ] 

Dharsan Logendran commented on CASSANDRA-6616:
--

Thanks Brandon,

Is there any  easy way in C* finding out which tables are empty. I don't want 
run count on each table to find out which ones are empty.

Dharsan

 



> Nodetool repair is taking a long time. 
> ---
>
> Key: CASSANDRA-6616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra Version 2.0.4 running on Redhat Version 6
>Reporter: Dharsan Logendran
>
> We have a two  nodes cluster with the replication factor of 2.   The db has 
> more than 2500 column families(tables).   The 'nodetool -pr -par' repair on 
> an empty database(one or table has a litter data) takes about 30 hours to 
> complete.  
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883253#comment-13883253
 ] 

Russ Hatch commented on CASSANDRA-6285:
---

[~jbellis] -- I was able to get the exception to occur by doing the following:

create a new cluster with ccm, and populate with 3 nodes
{noformat}
create keyspace SocialData with 
placement_strategy='org.apache.cassandra.locator.SimpleStrategy' and 
strategy_options = {replication_factor:3};
{noformat}
create signal Column Family (I had to modify schema above a little bit to make 
it work):
{noformat}
create column family signal
  with column_type = 'Standard'
  and comparator = 'UTF8Type'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'UTF8Type'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 432000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  and caching = 'ALL'
  and compaction_strategy_options = {'sstable_size_in_mb' : '160'}
  and comment = 'A store of information about each individual signal.'
  and column_metadata = [
{column_name : 'type', validation_class : UTF8Type},
{column_name : 'foo_id', validation_class : LongType}]
  and compression_options = {'sstable_compression' : 
'org.apache.cassandra.io.compress.LZ4Compressor'}; 
{noformat}
stopped the nodes
copied all the files from the provided tar's /data/SocialData/ directory to one 
of my nodes
started the nodes up again
At this point I didn't find any data in the signal column family (using 'list 
signal;')
The exception appeared in the node's log
{noformat}
ERROR [CompactionExecutor:10] 2014-01-27 12:45:26,734 CassandraDaemon.java 
(line 187) Exception in thread Thread[CompactionExecutor:10,1,main]
java.lang.RuntimeException: Last written key DecoratedKey(4322717900587903123, 
706f737431353834373031323038270903ae0022076d9f) >= current key 
DecoratedKey(-7009815163526224622, 
545749545445523a33353334383632393032363439333437) writing into 
/home/rhatch/.ccm/test_cluster_1390845354/node1/data/SocialData/signal/SocialData-signal-tmp-jb-7-Data.db
at 
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{noformat}
I was curious if repair would have any bearing, so I ran repair on one node 
(after which I can see data in the signal table), then I stopped and started 
the nodes again -- a similar exception appears in the log for all 3 nodes 
('Last written key DecoratedKey ...').

I'm not 100% certain if my procedure for using the provided tar's test data was 
correct, so let me know if there's anything obvious I missed and I'll run 
through it again.

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Russ Hatch
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >=

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883265#comment-13883265
 ] 

Jonathan Ellis commented on CASSANDRA-6285:
---

Was that 2.0 HEAD or 2.0.4?

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-6285:
-

Assignee: Tyler Hobbs  (was: Russ Hatch)

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6285:
--

  Component/s: Core
Fix Version/s: 2.0.5

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883267#comment-13883267
 ] 

Russ Hatch commented on CASSANDRA-6285:
---

oh sorry, forgot that detail. I reproduced from the cassandra-2.0.2 tag.

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883274#comment-13883274
 ] 

Jonathan Ellis commented on CASSANDRA-6285:
---

Can you try 2.0 HEAD as well just to be sure?

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6616) Nodetool repair is taking a long time.

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883282#comment-13883282
 ] 

Brandon Williams commented on CASSANDRA-6616:
-

cfstats will tell you

> Nodetool repair is taking a long time. 
> ---
>
> Key: CASSANDRA-6616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra Version 2.0.4 running on Redhat Version 6
>Reporter: Dharsan Logendran
>
> We have a two  nodes cluster with the replication factor of 2.   The db has 
> more than 2500 column families(tables).   The 'nodetool -pr -par' repair on 
> an empty database(one or table has a litter data) takes about 30 hours to 
> complete.  
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6616) Nodetool repair is taking a long time.

2014-01-27 Thread Dharsan Logendran (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883310#comment-13883310
 ] 

Dharsan Logendran commented on CASSANDRA-6616:
--

Is the cfstats accurate?.  

Thanks
Dharsan




> Nodetool repair is taking a long time. 
> ---
>
> Key: CASSANDRA-6616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra Version 2.0.4 running on Redhat Version 6
>Reporter: Dharsan Logendran
>
> We have a two  nodes cluster with the replication factor of 2.   The db has 
> more than 2500 column families(tables).   The 'nodetool -pr -par' repair on 
> an empty database(one or table has a litter data) takes about 30 hours to 
> complete.  
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883313#comment-13883313
 ] 

Russ Hatch commented on CASSANDRA-6285:
---

OK, appears we have the same issue on 2.0 HEAD as well (8bbb6e...) -- exception 
appears on startup using the procedure I included earlier.

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883319#comment-13883319
 ] 

Russ Hatch commented on CASSANDRA-6285:
---

I'm going to attempt to condense this down to a simple dtest as well.

> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>   at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6616) Nodetool repair is taking a long time.

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883361#comment-13883361
 ] 

Brandon Williams commented on CASSANDRA-6616:
-

Accurate within index_interval.  For zero it's perfectly accurate.

> Nodetool repair is taking a long time. 
> ---
>
> Key: CASSANDRA-6616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra Version 2.0.4 running on Redhat Version 6
>Reporter: Dharsan Logendran
>
> We have a two  nodes cluster with the replication factor of 2.   The db has 
> more than 2500 column families(tables).   The 'nodetool -pr -par' repair on 
> an empty database(one or table has a litter data) takes about 30 hours to 
> complete.  
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6285) LCS compaction failing with Exception

2014-01-27 Thread Brandon Kearby (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883375#comment-13883375
 ] 

Brandon Kearby commented on CASSANDRA-6285:
---

Hi [~rhatch], 

Here's the full schema I'm using to test with:

create keyspace SocialData
  with placement_strategy = 'NetworkTopologyStrategy'
  and strategy_options = {DC-Analytics : 3}
  and durable_writes = true;

use SocialData;

create column family signal
  with column_type = 'Standard'
  and comparator = 'UTF8Type'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'UTF8Type'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 432000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  and caching = 'NONE'
  and compaction_strategy_options = {'sstable_size_in_mb' : '160'}
  and comment = 'A store of information about each individual signal.'
  and column_metadata = [
{column_name : 'type',
validation_class : UTF8Type},
{column_name : 'department_id',
validation_class : LongType},
{column_name : 'ecosystem_account_id',
validation_class : UTF8Type},
{column_name : 'content_type',
validation_class : UTF8Type},
{column_name : 'rating_count',
validation_class : LongType},
{column_name : 'service_account_id',
validation_class : UTF8Type},
{column_name : 'time',
validation_class : LongType},
{column_name : 'organization_id',
validation_class : LongType},
{column_name : 'conversation_id',
validation_class : UTF8Type},
{column_name : 'favorites_count',
validation_class : LongType},
{column_name : 'dislike_count',
validation_class : LongType},
{column_name : 'url',
validation_class : UTF8Type},
{column_name : 'impressions',
validation_class : LongType},
{column_name : 'network_strength',
validation_class : LongType},
{column_name : 'parent_signal_id',
validation_class : UTF8Type},
{column_name : 'account_snapshot_id',
validation_class : UTF8Type},
{column_name : 'region_id',
validation_class : LongType},
{column_name : 'time_bucket',
validation_class : LongType},
{column_name : 'enriched_on',
validation_class : LongType},
{column_name : 'dachis_account_id',
validation_class : UTF8Type},
{column_name : 'text',
validation_class : UTF8Type},
{column_name : 'sentiment',
validation_class : LongType},
{column_name : 'like_count',
validation_class : LongType},
{column_name : 'industry_id',
validation_class : LongType},
{column_name : 'service',
validation_class : UTF8Type},
{column_name : 'cloned_from',
validation_class : UTF8Type},
{column_name : 'constituent_type',
validation_class : UTF8Type},
{column_name : 'listings_count',
validation_class : LongType},
{column_name : 'network_size',
validation_class : LongType},
{column_name : 'analyzed',
validation_class : Int32Type},
{column_name : 'username',
validation_class : UTF8Type},
{column_name : 'service_signal_id',
validation_class : UTF8Type},
{column_name : 'language',
validation_class : UTF8Type},
{column_name : 'brand_id',
validation_class : LongType},
{column_name : 'rating',
validation_class : LongType},
{column_name : 'relationship_id',
validation_class : UTF8Type}]
  and compression_options = {'sstable_compression' : 
'org.apache.cassandra.io.compress.LZ4Compressor'};



> LCS compaction failing with Exception
> -
>
> Key: CASSANDRA-6285
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>Reporter: David Sauer
>Assignee: Tyler Hobbs
> Fix For: 2.0.5
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736

[jira] [Commented] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883419#comment-13883419
 ] 

Minh Do commented on CASSANDRA-6619:


As posted in other tickets, 1.1 and 1.2 have different message protocols.  
Hence, it is important to set the right target version when making outbound 
connections rather than depending on the inbound connections to set a version 
value.  Thus, race condition in setting the version values is solved.

Attachment is the patch to make sure the code does that when an outbound 
connection is open and an exchange for versioning information in the hankshake 
fails.

As discussed with Jason Brown here at Netflix, we came up with a solution that 
during the upgrade, the upgraded nodes have in the environment the variable 
cassandra.prev_version = 5 (for 1.1.7 or 4 for 1.1) to help out the handshakes 
in a mixed version cluster.

Once a cluster is fully upgraded to 1.2, cassadra.prev_version is removed from 
all nodes' environment and a C* rolling restart across nodes is required.  This 
step ensures that the new patch won't penalize the 1.2 cluster where all 
outbound connections are from 1.2 to 1.2.

 



> Race condition issue during upgrading 1.1 to 1.2
> 
>
> Key: CASSANDRA-6619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
>Priority: Minor
> Fix For: 1.2.14
>
>
> There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
> One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
> some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
> suffer in high read latency and be unable to fulfill some write requests.  It 
> won't be a problem if there is a small cluster but it is a problem in a large 
> cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
> day(s) to complete.
> Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
> already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Attachment: (was: diff)

> Race condition issue during upgrading 1.1 to 1.2
> 
>
> Key: CASSANDRA-6619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
>Priority: Minor
> Fix For: 1.2.14
>
>
> There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
> One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
> some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
> suffer in high read latency and be unable to fulfill some write requests.  It 
> won't be a problem if there is a small cluster but it is a problem in a large 
> cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
> day(s) to complete.
> Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
> already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Attachment: diff

> Race condition issue during upgrading 1.1 to 1.2
> 
>
> Key: CASSANDRA-6619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
>Priority: Minor
> Fix For: 1.2.14
>
>
> There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
> One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
> some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
> suffer in high read latency and be unable to fulfill some write requests.  It 
> won't be a problem if there is a small cluster but it is a problem in a large 
> cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
> day(s) to complete.
> Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
> already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Attachment: patch.txt

> Race condition issue during upgrading 1.1 to 1.2
> 
>
> Key: CASSANDRA-6619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
>Priority: Minor
> Fix For: 1.2.14
>
> Attachments: patch.txt
>
>
> There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
> One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
> some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
> suffer in high read latency and be unable to fulfill some write requests.  It 
> won't be a problem if there is a small cluster but it is a problem in a large 
> cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
> day(s) to complete.
> Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
> already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Reviewer: Jason Brown

> Race condition issue during upgrading 1.1 to 1.2
> 
>
> Key: CASSANDRA-6619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
>Priority: Minor
> Fix For: 1.2.14
>
> Attachments: patch.txt
>
>
> There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
> One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
> some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
> suffer in high read latency and be unable to fulfill some write requests.  It 
> won't be a problem if there is a small cluster but it is a problem in a large 
> cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
> day(s) to complete.
> Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
> already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

git commit: clarify yaml comment about commitlog_periodic_queue_size

2014-01-27 Thread brandonwilliams

Updated Branches:
  refs/heads/trunk 680f2bda4 -> 1218bcacb


clarify yaml comment about commitlog_periodic_queue_size


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1218bcac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1218bcac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1218bcac

Branch: refs/heads/trunk
Commit: 1218bcacba7edefaf56cf8440d0aea5794c89a1e
Parents: 680f2bd
Author: Brandon Williams 
Authored: Mon Jan 27 16:22:30 2014 -0600
Committer: Brandon Williams 
Committed: Mon Jan 27 16:22:30 2014 -0600

--
 conf/cassandra.yaml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1218bcac/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index 885d28d..bdbb9ff 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -198,9 +198,9 @@ saved_caches_directory: /var/lib/cassandra/saved_caches
 #
 # the other option is "periodic" where writes may be acked immediately
 # and the CommitLog is simply synced every commitlog_sync_period_in_ms
-# milliseconds.  By default this allows 1024*(CPU cores) pending
-# entries on the commitlog queue.  If you are writing very large blobs,
-# you should reduce that; 16*cores works reasonably well for 1MB blobs.
+# milliseconds.  commitlog_periodic_queue_size allows 1024*(CPU cores) pending
+# entries on the commitlog queue by default.  If you are writing very large
+# blobs, you should reduce that; 16*cores works reasonably well for 1MB blobs.
 # It should be at least as large as the concurrent_writes setting.
 commitlog_sync: periodic
 commitlog_sync_period_in_ms: 1

[1/3] New counters implementation

2014-01-27 Thread aleksey

Updated Branches:
  refs/heads/trunk 1218bcacb -> 714c42336


http://git-wip-us.apache.org/repos/asf/cassandra/blob/714c4233/test/unit/org/apache/cassandra/db/context/CounterContextTest.java
--
diff --git a/test/unit/org/apache/cassandra/db/context/CounterContextTest.java 
b/test/unit/org/apache/cassandra/db/context/CounterContextTest.java
index 5c88fd6..ea5dd3e 100644
--- a/test/unit/org/apache/cassandra/db/context/CounterContextTest.java
+++ b/test/unit/org/apache/cassandra/db/context/CounterContextTest.java
@@ -26,7 +26,8 @@ import java.nio.ByteBuffer;
 
 import org.junit.Test;
 
-import org.apache.cassandra.db.context.IContext.ContextRelationship;
+import org.apache.cassandra.db.ClockAndCount;
+import org.apache.cassandra.db.context.CounterContext.Relationship;
 import org.apache.cassandra.Util;
 import org.apache.cassandra.utils.*;
 
@@ -92,7 +93,7 @@ public class CounterContextTest
 left.writeRemote(CounterId.fromInt(9), 1L, 0L);
 right = ContextState.wrap(ByteBufferUtil.clone(left.context));
 
-assertEquals(ContextRelationship.EQUAL, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.EQUAL, cc.diff(left.context, right.context));
 
 // greater than: left has superset of nodes (counts equal)
 left = ContextState.allocate(0, 0, 4, allocator);
@@ -106,7 +107,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6), 2L, 0L);
 right.writeRemote(CounterId.fromInt(9), 1L, 0L);
 
-assertEquals(ContextRelationship.GREATER_THAN, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.GREATER_THAN, cc.diff(left.context, 
right.context));
 
 // less than: left has subset of nodes (counts equal)
 left = ContextState.allocate(0, 0, 3, allocator);
@@ -120,7 +121,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(9),  1L, 0L);
 right.writeRemote(CounterId.fromInt(12), 0L, 0L);
 
-assertEquals(ContextRelationship.LESS_THAN, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.LESS_THAN, cc.diff(left.context, 
right.context));
 
 // greater than: equal nodes, but left has higher counts
 left = ContextState.allocate(0, 0, 3, allocator);
@@ -133,7 +134,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6), 2L, 0L);
 right.writeRemote(CounterId.fromInt(9), 1L, 0L);
 
-assertEquals(ContextRelationship.GREATER_THAN, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.GREATER_THAN, cc.diff(left.context, 
right.context));
 
 // less than: equal nodes, but right has higher counts
 left = ContextState.allocate(0, 0, 3, allocator);
@@ -146,7 +147,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6), 9L, 0L);
 right.writeRemote(CounterId.fromInt(9), 3L, 0L);
 
-assertEquals(ContextRelationship.LESS_THAN, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.LESS_THAN, cc.diff(left.context, 
right.context));
 
 // disjoint: right and left have disjoint node sets
 left = ContextState.allocate(0, 0, 3, allocator);
@@ -159,7 +160,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6), 1L, 0L);
 right.writeRemote(CounterId.fromInt(9), 1L, 0L);
 
-assertEquals(ContextRelationship.DISJOINT, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.DISJOINT, cc.diff(left.context, 
right.context));
 
 left = ContextState.allocate(0, 0, 3, allocator);
 left.writeRemote(CounterId.fromInt(3), 1L, 0L);
@@ -171,7 +172,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6),  1L, 0L);
 right.writeRemote(CounterId.fromInt(12), 1L, 0L);
 
-assertEquals(ContextRelationship.DISJOINT, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.DISJOINT, cc.diff(left.context, 
right.context));
 
 // disjoint: equal nodes, but right and left have higher counts in 
differing nodes
 left = ContextState.allocate(0, 0, 3, allocator);
@@ -184,7 +185,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6), 1L, 0L);
 right.writeRemote(CounterId.fromInt(9), 5L, 0L);
 
-assertEquals(ContextRelationship.DISJOINT, cc.diff(left.context, 
right.context));
+assertEquals(Relationship.DISJOINT, cc.diff(left.context, 
right.context));
 
 left = ContextState.allocate(0, 0, 3, allocator);
 left.writeRemote(CounterId.fromInt(3), 2L, 0L);
@@ -196,7 +197,7 @@ public class CounterContextTest
 right.writeRemote(CounterId.fromInt(6), 9L, 0L);
 right.writeRemote(CounterId.fromInt(9), 5L, 0L);
 
-assertEquals(ContextRelationship.DISJOINT, cc.diff(left.context, 
right.context)

[jira] [Created] (CASSANDRA-6626) Create 2.0->2.1 counter upgrade dtests

2014-01-27 Thread Aleksey Yeschenko (JIRA)

Aleksey Yeschenko created CASSANDRA-6626:


 Summary: Create 2.0->2.1 counter upgrade dtests
 Key: CASSANDRA-6626
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6626
 Project: Cassandra
  Issue Type: Test
Reporter: Aleksey Yeschenko
 Fix For: 2.1


Create 2.0->2.1 counter upgrade dtests. Something more extensive, yet more 
specific than 
https://github.com/riptano/cassandra-dtest/blob/master/upgrade_through_versions_test.py



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6626) Create 2.0->2.1 counter upgrade dtests

2014-01-27 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-6626:
-

Assignee: Ryan McGuire

> Create 2.0->2.1 counter upgrade dtests
> --
>
> Key: CASSANDRA-6626
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6626
> Project: Cassandra
>  Issue Type: Test
>Reporter: Aleksey Yeschenko
>Assignee: Ryan McGuire
> Fix For: 2.1
>
>
> Create 2.0->2.1 counter upgrade dtests. Something more extensive, yet more 
> specific than 
> https://github.com/riptano/cassandra-dtest/blob/master/upgrade_through_versions_test.py



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-5872) Bundle JNA

2014-01-27 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883477#comment-13883477
 ] 

Michael Shuler commented on CASSANDRA-5872:
---

I also see downloads for all the other jars that reside under lib/ - this is 
the output of 'ant mvn-install ; mvn dependency:tree -f $POMFILE' on trunk 
without the patch: http://aep.appspot.com/display/rR2lAhZ2TpOecoobMTv566zSXvU/

You can search for any of the lib/*.jar files, ie jbcrypt, and see they are 
also downloaded, so I suppose this is expected(?).

dependency:tree output after patching (and wget'ing lib/jna-4.0.0.jar):
{noformat}
(trunk *)mshuler@hana:~/git/cassandra$ mvn dependency:tree -f 
/home/mshuler/.m2/repository/org/apache/cassandra/cassandra-all/2.1-SNAPSHOT/cassandra-all-2.1-SNAPSHOT.pom
[INFO] Scanning for projects...
[INFO] 
[INFO] 
[INFO] Building Apache Cassandra 2.1-SNAPSHOT
[INFO] 
Downloading: 
http://repo.maven.apache.org/maven2/net/java/dev/jna/jna/4.0.0/jna-4.0.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/net/java/dev/jna/jna/4.0.0/jna-4.0.0.pom (2 
KB at 2.9 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/net/java/dev/jna/jna/4.0.0/jna-4.0.0.jar
Downloaded: 
http://repo.maven.apache.org/maven2/net/java/dev/jna/jna/4.0.0/jna-4.0.0.jar 
(894 KB at 546.6 KB/sec)
[INFO] 
[INFO] --- maven-dependency-plugin:2.1:tree (default-cli) @ cassandra-all ---
[INFO] org.apache.cassandra:cassandra-all:jar:2.1-SNAPSHOT
[INFO] +- org.xerial.snappy:snappy-java:jar:1.0.5:compile
[INFO] +- net.jpountz.lz4:lz4:jar:1.2.0:compile
[INFO] +- com.ning:compress-lzf:jar:0.8.4:compile
[INFO] +- com.google.guava:guava:jar:15.0:compile
[INFO] +- commons-cli:commons-cli:jar:1.1:compile
[INFO] +- commons-codec:commons-codec:jar:1.2:compile
[INFO] +- org.apache.commons:commons-lang3:jar:3.1:compile
[INFO] +- org.apache.commons:commons-math3:jar:3.2:compile
[INFO] +- 
com.googlecode.concurrentlinkedhashmap:concurrentlinkedhashmap-lru:jar:1.3:compile
[INFO] +- org.antlr:antlr:jar:3.2:compile
[INFO] |  \- org.antlr:antlr-runtime:jar:3.2:compile
[INFO] | \- org.antlr:stringtemplate:jar:3.2:compile
[INFO] |\- antlr:antlr:jar:2.7.7:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.7.2:compile
[INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.2:compile
[INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.2:compile
[INFO] +- jline:jline:jar:1.0:compile
[INFO] +- net.java.dev.jna:jna:jar:4.0.0:compile
[INFO] +- com.googlecode.json-simple:json-simple:jar:1.1:compile
[INFO] +- com.github.stephenc.high-scale-lib:high-scale-lib:jar:1.1.2:compile
[INFO] +- org.yaml:snakeyaml:jar:1.11:compile
[INFO] +- edu.stanford.ppl:snaptree:jar:0.1:compile
[INFO] +- org.mindrot:jbcrypt:jar:0.3m:compile
[INFO] +- com.yammer.metrics:metrics-core:jar:2.2.0:compile
[INFO] +- com.addthis.metrics:reporter-config:jar:2.1.0:compile
[INFO] |  \- org.hibernate:hibernate-validator:jar:4.3.0.Final:compile
[INFO] | +- javax.validation:validation-api:jar:1.0.0.GA:compile
[INFO] | \- org.jboss.logging:jboss-logging:jar:3.1.0.CR2:compile
[INFO] +- com.thinkaurelius.thrift:thrift-server:jar:0.3.3:compile
[INFO] |  +- com.lmax:disruptor:jar:3.0.1:compile
[INFO] |  \- junit:junit:jar:4.6:compile (version managed from 4.8.1)
[INFO] +- com.clearspring.analytics:stream:jar:2.5.1:compile
[INFO] |  \- it.unimi.dsi:fastutil:jar:6.5.7:compile
[INFO] +- ch.qos.logback:logback-core:jar:1.0.13:compile
[INFO] +- ch.qos.logback:logback-classic:jar:1.0.13:compile
[INFO] +- org.apache.thrift:libthrift:jar:0.9.1:compile
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.2.5:compile
[INFO] |  |  \- commons-logging:commons-logging:jar:1.1.1:compile
[INFO] |  \- org.apache.httpcomponents:httpcore:jar:4.2.4:compile
[INFO] +- org.apache.cassandra:cassandra-thrift:jar:2.1-SNAPSHOT:compile
[INFO] +- org.apache.hadoop:hadoop-core:jar:1.0.3:compile
[INFO] |  +- xmlenc:xmlenc:jar:0.52:compile
[INFO] |  +- commons-httpclient:commons-httpclient:jar:3.0.1:compile
[INFO] |  +- org.apache.commons:commons-math:jar:2.1:compile
[INFO] |  +- commons-configuration:commons-configuration:jar:1.6:compile
[INFO] |  |  +- commons-collections:commons-collections:jar:3.2.1:compile
[INFO] |  |  +- commons-lang:commons-lang:jar:2.4:compile
[INFO] |  |  +- commons-digester:commons-digester:jar:1.8:compile
[INFO] |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:compile
[INFO] |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
[INFO] |  +- commons-net:commons-net:jar:1.4.1:compile
[INFO] |  +- org.mortbay.jetty:jetty:jar:6.1.26:compile
[INFO] |  +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile
[INFO] |  +- tomcat:jasper-runtime:jar:5.5.12:compile
[IN

[jira] [Updated] (CASSANDRA-6553) Benchmark counter improvements (counters++)

2014-01-27 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-6553:
-

Description: 
Benchmark the difference in performance between CASSANDRA-6504 and trunk.

* Updating totally unrelated counters (different partitions)
* Updating the same counters a lot (same cells in the same partition)
* Different cells in the same few partitions (hot counter partition)

benchmark: 
https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
 (old counters)
compared to: https://github.com/iamaleksey/cassandra/commits/trunk

So far, the above changes should only affect the write path.

  was:
Benchmark the difference in performance between CASSANDRA-6504 and trunk.

* Updating totally unrelated counters (different partitions)
* Updating the same counters a lot (same cells in the same partition)
* Different cells in the same few partitions (hot counter partition)

benchmark: https://github.com/iamaleksey/cassandra/commits/6504
compared to: https://github.com/iamaleksey/cassandra/commits/trunk

So far, the above changes should only affect the write path.


> Benchmark counter improvements (counters++)
> ---
>
> Key: CASSANDRA-6553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6553
> Project: Cassandra
>  Issue Type: Test
>Reporter: Ryan McGuire
>Assignee: Ryan McGuire
> Fix For: 2.1
>
>
> Benchmark the difference in performance between CASSANDRA-6504 and trunk.
> * Updating totally unrelated counters (different partitions)
> * Updating the same counters a lot (same cells in the same partition)
> * Different cells in the same few partitions (hot counter partition)
> benchmark: 
> https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
>  (old counters)
> compared to: https://github.com/iamaleksey/cassandra/commits/trunk
> So far, the above changes should only affect the write path.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6553) Benchmark counter improvements (counters++)

2014-01-27 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-6553:
-

Description: 
Benchmark the difference in performance between CASSANDRA-6504 and trunk.

* Updating totally unrelated counters (different partitions)
* Updating the same counters a lot (same cells in the same partition)
* Different cells in the same few partitions (hot counter partition)

benchmark: 
https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
 (old counters)
compared to: 
https://github.com/apache/cassandra/tree/714c423360c36da2a2b365efaf9c5c4f623ed133
 (new counters)

So far, the above changes should only affect the write path.

  was:
Benchmark the difference in performance between CASSANDRA-6504 and trunk.

* Updating totally unrelated counters (different partitions)
* Updating the same counters a lot (same cells in the same partition)
* Different cells in the same few partitions (hot counter partition)

benchmark: 
https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
 (old counters)
compared to: https://github.com/iamaleksey/cassandra/commits/trunk

So far, the above changes should only affect the write path.


> Benchmark counter improvements (counters++)
> ---
>
> Key: CASSANDRA-6553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6553
> Project: Cassandra
>  Issue Type: Test
>Reporter: Ryan McGuire
>Assignee: Ryan McGuire
> Fix For: 2.1
>
>
> Benchmark the difference in performance between CASSANDRA-6504 and trunk.
> * Updating totally unrelated counters (different partitions)
> * Updating the same counters a lot (same cells in the same partition)
> * Different cells in the same few partitions (hot counter partition)
> benchmark: 
> https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
>  (old counters)
> compared to: 
> https://github.com/apache/cassandra/tree/714c423360c36da2a2b365efaf9c5c4f623ed133
>  (new counters)
> So far, the above changes should only affect the write path.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2014-01-27 Thread jbellis

Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/82735e09
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/82735e09
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/82735e09

Branch: refs/heads/trunk
Commit: 82735e096a49a1871d88353f06620300df55ebf6
Parents: 714c423 20c2adc
Author: Jonathan Ellis 
Authored: Mon Jan 27 17:00:24 2014 -0600
Committer: Jonathan Ellis 
Committed: Mon Jan 27 17:00:24 2014 -0600

--
 CHANGES.txt |  1 +
 .../cassandra/dht/Murmur3Partitioner.java   |  5 +-
 .../org/apache/cassandra/utils/BloomFilter.java | 79 ++--
 .../org/apache/cassandra/utils/FBUtilities.java |  6 ++
 .../cassandra/utils/Murmur3BloomFilter.java |  4 +-
 .../org/apache/cassandra/utils/MurmurHash.java  |  6 +-
 .../cassandra/utils/obs/OffHeapBitSet.java  |  2 +-
 7 files changed, 72 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/82735e09/CHANGES.txt
--
diff --cc CHANGES.txt
index cc406c8,68727dc..23bb4f1
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,34 -1,5 +1,35 @@@
 +2.1
 + * add listsnapshots command to nodetool (CASSANDRA-5742)
 + * Introduce AtomicBTreeColumns (CASSANDRA-6271)
 + * Multithreaded commitlog (CASSANDRA-3578)
 + * allocate fixed index summary memory pool and resample cold index summaries 
 +   to use less memory (CASSANDRA-5519)
 + * Removed multithreaded compaction (CASSANDRA-6142)
 + * Parallelize fetching rows for low-cardinality indexes (CASSANDRA-1337)
 + * change logging from log4j to logback (CASSANDRA-5883)
 + * switch to LZ4 compression for internode communication (CASSANDRA-5887)
 + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971)
 + * Remove 1.2 network compatibility code (CASSANDRA-5960)
 + * Remove leveled json manifest migration code (CASSANDRA-5996)
 + * Remove CFDefinition (CASSANDRA-6253)
 + * Use AtomicIntegerFieldUpdater in RefCountedMemory (CASSANDRA-6278)
 + * User-defined types for CQL3 (CASSANDRA-5590)
 + * Use of o.a.c.metrics in nodetool (CASSANDRA-5871, 6406)
 + * Batch read from OTC's queue and cleanup (CASSANDRA-1632)
 + * Secondary index support for collections (CASSANDRA-4511, 6383)
 + * SSTable metadata(Stats.db) format change (CASSANDRA-6356)
 + * Push composites support in the storage engine
 +   (CASSANDRA-5417, CASSANDRA-6520)
 + * Add snapshot space used to cfstats (CASSANDRA-6231)
 + * Add cardinality estimator for key count estimation (CASSANDRA-5906)
 + * CF id is changed to be non-deterministic. Data dir/key cache are created
 +   uniquely for CF id (CASSANDRA-5202)
 + * Cassandra won't start by default without jna (CASSANDRA-6575)
 + * New counters implementation (CASSANDRA-6504)
 +
 +
  2.0.5
+  * Reduce garbage generated by bloom filter lookups (CASSANDRA-6609)
   * Add ks.cf names to tombstone logging (CASSANDRA-6597)
   * Use LOCAL_QUORUM for LWT operations at LOCAL_SERIAL (CASSANDRA-6495)
   * Wait for gossip to settle before accepting client connections 
(CASSANDRA-4288)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/82735e09/src/java/org/apache/cassandra/utils/FBUtilities.java
--

[1/3] git commit: Reduce garbage generated by bloom filter lookups patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6609

2014-01-27 Thread jbellis

Updated Branches:
  refs/heads/cassandra-2.0 8bbb6eda6 -> 20c2adc87
  refs/heads/trunk 714c42336 -> 82735e096


Reduce garbage generated by bloom filter lookups
patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6609


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20c2adc8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20c2adc8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20c2adc8

Branch: refs/heads/cassandra-2.0
Commit: 20c2adc87102963836a59a5e9626005fd9ee08bc
Parents: 8bbb6ed
Author: Jonathan Ellis 
Authored: Mon Jan 27 17:00:08 2014 -0600
Committer: Jonathan Ellis 
Committed: Mon Jan 27 17:00:08 2014 -0600

--
 CHANGES.txt |  1 +
 .../cassandra/dht/Murmur3Partitioner.java   |  5 +-
 .../org/apache/cassandra/utils/BloomFilter.java | 79 ++--
 .../org/apache/cassandra/utils/FBUtilities.java |  6 ++
 .../cassandra/utils/Murmur3BloomFilter.java |  4 +-
 .../org/apache/cassandra/utils/MurmurHash.java  |  6 +-
 .../cassandra/utils/obs/OffHeapBitSet.java  |  2 +-
 7 files changed, 72 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c2adc8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 6acbc87..68727dc 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.5
+ * Reduce garbage generated by bloom filter lookups (CASSANDRA-6609)
  * Add ks.cf names to tombstone logging (CASSANDRA-6597)
  * Use LOCAL_QUORUM for LWT operations at LOCAL_SERIAL (CASSANDRA-6495)
  * Wait for gossip to settle before accepting client connections 
(CASSANDRA-4288)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c2adc8/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
--
diff --git a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java 
b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
index 9ff635e..3a045d7 100644
--- a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
+++ b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
@@ -89,8 +89,9 @@ public class Murmur3Partitioner extends 
AbstractPartitioner
 if (key.remaining() == 0)
 return MINIMUM;
 
-long hash = MurmurHash.hash3_x64_128(key, key.position(), 
key.remaining(), 0)[0];
-return new LongToken(normalize(hash));
+long[] hash = new long[2];
+MurmurHash.hash3_x64_128(key, key.position(), key.remaining(), 0, 
hash);
+return new LongToken(normalize(hash[0]));
 }
 
 public LongToken getRandomToken()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c2adc8/src/java/org/apache/cassandra/utils/BloomFilter.java
--
diff --git a/src/java/org/apache/cassandra/utils/BloomFilter.java 
b/src/java/org/apache/cassandra/utils/BloomFilter.java
index b134b3c..9fbb38e 100644
--- a/src/java/org/apache/cassandra/utils/BloomFilter.java
+++ b/src/java/org/apache/cassandra/utils/BloomFilter.java
@@ -20,10 +20,20 @@ package org.apache.cassandra.utils;
 import java.io.IOException;
 import java.nio.ByteBuffer;
 
+import com.google.common.annotations.VisibleForTesting;
+
 import org.apache.cassandra.utils.obs.IBitSet;
 
 public abstract class BloomFilter implements IFilter
 {
+private static final ThreadLocal reusableIndexes = new 
ThreadLocal()
+{
+protected long[] initialValue()
+{
+return new long[21];
+}
+};
+
 public final IBitSet bitset;
 public final int hashCount;
 
@@ -33,47 +43,68 @@ public abstract class BloomFilter implements IFilter
 this.bitset = bitset;
 }
 
-private long[] getHashBuckets(ByteBuffer key)
-{
-return getHashBuckets(key, hashCount, bitset.capacity());
-}
-
-protected abstract long[] hash(ByteBuffer b, int position, int remaining, 
long seed);
-
 // Murmur is faster than an SHA-based approach and provides as-good 
collision
 // resistance.  The combinatorial generation approach described in
 // http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/esa06.pdf
 // does prove to work in actual tests, and is obviously faster
 // than performing further iterations of murmur.
-long[] getHashBuckets(ByteBuffer b, int hashCount, long max)
+protected abstract void hash(ByteBuffer b, int position, int remaining, 
long seed, long[] result);
+
+// tests ask for ridiculous numbers of hashes so here is a special case 
for them
+// rather than using the threadLocal like we do in production
+@VisibleForTesting
+public long[] getHashBuckets(ByteBuffer key, int hashCount, l

[2/3] git commit: Reduce garbage generated by bloom filter lookups patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6609

2014-01-27 Thread jbellis

Reduce garbage generated by bloom filter lookups
patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6609


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20c2adc8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20c2adc8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20c2adc8

Branch: refs/heads/trunk
Commit: 20c2adc87102963836a59a5e9626005fd9ee08bc
Parents: 8bbb6ed
Author: Jonathan Ellis 
Authored: Mon Jan 27 17:00:08 2014 -0600
Committer: Jonathan Ellis 
Committed: Mon Jan 27 17:00:08 2014 -0600

--
 CHANGES.txt |  1 +
 .../cassandra/dht/Murmur3Partitioner.java   |  5 +-
 .../org/apache/cassandra/utils/BloomFilter.java | 79 ++--
 .../org/apache/cassandra/utils/FBUtilities.java |  6 ++
 .../cassandra/utils/Murmur3BloomFilter.java |  4 +-
 .../org/apache/cassandra/utils/MurmurHash.java  |  6 +-
 .../cassandra/utils/obs/OffHeapBitSet.java  |  2 +-
 7 files changed, 72 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c2adc8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 6acbc87..68727dc 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.5
+ * Reduce garbage generated by bloom filter lookups (CASSANDRA-6609)
  * Add ks.cf names to tombstone logging (CASSANDRA-6597)
  * Use LOCAL_QUORUM for LWT operations at LOCAL_SERIAL (CASSANDRA-6495)
  * Wait for gossip to settle before accepting client connections 
(CASSANDRA-4288)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c2adc8/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
--
diff --git a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java 
b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
index 9ff635e..3a045d7 100644
--- a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
+++ b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java
@@ -89,8 +89,9 @@ public class Murmur3Partitioner extends 
AbstractPartitioner
 if (key.remaining() == 0)
 return MINIMUM;
 
-long hash = MurmurHash.hash3_x64_128(key, key.position(), 
key.remaining(), 0)[0];
-return new LongToken(normalize(hash));
+long[] hash = new long[2];
+MurmurHash.hash3_x64_128(key, key.position(), key.remaining(), 0, 
hash);
+return new LongToken(normalize(hash[0]));
 }
 
 public LongToken getRandomToken()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c2adc8/src/java/org/apache/cassandra/utils/BloomFilter.java
--
diff --git a/src/java/org/apache/cassandra/utils/BloomFilter.java 
b/src/java/org/apache/cassandra/utils/BloomFilter.java
index b134b3c..9fbb38e 100644
--- a/src/java/org/apache/cassandra/utils/BloomFilter.java
+++ b/src/java/org/apache/cassandra/utils/BloomFilter.java
@@ -20,10 +20,20 @@ package org.apache.cassandra.utils;
 import java.io.IOException;
 import java.nio.ByteBuffer;
 
+import com.google.common.annotations.VisibleForTesting;
+
 import org.apache.cassandra.utils.obs.IBitSet;
 
 public abstract class BloomFilter implements IFilter
 {
+private static final ThreadLocal reusableIndexes = new 
ThreadLocal()
+{
+protected long[] initialValue()
+{
+return new long[21];
+}
+};
+
 public final IBitSet bitset;
 public final int hashCount;
 
@@ -33,47 +43,68 @@ public abstract class BloomFilter implements IFilter
 this.bitset = bitset;
 }
 
-private long[] getHashBuckets(ByteBuffer key)
-{
-return getHashBuckets(key, hashCount, bitset.capacity());
-}
-
-protected abstract long[] hash(ByteBuffer b, int position, int remaining, 
long seed);
-
 // Murmur is faster than an SHA-based approach and provides as-good 
collision
 // resistance.  The combinatorial generation approach described in
 // http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/esa06.pdf
 // does prove to work in actual tests, and is obviously faster
 // than performing further iterations of murmur.
-long[] getHashBuckets(ByteBuffer b, int hashCount, long max)
+protected abstract void hash(ByteBuffer b, int position, int remaining, 
long seed, long[] result);
+
+// tests ask for ridiculous numbers of hashes so here is a special case 
for them
+// rather than using the threadLocal like we do in production
+@VisibleForTesting
+public long[] getHashBuckets(ByteBuffer key, int hashCount, long max)
 {
-long[] result = new long[hashCount];
-long[] hash = this.hash(b, b.position(), b.remain

[jira] [Resolved] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6609.
---

   Resolution: Fixed
Fix Version/s: 2.0.5
 Reviewer: Jonathan Ellis
 Assignee: Benedict

Committed to 2.0.  (1.2 is attractive but it's unquestionably an optimization 
and not a fix, so I'm not comfortable touching 1.2.)

> Reduce Bloom Filter Garbage Allocation
> --
>
> Key: CASSANDRA-6609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 2.0.5
>
> Attachments: tmp.diff, tmp2.patch, tmp3.patch
>
>
> Just spotted that we allocate potentially large amounts of garbage on bloom 
> filter lookups, since we allocate a new long[] for each hash() and to store 
> the bucket indexes we visit, in a manner that guarantees they are allocated 
> on heap. With a lot of sstables and many requests, this could easily be 
> hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883496#comment-13883496
 ] 

Benedict commented on CASSANDRA-6609:
-

bq. so I'm not comfortable touching 1.2

Makes sense

> Reduce Bloom Filter Garbage Allocation
> --
>
> Key: CASSANDRA-6609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 2.0.5
>
> Attachments: tmp.diff, tmp2.patch, tmp3.patch
>
>
> Just spotted that we allocate potentially large amounts of garbage on bloom 
> filter lookups, since we allocate a new long[] for each hash() and to store 
> the bucket indexes we visit, in a manner that guarantees they are allocated 
> on heap. With a lot of sstables and many requests, this could easily be 
> hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (CASSANDRA-6627) nodetool help for handoff and truncatehints nits

2014-01-27 Thread Kristine Hahn (JIRA)

Kristine Hahn created CASSANDRA-6627:


 Summary: nodetool help for handoff and truncatehints nits
 Key: CASSANDRA-6627
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6627
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation & website
 Environment: 2.1 trunk
Reporter: Kristine Hahn
Priority: Trivial


The description for disablehandoff is missing. It has the same description as 
disablegossip.

At the end of the SYNOPSIS of truncatehints, the angle brackets are missing 
around endpoint.

nodetool [(-h  | --host )] [(-p  | --port )]
[(-pw  | --password )]
[(-u  | --username )] truncatehints [--] 
[endpoint]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (CASSANDRA-6627) nodetool help for handoff and truncatehints nits

2014-01-27 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-6627.
-

Resolution: Not A Problem

Actually it's correct, since the endpoint is an argument to truncatehints, not 
an argument to flag like the others.

> nodetool help for handoff and truncatehints nits
> 
>
> Key: CASSANDRA-6627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6627
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation & website
> Environment: 2.1 trunk
>Reporter: Kristine Hahn
>Priority: Trivial
>
> The description for disablehandoff is missing. It has the same description as 
> disablegossip.
> At the end of the SYNOPSIS of truncatehints, the angle brackets are missing 
> around endpoint.
> nodetool [(-h  | --host )] [(-p  | --port )]
> [(-pw  | --password )]
> [(-u  | --username )] truncatehints [--] 
> [endpoint]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

git commit: update nodetool help for truncatehints

2014-01-27 Thread brandonwilliams

Updated Branches:
  refs/heads/trunk 82735e096 -> b010cb791


update nodetool help for truncatehints


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b010cb79
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b010cb79
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b010cb79

Branch: refs/heads/trunk
Commit: b010cb791b70634bf77fcfb39210d85fa40918d4
Parents: 82735e0
Author: Brandon Williams 
Authored: Mon Jan 27 18:06:47 2014 -0600
Committer: Brandon Williams 
Committed: Mon Jan 27 18:07:03 2014 -0600

--
 src/java/org/apache/cassandra/tools/NodeTool.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b010cb79/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index 8baabaa..7c49e23 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -2197,10 +2197,10 @@ public class NodeTool
 }
 }
 
-@Command(name = "truncatehints", description = "Truncate all hints on the 
local node, or truncate hints for the endpoint specified.")
+@Command(name = "truncatehints", description = "Truncate all hints on the 
local node, or truncate hints for the endpoint(s) specified.")
 public static class TruncateHints extends NodeToolCmd
 {
-@Arguments(usage = "[endpoint]", description = "Endpoint address to 
delete hints for, either ip address (\"127.0.0.1\") or hostname")
+@Arguments(usage = "[endpoint ... ]", description = "Endpoint 
address(es) to delete hints for, either ip address (\"127.0.0.1\") or hostname")
 private String endpoint = EMPTY;
 
 @Override

[jira] [Commented] (CASSANDRA-6627) nodetool help for handoff and truncatehints nits

2014-01-27 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883544#comment-13883544
 ] 

Brandon Williams commented on CASSANDRA-6627:
-

I made it a little clearer you can provide as many endpoints as you like in 
1f2283c

> nodetool help for handoff and truncatehints nits
> 
>
> Key: CASSANDRA-6627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6627
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation & website
> Environment: 2.1 trunk
>Reporter: Kristine Hahn
>Priority: Trivial
>
> The description for disablehandoff is missing. It has the same description as 
> disablegossip.
> At the end of the SYNOPSIS of truncatehints, the angle brackets are missing 
> around endpoint.
> nodetool [(-h  | --host )] [(-p  | --port )]
> [(-pw  | --password )]
> [(-u  | --username )] truncatehints [--] 
> [endpoint]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6474) Compaction strategy based on MinHash

2014-01-27 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883565#comment-13883565
 ] 

Jonathan Ellis commented on CASSANDRA-6474:
---

So I'm running the numbers and it's not 100% obvious to me that minhash 
actually does better here:

# HLL of p=15 is about 10KB and has an error rate of 0.5%
# According to http://en.wikipedia.org/wiki/MinHash the expected error for 
minhash is 1/sqrt(k) for k hashes so to get to 0.5% error rate we'd need 40,000 
hashes (160KB)

So, for a relatively low error rate HLL wins.  Does it just not "scale down" as 
well as minhash?  How well would HLL do with a 1600 byte estimate, which 
minhash would give us 5% error on?  What about 400 bytes?

How accurate do we need the estimate to be, to be useful to compaction?

Incidentally, I'm not sure how hash size affects the 1/sqrt(k) computation, 
surely an 8-byte hash would give more accurate results than a 2-byte hash?


> Compaction strategy based on MinHash
> 
>
> Key: CASSANDRA-6474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Yuki Morishita
>Assignee: sankalp kohli
>  Labels: compaction
> Fix For: 3.0
>
>
> We can consider an SSTable as a set of partition keys, and 'compaction' as 
> de-duplication of those partition keys.
> We want to find compaction candidates from SSTables that have as many same 
> keys as possible. If we can group similar SSTables based on some measurement, 
> we can achieve more efficient compaction.
> One such measurement is [Jaccard 
> Distance|http://en.wikipedia.org/wiki/Jaccard_index],
> !http://upload.wikimedia.org/math/1/8/6/186c7f4e83da32e889d606140fae25a0.png!
> which we can estimate using technique called 
> [MinHash|http://en.wikipedia.org/wiki/MinHash].
> In Cassandra, we can calculate and store MinHash signature when writing 
> SSTable. New compaction strategy uses the signature to find the group of 
> similar SSTable for compaction candidates. We can always fall back to STCS 
> when such candidates are not exists.
> This is just an idea floating around my head, but before I forget, I dump it 
> here. For introduction to this technique, [Chapter 3 of 'Mining of Massive 
> Datasets'|http://infolab.stanford.edu/~ullman/mmds/ch3.pdf] is a good start.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

git commit: Require 2.0.5 for rolling upgrades

2014-01-27 Thread aleksey

Updated Branches:
  refs/heads/trunk b010cb791 -> d939be46b


Require 2.0.5 for rolling upgrades


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d939be46
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d939be46
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d939be46

Branch: refs/heads/trunk
Commit: d939be46b7573979f006c4d103774e9450de65b6
Parents: b010cb7
Author: Aleksey Yeschenko 
Authored: Mon Jan 27 19:23:47 2014 -0600
Committer: Aleksey Yeschenko 
Committed: Mon Jan 27 19:23:47 2014 -0600

--
 NEWS.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d939be46/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 4e00faf..72b898e 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -28,7 +28,7 @@ New features
 
 Upgrading
 -
-   - Rolling upgrades from anything pre-2.0 is not supported.
+   - Rolling upgrades from anything pre-2.0.5 is not supported.
- For leveled compaction users, 2.0 must be atleast started before
  upgrading to 2.1 due to the fact that the old JSON leveled
  manifest is migrated into the sstable metadata files on startup

[jira] [Commented] (CASSANDRA-6474) Compaction strategy based on MinHash

2014-01-27 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883615#comment-13883615
 ] 

Yuki Morishita commented on CASSANDRA-6474:
---

I think we don't need so much accuracy for this. We just want to find the set 
that contains SSTables of resemblance of, say, more than 0.5.
There is storage efficient improvement called b-bit minwise hashing\[1\]\[2\], 
which stores only b bit (b=1, 2, 3,...) for each hash, that we can use in the 
case above.

1. http://research.microsoft.com/pubs/120078/wfc0398-lips.pdf
2. http://research.microsoft.com/pubs/152334/cacm_hashing.pdf

> Compaction strategy based on MinHash
> 
>
> Key: CASSANDRA-6474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Yuki Morishita
>Assignee: sankalp kohli
>  Labels: compaction
> Fix For: 3.0
>
>
> We can consider an SSTable as a set of partition keys, and 'compaction' as 
> de-duplication of those partition keys.
> We want to find compaction candidates from SSTables that have as many same 
> keys as possible. If we can group similar SSTables based on some measurement, 
> we can achieve more efficient compaction.
> One such measurement is [Jaccard 
> Distance|http://en.wikipedia.org/wiki/Jaccard_index],
> !http://upload.wikimedia.org/math/1/8/6/186c7f4e83da32e889d606140fae25a0.png!
> which we can estimate using technique called 
> [MinHash|http://en.wikipedia.org/wiki/MinHash].
> In Cassandra, we can calculate and store MinHash signature when writing 
> SSTable. New compaction strategy uses the signature to find the group of 
> similar SSTable for compaction candidates. We can always fall back to STCS 
> when such candidates are not exists.
> This is just an idea floating around my head, but before I forget, I dump it 
> here. For introduction to this technique, [Chapter 3 of 'Mining of Massive 
> Datasets'|http://infolab.stanford.edu/~ullman/mmds/ch3.pdf] is a good start.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6474) Compaction strategy based on MinHash

2014-01-27 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883648#comment-13883648
 ] 

Benedict commented on CASSANDRA-6474:
-

[~yukim] It sounds like you're only aiming to select the best compaction 
candidates, rather than potentially avoiding compacting files that can be said 
to be (mostly) non-intersecting. Wouldn't the latter be even more useful, 
especially for use cases where data is mostly appending to unique PartitionKeys?

Also, it seems for this we could supplement whatever data we retain for the 
initial calculation with a sampling of one of the files (which, given Murmur3 
gives a good spread of data could probably be optimised to scanning a 
percentage of pages, rather than random records) to give tighter bounds on the 
actual overlap, in the case we decide not to compact two files for this reason.

> Compaction strategy based on MinHash
> 
>
> Key: CASSANDRA-6474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Yuki Morishita
>Assignee: sankalp kohli
>  Labels: compaction
> Fix For: 3.0
>
>
> We can consider an SSTable as a set of partition keys, and 'compaction' as 
> de-duplication of those partition keys.
> We want to find compaction candidates from SSTables that have as many same 
> keys as possible. If we can group similar SSTables based on some measurement, 
> we can achieve more efficient compaction.
> One such measurement is [Jaccard 
> Distance|http://en.wikipedia.org/wiki/Jaccard_index],
> !http://upload.wikimedia.org/math/1/8/6/186c7f4e83da32e889d606140fae25a0.png!
> which we can estimate using technique called 
> [MinHash|http://en.wikipedia.org/wiki/MinHash].
> In Cassandra, we can calculate and store MinHash signature when writing 
> SSTable. New compaction strategy uses the signature to find the group of 
> similar SSTable for compaction candidates. We can always fall back to STCS 
> when such candidates are not exists.
> This is just an idea floating around my head, but before I forget, I dump it 
> here. For introduction to this technique, [Chapter 3 of 'Mining of Massive 
> Datasets'|http://infolab.stanford.edu/~ullman/mmds/ch3.pdf] is a good start.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6210) Repair hangs when a new datacenter is added to a cluster

2014-01-27 Thread Russell Alexander Spitzer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883665#comment-13883665
 ] 

Russell Alexander Spitzer commented on CASSANDRA-6210:
--

Newest patch looks good to me. Passes extended repair tests.

> Repair hangs when a new datacenter is added to a cluster
> 
>
> Key: CASSANDRA-6210
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6210
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Amazon Ec2
> 2 M1.large nodes
>Reporter: Russell Alexander Spitzer
>Assignee: Yuki Morishita
> Fix For: 2.0.5
>
> Attachments: 6210-2.0.txt, RepairLogs.tar.gz, patch_1_logs.tar.gz
>
>
> Attempting to add a new datacenter to a cluster seems to cause repair 
> operations to break. I've been reproducing this with 20~ node clusters but 
> can get it to reliably occur on 2 node setups.
> {code}
> ##Basic Steps to reproduce
> #Node 1 is started using GossipingPropertyFileSnitch as dc1
> #Cassandra-stress is used to insert a minimal amount of data
> $CASSANDRA_STRESS -t 100 -R 
> org.apache.cassandra.locator.NetworkTopologyStrategy  --num-keys=1000 
> --columns=10 --consistency-level=LOCAL_QUORUM --average-size-values -
> -compaction-strategy='LeveledCompactionStrategy' -O dc1:1 
> --operation=COUNTER_ADD
> #Alter "Keyspace1"
> ALTER KEYSPACE "Keyspace1" WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 1 , 'dc2': 1 };
> #Add node 2 using GossipingPropertyFileSnitch as dc2
> run repair on node 1
> run repair on node 2
> {code}
> The repair task on node 1 never completes and while there are no exceptions 
> in the logs of node1, netstat reports the following repair tasks
> {code}
> Mode: NORMAL
> Repair 4e71a250-36b4-11e3-bedc-1d1bb5c9abab
> Repair 6c64ded0-36b4-11e3-bedc-1d1bb5c9abab
> Read Repair Statistics:
> Attempted: 0
> Mismatch (Blocking): 0
> Mismatch (Background): 0
> Pool NameActive   Pending  Completed
> Commandsn/a 0  10239
> Responses   n/a 0   3839
> {code}
> Checking on node 2 we see the following exceptions
> {code}
> ERROR [STREAM-IN-/10.171.122.130] 2013-10-16 22:42:58,961 StreamSession.java 
> (line 410) [Stream #4e71a250-36b4-11e3-bedc-1d1bb5c9abab] Streaming error 
> occurred
> java.lang.NullPointerException
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:174)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:724)
> ...
> ERROR [STREAM-IN-/10.171.122.130] 2013-10-16 22:43:49,214 StreamSession.java 
> (line 410) [Stream #6c64ded0-36b4-11e3-bedc-1d1bb5c9abab] Streaming error 
> occurred
> java.lang.NullPointerException
> at 
> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:174)
> at 
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
> at java.lang.Thread.run(Thread.java:724)
> {code}
> Netstats on node 2 reports
> {code}
> automaton@ip-10-171-15-234:~$ nodetool netstats
> Mode: NORMAL
> Repair 4e71a250-36b4-11e3-bedc-1d1bb5c9abab
> Read Repair Statistics:
> Attempted: 0
> Mismatch (Blocking): 0
> Mismatch (Background): 0
> Pool NameActive   Pending  Completed
> Commandsn/a 0   2562
> Responses   n/a 0   4284
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-5872) Bundle JNA

2014-01-27 Thread Lyuben Todorov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883674#comment-13883674
 ] 

Lyuben Todorov commented on CASSANDRA-5872:
---

Removing {{}} from 
{{}} seems 
to get rid of the download. Build succeeds without build.xml containing the 
aforementioned dependency, so maybe we want to get rid of it? [branch 
here|https://github.com/lyubent/cassandra/commit/f3de73d49ad30860c32f14ab1cbc4e8767c7b241]
 so you can all have a look. 

Side note - adding jna works on linux (haven't tested windows yet) but 
cassandra still needs the {{-Dcassandra.boot_without_jna=true}} override to 
start on OSX.

> Bundle JNA
> --
>
> Key: CASSANDRA-5872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5872
> Project: Cassandra
>  Issue Type: Task
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Lyuben Todorov
>Priority: Minor
> Fix For: 2.1
>
> Attachments: 5872-trunk.patch, 5872_debian.patch
>
>
> JNA 4.0 is reported to be dual-licensed LGPL/APL.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6474) Compaction strategy based on MinHash

2014-01-27 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883856#comment-13883856
 ] 

Jonathan Ellis commented on CASSANDRA-6474:
---

I  see that https://github.com/aggregateknowledge/java-hll (thanks [~iconara]) 
claims that 1280 bytes is enough to count "billions" of values with "a few 
percent" error with HLL.

1K per sstable gets us to 16TB of data in 1GB of HLL at the default 
160MB/sstable LCS which seems quite reasonable to me, especially if we can move 
it off heap later.  (Since few people are even approaching 5TB per node I think 
it's fine to leave it resident, on heap for now.)

If that gets us more information than the minhash approach then that sounds 
like a win to me both for better compaction and simplicity of implementation.  
If we have to fudge our BF allocations by 5% or so, I'm okay with that trade.

Another thought: cardinality estimation will primarily be useful for STCS, or 
whatever we call a new Similarity-based strategy.  So I'm also okay with 
saying, "we'll calculate HLL for LCS and use it to calculate BF size 
accurately, but for STCS we'll keep it resident and use it to calculate 
candidates."  For LCS it would only be useful for prioritization, since the set 
of potential candidates is determined by the level-sorting -- and given the 
order of magnitude difference in each level, size-based estimates of similarity 
a la hyperdex should be "almost" as good. 
(http://hackingdistributed.com/2013/06/17/hyperleveldb/)

> Compaction strategy based on MinHash
> 
>
> Key: CASSANDRA-6474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Yuki Morishita
>Assignee: sankalp kohli
>  Labels: compaction
> Fix For: 3.0
>
>
> We can consider an SSTable as a set of partition keys, and 'compaction' as 
> de-duplication of those partition keys.
> We want to find compaction candidates from SSTables that have as many same 
> keys as possible. If we can group similar SSTables based on some measurement, 
> we can achieve more efficient compaction.
> One such measurement is [Jaccard 
> Distance|http://en.wikipedia.org/wiki/Jaccard_index],
> !http://upload.wikimedia.org/math/1/8/6/186c7f4e83da32e889d606140fae25a0.png!
> which we can estimate using technique called 
> [MinHash|http://en.wikipedia.org/wiki/MinHash].
> In Cassandra, we can calculate and store MinHash signature when writing 
> SSTable. New compaction strategy uses the signature to find the group of 
> similar SSTable for compaction candidates. We can always fall back to STCS 
> when such candidates are not exists.
> This is just an idea floating around my head, but before I forget, I dump it 
> here. For introduction to this technique, [Chapter 3 of 'Mining of Massive 
> Datasets'|http://infolab.stanford.edu/~ullman/mmds/ch3.pdf] is a good start.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

87 matches

Mail list logo