[jira] [Created] (CASSANDRA-6484) cassandra-shuffle not working with authentication

2013-12-13 Thread Gibheer (JIRA)
Gibheer created CASSANDRA-6484:
--

 Summary: cassandra-shuffle not working with authentication
 Key: CASSANDRA-6484
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6484
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: cassandra 2.0.3
Reporter: Gibheer


When enabling authentication for a cassandra cluster the tool cassandra-shuffle 
is unable to connect.

The reason is, that cassandra-shuffle doesn't take any parameter for username 
and password for the thrift connection.

To solve that problem, parameter for username and password should be added, It 
should also be able to interpret cqlshrc or a separate file file with 
authentication data.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5899) Sends all interface in native protocol notification when rpc_address=0.0.0.0

2013-12-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847316#comment-13847316
 ] 

Sylvain Lebresne commented on CASSANDRA-5899:
-

bq. Why don't we add an broadcast_rpc_address config option?

Good idea. I mean, I think we need it anyway for the same reasons that we have 
a broadcast_listen_address, that is for when none of the interfaces on the node 
are actually public to the client because there is some router in between.

The fact that we don't know which address to send when rpc_address is 0.0.0.0 
is a slightly different problem, but we could make it mandatory (or strongly 
recommended, but I'd personally would vote for mandatory) to set a 
broadcast_rpc_address if rpc_address=0.0.0.0 and kill two birds with one stone.


 Sends all interface in native protocol notification when rpc_address=0.0.0.0
 

 Key: CASSANDRA-5899
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5899
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 2.1


 For the native protocol notifications, when we send a new node notification, 
 we send the rpc_address of that new node. For this to be actually useful, 
 that address sent should be publicly accessible by the driver it is destined 
 to. 
 The problem is when rpc_address=0.0.0.0. Currently, we send the 
 listen_address, which is correct in the sense that we do are bind on it but 
 might not be accessible by client nodes.
 In fact, one of the good reason to use 0.0.0.0 rpc_address would be if you 
 have a private network for internode communication and another for 
 client-server communinations, but still want to be able to issue query from 
 the private network for debugging. In that case, the current behavior to send 
 listen_address doesn't really help.
 So one suggestion would be to instead send all the addresses on which the 
 (native protocol) server is bound to (which would still leave to the driver 
 the task to pick the right one, but at least it has something to pick from).
 That's relatively trivial to do in practice, but it does require a minor 
 binary protocol break to return a list instead of just one IP, which is why 
 I'm tentatively marking this 2.0. Maybe we can shove that tiny change in the 
 final (in the protocol v2 only)? Povided we agree it's a good idea of course.
 Now to be complete, for the same reasons, we would also need to store all the 
 addresses we are bound to in the peers table. That's also fairly simple and 
 the backward compatibility story is maybe a tad simpler: we could add a new 
 {{rpc_addresses}} column that would be a list and deprecate {{rpc_address}} 
 (to be removed in 2.1 for instance).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847320#comment-13847320
 ] 

Sylvain Lebresne commented on CASSANDRA-6480:
-

Out of curiosity, do you actually have a use case for it? Asking because we 
left it out of CASSANDRA-5484 for lack of being aware of any real life use case.

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Andrés de la Peña
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

2013-12-13 Thread devThoughts (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847341#comment-13847341
 ] 

devThoughts commented on CASSANDRA-6151:


@Shridhar thanks a ton. its working. 

 CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
 

 Key: CASSANDRA-6151
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Russell Alexander Spitzer
Assignee: Alex Liu
Priority: Minor
 Attachments: 6151-1.2-branch.txt, 6151-v2-1.2-branch.txt, 
 6151-v3-1.2-branch.txt, 6151-v4-1.2.10-branch.txt


 From 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546
 The user was attempting to load a single partition using a where clause in a 
 pig load statement. 
 CQL Table
 {code}
 CREATE table data (
   occurday  text,
   seqnumber int,
   occurtimems bigint,
   unique bigint,
   fields maptext, text,
   primary key ((occurday, seqnumber), occurtimems, unique)
 )
 {code}
 Pig Load statement Query
 {code}
 data = LOAD 
 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27'
  USING CqlStorage();
 {code}
 This results in an exception when processed by the the CqlPagingRecordReader 
 which attempts to page this query even though it contains at most one 
 partition key. This leads to an invalid CQL statement. 
 CqlPagingRecordReader Query
 {code}
 SELECT * FROM data WHERE token(occurday,seqnumber)  ? AND
 token(occurday,seqnumber) = ? AND occurday='A Great Day' 
 AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 Exception
 {code}
  InvalidRequestException(why:occurday cannot be restricted by more than one 
 relation if it includes an Equal)
 {code}
 I'm not sure it is worth the special case but, a modification to not use the 
 paging record reader when the entire partition key is specified would solve 
 this issue. 
 h3. Solution
  If it have EQUAL clauses for all the partitioning keys, we use Query 
 {code}
   SELECT * FROM data 
   WHERE occurday='A Great Day' 
AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 instead of 
 {code}
   SELECT * FROM data 
   WHERE token(occurday,seqnumber)  ? 
AND token(occurday,seqnumber) = ? 
AND occurday='A Great Day' 
AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
 {code}
 The base line implementation is to retrieve all data of all rows around the 
 ring. This new feature is to retrieve all data of a wide row. It's a one 
 level lower than the base line. It helps for the use case where user is only 
 interested in a specific wide row, so the user doesn't spend whole job to 
 retrieve all the rows around the ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5839) Save repair data to system table

2013-12-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847367#comment-13847367
 ] 

Sylvain Lebresne commented on CASSANDRA-5839:
-

Haven't looked at the whole patch, but on the schema, I'd keep the grouping of 
the keyspace and cf names into the partition key (like in Jonathan's initial 
comment). It's common to have just one keyspace (or at least very few) and 
there's no point in putting all the burden of the repair data on just one node.

bq. users can use their force to construct their desired view

I just pictured a green Yuki with pointy hears answering a user question by 
use the force Luke. That was funny :). Joking asides, I agree there is no 
need to overthink it initially. That being said, I don't see a particular 
reason not to let users create 2ndary indexes on those tables if they care for 
it, and in fact, is there a reason this doesn't work out of the box? (if there 
is, let's worry about it later).

bq. I don't think a hardcode factor based on gc_grace is the way to go

It certainly shouldn't be hardcoded. We have a per-table default_time_to_live 
which is what we should use, and users can set that to whatever they want.  The 
only question is what default to set it to (I personally would go with 
something like 6 months but I don't feel very strongly on any number).

bq. I renamed the table to repair_jobs since each entry corresponds to a 
RepairJob rather than a whole session

I'd go with repair_history or even just repairs rather than sticking too 
closely to internal Java classes name, unless we particularly enjoy feeling 
stupid when we rename said classes in a refactor that is.

bq. What do you consider stats (and not status) in the current schema?

I believe Yuki really just meant total_ranges_out_of_sync in the current 
schema. And what I'd suggest is to leave this discussion to a follow up ticket. 
 I think there could be value in both some per-job simple stats (for quick 
estimation of what the repair did do) as well as some per-job per-replica 
repair log that provides more fine grained information. But it warrants it's 
own discussion and so should be a separate ticket imo.

Nits:
* not a fan of system_global as a name. Imo it suggests that it might be 
replicated to all nodes which it isn't. Not pretending I have a much better 
proposition though.
* coordinating_node and participant_nodes should be of type inet. I'd 
also shorten the names, say cordinator and participants.


 Save repair data to system table
 

 Key: CASSANDRA-5839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4

 Attachments: 2.0.4-5839-draft.patch


 As noted in CASSANDRA-2405, it would be useful to store repair results, 
 particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Issue Comment Deleted] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrés de la Peña updated CASSANDRA-6480:
-

Comment: was deleted

(was: Hi Sylvain,
yes, we actually have an use case. We're developing a custom secondary index 
based on Lucene. It would be useful to have a syntaxis similar to what we have 
to create a keyspace, using a map. For example it would be nice to be able to 
write something like this:
)

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Andrés de la Peña
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847373#comment-13847373
 ] 

Andrés de la Peña commented on CASSANDRA-6480:
--

Hi Sylvain,
yes, we actually have an use case. We're developing a custom secondary index 
based on Lucene. It would be useful to have a syntaxis similar to what we have 
to create a keyspace, using a map. For example it would be nice to be able to 
write something like this:


 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Andrés de la Peña
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6484) cassandra-shuffle not working with authentication

2013-12-13 Thread Gibheer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gibheer updated CASSANDRA-6484:
---

Attachment: login.patch

This is an untested patch to add the login functionality. I didn't get 
cassandra to compile, so prepare for typing errors and the like.

 cassandra-shuffle not working with authentication
 -

 Key: CASSANDRA-6484
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6484
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: cassandra 2.0.3
Reporter: Gibheer
 Attachments: login.patch


 When enabling authentication for a cassandra cluster the tool 
 cassandra-shuffle is unable to connect.
 The reason is, that cassandra-shuffle doesn't take any parameter for username 
 and password for the thrift connection.
 To solve that problem, parameter for username and password should be added, 
 It should also be able to interpret cqlshrc or a separate file file with 
 authentication data.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847379#comment-13847379
 ] 

Andrés de la Peña commented on CASSANDRA-6480:
--

Hi Sylvain,
yes, we have an use case. We're developing a custom secondary index based on 
Lucene, it would be nice to able to have a syntax similar to the 'create 
keyspace' statement, in which we can pass the complete index options map (which 
already exists in the code). For example:

CREATE TABLE documents (
  key timeuuid,
  date timestamp,
  spanish_text varchar,
  english_text varchar,
  PRIMARY KEY key);

CREATE CUSTOM INDEX ON demo.users (spanish_text) USING {'class' : 
'org.stratio.FullTextIndex', 'analyzer': 'SpanishAnalyzer', 
'storage':'/mnt/ssd/indexes/'};
CREATE CUSTOM INDEX ON demo.users (english_text) USING {'class' : 
'org.stratio.FullTextIndex', 'analyzer': 'EnglishAnalyzer', 
'storage':'/mnt/ssd/indexes/'};

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Andrés de la Peña
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-6480:
-

  Priority: Minor  (was: Major)
Issue Type: Improvement  (was: Bug)

As Sylvain says, it's been left out b/c there was no use case for it. I was 
going to add WITH OPTIONS = {..} or something like it for custom 2i 
simultaneously with CASSANDRA-5962 for 2.1, but for 2i alone it can go into 2.0.

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Andrés de la Peña
Priority: Minor
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6318) IN predicates on non-primary-key columns (%s) is not yet supported

2013-12-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847413#comment-13847413
 ] 

Sylvain Lebresne commented on CASSANDRA-6318:
-

bq. How important is it deemed by the team ?

I can't answer for the team, but can tell you that personally I don't 
consider CASSANDRA-4386 as a big priority. It's almost only a performance 
issue, you can always replace a IN by multiple queries, and while 
CASSANDRA-4386 will probably yield some benefits over doing multiple queries, 
it's unlikely to be a huge difference. Besides, it is no secret that 2ndary 
indexes in C* are not its most peformance feature and that when performance is 
your main concern, denormalization is your friend. So that relying on IN 
queries on 2ndary index a lot is likely not the best of ideas.

bq. What is the mental image of C* usage in the team ?

Hopefully you understand this question is pretty generic, has no simple answer 
and has no place in a comment of this JIRA ticket. If you have data modeling 
questions, the user mailing list is probably the right place.

bq. How hard is it to implement ?

My best estimation would be likely not terribly hard, though not 2 lines of 
code either.

bq. If i comment out the exception in 
cassandra.cql3.statements.SelectStatement.RawStatement#prepare, what and where 
will break ?

Because I can't really answer that precisely without some code dive, because 
it's relatively simple for you to try it out and see for yourself and because 
you'll likely learn something in the process, I'm gonna let you do it if you 
care enough about an answer.


 IN predicates on non-primary-key columns (%s) is not yet supported
 --

 Key: CASSANDRA-6318
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6318
 Project: Cassandra
  Issue Type: Bug
Reporter: Sergey Nagaytsev
  Labels: cql3
 Attachments: CASSANDRA_6318_test.cql


 Query:
 SELECT * FROM post WHERE blog IN (1,2) AND author=3 ALLOW FILTERING -- 
 contrived
 Error: IN predicates on non-primary-key columns (blog) is not yet supported
 Please either implement, set milestone or say will never be implemented !
 P.S. Did search, seemingly found no issue/plan related to it. Maybe 
 CASSANDRA-6048 ?
 P.S.2 What is recommended workaround for this ? Manual index tables, what are 
 design guidelines for them ?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847417#comment-13847417
 ] 

Aleksey Yeschenko commented on CASSANDRA-6480:
--

Oh, definitely not for non-custom 2i. That would be a *bad* idea.

Only talking about CREATE CUSTOM INDEX here.

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Andrés de la Peña
Priority: Minor
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2013-12-13 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847419#comment-13847419
 ] 

Jeremy Hanna commented on CASSANDRA-6311:
-

[~alexliu68] fwiw, just released version 1.0.5 of the java driver includes 
support for the LOCAL_ONE in case that's still helpful here.  
https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/UvTLT5q-5o4

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.4

 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847422#comment-13847422
 ] 

Sylvain Lebresne commented on CASSANDRA-6480:
-

Got it, I misunderstood the for 2i alone as meaning non-custom 2i rather that 
for 2i but not for triggers.  We're in agreement, my bad :)

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Andrés de la Peña
Priority: Minor
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Comment Edited] (CASSANDRA-6418) auto_snapshots are not removable via 'nodetool clearsnapshot'

2013-12-13 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845983#comment-13845983
 ] 

Lyuben Todorov edited comment on CASSANDRA-6418 at 12/13/13 11:59 AM:
--

Changed getCFDirectory to return a list of directories and added a check to 
ensure that CFs exist in v3.


was (Author: lyubent):
Changed getCFDirectory to return a list of directories and added a check to 
ensure that CFs exist. 

 auto_snapshots are not removable via 'nodetool clearsnapshot'
 -

 Key: CASSANDRA-6418
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6418
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: auto_snapshot: true
Reporter: J. Ryan Earl
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 2.0.4

 Attachments: 6418_cassandra-2.0.patch, 6418_v2.patch, 
 6418_v3_cassandra-2.0.patch


 Snapshots of deleted CFs created via the auto_snapshot configuration 
 parameter appear to not be tracked.  The result is that 'nodetool 
 clearsnapshot keyspace with deleted CFs' does nothing, and short of 
 manually removing the files from the filesystem, deleted CFs remain 
 indefinitely taking up space.
 I'm not sure if this is intended, but it seems pretty counter-intuitive.  I 
 haven't found any documentation that indicates auto_snapshots would be 
 ignored by 'nodetool clearsnapshot'.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Reopened] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2013-12-13 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reopened CASSANDRA-2238:
--


Reopening b/c of a patch: 
https://github.com/yaitskov/cassandra/commit/c816af1b75a23c4559619d5f0cf81e2e728d9990

[~brandon.williams] Mind having a look?

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Priority: Trivial

 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5414) Unspecified initial_token with ByteOrderedPartitioner results in java.lang.NumberFormatException: Non-hex characters

2013-12-13 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847445#comment-13847445
 ] 

Lyuben Todorov commented on CASSANDRA-5414:
---

[~pasthelod] I think you've supplied invalid tokens. With the 
ByteOrderedPartitioner the tokens need to be supplied in hex. If you want to 
calculate hex tokens take a look at the [BOP wiki 
page|http://wiki.apache.org/cassandra/ByteOrderedPartitioner], there is a 
python script that calculates tokens. 

 Unspecified initial_token with ByteOrderedPartitioner results in 
 java.lang.NumberFormatException: Non-hex characters
 

 Key: CASSANDRA-5414
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5414
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.3
 Environment: Oracle JDK, java 1.7.0_07, amd64 (x86_64), on a Debian 
 Squeeze, no virtual nodes, nothing fancy.
Reporter: Pas
Assignee: Lyuben Todorov
Priority: Minor

 Using one seed, after adding the third node the fresh node chooses an illegal 
 token, so it can't bootstrap itself. So I had to specify initial_token on 
 each host manually. (Which, as far as I know, is recommended anyway for 
 non-random partitioners.)
 java.lang.NumberFormatException: Non-hex characters in 
 ^@^V73797374656d5f61757468
 at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:60)
 at 
 org.apache.cassandra.dht.AbstractByteOrderedPartitioner$1.fromString(AbstractByteOrderedPartitioner.java:167)
 at 
 org.apache.cassandra.dht.BootStrapper$BootstrapTokenCallback.response(BootStrapper.java:230)
 at 
 org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6470) ArrayIndexOutOfBoundsException on range query from client

2013-12-13 Thread Dmitry (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847458#comment-13847458
 ] 

Dmitry commented on CASSANDRA-6470:
---

I get the same error too. I'm using Cassandra 2.0.2 and Datastax Java Driver 
2.0.0-beta2

 ArrayIndexOutOfBoundsException on range query from client
 -

 Key: CASSANDRA-6470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6470
 Project: Cassandra
  Issue Type: Bug
Reporter: Enrico Scalavino
Assignee: Ryan McGuire

 schema: 
 CREATE TABLE inboxkeyspace.inboxes(user_id bigint, message_id bigint, 
 thread_id bigint, network_id bigint, read boolean, PRIMARY KEY(user_id, 
 message_id)) WITH CLUSTERING ORDER BY (message_id DESC);
 CREATE INDEX ON inboxkeyspace.inboxes(read);
 query: 
 SELECT thread_id, message_id, network_id FROM inboxkeyspace.inboxes WHERE 
 user_id = ? AND message_id  ? AND read = ? LIMIT ? 
 The query works if run via cqlsh. However, when run through the datastax 
 client, on the client side we get a timeout exception and on the server side, 
 the Cassandra log shows this exception: 
 ERROR [ReadStage:4190] 2013-12-10 13:18:03,579 CassandraDaemon.java (line 
 187) Exception in thread Thread[ReadStage:4190,5,main]
 java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException: 0
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1940)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.start(SliceQueryFilter.java:261)
 at 
 org.apache.cassandra.db.index.composites.CompositesSearcher.makePrefix(CompositesSearcher.java:66)
 at 
 org.apache.cassandra.db.index.composites.CompositesSearcher.getIndexedIterator(CompositesSearcher.java:101)
 at 
 org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:53)
 at 
 org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:537)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1669)
 at 
 org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:109)
 at 
 org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1423)
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1936)
 ... 3 more



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5899) Sends all interface in native protocol notification when rpc_address=0.0.0.0

2013-12-13 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847477#comment-13847477
 ] 

Chris Burroughs commented on CASSANDRA-5899:


So I don't think this is terribly uncommon, but as an example all of our nodes 
have two interfaces.  Depending on where a client is in the network there are 3 
address (interfaces + one is NATed for inter-DC) that could potential work, but 
only 1-2 that would work.  Some might appear to work but would be very bad (go 
out the local net and back in or something).  I am unsure how any single value 
for broadcast_rpc_address would be useful, nor how even a list could work since 
which to choose is client location dependent.

 Sends all interface in native protocol notification when rpc_address=0.0.0.0
 

 Key: CASSANDRA-5899
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5899
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 2.1


 For the native protocol notifications, when we send a new node notification, 
 we send the rpc_address of that new node. For this to be actually useful, 
 that address sent should be publicly accessible by the driver it is destined 
 to. 
 The problem is when rpc_address=0.0.0.0. Currently, we send the 
 listen_address, which is correct in the sense that we do are bind on it but 
 might not be accessible by client nodes.
 In fact, one of the good reason to use 0.0.0.0 rpc_address would be if you 
 have a private network for internode communication and another for 
 client-server communinations, but still want to be able to issue query from 
 the private network for debugging. In that case, the current behavior to send 
 listen_address doesn't really help.
 So one suggestion would be to instead send all the addresses on which the 
 (native protocol) server is bound to (which would still leave to the driver 
 the task to pick the right one, but at least it has something to pick from).
 That's relatively trivial to do in practice, but it does require a minor 
 binary protocol break to return a list instead of just one IP, which is why 
 I'm tentatively marking this 2.0. Maybe we can shove that tiny change in the 
 final (in the protocol v2 only)? Povided we agree it's a good idea of course.
 Now to be complete, for the same reasons, we would also need to store all the 
 addresses we are bound to in the peers table. That's also fairly simple and 
 the backward compatibility story is maybe a tad simpler: we could add a new 
 {{rpc_addresses}} column that would be a list and deprecate {{rpc_address}} 
 (to be removed in 2.1 for instance).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847491#comment-13847491
 ] 

Andrés de la Peña commented on CASSANDRA-6480:
--

Perfect, then... what is our opinion about options in custom 2i?
Is everybdoy agreed about CQL3 shouldn't hide this useful feature on custom 2i? 
:)

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Andrés de la Peña
Priority: Minor
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6480) Custom secondary index options in CQL3

2013-12-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847493#comment-13847493
 ] 

Aleksey Yeschenko commented on CASSANDRA-6480:
--

It's not about hiding, it's about YAGNI. If you attach a patch, we'll review 
it. If you don't, this will have to wait until other, more important things, 
are resolved.

 Custom secondary index options in CQL3
 --

 Key: CASSANDRA-6480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6480
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Andrés de la Peña
Priority: Minor
  Labels: cql3, index

 The CQL3 create index statement syntax does not allow to specify the 
 options map internally used by custom indexes. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Resolved] (CASSANDRA-5414) Unspecified initial_token with ByteOrderedPartitioner results in java.lang.NumberFormatException: Non-hex characters

2013-12-13 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-5414.
---

Resolution: Won't Fix

Pas didn't supply the invalid token; the existing cluster did when it tried to 
bisect an existing token range on bootstrap.

Wontfixing since that code is already gone in 2.0 (CASSANDRA-5518).

 Unspecified initial_token with ByteOrderedPartitioner results in 
 java.lang.NumberFormatException: Non-hex characters
 

 Key: CASSANDRA-5414
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5414
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.3
 Environment: Oracle JDK, java 1.7.0_07, amd64 (x86_64), on a Debian 
 Squeeze, no virtual nodes, nothing fancy.
Reporter: Pas
Assignee: Lyuben Todorov
Priority: Minor

 Using one seed, after adding the third node the fresh node chooses an illegal 
 token, so it can't bootstrap itself. So I had to specify initial_token on 
 each host manually. (Which, as far as I know, is recommended anyway for 
 non-random partitioners.)
 java.lang.NumberFormatException: Non-hex characters in 
 ^@^V73797374656d5f61757468
 at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:60)
 at 
 org.apache.cassandra.dht.AbstractByteOrderedPartitioner$1.fromString(AbstractByteOrderedPartitioner.java:167)
 at 
 org.apache.cassandra.dht.BootStrapper$BootstrapTokenCallback.response(BootStrapper.java:230)
 at 
 org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5839) Save repair data to system table

2013-12-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847532#comment-13847532
 ] 

Jonathan Ellis commented on CASSANDRA-5839:
---

bq. not a fan of system_global as a name

{{system_distributed}}?

 Save repair data to system table
 

 Key: CASSANDRA-5839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4

 Attachments: 2.0.4-5839-draft.patch


 As noted in CASSANDRA-2405, it would be useful to store repair results, 
 particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5899) Sends all interface in native protocol notification when rpc_address=0.0.0.0

2013-12-13 Thread Russell Bradberry (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847535#comment-13847535
 ] 

Russell Bradberry commented on CASSANDRA-5899:
--

{quote}
 I am unsure how any single value for broadcast_rpc_address would be useful
{quote}

A very common setup is to have a CNAME address that points to the internal IP 
address when within the DC and an external IP address when outside the DC.  
Setting the broadcast address to this common CNAME would allow clients both 
internal and external to the DC to connect in the same way.

 Sends all interface in native protocol notification when rpc_address=0.0.0.0
 

 Key: CASSANDRA-5899
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5899
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 2.1


 For the native protocol notifications, when we send a new node notification, 
 we send the rpc_address of that new node. For this to be actually useful, 
 that address sent should be publicly accessible by the driver it is destined 
 to. 
 The problem is when rpc_address=0.0.0.0. Currently, we send the 
 listen_address, which is correct in the sense that we do are bind on it but 
 might not be accessible by client nodes.
 In fact, one of the good reason to use 0.0.0.0 rpc_address would be if you 
 have a private network for internode communication and another for 
 client-server communinations, but still want to be able to issue query from 
 the private network for debugging. In that case, the current behavior to send 
 listen_address doesn't really help.
 So one suggestion would be to instead send all the addresses on which the 
 (native protocol) server is bound to (which would still leave to the driver 
 the task to pick the right one, but at least it has something to pick from).
 That's relatively trivial to do in practice, but it does require a minor 
 binary protocol break to return a list instead of just one IP, which is why 
 I'm tentatively marking this 2.0. Maybe we can shove that tiny change in the 
 final (in the protocol v2 only)? Povided we agree it's a good idea of course.
 Now to be complete, for the same reasons, we would also need to store all the 
 addresses we are bound to in the peers table. That's also fairly simple and 
 the backward compatibility story is maybe a tad simpler: we could add a new 
 {{rpc_addresses}} column that would be a list and deprecate {{rpc_address}} 
 (to be removed in 2.1 for instance).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5839) Save repair data to system table

2013-12-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847619#comment-13847619
 ] 

Sylvain Lebresne commented on CASSANDRA-5839:
-

bq. system_distributed?

That would feel better to my hears yes.

 Save repair data to system table
 

 Key: CASSANDRA-5839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4

 Attachments: 2.0.4-5839-draft.patch


 As noted in CASSANDRA-2405, it would be useful to store repair results, 
 particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6373) describe_ring hangs with hsha thrift server

2013-12-13 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847650#comment-13847650
 ] 

Nick Bailey commented on CASSANDRA-6373:


[~xedin] I've verified its *only* the describe_ring call. Other calls seem to 
work fine. Let me know if you want access to the machine I used to reproduce.

 describe_ring hangs with hsha thrift server
 ---

 Key: CASSANDRA-6373
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6373
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
Assignee: Pavel Yaskevich
 Fix For: 2.0.4

 Attachments: describe_ring_failure.patch


 There is a strange bug with the thrift hsha server in 2.0 (we switched to 
 lmax disruptor server).
 The bug is that the first call to describe_ring from one connection will hang 
 indefinitely when the client is not connecting from localhost (or it at least 
 looks like the client is not on the same host). Additionally the cluster must 
 be using vnodes. When connecting from localhost the first call will work as 
 expected. And in either case subsequent calls from the same connection will 
 work as expected. According to git bisect the bad commit is the switch to the 
 lmax disruptor server:
 https://github.com/apache/cassandra/commit/98eec0a223251ecd8fec7ecc9e46b05497d631c6
 I've attached the patch I used to reproduce the error in the unit tests. The 
 command to reproduce is: 
 {noformat}
 PYTHONPATH=test nosetests 
 --tests=system.test_thrift_server:TestMutations.test_describe_ring
 {noformat}
 I reproduced on ec2 and a single machine by having the server bind to the 
 private ip on ec2 and the client connect to the public ip (so it appears as 
 if the client is non local). I've also reproduced with two different vms 
 though.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2013-12-13 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2238:


Reviewer: Brandon Williams

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Priority: Trivial

 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2013-12-13 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847675#comment-13847675
 ] 

Brandon Williams commented on CASSANDRA-2238:
-

LGTM but I don't see any reason to only put this in trunk (2.1). [~yaitskov] 
mind rebasing for 1.2?

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Priority: Trivial

 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2013-12-13 Thread Daneel S. Yaitskov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847688#comment-13847688
 ] 

Daneel S. Yaitskov commented on CASSANDRA-2238:
---

Okay. I'll do rebasing to get for 1.2 too.

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Priority: Trivial

 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


git commit: Pig: fix duplicate schema error Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6309

2013-12-13 Thread brandonwilliams
Updated Branches:
  refs/heads/trunk 611f328f3 - 343a6472d


Pig: fix duplicate schema error
Patch by Alex Liu, reviewed by brandonwilliams for CASSANDRA-6309


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/343a6472
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/343a6472
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/343a6472

Branch: refs/heads/trunk
Commit: 343a6472d8e26bc846c575c03d0af7d8b66e6dfa
Parents: 611f328
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 11:40:20 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 11:40:20 2013 -0600

--
 build.xml   |  27 ++-
 .../apache/cassandra/db/ConsistencyLevel.java   |   2 -
 .../hadoop/pig/AbstractCassandraStorage.java|  22 +--
 .../apache/cassandra/hadoop/pig/CqlStorage.java |   3 -
 .../cassandra/pig/CqlTableDataTypeTest.java |  35 +---
 .../org/apache/cassandra/pig/CqlTableTest.java  |   9 +-
 .../org/apache/cassandra/pig/PigTestBase.java   |   7 +-
 .../pig/ThriftColumnFamilyDataTypeTest.java |  21 ---
 .../cassandra/pig/ThriftColumnFamilyTest.java   | 163 ++-
 9 files changed, 105 insertions(+), 184 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/343a6472/build.xml
--
diff --git a/build.xml b/build.xml
index 606d2e2..6e579bf 100644
--- a/build.xml
+++ b/build.xml
@@ -413,7 +413,6 @@
 dependency groupId=org.apache.hadoop artifactId=hadoop-core/
dependency groupId=org.apache.hadoop 
artifactId=hadoop-minicluster/
 dependency groupId=org.apache.pig artifactId=pig/
-
 dependency groupId=net.java.dev.jna artifactId=jna/
dependency groupId=com.google.code.findbugs artifactId=jsr305/
   /artifact:pom
@@ -431,6 +430,9 @@
 parent groupId=org.apache.cassandra
 artifactId=cassandra-parent
 version=${version}/
+dependency groupId=joda-time artifactId=joda-time version=2.3 /
+dependency groupId=org.slf4j artifactId=slf4j-log4j12 
version=1.7.2/
+dependency groupId=log4j artifactId=log4j version=1.2.16 /
   /artifact:pom
 
   !-- now the pom's for artifacts being deployed to Maven Central --
@@ -563,6 +565,25 @@
   /copy
 /target
 
+target name=maven-ant-tasks-retrieve-pig-test 
depends=maven-ant-tasks-init
+  artifact:dependencies pomRefId=test-deps-pom
+ filesetId=test-dependency-jars
+ sourcesFilesetId=test-dependency-sources
+ cacheDependencyRefs=true
+ 
dependencyRefsBuildFile=${build.dir}/test-dependencies.xml
+remoteRepository refid=apache/
+remoteRepository refid=central/
+remoteRepository refid=java.net2/
+  /artifact:dependencies
+  copy todir=${build.dir.lib}/jars
+fileset refid=test-dependency-jars/
+mapper type=flatten/
+  /copy
+  copy todir=${build.dir.lib}/sources
+fileset refid=test-dependency-sources/
+mapper type=flatten/
+  /copy
+/target
 
 !--
Generate thrift code.  We have targets to build java because
@@ -995,6 +1016,7 @@
   /classpath
   src path=${test.unit.src}/
   src path=${test.long.src}/
+  src path=${test.pig.src}/
 /javac
 
 !-- Non-java resources needed by the test suite --
@@ -1132,7 +1154,7 @@
 /testmacro
   /target
 
-  target name=pig-test depends=build-test description=Excute Pig tests
+  target name=pig-test 
depends=build-test,maven-ant-tasks-retrieve-pig-test description=Excute Pig 
tests
 testmacro suitename=pig inputdir=${test.pig.src} 
timeout=120
 /testmacro
@@ -1248,6 +1270,7 @@
   classpathentry kind=src path=interface/thrift/gen-java/
   classpathentry kind=src path=test/unit/
   classpathentry kind=src path=test/long/
+  classpathentry kind=src path=test/pig/
   classpathentry kind=src path=tools/stress/src/
   classpathentry kind=con path=org.eclipse.jdt.launching.JRE_CONTAINER/
   classpathentry kind=output path=build/classes/main/

http://git-wip-us.apache.org/repos/asf/cassandra/blob/343a6472/src/java/org/apache/cassandra/db/ConsistencyLevel.java
--
diff --git a/src/java/org/apache/cassandra/db/ConsistencyLevel.java 
b/src/java/org/apache/cassandra/db/ConsistencyLevel.java
index cbb4bb1..0f6aba7 100644
--- a/src/java/org/apache/cassandra/db/ConsistencyLevel.java
+++ b/src/java/org/apache/cassandra/db/ConsistencyLevel.java
@@ -285,9 +285,7 @@ public enum ConsistencyLevel
 {
 switch (this)
 

[jira] [Updated] (CASSANDRA-6309) Pig CqlStorage generates ERROR 1108: Duplicate schema alias

2013-12-13 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6309:


Fix Version/s: 2.0.4
   1.2.13

 Pig CqlStorage generates  ERROR 1108: Duplicate schema alias
 

 Key: CASSANDRA-6309
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6309
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Thunder Stumpges
Assignee: Alex Liu
 Fix For: 1.2.13, 2.0.4

 Attachments: 6309-2.0.txt, 6309-fix-pig-test-compiling.txt, 
 6309-trunk-branch.txt, 6309-v2-2.0-branch.txt, 6309-v3.txt, 
 LOCAL_ONE-write-for-all-strategies-v2.txt, 
 LOCAL_ONE-write-for-all-strategies.txt


 In Pig after loading a simple CQL3 table from Cassandra 2.0.1, and dumping 
 contents, I receive:
 Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 1108: 
 Duplicate schema alias: author in cm
  cm = load 'cql://thunder_test/cassandra_messages' USING CqlStorage;
  dump cm
 ERROR org.apache.pig.tools.grunt.Grunt - 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
 open iterator for alias cm
 ...
 Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 1108: 
 Duplicate schema alias: author in cm
 at 
 org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:75)
 running 'describe cm' gives:
 cm: {message_id: chararray,author: chararray,author: chararray,body: 
 chararray,message_id: chararray}
 The original table schema in Cassandra is:
 CREATE TABLE cassandra_messages (
   message_id text,
   author text,
   body text,
   PRIMARY KEY (message_id, author)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='null' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='NONE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 it appears that the code in CqlStorage.getColumnMetadata at ~line 478 takes 
 the keys columns (in my case, message_id and author) and appends the 
 columns from getColumnMeta (which has all three columns). Thus the keys 
 columns are duplicated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints

2013-12-13 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847706#comment-13847706
 ] 

Brandon Williams commented on CASSANDRA-6158:
-

Looks good, except we SHOULD block the caller, as with all other JMX/nodetool 
commands.

 Nodetool command to purge hints
 ---

 Key: CASSANDRA-6158
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6158
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
 Attachments: trunk-6158.txt


 The only way to truncate all hints in Cassandra is to truncate the hints CF 
 in system table. 
 It would be cleaner to have a nodetool command for it. Also ability to 
 selectively remove hints by host or DC would also be nice rather than 
 removing all the hints. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2013-12-13 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847714#comment-13847714
 ] 

Alex Liu commented on CASSANDRA-6311:
-

I am updating the patch

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.4

 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[6/6] git commit: Merge branch 'cassandra-2.0' into trunk

2013-12-13 Thread brandonwilliams
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6fdff70f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6fdff70f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6fdff70f

Branch: refs/heads/trunk
Commit: 6fdff70f48e3c1142200afc527359ac1b402fb21
Parents: 343a647 54c1ed3
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 12:12:50 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 12:12:50 2013 -0600

--
 .../org/apache/cassandra/hadoop/pig/CassandraStorage.java| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--




[2/6] git commit: Pig: don't assume all DataBags are DefaultDataBags Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420

2013-12-13 Thread brandonwilliams
Pig: don't assume all DataBags are DefaultDataBags
Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11455738
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11455738
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11455738

Branch: refs/heads/cassandra-2.0
Commit: 11455738fa61c6eb02895a5a8d3fbbe4d8cb24b4
Parents: f7c9144
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 12:10:47 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 12:10:47 2013 -0600

--
 .../org/apache/cassandra/hadoop/pig/CassandraStorage.java| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/11455738/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 853a052..89ce7b4 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -517,7 +517,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 {
 if (t.size()  2)
 throw new IOException(No arguments allowed after bag);
-writeColumnsFromBag(key, (DefaultDataBag) t.get(1));
+writeColumnsFromBag(key, (DataBag) t.get(1));
 }
 else
 throw new IOException(Second argument in output must be a tuple 
or bag);
@@ -530,7 +530,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 for (int i = offset; i  t.size(); i++)
 {
 if (t.getType(i) == DataType.BAG)
-writeColumnsFromBag(key, (DefaultDataBag) t.get(i));
+writeColumnsFromBag(key, (DataBag) t.get(i));
 else if (t.getType(i) == DataType.TUPLE)
 {
 Tuple inner = (Tuple) t.get(i);
@@ -576,7 +576,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 }
 
 /** write bag data to Cassandra */
-private void writeColumnsFromBag(ByteBuffer key, DefaultDataBag bag) 
throws IOException
+private void writeColumnsFromBag(ByteBuffer key, DataBag bag) throws 
IOException
 {
 ListMutation mutationList = new ArrayListMutation();
 for (Tuple pair : bag)
@@ -587,7 +587,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 SuperColumn sc = new SuperColumn();
 sc.setName(objToBB(pair.get(0)));
 Listorg.apache.cassandra.thrift.Column columns = new 
ArrayListorg.apache.cassandra.thrift.Column();
-for (Tuple subcol : (DefaultDataBag) pair.get(1))
+for (Tuple subcol : (DataBag) pair.get(1))
 {
 org.apache.cassandra.thrift.Column column = new 
org.apache.cassandra.thrift.Column();
 column.setName(objToBB(subcol.get(0)));



[5/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-12-13 Thread brandonwilliams
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54c1ed36
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54c1ed36
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54c1ed36

Branch: refs/heads/cassandra-2.0
Commit: 54c1ed360c0a8f31bbaa56838525896ecef44886
Parents: fb5808d 1145573
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 12:12:41 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 12:12:41 2013 -0600

--
 .../org/apache/cassandra/hadoop/pig/CassandraStorage.java| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/54c1ed36/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--



[1/6] git commit: Pig: don't assume all DataBags are DefaultDataBags Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420

2013-12-13 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.2 f7c914485 - 11455738f
  refs/heads/cassandra-2.0 fb5808d43 - 54c1ed360
  refs/heads/trunk 343a6472d - 6fdff70f4


Pig: don't assume all DataBags are DefaultDataBags
Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11455738
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11455738
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11455738

Branch: refs/heads/cassandra-1.2
Commit: 11455738fa61c6eb02895a5a8d3fbbe4d8cb24b4
Parents: f7c9144
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 12:10:47 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 12:10:47 2013 -0600

--
 .../org/apache/cassandra/hadoop/pig/CassandraStorage.java| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/11455738/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 853a052..89ce7b4 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -517,7 +517,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 {
 if (t.size()  2)
 throw new IOException(No arguments allowed after bag);
-writeColumnsFromBag(key, (DefaultDataBag) t.get(1));
+writeColumnsFromBag(key, (DataBag) t.get(1));
 }
 else
 throw new IOException(Second argument in output must be a tuple 
or bag);
@@ -530,7 +530,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 for (int i = offset; i  t.size(); i++)
 {
 if (t.getType(i) == DataType.BAG)
-writeColumnsFromBag(key, (DefaultDataBag) t.get(i));
+writeColumnsFromBag(key, (DataBag) t.get(i));
 else if (t.getType(i) == DataType.TUPLE)
 {
 Tuple inner = (Tuple) t.get(i);
@@ -576,7 +576,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 }
 
 /** write bag data to Cassandra */
-private void writeColumnsFromBag(ByteBuffer key, DefaultDataBag bag) 
throws IOException
+private void writeColumnsFromBag(ByteBuffer key, DataBag bag) throws 
IOException
 {
 ListMutation mutationList = new ArrayListMutation();
 for (Tuple pair : bag)
@@ -587,7 +587,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 SuperColumn sc = new SuperColumn();
 sc.setName(objToBB(pair.get(0)));
 Listorg.apache.cassandra.thrift.Column columns = new 
ArrayListorg.apache.cassandra.thrift.Column();
-for (Tuple subcol : (DefaultDataBag) pair.get(1))
+for (Tuple subcol : (DataBag) pair.get(1))
 {
 org.apache.cassandra.thrift.Column column = new 
org.apache.cassandra.thrift.Column();
 column.setName(objToBB(subcol.get(0)));



[3/6] git commit: Pig: don't assume all DataBags are DefaultDataBags Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420

2013-12-13 Thread brandonwilliams
Pig: don't assume all DataBags are DefaultDataBags
Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11455738
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11455738
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11455738

Branch: refs/heads/trunk
Commit: 11455738fa61c6eb02895a5a8d3fbbe4d8cb24b4
Parents: f7c9144
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 12:10:47 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 12:10:47 2013 -0600

--
 .../org/apache/cassandra/hadoop/pig/CassandraStorage.java| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/11455738/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --git a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index 853a052..89ce7b4 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@ -517,7 +517,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 {
 if (t.size()  2)
 throw new IOException(No arguments allowed after bag);
-writeColumnsFromBag(key, (DefaultDataBag) t.get(1));
+writeColumnsFromBag(key, (DataBag) t.get(1));
 }
 else
 throw new IOException(Second argument in output must be a tuple 
or bag);
@@ -530,7 +530,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 for (int i = offset; i  t.size(); i++)
 {
 if (t.getType(i) == DataType.BAG)
-writeColumnsFromBag(key, (DefaultDataBag) t.get(i));
+writeColumnsFromBag(key, (DataBag) t.get(i));
 else if (t.getType(i) == DataType.TUPLE)
 {
 Tuple inner = (Tuple) t.get(i);
@@ -576,7 +576,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 }
 
 /** write bag data to Cassandra */
-private void writeColumnsFromBag(ByteBuffer key, DefaultDataBag bag) 
throws IOException
+private void writeColumnsFromBag(ByteBuffer key, DataBag bag) throws 
IOException
 {
 ListMutation mutationList = new ArrayListMutation();
 for (Tuple pair : bag)
@@ -587,7 +587,7 @@ public class CassandraStorage extends 
AbstractCassandraStorage
 SuperColumn sc = new SuperColumn();
 sc.setName(objToBB(pair.get(0)));
 Listorg.apache.cassandra.thrift.Column columns = new 
ArrayListorg.apache.cassandra.thrift.Column();
-for (Tuple subcol : (DefaultDataBag) pair.get(1))
+for (Tuple subcol : (DataBag) pair.get(1))
 {
 org.apache.cassandra.thrift.Column column = new 
org.apache.cassandra.thrift.Column();
 column.setName(objToBB(subcol.get(0)));



[jira] [Updated] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled

2013-12-13 Thread Chris Burroughs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Burroughs updated CASSANDRA-4288:
---

Attachment: j4288-1.2-v2-txt

v2 attached.

From our super small sample size, but with 1.2.13 gossip w/vnodes settles in 
less than 5s on ~32 node clusters, but not on  64 node clusters

 prevent thrift server from starting before gossip has settled
 -

 Key: CASSANDRA-4288
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Peter Schuller
Assignee: Chris Burroughs
 Fix For: 2.0.4

 Attachments: CASSANDRA-4288-trunk.txt, j4288-1.2-v1-txt, 
 j4288-1.2-v2-txt


 A serious problem is that there is no co-ordination whatsoever between gossip 
 and the consumers of gossip. In particular, on a large cluster with hundreds 
 of nodes, it takes several seconds for gossip to settle because the gossip 
 stage is CPU bound. This leads to a node starting up and accessing thrift 
 traffic long before it has any clue of what up and down. This leads to 
 client-visible timeouts (for nodes that are down but not identified as such) 
 and UnavailableException (for nodes that are up but not yet identified as 
 such). This is really bad in general, but in particular for clients doing 
 non-idempotent writes (counter increments).
 I was going to fix this as part of more significant re-writing in other 
 tickets having to do with gossip/topology/etc, but that's not going to 
 happen. So, the attached patch is roughly what we're running with in 
 production now to make restarts bearable. The minimum wait time is both for 
 ensuring that gossip has time to start becoming CPU bound if it will be, and 
 the reason it's large is to allow for down nodes to be identified as such in 
 most typical cases with a default phi conviction threshold (untested, we 
 actually ran with a smaller number of 5 seconds minimum, but from past 
 experience I believe 15 seconds is enough).
 The patch is tested on our 1.1 branch. It applies on trunk, and the diff is 
 against trunk, but I have not tested it against trunk.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2013-12-13 Thread Alex Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Liu updated CASSANDRA-6311:


Attachment: 6311-v5-2.0-branch.txt

V5 patch is attached. It updates C* Java driver to 2.0.0-rc2 which supports 
LOCAL_ONE 

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.4

 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6485) NPE in calculateNaturalEndpoints

2013-12-13 Thread Russell Alexander Spitzer (JIRA)
Russell Alexander Spitzer created CASSANDRA-6485:


 Summary: NPE in calculateNaturalEndpoints
 Key: CASSANDRA-6485
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6485
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Alexander Spitzer


I was running a test where I added a new data center to an existing cluster. 

Test outline:
Start 25 Node DC1
Keyspace Setup Replication 3
Begin insert against DC1 Using Stress
While the inserts are occuring
Start up 25 Node DC2
Alter Keyspace to include Replication in 2nd DC
Run rebuild on DC2
Wait for stress to finish
Run repair on Cluster
... Some other operations

Although there are no issues with smaller clusters or clusters without vnodes, 
Larger setups with vnodes seem to consistently see the following exception in 
the logs as well as a write operation failing for each exception. 

The exceptions/failures are Occurring when DC2 is brought online but *before* 
any alteration of the Keyspace. All of the exceptions are happening on DC1 
nodes. One of the exceptions occurred on a seed node though this doesn't seem 
to be the case most of the time. 

While the test was running, nodetool was run every second to get cluster 
status. At no time did any nodes report themselves as down. 


{code}
ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
message.
system_logs-107.21.186.208/system.log:java.lang.NullPointerException
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
system_logs-107.21.186.208/system.log-  at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
system_logs-107.21.186.208/system.log-  at 
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
system_logs-107.21.186.208/system.log-  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
system_logs-107.21.186.208/system.log-  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
system_logs-107.21.186.208/system.log-  at java.lang.Thread.run(Thread.java:724)
{code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6485) NPE in calculateNaturalEndpoints

2013-12-13 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6485:
-

Description: 
I was running a test where I added a new data center to an existing cluster. 

Test outline:
Start 25 Node DC1
Keyspace Setup Replication 3
Begin insert against DC1 Using Stress
While the inserts are occuring
Start up 25 Node DC2
Alter Keyspace to include Replication in 2nd DC
Run rebuild on DC2
Wait for stress to finish
Run repair on Cluster
... Some other operations

Although there are no issues with smaller clusters or clusters without vnodes, 
Larger setups with vnodes seem to consistently see the following exception in 
the logs as well as a write operation failing for each exception. Usually this 
happens between 1-8 times during an experiment. 

The exceptions/failures are Occurring when DC2 is brought online but *before* 
any alteration of the Keyspace. All of the exceptions are happening on DC1 
nodes. One of the exceptions occurred on a seed node though this doesn't seem 
to be the case most of the time. 

While the test was running, nodetool was run every second to get cluster 
status. At no time did any nodes report themselves as down. 


{code}
ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
message.
system_logs-107.21.186.208/system.log:java.lang.NullPointerException
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
system_logs-107.21.186.208/system.log-  at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
system_logs-107.21.186.208/system.log-  at 
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
system_logs-107.21.186.208/system.log-  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
system_logs-107.21.186.208/system.log-  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
system_logs-107.21.186.208/system.log-  at java.lang.Thread.run(Thread.java:724)
{code}

  was:
I was running a test where I added a new data center to an existing cluster. 

Test outline:
Start 25 Node DC1
Keyspace Setup Replication 3
Begin insert against DC1 Using Stress
While the inserts are occuring
Start up 25 Node DC2
Alter Keyspace to include Replication in 2nd DC
Run rebuild on DC2
Wait for stress to finish
Run repair on Cluster
... Some other operations

Although there are no issues with smaller clusters or clusters without vnodes, 
Larger setups with vnodes seem to consistently see the following exception in 
the logs as well as a write operation failing for each exception. 

The exceptions/failures are Occurring when DC2 is brought online but *before* 
any alteration of the Keyspace. All of the exceptions are happening on DC1 
nodes. One of the exceptions occurred on a seed node though this doesn't seem 
to be the case most of the time. 

While the test was running, nodetool was run every second to get cluster 
status. At no time did any nodes report themselves as down. 


{code}
ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
message.
system_logs-107.21.186.208/system.log:java.lang.NullPointerException
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
system_logs-107.21.186.208/system.log-  at 
org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)

[jira] [Updated] (CASSANDRA-6485) NPE in calculateNaturalEndpoints

2013-12-13 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6485:
--

Attachment: 6485.txt

 NPE in calculateNaturalEndpoints
 

 Key: CASSANDRA-6485
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6485
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Alexander Spitzer
 Fix For: 1.2.13

 Attachments: 6485.txt


 I was running a test where I added a new data center to an existing cluster. 
 Test outline:
 Start 25 Node DC1
 Keyspace Setup Replication 3
 Begin insert against DC1 Using Stress
 While the inserts are occuring
 Start up 25 Node DC2
 Alter Keyspace to include Replication in 2nd DC
 Run rebuild on DC2
 Wait for stress to finish
 Run repair on Cluster
 ... Some other operations
 Although there are no issues with smaller clusters or clusters without 
 vnodes, Larger setups with vnodes seem to consistently see the following 
 exception in the logs as well as a write operation failing for each 
 exception. Usually this happens between 1-8 times during an experiment. 
 The exceptions/failures are Occurring when DC2 is brought online but *before* 
 any alteration of the Keyspace. All of the exceptions are happening on DC1 
 nodes. One of the exceptions occurred on a seed node though this doesn't seem 
 to be the case most of the time. 
 While the test was running, nodetool was run every second to get cluster 
 status. At no time did any nodes report themselves as down. 
 {code}
 ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
 CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
 message.
 system_logs-107.21.186.208/system.log:java.lang.NullPointerException
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
 system_logs-107.21.186.208/system.log-at 
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 system_logs-107.21.186.208/system.log-at 
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 system_logs-107.21.186.208/system.log-at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 system_logs-107.21.186.208/system.log-at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 system_logs-107.21.186.208/system.log-at 
 java.lang.Thread.run(Thread.java:724)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


git commit: rm dead code

2013-12-13 Thread brandonwilliams
Updated Branches:
  refs/heads/trunk 6fdff70f4 - 84d85ee70


rm dead code


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/84d85ee7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/84d85ee7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/84d85ee7

Branch: refs/heads/trunk
Commit: 84d85ee70bacdf7db92d52e5c738bf7722625d83
Parents: 6fdff70
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Dec 13 13:59:20 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Dec 13 13:59:20 2013 -0600

--
 .../org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java   | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/84d85ee7/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index b5a4c67..b0b7fe9 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -618,7 +618,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 cfDef.default_validation_class = 
ByteBufferUtil.string(cqlRow.columns.get(3).value);
 cfDef.key_validation_class = 
ByteBufferUtil.string(cqlRow.columns.get(4).value);
 String keyAliases = 
ByteBufferUtil.string(cqlRow.columns.get(5).value);
-ListString keys = FBUtilities.fromJsonList(keyAliases);
 if (FBUtilities.fromJsonList(keyAliases).size()  0)
 cql3Table = true;
 }



[jira] [Updated] (CASSANDRA-6008) Getting 'This should never happen' error at startup due to sstables missing

2013-12-13 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-6008:
---

Attachment: 6008-2.0-part2.patch

6008-2.0-part2.patch (and 
[branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-6008-2.0-part2]) 
should apply to the 2.0 branch.  This deletes the entries from 
{{compactions_in_progress}} before deleting the files, as suggested by John.

I'll make a trunk version of the patch after review.

 Getting 'This should never happen' error at startup due to sstables missing
 ---

 Key: CASSANDRA-6008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: John Carrino
Assignee: Tyler Hobbs
 Fix For: 2.0.4

 Attachments: 6008-2.0-part2.patch, 6008-2.0-v1.patch, 
 6008-trunk-v1.patch


 Exception encountered during startup: Unfinished compactions reference 
 missing sstables. This should never happen since compactions are marked 
 finished before we start removing the old sstables
 This happens when sstables that have been compacted away are removed, but 
 they still have entries in the system.compactions_in_progress table.
 Normally this should not happen because the entries in 
 system.compactions_in_progress are deleted before the old sstables are 
 deleted.
 However at startup recovery time, old sstables are deleted (NOT BEFORE they 
 are removed from the compactions_in_progress table) and then after that is 
 done it does a truncate using SystemKeyspace.discardCompactionsInProgress
 We ran into a case where the disk filled up and the node died and was bounced 
 and then failed to truncate this table on startup, and then got stuck hitting 
 this exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers.
 Maybe on startup we can delete from this table incrementally as we clean 
 stuff up in the same way that compactions delete from this table before they 
 delete old sstables.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled

2013-12-13 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847901#comment-13847901
 ] 

Tyler Hobbs commented on CASSANDRA-4288:


Sorry, I meant for the constants to be class-level, but that can be changed by 
the committer :)

Other than that, +1 from me.

 prevent thrift server from starting before gossip has settled
 -

 Key: CASSANDRA-4288
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Peter Schuller
Assignee: Chris Burroughs
 Fix For: 2.0.4

 Attachments: CASSANDRA-4288-trunk.txt, j4288-1.2-v1-txt, 
 j4288-1.2-v2-txt


 A serious problem is that there is no co-ordination whatsoever between gossip 
 and the consumers of gossip. In particular, on a large cluster with hundreds 
 of nodes, it takes several seconds for gossip to settle because the gossip 
 stage is CPU bound. This leads to a node starting up and accessing thrift 
 traffic long before it has any clue of what up and down. This leads to 
 client-visible timeouts (for nodes that are down but not identified as such) 
 and UnavailableException (for nodes that are up but not yet identified as 
 such). This is really bad in general, but in particular for clients doing 
 non-idempotent writes (counter increments).
 I was going to fix this as part of more significant re-writing in other 
 tickets having to do with gossip/topology/etc, but that's not going to 
 happen. So, the attached patch is roughly what we're running with in 
 production now to make restarts bearable. The minimum wait time is both for 
 ensuring that gossip has time to start becoming CPU bound if it will be, and 
 the reason it's large is to allow for down nodes to be identified as such in 
 most typical cases with a default phi conviction threshold (untested, we 
 actually ran with a smaller number of 5 seconds minimum, but from past 
 experience I believe 15 seconds is enough).
 The patch is tested on our 1.1 branch. It applies on trunk, and the diff is 
 against trunk, but I have not tested it against trunk.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6483) Possible Collections.sort assertion failure in STCS.filterColdSSTables

2013-12-13 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-6483:
---

Attachment: 6483-2.0-v1.patch

Attached patch 6483-2.0-v1.patch (and 
[branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-6483]) builds a map 
of hotness values prior to the sort and uses that for comparisons.  I also made 
{{filterColdSSTables()}} skip unneeded work if we won't be able to filter 
anything anyway.

 Possible Collections.sort assertion failure in STCS.filterColdSSTables
 --

 Key: CASSANDRA-6483
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6483
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: graham sanderson
Assignee: Tyler Hobbs
  Labels: compaction
 Fix For: 2.0.4

 Attachments: 6483-2.0-v1.patch


 We have observed the following stack trace periodically:
 {code}
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
 at java.util.TimSort.mergeLo(TimSort.java:747)
 at java.util.TimSort.mergeAt(TimSort.java:483)
 at java.util.TimSort.mergeCollapse(TimSort.java:410)
 at java.util.TimSort.sort(TimSort.java:214)
 at java.util.TimSort.sort(TimSort.java:173)
 at java.util.Arrays.sort(Arrays.java:659)
 at java.util.Collections.sort(Collections.java:217)
 at 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.filterColdSSTables(SizeTieredCompactionStrategy.java:94)
 at 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:59)
 at 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:229)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:191)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 The comparator ant SizeTieredCompactionStrategy line 94 breaks the assertions 
 in the new JDK7 default sort algorithm, because (I think just) the hotness 
 value (based on meter) may be modified concurrently by another thread
 This bug appears to have been introduced in CASSANDRA-6109



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6483) Possible Collections.sort assertion failure in STCS.filterColdSSTables

2013-12-13 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847928#comment-13847928
 ] 

Tyler Hobbs commented on CASSANDRA-6483:


bq. Note that the CASSANDRA-6109 feature claims to be “off” by default, however 
it isn’t immediately clear to me from that patch how “off” is implemented, and 
whether it is supposed to go down that code path even when “off

I answered this on the dev ML, but I'll repeat it here for others who are 
interested.  The default max_cold_reads_ratio is 0.0, so 
{{filterColdSSTables()}} shouldn't filter any SSTables.  When writing this 
patch, I realized that even with that set to 0.0, SSTables that have no read 
activity at all would still be filtered out.  However, after this patch, that's 
no longer true, and a setting of 0.0 will prevent any filtering at all.

bq. I’m guessing there is no actual downside (other than ERROR level messages 
in the logs which cause alerts), since it just fails a subset of compactions?

That's correct, this shouldn't cause any other problems, only delay some 
compactions.

 Possible Collections.sort assertion failure in STCS.filterColdSSTables
 --

 Key: CASSANDRA-6483
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6483
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: graham sanderson
Assignee: Tyler Hobbs
  Labels: compaction
 Fix For: 2.0.4

 Attachments: 6483-2.0-v1.patch


 We have observed the following stack trace periodically:
 {code}
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
 at java.util.TimSort.mergeLo(TimSort.java:747)
 at java.util.TimSort.mergeAt(TimSort.java:483)
 at java.util.TimSort.mergeCollapse(TimSort.java:410)
 at java.util.TimSort.sort(TimSort.java:214)
 at java.util.TimSort.sort(TimSort.java:173)
 at java.util.Arrays.sort(Arrays.java:659)
 at java.util.Collections.sort(Collections.java:217)
 at 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.filterColdSSTables(SizeTieredCompactionStrategy.java:94)
 at 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:59)
 at 
 org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:229)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:191)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 The comparator ant SizeTieredCompactionStrategy line 94 breaks the assertions 
 in the new JDK7 default sort algorithm, because (I think just) the hotness 
 value (based on meter) may be modified concurrently by another thread
 This bug appears to have been introduced in CASSANDRA-6109



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[3/3] git commit: merge from 2.0

2013-12-13 Thread jbellis
merge from 2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fe58dffe
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fe58dffe
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fe58dffe

Branch: refs/heads/trunk
Commit: fe58dffef5d4f44255ff47623b7d4d50a2f4e56d
Parents: 74bf5aa c960975
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 17:13:33 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 17:13:33 2013 -0600

--
 CHANGES.txt |  2 ++
 .../SizeTieredCompactionStrategy.java   | 21 
 2 files changed, 19 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe58dffe/CHANGES.txt
--
diff --cc CHANGES.txt
index 4c74ea9,182bada..b673918
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,24 -1,6 +1,26 @@@
 +2.1
 + * Multithreaded commitlog (CASSANDRA-3578)
 + * allocate fixed index summary memory pool and resample cold index summaries 
 +   to use less memory (CASSANDRA-5519)
 + * Removed multithreaded compaction (CASSANDRA-6142)
 + * Parallelize fetching rows for low-cardinality indexes (CASSANDRA-1337)
 + * change logging from log4j to logback (CASSANDRA-5883)
 + * switch to LZ4 compression for internode communication (CASSANDRA-5887)
 + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971)
 + * Remove 1.2 network compatibility code (CASSANDRA-5960)
 + * Remove leveled json manifest migration code (CASSANDRA-5996)
 + * Remove CFDefinition (CASSANDRA-6253)
 + * Use AtomicIntegerFieldUpdater in RefCountedMemory (CASSANDRA-6278)
 + * User-defined types for CQL3 (CASSANDRA-5590)
 + * Use of o.a.c.metrics in nodetool (CASSANDRA-5871, 6406)
 + * Batch read from OTC's queue and cleanup (CASSANDRA-1632)
 + * Secondary index support for collections (CASSANDRA-4511)
 + * SSTable metadata(Stats.db) format change (CASSANDRA-6356)
 +
 +
  2.0.4
+  * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
+  * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
   * Fix cleanup ClassCastException (CASSANDRA-6462)
   * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)
   * Allow specifying datacenters to participate in a repair (CASSANDRA-6218)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe58dffe/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
--



[2/3] git commit: Fix assertion failure in filterColdSSTables patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-6483

2013-12-13 Thread jbellis
Fix assertion failure in filterColdSSTables
patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-6483


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c9609759
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c9609759
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c9609759

Branch: refs/heads/trunk
Commit: c960975950560218cb5699e4961192081d119e45
Parents: 54c1ed3
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 17:12:47 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 17:12:47 2013 -0600

--
 CHANGES.txt |  1 +
 .../SizeTieredCompactionStrategy.java   | 21 
 2 files changed, 18 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c9609759/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a4b34ca..182bada 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.4
+ * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
  * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
  * Fix cleanup ClassCastException (CASSANDRA-6462)
  * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c9609759/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
index 09d4e8e..7ccc99d 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
@@ -90,12 +90,16 @@ public class SizeTieredCompactionStrategy extends 
AbstractCompactionStrategy
 @VisibleForTesting
 static ListSSTableReader filterColdSSTables(ListSSTableReader 
sstables, double coldReadsToOmit)
 {
-// sort the sstables by hotness (coldest-first)
+if (coldReadsToOmit == 0.0)
+return sstables;
+
+// Sort the sstables by hotness (coldest-first). We first build a map 
because the hotness may change during the sort.
+final MapSSTableReader, Double hotnessSnapshot = 
getHotnessMap(sstables);
 Collections.sort(sstables, new ComparatorSSTableReader()
 {
 public int compare(SSTableReader o1, SSTableReader o2)
 {
-int comparison = Double.compare(hotness(o1), hotness(o2));
+int comparison = Double.compare(hotnessSnapshot.get(o1), 
hotnessSnapshot.get(o2));
 if (comparison != 0)
 return comparison;
 
@@ -190,12 +194,13 @@ public class SizeTieredCompactionStrategy extends 
AbstractCompactionStrategy
 @VisibleForTesting
 static PairListSSTableReader, Double 
trimToThresholdWithHotness(ListSSTableReader bucket, int maxThreshold)
 {
-// sort by sstable hotness (descending)
+// Sort by sstable hotness (descending). We first build a map because 
the hotness may change during the sort.
+final MapSSTableReader, Double hotnessSnapshot = 
getHotnessMap(bucket);
 Collections.sort(bucket, new ComparatorSSTableReader()
 {
 public int compare(SSTableReader o1, SSTableReader o2)
 {
-return -1 * Double.compare(hotness(o1), hotness(o2));
+return -1 * Double.compare(hotnessSnapshot.get(o1), 
hotnessSnapshot.get(o2));
 }
 });
 
@@ -210,6 +215,14 @@ public class SizeTieredCompactionStrategy extends 
AbstractCompactionStrategy
 return Pair.create(prunedBucket, bucketHotness);
 }
 
+private static MapSSTableReader, Double 
getHotnessMap(CollectionSSTableReader sstables)
+{
+MapSSTableReader, Double hotness = new HashMap();
+for (SSTableReader sstable : sstables)
+hotness.put(sstable, hotness(sstable));
+return hotness;
+}
+
 /**
  * Returns the reads per second per key for this sstable, or 0.0 if the 
sstable has no read meter
  */



[1/3] git commit: Fix assertion failure in filterColdSSTables patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-6483

2013-12-13 Thread jbellis
Updated Branches:
  refs/heads/cassandra-2.0 54c1ed360 - c96097595
  refs/heads/trunk 74bf5aa16 - fe58dffef


Fix assertion failure in filterColdSSTables
patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-6483


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c9609759
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c9609759
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c9609759

Branch: refs/heads/cassandra-2.0
Commit: c960975950560218cb5699e4961192081d119e45
Parents: 54c1ed3
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 17:12:47 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 17:12:47 2013 -0600

--
 CHANGES.txt |  1 +
 .../SizeTieredCompactionStrategy.java   | 21 
 2 files changed, 18 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c9609759/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a4b34ca..182bada 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.4
+ * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
  * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
  * Fix cleanup ClassCastException (CASSANDRA-6462)
  * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c9609759/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
index 09d4e8e..7ccc99d 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java
@@ -90,12 +90,16 @@ public class SizeTieredCompactionStrategy extends 
AbstractCompactionStrategy
 @VisibleForTesting
 static ListSSTableReader filterColdSSTables(ListSSTableReader 
sstables, double coldReadsToOmit)
 {
-// sort the sstables by hotness (coldest-first)
+if (coldReadsToOmit == 0.0)
+return sstables;
+
+// Sort the sstables by hotness (coldest-first). We first build a map 
because the hotness may change during the sort.
+final MapSSTableReader, Double hotnessSnapshot = 
getHotnessMap(sstables);
 Collections.sort(sstables, new ComparatorSSTableReader()
 {
 public int compare(SSTableReader o1, SSTableReader o2)
 {
-int comparison = Double.compare(hotness(o1), hotness(o2));
+int comparison = Double.compare(hotnessSnapshot.get(o1), 
hotnessSnapshot.get(o2));
 if (comparison != 0)
 return comparison;
 
@@ -190,12 +194,13 @@ public class SizeTieredCompactionStrategy extends 
AbstractCompactionStrategy
 @VisibleForTesting
 static PairListSSTableReader, Double 
trimToThresholdWithHotness(ListSSTableReader bucket, int maxThreshold)
 {
-// sort by sstable hotness (descending)
+// Sort by sstable hotness (descending). We first build a map because 
the hotness may change during the sort.
+final MapSSTableReader, Double hotnessSnapshot = 
getHotnessMap(bucket);
 Collections.sort(bucket, new ComparatorSSTableReader()
 {
 public int compare(SSTableReader o1, SSTableReader o2)
 {
-return -1 * Double.compare(hotness(o1), hotness(o2));
+return -1 * Double.compare(hotnessSnapshot.get(o1), 
hotnessSnapshot.get(o2));
 }
 });
 
@@ -210,6 +215,14 @@ public class SizeTieredCompactionStrategy extends 
AbstractCompactionStrategy
 return Pair.create(prunedBucket, bucketHotness);
 }
 
+private static MapSSTableReader, Double 
getHotnessMap(CollectionSSTableReader sstables)
+{
+MapSSTableReader, Double hotness = new HashMap();
+for (SSTableReader sstable : sstables)
+hotness.put(sstable, hotness(sstable));
+return hotness;
+}
+
 /**
  * Returns the reads per second per key for this sstable, or 0.0 if the 
sstable has no read meter
  */



[jira] [Commented] (CASSANDRA-6008) Getting 'This should never happen' error at startup due to sstables missing

2013-12-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848078#comment-13848078
 ] 

Jonathan Ellis commented on CASSANDRA-6008:
---

This means that instead of throwing an error if we restart before 
removeUnfinishedCompactionLeftovers finishes, we'll leave both old and new 
sstables from unfinished compactions live, which defeats the purpose for 
counters.

For 2.1 that would be okay (since we're assuming CASSANDRA-4775 will be done 
before we release) but for 2.0 it isn't, unfortunately.

I think the alternatives are
# Switch back to delete-first, and add a debug line instead of 
IllegalStateException.  (Can delete from compaction_log incrementally too to 
reduce the window of inconsistency.)
# Do a dance of renaming back to .tmp instead of deleting, then removing 
compaction_log entry, then deleting.  .tmp will be included in the unfinished 
list, but if there is no corresponding compaction_log entry they can just be 
deleted

I'd lean towards saying the extra complexity of #2 isn't worth the security 
blanket of the ISE.

 Getting 'This should never happen' error at startup due to sstables missing
 ---

 Key: CASSANDRA-6008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: John Carrino
Assignee: Tyler Hobbs
 Fix For: 2.0.4

 Attachments: 6008-2.0-part2.patch, 6008-2.0-v1.patch, 
 6008-trunk-v1.patch


 Exception encountered during startup: Unfinished compactions reference 
 missing sstables. This should never happen since compactions are marked 
 finished before we start removing the old sstables
 This happens when sstables that have been compacted away are removed, but 
 they still have entries in the system.compactions_in_progress table.
 Normally this should not happen because the entries in 
 system.compactions_in_progress are deleted before the old sstables are 
 deleted.
 However at startup recovery time, old sstables are deleted (NOT BEFORE they 
 are removed from the compactions_in_progress table) and then after that is 
 done it does a truncate using SystemKeyspace.discardCompactionsInProgress
 We ran into a case where the disk filled up and the node died and was bounced 
 and then failed to truncate this table on startup, and then got stuck hitting 
 this exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers.
 Maybe on startup we can delete from this table incrementally as we clean 
 stuff up in the same way that compactions delete from this table before they 
 delete old sstables.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6486) Latency Measurement

2013-12-13 Thread Benedict (JIRA)
Benedict created CASSANDRA-6486:
---

 Summary: Latency Measurement
 Key: CASSANDRA-6486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6486
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict


Latency measurement in Cassandra is currently suboptimal. Exactly what the 
latency measurements tell you isn't intuitively clear due to their 
exponentially decaying, but amount to some view of the latency per (unweighted) 
operation over the past, approximately, 10 minute period, with greater weight 
given to more recent operations. This has some obvious flaws, the most notable 
being that due to probabilistic sampling, large outlier events (e.g. GC) can 
easily be lost over a multi-minute time horizon, and even when caught are 
unlikely to appear even in the 99.9th percentile due to accounting for a tiny 
fraction of events numerically.

I'm generally thinking about how we might improve on this, and want to dump my 
ideas here for discussion. I think the following things should be targeted:

1) Ability to see uniform latency measurements for different time horizons 
stretching back from the present, e.g. last 1s, 1m, 1hr and 1day
2) Ability to bound the error margin of statistics for all of these intervals
3) Protect against losing outlier measurements
4) Possibly offer the ability to weight statistics, so that longer latencies 
are not underplayed even if they are counted
5) Preferably non-blocking, memory efficient, and relatively garbage-free

(3) and (4) are the trickiest, as a theoretically sound and general approach 
isn't immediately obvious. There are a number of possibilities that spring to 
mind:
1) ensure that we have enough sample points that we are probabilistically 
guaranteed to not lose them, but over large time horizons this is problematic 
due to memory constraints, and it doesn't address (4);
2) count large events multiple times (or sub-slices of the events), based on 
e.g. average op-rate. I am not a fan of this idea because it makes possibly bad 
assumptions about behaviour and doesn't seem very theoretically sound;
3) weight the probability of retaining an event by its length. the problem with 
this approach is that it ties you into (4) without offering the current view of 
statistics (i.e. unweighted operations), and it also doesn't lend itself to 
efficient implementation

I'm currently leaning towards a fourth approach, which attempts to hybridise 
uniform sampling and histogram behaviour, by separating the sample space into 
ranges, each some multiple of the last (say 2x the size). Each range has a 
uniform sample of events that occured in that range, plus a count of total 
events. Ideally the size of the sample will be variable based on the number of 
events occurring in any range, but that there will be a lower-bound calculated 
to ensure we do not lose events.

This approach lends itself to all 5 goals above:
1) by maintaining the same structure for each time horizon, and uniformly 
sampling from all of the directly lower order time horizons to maintain it;
2) by imposing minimum sample sizes for each range;
3) ditto (2);
4) by producing time/frequency-weighted statistics using the samples and counts 
from each range;
5) with thread-local array-based timers that are synchronised with the global 
timer once every minimum period, by the owning thread

This also extends reasonably nicely the timers I have already written for 
CASSANDRA-6199, so some of the work is already done.

Thoughts / discussion would be welcome, especially if you think I've missed 
another obvious approach.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5323) Revisit disabled dtests

2013-12-13 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848088#comment-13848088
 ] 

Michael Shuler commented on CASSANDRA-5323:
---

Current exclude lists, in order for dtest to complete without hanging:

trunk:
drop_cf_auth_test|drop_ks_auth_test|grant_revoke_cleanup_test|concurrent_schema_changes_test|short_read_reversed_test|short_read_test|keyspace_test|table_test|global_row_key_cache_test|sstableloader_compression_deflate_to_deflate_test|simple_repair_order_preserving_test|upgrade_test|sstable_generation_loading_test
- http://cassci.datastax.com/job/trunk_dtest/48/console
- http://cassci.datastax.com/job/trunk_dtest/48/testReport/

cassandra-2.0:
drop_cf_auth_test|drop_ks_auth_test|grant_revoke_cleanup_test|decommission_node_test|short_read_reversed_test|short_read_test|keyspace_test|table_test|global_row_key_cache_test|simple_repair_order_preserving_test|sstableloader_compression|upgrade_test
- http://cassci.datastax.com/job/cassandra-2.0_dtest/21/console
- http://cassci.datastax.com/job/cassandra-2.0_dtest/21/testReport/

cassandra-1.2:
decommission|sstable_gen|global_row|cql3_insert
- http://cassci.datastax.com/job/cassandra-1.2_dtest/29/console
- http://cassci.datastax.com/job/cassandra-1.2_dtest/29/testReport/

Some of the results may not be particularly pretty, but we have results!  :)

 Revisit disabled dtests
 ---

 Key: CASSANDRA-5323
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5323
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Michael Shuler

 The following dtests are disabled in buildbot, if they can be re-enabled 
 great, if they can't can they be fixed? 
 upgrade|decommission|sstable_gen|global_row|putget_2dc|cql3_insert



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-13 Thread Patrick McFadin (JIRA)
Patrick McFadin created CASSANDRA-6487:
--

 Summary: Log WARN on large batch sizes
 Key: CASSANDRA-6487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
 Project: Cassandra
  Issue Type: Improvement
Reporter: Patrick McFadin
Priority: Minor


Large batches on a coordinator can cause a lot of node stress. I propose adding 
a WARN log entry if batch sizes go beyond a configurable size. This will give 
more visibility to operators on something that can happen on the developer 
side. 

New yaml setting with 5k default.

# Log WARN on any batch size exceeding this value. 5k by default.
# Caution should be taken on increasing the size of this threshold as it can 
lead to node instability.

batch_size_warn_threshold: 5k





--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-13 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848099#comment-13848099
 ] 

Albert P Tobey commented on CASSANDRA-6487:
---

If it's not out of the way, it would help to include the keyspace and column 
family and maybe the session ID/info.

 Log WARN on large batch sizes
 -

 Key: CASSANDRA-6487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
 Project: Cassandra
  Issue Type: Improvement
Reporter: Patrick McFadin
Priority: Minor

 Large batches on a coordinator can cause a lot of node stress. I propose 
 adding a WARN log entry if batch sizes go beyond a configurable size. This 
 will give more visibility to operators on something that can happen on the 
 developer side. 
 New yaml setting with 5k default.
 # Log WARN on any batch size exceeding this value. 5k by default.
 # Caution should be taken on increasing the size of this threshold as it can 
 lead to node instability.
 batch_size_warn_threshold: 5k



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-13 Thread Patrick McFadin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick McFadin updated CASSANDRA-6487:
---

Description: 
Large batches on a coordinator can cause a lot of node stress. I propose adding 
a WARN log entry if batch sizes go beyond a configurable size. This will give 
more visibility to operators on something that can happen on the developer 
side. 

New yaml setting with 5k default.

{{# Log WARN on any batch size exceeding this value. 5k by default.
# Caution should be taken on increasing the size of this threshold as it can 
lead to node instability.

batch_size_warn_threshold: 5k
}}


  was:
Large batches on a coordinator can cause a lot of node stress. I propose adding 
a WARN log entry if batch sizes go beyond a configurable size. This will give 
more visibility to operators on something that can happen on the developer 
side. 

New yaml setting with 5k default.

# Log WARN on any batch size exceeding this value. 5k by default.
# Caution should be taken on increasing the size of this threshold as it can 
lead to node instability.

batch_size_warn_threshold: 5k




 Log WARN on large batch sizes
 -

 Key: CASSANDRA-6487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
 Project: Cassandra
  Issue Type: Improvement
Reporter: Patrick McFadin
Priority: Minor

 Large batches on a coordinator can cause a lot of node stress. I propose 
 adding a WARN log entry if batch sizes go beyond a configurable size. This 
 will give more visibility to operators on something that can happen on the 
 developer side. 
 New yaml setting with 5k default.
 {{# Log WARN on any batch size exceeding this value. 5k by default.
 # Caution should be taken on increasing the size of this threshold as it can 
 lead to node instability.
 batch_size_warn_threshold: 5k
 }}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-13 Thread Patrick McFadin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848103#comment-13848103
 ] 

Patrick McFadin commented on CASSANDRA-6487:


Sure. Can't see any reason not to add more info if it's easy to add. 

 Log WARN on large batch sizes
 -

 Key: CASSANDRA-6487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
 Project: Cassandra
  Issue Type: Improvement
Reporter: Patrick McFadin
Priority: Minor

 Large batches on a coordinator can cause a lot of node stress. I propose 
 adding a WARN log entry if batch sizes go beyond a configurable size. This 
 will give more visibility to operators on something that can happen on the 
 developer side. 
 New yaml setting with 5k default.
 {{# Log WARN on any batch size exceeding this value. 5k by default.}}
 {{# Caution should be taken on increasing the size of this threshold as it 
 can lead to node instability.}}
 {{batch_size_warn_threshold: 5k}}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-13 Thread Patrick McFadin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick McFadin updated CASSANDRA-6487:
---

Description: 
Large batches on a coordinator can cause a lot of node stress. I propose adding 
a WARN log entry if batch sizes go beyond a configurable size. This will give 
more visibility to operators on something that can happen on the developer 
side. 

New yaml setting with 5k default.

{{# Log WARN on any batch size exceeding this value. 5k by default.}}
{{# Caution should be taken on increasing the size of this threshold as it can 
lead to node instability.}}

{{batch_size_warn_threshold: 5k}}



  was:
Large batches on a coordinator can cause a lot of node stress. I propose adding 
a WARN log entry if batch sizes go beyond a configurable size. This will give 
more visibility to operators on something that can happen on the developer 
side. 

New yaml setting with 5k default.

{{# Log WARN on any batch size exceeding this value. 5k by default.
# Caution should be taken on increasing the size of this threshold as it can 
lead to node instability.

batch_size_warn_threshold: 5k
}}



 Log WARN on large batch sizes
 -

 Key: CASSANDRA-6487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
 Project: Cassandra
  Issue Type: Improvement
Reporter: Patrick McFadin
Priority: Minor

 Large batches on a coordinator can cause a lot of node stress. I propose 
 adding a WARN log entry if batch sizes go beyond a configurable size. This 
 will give more visibility to operators on something that can happen on the 
 developer side. 
 New yaml setting with 5k default.
 {{# Log WARN on any batch size exceeding this value. 5k by default.}}
 {{# Caution should be taken on increasing the size of this threshold as it 
 can lead to node instability.}}
 {{batch_size_warn_threshold: 5k}}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-13 Thread Rick Branson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Branson updated CASSANDRA-6488:


Attachment: graph (21).png

CPU usage dropping on a production cluster after the attached patch is rolled 
out.

 Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
 -

 Key: CASSANDRA-6488
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
 Project: Cassandra
  Issue Type: Bug
Reporter: Rick Branson
Assignee: Aleksey Yeschenko
 Attachments: 6488-rbranson-patch.txt, graph (21).png


 The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
 enormous amounts of CPU to be consumed on clusters with many vnodes. I 
 created a patch to cache this data as a workaround and deployed it to a 
 production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
 highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
 clusters. I'm including the maybe-not-the-best-quality workaround patch to 
 use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
 it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-13 Thread Rick Branson (JIRA)
Rick Branson created CASSANDRA-6488:
---

 Summary: Batchlog writes consume unnecessarily large amounts of 
CPU on vnodes clusters
 Key: CASSANDRA-6488
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
 Project: Cassandra
  Issue Type: Bug
Reporter: Rick Branson
Assignee: Aleksey Yeschenko
 Attachments: 6488-rbranson-patch.txt, graph (21).png

The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous 
amounts of CPU to be consumed on clusters with many vnodes. I created a patch 
to cache this data as a workaround and deployed it to a production cluster with 
15,000 tokens. CPU consumption drop to 1/5th. This highlights the overall 
issues with cloneOnlyTokenMap() calls on vnodes clusters. I'm including the 
maybe-not-the-best-quality workaround patch to use as a reference, but 
cloneOnlyTokenMap is a systemic issue and every place it's called should 
probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters

2013-12-13 Thread Rick Branson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Branson updated CASSANDRA-6488:


Attachment: 6488-rbranson-patch.txt

 Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
 -

 Key: CASSANDRA-6488
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488
 Project: Cassandra
  Issue Type: Bug
Reporter: Rick Branson
Assignee: Aleksey Yeschenko
 Attachments: 6488-rbranson-patch.txt, graph (21).png


 The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes 
 enormous amounts of CPU to be consumed on clusters with many vnodes. I 
 created a patch to cache this data as a workaround and deployed it to a 
 production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This 
 highlights the overall issues with cloneOnlyTokenMap() calls on vnodes 
 clusters. I'm including the maybe-not-the-best-quality workaround patch to 
 use as a reference, but cloneOnlyTokenMap is a systemic issue and every place 
 it's called should probably be investigated.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6485) NPE in calculateNaturalEndpoints

2013-12-13 Thread Rick Branson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848118#comment-13848118
 ] 

Rick Branson commented on CASSANDRA-6485:
-

LGTM.

 NPE in calculateNaturalEndpoints
 

 Key: CASSANDRA-6485
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6485
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Alexander Spitzer
Assignee: Jonathan Ellis
 Fix For: 1.2.13

 Attachments: 6485.txt


 I was running a test where I added a new data center to an existing cluster. 
 Test outline:
 Start 25 Node DC1
 Keyspace Setup Replication 3
 Begin insert against DC1 Using Stress
 While the inserts are occuring
 Start up 25 Node DC2
 Alter Keyspace to include Replication in 2nd DC
 Run rebuild on DC2
 Wait for stress to finish
 Run repair on Cluster
 ... Some other operations
 Although there are no issues with smaller clusters or clusters without 
 vnodes, Larger setups with vnodes seem to consistently see the following 
 exception in the logs as well as a write operation failing for each 
 exception. Usually this happens between 1-8 times during an experiment. 
 The exceptions/failures are Occurring when DC2 is brought online but *before* 
 any alteration of the Keyspace. All of the exceptions are happening on DC1 
 nodes. One of the exceptions occurred on a seed node though this doesn't seem 
 to be the case most of the time. 
 While the test was running, nodetool was run every second to get cluster 
 status. At no time did any nodes report themselves as down. 
 {code}
 ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
 CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
 message.
 system_logs-107.21.186.208/system.log:java.lang.NullPointerException
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
 system_logs-107.21.186.208/system.log-at 
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 system_logs-107.21.186.208/system.log-at 
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 system_logs-107.21.186.208/system.log-at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 system_logs-107.21.186.208/system.log-at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 system_logs-107.21.186.208/system.log-at 
 java.lang.Thread.run(Thread.java:724)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[6/6] git commit: Merge branch 'cassandra-2.0' into trunk

2013-12-13 Thread jbellis
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/14ebfbf7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/14ebfbf7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/14ebfbf7

Branch: refs/heads/trunk
Commit: 14ebfbf7f8b5091ba2d9a5f6cbb34a00cdb5dbc3
Parents: fe58dff a3796f5
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 22:11:35 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 22:11:35 2013 -0600

--
 CHANGES.txt  |  3 ++-
 .../cassandra/locator/AbstractReplicationStrategy.java   | 11 ++-
 2 files changed, 8 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/14ebfbf7/CHANGES.txt
--



[2/6] git commit: fix race referencing tokenMetadataCache patch by jbellis; reviewed by rbranson for CASSANDRA-6485

2013-12-13 Thread jbellis
fix race referencing tokenMetadataCache
patch by jbellis; reviewed by rbranson for CASSANDRA-6485


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3d91dc9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3d91dc9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3d91dc9

Branch: refs/heads/cassandra-2.0
Commit: a3d91dc9d67572e16d9ad92f22b89eb969373899
Parents: 1145573
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 22:10:13 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 22:10:13 2013 -0600

--
 CHANGES.txt  |  7 ++-
 .../cassandra/locator/AbstractReplicationStrategy.java   | 11 ++-
 2 files changed, 8 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3d91dc9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b7bbe09..e586592 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,9 +1,6 @@
-1.2.14
- * Randomize batchlog candidates selection (CASSANDRA-6481)
-
-
 1.2.13
- * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345)
+ * Randomize batchlog candidates selection (CASSANDRA-6481)
+ * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)
  * Optimize FD phi calculation (CASSANDRA-6386)
  * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
  * Don't list CQL3 table in CLI describe even if named explicitely 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3d91dc9/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java 
b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
index 51c4119..c36fde4 100644
--- a/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
+++ b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
@@ -116,19 +116,20 @@ public abstract class AbstractReplicationStrategy
 ArrayListInetAddress endpoints = getCachedEndpoints(keyToken);
 if (endpoints == null)
 {
-if (tokenMetadataClone == null)
+TokenMetadata tm; // local reference in case another thread nulls 
tMC out from under us
+if ((tm = tokenMetadataClone) == null)
 {
 // synchronize to prevent thundering herd post-invalidation
 synchronized (this)
 {
-if (tokenMetadataClone == null)
-tokenMetadataClone = tokenMetadata.cloneOnlyTokenMap();
+if ((tm = tokenMetadataClone) == null)
+tm = tokenMetadataClone = 
tokenMetadata.cloneOnlyTokenMap();
 }
 // if our clone got invalidated, it's possible there is a new 
token to account for too
-keyToken = 
TokenMetadata.firstToken(tokenMetadataClone.sortedTokens(), searchToken);
+keyToken = TokenMetadata.firstToken(tm.sortedTokens(), 
searchToken);
 }
 
-endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, 
tokenMetadataClone));
+endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, tm));
 cachedEndpoints.put(keyToken, endpoints);
 }
 



[3/6] git commit: fix race referencing tokenMetadataCache patch by jbellis; reviewed by rbranson for CASSANDRA-6485

2013-12-13 Thread jbellis
fix race referencing tokenMetadataCache
patch by jbellis; reviewed by rbranson for CASSANDRA-6485


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3d91dc9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3d91dc9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3d91dc9

Branch: refs/heads/trunk
Commit: a3d91dc9d67572e16d9ad92f22b89eb969373899
Parents: 1145573
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 22:10:13 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 22:10:13 2013 -0600

--
 CHANGES.txt  |  7 ++-
 .../cassandra/locator/AbstractReplicationStrategy.java   | 11 ++-
 2 files changed, 8 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3d91dc9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b7bbe09..e586592 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,9 +1,6 @@
-1.2.14
- * Randomize batchlog candidates selection (CASSANDRA-6481)
-
-
 1.2.13
- * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345)
+ * Randomize batchlog candidates selection (CASSANDRA-6481)
+ * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)
  * Optimize FD phi calculation (CASSANDRA-6386)
  * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
  * Don't list CQL3 table in CLI describe even if named explicitely 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3d91dc9/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java 
b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
index 51c4119..c36fde4 100644
--- a/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
+++ b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
@@ -116,19 +116,20 @@ public abstract class AbstractReplicationStrategy
 ArrayListInetAddress endpoints = getCachedEndpoints(keyToken);
 if (endpoints == null)
 {
-if (tokenMetadataClone == null)
+TokenMetadata tm; // local reference in case another thread nulls 
tMC out from under us
+if ((tm = tokenMetadataClone) == null)
 {
 // synchronize to prevent thundering herd post-invalidation
 synchronized (this)
 {
-if (tokenMetadataClone == null)
-tokenMetadataClone = tokenMetadata.cloneOnlyTokenMap();
+if ((tm = tokenMetadataClone) == null)
+tm = tokenMetadataClone = 
tokenMetadata.cloneOnlyTokenMap();
 }
 // if our clone got invalidated, it's possible there is a new 
token to account for too
-keyToken = 
TokenMetadata.firstToken(tokenMetadataClone.sortedTokens(), searchToken);
+keyToken = TokenMetadata.firstToken(tm.sortedTokens(), 
searchToken);
 }
 
-endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, 
tokenMetadataClone));
+endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, tm));
 cachedEndpoints.put(keyToken, endpoints);
 }
 



[5/6] git commit: merge from 1.2

2013-12-13 Thread jbellis
merge from 1.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3796f5f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3796f5f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3796f5f

Branch: refs/heads/cassandra-2.0
Commit: a3796f5f7d9473452dd856aad3d99940eb307716
Parents: c960975 a3d91dc
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 22:11:31 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 22:11:31 2013 -0600

--
 CHANGES.txt  |  3 ++-
 .../cassandra/locator/AbstractReplicationStrategy.java   | 11 ++-
 2 files changed, 8 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3796f5f/CHANGES.txt
--
diff --cc CHANGES.txt
index 182bada,e586592..a54231e
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,10 +1,18 @@@
 -1.2.13
 +2.0.4
 + * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
 + * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
 + * Fix cleanup ClassCastException (CASSANDRA-6462)
 + * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)
 + * Allow specifying datacenters to participate in a repair (CASSANDRA-6218)
 + * Fix divide-by-zero in PCI (CASSANDRA-6403)
 + * Fix setting last compacted key in the wrong level for LCS (CASSANDRA-6284)
 + * Add sub-ms precision formats to the timestamp parser (CASSANDRA-6395)
 + * Expose a total memtable size metric for a CF (CASSANDRA-6391)
 + * cqlsh: handle symlinks properly (CASSANDRA-6425)
 + * Don't resubmit counter mutation runnables internally (CASSANDRA-6427)
 +Merged from 1.2:
-  * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345)
+  * Randomize batchlog candidates selection (CASSANDRA-6481)
+  * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)
 - * Optimize FD phi calculation (CASSANDRA-6386)
 - * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
 - * Don't list CQL3 table in CLI describe even if named explicitely 
 -   (CASSANDRA-5750)
   * cqlsh: quote single quotes in strings inside collections (CASSANDRA-6172)
   * Improve gossip performance for typical messages (CASSANDRA-6409)
   * Throw IRE if a prepared statement has more markers than supported 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3796f5f/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
--



[4/6] git commit: merge from 1.2

2013-12-13 Thread jbellis
merge from 1.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3796f5f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3796f5f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3796f5f

Branch: refs/heads/trunk
Commit: a3796f5f7d9473452dd856aad3d99940eb307716
Parents: c960975 a3d91dc
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 22:11:31 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 22:11:31 2013 -0600

--
 CHANGES.txt  |  3 ++-
 .../cassandra/locator/AbstractReplicationStrategy.java   | 11 ++-
 2 files changed, 8 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3796f5f/CHANGES.txt
--
diff --cc CHANGES.txt
index 182bada,e586592..a54231e
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,10 +1,18 @@@
 -1.2.13
 +2.0.4
 + * Fix assertion failure in filterColdSSTables (CASSANDRA-6483)
 + * Fix row tombstones in larger-than-memory compactions (CASSANDRA-6008)
 + * Fix cleanup ClassCastException (CASSANDRA-6462)
 + * Reduce gossip memory use by interning VersionedValue strings 
(CASSANDRA-6410)
 + * Allow specifying datacenters to participate in a repair (CASSANDRA-6218)
 + * Fix divide-by-zero in PCI (CASSANDRA-6403)
 + * Fix setting last compacted key in the wrong level for LCS (CASSANDRA-6284)
 + * Add sub-ms precision formats to the timestamp parser (CASSANDRA-6395)
 + * Expose a total memtable size metric for a CF (CASSANDRA-6391)
 + * cqlsh: handle symlinks properly (CASSANDRA-6425)
 + * Don't resubmit counter mutation runnables internally (CASSANDRA-6427)
 +Merged from 1.2:
-  * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345)
+  * Randomize batchlog candidates selection (CASSANDRA-6481)
+  * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)
 - * Optimize FD phi calculation (CASSANDRA-6386)
 - * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
 - * Don't list CQL3 table in CLI describe even if named explicitely 
 -   (CASSANDRA-5750)
   * cqlsh: quote single quotes in strings inside collections (CASSANDRA-6172)
   * Improve gossip performance for typical messages (CASSANDRA-6409)
   * Throw IRE if a prepared statement has more markers than supported 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3796f5f/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
--



[1/6] git commit: fix race referencing tokenMetadataCache patch by jbellis; reviewed by rbranson for CASSANDRA-6485

2013-12-13 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.2 11455738f - a3d91dc9d
  refs/heads/cassandra-2.0 c96097595 - a3796f5f7
  refs/heads/trunk fe58dffef - 14ebfbf7f


fix race referencing tokenMetadataCache
patch by jbellis; reviewed by rbranson for CASSANDRA-6485


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3d91dc9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3d91dc9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3d91dc9

Branch: refs/heads/cassandra-1.2
Commit: a3d91dc9d67572e16d9ad92f22b89eb969373899
Parents: 1145573
Author: Jonathan Ellis jbel...@apache.org
Authored: Fri Dec 13 22:10:13 2013 -0600
Committer: Jonathan Ellis jbel...@apache.org
Committed: Fri Dec 13 22:10:13 2013 -0600

--
 CHANGES.txt  |  7 ++-
 .../cassandra/locator/AbstractReplicationStrategy.java   | 11 ++-
 2 files changed, 8 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3d91dc9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b7bbe09..e586592 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,9 +1,6 @@
-1.2.14
- * Randomize batchlog candidates selection (CASSANDRA-6481)
-
-
 1.2.13
- * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345)
+ * Randomize batchlog candidates selection (CASSANDRA-6481)
+ * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485)
  * Optimize FD phi calculation (CASSANDRA-6386)
  * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
  * Don't list CQL3 table in CLI describe even if named explicitely 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3d91dc9/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java 
b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
index 51c4119..c36fde4 100644
--- a/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
+++ b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
@@ -116,19 +116,20 @@ public abstract class AbstractReplicationStrategy
 ArrayListInetAddress endpoints = getCachedEndpoints(keyToken);
 if (endpoints == null)
 {
-if (tokenMetadataClone == null)
+TokenMetadata tm; // local reference in case another thread nulls 
tMC out from under us
+if ((tm = tokenMetadataClone) == null)
 {
 // synchronize to prevent thundering herd post-invalidation
 synchronized (this)
 {
-if (tokenMetadataClone == null)
-tokenMetadataClone = tokenMetadata.cloneOnlyTokenMap();
+if ((tm = tokenMetadataClone) == null)
+tm = tokenMetadataClone = 
tokenMetadata.cloneOnlyTokenMap();
 }
 // if our clone got invalidated, it's possible there is a new 
token to account for too
-keyToken = 
TokenMetadata.firstToken(tokenMetadataClone.sortedTokens(), searchToken);
+keyToken = TokenMetadata.firstToken(tm.sortedTokens(), 
searchToken);
 }
 
-endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, 
tokenMetadataClone));
+endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, tm));
 cachedEndpoints.put(keyToken, endpoints);
 }
 



[jira] [Updated] (CASSANDRA-6485) NPE in calculateNaturalEndpoints

2013-12-13 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6485:
--

Since Version: 1.2.13

 NPE in calculateNaturalEndpoints
 

 Key: CASSANDRA-6485
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6485
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Alexander Spitzer
Assignee: Jonathan Ellis
 Fix For: 1.2.13, 2.0.4

 Attachments: 6485.txt


 I was running a test where I added a new data center to an existing cluster. 
 Test outline:
 Start 25 Node DC1
 Keyspace Setup Replication 3
 Begin insert against DC1 Using Stress
 While the inserts are occuring
 Start up 25 Node DC2
 Alter Keyspace to include Replication in 2nd DC
 Run rebuild on DC2
 Wait for stress to finish
 Run repair on Cluster
 ... Some other operations
 Although there are no issues with smaller clusters or clusters without 
 vnodes, Larger setups with vnodes seem to consistently see the following 
 exception in the logs as well as a write operation failing for each 
 exception. Usually this happens between 1-8 times during an experiment. 
 The exceptions/failures are Occurring when DC2 is brought online but *before* 
 any alteration of the Keyspace. All of the exceptions are happening on DC1 
 nodes. One of the exceptions occurred on a seed node though this doesn't seem 
 to be the case most of the time. 
 While the test was running, nodetool was run every second to get cluster 
 status. At no time did any nodes report themselves as down. 
 {code}
 ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 
 CustomTThreadPoolServer.java (line 217) Error occurred during processing of 
 message.
 system_logs-107.21.186.208/system.log:java.lang.NullPointerException
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
 system_logs-107.21.186.208/system.log-at 
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 system_logs-107.21.186.208/system.log-at 
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 system_logs-107.21.186.208/system.log-at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 system_logs-107.21.186.208/system.log-at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 system_logs-107.21.186.208/system.log-at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 system_logs-107.21.186.208/system.log-at 
 java.lang.Thread.run(Thread.java:724)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-5633) CQL support for updating multiple rows in a partition using CAS

2013-12-13 Thread Sebastian Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848238#comment-13848238
 ] 

Sebastian Schmidt commented on CASSANDRA-5633:
--

We are pretty much stuck with Hector right now and would like to move to CQL 
once this gets implemented. Our use case is not very specific, but we are 
trying to model data that contains relationships that need to be updated 
atomically. Using CAS through CQL makes it impossible for us to use atomic 
updates, which can easily break our constraints. We believe that any data model 
that contains dependencies or relationships requires such a functionality 
should CAS be used as an op-lock mechanism.

As a suggestion for syntax, we have the following to offer:

update cf set c='a', d='b' where foo='a' and bar = 'b', set c='x', d='y' where 
foo='a' and bar = 'c' if (bar='b' and c='d’), (bar='g' and c='h’);

where the table is:

CREATE TABLE cf (
  foo text,
  bar text,
  c text,
  d text,
  PRIMARY KEY (foo, bar)
)

 CQL support for updating multiple rows in a partition using CAS
 ---

 Key: CASSANDRA-5633
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5633
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 2.0 beta 1
Reporter: sankalp kohli
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: cql3
 Fix For: 2.0.4


 This is currently supported via Thrift but not via CQL. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)