date:20130620


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689015#comment-13689015
 ] 

Sylvain Lebresne commented on CASSANDRA-5668:
-

For what it's worth, I think that for the 2nd problem, another option might be 
to make Tracing.initializeMessage behave slightly differently depending on the 
message type. So if the state doesn't exist but the message type is a 
REQUEST_RESPONSE, we could create the state and set it in the threadLocal, but 
not save it in the global state map.

It's a bit of a hack though, but it slightly bother me to leave this to 
expiration either so  

 NPE in net.OutputTcpConnection when tracing is enabled
 --

 Key: CASSANDRA-5668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5668
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.6, 2.0 beta 1
Reporter: Ryan McGuire
 Attachments: 5668-assert-2.txt, 5668-assert.txt, 5668-logs.tar.gz, 
 5668_npe_ddl.cql, 5668_npe_insert.cql, system.log


 I get multiple NullPointerException when trying to trace INSERT statements.
 To reproduce:
 {code}
 $ ccm create -v git:trunk
 $ ccm populate -n 3
 $ ccm start
 $ ccm node1 cqlsh  5668_npe_ddl.cql
 $ ccm node1 cqlsh  5668_npe_insert.cql
 {code}
 And see many exceptions like this in the logs of node1:
 {code}
 ERROR [WRITE-/127.0.0.3] 2013-06-19 14:54:35,885 OutboundTcpConnection.java 
 (line 197) error writing to /127.0.0.3
 java.lang.NullPointerException
 at 
 org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:182)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
 {code}
 This is similar to CASSANDRA-5658 and is the reason that npe_ddl and 
 npe_insert are separate files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5342) ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed

2013-06-20 Thread Marcus Eriksson (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marcus Eriksson updated CASSANDRA-5342:
---

Attachment: 0001-CASSANDRA-5342-wip.patch

removes ancestors from SSTableMetadata and instead makes deserialize(..) return
a PairSSTM, SetInteger so that the caller can decide if the ancestors are
needed. This allows us to keep SSTM as immutable as possible.

this forces us to re-deserialize the metadata when trying to figure out
ancestors during compaction

opted not to mutate the ancestors on-disk since it makes my skin crawl

ancestors are not cleared in SSTableMetadata after compactions are done and
old SSTables are removed

Key: CASSANDRA-5342
URL: https://issues.apache.org/jira/browse/CASSANDRA-5342
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.1.3
Reporter: Wei Zhu
Assignee: Marcus Eriksson
Fix For: 1.2.7

Attachments: 0001-CASSANDRA-5342-wip.patch, Screen Shot 2013-03-13 at
12.05.08 PM.png

We are using LCS and have total of 38000 SSTables for one CF. During LCS,
there could be over a thousand SSTable involved. All those SSTable IDs are
stored in ancestors field of SSTableMetatdata for the new table. In our case,
it consumes more than 1G of heap memory for those field. Put it in
perspective, the ancestors consume 2 - 3 times more memory than bloomfilter
(fp = 0.1 by default) in LCS.
We should remove those ancestors from SSTableMetadata after the compaction is
finished and the old SSTable is removed. It might be a big deal for Sized
Compaction since there are small number of SSTable involved. But it consumes
a lot of memory for LCS.
At least, we shouldn't load those ancestors to the memory during startup if
the files are removed.
I would love to contribute and provide patch. Please let me know how to
start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5674) Inserting Zero Byte values via CQL for types other than Ascii / binary

2013-06-20 Thread Tobias Schlottke (JIRA)

Tobias Schlottke created CASSANDRA-5674:
---

 Summary: Inserting Zero Byte values via CQL for types other than 
Ascii / binary
 Key: CASSANDRA-5674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5674
 Project: Cassandra
  Issue Type: Bug
Reporter: Tobias Schlottke


Hi there, 

we're currently upgrading from thrift to cql and are experiencing another 
problem with null values (similar to #CASSSANDRA-5648).
I respect the fact that null means delete and that I have to insert a zero byte 
value but what is the right zero byte value for types other than ascii/blob?

Usecase:
{code}
CREATE TABLE foo (
  key1 ascii,
  key2 timeuuid,
  key3 ascii,
  value ascii,
  PRIMARY KEY (key1, key2, key3)
) WITH COMPACT STORAGE;
{code}


I got a clustering key on three columns and want to insert an empty value for 
the Timeuuid in the middle (key2).
For data already inserted via thrift I see null for all relevant columns 
already in there, which would be my desired behaviour.

trying this:
{code}
insert into foo(key1,key2,key3) values('test', null, 'test');
{code}

returns 
{code}
Bad Request: Invalid null value for clustering key part key2
{code}

Which is okay if null implicitly means delete.
The question is: Am I able to insert a zero byte value for a type like timeuuid 
that will be compatible with my old dataset where null values where possible 
via thrift?

Best,

Tobias

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-5674) Inserting Zero Byte values via CQL for types other than Ascii / binary


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-5674.
-

Resolution: Invalid

 Inserting Zero Byte values via CQL for types other than Ascii / binary
 --

 Key: CASSANDRA-5674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5674
 Project: Cassandra
  Issue Type: Bug
Reporter: Tobias Schlottke

 Hi there, 
 we're currently upgrading from thrift to cql and are experiencing another 
 problem with null values (similar to #CASSSANDRA-5648).
 I respect the fact that null means delete and that I have to insert a zero 
 byte value but what is the right zero byte value for types other than 
 ascii/blob?
 Usecase:
 {code}
 CREATE TABLE foo (
   key1 ascii,
   key2 timeuuid,
   key3 ascii,
   value ascii,
   PRIMARY KEY (key1, key2, key3)
 ) WITH COMPACT STORAGE;
 {code}
 I got a clustering key on three columns and want to insert an empty value 
 for the Timeuuid in the middle (key2).
 For data already inserted via thrift I see null for all relevant columns 
 already in there, which would be my desired behaviour.
 trying this:
 {code}
 insert into foo(key1,key2,key3) values('test', null, 'test');
 {code}
 returns 
 {code}
 Bad Request: Invalid null value for clustering key part key2
 {code}
 Which is okay if null implicitly means delete.
 The question is: Am I able to insert a zero byte value for a type like 
 timeuuid that will be compatible with my old dataset where null values 
 where possible via thrift?
 Best,
 Tobias

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5674) Inserting Zero Byte values via CQL for types other than Ascii / binary


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689106#comment-13689106
 ] 

Sylvain Lebresne commented on CASSANDRA-5674:
-

bq. Which is okay if null implicitly means delete

For the sake of being precise, the meaning of null pretty universally mean the 
absence of a value. Thrift never really allowed this, it was allowing an empty 
byte array as a valid value for any type. That's different from a null value: 
for instance, for strings, an empty byte array is the empty string, and no 
language/database that I know of consider that equivalent to a null string.

That's why using null to represent empty bytes values is not a valid solution 
in the context CASSANDRA-5648.

Now, imho, supporting empty values for type like int is more of a bug of thrift 
than a feature. No language supports an empty value (which again, is different 
from null) for an int so working with such value will just be a pain in 
practice (and while thrift drivers might have gotten away returning null for 
such empty values for types that don't really support empty values, CQL3 does 
support null values (with it's normal meaning of the absence of value), so 
it's not going to fly).

So I think people should avoid empty values for type for which it doesn't 
really make sense. But for thrift upgrades, the good news is that thrift *can* 
input such empty values:
{noformat}
INSERT INTO foo(key1, key2, key3,value) VALUES ('test', blobAsTimeuuid(0x), 
'test', '');
{noformat}
which has the advantage of making it clear what this is doing. Another solution 
is to use a prepared statement (where values are inputed as bytes directly so 
passing an empty byte array for any type will be fine).

I'll note that if you do a {{SELECT}} after the insertion above in cqlsh, it 
will display null for the key2 value. But that's a bug really and I'll open a 
ticket to fix.


 Inserting Zero Byte values via CQL for types other than Ascii / binary
 --

 Key: CASSANDRA-5674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5674
 Project: Cassandra
  Issue Type: Bug
Reporter: Tobias Schlottke

 Hi there, 
 we're currently upgrading from thrift to cql and are experiencing another 
 problem with null values (similar to #CASSSANDRA-5648).
 I respect the fact that null means delete and that I have to insert a zero 
 byte value but what is the right zero byte value for types other than 
 ascii/blob?
 Usecase:
 {code}
 CREATE TABLE foo (
   key1 ascii,
   key2 timeuuid,
   key3 ascii,
   value ascii,
   PRIMARY KEY (key1, key2, key3)
 ) WITH COMPACT STORAGE;
 {code}
 I got a clustering key on three columns and want to insert an empty value 
 for the Timeuuid in the middle (key2).
 For data already inserted via thrift I see null for all relevant columns 
 already in there, which would be my desired behaviour.
 trying this:
 {code}
 insert into foo(key1,key2,key3) values('test', null, 'test');
 {code}
 returns 
 {code}
 Bad Request: Invalid null value for clustering key part key2
 {code}
 Which is okay if null implicitly means delete.
 The question is: Am I able to insert a zero byte value for a type like 
 timeuuid that will be compatible with my old dataset where null values 
 where possible via thrift?
 Best,
 Tobias

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5675) cqlsh shouldn't display null for empty values

Sylvain Lebresne created CASSANDRA-5675:
---

 Summary: cqlsh shouldn't display null for empty values
 Key: CASSANDRA-5675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5675
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.7


For historical reason (and compatibility with thrift), all type support an 
empty value, even type like int for which it doesn't really make sense (see 
CASSANDRA-5674 too on that subject).

If you input such an empty value for a type like int, cqlsh will display it as 
null:
{noformat}
cqlsh:ks CREATE TABLE test (k text PRIMARY KEY, v int);
cqlsh:ks INSERT INTO test(k, v) VALUES ('someKey', blobAsInt(0x));
cqlsh:ks SELECT * FROM test;

 k   | v
-+--
 someKey | null

{noformat} 

But that's not correct, it suggests {{v}} has no value but that's not true, it 
has a value, it's just an empty one.

Now, one may argue support empty values for a type like int is broken, and I 
would agree with that. But thrift allows it so we probably need to preserve 
that behavior for compatibility sake. And I guess the need to use blobAsInt at 
least make it clear that it's kind of a hack.

That being said, cqlsh should not display null as this is confusing. Instead 
I'd suggest either displaying nothing (that's how an empty string is displayed 
after all), or to just go with some random explicit syntax like say [empty 
value]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5666) CQL3 should not allow ranges on the partition key without the token() method, even for byte ordered partitioner.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5666:


Attachment: 5666.txt

Attached trivial patch for this (that include update of the documentation).

Just to illustrate the problem we currently have, consider (where 
BytesOrderingPartitioner is used):
{noformat}
cqlsh:ks CREATE TABLE test ( k int PRIMARY KEY);
cqlsh:ks INSERT INTO test(k) VALUES (1);
cqlsh:ks INSERT INTO test(k) VALUES (0);
cqlsh:ks INSERT INTO test(k) VALUES (-1);
cqlsh:ks SELECT * FROM test;

 k

  0
  1
 -1

cqlsh:ks SELECT * FROM test WHERE k = -1 AND k  1;
Bad Request: Start key must sort before (or equal to) finish key in your 
partitioner!

{noformat}

 CQL3 should not allow ranges on the partition key without the token() method, 
 even for byte ordered partitioner.
 

 Key: CASSANDRA-5666
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5666
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 1.2.6

 Attachments: 5666.txt


 When the partition is an ordered one, CQL3 currently allows non-equal 
 conditions on the partition key directly. I.e. we allow
 {noformat}
 CREATE TABLE t (k timeuuid PRIMARY KEY);
 SELECT * FROM t WHERE k  ... AND k  ...;
 {noformat}
 but this is a bug because even ordered partitioner don't order following the 
 type of the partition key. They order by bytes, always.
 So that type of query doesn't do in general what it is supposed to do and we 
 should disallow it. Even for ordered partitioner, the token() function should 
 be used. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (CASSANDRA-5665) Gossiper.handleMajorStateChange can lose existing node ApplicationState


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown reassigned CASSANDRA-5665:
--

Assignee: Jason Brown

 Gossiper.handleMajorStateChange can lose existing node ApplicationState
 ---

 Key: CASSANDRA-5665
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5665
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip, upgrade
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5665-v1.diff


 Dovetailing on CASSANDRA-5660, I discovered that further along during an 
 upgrade, when more nodes are on the new major version, a node the previous 
 version can get passed some incomplete Gossip info about another, already 
 upgraded node, and the older node drops AppStat info about that node.
 I think what happens is that a 1.1 node (older rev) gets gossip info from a 
 1.2 node (A), which includes incomplete (lacking some AppState data) gossip 
 info about another 1.2 node (B). The 1.1 node, which has marked incorrectly 
 kicked node B out of gossip due to the bug described in #5660, then takes 
 that incomplete node B info and wholesale replaces any previous known state 
 about node B in Gossiper.handleMajorStateChanged. Thus, if we previously had 
 DC/RACK info, it'll get dropped as part of the 
 endpointStateMap.put(endpointstate). When the data being pased is incomplete, 
 1.1 will start referencing node B and gets into the NPE situation in #5498.
 Anecdotally, this bad state is short-lived, less than a few minutes, even as 
 short as ten seconds, until gossip catches up and properly propagates the 
 AppState data. Furthermore, when upgrading a two datacenter, 48 node cluster, 
 it only occurred on two nodes for less than a minute each. Thus, the scope 
 seems limited but can occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5342) ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed

2013-06-20 Thread Marcus Eriksson (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marcus Eriksson updated CASSANDRA-5342:
---

Attachment: 0001-CASSANDRA-5342-wip-v2.patch

use Pair.create...

ancestors are not cleared in SSTableMetadata after compactions are done and
old SSTables are removed

Attachments: 0001-CASSANDRA-5342-wip.patch,
0001-CASSANDRA-5342-wip-v2.patch, Screen Shot 2013-03-13 at 12.05.08 PM.png

git commit: Ninja-remove redundant text type from native proto spec (v2)

2013-06-20 Thread aleksey

Updated Branches:
  refs/heads/trunk af008a41a - 40b6c5d9c


Ninja-remove redundant text type from native proto spec (v2)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/40b6c5d9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/40b6c5d9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/40b6c5d9

Branch: refs/heads/trunk
Commit: 40b6c5d9c91e504e1d8ba8639e24d1cee781d10a
Parents: af008a4
Author: Aleksey Yeschenko alek...@apache.org
Authored: Thu Jun 20 17:41:06 2013 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Thu Jun 20 17:41:06 2013 +0300

--
 doc/native_protocol_v2.spec | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/40b6c5d9/doc/native_protocol_v2.spec
--
diff --git a/doc/native_protocol_v2.spec b/doc/native_protocol_v2.spec
index 8d83d3b..3959a15 100644
--- a/doc/native_protocol_v2.spec
+++ b/doc/native_protocol_v2.spec
@@ -476,7 +476,6 @@ Table of Contents
 0x0007Double
 0x0008Float
 0x0009Int
-0x000AText
 0x000BTimestamp
 0x000CUuid
 0x000DVarchar

[jira] [Commented] (CASSANDRA-5666) CQL3 should not allow ranges on the partition key without the token() method, even for byte ordered partitioner.


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689294#comment-13689294
 ] 

Jonathan Ellis commented on CASSANDRA-5666:
---

+1

 CQL3 should not allow ranges on the partition key without the token() method, 
 even for byte ordered partitioner.
 

 Key: CASSANDRA-5666
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5666
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 1.2.6

 Attachments: 5666.txt


 When the partition is an ordered one, CQL3 currently allows non-equal 
 conditions on the partition key directly. I.e. we allow
 {noformat}
 CREATE TABLE t (k timeuuid PRIMARY KEY);
 SELECT * FROM t WHERE k  ... AND k  ...;
 {noformat}
 but this is a bug because even ordered partitioner don't order following the 
 type of the partition key. They order by bytes, always.
 So that type of query doesn't do in general what it is supposed to do and we 
 should disallow it. Even for ordered partitioner, the token() function should 
 be used. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5676) Occasional timeouts from cassandra on secondary index queries: AssertionError: Illegal offset error observed in cassandra logs.

2013-06-20 Thread Rao (JIRA)

Rao created CASSANDRA-5676:
--

 Summary: Occasional timeouts from cassandra on secondary index 
queries: AssertionError: Illegal offset error observed in cassandra logs.
 Key: CASSANDRA-5676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5676
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.5
Reporter: Rao


When executing the queries based on secondary index, we are occasionally 
getting a OperationTimeoutException from astyanax client and at the same time 
observed the following error in Cassandra logs:

Query executed: select * from grd.route where 
serviceidentifier='com.att.aft.NagiosTestService'  LIMIT 3 ALLOW FILTERING;

serviceidentifier has a secondary index.

ERROR [ReadStage:6185] 2013-06-20 09:20:31,574 CassandraDaemon.java (line 175) 
Exception in thread Thread[ReadStage:6185,5,RMI Runtime]
java.lang.AssertionError: Illegal offset: 13956, size: 13955
at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:147)
at org.apache.cassandra.io.util.Memory.setBytes(Memory.java:103)
at 
org.apache.cassandra.io.util.MemoryOutputStream.write(MemoryOutputStream.java:45)
at 
org.apache.cassandra.utils.vint.EncodedDataOutputStream.write(EncodedDataOutputStream.java:50)
at 
org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at 
org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at 
org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
at 
org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at 
org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at 
org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:47)
at 
org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:37)
at 
org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:118)
at 
org.apache.cassandra.cache.SerializingCache.replace(SerializingCache.java:206)
at 
org.apache.cassandra.cache.InstrumentingCache.replace(InstrumentingCache.java:54)
at 
org.apache.cassandra.db.ColumnFamilyStore.getThroughCache(ColumnFamilyStore.java:1174)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
at 
org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:305)
at 
org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:161)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at 
org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:85)
at 
org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:548)
at 
org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1454)
at 
org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:44)
at 
org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5676) Sporadic timeouts from cassandra on secondary index queries: AssertionError: Illegal offset error observed in cassandra logs.

2013-06-20 Thread Rao (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rao updated CASSANDRA-5676:
---

Summary: Sporadic timeouts from cassandra on secondary index queries: 
AssertionError: Illegal offset error observed in cassandra logs.  (was: 
Occasional timeouts from cassandra on secondary index queries: AssertionError: 
Illegal offset error observed in cassandra logs.)

 Sporadic timeouts from cassandra on secondary index queries: AssertionError: 
 Illegal offset error observed in cassandra logs.
 -

 Key: CASSANDRA-5676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5676
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.5
Reporter: Rao

 When executing the queries based on secondary index, we are occasionally 
 getting a OperationTimeoutException from astyanax client and at the same time 
 observed the following error in Cassandra logs:
 Query executed: select * from grd.route where 
 serviceidentifier='com.att.aft.NagiosTestService'  LIMIT 3 ALLOW 
 FILTERING;
 serviceidentifier has a secondary index.
 ERROR [ReadStage:6185] 2013-06-20 09:20:31,574 CassandraDaemon.java (line 
 175) Exception in thread Thread[ReadStage:6185,5,RMI Runtime]
 java.lang.AssertionError: Illegal offset: 13956, size: 13955
 at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:147)
 at org.apache.cassandra.io.util.Memory.setBytes(Memory.java:103)
 at 
 org.apache.cassandra.io.util.MemoryOutputStream.write(MemoryOutputStream.java:45)
 at 
 org.apache.cassandra.utils.vint.EncodedDataOutputStream.write(EncodedDataOutputStream.java:50)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
 at 
 org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
 at 
 org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
 at 
 org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:47)
 at 
 org.apache.cassandra.cache.SerializingCacheProvider$RowCacheSerializer.serialize(SerializingCacheProvider.java:37)
 at 
 org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:118)
 at 
 org.apache.cassandra.cache.SerializingCache.replace(SerializingCache.java:206)
 at 
 org.apache.cassandra.cache.InstrumentingCache.replace(InstrumentingCache.java:54)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getThroughCache(ColumnFamilyStore.java:1174)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
 at 
 org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:305)
 at 
 org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:161)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
 at 
 org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:85)
 at 
 org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:548)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1454)
 at 
 org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:44)
 at 
 org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:722)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (CASSANDRA-5675) cqlsh shouldn't display null for empty values

2013-06-20 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reassigned CASSANDRA-5675:


Assignee: Aleksey Yeschenko

 cqlsh shouldn't display null for empty values
 ---

 Key: CASSANDRA-5675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5675
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 1.2.7


 For historical reason (and compatibility with thrift), all type support an 
 empty value, even type like int for which it doesn't really make sense (see 
 CASSANDRA-5674 too on that subject).
 If you input such an empty value for a type like int, cqlsh will display it 
 as null:
 {noformat}
 cqlsh:ks CREATE TABLE test (k text PRIMARY KEY, v int);
 cqlsh:ks INSERT INTO test(k, v) VALUES ('someKey', blobAsInt(0x));
 cqlsh:ks SELECT * FROM test;
  k   | v
 -+--
  someKey | null
 {noformat} 
 But that's not correct, it suggests {{v}} has no value but that's not true, 
 it has a value, it's just an empty one.
 Now, one may argue support empty values for a type like int is broken, and I 
 would agree with that. But thrift allows it so we probably need to preserve 
 that behavior for compatibility sake. And I guess the need to use blobAsInt 
 at least make it clear that it's kind of a hack.
 That being said, cqlsh should not display null as this is confusing. Instead 
 I'd suggest either displaying nothing (that's how an empty string is 
 displayed after all), or to just go with some random explicit syntax like say 
 [empty value]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5668) NPE in net.OutputTcpConnection when tracing is enabled


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5668:
--

Attachment: 5668.txt

Good idea to check for REQUEST_RESPONSE, although it's not quite as easy as it 
sounds since we still need to be able to inject the TraceState into the 
executor stage.  Patch attached.

(Note that once the session is closed we won't know elapsed time anymore.  I 
don't see a good way around this.)

 NPE in net.OutputTcpConnection when tracing is enabled
 --

 Key: CASSANDRA-5668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5668
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.6, 2.0 beta 1
Reporter: Ryan McGuire
 Attachments: 5668-assert-2.txt, 5668-assert.txt, 5668-logs.tar.gz, 
 5668_npe_ddl.cql, 5668_npe_insert.cql, 5668.txt, system.log


 I get multiple NullPointerException when trying to trace INSERT statements.
 To reproduce:
 {code}
 $ ccm create -v git:trunk
 $ ccm populate -n 3
 $ ccm start
 $ ccm node1 cqlsh  5668_npe_ddl.cql
 $ ccm node1 cqlsh  5668_npe_insert.cql
 {code}
 And see many exceptions like this in the logs of node1:
 {code}
 ERROR [WRITE-/127.0.0.3] 2013-06-19 14:54:35,885 OutboundTcpConnection.java 
 (line 197) error writing to /127.0.0.3
 java.lang.NullPointerException
 at 
 org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:182)
 at 
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
 {code}
 This is similar to CASSANDRA-5658 and is the reason that npe_ddl and 
 npe_insert are separate files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2737) CQL: support IF EXISTS extension for DROP commands (table, keyspace, index)

2013-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michał Michalski updated CASSANDRA-2737:


Attachment: (was: 2737-v2.txt)

 CQL: support IF EXISTS extension for DROP commands (table, keyspace, index)
 ---

 Key: CASSANDRA-2737
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2737
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 0.8.0
Reporter: Cathy Daw
Assignee: Michał Michalski
Priority: Trivial
  Labels: cql, cql3
 Fix For: 2.1

 Attachments: 2737-concept-v1.txt, 2737-poor-mans-testcase.cql




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2737) CQL: support IF EXISTS extension for DROP commands (table, keyspace, index)

2013-06-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michał Michalski updated CASSANDRA-2737:


Attachment: 2737-v2.txt

Replacing old v2 with v2 rebased onto current trunk.

 CQL: support IF EXISTS extension for DROP commands (table, keyspace, index)
 ---

 Key: CASSANDRA-2737
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2737
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 0.8.0
Reporter: Cathy Daw
Assignee: Michał Michalski
Priority: Trivial
  Labels: cql, cql3
 Fix For: 2.1

 Attachments: 2737-concept-v1.txt, 2737-poor-mans-testcase.cql, 
 2737-v2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-06-20 Thread Fabien Rousseau (JIRA)

Fabien Rousseau created CASSANDRA-5677:
--

 Summary: Performance improvements of RangeTombstones/IntervalTree
 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Priority: Minor


Using massively range tombstones leads to bad response time (ie 100-500 ranges 
tombstones per row).

After investigation, it seems that the culprit is how the DeletionInfo are 
merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 100 
instances of IntervalTree are created, the first one having one interval, the 
second one 2 intervals,... the 100th one : 100 intervals...)

It seems that once the IntervalTree is built, it is not possible to add a new 
Interval. Idea is to change the implementation of the IntervalTree by another 
one which support insert interval.

Attached is a proposed patch which :
 - renames the IntervalTree implementation to IntervalTreeCentered (the 
renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
 - adds a new implementation IntervalTreeAvl (which is described here : 
http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
http://en.wikipedia.org/wiki/AVL_tree )
 - adds a new interface IIntervalTree to abstract the implementation
 - adds a new configuration option (interval_tree_provider) which allows to 
choose between the two implementations (defaults to previous 
IntervalTreeCentered)
 - updates IntervalTreeTest unit tests to test both implementations
 - creates a mini benchmark between the two implementations (tree creation, 
point lookup, interval lookup)
 - creates a mini benchmark between the two implementations when merging 
DeletionInfo (which shows a big performance improvement when using 500 
tombstones for a row)

This patch applies for 1.2 branch...



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-06-20 Thread Fabien Rousseau (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabien Rousseau updated CASSANDRA-5677:
---

Attachment: 5677-new-IntervalTree-implementation.patch

 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Priority: Minor
 Attachments: 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5669) ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5669:
--

Reviewer: jbellis  (was: brandon.williams)

 ITC.close() resets peer msg version, causes connection thrashing in ec2 
 during upgrade
 --

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5669) ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade

[
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689376#comment-13689376
]

Jonathan Ellis commented on CASSANDRA-5669:
---

bq. It looks like we'll (re)set the version on any new connection from a given
node, so I'm not sure we need to explicitly throw away the version on close()

Here's the scenario. A is 1.2. B is 1.1.

B is restarted for upgrade. A reconnects to B before B connects to A -- maybe
it had an undroppable command to retry, or maybe it's just luck of the draw
that A gossips or sends a command to B.

If we don't reset the version on close, A will connect to B as 1.1, and then B
will think, Oh, A is a 1.1 node, I'd better connect to him that way too.

bq. The problem I'm trying to solve here is the upgraded node trying to contact
the older node, and things getting wonky (data race) when the
Ec2MultiRegionSnitch chooses to close the publicIP connection in favor of the
localIP

So damned if you do, damned if you don't...

What if we add logic to EC2MRS to only reconnect if we're both on the current
version? 1.1 - 1.2 would reconnect then (because 1.2 drops down to 1.1 after
initial negotiation) but that's okay since it would reconnect at 1.1 again.
1.2 - 1.1 would not reconnect, so you'd have extra public traffic until
everyone upgrades. Acceptable?

ITC.close() resets peer msg version, causes connection thrashing in ec2
during upgrade
--

Key: CASSANDRA-5669
URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
Labels: gossip
Fix For: 1.2.6, 2.0 beta 1

Attachments: 5669-v1.diff

While debugging the upgrading scenario described in CASSANDRA-5660, I
discovered the ITC.close() will reset the message protocol version of a peer
node that disconnects. CASSANDRA-5660 has a full description of the upgrade
path, but basically the Ec2MultiRegionSnitch will close connections on the
publicIP addr to reconnect on the privateIp, and this causes ITC to drop the
message protocol version of previously known nodes. I think we want to hang
onto that version so that when the newer node (re-)connects to the lower node
version, it passes the correct protocol version rather than the current
version (too high for the older node),the connection attempt getting dropped,
and going through the dance again.
To clarify, the 'thrashing' is at a rather low volume, from what I observed.
Anecdotaly, perhaps one connection per second gets turned over.

[jira] [Commented] (CASSANDRA-5673) NullPointerException on running instances


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689381#comment-13689381
 ] 

Jonathan Ellis commented on CASSANDRA-5673:
---

Please test 1.2.5, SSTableNamesIterator has changed substantially due to 
CASSANDRA-5492.

 NullPointerException on running instances
 -

 Key: CASSANDRA-5673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5673
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.4
Reporter: Sanjay

 Hallo,
 We are having sporadic NullPointerException in some of the cassandra nodes in 
 cluster (See stacktrace). 
 We are having two Datacenter, each having 15 nodes with RF = 2, OS is SLES 
 with java-1_6_0-ibm-1.6.0_sr12.0-0.5.1. 
 At present only  workaround is to stop the application running on same node 
 and run repair tool on cassandra. We are unable to identify the cause of 
 error.
 1)
 INFO|ScheduledTasks:1|org.apache.cassandra.service.GCInspector|GC for 
 MarkSweepCompact: 347 ms for 1 collections, 138398568 used; ma
 x is 1051721728
 2013-06-19T16:25:50:843|ERROR|ReplicateOnWriteStage:115|org.apache.cassandra.service.CassandraDaemon|Exception
  in thread Thread[ReplicateOnWriteStage:115,5,m
 ain]
 java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.NullPointerException
 at java.util.TreeSet.iterator(TreeSet.java:230)
 at 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:163)
 at 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:64)
 at 
 org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:274)
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
 at org.apache.cassandra.db.Table.getRow(Table.java:347)
 at 
 org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64)
 at 
 org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:90)
 at 
 org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:796)
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
 ... 3 more
 2013-06-19T16:26:01:001|ERROR|ReadStage:4833|org.apache.cassandra.service.CassandraDaemon|Exception
  in thread Thread[ReadStage:4833,5,main]
 java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
 at java.lang.Thread.run(Thread.java:738)
 Caused by: java.lang.NullPointerException
 at java.util.TreeSet.iterator(TreeSet.java:230)
 at 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:163)
 at org.
 2)
 2013-06-19T08:38:23:436| 
 INFO|Thread-2447|org.apache.cassandra.service.StorageService|Starting repair 
 command #2, repairing 1 ranges for keyspace system_auth
 2013-06-19T08:58:25:685|ERROR|ReadStage:9270|org.apache.cassandra.service.CassandraDaemon|Exception
  in thread Thread[ReadStage:9270,5,main]
 java.lang.NullPointerException
 at java.util.TreeSet.iterator(TreeSet.java:230)
 at 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:163)
 at 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:64)
 at 
 org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
 at

[jira] [Updated] (CASSANDRA-5665) Gossiper.handleMajorStateChange can lose existing node ApplicationState


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5665:
--

Reviewer: jbellis  (was: brandon.williams)

Currently the logic in applyStateLocally is (essentially)

{code}
if (remoteGeneration  localGeneration)
handleMajorStateChange(ep, remoteState);
else if ( remoteGeneration == localGeneration ) // generation 
has not changed, apply new states
applyNewStates(ep, localEpStatePtr, remoteState);

{code}

You're changing hMSTC to be more like applyNewStates, can we combine the two 
instead?

 Gossiper.handleMajorStateChange can lose existing node ApplicationState
 ---

 Key: CASSANDRA-5665
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5665
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip, upgrade
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5665-v1.diff


 Dovetailing on CASSANDRA-5660, I discovered that further along during an 
 upgrade, when more nodes are on the new major version, a node the previous 
 version can get passed some incomplete Gossip info about another, already 
 upgraded node, and the older node drops AppStat info about that node.
 I think what happens is that a 1.1 node (older rev) gets gossip info from a 
 1.2 node (A), which includes incomplete (lacking some AppState data) gossip 
 info about another 1.2 node (B). The 1.1 node, which has marked incorrectly 
 kicked node B out of gossip due to the bug described in #5660, then takes 
 that incomplete node B info and wholesale replaces any previous known state 
 about node B in Gossiper.handleMajorStateChanged. Thus, if we previously had 
 DC/RACK info, it'll get dropped as part of the 
 endpointStateMap.put(endpointstate). When the data being pased is incomplete, 
 1.1 will start referencing node B and gets into the NPE situation in #5498.
 Anecdotally, this bad state is short-lived, less than a few minutes, even as 
 short as ten seconds, until gossip catches up and properly propagates the 
 AppState data. Furthermore, when upgrading a two datacenter, 48 node cluster, 
 it only occurred on two nodes for less than a minute each. Thus, the scope 
 seems limited but can occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[1/5] Streaming 2.0

2013-06-20 Thread slebresne

Updated Branches:
  refs/heads/trunk 40b6c5d9c - 515116972


http://git-wip-us.apache.org/repos/asf/cassandra/blob/51511697/test/unit/org/apache/cassandra/streaming/SerializationsTest.java
--
diff --git a/test/unit/org/apache/cassandra/streaming/SerializationsTest.java 
b/test/unit/org/apache/cassandra/streaming/SerializationsTest.java
deleted file mode 100644
index 6db5b15..000
--- a/test/unit/org/apache/cassandra/streaming/SerializationsTest.java
+++ /dev/null
@@ -1,220 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * License); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-package org.apache.cassandra.streaming;
-
-import java.io.DataInputStream;
-import java.io.DataOutputStream;
-import java.io.File;
-import java.io.IOException;
-import java.util.*;
-
-import org.junit.Test;
-
-import org.apache.cassandra.AbstractSerializationsTester;
-import org.apache.cassandra.db.ColumnFamilyStore;
-import org.apache.cassandra.db.RowMutation;
-import org.apache.cassandra.db.Table;
-import org.apache.cassandra.dht.BytesToken;
-import org.apache.cassandra.dht.Range;
-import org.apache.cassandra.dht.Token;
-import org.apache.cassandra.io.sstable.Descriptor;
-import org.apache.cassandra.io.sstable.SSTable;
-import org.apache.cassandra.io.sstable.SSTableReader;
-import org.apache.cassandra.net.MessageIn;
-import org.apache.cassandra.utils.ByteBufferUtil;
-import org.apache.cassandra.utils.FBUtilities;
-import org.apache.cassandra.utils.Pair;
-import org.apache.cassandra.utils.UUIDGen;
-
-public class SerializationsTest extends AbstractSerializationsTester
-{
-private void testPendingFileWrite() throws IOException
-{
-// make sure to test serializing null and a pf with no sstable.
-PendingFile normal = makePendingFile(true, 100, 
OperationType.BOOTSTRAP);
-PendingFile noSections = makePendingFile(true, 0, OperationType.AES);
-PendingFile noSST = makePendingFile(false, 100, 
OperationType.RESTORE_REPLICA_COUNT);
-
-DataOutputStream out = getOutput(streaming.PendingFile.bin);
-PendingFile.serializer.serialize(normal, out, getVersion());
-PendingFile.serializer.serialize(noSections, out, getVersion());
-PendingFile.serializer.serialize(noSST, out, getVersion());
-PendingFile.serializer.serialize(null, out, getVersion());
-out.close();
-
-// test serializedSize
-testSerializedSize(normal, PendingFile.serializer);
-testSerializedSize(noSections, PendingFile.serializer);
-testSerializedSize(noSST, PendingFile.serializer);
-testSerializedSize(null, PendingFile.serializer);
-}
-
-@Test
-public void testPendingFileRead() throws IOException
-{
-if (EXECUTE_WRITES)
-testPendingFileWrite();
-
-DataInputStream in = getInput(streaming.PendingFile.bin);
-assert PendingFile.serializer.deserialize(in, getVersion()) != null;
-assert PendingFile.serializer.deserialize(in, getVersion()) != null;
-assert PendingFile.serializer.deserialize(in, getVersion()) != null;
-assert PendingFile.serializer.deserialize(in, getVersion()) == null;
-in.close();
-}
-
-private void testStreamHeaderWrite() throws IOException
-{
-UUID sessionId = UUIDGen.getTimeUUID();
-StreamHeader sh0 = new StreamHeader(Keyspace1, sessionId, 
makePendingFile(true, 100, OperationType.BOOTSTRAP));
-StreamHeader sh1 = new StreamHeader(Keyspace1, sessionId, 
makePendingFile(false, 100, OperationType.BOOTSTRAP));
-CollectionPendingFile files = new ArrayListPendingFile();
-for (int i = 0; i  50; i++)
-files.add(makePendingFile(i % 2 == 0, 100, 
OperationType.BOOTSTRAP));
-StreamHeader sh2 = new StreamHeader(Keyspace1, sessionId, 
makePendingFile(true, 100, OperationType.BOOTSTRAP), files);
-StreamHeader sh3 = new StreamHeader(Keyspace1, sessionId, null, 
files);
-StreamHeader sh4 = new StreamHeader(Keyspace1, sessionId, 
makePendingFile(true, 100, OperationType.BOOTSTRAP), new 
ArrayListPendingFile());
-
-DataOutputStream out = getOutput(streaming.StreamHeader.bin);
-

git commit: Never allow partition range queries in CQL3 without token()

2013-06-20 Thread slebresne

Updated Branches:
  refs/heads/cassandra-1.2 9ba0ff03e - 41f418a09


Never allow partition range queries in CQL3 without token()

patch by slebresne; reviewed by jbellis for CASSANDRA-5666


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/41f418a0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/41f418a0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/41f418a0

Branch: refs/heads/cassandra-1.2
Commit: 41f418a09e03e36911b404ff01c96adefc75b988
Parents: 9ba0ff0
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jun 20 19:10:24 2013 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu Jun 20 19:10:24 2013 +0200

--
 CHANGES.txt| 1 +
 doc/cql3/CQL.textile   | 5 +++--
 .../org/apache/cassandra/cql3/statements/SelectStatement.java  | 6 +-
 3 files changed, 5 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e1282aa..bd52eab 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -29,6 +29,7 @@
  * Suppress custom exceptions thru jmx (CASSANDRA-5652)
  * Update CREATE CUSTOM INDEX syntax (CASSANDRA-5639)
  * Fix PermissionDetails.equals() method (CASSANDRA-5655)
+ * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index 5fa36ab..f7d2dda 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -626,7 +626,7 @@ h4(#selectWhere). @where-clause@
 
 The @where-clause@ specifies which rows must be queried. It is composed of 
relations on the columns that are part of the @PRIMARY KEY@ and/or have a 
secondary index:#createIndexStmt defined on them.
 
-Not all relations are allowed in a query. For instance, non-equal relations 
(where @IN@ is considered as an equal relation) on a partition key is only 
supported if the partitioner for the keyspace is an ordered one. Moreover, for 
a given partition key, the clustering keys induce an ordering of rows and 
relations on them is restricted to the relations that allow to select a 
*contiguous* (for the ordering) set of rows. For instance, given
+Not all relations are allowed in a query. For instance, non-equal relations 
(where @IN@ is considered as an equal relation) on a partition key are not 
supported (but see the use of the @TOKEN@ method below to do non-equal queries 
on the partition key). Moreover, for a given partition key, the clustering keys 
induce an ordering of rows and relations on them is restricted to the relations 
that allow to select a *contiguous* (for the ordering) set of rows. For 
instance, given
 
 bc(sample). 
 CREATE TABLE posts (
@@ -650,7 +650,7 @@ bc(sample).
 // Needs a blog_title to be set to select ranges of posted_at
 SELECT entry_title, content FROM posts WHERE userid='john doe' AND posted_at 
= 2012-01-01 AND posted_at  2012-01-31
 
-When specifying relations, the @TOKEN@ function can be used on the @PARTITION 
KEY@ column to query. In that case, rows will be selected based on the token of 
their @PARTITION_KEY@ rather than on the value (note that the token of a key 
depends on the partitioner in use, and that in particular the RandomPartitioner 
won't yeld a meaningful order). Example:
+When specifying relations, the @TOKEN@ function can be used on the @PARTITION 
KEY@ column to query. In that case, rows will be selected based on the token of 
their @PARTITION_KEY@ rather than on the value. Note that the token of a key 
depends on the partitioner in use, and that in particular the RandomPartitioner 
won't yeld a meaningful order. Also note that ordering partitioners always 
order token values by bytes (so even if the partition key is of type int, 
@token(-1)  token(0)@ in particular). Example:
 
 bc(sample). 
 SELECT * FROM posts WHERE token(userid)  token('tom') AND token(userid)  
token('bob')
@@ -1051,6 +1051,7 @@ The following describes the addition/changes brought for 
each version of CQL.
 h3. 3.0.4
 
 * Updated the syntax for custom secondary indexes:#createIndexStmt.
+* Non-equal condition on the partition key are now never supported, even for 
ordering partitioner as this was not correct (the order was *not* the one of 
the type of the partition key). Instead, the @token@ method should always be 
used for range queries on the partition key

[jira] [Commented] (CASSANDRA-5669) ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689422#comment-13689422
 ] 

Jason Brown commented on CASSANDRA-5669:


Ahh, I see your point. Our upgrades are never that short, time-wise, when we 
bounce a node for upgrade, A would usually mark B as dead and drop any messages.

Yes, I think your proposal will be fine, a little extra public traffic is 
better than thrashing (all) connections. This will work now that we keep the 
version with the OTC rather than in each individual message (as we did pre-1.2).

 ITC.close() resets peer msg version, causes connection thrashing in ec2 
 during upgrade
 --

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5524) Allow upgradesstables to be run against a specified directory

2013-06-20 Thread Nick Bailey (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Bailey updated CASSANDRA-5524:
---

Attachment: 0003-Update-NEWS.txt-and-debian-scripts.patch
0002-Rename-snapshotupgrade-to-sstableupgrade.patch

Renamed the tool to 'sstableupgrade' and added entries to NEWS.txt and debian 
scripts.

 Allow upgradesstables to be run against a specified directory
 -

 Key: CASSANDRA-5524
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5524
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Nick Bailey
Priority: Minor
 Fix For: 1.2.6

 Attachments: 0001-Add-a-snapshot-upgrade-tool.patch, 
 0002-Rename-snapshotupgrade-to-sstableupgrade.patch, 
 0003-Update-NEWS.txt-and-debian-scripts.patch


 Currently, upgradesstables only modifies live SSTables.  Because 
 sstableloader cannot stream old SSTable formats, this makes it difficult to 
 restore data from a snapshot taken in a previous major version of Cassandra.
 Allowing the user to specify a directory for upgradesstables would resolve 
 this, but it may also be nice to upgrade SSTables in snapshot directories 
 automatically or with a separate flag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: Add standalone sstableupgrade utility. Patch by Nick Bailey, reviewed by brandonwilliams for CASSANDRA-5524

2013-06-20 Thread brandonwilliams

Updated Branches:
  refs/heads/cassandra-1.2 41f418a09 - 3814af808


Add standalone sstableupgrade utility.
Patch by Nick Bailey, reviewed by brandonwilliams for CASSANDRA-5524


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3814af80
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3814af80
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3814af80

Branch: refs/heads/cassandra-1.2
Commit: 3814af8087c8b5541bea563344afcc344f5efa2a
Parents: 41f418a
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu Jun 20 13:15:54 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu Jun 20 13:15:54 2013 -0500

--
 NEWS.txt|  67 +++---
 bin/sstableupgrade  |  55 +
 debian/cassandra.install|   1 +
 .../cassandra/db/compaction/Upgrader.java   | 167 ++
 .../cassandra/tools/StandaloneUpgrader.java | 223 +++
 5 files changed, 482 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 5cb06da..dbc9aab 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -8,6 +8,11 @@ upgrade, just in case you need to roll back to the previous 
version.
 (Cassandra version X + 1 will always be able to read data files created
 by version X, but the inverse is not necessarily the case.)
 
+When upgrading major versions of Cassandra, you will be unable to
+restore snapshots created with the previous major version using the
+'sstableloader' tool. You can upgrade the file format of your snapshots
+using the provided 'sstableupgrade' tool.
+
 1.2.6
 =
 
@@ -217,7 +222,7 @@ Features
 - num_tokens can now be specified in cassandra.yaml. This defines the
   number of tokens assigned to the host on the ring (default: 1).
   Also specifying initial_token will override any num_tokens setting.
-- disk_failure_policy allows blacklisting failed disks in JBOD 
+- disk_failure_policy allows blacklisting failed disks in JBOD
   configuration instead of erroring out indefinitely
 - event tracing can be configured per-connection (trace_next_query)
   or globally/probabilistically (nodetool settraceprobability)
@@ -314,7 +319,7 @@ Upgrading
   throw an InvalidRequestException when used for reads.  (Previous
   versions would silently perform a ONE read for range queries;
   single-row and multiget reads already rejected ANY.)
-- The largest mutation batch accepted by the commitlog is now 128MB.  
+- The largest mutation batch accepted by the commitlog is now 128MB.
   (In practice, batches larger than ~10MB always caused poor
   performance due to load volatility and GC promotion failures.)
   Larger batches will continue to be accepted but will not be
@@ -514,7 +519,7 @@ Upgrading
 - Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
   restart, one node at a time.  (0.8.0 or 0.8.1 are NOT network-compatible
   with 1.0: upgrade to the most recent 0.8 release first.)
-  You do not need to bring down the whole cluster at once. 
+  You do not need to bring down the whole cluster at once.
 - After upgrading, run nodetool scrub against each node before running
   repair, moving nodes, or adding new ones.
 - CQL inserts/updates now generate microsecond resolution timestamps
@@ -695,7 +700,7 @@ Upgrading
 -
 - Upgrading from version 0.7.1 or later can be done with a rolling
   restart, one node at a time.  You do not need to bring down the
-  whole cluster at once. 
+  whole cluster at once.
 - After upgrading, run nodetool scrub against each node before running
   repair, moving nodes, or adding new ones.
 - Running nodetool drain before shutting down the 0.7 node is
@@ -706,8 +711,8 @@ Upgrading
   to use your 0.7 clients.
 - Avro record classes used in map/reduce and Hadoop streaming code have
   been removed. Map/reduce can be switched to Thrift by changing
-  org.apache.cassandra.avro in import statements to 
-  org.apache.cassandra.thrift (no class names change). Streaming support 
+  org.apache.cassandra.avro in import statements to
+  org.apache.cassandra.thrift (no class names change). Streaming support
   has been removed for the time being.
 - The loadbalance command has been removed from nodetool.  For similar
   behavior, decommission then rebootstrap with empty initial_token.
@@ -721,15 +726,15 @@ Features
 
 - added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
   Python, respectively

[jira] [Commented] (CASSANDRA-5524) Allow upgradesstables to be run against a specified directory

2013-06-20 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689475#comment-13689475
 ] 

Brandon Williams commented on CASSANDRA-5524:
-

Committed.

 Allow upgradesstables to be run against a specified directory
 -

 Key: CASSANDRA-5524
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5524
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Nick Bailey
Priority: Minor
 Fix For: 1.2.6

 Attachments: 0001-Add-a-snapshot-upgrade-tool.patch, 
 0002-Rename-snapshotupgrade-to-sstableupgrade.patch, 
 0003-Update-NEWS.txt-and-debian-scripts.patch


 Currently, upgradesstables only modifies live SSTables.  Because 
 sstableloader cannot stream old SSTable formats, this makes it difficult to 
 restore data from a snapshot taken in a previous major version of Cassandra.
 Allowing the user to specify a directory for upgradesstables would resolve 
 this, but it may also be nice to upgrade SSTables in snapshot directories 
 automatically or with a separate flag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5669) ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5669:
---

Attachment: 5669-v2.diff

v2 adds additional check in Ec2MRS.reConnect() to make sure peer node is at 
same MS.current_version before closing connection on publicIP (and reconnecting 
on privateIP).

 ITC.close() resets peer msg version, causes connection thrashing in ec2 
 during upgrade
 --

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[2/5] git commit: Add standalone sstableupgrade utility. Patch by Nick Bailey, reviewed by brandonwilliams for CASSANDRA-5524

Add standalone sstableupgrade utility.
Patch by Nick Bailey, reviewed by brandonwilliams for CASSANDRA-5524


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3814af80
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3814af80
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3814af80

Branch: refs/heads/trunk
Commit: 3814af8087c8b5541bea563344afcc344f5efa2a
Parents: 41f418a
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu Jun 20 13:15:54 2013 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu Jun 20 13:15:54 2013 -0500

--
 NEWS.txt|  67 +++---
 bin/sstableupgrade  |  55 +
 debian/cassandra.install|   1 +
 .../cassandra/db/compaction/Upgrader.java   | 167 ++
 .../cassandra/tools/StandaloneUpgrader.java | 223 +++
 5 files changed, 482 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 5cb06da..dbc9aab 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -8,6 +8,11 @@ upgrade, just in case you need to roll back to the previous 
version.
 (Cassandra version X + 1 will always be able to read data files created
 by version X, but the inverse is not necessarily the case.)
 
+When upgrading major versions of Cassandra, you will be unable to
+restore snapshots created with the previous major version using the
+'sstableloader' tool. You can upgrade the file format of your snapshots
+using the provided 'sstableupgrade' tool.
+
 1.2.6
 =
 
@@ -217,7 +222,7 @@ Features
 - num_tokens can now be specified in cassandra.yaml. This defines the
   number of tokens assigned to the host on the ring (default: 1).
   Also specifying initial_token will override any num_tokens setting.
-- disk_failure_policy allows blacklisting failed disks in JBOD 
+- disk_failure_policy allows blacklisting failed disks in JBOD
   configuration instead of erroring out indefinitely
 - event tracing can be configured per-connection (trace_next_query)
   or globally/probabilistically (nodetool settraceprobability)
@@ -314,7 +319,7 @@ Upgrading
   throw an InvalidRequestException when used for reads.  (Previous
   versions would silently perform a ONE read for range queries;
   single-row and multiget reads already rejected ANY.)
-- The largest mutation batch accepted by the commitlog is now 128MB.  
+- The largest mutation batch accepted by the commitlog is now 128MB.
   (In practice, batches larger than ~10MB always caused poor
   performance due to load volatility and GC promotion failures.)
   Larger batches will continue to be accepted but will not be
@@ -514,7 +519,7 @@ Upgrading
 - Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
   restart, one node at a time.  (0.8.0 or 0.8.1 are NOT network-compatible
   with 1.0: upgrade to the most recent 0.8 release first.)
-  You do not need to bring down the whole cluster at once. 
+  You do not need to bring down the whole cluster at once.
 - After upgrading, run nodetool scrub against each node before running
   repair, moving nodes, or adding new ones.
 - CQL inserts/updates now generate microsecond resolution timestamps
@@ -695,7 +700,7 @@ Upgrading
 -
 - Upgrading from version 0.7.1 or later can be done with a rolling
   restart, one node at a time.  You do not need to bring down the
-  whole cluster at once. 
+  whole cluster at once.
 - After upgrading, run nodetool scrub against each node before running
   repair, moving nodes, or adding new ones.
 - Running nodetool drain before shutting down the 0.7 node is
@@ -706,8 +711,8 @@ Upgrading
   to use your 0.7 clients.
 - Avro record classes used in map/reduce and Hadoop streaming code have
   been removed. Map/reduce can be switched to Thrift by changing
-  org.apache.cassandra.avro in import statements to 
-  org.apache.cassandra.thrift (no class names change). Streaming support 
+  org.apache.cassandra.avro in import statements to
+  org.apache.cassandra.thrift (no class names change). Streaming support
   has been removed for the time being.
 - The loadbalance command has been removed from nodetool.  For similar
   behavior, decommission then rebootstrap with empty initial_token.
@@ -721,15 +726,15 @@ Features
 
 - added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
   Python, respectively (see: drivers/ subdirectory and doc/cql)
-- added distributed Counters

[4/5] git commit: add license

add license


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8d17ccb7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8d17ccb7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8d17ccb7

Branch: refs/heads/cassandra-1.2
Commit: 8d17ccb7b26d705e815863554d7e15f4eca46c89
Parents: 3814af8
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 10:21:23 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 13:53:59 2013 -0500

--
 .../cassandra/metrics/ReadRepairMetrics.java| 21 
 1 file changed, 21 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8d17ccb7/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
--
diff --git a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java 
b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
index 3f48fee..5b61e42 100644
--- a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
@@ -1,4 +1,25 @@
 package org.apache.cassandra.metrics;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 
 import java.util.concurrent.TimeUnit;

[5/5] git commit: merge from 1.2

merge from 1.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b9de5de2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b9de5de2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b9de5de2

Branch: refs/heads/trunk
Commit: b9de5de235267154f6a6fea5f2ca6710c5efefc5
Parents: 5151169 8d17ccb
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 13:56:04 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 13:56:04 2013 -0500

--
 CHANGES.txt |   1 +
 NEWS.txt|  71 +++---
 bin/sstableupgrade  |  55 +
 debian/cassandra.install|   1 +
 doc/cql3/CQL.textile|   5 +-
 .../cql3/statements/SelectStatement.java|  11 +-
 .../cassandra/db/compaction/Upgrader.java   | 167 ++
 .../cassandra/tools/StandaloneUpgrader.java | 223 +++
 8 files changed, 494 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/NEWS.txt
--
diff --cc NEWS.txt
index 7e06aa7,dbc9aab..c838d48
--- a/NEWS.txt
+++ b/NEWS.txt
@@@ -8,64 -8,11 +8,73 @@@ upgrade, just in case you need to roll 
  (Cassandra version X + 1 will always be able to read data files created
  by version X, but the inverse is not necessarily the case.)
  
++ HEAD
 +2.0.0
 +=
 +
 +Upgrading
 +-
 +- CAS and new features in CQL such as DROP COLUMN assume that cell
 +  timestamps are microseconds-since-epoch.  Do not use these
 +  features if you are using client-specified timestamps with some
 +  other source.
 +- Upgrading is ONLY supported from Cassandra 1.2.5 or later.  This
 +  goes for sstable compatibility as well as network.  When
 +  upgrading from an earlier release, upgrade to 1.2.5 first and
 +  run upgradesstables before proceeding to 2.0.
 +- Replication and strategy options do not accept unknown options anymore.
 +  This was already the case for CQL3 in 1.2 but this is now the case for
 +  thrift too.
 +- auto_bootstrap of a single-token node with no initial_token will
 +  now pick a random token instead of bisecting an existing token
 +  range.  We recommend upgrading to vnodes; failing that, we
 +  recommend specifying initial_token.
 +- reduce_cache_sizes_at, reduce_cache_capacity_to, and
 +  flush_largest_memtables_at options have been removed from 
cassandra.yaml.
 +- CacheServiceMBean.reduceCacheSizes() has been removed.
 +  Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead.
 +- authority option in cassandra.yaml has been deprecated since 1.2.0,
 +  but it has been completely removed in 2.0. Please use 'authorizer' 
option.
 +- ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and
 +  typeAsBlob() conversion functions instead.
 +  See https://cassandra.apache.org/doc/cql3/CQL.html#blobFun for details.
 +- Inputing blobs as string constants is now fully deprecated in
 +  favor of blob constants. Make sure to update your applications to use
 +  the new syntax while you are still on 1.2 (which supports both string
 +  and blob constants for blob input) before upgrading to 2.0.
 +
 +Operations
 +--
 +- Major compactions, cleanup, scrub, and upgradesstables will interrupt 
 +  any in-progress compactions (but not repair validations) when invoked.
 +- Disabling autocompactions by setting min/max compaction threshold to 0
 +  has been deprecated, instead, use the nodetool commands 
'disableautocompaction'
 +  and 'enableautocompaction' or set the compaction strategy option 
enabled = false
 +- ALTER TABLE DROP has been reenabled for CQL3 tables and has new 
semantics now.
 +  See https://cassandra.apache.org/doc/cql3/CQL.html#alterTableStmt and
 +  https://issues.apache.org/jira/browse/CASSANDRA-3919 for details.
 +- CAS uses gc_grace_seconds to determine how long to keep unused paxos
 +  state around for, or a minimum of three hours.
 +
 +Features
 +
 +- Alias support has been added to CQL3 SELECT statement. Refer to
 +  CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html) for 
details.
 +- JEMalloc support (see memory_allocator in cassandra.yaml)
 +- Experimental triggers support.  See examples/ for how to use.  
Experimental
 +  means tied closely to internal data structures; we plan to decouple 
this in
 +

[3/5] git commit: add license

add license


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8d17ccb7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8d17ccb7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8d17ccb7

Branch: refs/heads/trunk
Commit: 8d17ccb7b26d705e815863554d7e15f4eca46c89
Parents: 3814af8
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 10:21:23 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 13:53:59 2013 -0500

--
 .../cassandra/metrics/ReadRepairMetrics.java| 21 
 1 file changed, 21 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8d17ccb7/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
--
diff --git a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java 
b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
index 3f48fee..5b61e42 100644
--- a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
@@ -1,4 +1,25 @@
 package org.apache.cassandra.metrics;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 
 import java.util.concurrent.TimeUnit;

[1/5] git commit: Never allow partition range queries in CQL3 without token()

Updated Branches:
  refs/heads/cassandra-1.2 3814af808 - 8d17ccb7b
  refs/heads/trunk 515116972 - b9de5de23


Never allow partition range queries in CQL3 without token()

patch by slebresne; reviewed by jbellis for CASSANDRA-5666


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/41f418a0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/41f418a0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/41f418a0

Branch: refs/heads/trunk
Commit: 41f418a09e03e36911b404ff01c96adefc75b988
Parents: 9ba0ff0
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jun 20 19:10:24 2013 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu Jun 20 19:10:24 2013 +0200

--
 CHANGES.txt| 1 +
 doc/cql3/CQL.textile   | 5 +++--
 .../org/apache/cassandra/cql3/statements/SelectStatement.java  | 6 +-
 3 files changed, 5 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e1282aa..bd52eab 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -29,6 +29,7 @@
  * Suppress custom exceptions thru jmx (CASSANDRA-5652)
  * Update CREATE CUSTOM INDEX syntax (CASSANDRA-5639)
  * Fix PermissionDetails.equals() method (CASSANDRA-5655)
+ * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index 5fa36ab..f7d2dda 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -626,7 +626,7 @@ h4(#selectWhere). @where-clause@
 
 The @where-clause@ specifies which rows must be queried. It is composed of 
relations on the columns that are part of the @PRIMARY KEY@ and/or have a 
secondary index:#createIndexStmt defined on them.
 
-Not all relations are allowed in a query. For instance, non-equal relations 
(where @IN@ is considered as an equal relation) on a partition key is only 
supported if the partitioner for the keyspace is an ordered one. Moreover, for 
a given partition key, the clustering keys induce an ordering of rows and 
relations on them is restricted to the relations that allow to select a 
*contiguous* (for the ordering) set of rows. For instance, given
+Not all relations are allowed in a query. For instance, non-equal relations 
(where @IN@ is considered as an equal relation) on a partition key are not 
supported (but see the use of the @TOKEN@ method below to do non-equal queries 
on the partition key). Moreover, for a given partition key, the clustering keys 
induce an ordering of rows and relations on them is restricted to the relations 
that allow to select a *contiguous* (for the ordering) set of rows. For 
instance, given
 
 bc(sample). 
 CREATE TABLE posts (
@@ -650,7 +650,7 @@ bc(sample).
 // Needs a blog_title to be set to select ranges of posted_at
 SELECT entry_title, content FROM posts WHERE userid='john doe' AND posted_at 
= 2012-01-01 AND posted_at  2012-01-31
 
-When specifying relations, the @TOKEN@ function can be used on the @PARTITION 
KEY@ column to query. In that case, rows will be selected based on the token of 
their @PARTITION_KEY@ rather than on the value (note that the token of a key 
depends on the partitioner in use, and that in particular the RandomPartitioner 
won't yeld a meaningful order). Example:
+When specifying relations, the @TOKEN@ function can be used on the @PARTITION 
KEY@ column to query. In that case, rows will be selected based on the token of 
their @PARTITION_KEY@ rather than on the value. Note that the token of a key 
depends on the partitioner in use, and that in particular the RandomPartitioner 
won't yeld a meaningful order. Also note that ordering partitioners always 
order token values by bytes (so even if the partition key is of type int, 
@token(-1)  token(0)@ in particular). Example:
 
 bc(sample). 
 SELECT * FROM posts WHERE token(userid)  token('tom') AND token(userid)  
token('bob')
@@ -1051,6 +1051,7 @@ The following describes the addition/changes brought for 
each version of CQL.
 h3. 3.0.4
 
 * Updated the syntax for custom secondary indexes:#createIndexStmt.
+* Non-equal condition on the partition key are now never supported, even for 
ordering partitioner as this was not correct (the order was *not* the one of 
the type of the partition key). Instead, the @token@ method should always be 
used for

git commit: r/m unreachable code

Updated Branches:
  refs/heads/trunk b9de5de23 - 1f061f949


r/m unreachable code


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1f061f94
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1f061f94
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1f061f94

Branch: refs/heads/trunk
Commit: 1f061f94906884efca0213014516fbdeb82f8005
Parents: b9de5de
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 13:59:19 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 13:59:19 2013 -0500

--
 .../org/apache/cassandra/cql3/statements/SelectStatement.java| 4 
 1 file changed, 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1f061f94/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 3815a9d..bdfc326 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -1025,10 +1025,6 @@ public class SelectStatement implements CQLStatement
 break;
 }
 throw new InvalidRequestException(Only EQ and IN relation 
are supported on the partition key for random partitioners (unless you use the 
token() function));
-
-stmt.isKeyRange = true;
-lastRestrictedPartitionKey = i;
-shouldBeDone = true;
 }
 previous = cname;
 }

[jira] [Commented] (CASSANDRA-5669) ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689528#comment-13689528
 ] 

Jonathan Ellis commented on CASSANDRA-5669:
---

+1, just fix your IDE alignment settings on the  clause :)

 ITC.close() resets peer msg version, causes connection thrashing in ec2 
 during upgrade
 --

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: fix Upgrader for 2.0

Updated Branches:
  refs/heads/trunk 1f061f949 - 24da2bcc5


fix Upgrader for 2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/24da2bcc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/24da2bcc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/24da2bcc

Branch: refs/heads/trunk
Commit: 24da2bcc5d74dfe02ad0148eafc9e22368be34f5
Parents: 1f061f9
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 14:09:47 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 14:09:47 2013 -0500

--
 src/java/org/apache/cassandra/db/compaction/Upgrader.java | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/24da2bcc/src/java/org/apache/cassandra/db/compaction/Upgrader.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/Upgrader.java 
b/src/java/org/apache/cassandra/db/compaction/Upgrader.java
index e7211ba..17b68ec 100644
--- a/src/java/org/apache/cassandra/db/compaction/Upgrader.java
+++ b/src/java/org/apache/cassandra/db/compaction/Upgrader.java
@@ -69,14 +69,14 @@ public class Upgrader
 this.controller = new UpgradeController(cfs);
 
 this.strategy = cfs.getCompactionStrategy();
-long estimatedTotalKeys = 
Math.max(DatabaseDescriptor.getIndexInterval(), 
SSTableReader.getApproximateKeyCount(toUpgrade));
+long estimatedTotalKeys = Math.max(cfs.metadata.getIndexInterval(), 
SSTableReader.getApproximateKeyCount(toUpgrade, cfs.metadata));
 long estimatedSSTables = Math.max(1, 
SSTable.getTotalBytes(this.toUpgrade) / strategy.getMaxSSTableSize());
 this.estimatedRows = (long) Math.ceil((double) estimatedTotalKeys / 
estimatedSSTables);
 }
 
 private SSTableWriter createCompactionWriter()
 {
-SSTableMetadata.Collector sstableMetadataCollector = 
SSTableMetadata.createCollector();
+SSTableMetadata.Collector sstableMetadataCollector = 
SSTableMetadata.createCollector(cfs.getComparator());
 
 // Get the max timestamp of the precompacted sstables
 // and adds generation of live ancestors
@@ -130,7 +130,7 @@ public class Upgrader
 // also remove already completed SSTables
 for (SSTableReader sstable : sstables)
 {
-sstable.markCompacted();
+sstable.markObsolete();
 sstable.releaseReference();
 }
 throw Throwables.propagate(t);

[jira] [Commented] (CASSANDRA-5669) Connection thrashing during multi-region ec2 during upgrade, due to messaging version


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689542#comment-13689542
 ] 

Jason Brown commented on CASSANDRA-5669:


changed name of ticket to better reflect the problem (and the solution)

 Connection thrashing during multi-region ec2 during upgrade, due to messaging 
 version
 -

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5669) Connection thrashing during multi-region ec2 during upgrade, due to messaging version


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5669:
---

Labels: ec2 ec2multiregionsnitch gossip  (was: gossip)

 Connection thrashing during multi-region ec2 during upgrade, due to messaging 
 version
 -

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch, gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5669) Connection thrashing in multi-region ec2 during upgrade, due to messaging version


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5669:
---

Summary: Connection thrashing in multi-region ec2 during upgrade, due to 
messaging version  (was: Connection thrashing during multi-region ec2 during 
upgrade, due to messaging version)

 Connection thrashing in multi-region ec2 during upgrade, due to messaging 
 version
 -

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch, gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5669) Connection thrashing during multi-region ec2 during upgrade, due to messaging version


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5669:
---

Summary: Connection thrashing during multi-region ec2 during upgrade, due 
to messaging version  (was: ITC.close() resets peer msg version, causes 
connection thrashing in ec2 during upgrade)

 Connection thrashing during multi-region ec2 during upgrade, due to messaging 
 version
 -

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: StandaloneUpgrader fix

Updated Branches:
  refs/heads/trunk 24da2bcc5 - 56a47b394


StandaloneUpgrader fix


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/56a47b39
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/56a47b39
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/56a47b39

Branch: refs/heads/trunk
Commit: 56a47b3948ef326070663aea465e23eb3fdf8eda
Parents: 24da2bc
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 14:20:04 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 14:20:04 2013 -0500

--
 src/java/org/apache/cassandra/tools/StandaloneUpgrader.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/56a47b39/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
--
diff --git a/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java 
b/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
index 357e99c..9329b0f 100644
--- a/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
+++ b/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
@@ -107,7 +107,7 @@ public class StandaloneUpgrader
 Upgrader upgrader = new Upgrader(cfs, sstable, handler);
 upgrader.upgrade();
 
-sstable.markCompacted();
+sstable.markObsolete();
 sstable.releaseReference();
 }
 catch (Exception e)

[1/2] git commit: changes.txt

Updated Branches:
  refs/heads/cassandra-1.2 8d17ccb7b - b4dca4437


changes.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b4dca443
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b4dca443
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b4dca443

Branch: refs/heads/cassandra-1.2
Commit: b4dca44375b023ad12ac812572c96bf75b7935db
Parents: 72b1a1b
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 12:15:00 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 12:15:26 2013 -0700

--
 CHANGES.txt | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b4dca443/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index bd52eab..6d9c910 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -30,6 +30,8 @@
  * Update CREATE CUSTOM INDEX syntax (CASSANDRA-5639)
  * Fix PermissionDetails.equals() method (CASSANDRA-5655)
  * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
+ * Gossiper incorrectly drops AppState for an upgrading node (CASSANDRA-5660)
+ * Connection thrashing during multi-region ec2 during upgrade, due to 
messaging version (CASSANDRA-5669)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

[1/3] git commit: changes.txt

Updated Branches:
  refs/heads/trunk 56a47b394 - 7bb6f012b


changes.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b4dca443
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b4dca443
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b4dca443

Branch: refs/heads/trunk
Commit: b4dca44375b023ad12ac812572c96bf75b7935db
Parents: 72b1a1b
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 12:15:00 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 12:15:26 2013 -0700

--
 CHANGES.txt | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b4dca443/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index bd52eab..6d9c910 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -30,6 +30,8 @@
  * Update CREATE CUSTOM INDEX syntax (CASSANDRA-5639)
  * Fix PermissionDetails.equals() method (CASSANDRA-5655)
  * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
+ * Gossiper incorrectly drops AppState for an upgrading node (CASSANDRA-5660)
+ * Connection thrashing during multi-region ec2 during upgrade, due to 
messaging version (CASSANDRA-5669)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

[2/3] git commit: ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade. Second pass, where we have Ec2MRS check that the perr node is on the same MS.current_version b

ITC.close() resets peer msg version, causes connection thrashing in ec2 during 
upgrade.
Second pass, where we have Ec2MRS check that the perr node is on the same 
MS.current_version before
closing connection on publicIP and reconnecting on privateIP


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/72b1a1b4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/72b1a1b4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/72b1a1b4

Branch: refs/heads/trunk
Commit: 72b1a1b4989212267dba9a8d389af21d24423533
Parents: 8d17ccb
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 11:19:44 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 12:15:26 2013 -0700

--
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 22 +++-
 1 file changed, 12 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/72b1a1b4/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java 
b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
index e29637f..12ebfbb 100644
--- a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
+++ b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
@@ -97,17 +97,19 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 
 private void reConnect(InetAddress endpoint, VersionedValue versionedValue)
 {
-if (!getDatacenter(endpoint).equals(getDatacenter(public_ip)))
-return; // do nothing return back...
-
-try
-{
-InetAddress remoteIP = InetAddress.getByName(versionedValue.value);
-
MessagingService.instance().getConnectionPool(endpoint).reset(remoteIP);
-logger.debug(String.format(Intiated reconnect to an Internal IP 
%s for the %s, remoteIP, endpoint));
-} catch (UnknownHostException e)
+if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
+ MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version)
 {
-logger.error(Error in getting the IP address resolved: , e);
+try
+{
+InetAddress remoteIP = 
InetAddress.getByName(versionedValue.value);
+
MessagingService.instance().getConnectionPool(endpoint).reset(remoteIP);
+logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, remoteIP, endpoint));
+}
+catch (UnknownHostException e)
+{
+logger.error(Error in getting the IP address resolved: , e);
+}
 }
 }

[3/3] git commit: Merge branch 'cassandra-1.2' into trunk

Merge branch 'cassandra-1.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7bb6f012
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7bb6f012
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7bb6f012

Branch: refs/heads/trunk
Commit: 7bb6f012b48fa6c38f78467306893aa52dfd9b4a
Parents: 56a47b3 b4dca44
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 12:21:40 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 12:21:40 2013 -0700

--
 CHANGES.txt |  2 ++
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 22 +++-
 2 files changed, 14 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bb6f012/CHANGES.txt
--

[jira] [Commented] (CASSANDRA-5669) Connection thrashing in multi-region ec2 during upgrade, due to messaging version


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689546#comment-13689546
 ] 

Jason Brown commented on CASSANDRA-5669:


committed to 1.2 and trunk, with indentation alignment change. thanks!

 Connection thrashing in multi-region ec2 during upgrade, due to messaging 
 version
 -

 Key: CASSANDRA-5669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch, gossip
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5669-v1.diff, 5669-v2.diff


 While debugging the upgrading scenario described in CASSANDRA-5660, I 
 discovered the ITC.close() will reset the message protocol version of a peer 
 node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
 path, but basically the Ec2MultiRegionSnitch will close connections on the 
 publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
 message protocol version of previously known nodes. I think we want to hang 
 onto that version so that when the newer node (re-)connects to the lower node 
 version, it passes the correct protocol version rather than the current 
 version (too high for the older node),the connection attempt getting dropped, 
 and going through the dance again.
 To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
 Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[2/2] git commit: ITC.close() resets peer msg version, causes connection thrashing in ec2 during upgrade. Second pass, where we have Ec2MRS check that the perr node is on the same MS.current_version b

ITC.close() resets peer msg version, causes connection thrashing in ec2 during 
upgrade.
Second pass, where we have Ec2MRS check that the perr node is on the same 
MS.current_version before
closing connection on publicIP and reconnecting on privateIP


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/72b1a1b4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/72b1a1b4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/72b1a1b4

Branch: refs/heads/cassandra-1.2
Commit: 72b1a1b4989212267dba9a8d389af21d24423533
Parents: 8d17ccb
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 11:19:44 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 12:15:26 2013 -0700

--
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 22 +++-
 1 file changed, 12 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/72b1a1b4/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java 
b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
index e29637f..12ebfbb 100644
--- a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
+++ b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
@@ -97,17 +97,19 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 
 private void reConnect(InetAddress endpoint, VersionedValue versionedValue)
 {
-if (!getDatacenter(endpoint).equals(getDatacenter(public_ip)))
-return; // do nothing return back...
-
-try
-{
-InetAddress remoteIP = InetAddress.getByName(versionedValue.value);
-
MessagingService.instance().getConnectionPool(endpoint).reset(remoteIP);
-logger.debug(String.format(Intiated reconnect to an Internal IP 
%s for the %s, remoteIP, endpoint));
-} catch (UnknownHostException e)
+if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
+ MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version)
 {
-logger.error(Error in getting the IP address resolved: , e);
+try
+{
+InetAddress remoteIP = 
InetAddress.getByName(versionedValue.value);
+
MessagingService.instance().getConnectionPool(endpoint).reset(remoteIP);
+logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, remoteIP, endpoint));
+}
+catch (UnknownHostException e)
+{
+logger.error(Error in getting the IP address resolved: , e);
+}
 }
 }

[jira] [Commented] (CASSANDRA-5619) CAS UPDATE for a lost race: save round trip by returning column values


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689559#comment-13689559
 ] 

Jonathan Ellis commented on CASSANDRA-5619:
---

LGTM, ship it!

 CAS UPDATE for a lost race: save round trip by returning column values
 --

 Key: CASSANDRA-5619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5619
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0 beta 1
Reporter: Blair Zajac
Assignee: Sylvain Lebresne
 Fix For: 2.0 beta 1

 Attachments: 5619.txt


 Looking at the new CAS CQL3 support examples [1], if one lost a race for an 
 UPDATE, to save a round trip to get the current values to decide if you need 
 to perform your work, could the columns that were used in the IF clause also 
 be returned to the caller?  Maybe the columns values as part of the SET part 
 could also be returned.
 I don't know if this is generally useful though.
 In the case of creating a new user account with a given username which is the 
 partition key, if one lost the race to another person creating an account 
 with the same username, it doesn't matter to the loser what the column values 
 are, just that they lost.
 I'm new to Cassandra, so maybe there's other use cases, such as doing 
 incremental amount of work on a row.  In pure Java projects I've done while 
 loops around AtomicReference.html#compareAndSet() until the work was done on 
 the referenced object to handle multiple threads each making forward progress 
 in updating the references object.
 [1] https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py#L3044

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5678) Avoid over reconnecting in EC2MRS

Jason Brown created CASSANDRA-5678:
--

 Summary: Avoid over reconnecting in EC2MRS
 Key: CASSANDRA-5678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5678
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
 Fix For: 1.2.6, 2.0 beta 1
 Attachments: 5678-v1.diff

EC2MRS can reset the localIP connection to peers aggressively when calls to 
it's IEndpointStateChangeSubscriber impls get invoked. We shouldn't need to 
reset (switch to the localIP) if we're already using the localIP in the OTCP 
(as that's all the reset will do).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5678) Avoid over reconnecting in EC2MRS


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5678:
---

Attachment: 5678-v1.diff

Attached patch modifies EC2MRS.reConnect to add an additional check if the OTCP 
is already using the localIP address. If so, don;t bother to reset the 
connection.

 Avoid over reconnecting in EC2MRS
 -

 Key: CASSANDRA-5678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5678
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5678-v1.diff


 EC2MRS can reset the localIP connection to peers aggressively when calls to 
 it's IEndpointStateChangeSubscriber impls get invoked. We shouldn't need to 
 reset (switch to the localIP) if we're already using the localIP in the OTCP 
 (as that's all the reset will do).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-5678) Avoid over reconnecting in EC2MRS


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689565#comment-13689565
 ] 

Jason Brown edited comment on CASSANDRA-5678 at 6/20/13 7:53 PM:
-

Attached patch modifies EC2MRS.reConnect to add an additional check if the OTCP 
is already using the localIP address. If so, don't bother to reset the 
connection.

  was (Author: jasobrown):
Attached patch modifies EC2MRS.reConnect to add an additional check if the 
OTCP is already using the localIP address. If so, don;t bother to reset the 
connection.
  
 Avoid over reconnecting in EC2MRS
 -

 Key: CASSANDRA-5678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5678
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5678-v1.diff


 EC2MRS can reset the localIP connection to peers aggressively when calls to 
 it's IEndpointStateChangeSubscriber impls get invoked. We shouldn't need to 
 reset (switch to the localIP) if we're already using the localIP in the OTCP 
 (as that's all the reset will do).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5678) Avoid over reconnecting in EC2MRS


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689574#comment-13689574
 ] 

Jonathan Ellis commented on CASSANDRA-5678:
---

+1

 Avoid over reconnecting in EC2MRS
 -

 Key: CASSANDRA-5678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5678
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5678-v1.diff


 EC2MRS can reset the localIP connection to peers aggressively when calls to 
 it's IEndpointStateChangeSubscriber impls get invoked. We shouldn't need to 
 reset (switch to the localIP) if we're already using the localIP in the OTCP 
 (as that's all the reset will do).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5510) Following sequence of operations delete, add, search by secondary index of operations doesnot return correct results all the time.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-5510:


Affects Version/s: 1.2.5

 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 --

 Key: CASSANDRA-5510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5510
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.2, 1.2.5
 Environment: Test
Reporter: Rao
Assignee: Ryan McGuire
 Attachments: cassandra-analysis.zip


 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 Performance tests was performed on the following sequence of operations: 
 delete a set of rows, add a set of rows and then search a set of rows by 
 secondary index by each thread. On search some of the rows were not returned 
 some times.
 configuration:
 replication_factor:2 per dc 
 nodes: 2 per dc
 consistency_level: local_quorum

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5510) Following sequence of operations delete, add, search by secondary index of operations doesnot return correct results all the time.


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689618#comment-13689618
 ] 

Ryan McGuire commented on CASSANDRA-5510:
-

[~bv5301], I'm trying to setup your java package in a local testing environment 
using [ccm|https://github.com/pcmanus/ccm].

I do this to create a 2x2 multi-dc setup:

{code}
ccm create -v 1.2.5 5510-test
ccm populate -n 2:2
ccm start
{code}

Then I load the DDL statements found in your README. Then I'm trying to get 
your java package to connect to that local cluster. I set the following in 
bootstrap.properties at the bottom:
{code}
##
# Cassandra configuration
##
GRM_CASSANDRA_CLUSTER_NAME=SCLD_CASS_INFRATEST
GRM_CASSANDRA_SEED_HOSTS=node1:7100,node3:7300
GRM_CASSANDRA_RPC_PORT=9160
GRM_LOCAL_DATACENTER=dc1
CASSANDRA_HOST_NAME=node1
{code}

However, I can't get it to go, I get this error when starting up:
{code}
java.lang.Exception: internal_error.cassandra.connect.detail
at 
com.att.scld.cassandraDefect.util.CassandraConnectUtil.getHostToPin(CassandraConnectUtil.java:221)
at 
com.att.scld.cassandraDefect.dao.RouteDAOImpl.deleteRoutes(RouteDAOImpl.java:165)
at com.att.scld.cassandraDefect.Launcher.callCassandra(Launcher.java:66)
at com.att.scld.cassandraDefect.Launcher.access$1(Launcher.java:62)
at com.att.scld.cassandraDefect.Launcher$1.call(Launcher.java:37)
at com.att.scld.cassandraDefect.Launcher$1.call(Launcher.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
{code}

I believe this is telling me it can't connect to the cluster at all. This leads 
me to believe there is some setting in bootstrap.properties that I have wrong. 
Can you see if you can get your test to work against a local ccm cluster and 
also provide me with a new bootstrap.properties?

As an alternative to running your test, I created my own test. See 
delete_insert_test.py. I've tried to copy the general gist of what you're doing 
- creating some data, deleting part of it, reinserting it, and querying on a 
secondary index. So far I'm not able to reproduce the error using my test. To 
use my test, it needs to be copied to a [dtest 
environment|https://github.com/riptano/cassandra-dtest/blob/master/INSTALL.md] 
and run like 'nosetests delete_insert_test.py'

 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 --

 Key: CASSANDRA-5510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5510
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.2, 1.2.5
 Environment: Test
Reporter: Rao
Assignee: Ryan McGuire
 Attachments: cassandra-analysis.zip


 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 Performance tests was performed on the following sequence of operations: 
 delete a set of rows, add a set of rows and then search a set of rows by 
 secondary index by each thread. On search some of the rows were not returned 
 some times.
 configuration:
 replication_factor:2 per dc 
 nodes: 2 per dc
 consistency_level: local_quorum

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5672) Add enumumerated column to system_trace.events table to signify the type of event.

[
https://issues.apache.org/jira/browse/CASSANDRA-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-5672:
--

Affects Version/s: (was: 2.0 beta 1)
1.2.0
Fix Version/s: 2.1

Add enumumerated column to system_trace.events table to signify the type of
event.
--

Key: CASSANDRA-5672
URL: https://issues.apache.org/jira/browse/CASSANDRA-5672
Project: Cassandra
Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Ryan McGuire
Priority: Minor
Fix For: 2.1

Tracing of queries is useful for at least two different purposes:
* Interactively diagnosing a problem, via cqlsh, by a human.
* Programatically recording and responding to how queries behave.
This second purpose is not well suited to how the system_trace.events table
is currently organized, as the only identifying characteristic of each event
is a free-form string that can (and has) been changed in later versions.
[~jbellis] [mentioned the
possibility|http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2]
of adding an enumeration of event types that would be immutable.
Reference [this
dtest|https://github.com/riptano/cassandra-dtest/pull/13/files] that parses
the strings in this table via regex. If these strings change, this test will
break.

[jira] [Updated] (CASSANDRA-5510) Following sequence of operations delete, add, search by secondary index of operations doesnot return correct results all the time.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-5510:


Attachment: delete_insert_test.py

 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 --

 Key: CASSANDRA-5510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5510
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.2, 1.2.5
 Environment: Test
Reporter: Rao
Assignee: Ryan McGuire
 Attachments: cassandra-analysis.zip, delete_insert_test.py


 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 Performance tests was performed on the following sequence of operations: 
 delete a set of rows, add a set of rows and then search a set of rows by 
 secondary index by each thread. On search some of the rows were not returned 
 some times.
 configuration:
 replication_factor:2 per dc 
 nodes: 2 per dc
 consistency_level: local_quorum

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5510) Following sequence of operations delete, add, search by secondary index of operations doesnot return correct results all the time.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-5510:


Attachment: (was: delete_insert_test.py)

 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 --

 Key: CASSANDRA-5510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5510
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.2, 1.2.5
 Environment: Test
Reporter: Rao
Assignee: Ryan McGuire
 Attachments: cassandra-analysis.zip, delete_insert_test.py


 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 Performance tests was performed on the following sequence of operations: 
 delete a set of rows, add a set of rows and then search a set of rows by 
 secondary index by each thread. On search some of the rows were not returned 
 some times.
 configuration:
 replication_factor:2 per dc 
 nodes: 2 per dc
 consistency_level: local_quorum

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5510) Following sequence of operations delete, add, search by secondary index of operations doesnot return correct results all the time.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-5510:


Attachment: delete_insert_test.py

 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 --

 Key: CASSANDRA-5510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5510
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.2, 1.2.5
 Environment: Test
Reporter: Rao
Assignee: Ryan McGuire
 Attachments: cassandra-analysis.zip, delete_insert_test.py


 Following sequence of operations delete, add, search by secondary index of 
 operations doesnot return correct results all the time.
 Performance tests was performed on the following sequence of operations: 
 delete a set of rows, add a set of rows and then search a set of rows by 
 secondary index by each thread. On search some of the rows were not returned 
 some times.
 configuration:
 replication_factor:2 per dc 
 nodes: 2 per dc
 consistency_level: local_quorum

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)

Tony Zhao created CASSANDRA-5679:


 Summary: Wide Row calls map method once per column in Hadoop 
MapReduce
 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao


When using Cassandra without wide row support in a Hadoop job, the map method 
gets a number of columns limited by the SlicePredicate every time the map 
method in the mapper is called; but when using wide row support, the map method 
is called once for every column. It seems like the limit in SlicePredicate is 
ignored when wide row set to true. 

This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: Avoid over-reconnecting in EC2MRS patch by jasobrown; reviewed by jbellis for CASSADNRA-5678

Updated Branches:
  refs/heads/cassandra-1.2 b4dca4437 - 998fe9676


Avoid over-reconnecting in EC2MRS
patch by jasobrown; reviewed by jbellis for CASSADNRA-5678


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/998fe967
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/998fe967
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/998fe967

Branch: refs/heads/cassandra-1.2
Commit: 998fe96766a8c826ab0483e657885eb10a9293ae
Parents: b4dca44
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 12:43:46 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 13:40:49 2013 -0700

--
 CHANGES.txt |  1 +
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 21 ++--
 .../net/OutboundTcpConnectionPool.java  |  2 +-
 3 files changed, 13 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/998fe967/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 6d9c910..3847d6a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -32,6 +32,7 @@
  * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
  * Gossiper incorrectly drops AppState for an upgrading node (CASSANDRA-5660)
  * Connection thrashing during multi-region ec2 during upgrade, due to 
messaging version (CASSANDRA-5669)
+ * Avoid over reconnecting in EC2MRS (CASSANDRA-5678)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/998fe967/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java 
b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
index 12ebfbb..ea41bc0 100644
--- a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
+++ b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
@@ -97,20 +97,21 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 
 private void reConnect(InetAddress endpoint, VersionedValue versionedValue)
 {
-if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
- MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version)
+try
 {
-try
+InetAddress localEc2IP = 
InetAddress.getByName(versionedValue.value);
+if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
+ MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version
+ 
!MessagingService.instance().getConnectionPool(endpoint).endPoint().equals(localEc2IP))
 {
-InetAddress remoteIP = 
InetAddress.getByName(versionedValue.value);
-
MessagingService.instance().getConnectionPool(endpoint).reset(remoteIP);
-logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, remoteIP, endpoint));
-}
-catch (UnknownHostException e)
-{
-logger.error(Error in getting the IP address resolved: , e);
+
MessagingService.instance().getConnectionPool(endpoint).reset(localEc2IP);
+logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, localEc2IP, endpoint));
 }
 }
+catch (UnknownHostException e)
+{
+logger.error(Error in getting the IP address resolved: , e);
+}
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/cassandra/blob/998fe967/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
--
diff --git a/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java 
b/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
index 1bc1893..86476b1 100644
--- a/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
+++ b/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
@@ -130,7 +130,7 @@ public class OutboundTcpConnectionPool
 }
 }
 
-InetAddress endPoint()
+public InetAddress endPoint()
 {
 if (id.equals(FBUtilities.getBroadcastAddress()))
 return FBUtilities.getLocalAddress();

[2/2] git commit: Merge branch 'cassandra-1.2' into trunk

Merge branch 'cassandra-1.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8df9d1f4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8df9d1f4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8df9d1f4

Branch: refs/heads/trunk
Commit: 8df9d1f4c84773250048f41a9143fa6a5457dfd0
Parents: 7bb6f01 998fe96
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 13:42:28 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 13:42:28 2013 -0700

--
 CHANGES.txt |  1 +
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 21 ++--
 .../net/OutboundTcpConnectionPool.java  |  2 +-
 3 files changed, 13 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8df9d1f4/CHANGES.txt
--

[1/2] git commit: Avoid over-reconnecting in EC2MRS patch by jasobrown; reviewed by jbellis for CASSADNRA-5678

Updated Branches:
  refs/heads/trunk 7bb6f012b - 8df9d1f4c


Avoid over-reconnecting in EC2MRS
patch by jasobrown; reviewed by jbellis for CASSADNRA-5678


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/998fe967
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/998fe967
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/998fe967

Branch: refs/heads/trunk
Commit: 998fe96766a8c826ab0483e657885eb10a9293ae
Parents: b4dca44
Author: Jason Brown jasedbr...@gmail.com
Authored: Thu Jun 20 12:43:46 2013 -0700
Committer: Jason Brown jasedbr...@gmail.com
Committed: Thu Jun 20 13:40:49 2013 -0700

--
 CHANGES.txt |  1 +
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 21 ++--
 .../net/OutboundTcpConnectionPool.java  |  2 +-
 3 files changed, 13 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/998fe967/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 6d9c910..3847d6a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -32,6 +32,7 @@
  * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
  * Gossiper incorrectly drops AppState for an upgrading node (CASSANDRA-5660)
  * Connection thrashing during multi-region ec2 during upgrade, due to 
messaging version (CASSANDRA-5669)
+ * Avoid over reconnecting in EC2MRS (CASSANDRA-5678)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/998fe967/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java 
b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
index 12ebfbb..ea41bc0 100644
--- a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
+++ b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
@@ -97,20 +97,21 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 
 private void reConnect(InetAddress endpoint, VersionedValue versionedValue)
 {
-if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
- MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version)
+try
 {
-try
+InetAddress localEc2IP = 
InetAddress.getByName(versionedValue.value);
+if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
+ MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version
+ 
!MessagingService.instance().getConnectionPool(endpoint).endPoint().equals(localEc2IP))
 {
-InetAddress remoteIP = 
InetAddress.getByName(versionedValue.value);
-
MessagingService.instance().getConnectionPool(endpoint).reset(remoteIP);
-logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, remoteIP, endpoint));
-}
-catch (UnknownHostException e)
-{
-logger.error(Error in getting the IP address resolved: , e);
+
MessagingService.instance().getConnectionPool(endpoint).reset(localEc2IP);
+logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, localEc2IP, endpoint));
 }
 }
+catch (UnknownHostException e)
+{
+logger.error(Error in getting the IP address resolved: , e);
+}
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/cassandra/blob/998fe967/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
--
diff --git a/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java 
b/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
index 1bc1893..86476b1 100644
--- a/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
+++ b/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java
@@ -130,7 +130,7 @@ public class OutboundTcpConnectionPool
 }
 }
 
-InetAddress endPoint()
+public InetAddress endPoint()
 {
 if (id.equals(FBUtilities.getBroadcastAddress()))
 return FBUtilities.getLocalAddress();

[jira] [Commented] (CASSANDRA-5678) Avoid over reconnecting in EC2MRS


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689635#comment-13689635
 ] 

Jason Brown commented on CASSANDRA-5678:


committed to 1.2 and trunk. thanks!

 Avoid over reconnecting in EC2MRS
 -

 Key: CASSANDRA-5678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5678
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: ec2, ec2multiregionsnitch
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5678-v1.diff


 EC2MRS can reset the localIP connection to peers aggressively when calls to 
 it's IEndpointStateChangeSubscriber impls get invoked. We shouldn't need to 
 reset (switch to the localIP) if we're already using the localIP in the OTCP 
 (as that's all the reset will do).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-5679.
---

Resolution: Duplicate

 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[3/3] git commit: Merge branch 'cassandra-1.2' into trunk

Merge branch 'cassandra-1.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/21deff6c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/21deff6c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/21deff6c

Branch: refs/heads/trunk
Commit: 21deff6c2a454771aa3df0170cc15dbc0ee6
Parents: 8df9d1f 7dc2eb9
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 15:47:13 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 15:47:13 2013 -0500

--
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 42 +++-
 1 file changed, 23 insertions(+), 19 deletions(-)
--

[1/3] git commit: cleanup

Updated Branches:
  refs/heads/cassandra-1.2 998fe9676 - 7dc2eb95c
  refs/heads/trunk 8df9d1f4c - 21deff6c9


cleanup


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7dc2eb95
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7dc2eb95
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7dc2eb95

Branch: refs/heads/cassandra-1.2
Commit: 7dc2eb95c1752eb661b93c72a831ceb783d42ce4
Parents: 998fe96
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 15:47:06 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 15:47:06 2013 -0500

--
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 42 +++-
 1 file changed, 23 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dc2eb95/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java 
b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
index ea41bc0..9317941 100644
--- a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
+++ b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
@@ -49,35 +49,35 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 {
 private static final String PUBLIC_IP_QUERY_URL = 
http://169.254.169.254/latest/meta-data/public-ipv4;;
 private static final String PRIVATE_IP_QUERY_URL = 
http://169.254.169.254/latest/meta-data/local-ipv4;;
-private final InetAddress public_ip;
-private final String private_ip;
+private final InetAddress localPublicAddress;
+private final String localPrivateAddress;
 
 public Ec2MultiRegionSnitch() throws IOException, ConfigurationException
 {
 super();
-public_ip = InetAddress.getByName(awsApiCall(PUBLIC_IP_QUERY_URL));
-logger.info(EC2Snitch using publicIP as identifier:  + public_ip);
-private_ip = awsApiCall(PRIVATE_IP_QUERY_URL);
+localPublicAddress = 
InetAddress.getByName(awsApiCall(PUBLIC_IP_QUERY_URL));
+logger.info(EC2Snitch using publicIP as identifier:  + 
localPublicAddress);
+localPrivateAddress = awsApiCall(PRIVATE_IP_QUERY_URL);
 // use the Public IP to broadcast Address to other nodes.
-DatabaseDescriptor.setBroadcastAddress(public_ip);
+DatabaseDescriptor.setBroadcastAddress(localPublicAddress);
 }
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
 if (epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
-reConnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
+reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
 if (state == ApplicationState.INTERNAL_IP)
-reConnect(endpoint, value);
+reconnect(endpoint, value);
 }
 
 public void onAlive(InetAddress endpoint, EndpointState state)
 {
 if (state.getApplicationState(ApplicationState.INTERNAL_IP) != null)
-reConnect(endpoint, 
state.getApplicationState(ApplicationState.INTERNAL_IP));
+reconnect(endpoint, 
state.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onDead(InetAddress endpoint, EndpointState state)
@@ -95,18 +95,11 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 // do nothing.
 }
 
-private void reConnect(InetAddress endpoint, VersionedValue versionedValue)
+private void reconnect(InetAddress publicAddress, VersionedValue 
localAddressValue)
 {
 try
 {
-InetAddress localEc2IP = 
InetAddress.getByName(versionedValue.value);
-if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
- MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version
- 
!MessagingService.instance().getConnectionPool(endpoint).endPoint().equals(localEc2IP))
-{
-
MessagingService.instance().getConnectionPool(endpoint).reset(localEc2IP);
-logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, localEc2IP, endpoint));
-}
+reconnect(publicAddress, 
InetAddress.getByName(localAddressValue.value));
 }
 catch (UnknownHostException e)
 {
@@ -114,11 +107,22 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 }
 }
 
+private void reconnect(InetAddress publicAddress,

[2/3] git commit: cleanup

cleanup


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7dc2eb95
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7dc2eb95
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7dc2eb95

Branch: refs/heads/trunk
Commit: 7dc2eb95c1752eb661b93c72a831ceb783d42ce4
Parents: 998fe96
Author: Jonathan Ellis jbel...@apache.org
Authored: Thu Jun 20 15:47:06 2013 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Thu Jun 20 15:47:06 2013 -0500

--
 .../cassandra/locator/Ec2MultiRegionSnitch.java | 42 +++-
 1 file changed, 23 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7dc2eb95/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java 
b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
index ea41bc0..9317941 100644
--- a/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
+++ b/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java
@@ -49,35 +49,35 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 {
 private static final String PUBLIC_IP_QUERY_URL = 
http://169.254.169.254/latest/meta-data/public-ipv4;;
 private static final String PRIVATE_IP_QUERY_URL = 
http://169.254.169.254/latest/meta-data/local-ipv4;;
-private final InetAddress public_ip;
-private final String private_ip;
+private final InetAddress localPublicAddress;
+private final String localPrivateAddress;
 
 public Ec2MultiRegionSnitch() throws IOException, ConfigurationException
 {
 super();
-public_ip = InetAddress.getByName(awsApiCall(PUBLIC_IP_QUERY_URL));
-logger.info(EC2Snitch using publicIP as identifier:  + public_ip);
-private_ip = awsApiCall(PRIVATE_IP_QUERY_URL);
+localPublicAddress = 
InetAddress.getByName(awsApiCall(PUBLIC_IP_QUERY_URL));
+logger.info(EC2Snitch using publicIP as identifier:  + 
localPublicAddress);
+localPrivateAddress = awsApiCall(PRIVATE_IP_QUERY_URL);
 // use the Public IP to broadcast Address to other nodes.
-DatabaseDescriptor.setBroadcastAddress(public_ip);
+DatabaseDescriptor.setBroadcastAddress(localPublicAddress);
 }
 
 public void onJoin(InetAddress endpoint, EndpointState epState)
 {
 if (epState.getApplicationState(ApplicationState.INTERNAL_IP) != null)
-reConnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
+reconnect(endpoint, 
epState.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onChange(InetAddress endpoint, ApplicationState state, 
VersionedValue value)
 {
 if (state == ApplicationState.INTERNAL_IP)
-reConnect(endpoint, value);
+reconnect(endpoint, value);
 }
 
 public void onAlive(InetAddress endpoint, EndpointState state)
 {
 if (state.getApplicationState(ApplicationState.INTERNAL_IP) != null)
-reConnect(endpoint, 
state.getApplicationState(ApplicationState.INTERNAL_IP));
+reconnect(endpoint, 
state.getApplicationState(ApplicationState.INTERNAL_IP));
 }
 
 public void onDead(InetAddress endpoint, EndpointState state)
@@ -95,18 +95,11 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 // do nothing.
 }
 
-private void reConnect(InetAddress endpoint, VersionedValue versionedValue)
+private void reconnect(InetAddress publicAddress, VersionedValue 
localAddressValue)
 {
 try
 {
-InetAddress localEc2IP = 
InetAddress.getByName(versionedValue.value);
-if (getDatacenter(endpoint).equals(getDatacenter(public_ip))
- MessagingService.instance().getVersion(endpoint) == 
MessagingService.current_version
- 
!MessagingService.instance().getConnectionPool(endpoint).endPoint().equals(localEc2IP))
-{
-
MessagingService.instance().getConnectionPool(endpoint).reset(localEc2IP);
-logger.debug(String.format(Intiated reconnect to an Internal 
IP %s for the %s, localEc2IP, endpoint));
-}
+reconnect(publicAddress, 
InetAddress.getByName(localAddressValue.value));
 }
 catch (UnknownHostException e)
 {
@@ -114,11 +107,22 @@ public class Ec2MultiRegionSnitch extends Ec2Snitch 
implements IEndpointStateCha
 }
 }
 
+private void reconnect(InetAddress publicAddress, InetAddress localAddress)
+{
+if 
(getDatacenter(publicAddress).equals(getDatacenter(localPublicAddress))
+

[jira] [Created] (CASSANDRA-5680) keyspace (and CF?) argument to nodetool cfstats

2013-06-20 Thread Robert Coli (JIRA)

Robert Coli created CASSANDRA-5680:
--

 Summary: keyspace (and CF?) argument to nodetool cfstats
 Key: CASSANDRA-5680
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5680
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Coli
Assignee: Edward Capriolo
Priority: Trivial


Operators frequently use cfstats to get a quick health check/row size 
check/etc. for their cluster. Unfortunately, the output of cfstats includes 
system keyspace columnfamilies and there is no easy way to exclude them from 
the output.

[~appodictic] and I were discussing this on #cassandra and he said that if I 
filed a JIRA to request that cfstats take a keyspace (and CF?) argument, he 
would try to hack out a quick patch.

Here's that JIRA! :D

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Zhao updated CASSANDRA-5679:
-

Labels: cassandra hadoop  (was: )

 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao
  Labels: cassandra, hadoop

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689678#comment-13689678
]

Tony Zhao commented on CASSANDRA-5679:
--

They are not the same problem. 4871 is saying that you can't bound the return
from Cassandra with SlicePredicate. I am saying that the result set I get back
is bounded to my SlicePredicate but it is not sliced correctly. Without wide
row, each call to the map method has a chunk of columns limited by the number
specified in the SlicePredicate. With wide row, each call to the map method has
only one column. So if the SlicePredicate returns 1000 columns with wide row,
the map method gets called 1000 times when it should ideally be called just
once with 1000 columns.

Wide Row calls map method once per column in Hadoop MapReduce
-

Key: CASSANDRA-5679
URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
Project: Cassandra
Issue Type: Bug
Components: Hadoop
Affects Versions: 1.2.4
Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao

When using Cassandra without wide row support in a Hadoop job, the map method
gets a number of columns limited by the SlicePredicate every time the map
method in the mapper is called; but when using wide row support, the map
method is called once for every column. It seems like the limit in
SlicePredicate is ignored when wide row set to true.
This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

[jira] [Reopened] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-5679:
---


 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao
  Labels: cassandra, hadoop

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-5679.
---

Resolution: Won't Fix

Very well, resolved as wontfix instead of duplicate.

 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao
  Labels: cassandra, hadoop

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689706#comment-13689706
 ] 

Jonathan Ellis commented on CASSANDRA-5679:
---

(Use CqlPagedInputFormat instead for wide rows done right.)

 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao
  Labels: cassandra, hadoop

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5608) Primary range repair still isn't quite NTS-aware

2013-06-20 Thread Bill Hathaway (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689730#comment-13689730
 ] 

Bill Hathaway commented on CASSANDRA-5608:
--

If we are using the standard token assignment strategy of +1 for each new 
datacenter, when running repair -pr on any data center not the primary one 
(that starts with first node token=0), we can see it running a repair for a 
range of 1, versus the normal huge range on the primary data center.


Example on 1.1.10 from primary data center (huge range)
2013-05-26 09:07:07,728 [AntiEntropySessions:7] INFO AntiEntropyService [repair 
#a42df6e0-c5e3-11e2--dad4ff6f95da] new session: will sync /172.20.248.95 on 
range 
(106338239662793269832304564822427566081,127605887595351923798765477786913079296]
 for UnitTestKeyspace.[CF1,CF2,CF3]

Example on 1.1.10 from secondary data center (range of 1)
2013-06-20 19:11:39,238 [AntiEntropySessions:23] INFO AntiEntropyService 
[repair #3c031150-d9dd-11e2--b258590872ff] new session: will sync 
/172.20.156.220 on range 
(148873535527910577765226390751398592512,148873535527910577765226390751398592513]
 for UnitTestKeyspace.[CF1,CF2,CF3]


 Primary range repair still isn't quite NTS-aware
 --

 Key: CASSANDRA-5608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5608
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.5
Reporter: Jonathan Ellis

 Consider the case of a four node cluster, with nodes A and C in DC1, and 
 nodes B and D in DC2.  TokenMetadata will break this into ranges of (A-B], 
 (B-C], (C-D], (D-A].
 If we have a single copy of a keyspace stored in DC1 only (none in DC2), then 
 the current code correctly calculates that node A is responsible for ranges 
 (C-D], (D-A].
 But, if we add a copy in DC2, then we only calculate (D-A] as primary range.  
 This is a bug; we should not care what copies are in other datacenters, when 
 computing what to repair in the local one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5608) Primary range repair still isn't quite NTS-aware


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689740#comment-13689740
 ] 

Jonathan Ellis commented on CASSANDRA-5608:
---

That was fixed in 1.2.5. CASSANDRA-5424

 Primary range repair still isn't quite NTS-aware
 --

 Key: CASSANDRA-5608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5608
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2.5
Reporter: Jonathan Ellis

 Consider the case of a four node cluster, with nodes A and C in DC1, and 
 nodes B and D in DC2.  TokenMetadata will break this into ranges of (A-B], 
 (B-C], (C-D], (D-A].
 If we have a single copy of a keyspace stored in DC1 only (none in DC2), then 
 the current code correctly calculates that node A is responsible for ranges 
 (C-D], (D-A].
 But, if we add a copy in DC2, then we only calculate (D-A] as primary range.  
 This is a bug; we should not care what copies are in other datacenters, when 
 computing what to repair in the local one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5234) Table created through CQL3 are not accessble to Pig 0.10

2013-06-20 Thread Alex Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Liu updated CASSANDRA-5234:


Attachment: 5234-2-1.2branch.txt

5234-2-1.2branch.txt is attached to use cql:// instead of cassandra:// for 
CqlStorage

 Table created through CQL3 are not accessble to Pig 0.10
 

 Key: CASSANDRA-5234
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5234
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.1
 Environment: Red hat linux 5
Reporter: Shamim Ahmed
Assignee: Alex Liu
 Fix For: 1.2.6

 Attachments: 5234-1-1.2-patch.txt, 5234-1.2-patch.txt, 
 5234-2-1.2branch.txt, 5234.tx


 Hi,
   i have faced a bug when creating table through CQL3 and trying to load data 
 through pig 0.10 as follows:
 java.lang.RuntimeException: Column family 'abc' not found in keyspace 'XYZ'
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.initSchema(CassandraStorage.java:1112)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:615).
 This effects from Simple table to table with compound key. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5681) Refactor IESCS in Snitches

Jason Brown created CASSANDRA-5681:
--

 Summary: Refactor IESCS in Snitches
 Key: CASSANDRA-5681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5681
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
 Fix For: 1.2.6, 2.0 beta 1


Reduce/refactor duplicated IESCS implementations in Ec2MRS, GPFS, and YPNTS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5681) Refactor IESCS in Snitches


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5681:
---

Attachment: 5681-v1.diff

Attached v1 patch extracts the IESCS work in Ec2MRS and GPFS into a new helper 
class, ReconnectableSnitchHelper (which implements IESCS). I choose to create a 
new 'sidekick/helper' class rather than create a new parent class as Ec2MRS 
already derives from Ec2Snitch, and it wouldn't make sense to have Ec2Snitch 
derive from the new 'reconnecting' snitch as it doesn't need the reconnect 
functionality.

Note: when applying to trunk, will also refactor YamlFileNTS (as it didn't 
exist in 1.2).

 Refactor IESCS in Snitches
 --

 Key: CASSANDRA-5681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5681
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: snitch
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5681-v1.diff


 Reduce/refactor duplicated IESCS implementations in Ec2MRS, GPFS, and YPNTS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-5681) Refactor IESCS in Snitches


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689803#comment-13689803
 ] 

Jason Brown edited comment on CASSANDRA-5681 at 6/20/13 10:58 PM:
--

Attached v1 patch extracts the IESCS work in Ec2MRS and GPFS into a new helper 
class, ReconnectableSnitchHelper (which implements IESCS). I choose to create a 
new 'sidekick/helper' class rather than create a new parent class as Ec2MRS 
already derives from Ec2Snitch, and it wouldn't make sense to have Ec2Snitch 
derive from the new 'reconnecting' snitch as it doesn't need the reconnect 
functionality.

Note: when applying to trunk, I will also refactor YamlFileNTS to use 
ReconnectableSnitchHelper (as YamlFileNTS didn't exist in 1.2).

Note, 2: I'm open to renaming ReconnectableSnitchHelper to something more 
interesting (I'm not thrilled with the 'Helper' suffix).

  was (Author: jasobrown):
Attached v1 patch extracts the IESCS work in Ec2MRS and GPFS into a new 
helper class, ReconnectableSnitchHelper (which implements IESCS). I choose to 
create a new 'sidekick/helper' class rather than create a new parent class as 
Ec2MRS already derives from Ec2Snitch, and it wouldn't make sense to have 
Ec2Snitch derive from the new 'reconnecting' snitch as it doesn't need the 
reconnect functionality.

Note: when applying to trunk, will also refactor YamlFileNTS (as it didn't 
exist in 1.2).
  
 Refactor IESCS in Snitches
 --

 Key: CASSANDRA-5681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5681
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.5
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
  Labels: snitch
 Fix For: 1.2.6, 2.0 beta 1

 Attachments: 5681-v1.diff


 Reduce/refactor duplicated IESCS implementations in Ec2MRS, GPFS, and YPNTS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)

2013-06-20 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689804#comment-13689804
 ] 

Benedict commented on CASSANDRA-2698:
-

Hi Yuki,

{quote}
how many rows each merkle tree range account for (and the size that this 
represent)
{quote}

I take it two histograms, one of row distribution, the other of size (of row) 
distribution, is what you're looking for?

I've attached a new patch which is *not* complete, in that I have not tested it 
and may want to change a few of the final details (such as, possibly, where and 
what is logged with the histogram), but before I iron out those kinks I wanted 
to run past the main crux of the changes to see if it's what you're looking 
for. Simply put, the merkle tree ranges now retain both a sizeOfRange (ie size 
of rows added) and rowsInRange (ie number of rows added). Merkle tree now 
exposes two histogramXXX() methods which use these, and which as of now are 
logged in Validator.complete(). As it stands, I serialise both these new values 
over the wire with any merkle tree, to ensure no unexpected behaviour for 
future users of the class, and as such I also retained my TreeDifference 
changes to the merkle tree, which reports the size and row count of each side 
of a difference. These latter two changes may be slightly controversial, so 
want to run them past you, as well as confirm the basic information I'm 
printing is what you're looking for.

Cheers!

 Instrument repair to be able to assess it's efficiency (precision)
 --

 Key: CASSANDRA-2698
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Benedict
Priority: Minor
  Labels: lhf
 Attachments: nodetool_repair_and_cfhistogram.tar.gz, 
 patch_2698_v1.txt, patch.diff, patch-rebased.diff


 Some reports indicate that repair sometime transfer huge amounts of data. One 
 hypothesis is that the merkle tree precision may deteriorate too much at some 
 data size. To check this hypothesis, it would be reasonably to gather 
 statistic during the merkle tree building of how many rows each merkle tree 
 range account for (and the size that this represent). It is probably an 
 interesting statistic to have anyway.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)

2013-06-20 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-2698:


Attachment: patch.taketwo.alpha.diff

 Instrument repair to be able to assess it's efficiency (precision)
 --

 Key: CASSANDRA-2698
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Benedict
Priority: Minor
  Labels: lhf
 Attachments: nodetool_repair_and_cfhistogram.tar.gz, 
 patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff


 Some reports indicate that repair sometime transfer huge amounts of data. One 
 hypothesis is that the merkle tree precision may deteriorate too much at some 
 data size. To check this hypothesis, it would be reasonably to gather 
 statistic during the merkle tree building of how many rows each merkle tree 
 range account for (and the size that this represent). It is probably an 
 interesting statistic to have anyway.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)

2013-06-20 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689854#comment-13689854
 ] 

Benedict commented on CASSANDRA-2698:
-

Just noticed that patch doesn't include the HistogramBuilder class - after I 
posted I noticed I needed to pull in the latest remote changes, which 
unfortunately use 1.7 syntax, and since I'm on an old eclipse my problems 
window exploded. It's late here so I didn't/don't want to faff around too much, 
but if you have trouble let me know and I'll upload another patch after 
upgrading eclipse. The patch uploaded should be easy to quickly scan with my 
description to let me know if I'm still barking up the wrong tree, or if there 
are any changes you disagree with in principle.

 Instrument repair to be able to assess it's efficiency (precision)
 --

 Key: CASSANDRA-2698
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Benedict
Priority: Minor
  Labels: lhf
 Attachments: nodetool_repair_and_cfhistogram.tar.gz, 
 patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff


 Some reports indicate that repair sometime transfer huge amounts of data. One 
 hypothesis is that the merkle tree precision may deteriorate too much at some 
 data size. To check this hypothesis, it would be reasonably to gather 
 statistic during the merkle tree building of how many rows each merkle tree 
 range account for (and the size that this represent). It is probably an 
 interesting statistic to have anyway.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-5682) When the Cassandra delete keys in secondary Index?

2013-06-20 Thread YounwooKim (JIRA)

YounwooKim created CASSANDRA-5682:
-

 Summary: When the Cassandra delete keys in secondary Index?
 Key: CASSANDRA-5682
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5682
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0.1
 Environment: normal x86 PC (i3 CPU + 4GB ram) + Ubuntu 12.04
Reporter: YounwooKim
Priority: Minor


How can i reduce the size of secondary index?

Obviously, I delete many keys, and tried flush, compact, cleanup, rebuild_index 
using nodetool. However, i can't reduce the size of secondary index. ( Of 
course, the size of table(Primary key) is reduced. )


Therefore, I found out the hint from the Cassandra source code, and I guess a 
feature of secondary index deletion.

1) When I request deletion of key, and the key is in the sstable(not in the 
memtable), the Cassandra doesn't insert the tombstone to the sstable for 
secondary index.( Unlike the table )
( from AbstractSimpleColumnSecondaryIndex.delete() function. )

2) After scaning the secondary index, the tombstone is maded in secondary index.
( from KeysSearcher.getIndexedIterator() function. It is called by index scan 
verb. )

3) Cleanup command in nodetool is used to delete out of range keys. ( Cleanup 
command don't care about deleted keys )
( from CompactionManager.doCleanupCompaction() function. )

After this, I scan deleted keys using 'Where' clause, and I can reduce the size 
of secondary index. I think that it is only one way to reduce the size of 
secondary index.

Is this a correct conclusion? I can't found related articles and other methods. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5682) When the Cassandra delete keys in secondary Index?

2013-06-20 Thread YounwooKim (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YounwooKim updated CASSANDRA-5682:
--

Description: 
How can i reduce the size of secondary index?

Obviously, I delete many keys, and tried flush, compact, cleanup, rebuild_index 
using nodetool. However, i can't reduce the size of secondary index. ( Of 
course, the size of table(Primary key) is reduced. )


Therefore, I found out the hint from the Cassandra source code, and I guess a 
feature of secondary index deletion.

1) When I request deletion of key, and the key is in the sstable(not in the 
memtable), the Cassandra doesn't insert the tombstone to the sstable for 
secondary index.( Unlike the table )
( from AbstractSimpleColumnSecondaryIndex.delete() function. )

2) After scaning the secondary index, the tombstone is maded in secondary index.
( from KeysSearcher.getIndexedIterator() function. It is called by index scan 
verb. )

3) Cleanup command in nodetool is used to delete out of range keys. ( Cleanup 
command don't care about deleted keys )
( from CompactionManager.doCleanupCompaction() function. )

After this, I scan deleted keys using 'Where' clause, and I can reduce the size 
of secondary index. I think that it is only one way to reduce the size of 
secondary index.

Is this a correct conclusion? I can't found related articles and other methods. 

I think that the Cassandra needs the compaction function for secondary index .

  was:
How can i reduce the size of secondary index?

Obviously, I delete many keys, and tried flush, compact, cleanup, rebuild_index 
using nodetool. However, i can't reduce the size of secondary index. ( Of 
course, the size of table(Primary key) is reduced. )


Therefore, I found out the hint from the Cassandra source code, and I guess a 
feature of secondary index deletion.

1) When I request deletion of key, and the key is in the sstable(not in the 
memtable), the Cassandra doesn't insert the tombstone to the sstable for 
secondary index.( Unlike the table )
( from AbstractSimpleColumnSecondaryIndex.delete() function. )

2) After scaning the secondary index, the tombstone is maded in secondary index.
( from KeysSearcher.getIndexedIterator() function. It is called by index scan 
verb. )

3) Cleanup command in nodetool is used to delete out of range keys. ( Cleanup 
command don't care about deleted keys )
( from CompactionManager.doCleanupCompaction() function. )

After this, I scan deleted keys using 'Where' clause, and I can reduce the size 
of secondary index. I think that it is only one way to reduce the size of 
secondary index.

Is this a correct conclusion? I can't found related articles and other methods. 


 When the Cassandra delete keys in secondary Index?
 --

 Key: CASSANDRA-5682
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5682
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0.1
 Environment: normal x86 PC (i3 CPU + 4GB ram) + Ubuntu 12.04
Reporter: YounwooKim
Priority: Minor

 How can i reduce the size of secondary index?
 Obviously, I delete many keys, and tried flush, compact, cleanup, 
 rebuild_index using nodetool. However, i can't reduce the size of secondary 
 index. ( Of course, the size of table(Primary key) is reduced. )
 Therefore, I found out the hint from the Cassandra source code, and I guess a 
 feature of secondary index deletion.
 1) When I request deletion of key, and the key is in the sstable(not in the 
 memtable), the Cassandra doesn't insert the tombstone to the sstable for 
 secondary index.( Unlike the table )
 ( from AbstractSimpleColumnSecondaryIndex.delete() function. )
 2) After scaning the secondary index, the tombstone is maded in secondary 
 index.
 ( from KeysSearcher.getIndexedIterator() function. It is called by index scan 
 verb. )
 3) Cleanup command in nodetool is used to delete out of range keys. ( Cleanup 
 command don't care about deleted keys )
 ( from CompactionManager.doCleanupCompaction() function. )
 After this, I scan deleted keys using 'Where' clause, and I can reduce the 
 size of secondary index. I think that it is only one way to reduce the size 
 of secondary index.
 Is this a correct conclusion? I can't found related articles and other 
 methods. 
 I think that the Cassandra needs the compaction function for secondary index .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5286) Streaming 2.0

2013-06-20 Thread Dave Brosius (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius updated CASSANDRA-5286:


Attachment: 5286_addendum.txt

currentThroughput is a double but is calculated with int math, just use double 
math

 Streaming 2.0
 -

 Key: CASSANDRA-5286
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5286
 Project: Cassandra
  Issue Type: Improvement
Reporter: Yuki Morishita
Assignee: Yuki Morishita
  Labels: streaming
 Fix For: 2.0 beta 1

 Attachments: 5286_addendum.txt


 2.0 is the good time to redesign streaming API including protocol to make 
 streaming more performant and reliable.
 Design goals that come up in my mind:
 *Better performance*
   - Protocol optimization
   - Stream multiple files in parallel (CASSANDRA-4663)
   - Persistent connection (CASSANDRA-4660)
 *Better control*
   - Cleaner API for error handling
   - Integrate both IN/OUT streams into one session, so the 
 components(bootstrap, move, bulkload, repair...) that use streaming can 
 manage them easily.
 *Better reporting*
   - Better logging/tracing
   - More metrics
   - Progress reporting API for external client

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (CASSANDRA-5286) Streaming 2.0

2013-06-20 Thread Dave Brosius (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius reopened CASSANDRA-5286:
-


patch attached

 Streaming 2.0
 -

 Key: CASSANDRA-5286
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5286
 Project: Cassandra
  Issue Type: Improvement
Reporter: Yuki Morishita
Assignee: Yuki Morishita
  Labels: streaming
 Fix For: 2.0 beta 1

 Attachments: 5286_addendum.txt


 2.0 is the good time to redesign streaming API including protocol to make 
 streaming more performant and reliable.
 Design goals that come up in my mind:
 *Better performance*
   - Protocol optimization
   - Stream multiple files in parallel (CASSANDRA-4663)
   - Persistent connection (CASSANDRA-4660)
 *Better control*
   - Cleaner API for error handling
   - Integrate both IN/OUT streams into one session, so the 
 components(bootstrap, move, bulkload, repair...) that use streaming can 
 manage them easily.
 *Better reporting*
   - Better logging/tracing
   - More metrics
   - Progress reporting API for external client

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5286) Streaming 2.0