date:20140404


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959841#comment-13959841
 ] 

Benedict commented on CASSANDRA-6694:
-

rebased and pushed -f

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6982) start_column in get_page_slice has odd behaivor

2014-04-04 Thread Edward Capriolo (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned CASSANDRA-6982:
--

Assignee: Edward Capriolo

 start_column in get_page_slice has odd behaivor
 ---

 Key: CASSANDRA-6982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6982
 Project: Cassandra
  Issue Type: Bug
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Critical

 get_paged_slice is described as so:
 {code}
  /**
returns a range of columns, wrapping to the next rows if necessary to 
 collect max_results.
   */
   listKeySlice get_paged_slice(1:required string column_family,
  2:required KeyRange range,
  3:required binary start_column,
  4:required ConsistencyLevel 
 consistency_level=ConsistencyLevel.ONE)
  throws (1:InvalidRequestException ire, 
 2:UnavailableException ue, 3:TimedOutException te),
 {code}
 The term max_results is not defined, I take it to mean key_range.count.
 The larger issue I have found is that start_column seems to be ignored in 
 some cases.
 testNormal() produces this error
 junit.framework.ComparisonFailure: null expected:[c] but was:[a]
 The problem seems to be KeyRanges that use tokens and not keys.
 {code}
 KeyRange kr = new KeyRange();
   kr.setCount(3);
   kr.setStart_token();
   kr.setEnd_token();   
 {code}
 A failing test is here:
 https://github.com/edwardcapriolo/cassandra/compare/pg?expand=1
 Is this a bug? It feels like one, or is this just undefined behaviour. If it 
 is a bug I would like to fix. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6982) start_column in get_page_slice has odd behaivor

2014-04-04 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959995#comment-13959995
 ] 

Edward Capriolo commented on CASSANDRA-6982:


This is not as much a bug as thrift is allowing users to do something that does 
not work. We should reject that request.

I will explain
This works:
{code}
 KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_key(ByteBufferUtil.bytes(aslice));
  kr.setEnd_key(ByteBufferUtil.bytes(aslice));  
   ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{code}

When you specify a start token and a start column you get what you would expect.

This is correct but may not be intuitive.
{code}
 KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_key(ByteBufferUtil.bytes());
  kr.setEnd_key(ByteBufferUtil.bytes());  
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{code}

Your slice is starting before the row in question. You get back columns a,b,c  
not c,d,e.

The problem comes using token. With Murmur3 and Random Partitioner the relation 
between tokens-keys is not one to one. The pig unit tests fires up 
ByteOrderPartitioner, the rest of our testing is Murmer3.

Here are the things we should not allow: (not sure if I am supposed to hex 
encode so I tried both)
{quote}
  KeyRange kr = new KeyRange();
  kr.setCount(3);
  
kr.setStart_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes(;
  
kr.setEnd_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes(;
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{quote}  

{quote}
  Murmur3Partitioner m  = new Murmur3Partitioner();
  LongToken l = m.getToken(ByteBufferUtil.bytes(aslice));
KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_token(l.toString());
  kr.setEnd_token(l.toString());
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{quote}

Because the relationship of token to key is not 1 to 1. There is no way to 
start at a specific row. Since you can not start at a specific row the 
start_column is meaningless.

I *think* we should reject a KeyRange using a start_token and a start_column. 
We should throw an InvalidRequestException.

 start_column in get_page_slice has odd behaivor
 ---

 Key: CASSANDRA-6982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6982
 Project: Cassandra
  Issue Type: Bug
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Critical

 get_paged_slice is described as so:
 {code}
  /**
returns a range of columns, wrapping to the next rows if necessary to 
 collect max_results.
   */
   listKeySlice get_paged_slice(1:required string column_family,
  2:required KeyRange range,
  3:required binary start_column,
  4:required ConsistencyLevel 
 consistency_level=ConsistencyLevel.ONE)
  throws (1:InvalidRequestException ire, 
 2:UnavailableException ue, 3:TimedOutException te),
 {code}
 The term max_results is not defined, I take it to mean key_range.count.
 The larger issue I have found is that start_column seems to be ignored in 
 some cases.
 testNormal() produces this error
 junit.framework.ComparisonFailure: null expected:[c] but was:[a]
 The problem seems to be KeyRanges that use tokens and not keys.
 {code}
 KeyRange kr = new KeyRange();
   kr.setCount(3);
   kr.setStart_token();
   kr.setEnd_token();   
 {code}
 A failing test is here:
 https://github.com/edwardcapriolo/cassandra/compare/pg?expand=1
 Is this a bug? It feels like one, or is this just undefined behaviour. If it 
 is a bug I would like to fix. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-6982) start_column in get_page_slice has odd behaivor

2014-04-04 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959995#comment-13959995
 ] 

Edward Capriolo edited comment on CASSANDRA-6982 at 4/4/14 2:29 PM:


This is not as much a bug as thrift is allowing users to do something that does 
not work. We should reject that request.

I will explain
This works:
{code}
 KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_key(ByteBufferUtil.bytes(aslice));
  kr.setEnd_key(ByteBufferUtil.bytes(aslice));  
   ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{code}

When you specify a start token and a start column you get what you would expect.

This is correct but may not be intuitive.
{code}
 KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_key(ByteBufferUtil.bytes());
  kr.setEnd_key(ByteBufferUtil.bytes());  
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{code}

Your slice is starting before the row in question. You get back columns a,b,c  
not c,d,e.

The problem comes using token. With Murmur3 and Random Partitioner the relation 
between tokens-keys is not one to one. The pig unit tests fires up 
ByteOrderPartitioner, the rest of our testing is Murmer3.

Here are the things we should not allow: (not sure if I am supposed to hex 
encode so I tried both)
{quote}
  KeyRange kr = new KeyRange();
  kr.setCount(3);
  
kr.setStart_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes(;
  
kr.setEnd_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes(;
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{quote}  

{quote}
  Murmur3Partitioner m  = new Murmur3Partitioner();
  LongToken l = m.getToken(ByteBufferUtil.bytes(aslice));
KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_token(l.toString());
  kr.setEnd_token(l.toString());
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{quote}

Because the relationship of token to key is not 1 to 1. There is no way to 
start at a specific row. Since you can not start at a specific row the 
start_column is meaningless.

I *think* we should reject a KeyRange using a start_token and a start_column 
when the partitioner does not provide 1 to 1 tokens. We should throw an 
InvalidRequestException.


was (Author: appodictic):
This is not as much a bug as thrift is allowing users to do something that does 
not work. We should reject that request.

I will explain
This works:
{code}
 KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_key(ByteBufferUtil.bytes(aslice));
  kr.setEnd_key(ByteBufferUtil.bytes(aslice));  
   ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{code}

When you specify a start token and a start column you get what you would expect.

This is correct but may not be intuitive.
{code}
 KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_key(ByteBufferUtil.bytes());
  kr.setEnd_key(ByteBufferUtil.bytes());  
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{code}

Your slice is starting before the row in question. You get back columns a,b,c  
not c,d,e.

The problem comes using token. With Murmur3 and Random Partitioner the relation 
between tokens-keys is not one to one. The pig unit tests fires up 
ByteOrderPartitioner, the rest of our testing is Murmer3.

Here are the things we should not allow: (not sure if I am supposed to hex 
encode so I tried both)
{quote}
  KeyRange kr = new KeyRange();
  kr.setCount(3);
  
kr.setStart_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes(;
  
kr.setEnd_token(ByteBufferUtil.bytesToHex(ByteBuffer.wrap(l.token.toString().getBytes(;
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{quote}  

{quote}
  Murmur3Partitioner m  = new Murmur3Partitioner();
  LongToken l = m.getToken(ByteBufferUtil.bytes(aslice));
KeyRange kr = new KeyRange();
  kr.setCount(3);
  kr.setStart_token(l.toString());
  kr.setEnd_token(l.toString());
  ListKeySlice t = server.get_paged_slice(Standard1, kr, 
ByteBufferUtil.bytes(c), ConsistencyLevel.ONE);
{quote}

Because the relationship of token to key is not 1 to 1. There is no way to 
start at a specific row. Since you can not start at a specific row the 
start_column is meaningless.

I *think* we should reject a KeyRange using a start_token and a start_column. 
We should

git commit: Track presence of legacy counter shards in sstables

2014-04-04 Thread aleksey

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 6d901f90a - 57b18e600


Track presence of legacy counter shards in sstables

patch by Aleksey Yeschenko; reviewed by Marcus Eriksson for
CASSANDRA-6888


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/57b18e60
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/57b18e60
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/57b18e60

Branch: refs/heads/cassandra-2.1
Commit: 57b18e600c6d79d19d29f3569b81cb946ef9ee57
Parents: 6d901f9
Author: Aleksey Yeschenko alek...@apache.org
Authored: Fri Apr 4 17:36:15 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Fri Apr 4 17:36:15 2014 +0300

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/db/ColumnFamily.java   | 12 ++-
 .../org/apache/cassandra/db/CounterCell.java|  5 ++
 .../db/compaction/LazilyCompactedRow.java   | 12 +--
 .../cassandra/db/context/CounterContext.java| 18 +
 .../cassandra/io/sstable/ColumnStats.java   | 12 ++-
 .../apache/cassandra/io/sstable/Descriptor.java |  3 +
 .../cassandra/io/sstable/SSTableWriter.java | 26 ---
 .../metadata/LegacyMetadataSerializer.java  |  1 +
 .../io/sstable/metadata/MetadataCollector.java  | 67 ++---
 .../io/sstable/metadata/StatsMetadata.java  | 14 
 .../io/sstable/SSTableMetadataTest.java | 77 +---
 12 files changed, 194 insertions(+), 54 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/57b18e60/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ac2f624..4cfc957 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -40,6 +40,7 @@
  * Optimize CounterColumn#reconcile() (CASSANDRA-6953)
  * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869)
  * Lock counter cells, not partitions (CASSANDRA-6880)
+ * Track presence of legacy counter shards in sstables (CASSANDRA-6888)
 Merged from 2.0:
  * Allow compaction of system tables during startup (CASSANDRA-6913)
  * Restrict Windows to parallel repairs (CASSANDRA-6907)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/57b18e60/src/java/org/apache/cassandra/db/ColumnFamily.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamily.java 
b/src/java/org/apache/cassandra/db/ColumnFamily.java
index e7aab37..da404b0 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamily.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamily.java
@@ -402,6 +402,7 @@ public abstract class ColumnFamily implements 
IterableCell, IRowCacheEntry
 int maxLocalDeletionTime = Integer.MIN_VALUE;
 ListByteBuffer minColumnNamesSeen = Collections.emptyList();
 ListByteBuffer maxColumnNamesSeen = Collections.emptyList();
+boolean hasLegacyCounterShards = false;
 for (Cell cell : this)
 {
 if (deletionInfo().getTopLevelDeletion().localDeletionTime  
Integer.MAX_VALUE)
@@ -420,8 +421,17 @@ public abstract class ColumnFamily implements 
IterableCell, IRowCacheEntry
 tombstones.update(deletionTime);
 minColumnNamesSeen = 
ColumnNameHelper.minComponents(minColumnNamesSeen, cell.name, 
metadata.comparator);
 maxColumnNamesSeen = 
ColumnNameHelper.maxComponents(maxColumnNamesSeen, cell.name, 
metadata.comparator);
+if (cell instanceof CounterCell)
+hasLegacyCounterShards = hasLegacyCounterShards || 
((CounterCell) cell).hasLegacyShards();
 }
-return new ColumnStats(getColumnCount(), minTimestampSeen, 
maxTimestampSeen, maxLocalDeletionTime, tombstones, minColumnNamesSeen, 
maxColumnNamesSeen);
+return new ColumnStats(getColumnCount(),
+   minTimestampSeen,
+   maxTimestampSeen,
+   maxLocalDeletionTime,
+   tombstones,
+   minColumnNamesSeen,
+   maxColumnNamesSeen,
+   hasLegacyCounterShards);
 }
 
 public boolean isMarkedForDelete()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/57b18e60/src/java/org/apache/cassandra/db/CounterCell.java
--
diff --git a/src/java/org/apache/cassandra/db/CounterCell.java 
b/src/java/org/apache/cassandra/db/CounterCell.java
index 6b588ef..fc4ac3f 100644
--- a/src/java/org/apache/cassandra/db/CounterCell.java
+++ b/src/java/org/apache/cassandra/db/CounterCell.java
@@ -182,6 +182,11 @@ public class CounterCell extends Cell

[1/2] git commit: Track presence of legacy counter shards in sstables

2014-04-04 Thread aleksey

Repository: cassandra
Updated Branches:
  refs/heads/trunk f4e8fc3f6 - 0015f37a3


Track presence of legacy counter shards in sstables

patch by Aleksey Yeschenko; reviewed by Marcus Eriksson for
CASSANDRA-6888


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/57b18e60
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/57b18e60
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/57b18e60

Branch: refs/heads/trunk
Commit: 57b18e600c6d79d19d29f3569b81cb946ef9ee57
Parents: 6d901f9
Author: Aleksey Yeschenko alek...@apache.org
Authored: Fri Apr 4 17:36:15 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Fri Apr 4 17:36:15 2014 +0300

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/db/ColumnFamily.java   | 12 ++-
 .../org/apache/cassandra/db/CounterCell.java|  5 ++
 .../db/compaction/LazilyCompactedRow.java   | 12 +--
 .../cassandra/db/context/CounterContext.java| 18 +
 .../cassandra/io/sstable/ColumnStats.java   | 12 ++-
 .../apache/cassandra/io/sstable/Descriptor.java |  3 +
 .../cassandra/io/sstable/SSTableWriter.java | 26 ---
 .../metadata/LegacyMetadataSerializer.java  |  1 +
 .../io/sstable/metadata/MetadataCollector.java  | 67 ++---
 .../io/sstable/metadata/StatsMetadata.java  | 14 
 .../io/sstable/SSTableMetadataTest.java | 77 +---
 12 files changed, 194 insertions(+), 54 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/57b18e60/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ac2f624..4cfc957 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -40,6 +40,7 @@
  * Optimize CounterColumn#reconcile() (CASSANDRA-6953)
  * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869)
  * Lock counter cells, not partitions (CASSANDRA-6880)
+ * Track presence of legacy counter shards in sstables (CASSANDRA-6888)
 Merged from 2.0:
  * Allow compaction of system tables during startup (CASSANDRA-6913)
  * Restrict Windows to parallel repairs (CASSANDRA-6907)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/57b18e60/src/java/org/apache/cassandra/db/ColumnFamily.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamily.java 
b/src/java/org/apache/cassandra/db/ColumnFamily.java
index e7aab37..da404b0 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamily.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamily.java
@@ -402,6 +402,7 @@ public abstract class ColumnFamily implements 
IterableCell, IRowCacheEntry
 int maxLocalDeletionTime = Integer.MIN_VALUE;
 ListByteBuffer minColumnNamesSeen = Collections.emptyList();
 ListByteBuffer maxColumnNamesSeen = Collections.emptyList();
+boolean hasLegacyCounterShards = false;
 for (Cell cell : this)
 {
 if (deletionInfo().getTopLevelDeletion().localDeletionTime  
Integer.MAX_VALUE)
@@ -420,8 +421,17 @@ public abstract class ColumnFamily implements 
IterableCell, IRowCacheEntry
 tombstones.update(deletionTime);
 minColumnNamesSeen = 
ColumnNameHelper.minComponents(minColumnNamesSeen, cell.name, 
metadata.comparator);
 maxColumnNamesSeen = 
ColumnNameHelper.maxComponents(maxColumnNamesSeen, cell.name, 
metadata.comparator);
+if (cell instanceof CounterCell)
+hasLegacyCounterShards = hasLegacyCounterShards || 
((CounterCell) cell).hasLegacyShards();
 }
-return new ColumnStats(getColumnCount(), minTimestampSeen, 
maxTimestampSeen, maxLocalDeletionTime, tombstones, minColumnNamesSeen, 
maxColumnNamesSeen);
+return new ColumnStats(getColumnCount(),
+   minTimestampSeen,
+   maxTimestampSeen,
+   maxLocalDeletionTime,
+   tombstones,
+   minColumnNamesSeen,
+   maxColumnNamesSeen,
+   hasLegacyCounterShards);
 }
 
 public boolean isMarkedForDelete()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/57b18e60/src/java/org/apache/cassandra/db/CounterCell.java
--
diff --git a/src/java/org/apache/cassandra/db/CounterCell.java 
b/src/java/org/apache/cassandra/db/CounterCell.java
index 6b588ef..fc4ac3f 100644
--- a/src/java/org/apache/cassandra/db/CounterCell.java
+++ b/src/java/org/apache/cassandra/db/CounterCell.java
@@ -182,6 +182,11 @@ public class CounterCell extends Cell

[2/2] git commit: Merge branch 'cassandra-2.1' into trunk

2014-04-04 Thread aleksey

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0015f37a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0015f37a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0015f37a

Branch: refs/heads/trunk
Commit: 0015f37a3fa6ff34a63566e253433dbc4d3cf384
Parents: f4e8fc3 57b18e6
Author: Aleksey Yeschenko alek...@apache.org
Authored: Fri Apr 4 17:39:20 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Fri Apr 4 17:39:20 2014 +0300

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/db/ColumnFamily.java   | 12 ++-
 .../org/apache/cassandra/db/CounterCell.java|  5 ++
 .../db/compaction/LazilyCompactedRow.java   | 12 +--
 .../cassandra/db/context/CounterContext.java| 18 +
 .../cassandra/io/sstable/ColumnStats.java   | 12 ++-
 .../apache/cassandra/io/sstable/Descriptor.java |  3 +
 .../cassandra/io/sstable/SSTableWriter.java | 26 ---
 .../metadata/LegacyMetadataSerializer.java  |  1 +
 .../io/sstable/metadata/MetadataCollector.java  | 67 ++---
 .../io/sstable/metadata/StatsMetadata.java  | 14 
 .../io/sstable/SSTableMetadataTest.java | 77 +---
 12 files changed, 194 insertions(+), 54 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0015f37a/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0015f37a/src/java/org/apache/cassandra/db/ColumnFamily.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0015f37a/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
--

[jira] [Commented] (CASSANDRA-6553) Benchmark counter improvements (counters++)

2014-04-04 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960040#comment-13960040
 ] 

Aleksey Yeschenko commented on CASSANDRA-6553:
--

[~rhatch] One last time this week, could you run QUORUM only, counter cache ON 
only https://github.com/iamaleksey/cassandra/tree/cassandra-2.0 vs. 
https://github.com/iamaleksey/cassandra/tree/6553 ? (for screenshots)

Thanks.


 Benchmark counter improvements (counters++)
 ---

 Key: CASSANDRA-6553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6553
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Russ Hatch
 Fix For: 2.1 beta2

 Attachments: 6553.txt, 6553.uber.quorum.bdplab.read.png, 
 6553.uber.quorum.bdplab.write.png, high_cl_one.png, high_cl_quorum.png, 
 low_cl_one.png, low_cl_quorum.png, tracing.txt, uber_cl_one.png, 
 uber_cl_quorum.png


 Benchmark the difference in performance between CASSANDRA-6504 and trunk.
 * Updating totally unrelated counters (different partitions)
 * Updating the same counters a lot (same cells in the same partition)
 * Different cells in the same few partitions (hot counter partition)
 benchmark: 
 https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
  (old counters)
 compared to: 
 https://github.com/apache/cassandra/tree/714c423360c36da2a2b365efaf9c5c4f623ed133
  (new counters)
 So far, the above changes should only affect the write path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2014-04-04 Thread Shridhar (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960050#comment-13960050
 ] 

Shridhar commented on CASSANDRA-6311:
-

[~alexliu68] We downloaded cassandra-2.0.6 and added patch (6311-v11.txt) on 
top of this. Still we are getting the same error as in CASSANDRA-6151. 

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.7

 Attachments: 6311-v10.txt, 6311-v11.txt, 6311-v3-2.0-branch.txt, 
 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 
 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6747) MessagingService should handle failures on remote nodes.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-6747:
--

Fix Version/s: 2.1 beta2

 MessagingService should handle failures on remote nodes.
 

 Key: CASSANDRA-6747
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6747
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
  Labels: Core
 Fix For: 2.1 beta2

 Attachments: CASSANDRA-6747.diff


 While going through the code of MessagingService, I discovered that we don't 
 handle callbacks on failure very well. If a Verb Handler on the remote 
 machine throws an exception, it goes right through uncaught exception 
 handler. The machine which triggered the message will keep waiting and will 
 timeout. On timeout, it will so some stuff hard coded in the MS like hints 
 and add to Latency. There is no way in IAsyncCallback to specify that to do 
 on timeouts and also on failures. 
 Here are some examples which I found will help if we enhance this system to 
 also propagate failures back.  So IAsyncCallback will have methods like 
 onFailure.
 1) From ActiveRepairService.prepareForRepair
IAsyncCallback callback = new IAsyncCallback()
{
@Override
public void response(MessageIn msg)
{
prepareLatch.countDown();
}
@Override
public boolean isLatencyForSnitch()
{
return false;
}
};
ListUUID cfIds = new ArrayList(columnFamilyStores.size());
for (ColumnFamilyStore cfs : columnFamilyStores)
cfIds.add(cfs.metadata.cfId);
for(InetAddress neighbour : endpoints)
{
PrepareMessage message = new PrepareMessage(parentRepairSession, 
 cfIds, ranges);
MessageOutRepairMessage msg = message.createMessage();
MessagingService.instance().sendRR(msg, neighbour, callback);
}
try
{
prepareLatch.await(1, TimeUnit.HOURS);
}
catch (InterruptedException e)
{
parentRepairSessions.remove(parentRepairSession);
throw new RuntimeException(Did not get replies from all 
 endpoints., e);
}
 2) During snapshot phase in repair, if SnapshotVerbHandler throws an 
 exception, we will wait forever. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6747) MessagingService should handle failures on remote nodes.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-6747:
--

Reviewer: Yuki Morishita

[~kohlisankalp] I like your approach. 
One thing you need to change is in SnapshotTask's callback#onFailure, you can't 
just throw RuntimeException, you have to call task.setException so repair knows 
there's exception during snapshotting.

 MessagingService should handle failures on remote nodes.
 

 Key: CASSANDRA-6747
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6747
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
  Labels: Core
 Fix For: 2.1 beta2

 Attachments: CASSANDRA-6747.diff


 While going through the code of MessagingService, I discovered that we don't 
 handle callbacks on failure very well. If a Verb Handler on the remote 
 machine throws an exception, it goes right through uncaught exception 
 handler. The machine which triggered the message will keep waiting and will 
 timeout. On timeout, it will so some stuff hard coded in the MS like hints 
 and add to Latency. There is no way in IAsyncCallback to specify that to do 
 on timeouts and also on failures. 
 Here are some examples which I found will help if we enhance this system to 
 also propagate failures back.  So IAsyncCallback will have methods like 
 onFailure.
 1) From ActiveRepairService.prepareForRepair
IAsyncCallback callback = new IAsyncCallback()
{
@Override
public void response(MessageIn msg)
{
prepareLatch.countDown();
}
@Override
public boolean isLatencyForSnitch()
{
return false;
}
};
ListUUID cfIds = new ArrayList(columnFamilyStores.size());
for (ColumnFamilyStore cfs : columnFamilyStores)
cfIds.add(cfs.metadata.cfId);
for(InetAddress neighbour : endpoints)
{
PrepareMessage message = new PrepareMessage(parentRepairSession, 
 cfIds, ranges);
MessageOutRepairMessage msg = message.createMessage();
MessagingService.instance().sendRR(msg, neighbour, callback);
}
try
{
prepareLatch.await(1, TimeUnit.HOURS);
}
catch (InterruptedException e)
{
parentRepairSessions.remove(parentRepairSession);
throw new RuntimeException(Did not get replies from all 
 endpoints., e);
}
 2) During snapshot phase in repair, if SnapshotVerbHandler throws an 
 exception, we will wait forever. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6747) MessagingService should handle failures on remote nodes.

2014-04-04 Thread sankalp kohli (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960085#comment-13960085
 ] 

sankalp kohli commented on CASSANDRA-6747:
--

Please review v2 with your suggestions. 

 MessagingService should handle failures on remote nodes.
 

 Key: CASSANDRA-6747
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6747
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
  Labels: Core
 Fix For: 2.1 beta2

 Attachments: CASSANDRA-6747-v2.diff, CASSANDRA-6747.diff


 While going through the code of MessagingService, I discovered that we don't 
 handle callbacks on failure very well. If a Verb Handler on the remote 
 machine throws an exception, it goes right through uncaught exception 
 handler. The machine which triggered the message will keep waiting and will 
 timeout. On timeout, it will so some stuff hard coded in the MS like hints 
 and add to Latency. There is no way in IAsyncCallback to specify that to do 
 on timeouts and also on failures. 
 Here are some examples which I found will help if we enhance this system to 
 also propagate failures back.  So IAsyncCallback will have methods like 
 onFailure.
 1) From ActiveRepairService.prepareForRepair
IAsyncCallback callback = new IAsyncCallback()
{
@Override
public void response(MessageIn msg)
{
prepareLatch.countDown();
}
@Override
public boolean isLatencyForSnitch()
{
return false;
}
};
ListUUID cfIds = new ArrayList(columnFamilyStores.size());
for (ColumnFamilyStore cfs : columnFamilyStores)
cfIds.add(cfs.metadata.cfId);
for(InetAddress neighbour : endpoints)
{
PrepareMessage message = new PrepareMessage(parentRepairSession, 
 cfIds, ranges);
MessageOutRepairMessage msg = message.createMessage();
MessagingService.instance().sendRR(msg, neighbour, callback);
}
try
{
prepareLatch.await(1, TimeUnit.HOURS);
}
catch (InterruptedException e)
{
parentRepairSessions.remove(parentRepairSession);
throw new RuntimeException(Did not get replies from all 
 endpoints., e);
}
 2) During snapshot phase in repair, if SnapshotVerbHandler throws an 
 exception, we will wait forever. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6747) MessagingService should handle failures on remote nodes.

2014-04-04 Thread sankalp kohli (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-6747:
-

Attachment: CASSANDRA-6747-v2.diff

 MessagingService should handle failures on remote nodes.
 

 Key: CASSANDRA-6747
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6747
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
  Labels: Core
 Fix For: 2.1 beta2

 Attachments: CASSANDRA-6747-v2.diff, CASSANDRA-6747.diff


 While going through the code of MessagingService, I discovered that we don't 
 handle callbacks on failure very well. If a Verb Handler on the remote 
 machine throws an exception, it goes right through uncaught exception 
 handler. The machine which triggered the message will keep waiting and will 
 timeout. On timeout, it will so some stuff hard coded in the MS like hints 
 and add to Latency. There is no way in IAsyncCallback to specify that to do 
 on timeouts and also on failures. 
 Here are some examples which I found will help if we enhance this system to 
 also propagate failures back.  So IAsyncCallback will have methods like 
 onFailure.
 1) From ActiveRepairService.prepareForRepair
IAsyncCallback callback = new IAsyncCallback()
{
@Override
public void response(MessageIn msg)
{
prepareLatch.countDown();
}
@Override
public boolean isLatencyForSnitch()
{
return false;
}
};
ListUUID cfIds = new ArrayList(columnFamilyStores.size());
for (ColumnFamilyStore cfs : columnFamilyStores)
cfIds.add(cfs.metadata.cfId);
for(InetAddress neighbour : endpoints)
{
PrepareMessage message = new PrepareMessage(parentRepairSession, 
 cfIds, ranges);
MessageOutRepairMessage msg = message.createMessage();
MessagingService.instance().sendRR(msg, neighbour, callback);
}
try
{
prepareLatch.await(1, TimeUnit.HOURS);
}
catch (InterruptedException e)
{
parentRepairSessions.remove(parentRepairSession);
throw new RuntimeException(Did not get replies from all 
 endpoints., e);
}
 2) During snapshot phase in repair, if SnapshotVerbHandler throws an 
 exception, we will wait forever. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6553) Benchmark counter improvements (counters++)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch updated CASSANDRA-6553:
--

Attachment: logs.tar.gz

adding logs in logs.tar.gz

These should be ordered same as the graph links above (starting with the no 
counter cache tests).

Like so:
no counter cache/low contention/2.0/write/cl.one
no counter cache/low contention/2.0/read/cl.one
no counter cache/low contention/2.1/write/cl.one
no counter cache/low contention/2.1/read/cl.one

counter cache enabled/uber contention/aleksey's patched 2.1/read/cl.quorum

 Benchmark counter improvements (counters++)
 ---

 Key: CASSANDRA-6553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6553
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Russ Hatch
 Fix For: 2.1 beta2

 Attachments: 6553.txt, 6553.uber.quorum.bdplab.read.png, 
 6553.uber.quorum.bdplab.write.png, high_cl_one.png, high_cl_quorum.png, 
 logs.tar.gz, low_cl_one.png, low_cl_quorum.png, tracing.txt, uber_cl_one.png, 
 uber_cl_quorum.png


 Benchmark the difference in performance between CASSANDRA-6504 and trunk.
 * Updating totally unrelated counters (different partitions)
 * Updating the same counters a lot (same cells in the same partition)
 * Different cells in the same few partitions (hot counter partition)
 benchmark: 
 https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
  (old counters)
 compared to: 
 https://github.com/apache/cassandra/tree/714c423360c36da2a2b365efaf9c5c4f623ed133
  (new counters)
 So far, the above changes should only affect the write path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960112#comment-13960112
 ] 

Jonathan Ellis commented on CASSANDRA-6694:
---

Is there a case to be made here that there's more abstraction than necessary?  
Because I'm still having trouble wrapping my head around it.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960117#comment-13960117
 ] 

Benedict commented on CASSANDRA-6694:
-

Well, it's probably indicative of something wrong, but I don't think it's the 
level of abstraction. Probably I can re-organise it to make it clearer, though.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960187#comment-13960187
 ] 

Benedict commented on CASSANDRA-6694:
-

Rebased, reorganised and pushed to 
[6694-reorg|https://github.com/belliottsmith/cassandra/tree/6694-reorg]

Does that make it clearer what's going on?

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960192#comment-13960192
 ] 

Benedict commented on CASSANDRA-6694:
-

We basically have:

BBAllocator (and implementors)
BBPool + BBPoolAllocator (and implementors)
NativePool + NativeAllocator

BBPoolAllocator creates a BBAllocator per session, by wrapping the session's 
OpOrder.Group

BBAllocator is used to construct Buffer* implementations (necessary without 
further refactor, as that's how CellName implementors work, and don't want to 
rip those apart in this commit);
DataAllocator wraps the above to create arbitrary implementations (i.e. Native* 
or Buffer*, atm)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6913) Compaction of system keyspaces during startup can cause early loading of non-system keyspaces

2014-04-04 Thread Ravi Prasad (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960220#comment-13960220
 ] 

Ravi Prasad commented on CASSANDRA-6913:


we were noticing occasional FilenotFoundException due to compaction leftovers 
at startup on restart, after upgrading to cassandra-2.0 (CASSANDRA-5151). I 
think this fixes that issue.  
Would it make sense to change the changes.txt to 'Avoid early loading of 
non-system keyspaces before compaction-leftovers cleanup at startup' instead of
https://github.com/apache/cassandra/blob/56d84a7c028c0498158efb1a3cadea149ab7c1cd/CHANGES.txt#L2
 ?

 Compaction of system keyspaces during startup can cause early loading of 
 non-system keyspaces
 -

 Key: CASSANDRA-6913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6913
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.0.7, 2.1 beta2

 Attachments: 6913.txt


 This then can result in an inconsistent CFS state, as cleanup of e.g. 
 compaction leftovers does not get reflected in DataTracker. It happens 
 because StorageService.getLoad() iterates over and opens all CFS, and this is 
 called by Compaction.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6913) Compaction of system keyspaces during startup can cause early loading of non-system keyspaces


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960234#comment-13960234
 ] 

Benedict commented on CASSANDRA-6913:
-

Hi [~ravilr], yes that's exactly the symptom you'd expect when hitting this 
issue. +1 to CHANGES.txt suggestion, even if it is a bit of a mouthful.

 Compaction of system keyspaces during startup can cause early loading of 
 non-system keyspaces
 -

 Key: CASSANDRA-6913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6913
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.0.7, 2.1 beta2

 Attachments: 6913.txt


 This then can result in an inconsistent CFS state, as cleanup of e.g. 
 compaction leftovers does not get reflected in DataTracker. It happens 
 because StorageService.getLoad() iterates over and opens all CFS, and this is 
 called by Compaction.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6977) attempting to create 10K column families fails with 100 node cluster

2014-04-04 Thread Daniel Meyer (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960312#comment-13960312
]

Daniel Meyer commented on CASSANDRA-6977:
-

I am not sure if memory is the issue here. I monitored memory with visualvm
and found the maximum used heap to be only 1GB. There were no OOM errors in
the logs. Further, if memory were the issue I would think that the 5 node
cluster would run into this; however, in the case of the 5 node cluster this
issue does not occur and we are able to create the 10K cfs without a problem
(albeit it takes a while).

attempting to create 10K column families fails with 100 node cluster

Key: CASSANDRA-6977
URL: https://issues.apache.org/jira/browse/CASSANDRA-6977
Project: Cassandra
Issue Type: Bug
Environment: 100 nodes, Ubuntu 12.04.3 LTS, AWS m1.large instances
Reporter: Daniel Meyer
Attachments: 100_nodes_all_data.png, all_data_5_nodes.png,
keyspace_create.py, logs.tar, tpstats.txt, visualvm_tracer_data.csv

During this test we are attempting to create a total of 1K keyspaces with 10
column families each to bring the total column families to 10K. With a 5
node cluster this operation can be completed; however, it fails with 100
nodes. Please see the two charts. For the 5 node case the time required to
create each keyspace and subsequent 10 column families increases linearly
until the number of keyspaces is 1K. For a 100 node cluster there is a
sudden increase in latency between 450 keyspaces and 550 keyspaces. The test
ends when the test script times out. After the test script times out it is
impossible to reconnect to the cluster with the datastax python driver
because it cannot connect to the host:
cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers',
{'10.199.5.98': OperationTimedOut()}
It was found that running the following stress command does work from the
same machine the test script runs on.
cassandra-stress -d 10.199.5.98 -l 2 -e QUORUM -L3 -b -o INSERT
It should be noted that this test was initially done with DSE 4.0 and c*
version 2.0.5.24 and in that case it was not possible to run stress against
the cluster even locally on a node due to not finding the host.
Attached are system logs from one of the nodes, charts showing schema
creation latency for 5 and 100 node clusters and virtualvm tracer data for
cpu, memory, num_threads and gc runs, tpstat output and the test script.
The test script was on an m1.large aws instance outside of the cluster under
test.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960318#comment-13960318
 ] 

Brandon Williams commented on CASSANDRA-6971:
-

Hmm, so, the ks was created on node1, node2 saw it and applied it, but node3 
never noticed any of this and never got it, so we must have a problem with the 
schema pull code.  That's about all I can tell at INFO, if you can get DEBUG 
there may be more.

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: node1.log, node2.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960319#comment-13960319
 ] 

Brandon Williams commented on CASSANDRA-6971:
-

gossipinfo from this state would help rule out a problem there.

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: node1.log, node2.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6553) Benchmark counter improvements (counters++)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960339#comment-13960339
 ] 

Russ Hatch commented on CASSANDRA-6553:
---

[~iamaleksey] -- here you go:
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.low_contention_CL_quorum_4_4.json
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.high_contention_CL_quorum_4_4.json
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.user_contention_CL_quorum_4_4.json

btw, if you want to change the top title or legend information for better 
presentation, you can update the json. Clicking a legend will disable/enable 
that dataset.
If you want to make changes but don't want to push them to github, you can 
clone the cassandra_performance repo and run it locally from the 'graph' 
directory with python -m SimpleHTTPServer

 Benchmark counter improvements (counters++)
 ---

 Key: CASSANDRA-6553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6553
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Russ Hatch
 Fix For: 2.1 beta2

 Attachments: 6553.txt, 6553.uber.quorum.bdplab.read.png, 
 6553.uber.quorum.bdplab.write.png, high_cl_one.png, high_cl_quorum.png, 
 logs.tar.gz, low_cl_one.png, low_cl_quorum.png, tracing.txt, uber_cl_one.png, 
 uber_cl_quorum.png


 Benchmark the difference in performance between CASSANDRA-6504 and trunk.
 * Updating totally unrelated counters (different partitions)
 * Updating the same counters a lot (same cells in the same partition)
 * Different cells in the same few partitions (hot counter partition)
 benchmark: 
 https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
  (old counters)
 compared to: 
 https://github.com/apache/cassandra/tree/714c423360c36da2a2b365efaf9c5c4f623ed133
  (new counters)
 So far, the above changes should only affect the write path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-6553) Benchmark counter improvements (counters++)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960339#comment-13960339
 ] 

Russ Hatch edited comment on CASSANDRA-6553 at 4/4/14 8:03 PM:
---

[~iamaleksey] -- here you go:
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.low_contention_CL_quorum_4_4.json
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.high_contention_CL_quorum_4_4.json
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.user_contention_CL_quorum_4_4.json

btw, if you want to change the top title or legend information for better 
presentation, you can update the json. Clicking a color in the legend will 
disable/enable that dataset.
If you want to make changes but don't want to push them to github, you can 
clone the cassandra_performance repo and run it locally from the 'graph' 
directory with python -m SimpleHTTPServer


was (Author: rhatch):
[~iamaleksey] -- here you go:
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.low_contention_CL_quorum_4_4.json
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.high_contention_CL_quorum_4_4.json
http://riptano.github.io/cassandra_performance/graph/graph.html?stats=6553.user_contention_CL_quorum_4_4.json

btw, if you want to change the top title or legend information for better 
presentation, you can update the json. Clicking a legend will disable/enable 
that dataset.
If you want to make changes but don't want to push them to github, you can 
clone the cassandra_performance repo and run it locally from the 'graph' 
directory with python -m SimpleHTTPServer

 Benchmark counter improvements (counters++)
 ---

 Key: CASSANDRA-6553
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6553
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Russ Hatch
 Fix For: 2.1 beta2

 Attachments: 6553.txt, 6553.uber.quorum.bdplab.read.png, 
 6553.uber.quorum.bdplab.write.png, high_cl_one.png, high_cl_quorum.png, 
 logs.tar.gz, low_cl_one.png, low_cl_quorum.png, tracing.txt, uber_cl_one.png, 
 uber_cl_quorum.png


 Benchmark the difference in performance between CASSANDRA-6504 and trunk.
 * Updating totally unrelated counters (different partitions)
 * Updating the same counters a lot (same cells in the same partition)
 * Different cells in the same few partitions (hot counter partition)
 benchmark: 
 https://github.com/apache/cassandra/tree/1218bcacba7edefaf56cf8440d0aea5794c89a1e
  (old counters)
 compared to: 
 https://github.com/apache/cassandra/tree/714c423360c36da2a2b365efaf9c5c4f623ed133
  (new counters)
 So far, the above changes should only affect the write path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6957) testNewRepairedSSTable fails intermittently


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960351#comment-13960351
 ] 

Jonathan Ellis commented on CASSANDRA-6957:
---

Is there a way there could be a race and we end up wanting to put an sstable 
back in the same level it started?  If so we'd want something like v3.

 testNewRepairedSSTable fails intermittently
 ---

 Key: CASSANDRA-6957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6957
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
 Fix For: 2.1 beta2

 Attachments: 0001-doh-clear-out-L0-as-well.patch, 6957-v2.txt, 
 6957-v3.txt, system.log.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6957) testNewRepairedSSTable fails intermittently


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6957:
--

Attachment: 6957-v3.txt

 testNewRepairedSSTable fails intermittently
 ---

 Key: CASSANDRA-6957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6957
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
 Fix For: 2.1 beta2

 Attachments: 0001-doh-clear-out-L0-as-well.patch, 6957-v2.txt, 
 6957-v3.txt, system.log.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6831) Updates to COMPACT STORAGE tables via cli drop CQL information

2014-04-04 Thread Mikhail Stepura (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960368#comment-13960368
 ] 

Mikhail Stepura commented on CASSANDRA-6831:


Here's what I've seen on the 2.1 branch.
The table was created in ``cqlsh``
{code:title=cqlsh}
[cqlsh 5.0.0 | Cassandra 2.1.0-beta1-SNAPSHOT | CQL spec 3.1.5 | Native 
protocol v2]
Use HELP for help.
cqlsh
cqlsh  CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 
'replication_factor' : 1 };
cqlsh use test;
cqlsh:test CREATE TABLE foo (bar text, baz text, qux text, PRIMARY KEY(bar, 
baz) ) WITH COMPACT STORAGE;
cqlsh:test DESCRIBE TABLE foo;

CREATE TABLE test.foo (
bar text,
baz text,
qux text,
PRIMARY KEY (bar, baz)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (baz ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND populate_io_cache_on_flush = false
AND read_repair_chance = 0.1
AND speculative_retry = '99.0PERCENTILE'
{code}

Then I did the stuff in ``cassandra-cli``
{code:title=cassandra-cli}
mstepura-mac:cassandra mikhail$ bin/cassandra-cli
Connected to: Test Cluster on 127.0.0.1/9160
Welcome to Cassandra CLI version 2.1.0-beta1-SNAPSHOT

The CLI is deprecated and will be removed in Cassandra 3.0.  Consider migrating 
to cqlsh.
CQL is fully backwards compatible with Thrift data; see 
http://www.datastax.com/dev/blog/thrift-to-cql3

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown] use test;
Authenticated to keyspace: test
[default@test] UPDATE COLUMN FAMILY foo WITH comment='hey this is a comment';
org.apache.thrift.transport.TTransportException
{code}

Meanwhile in the logs
{code}
ERROR 20:14:17 Exception in thread Thread[MigrationStage:1,5,main]
java.lang.AssertionError: There shouldn't be more than one compact value 
defined: got ColumnDefinition{name=qux, 
type=org.apache.cassandra.db.marshal.UTF8Type, kind=COMPACT_VALUE, 
componentIndex=null, indexName=null, indexType=null} and 
ColumnDefinition{name=value, type=org.apache.cassandra.db.marshal.UTF8Type, 
kind=COMPACT_VALUE, componentIndex=null, indexName=null, indexType=null}
at org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:1981) 
~[main/:na]
at 
org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1751)
 ~[main/:na]
at 
org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1791) 
~[main/:na]
at 
org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:320)
 ~[main/:na]
at 
org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:306) 
~[main/:na]
at org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:181) 
~[main/:na]
at 
org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:306)
 ~[main/:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_51]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_51]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_51]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
ERROR 20:14:17 Error occurred during processing of message.
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.AssertionError: There shouldn't be more than one compact value 
defined: got ColumnDefinition{name=qux, 
type=org.apache.cassandra.db.marshal.UTF8Type, kind=COMPACT_VALUE, 
componentIndex=null, indexName=null, indexType=null} and 
ColumnDefinition{name=value, type=org.apache.cassandra.db.marshal.UTF8Type, 
kind=COMPACT_VALUE, componentIndex=null, indexName=null, indexType=null}
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) 
~[main/:na]
at 
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:288)
 ~[main/:na]
at 
org.apache.cassandra.service.MigrationManager.announceColumnFamilyUpdate(MigrationManager.java:242)
 ~[main/:na]
at 
org.apache.cassandra.thrift.CassandraServer.system_update_column_family(CassandraServer.java:1676)
 ~[main/:na]
at

[jira] [Updated] (CASSANDRA-6959) Reusing Keyspace and CF names raises assertion errors


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6959:


Reproduced In: 2.1 beta1, 2.0.6  (was: 2.0.6, 2.1 beta1)
Fix Version/s: 2.1 beta2
 Assignee: Benedict

Let's just get the CL part for 2.1 fixed then.

 Reusing Keyspace and CF names raises assertion errors
 -

 Key: CASSANDRA-6959
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6959
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Benedict
 Fix For: 2.1 beta2


 The [dtest I 
 introduced|https://github.com/riptano/cassandra-dtest/commit/36960090d219ab8dbc7f108faa91c3ea5cea2bec]
  to test CASSANDRA-6924 introduces some log errors which I think may be 
 related to  CASSANDRA-5202. 
 On 2.1 :
 {code}
 ERROR [MigrationStage:1] 2014-03-31 14:36:43,463 
 CommitLogSegmentManager.java:306 - Failed waiting for a forced recycle of 
 in-use commit log segments
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.db.commitlog.CommitLogSegmentManager.forceRecycleAll(CommitLogSegmentManager.java:301)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:160)
  [main/:na]
 at 
 org.apache.cassandra.db.DefsTables.dropColumnFamily(DefsTables.java:497) 
 [main/:na]
 at 
 org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:296) 
 [main/:na]
 at 
 org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:181) [main/:na]
 at 
 org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:49)
  [main/:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 [main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_51]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 [na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_51]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_51]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
 {code}
 On 2.0: 
 {code}
 ERROR [ReadStage:3] 2014-03-31 13:28:11,014 CassandraDaemon.java (line 198) 
 Exception in thread Thread[ReadStage:3,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.db.filter.ExtendedFilter$WithClauses.getExtraFilter(ExtendedFilter.java:258)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1744)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1699)
 at 
 org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:119)
 at 
 org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:39)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 {code}
 To reproduce, you many need to comment out the assertion in that test, as it 
 is not 100% reproducible on the first try.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6913) Compaction of system keyspaces during startup can cause early loading of non-system keyspaces


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960388#comment-13960388
 ] 

Jonathan Ellis commented on CASSANDRA-6913:
---

Done.

 Compaction of system keyspaces during startup can cause early loading of 
 non-system keyspaces
 -

 Key: CASSANDRA-6913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6913
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.0.7, 2.1 beta2

 Attachments: 6913.txt


 This then can result in an inconsistent CFS state, as cleanup of e.g. 
 compaction leftovers does not get reflected in DataTracker. It happens 
 because StorageService.getLoad() iterates over and opens all CFS, and this is 
 called by Compaction.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

git commit: Ensure safe resource cleanup when replacing SSTables

2014-04-04 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 57b18e600 - 5ebadc11e


Ensure safe resource cleanup when replacing SSTables

Patch by belliotsmith; reviewed by Tyler Hobbs for CASSANDRA-6912


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5ebadc11
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5ebadc11
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5ebadc11

Branch: refs/heads/cassandra-2.1
Commit: 5ebadc11e36749e6479f9aba19406db3aacdaf41
Parents: 57b18e6
Author: belliottsmith git...@sub.laerad.com
Authored: Fri Apr 4 15:37:09 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Apr 4 15:37:09 2014 -0500

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/db/DataTracker.java|  28 +-
 .../cassandra/io/sstable/IndexSummary.java  |   2 +-
 .../io/sstable/IndexSummaryManager.java |  22 +-
 .../cassandra/io/sstable/SSTableReader.java | 318 +--
 .../cassandra/utils/AlwaysPresentFilter.java|   3 +-
 .../org/apache/cassandra/utils/BloomFilter.java |   3 +-
 .../org/apache/cassandra/utils/IFilter.java |   2 +
 .../org/apache/cassandra/utils/obs/IBitSet.java |   2 +
 .../cassandra/utils/obs/OffHeapBitSet.java  |   2 +-
 .../apache/cassandra/utils/obs/OpenBitSet.java  |   2 +-
 .../io/sstable/IndexSummaryManagerTest.java |   2 +-
 .../cassandra/io/sstable/SSTableReaderTest.java |  28 +-
 13 files changed, 278 insertions(+), 137 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ebadc11/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4cfc957..0f1ae93 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -41,6 +41,7 @@
  * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869)
  * Lock counter cells, not partitions (CASSANDRA-6880)
  * Track presence of legacy counter shards in sstables (CASSANDRA-6888)
+ * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912)
 Merged from 2.0:
  * Allow compaction of system tables during startup (CASSANDRA-6913)
  * Restrict Windows to parallel repairs (CASSANDRA-6907)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ebadc11/src/java/org/apache/cassandra/db/DataTracker.java
--
diff --git a/src/java/org/apache/cassandra/db/DataTracker.java 
b/src/java/org/apache/cassandra/db/DataTracker.java
index c8fc699..9c8f9a0 100644
--- a/src/java/org/apache/cassandra/db/DataTracker.java
+++ b/src/java/org/apache/cassandra/db/DataTracker.java
@@ -192,14 +192,17 @@ public class DataTracker
 public boolean markCompacting(IterableSSTableReader sstables)
 {
 assert sstables != null  !Iterables.isEmpty(sstables);
+while (true)
+{
+View currentView = view.get();
+SetSSTableReader inactive = 
Sets.difference(ImmutableSet.copyOf(sstables), currentView.compacting);
+if (inactive.size()  Iterables.size(sstables))
+return false;
 
-View currentView = view.get();
-SetSSTableReader inactive = 
Sets.difference(ImmutableSet.copyOf(sstables), currentView.compacting);
-if (inactive.size()  Iterables.size(sstables))
-return false;
-
-View newView = currentView.markCompacting(inactive);
-return view.compareAndSet(currentView, newView);
+View newView = currentView.markCompacting(inactive);
+if (view.compareAndSet(currentView, newView))
+return true;
+}
 }
 
 /**
@@ -333,14 +336,6 @@ public class DataTracker
  */
 public void replaceReaders(CollectionSSTableReader oldSSTables, 
CollectionSSTableReader newSSTables)
 {
-// data component will be unchanged but the index summary will be a 
different size
-// (since we save that to make restart fast)
-long sizeIncrease = 0;
-for (SSTableReader sstable : oldSSTables)
-sizeIncrease -= sstable.bytesOnDisk();
-for (SSTableReader sstable : newSSTables)
-sizeIncrease += sstable.bytesOnDisk();
-
 View currentView, newView;
 do
 {
@@ -349,9 +344,6 @@ public class DataTracker
 }
 while (!view.compareAndSet(currentView, newView));
 
-StorageMetrics.load.inc(sizeIncrease);
-cfstore.metric.liveDiskSpaceUsed.inc(sizeIncrease);
-
 for (SSTableReader sstable : newSSTables)
 sstable.setTrackedBy(this);
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ebadc11/src/java/org/apache/cassandra/io/sstable/IndexSummary.java
--
diff --git

[1/2] git commit: Ensure safe resource cleanup when replacing SSTables

2014-04-04 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/trunk 0015f37a3 - 64bc45849


Ensure safe resource cleanup when replacing SSTables

Patch by belliotsmith; reviewed by Tyler Hobbs for CASSANDRA-6912


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5ebadc11
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5ebadc11
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5ebadc11

Branch: refs/heads/trunk
Commit: 5ebadc11e36749e6479f9aba19406db3aacdaf41
Parents: 57b18e6
Author: belliottsmith git...@sub.laerad.com
Authored: Fri Apr 4 15:37:09 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Apr 4 15:37:09 2014 -0500

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/db/DataTracker.java|  28 +-
 .../cassandra/io/sstable/IndexSummary.java  |   2 +-
 .../io/sstable/IndexSummaryManager.java |  22 +-
 .../cassandra/io/sstable/SSTableReader.java | 318 +--
 .../cassandra/utils/AlwaysPresentFilter.java|   3 +-
 .../org/apache/cassandra/utils/BloomFilter.java |   3 +-
 .../org/apache/cassandra/utils/IFilter.java |   2 +
 .../org/apache/cassandra/utils/obs/IBitSet.java |   2 +
 .../cassandra/utils/obs/OffHeapBitSet.java  |   2 +-
 .../apache/cassandra/utils/obs/OpenBitSet.java  |   2 +-
 .../io/sstable/IndexSummaryManagerTest.java |   2 +-
 .../cassandra/io/sstable/SSTableReaderTest.java |  28 +-
 13 files changed, 278 insertions(+), 137 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ebadc11/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4cfc957..0f1ae93 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -41,6 +41,7 @@
  * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869)
  * Lock counter cells, not partitions (CASSANDRA-6880)
  * Track presence of legacy counter shards in sstables (CASSANDRA-6888)
+ * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912)
 Merged from 2.0:
  * Allow compaction of system tables during startup (CASSANDRA-6913)
  * Restrict Windows to parallel repairs (CASSANDRA-6907)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ebadc11/src/java/org/apache/cassandra/db/DataTracker.java
--
diff --git a/src/java/org/apache/cassandra/db/DataTracker.java 
b/src/java/org/apache/cassandra/db/DataTracker.java
index c8fc699..9c8f9a0 100644
--- a/src/java/org/apache/cassandra/db/DataTracker.java
+++ b/src/java/org/apache/cassandra/db/DataTracker.java
@@ -192,14 +192,17 @@ public class DataTracker
 public boolean markCompacting(IterableSSTableReader sstables)
 {
 assert sstables != null  !Iterables.isEmpty(sstables);
+while (true)
+{
+View currentView = view.get();
+SetSSTableReader inactive = 
Sets.difference(ImmutableSet.copyOf(sstables), currentView.compacting);
+if (inactive.size()  Iterables.size(sstables))
+return false;
 
-View currentView = view.get();
-SetSSTableReader inactive = 
Sets.difference(ImmutableSet.copyOf(sstables), currentView.compacting);
-if (inactive.size()  Iterables.size(sstables))
-return false;
-
-View newView = currentView.markCompacting(inactive);
-return view.compareAndSet(currentView, newView);
+View newView = currentView.markCompacting(inactive);
+if (view.compareAndSet(currentView, newView))
+return true;
+}
 }
 
 /**
@@ -333,14 +336,6 @@ public class DataTracker
  */
 public void replaceReaders(CollectionSSTableReader oldSSTables, 
CollectionSSTableReader newSSTables)
 {
-// data component will be unchanged but the index summary will be a 
different size
-// (since we save that to make restart fast)
-long sizeIncrease = 0;
-for (SSTableReader sstable : oldSSTables)
-sizeIncrease -= sstable.bytesOnDisk();
-for (SSTableReader sstable : newSSTables)
-sizeIncrease += sstable.bytesOnDisk();
-
 View currentView, newView;
 do
 {
@@ -349,9 +344,6 @@ public class DataTracker
 }
 while (!view.compareAndSet(currentView, newView));
 
-StorageMetrics.load.inc(sizeIncrease);
-cfstore.metric.liveDiskSpaceUsed.inc(sizeIncrease);
-
 for (SSTableReader sstable : newSSTables)
 sstable.setTrackedBy(this);
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ebadc11/src/java/org/apache/cassandra/io/sstable/IndexSummary.java
--
diff --git

[2/2] git commit: Merge branch 'cassandra-2.1' into trunk

2014-04-04 Thread tylerhobbs

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/64bc4584
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/64bc4584
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/64bc4584

Branch: refs/heads/trunk
Commit: 64bc45849fd2a488b766ca9ddfe8456dae50a187
Parents: 0015f37 5ebadc1
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Apr 4 15:37:58 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Apr 4 15:37:58 2014 -0500

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/db/DataTracker.java|  28 +-
 .../cassandra/io/sstable/IndexSummary.java  |   2 +-
 .../io/sstable/IndexSummaryManager.java |  22 +-
 .../cassandra/io/sstable/SSTableReader.java | 318 +--
 .../cassandra/utils/AlwaysPresentFilter.java|   3 +-
 .../org/apache/cassandra/utils/BloomFilter.java |   3 +-
 .../org/apache/cassandra/utils/IFilter.java |   2 +
 .../org/apache/cassandra/utils/obs/IBitSet.java |   2 +
 .../cassandra/utils/obs/OffHeapBitSet.java  |   2 +-
 .../apache/cassandra/utils/obs/OpenBitSet.java  |   2 +-
 .../io/sstable/IndexSummaryManagerTest.java |   2 +-
 .../cassandra/io/sstable/SSTableReaderTest.java |  28 +-
 13 files changed, 278 insertions(+), 137 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/64bc4584/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/64bc4584/test/unit/org/apache/cassandra/io/sstable/SSTableReaderTest.java
--

[jira] [Updated] (CASSANDRA-6912) SSTableReader.isReplaced does not allow for safe resource cleanup


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-6912:
---

Since Version: 2.1 beta1

 SSTableReader.isReplaced does not allow for safe resource cleanup
 -

 Key: CASSANDRA-6912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6912
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1 beta2


 There are a number of possible race conditions on resource cleanup from the 
 use of cloneWithNewSummarySamplingLevel, because the replacement sstable can 
 be itself replaced/obsoleted while the prior sstable is still referenced 
 (this is actually quite easy with compaction, but can happen in other 
 circumstances less commonly).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-3668) Parallel streaming for sstableloader


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960435#comment-13960435
 ] 

Joshua McKenzie commented on CASSANDRA-3668:


A quick update on this - going the route of multiple StreamSessions per 
StreamPlan with the current architecture is going to require some 
restructuring.  The current design assumes a single socket for streaming and 
multiple StreamSessions means multiple ConnectionHandlers, all of which assume 
ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm 
working on promoting IncomingMessageHandler and OutgoingMessageHandler into 
higher-level abstractions that are responsible for polling the socket and 
dispatching to various StreamSessions based on deserialized session indices on 
the inbound or following the current PriorityQueue polling mechanism for the 
outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even 
over a single socket as my prelim parallelized stream testing is peaking at ~ 
55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing 
returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single 
connection it's still a respectable increase.

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 beta2

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-3668) Parallel streaming for sstableloader


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960435#comment-13960435
 ] 

Joshua McKenzie edited comment on CASSANDRA-3668 at 4/4/14 9:19 PM:


A quick update on this - going the route of multiple StreamSessions per 
StreamPlan is going to require some restructuring.  The current design assumes 
a single socket for streaming and changing to multiple StreamSessions means 
multiple ConnectionHandlers, all of which assume ownership of polling the 
readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm 
working on promoting IncomingMessageHandler and OutgoingMessageHandler into 
higher-level abstractions that are responsible for polling the socket and 
dispatching to various StreamSessions based on deserialized session indices on 
the inbound or following the current PriorityQueue polling mechanism for the 
outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even 
over a single socket as my prelim parallelized stream testing is peaking at ~ 
55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing 
returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single 
connection it's still a respectable increase.


was (Author: joshuamckenzie):
A quick update on this - going the route of multiple StreamSessions per 
StreamPlan with the current architecture is going to require some 
restructuring.  The current design assumes a single socket for streaming and 
multiple StreamSessions means multiple ConnectionHandlers, all of which assume 
ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm 
working on promoting IncomingMessageHandler and OutgoingMessageHandler into 
higher-level abstractions that are responsible for polling the socket and 
dispatching to various StreamSessions based on deserialized session indices on 
the inbound or following the current PriorityQueue polling mechanism for the 
outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even 
over a single socket as my prelim parallelized stream testing is peaking at ~ 
55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing 
returns as we get higher.  Compared to the 24MB/s I'm benchmarking on a single 
connection it's still a respectable increase.

 Parallel streaming for sstableloader
 

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
  Labels: streaming
 Fix For: 2.1 beta2

 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 
 3688-reply_before_closing_writer.txt, sstable-loader performance.txt

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (CASSANDRA-6983) DirectoriesTest fails when run as root

Brandon Williams created CASSANDRA-6983:
---

 Summary: DirectoriesTest fails when run as root
 Key: CASSANDRA-6983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6983
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Brandon Williams
Assignee: Yuki Morishita
 Fix For: 2.0.7


When you run the DirectoriesTest as a normal user, it passes because it fails 
to create the 'bad' directory:

{noformat}
[junit] - Standard Error -
[junit] ERROR 16:16:18,111 Failed to create 
/tmp/cassandra4119802552776680052unittest/ks/bad directory
[junit]  WARN 16:16:18,112 Blacklisting 
/tmp/cassandra4119802552776680052unittest/ks/bad for writes
[junit] -  ---
{noformat}

But when you run the test as root, it succeeds in making the directory, causing 
an assertion failure that it's unwritable:

{noformat}
[junit] Testcase: 
testDiskFailurePolicy_best_effort(org.apache.cassandra.db.DirectoriesTest): 
  FAILED
[junit] 
[junit] junit.framework.AssertionFailedError: 
[junit] at 
org.apache.cassandra.db.DirectoriesTest.testDiskFailurePolicy_best_effort(DirectoriesTest.java:199)
{noformat}

It seems to me that we shouldn't be relying on failing the make the directory.  
If we're just going to test a nonexistent dir, why try to make one at all?  And 
if that is supposed to succeed, then we have a problem with either the test or 
blacklisting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-3668) Parallel streaming for sstableloader

[
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960435#comment-13960435
]

Joshua McKenzie edited comment on CASSANDRA-3668 at 4/4/14 9:25 PM:

(edit) Scratch what I wrote previously - we're good with the multiple
StreamSessions per peer, I just need to iron out a socket-connection race on
startup of streams.

Prelim parallelized stream testing is peaking at ~ 55MB/s on 5
connections-per-host vs. 49MB/s on 4 connections - diminishing returns as we
get higher. Compared to the 24MB/s I'm benchmarking on a single connection
it's still a respectable increase.

was (Author: joshuamckenzie):
A quick update on this - going the route of multiple StreamSessions per
StreamPlan is going to require some restructuring. The current design assumes
a single socket for streaming and changing to multiple StreamSessions means
multiple ConnectionHandlers, all of which assume ownership of polling the
readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm
working on promoting IncomingMessageHandler and OutgoingMessageHandler into
higher-level abstractions that are responsible for polling the socket and
dispatching to various StreamSessions based on deserialized session indices on
the inbound or following the current PriorityQueue polling mechanism for the
outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even
over a single socket as my prelim parallelized stream testing is peaking at ~
55MB/s on 5 connections-per-host vs. 49MB/s on 4 connections - diminishing
returns as we get higher. Compared to the 24MB/s I'm benchmarking on a single
connection it's still a respectable increase.

Parallel streaming for sstableloader

Key: CASSANDRA-3668
URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
Project: Cassandra
Issue Type: Improvement
Components: API
Reporter: Manish Zope
Assignee: Joshua McKenzie
Priority: Minor
Labels: streaming
Fix For: 2.1 beta2

Attachments: 3668-1.1-v2.txt, 3668-1.1.txt,
3688-reply_before_closing_writer.txt, sstable-loader performance.txt

Original Estimate: 48h
Remaining Estimate: 48h

One of my colleague had reported the bug regarding the degraded performance
of the sstable generator and sstable loader.
ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589
As stated in above issue generator performance is rectified but performance
of the sstableloader is still an issue.
3589 is marked as duplicate of 3618.Both issues shows resolved status.But the
problem with sstableloader still exists.
So opening other issue so that sstbleloader problem should not go unnoticed.
FYI : We have tested the generator part with the patch given in 3589.Its
Working fine.
Please let us know if you guys require further inputs from our side.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-3668) Parallel streaming for sstableloader

[
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960435#comment-13960435
]

Joshua McKenzie edited comment on CASSANDRA-3668 at 4/4/14 9:26 PM:

(edit) We're good with the multiple StreamSessions per peer, I just need to
iron out a socket-connection race on startup of streams.

was (Author: joshuamckenzie):
(edit) Scratch what I wrote previously - we're good with the multiple
StreamSessions per peer, I just need to iron out a socket-connection race on
startup of streams.

Parallel streaming for sstableloader

Attachments: 3668-1.1-v2.txt, 3668-1.1.txt,
3688-reply_before_closing_writer.txt, sstable-loader performance.txt

Original Estimate: 48h
Remaining Estimate: 48h

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-3668) Parallel streaming for sstableloader

[
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960435#comment-13960435
]

Joshua McKenzie edited comment on CASSANDRA-3668 at 4/4/14 9:29 PM:

We should be good with multiple StreamSessions per peer with some minimal
code-changes to clean up and consolidate StreamSessions and ProgressInfo data.

was (Author: joshuamckenzie):
(edit) We're good with the multiple StreamSessions per peer, I just need to
iron out a socket-connection race on startup of streams.

Parallel streaming for sstableloader

Attachments: 3668-1.1-v2.txt, 3668-1.1.txt,
3688-reply_before_closing_writer.txt, sstable-loader performance.txt

Original Estimate: 48h
Remaining Estimate: 48h

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6934) Optimise Byte + CellName comparisons


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960445#comment-13960445
 ] 

Benedict commented on CASSANDRA-6934:
-

Initial patch available 
[here|https://github.com/belliottsmith/cassandra/tree/6934]

Optimises to some extent the various compare() implementations for 
AbstractCType, and at the same time slightly optimises compare in BTree to 
avoid unwrapping the special +/-Inf values except when absolutely necessary, 
and to perform (potentially) one fewer comparison per update() when not 
updating identical values, and to not perform a wasteful start-of-range 
compare() when _not_ inserting.

There are further performance improvements that can be made to 
AbstractCType.compare() and its inheritors, but they're a little more invasive, 
and since CASSANDRA-6694 will entail some optimisation work to make comparisons 
less expensive, I will wait until then to do anything more.

I need to make some tweaks to stress so I can properly test the impact of this 
patch on CQL, as there's no easy way to perform inserts of random columns. As 
shown with CASSANDRA-6553, there is a marked improvement for simple composites, 
and some quick and dirty benchmarking on my local box for thrift columns with 
only the general purpose improvements showed a lesser but still marked impact.



 Optimise Byte + CellName comparisons
 

 Key: CASSANDRA-6934
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6934
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1


 AbstractCompositeType is called a lot, so deserves some heavy optimisation. 
 SimpleCellNameType can be optimised easily, but should explore other 
 potential optimisations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6961) nodes should go into hibernate when join_ring is false


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960455#comment-13960455
 ] 

Tyler Hobbs commented on CASSANDRA-6961:


I'm seeing some issues with repair while one node is running with 
join_ring=false.

Here's what I did:
* Start a three node ccm cluster
* Start a stress write with RF=3
* Stop node3
* Start node3
* Run a repair against node3

It looks like the repair finishes everything diffing and streaming, but the 
repair command hangs, and netstats shows continuously increasing completed 
Command/Response counts.

 nodes should go into hibernate when join_ring is false
 --

 Key: CASSANDRA-6961
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6961
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 2.0.7

 Attachments: 6961.txt


 The impetus here is this: a node that was down for some period and comes back 
 can serve stale information.  We know from CASSANDRA-768 that we can't just 
 wait for hints, and know that tangentially related CASSANDRA-3569 prevents us 
 from having the node in a down (from the FD's POV) state handle streaming.
 We can *almost* set join_ring to false, then repair, and then join the ring 
 to narrow the window (actually, you can do this and everything succeeds 
 because the node doesn't know it's a member yet, which is probably a bit of a 
 bug.)  If instead we modified this to put the node in hibernate, like 
 replace_address does, it could work almost like replace, except you could run 
 a repair (manually) while in the hibernate state, and then flip to normal 
 when it's done.
 This won't prevent the staleness 100%, but it will greatly reduce the chance 
 if the node has been down a significant amount of time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-6961) nodes should go into hibernate when join_ring is false


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960455#comment-13960455
 ] 

Tyler Hobbs edited comment on CASSANDRA-6961 at 4/4/14 9:49 PM:


I'm seeing some issues with repair while one node is running with 
join_ring=false.

Here's what I did:
* Start a three node ccm cluster
* Start a stress write with RF=3
* Stop node3
* Start node3 with join_ring=false
* Run a repair against node3

It looks like the repair finishes everything diffing and streaming, but the 
repair command hangs, and netstats shows continuously increasing completed 
Command/Response counts.


was (Author: thobbs):
I'm seeing some issues with repair while one node is running with 
join_ring=false.

Here's what I did:
* Start a three node ccm cluster
* Start a stress write with RF=3
* Stop node3
* Start node3
* Run a repair against node3

It looks like the repair finishes everything diffing and streaming, but the 
repair command hangs, and netstats shows continuously increasing completed 
Command/Response counts.

 nodes should go into hibernate when join_ring is false
 --

 Key: CASSANDRA-6961
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6961
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 2.0.7

 Attachments: 6961.txt


 The impetus here is this: a node that was down for some period and comes back 
 can serve stale information.  We know from CASSANDRA-768 that we can't just 
 wait for hints, and know that tangentially related CASSANDRA-3569 prevents us 
 from having the node in a down (from the FD's POV) state handle streaming.
 We can *almost* set join_ring to false, then repair, and then join the ring 
 to narrow the window (actually, you can do this and everything succeeds 
 because the node doesn't know it's a member yet, which is probably a bit of a 
 bug.)  If instead we modified this to put the node in hibernate, like 
 replace_address does, it could work almost like replace, except you could run 
 a repair (manually) while in the hibernate state, and then flip to normal 
 when it's done.
 This won't prevent the staleness 100%, but it will greatly reduce the chance 
 if the node has been down a significant amount of time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6863) Incorrect read repair of range thombstones


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6863:
--

Attachment: 6863-v2.txt

 Incorrect read repair of range thombstones
 --

 Key: CASSANDRA-6863
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6863
 Project: Cassandra
  Issue Type: Bug
 Environment: 2.0
Reporter: Oleg Anastasyev
 Attachments: 6863-v2.txt, 6863-v2.txt, 
 ReadRepairRangeThombstoneDiff.txt, ReadRepairsDebugLogger.txt


 Rows with range thombstones are read repaired for every replica, if RR is 
 triggered (this is because CF.diff() returns non null if !isEmpty(), which in 
 turn returns false if range thombstones list is not empty). 
 Also, full rangethombstone list is send to all nodes, which could be a 
 problem if you have wide partition.
 Fixed this by evaluating diff on range thombstone lists as well as on 
 deteleInfo of endpoint CF versions. Also return null from CF.diff, if no diff 
 in RTL.
 A second patch (ReadRepairsDebugLogger.txt) adds some debug logging to look 
 at read repairs. You may find it useful as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6961) nodes should go into hibernate when join_ring is false


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960468#comment-13960468
 ] 

Brandon Williams commented on CASSANDRA-6961:
-

Hmm, I can't reproduce that, even wiping the node before starting it with 
join_ring=false:

{noformat}
[2014-04-04 21:51:30,888] Repair command #1 finished
{noformat}

and nodetool exits.

 nodes should go into hibernate when join_ring is false
 --

 Key: CASSANDRA-6961
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6961
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 2.0.7

 Attachments: 6961.txt


 The impetus here is this: a node that was down for some period and comes back 
 can serve stale information.  We know from CASSANDRA-768 that we can't just 
 wait for hints, and know that tangentially related CASSANDRA-3569 prevents us 
 from having the node in a down (from the FD's POV) state handle streaming.
 We can *almost* set join_ring to false, then repair, and then join the ring 
 to narrow the window (actually, you can do this and everything succeeds 
 because the node doesn't know it's a member yet, which is probably a bit of a 
 bug.)  If instead we modified this to put the node in hibernate, like 
 replace_address does, it could work almost like replace, except you could run 
 a repair (manually) while in the hibernate state, and then flip to normal 
 when it's done.
 This won't prevent the staleness 100%, but it will greatly reduce the chance 
 if the node has been down a significant amount of time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-4050) Rewrite RandomAccessReader to use FileChannel / nio to address Windows file access violations

[
https://issues.apache.org/jira/browse/CASSANDRA-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joshua McKenzie updated CASSANDRA-4050:
---

Description:
On Windows w/older java I/O libraries the files are not opened with
FILE_SHARE_DELETE. This causes problems as hard-links cannot be deleted while
the original file is opened - our snapshots are a big problem in particular.
The nio library and FileChannels open with FILE_SHARE_DELETE which should help
remedy this problem.

Original text:
I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the
database, I find that I am unable to delete the snapshot directory (i.e., dir
named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is
running: The action can't be completed because the folder or a file in it is
open in another program. Close the folder or file and try again [in Windows
Explorer]. If I terminate Cassandra, then I can delete the directory with no
problem.

I expect to be able to move or delete the snapshotted files while Cassandra is
running, as this should not affect the runtime operation of Cassandra.

was:
I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the
database, I find that I am unable to delete the snapshot directory (i.e., dir
named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is
running: The action can't be completed because the folder or a file in it is
open in another program. Close the folder or file and try again [in Windows
Explorer]. If I terminate Cassandra, then I can delete the directory with no
problem.

I expect to be able to move or delete the snapshotted files while Cassandra is
running, as this should not affect the runtime operation of Cassandra.

Summary: Rewrite RandomAccessReader to use FileChannel / nio to address
Windows file access violations (was: Unable to remove snapshot files on
Windows while original sstables are live)

Rewrite RandomAccessReader to use FileChannel / nio to address Windows file
access violations
-

Key: CASSANDRA-4050
URL: https://issues.apache.org/jira/browse/CASSANDRA-4050
Project: Cassandra
Issue Type: Bug
Environment: Windows 7
Reporter: Jim Newsham
Assignee: Joshua McKenzie
Priority: Minor
Attachments: CASSANDRA-4050_v1.patch

On Windows w/older java I/O libraries the files are not opened with
FILE_SHARE_DELETE. This causes problems as hard-links cannot be deleted
while the original file is opened - our snapshots are a big problem in
particular. The nio library and FileChannels open with FILE_SHARE_DELETE
which should help remedy this problem.
Original text:
I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the
database, I find that I am unable to delete the snapshot directory (i.e., dir
named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is
running: The action can't be completed because the folder or a file in it
is open in another program. Close the folder or file and try again [in
Windows Explorer]. If I terminate Cassandra, then I can delete the directory
with no problem.
I expect to be able to move or delete the snapshotted files while Cassandra
is running, as this should not affect the runtime operation of Cassandra.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6863) Incorrect read repair of range thombstones


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960467#comment-13960467
 ] 

Jonathan Ellis commented on CASSANDRA-6863:
---

The approach is sound, but I'm worried about the upgrade scenario.  Consensus 
on irc was that we should make this a 2.1 feature, and enable it when we detect 
the entire cluster is on 2.1.  v2 attached

 Incorrect read repair of range thombstones
 --

 Key: CASSANDRA-6863
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6863
 Project: Cassandra
  Issue Type: Bug
 Environment: 2.0
Reporter: Oleg Anastasyev
 Attachments: 6863-v2.txt, 6863-v2.txt, 
 ReadRepairRangeThombstoneDiff.txt, ReadRepairsDebugLogger.txt


 Rows with range thombstones are read repaired for every replica, if RR is 
 triggered (this is because CF.diff() returns non null if !isEmpty(), which in 
 turn returns false if range thombstones list is not empty). 
 Also, full rangethombstone list is send to all nodes, which could be a 
 problem if you have wide partition.
 Fixed this by evaluating diff on range thombstone lists as well as on 
 deteleInfo of endpoint CF versions. Also return null from CF.diff, if no diff 
 in RTL.
 A second patch (ReadRepairsDebugLogger.txt) adds some debug logging to look 
 at read repairs. You may find it useful as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-4050) Rewrite RandomAccessReader to use FileChannel / nio to address Windows file access violations

[
https://issues.apache.org/jira/browse/CASSANDRA-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960469#comment-13960469
]

Joshua McKenzie commented on CASSANDRA-4050:

I'll rebase your branch against trunk and post a revised patch early next week.
I know how much you love rebasing and I figure I owe you one for the
house-cleaning on this patch. ;)

Rewrite RandomAccessReader to use FileChannel / nio to address Windows file
access violations
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6283) Windows 7 data files keept open / can't be deleted after compaction.


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960479#comment-13960479
 ] 

Joshua McKenzie commented on CASSANDRA-6283:


On CASSANDRA-4050 we're converting our RandomAccessReader to use nio which 
should fix the can't delete a hard-link while original file is open for most 
use-cases.  Unfortunately you cannot delete hard-linked files on Windows if you 
have a memory-mapped segment in the original file - I've done some benchmarking 
on CASSANDRA-6890 regarding removing memory mapped I/O and the performance cost 
/ feature loss is high enough that we're going to keep it for now.

I'll put together a patch for this ticket to create something similar to an 
SSTableDeletingTask for a snapshot folder - walk the files and try to delete 
them, re-scheduling a job to try and clear this folder again after a GC if 
there's any failures due to access violations.  That combined with 
CASSANDRA-4050 should give us immediate and full clear on compressed cf's and 
partial / incrementally improving snapshot clearing on snapshots where there's 
memory mapped readers to the original sstables.

I don't like having partially cleared out snapshots floating around on the 
file-system though.  I'd guess this will cause some confusion for people in the 
future.

 Windows 7 data files keept open / can't be deleted after compaction.
 

 Key: CASSANDRA-6283
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6283
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows 7 (32) / Java 1.7.0.45
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
  Labels: compaction
 Fix For: 2.0.7

 Attachments: 6283_StreamWriter_patch.txt, leakdetect.patch, 
 neighbor-log.zip, root-log.zip, screenshot-1.jpg, system.log


 Files cannot be deleted, patch CASSANDRA-5383 (Win7 deleting problem) doesn't 
 help on Win-7 on Cassandra 2.0.2. Even 2.1 Snapshot is not running. The cause 
 is: Opened file handles seem to be lost and not closed properly. Win 7 
 blames, that another process is still using the file (but its obviously 
 cassandra). Only restart of the server makes the files deleted. But after 
 heavy using (changes) of tables, there are about 24K files in the data folder 
 (instead of 35 after every restart) and Cassandra crashes. I experiminted and 
 I found out, that a finalizer fixes the problem. So after GC the files will 
 be deleted (not optimal, but working fine). It runs now 2 days continously 
 without problem. Possible fix/test:
 I wrote the following finalizer at the end of class 
 org.apache.cassandra.io.util.RandomAccessReader:
 {code:title=RandomAccessReader.java|borderStyle=solid}
 @Override
 protected void finalize() throws Throwable {
   deallocate();
   super.finalize();
 }
 {code}
 Can somebody test / develop / patch it? Thx.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6934) Optimise Byte + CellName comparisons


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960546#comment-13960546
 ] 

Benedict commented on CASSANDRA-6934:
-

I decided to put in some of the extra optimisations now after all - whenever 
the clustering/ordering components of a type all support unsigned comparison we 
now avoid almost all virtual method calls. This gives a 10-20% bump in some 
very quick and dirty tests on my box versus the prior optimisation, and 
probably has a larger effect on clustering columns (which still can't easily be 
benchmarked, but I will fix that next week)

 Optimise Byte + CellName comparisons
 

 Key: CASSANDRA-6934
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6934
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1


 AbstractCompositeType is called a lot, so deserves some heavy optimisation. 
 SimpleCellNameType can be optimised easily, but should explore other 
 potential optimisations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (CASSANDRA-6984) NullPointerException in Streaming During Repair

Tyler Hobbs created CASSANDRA-6984:
--

 Summary: NullPointerException in Streaming During Repair
 Key: CASSANDRA-6984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6984
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Yuki Morishita


In cassandra-2.0, I can trigger a NullPointerException with a repair.  These 
steps should reproduce the issue:
* create a three node ccm cluster (with vnodes)
* start a stress write (I'm using {{tools/bin/cassandra-stress 
--replication-factor=3 -n 1000 -k -t 1}})
* stop node3 while stress is running, then wait a minute
* start node 3
* run ccm node3 repair

In the logs for node1, I see this:
{noformat}


ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 17:40:08,547 CassandraDaemon.java 
(line 198) Exception in thread Thread[STREAM-OUT-/127.0.0.3,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
at java.lang.Thread.run(Thread.java:724)
{noformat}

After applying Yuki's suggested patch:
{noformat}
diff --git a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java 
b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
index 356138b..b06a818 100644
--- a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
+++ b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
@@ -366,7 +366,7 @@ public class ConnectionHandler
 {
 throw new AssertionError(e);
 }
-catch (IOException e)
+catch (Throwable e)
 {
 session.onError(e);
 }
{noformat}

I see a new NPE:
{noformat}
ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 18:12:35,912 StreamSession.java (line 
420) [Stream #9b592af0-bc4e-11e3-a6f9-43eb3a328df9] Streaming error occurred
java.lang.NullPointerException
at 
org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:465)
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:60)
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355)
at java.lang.Thread.run(Thread.java:724)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6984) NullPointerException in Streaming During Repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6984:


Fix Version/s: 2.0.7

 NullPointerException in Streaming During Repair
 ---

 Key: CASSANDRA-6984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6984
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
 Fix For: 2.0.7


 In cassandra-2.0, I can trigger a NullPointerException with a repair.  These 
 steps should reproduce the issue:
 * create a three node ccm cluster (with vnodes)
 * start a stress write (I'm using {{tools/bin/cassandra-stress 
 --replication-factor=3 -n 1000 -k -t 1}})
 * stop node3 while stress is running, then wait a minute
 * start node 3
 * run ccm node3 repair
 In the logs for node1, I see this:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 17:40:08,547 CassandraDaemon.java 
 (line 198) Exception in thread Thread[STREAM-OUT-/127.0.0.3,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}
 After applying Yuki's suggested patch:
 {noformat}
 diff --git a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java 
 b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 index 356138b..b06a818 100644
 --- a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 +++ b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 @@ -366,7 +366,7 @@ public class ConnectionHandler
  {
  throw new AssertionError(e);
  }
 -catch (IOException e)
 +catch (Throwable e)
  {
  session.onError(e);
  }
 {noformat}
 I see a new NPE:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 18:12:35,912 StreamSession.java 
 (line 420) [Stream #9b592af0-bc4e-11e3-a6f9-43eb3a328df9] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:465)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:60)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 at 
 org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6984) NullPointerException in Streaming During Repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6984:


Priority: Blocker  (was: Major)

 NullPointerException in Streaming During Repair
 ---

 Key: CASSANDRA-6984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6984
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
Priority: Blocker
 Fix For: 2.0.7


 In cassandra-2.0, I can trigger a NullPointerException with a repair.  These 
 steps should reproduce the issue:
 * create a three node ccm cluster (with vnodes)
 * start a stress write (I'm using {{tools/bin/cassandra-stress 
 --replication-factor=3 -n 1000 -k -t 1}})
 * stop node3 while stress is running, then wait a minute
 * start node 3
 * run ccm node3 repair
 In the logs for node1, I see this:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 17:40:08,547 CassandraDaemon.java 
 (line 198) Exception in thread Thread[STREAM-OUT-/127.0.0.3,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}
 After applying Yuki's suggested patch:
 {noformat}
 diff --git a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java 
 b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 index 356138b..b06a818 100644
 --- a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 +++ b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 @@ -366,7 +366,7 @@ public class ConnectionHandler
  {
  throw new AssertionError(e);
  }
 -catch (IOException e)
 +catch (Throwable e)
  {
  session.onError(e);
  }
 {noformat}
 I see a new NPE:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 18:12:35,912 StreamSession.java 
 (line 420) [Stream #9b592af0-bc4e-11e3-a6f9-43eb3a328df9] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:465)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:60)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 at 
 org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6961) nodes should go into hibernate when join_ring is false


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960799#comment-13960799
 ] 

Tyler Hobbs commented on CASSANDRA-6961:


CASSANDRA-6984 was the cause of the hung repair.

 nodes should go into hibernate when join_ring is false
 --

 Key: CASSANDRA-6961
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6961
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 2.0.7

 Attachments: 6961.txt


 The impetus here is this: a node that was down for some period and comes back 
 can serve stale information.  We know from CASSANDRA-768 that we can't just 
 wait for hints, and know that tangentially related CASSANDRA-3569 prevents us 
 from having the node in a down (from the FD's POV) state handle streaming.
 We can *almost* set join_ring to false, then repair, and then join the ring 
 to narrow the window (actually, you can do this and everything succeeds 
 because the node doesn't know it's a member yet, which is probably a bit of a 
 bug.)  If instead we modified this to put the node in hibernate, like 
 replace_address does, it could work almost like replace, except you could run 
 a repair (manually) while in the hibernate state, and then flip to normal 
 when it's done.
 This won't prevent the staleness 100%, but it will greatly reduce the chance 
 if the node has been down a significant amount of time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960834#comment-13960834
 ] 

Russ Hatch commented on CASSANDRA-6971:
---

Here's the gossipinfo output. This is several minutes after the test failed. 
Schema uuid's still mismatch.
{noformat}
rhatch@whatup:/tmp/dtest-VZ3n7v/test/node1$ bin/nodetool -p 7100 gossipinfo
/127.0.0.2
  DC:datacenter1
  HOST_ID:8929d0a0-5a4f-4a6f-85c3-665bb3aaf140
  SCHEMA:19969d6d-daaa-328f-ade0-8640043e37b9
  NET_VERSION:6
  RPC_ADDRESS:127.0.0.2
  RACK:rack1
  LOAD:52150.0
  STATUS:NORMAL,-3074457345618258603
  RELEASE_VERSION:1.2.0-SNAPSHOT
/127.0.0.3
  DC:datacenter1
  SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
  HOST_ID:864a00d4-f661-421e-81c3-90e7b8f90ef1
  NET_VERSION:6
  RPC_ADDRESS:127.0.0.3
  RACK:rack1
  LOAD:14361.0
  STATUS:NORMAL,3074457345618258602
  RELEASE_VERSION:1.2.0-SNAPSHOT
/127.0.0.1
  DC:datacenter1
  HOST_ID:06bbda3d-5265-4134-8248-11e0a2ddf798
  RPC_ADDRESS:127.0.0.1
  NET_VERSION:6
  SCHEMA:19969d6d-daaa-328f-ade0-8640043e37b9
  RACK:rack1
  LOAD:52153.0
  STATUS:NORMAL,-9223372036854775808
  RELEASE_VERSION:1.2.0-SNAPSHOT

rhatch@whatup:/tmp/dtest-VZ3n7v/test/node1$ bin/nodetool -p 7200 gossipinfo
/127.0.0.2
  HOST_ID:8929d0a0-5a4f-4a6f-85c3-665bb3aaf140
  SCHEMA:19969d6d-daaa-328f-ade0-8640043e37b9
  NET_VERSION:6
  LOAD:52150.0
  RPC_ADDRESS:127.0.0.2
  RACK:rack1
  DC:datacenter1
  RELEASE_VERSION:1.2.0-SNAPSHOT
  STATUS:NORMAL,-3074457345618258603
/127.0.0.3
  HOST_ID:864a00d4-f661-421e-81c3-90e7b8f90ef1
  SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
  NET_VERSION:6
  LOAD:14361.0
  RPC_ADDRESS:127.0.0.3
  RACK:rack1
  DC:datacenter1
  RELEASE_VERSION:1.2.0-SNAPSHOT
  STATUS:NORMAL,3074457345618258602
/127.0.0.1
  HOST_ID:06bbda3d-5265-4134-8248-11e0a2ddf798
  SCHEMA:19969d6d-daaa-328f-ade0-8640043e37b9
  NET_VERSION:6
  LOAD:52153.0
  RPC_ADDRESS:127.0.0.1
  RACK:rack1
  DC:datacenter1
  RELEASE_VERSION:1.2.0-SNAPSHOT
  STATUS:NORMAL,-9223372036854775808

rhatch@whatup:/tmp/dtest-VZ3n7v/test/node1$ bin/nodetool -p 7300 gossipinfo
/127.0.0.2
  NET_VERSION:6
  RELEASE_VERSION:1.2.0-SNAPSHOT
  DC:datacenter1
  SCHEMA:19969d6d-daaa-328f-ade0-8640043e37b9
  HOST_ID:8929d0a0-5a4f-4a6f-85c3-665bb3aaf140
  LOAD:52150.0
  STATUS:NORMAL,-3074457345618258603
  RACK:rack1
  RPC_ADDRESS:127.0.0.2
/127.0.0.3
  NET_VERSION:6
  RELEASE_VERSION:1.2.0-SNAPSHOT
  DC:datacenter1
  SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
  HOST_ID:864a00d4-f661-421e-81c3-90e7b8f90ef1
  LOAD:14361.0
  STATUS:NORMAL,3074457345618258602
  RACK:rack1
  RPC_ADDRESS:127.0.0.3
/127.0.0.1
  NET_VERSION:6
  RELEASE_VERSION:1.2.0-SNAPSHOT
  DC:datacenter1
  SCHEMA:19969d6d-daaa-328f-ade0-8640043e37b9
  HOST_ID:06bbda3d-5265-4134-8248-11e0a2ddf798
  LOAD:52153.0
  STATUS:NORMAL,-9223372036854775808
  RACK:rack1
  RPC_ADDRESS:127.0.0.1
{noformat}

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: node1.log, node2.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960844#comment-13960844
 ] 

Brandon Williams commented on CASSANDRA-6971:
-

Ok, so we know the problem is in the 'passive' detection.  That narrows it 
down, thanks.

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: node1.log, node1.log, node2.log, node2.log, node3.log, 
 node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch updated CASSANDRA-6971:
--

Attachment: node3.log
node2.log
node1.log

attaching logs with debug output.

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: node1.log, node1.log, node2.log, node2.log, node3.log, 
 node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6971:


Attachment: 6971-debugging.txt

Can you get debug logs with this extra debugging patch applied?

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: 6971-debugging.txt, node1.log, node1.log, node2.log, 
 node2.log, node3.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-04-04 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960864#comment-13960864
 ] 

Pavel Yaskevich commented on CASSANDRA-6694:


Sorry guys, I've been busy with multiple things this week, will try to take a 
look at this on weekend.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960884#comment-13960884
 ] 

Russ Hatch commented on CASSANDRA-6971:
---

weird, with the logging patch I didn't see any output from MigrationManager. I 
did see this exception though (maybe was occuring before and I just didn't 
notice):
{noformat}
DEBUG [Thrift:2] 2014-04-04 19:03:59,051 CustomTThreadPoolServer.java (line 
209) Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at 
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{noformat}

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: 6971-debugging.txt, node1.log, node1.log, node2.log, 
 node2.log, node3.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6984) NullPointerException in Streaming During Repair

2014-04-04 Thread Jack Krupansky (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960891#comment-13960891
 ] 

Jack Krupansky commented on CASSANDRA-6984:
---

Is there any suggested workaround before a patch becomes generally available? 
Such as some other repair or rebuild sequence or parameters?

Here's a SO user who appears to be hitting this with DataStax Enterprise, which 
uses C* 2.0.

http://stackoverflow.com/questions/22837895/restarting-a-failed-stalled-stream-during-bootstrap-of-new-node


 NullPointerException in Streaming During Repair
 ---

 Key: CASSANDRA-6984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6984
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
Priority: Blocker
 Fix For: 2.0.7


 In cassandra-2.0, I can trigger a NullPointerException with a repair.  These 
 steps should reproduce the issue:
 * create a three node ccm cluster (with vnodes)
 * start a stress write (I'm using {{tools/bin/cassandra-stress 
 --replication-factor=3 -n 1000 -k -t 1}})
 * stop node3 while stress is running, then wait a minute
 * start node 3
 * run ccm node3 repair
 In the logs for node1, I see this:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 17:40:08,547 CassandraDaemon.java 
 (line 198) Exception in thread Thread[STREAM-OUT-/127.0.0.3,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}
 After applying Yuki's suggested patch:
 {noformat}
 diff --git a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java 
 b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 index 356138b..b06a818 100644
 --- a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 +++ b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 @@ -366,7 +366,7 @@ public class ConnectionHandler
  {
  throw new AssertionError(e);
  }
 -catch (IOException e)
 +catch (Throwable e)
  {
  session.onError(e);
  }
 {noformat}
 I see a new NPE:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 18:12:35,912 StreamSession.java 
 (line 420) [Stream #9b592af0-bc4e-11e3-a6f9-43eb3a328df9] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:465)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:60)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 at 
 org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960892#comment-13960892
 ] 

Brandon Williams commented on CASSANDRA-6971:
-

That's unrelated and it's at debug for a reason, it just means a client dropped 
the connection on us.

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: 6971-debugging.txt, node1.log, node1.log, node2.log, 
 node2.log, node3.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6984) NullPointerException in Streaming During Repair


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960893#comment-13960893
 ] 

Brandon Williams commented on CASSANDRA-6984:
-

Not really, it's a streaming problem.

 NullPointerException in Streaming During Repair
 ---

 Key: CASSANDRA-6984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6984
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
Priority: Blocker
 Fix For: 2.0.7


 In cassandra-2.0, I can trigger a NullPointerException with a repair.  These 
 steps should reproduce the issue:
 * create a three node ccm cluster (with vnodes)
 * start a stress write (I'm using {{tools/bin/cassandra-stress 
 --replication-factor=3 -n 1000 -k -t 1}})
 * stop node3 while stress is running, then wait a minute
 * start node 3
 * run ccm node3 repair
 In the logs for node1, I see this:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 17:40:08,547 CassandraDaemon.java 
 (line 198) Exception in thread Thread[STREAM-OUT-/127.0.0.3,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}
 After applying Yuki's suggested patch:
 {noformat}
 diff --git a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java 
 b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 index 356138b..b06a818 100644
 --- a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 +++ b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 @@ -366,7 +366,7 @@ public class ConnectionHandler
  {
  throw new AssertionError(e);
  }
 -catch (IOException e)
 +catch (Throwable e)
  {
  session.onError(e);
  }
 {noformat}
 I see a new NPE:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 18:12:35,912 StreamSession.java 
 (line 420) [Stream #9b592af0-bc4e-11e3-a6f9-43eb3a328df9] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:465)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:60)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 at 
 org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-6984) NullPointerException in Streaming During Repair


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960899#comment-13960899
 ] 

Yuki Morishita commented on CASSANDRA-6984:
---

From the stacktrace above, this is caused by CASSANDRA-6818 which is not 
released yet.
I cannot tell what the user in SO is hitting since exception other than 
IOException is hidden due to NPE in ConnectionHandler.
 

 NullPointerException in Streaming During Repair
 ---

 Key: CASSANDRA-6984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6984
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
Priority: Blocker
 Fix For: 2.0.7


 In cassandra-2.0, I can trigger a NullPointerException with a repair.  These 
 steps should reproduce the issue:
 * create a three node ccm cluster (with vnodes)
 * start a stress write (I'm using {{tools/bin/cassandra-stress 
 --replication-factor=3 -n 1000 -k -t 1}})
 * stop node3 while stress is running, then wait a minute
 * start node 3
 * run ccm node3 repair
 In the logs for node1, I see this:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 17:40:08,547 CassandraDaemon.java 
 (line 198) Exception in thread Thread[STREAM-OUT-/127.0.0.3,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}
 After applying Yuki's suggested patch:
 {noformat}
 diff --git a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java 
 b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 index 356138b..b06a818 100644
 --- a/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 +++ b/src/java/org/apache/cassandra/streaming/ConnectionHandler.java
 @@ -366,7 +366,7 @@ public class ConnectionHandler
  {
  throw new AssertionError(e);
  }
 -catch (IOException e)
 +catch (Throwable e)
  {
  session.onError(e);
  }
 {noformat}
 I see a new NPE:
 {noformat}
 ERROR [STREAM-OUT-/127.0.0.3] 2014-04-04 18:12:35,912 StreamSession.java 
 (line 420) [Stream #9b592af0-bc4e-11e3-a6f9-43eb3a328df9] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:465)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:60)
 at 
 org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 at 
 org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355)
 at java.lang.Thread.run(Thread.java:724)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch updated CASSANDRA-6971:
--

Attachment: debug3.log
debug2.log
debug1.log

attaching logs with debug patch (from a test run when the problem happened of 
course).

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: 6971-debugging.txt, debug1.log, debug2.log, debug3.log, 
 node1.log, node1.log, node2.log, node2.log, node3.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6971:


Attachment: 6971.txt

My guess is that passiveAnnounce hasn't been called in time when the 
onAlive/onRestart events have been called closely enough, so we need to also 
check onChange.  Patch to do so, plus the debugging.

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
 Attachments: 6971-debugging.txt, 6971.txt, debug1.log, debug2.log, 
 debug3.log, node1.log, node1.log, node2.log, node2.log, node3.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (CASSANDRA-6971) nodes not catching up to creation of new keyspace


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-6971:
---

Assignee: Brandon Williams

 nodes not catching up to creation of new keyspace
 -

 Key: CASSANDRA-6971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6971
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
Assignee: Brandon Williams
 Attachments: 6971-debugging.txt, 6971.txt, debug1.log, debug2.log, 
 debug3.log, node1.log, node1.log, node2.log, node2.log, node3.log, node3.log


 The dtest suite is running a test which creates a 3 node cluster, then adds a 
 keyspace and column family. For some reason the 3 nodes are not agreeing on 
 the schema version. The problem is intermittent -- either the nodes all agree 
 on schema quickly, or they seem to stay stuck in limbo.
 The simplest way to reproduce is to run the dtest (simple_increment_test):
 https://github.com/riptano/cassandra-dtest/blob/master/counter_tests.py
 using nosetests:
 {noformat}
 nosetests -vs counter_tests.py:TestCounters.simple_increment_test
 {noformat}
 If the problem is reproduced nose will return this:
 ProgrammingError: Bad Request: Keyspace 'ks' does not exist
 I am not yet sure if the bug is reproducible outside of the dtest suite.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6984) NullPointerException in Streaming During Repair