[jira] [Commented] (CASSANDRA-3943) Too many small size sstables after loading data using sstableloader or BulkOutputFormat increases compaction time.

2012-03-22 Thread Stu Hood (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235427#comment-13235427
 ] 

Stu Hood commented on CASSANDRA-3943:
-

I'd like to work on this, as it seems like it should be possible to make a 
BulkOutputFormat that writes at most one sstable from each reducer to each 
host, if you sort the data by token before it arrives at the reducer. 
Essentially, the OutputFormat would assert that it was receiving the data in 
sorted order, and write it straight to the socket as an sstable data file: this 
has been the 'lifelong dream' of MapReduce integration.

> Too many small size sstables after loading data using sstableloader or 
> BulkOutputFormat increases compaction time.
> --
>
> Key: CASSANDRA-3943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3943
> Project: Cassandra
>  Issue Type: Wish
>  Components: Hadoop, Tools
>Affects Versions: 0.8.2, 1.1.0
>Reporter: Samarth Gahire
>Priority: Minor
>  Labels: bulkloader, hadoop, ponies, sstableloader, streaming, 
> tools
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the 
> size of sstables created is around the buffer size provided.
> But After loading , sstables created in the cluster nodes are of size around
> {code}( (sstable_size_before_loading) * replication_factor ) / 
> No_Of_Nodes_In_Cluster{code}
> As the no of nodes in cluster goes increasing, size of each sstable loaded to 
> cassandra node decreases.Such small size sstables take too much time to 
> compact (minor compaction) as compare to relatively large size sstables.
> One solution that we have tried is to increase the buffer size while 
> generating sstables.But as we increase the buffer size ,time taken to 
> generate sstables increases.Is there any solution to this in existing 
> versions or are you fixing this in future version?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3943) Too many small size sstables after loading data using sstableloader or BulkOutputFormat increases compaction time.

2012-03-22 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235437#comment-13235437
 ] 

Peter Schuller commented on CASSANDRA-3943:
---

We are working on generating single-range-per-reducer sstables so that there is 
no overlap, and each reducer can send to a single node (or at least one node 
per sstable generated). It doesn't address local storage, but does address this.

It also has the effect that if we combine it with log(n) filtering of sstables 
in the read path based on ranges, it would be feasable to bulk import and have 
thousands of sstables and completely disable compaction.


> Too many small size sstables after loading data using sstableloader or 
> BulkOutputFormat increases compaction time.
> --
>
> Key: CASSANDRA-3943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3943
> Project: Cassandra
>  Issue Type: Wish
>  Components: Hadoop, Tools
>Affects Versions: 0.8.2, 1.1.0
>Reporter: Samarth Gahire
>Priority: Minor
>  Labels: bulkloader, hadoop, ponies, sstableloader, streaming, 
> tools
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the 
> size of sstables created is around the buffer size provided.
> But After loading , sstables created in the cluster nodes are of size around
> {code}( (sstable_size_before_loading) * replication_factor ) / 
> No_Of_Nodes_In_Cluster{code}
> As the no of nodes in cluster goes increasing, size of each sstable loaded to 
> cassandra node decreases.Such small size sstables take too much time to 
> compact (minor compaction) as compare to relatively large size sstables.
> One solution that we have tried is to increase the buffer size while 
> generating sstables.But as we increase the buffer size ,time taken to 
> generate sstables increases.Is there any solution to this in existing 
> versions or are you fixing this in future version?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3943) Too many small size sstables after loading data using sstableloader or BulkOutputFormat increases compaction time.

2012-03-22 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235440#comment-13235440
 ] 

Peter Schuller commented on CASSANDRA-3943:
---

It also facilitates replacing a data set one sstable at a time (if one 
generates sstables that correspond exactly in ranges), allowing completely 
replacement of a dataset without a temporary disk space spike.

Without any of these fixes, extra disk space needed is very significant - both 
regular compaction overhead in addition to loading two data sets onto the node.

> Too many small size sstables after loading data using sstableloader or 
> BulkOutputFormat increases compaction time.
> --
>
> Key: CASSANDRA-3943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3943
> Project: Cassandra
>  Issue Type: Wish
>  Components: Hadoop, Tools
>Affects Versions: 0.8.2, 1.1.0
>Reporter: Samarth Gahire
>Priority: Minor
>  Labels: bulkloader, hadoop, ponies, sstableloader, streaming, 
> tools
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the 
> size of sstables created is around the buffer size provided.
> But After loading , sstables created in the cluster nodes are of size around
> {code}( (sstable_size_before_loading) * replication_factor ) / 
> No_Of_Nodes_In_Cluster{code}
> As the no of nodes in cluster goes increasing, size of each sstable loaded to 
> cassandra node decreases.Such small size sstables take too much time to 
> compact (minor compaction) as compare to relatively large size sstables.
> One solution that we have tried is to increase the buffer size while 
> generating sstables.But as we increase the buffer size ,time taken to 
> generate sstables increases.Is there any solution to this in existing 
> versions or are you fixing this in future version?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4071) Topology changes can lead to bad counters (at RF=1)

2012-03-22 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235447#comment-13235447
 ] 

Peter Schuller commented on CASSANDRA-4071:
---

This is not entirely thought through, but: Suppose, upon streaming, the source 
node regenerates its node id. Further suppose the target node is able to 
determine, for a given nodeid, whether it is "frozen". It could now opt not to 
scrub the delta flag on incoming sstables for shards with that nodeid. As far 
as I can tell, shards for a node id are safe to interpret as deltas if it is 
known that they will never ever have to be updated. Given that the source node 
as regenerated it's node id, I think this is the case?

It feels like there are issues with this, but it's just a thought. Potential 
concerns I can think of:

* Will read repair generate new shards for an old nodeid? I don't think so.
* If old shards get removed on the new owner (stream destination) prior to the 
old owner no longer being responsible for the data, could that cause a problem?

I am not really suggesting this be done, it seems too complex/fragile to me. 
But worth mentioning.



> Topology changes can lead to bad counters (at RF=1)
> ---
>
> Key: CASSANDRA-4071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Sylvain Lebresne
>  Labels: counters
>
> A counter is broken into shards (partitions), each shard being 'owned' by a 
> given replica (meaning that only this replica will increment that shard).  
> For a given node A, the resolution of 2 shards (having the same owner) 
> follows the following rules:
> * if the shards are owned by A, then sum the values (in the original patch, 
> 'owned by A' was based on the machine IP address, in the current code, it's 
> based on the shard having a delta flag but the principle is the same)
> * otherwise, keep the maximum value (based on the shards clocks)
> During topology changes (boostrap/move/decommission), we transfer data from A 
> to B, but the shards owned by A are not owned by B (and we cannot make them 
> owned by B because during those operations (boostrap, ...) a given shard 
> would be owned by A and B which would break counters). But this means that B 
> won't interpret the streamed shards correctly.
> Concretely, if A receives a number of counter increments that end up in 
> different sstables (the shards should thus be summed) and then those 
> increments are streamed to B as part of boostrap, B will not sum the 
> increments but use the clocks to keep the maximum value.
> I've pushed a test that show the breakeage at 
> https://github.com/riptano/cassandra-dtest/commits/counters_test (the test 
> needs CASSANDRA-4070 to work correctly).
> Note that in practice, replication will hide this (even though B will have 
> the bad value after the boostrap, read or read/repair from the other replica 
> will repair it). This is a problem for RF=1 however.
> Another problem is that during repair, a node won't correctly repair other 
> nodes on it's own shards (unless everything is fully compacted).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

2012-03-22 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.0 c573c46e3 -> 5a3d4c14b


CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

patch by slebresne; reviewed by jbellis for CASSANDRA-4070


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a3d4c14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a3d4c14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a3d4c14

Branch: refs/heads/cassandra-1.0
Commit: 5a3d4c14b2f7e804c74c04c6e0b229c50649e7bd
Parents: c573c46
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:42:22 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:42:22 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a3d4c14/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 60a3487..9c790ae 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1739,7 +1739,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 
 public void setMaximumCompactionThreshold(int maxCompactionThreshold)
 {
-if (maxCompactionThreshold < this.minCompactionThreshold.value())
+if (maxCompactionThreshold > 0 && maxCompactionThreshold < 
this.minCompactionThreshold.value())
 {
 throw new RuntimeException("The max_compaction_threshold cannot be 
smaller than the min.");
 }



[2/2] git commit: CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

2012-03-22 Thread slebresne
CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

patch by slebresne; reviewed by jbellis for CASSANDRA-4070


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a3d4c14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a3d4c14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a3d4c14

Branch: refs/heads/cassandra-1.1.0
Commit: 5a3d4c14b2f7e804c74c04c6e0b229c50649e7bd
Parents: c573c46
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:42:22 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:42:22 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a3d4c14/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 60a3487..9c790ae 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1739,7 +1739,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 
 public void setMaximumCompactionThreshold(int maxCompactionThreshold)
 {
-if (maxCompactionThreshold < this.minCompactionThreshold.value())
+if (maxCompactionThreshold > 0 && maxCompactionThreshold < 
this.minCompactionThreshold.value())
 {
 throw new RuntimeException("The max_compaction_threshold cannot be 
smaller than the min.");
 }



[jira] [Resolved] (CASSANDRA-4070) CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

2012-03-22 Thread Sylvain Lebresne (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-4070.
-

Resolution: Fixed
  Reviewer: jbellis

Committed, thanks

I agree that we probably should have a better way to disable compaction. 
Actually given that leveled compaction pretty much ignore the max and min 
threshold, I think we should think about moving those to the compaction options.

> CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0
> --
>
> Key: CASSANDRA-4070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4070
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Trivial
> Fix For: 1.0.9
>
> Attachments: 4070.patch
>
>
> Thrift allows to set the max compaction threshold to 0 to disable compaction. 
> However, CFS.setMaxCompactionThreshold throws an exception min > max even if 
> max is 0.
> Note that even if someone sets 0 for both the min and max thresholds, we'll 
> can have a problem because SizeTieredCompaction calls 
> CFS.setMaxCompactionThreshold before calling CFS.setMinCompactionThreshold 
> and thus will trigger the RuntimeException when it shouldn't.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[3/4] git commit: Merge branch 'cassandra-1.0' into cassandra-1.1.0

2012-03-22 Thread slebresne
Merge branch 'cassandra-1.0' into cassandra-1.1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3136c209
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3136c209
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3136c209

Branch: refs/heads/cassandra-1.1
Commit: 3136c2092cf36b79d09941178eebf83f44f0a7d6
Parents: 050e61a 5a3d4c1
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:45:44 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:45:44 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3136c209/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--



[4/4] git commit: CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

2012-03-22 Thread slebresne
CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

patch by slebresne; reviewed by jbellis for CASSANDRA-4070


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a3d4c14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a3d4c14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a3d4c14

Branch: refs/heads/cassandra-1.1
Commit: 5a3d4c14b2f7e804c74c04c6e0b229c50649e7bd
Parents: c573c46
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:42:22 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:42:22 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a3d4c14/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 60a3487..9c790ae 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1739,7 +1739,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 
 public void setMaximumCompactionThreshold(int maxCompactionThreshold)
 {
-if (maxCompactionThreshold < this.minCompactionThreshold.value())
+if (maxCompactionThreshold > 0 && maxCompactionThreshold < 
this.minCompactionThreshold.value())
 {
 throw new RuntimeException("The max_compaction_threshold cannot be 
smaller than the min.");
 }



[2/4] git commit: Merge branch 'cassandra-1.1.0' into cassandra-1.1

2012-03-22 Thread slebresne
Merge branch 'cassandra-1.1.0' into cassandra-1.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9f110584
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9f110584
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9f110584

Branch: refs/heads/cassandra-1.1
Commit: 9f1105848317bad15f1960e710826f6c8b0ee142
Parents: 5787bb8 3136c20
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:46:26 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:46:26 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f110584/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--



[1/5] git commit: Merge branch 'cassandra-1.1' into trunk

2012-03-22 Thread slebresne
Updated Branches:
  refs/heads/trunk 908e0e2d7 -> ad4541e6b


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ad4541e6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ad4541e6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ad4541e6

Branch: refs/heads/trunk
Commit: ad4541e6bf3c178677be30a899e21fe63094a552
Parents: 908e0e2 86f5eaa
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:50:37 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:50:37 2012 +0100

--
 CHANGES.txt|1 +
 .../org/apache/cassandra/config/CFMetaData.java|  163 +++--
 .../apache/cassandra/config/ColumnDefinition.java  |   13 +-
 .../org/apache/cassandra/config/KSMetaData.java|   43 +++-
 .../apache/cassandra/cql/AlterTableStatement.java  |   64 +++---
 .../cassandra/cql/CreateColumnFamilyStatement.java |9 +-
 .../cassandra/cql/CreateKeyspaceStatement.java |   20 --
 .../org/apache/cassandra/cql/QueryProcessor.java   |   38 ++--
 .../cql3/statements/AlterTableStatement.java   |   49 ++--
 .../statements/CreateColumnFamilyStatement.java|7 +-
 .../cql3/statements/CreateIndexStatement.java  |   21 +-
 .../cql3/statements/CreateKeyspaceStatement.java   |6 +-
 .../cql3/statements/DropIndexStatement.java|   22 +-
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 .../apache/cassandra/thrift/CassandraServer.java   |   22 +-
 .../apache/cassandra/thrift/ThriftValidation.java  |  178 ---
 test/unit/org/apache/cassandra/SchemaLoader.java   |6 +-
 .../cassandra/thrift/ThriftValidationTest.java |   28 ++--
 18 files changed, 311 insertions(+), 381 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/config/CFMetaData.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/config/ColumnDefinition.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/config/KSMetaData.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql/AlterTableStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql/CreateKeyspaceStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql/QueryProcessor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql3/statements/AlterTableStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql3/statements/CreateColumnFamilyStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql3/statements/CreateKeyspaceStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/cql3/statements/DropIndexStatement.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/thrift/CassandraServer.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad4541e6/src/java/org/apache/cassandra/thrift/Thrift

[5/5] git commit: CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

2012-03-22 Thread slebresne
CFS.setMaxCompactionThreshold doesn't allow 0 unless min is also 0

patch by slebresne; reviewed by jbellis for CASSANDRA-4070


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a3d4c14
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a3d4c14
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a3d4c14

Branch: refs/heads/trunk
Commit: 5a3d4c14b2f7e804c74c04c6e0b229c50649e7bd
Parents: c573c46
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:42:22 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:42:22 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a3d4c14/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 60a3487..9c790ae 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1739,7 +1739,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 
 public void setMaximumCompactionThreshold(int maxCompactionThreshold)
 {
-if (maxCompactionThreshold < this.minCompactionThreshold.value())
+if (maxCompactionThreshold > 0 && maxCompactionThreshold < 
this.minCompactionThreshold.value())
 {
 throw new RuntimeException("The max_compaction_threshold cannot be 
smaller than the min.");
 }



[3/5] git commit: Merge branch 'cassandra-1.1.0' into cassandra-1.1

2012-03-22 Thread slebresne
Merge branch 'cassandra-1.1.0' into cassandra-1.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9f110584
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9f110584
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9f110584

Branch: refs/heads/trunk
Commit: 9f1105848317bad15f1960e710826f6c8b0ee142
Parents: 5787bb8 3136c20
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:46:26 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:46:26 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f110584/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--



[4/5] git commit: Merge branch 'cassandra-1.0' into cassandra-1.1.0

2012-03-22 Thread slebresne
Merge branch 'cassandra-1.0' into cassandra-1.1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3136c209
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3136c209
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3136c209

Branch: refs/heads/trunk
Commit: 3136c2092cf36b79d09941178eebf83f44f0a7d6
Parents: 050e61a 5a3d4c1
Author: Sylvain Lebresne 
Authored: Thu Mar 22 13:45:44 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 13:45:44 2012 +0100

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3136c209/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--



[jira] [Resolved] (CASSANDRA-4037) Move CfDef and KsDef validation to CFMetaData and KSMetaData

2012-03-22 Thread Sylvain Lebresne (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-4037.
-

Resolution: Fixed
  Reviewer: jbellis

Committed, thanks

> Move CfDef and KsDef validation to CFMetaData and KSMetaData
> 
>
> Key: CASSANDRA-4037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4037
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.1.1
>
>
> Following CASSANDRA-3792, CQL don't need to use thrift CfDef and KsDef. 
> However, those are still used in order to reuse ThriftValidation validation 
> methods. We should move that validation to CFM and KSM and remove the use of 
> those thrift structures by CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: ensure that directory is selected for compaction patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985

2012-03-22 Thread xedin
Updated Branches:
  refs/heads/cassandra-1.0 5a3d4c14b -> fbdf7b03c


ensure that directory is selected for compaction
patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fbdf7b03
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fbdf7b03
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fbdf7b03

Branch: refs/heads/cassandra-1.0
Commit: fbdf7b03c7a8138ae9621bf9bacaada906a2530d
Parents: 5a3d4c1
Author: Pavel Yaskevich 
Authored: Thu Mar 22 15:40:29 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 15:57:13 2012 +0300

--
 CHANGES.txt|1 +
 .../cassandra/config/DatabaseDescriptor.java   |   72 ---
 src/java/org/apache/cassandra/db/Table.java|9 ++-
 .../cassandra/db/compaction/CompactionTask.java|   27 +++--
 4 files changed, 64 insertions(+), 45 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 925a4a9..c1e1cfe 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -10,6 +10,7 @@
  * don't change manifest level for cleanup, scrub, and upgradesstables
operations under LeveledCompactionStrategy (CASSANDRA-3989)
  * fix race leading to super columns assertion failure (CASSANDRA-3957)
+ * ensure that directory is selected for compaction (CASSANDRA-3985)
 
 
 1.0.8

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 5aa59e4..f981adf 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -63,8 +63,6 @@ public class DatabaseDescriptor
 private static InetAddress broadcastAddress;
 private static InetAddress rpcAddress;
 private static SeedProvider seedProvider;
-/* Current index into the above list of directories */
-private static int currentIndex = 0;
 
 /* Hashing strategy Random or OPHF */
 private static IPartitioner partitioner;
@@ -741,12 +739,6 @@ public class DatabaseDescriptor
 return tableLocations;
 }
 
-public synchronized static String getNextAvailableDataLocation()
-{
-String dataFileDirectory = conf.data_file_directories[currentIndex];
-currentIndex = (currentIndex + 1) % conf.data_file_directories.length;
-return dataFileDirectory;
-}
 
 public static String getCommitLogLocation()
 {
@@ -763,41 +755,57 @@ public class DatabaseDescriptor
 return Collections.unmodifiableSet(new 
HashSet(seedProvider.getSeeds()));
 }
 
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize)
+{
+return getDataFileLocationForTable(table, expectedCompactedFileSize, 
true);
+}
+
 /*
  * Loop through all the disks to see which disk has the max free space
  * return the disk with max free space for compactions. If the size of the 
expected
  * compacted file is greater than the max disk space available return 
null, we cannot
  * do compaction in this case.
+ *
+ * @param table name of the table.
+ * @param expectedCompactedSize expected file size in bytes.
+ * @param ensureFreeSpace Flag if the function should ensure enough free 
space exists for the expected file size.
+ *If False and there is not enough free space a 
warning is logged, and the dir with the most space is returned.
  */
-public static String getDataFileLocationForTable(String table, long 
expectedCompactedFileSize)
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize, boolean ensureFreeSpace)
 {
-  long maxFreeDisk = 0;
-  int maxDiskIndex = 0;
-  String dataFileDirectory = null;
-  String[] dataDirectoryForTable = getAllDataFileLocationsForTable(table);
+long maxFreeDisk = 0;
+int maxDiskIndex = 0;
+String dataFileDirectory = null;
+String[] dataDirectoryForTable = 
getAllDataFileLocationsForTable(table);
 
-  for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ )
-  {
-File f = new File(dataDirectoryForTable[i]);
-if( maxFreeDisk < f.getUsableSpace())
+for (int i = 0; i < dataDirectoryForTable.length; i++)
 {
-  maxFreeDisk = f.getUsableSpace();
-  

[jira] [Updated] (CASSANDRA-3943) Too many small size sstables after loading data using sstableloader or BulkOutputFormat increases compaction time.

2012-03-22 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3943:
--

Affects Version/s: (was: 1.1.0)
Fix Version/s: 1.2
 Assignee: Stu Hood

> Too many small size sstables after loading data using sstableloader or 
> BulkOutputFormat increases compaction time.
> --
>
> Key: CASSANDRA-3943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3943
> Project: Cassandra
>  Issue Type: Wish
>  Components: Hadoop, Tools
>Affects Versions: 0.8.2, 1.1.0
>Reporter: Samarth Gahire
>Assignee: Stu Hood
>Priority: Minor
>  Labels: bulkloader, hadoop, ponies, sstableloader, streaming, 
> tools
> Fix For: 1.2
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the 
> size of sstables created is around the buffer size provided.
> But After loading , sstables created in the cluster nodes are of size around
> {code}( (sstable_size_before_loading) * replication_factor ) / 
> No_Of_Nodes_In_Cluster{code}
> As the no of nodes in cluster goes increasing, size of each sstable loaded to 
> cassandra node decreases.Such small size sstables take too much time to 
> compact (minor compaction) as compare to relatively large size sstables.
> One solution that we have tried is to increase the buffer size while 
> generating sstables.But as we increase the buffer size ,time taken to 
> generate sstables increases.Is there any solution to this in existing 
> versions or are you fixing this in future version?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3943) Too many small size sstables after loading data using sstableloader or BulkOutputFormat increases compaction time.

2012-03-22 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3943:
--

Affects Version/s: 1.1.0

> Too many small size sstables after loading data using sstableloader or 
> BulkOutputFormat increases compaction time.
> --
>
> Key: CASSANDRA-3943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3943
> Project: Cassandra
>  Issue Type: Wish
>  Components: Hadoop, Tools
>Affects Versions: 0.8.2, 1.1.0
>Reporter: Samarth Gahire
>Assignee: Stu Hood
>Priority: Minor
>  Labels: bulkloader, hadoop, ponies, sstableloader, streaming, 
> tools
> Fix For: 1.2
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the 
> size of sstables created is around the buffer size provided.
> But After loading , sstables created in the cluster nodes are of size around
> {code}( (sstable_size_before_loading) * replication_factor ) / 
> No_Of_Nodes_In_Cluster{code}
> As the no of nodes in cluster goes increasing, size of each sstable loaded to 
> cassandra node decreases.Such small size sstables take too much time to 
> compact (minor compaction) as compare to relatively large size sstables.
> One solution that we have tried is to increase the buffer size while 
> generating sstables.But as we increase the buffer size ,time taken to 
> generate sstables increases.Is there any solution to this in existing 
> versions or are you fixing this in future version?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[2/2] git commit: ensure that directory is selected for compaction patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985

2012-03-22 Thread xedin
ensure that directory is selected for compaction
patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fbdf7b03
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fbdf7b03
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fbdf7b03

Branch: refs/heads/cassandra-1.1.0
Commit: fbdf7b03c7a8138ae9621bf9bacaada906a2530d
Parents: 5a3d4c1
Author: Pavel Yaskevich 
Authored: Thu Mar 22 15:40:29 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 15:57:13 2012 +0300

--
 CHANGES.txt|1 +
 .../cassandra/config/DatabaseDescriptor.java   |   72 ---
 src/java/org/apache/cassandra/db/Table.java|9 ++-
 .../cassandra/db/compaction/CompactionTask.java|   27 +++--
 4 files changed, 64 insertions(+), 45 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 925a4a9..c1e1cfe 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -10,6 +10,7 @@
  * don't change manifest level for cleanup, scrub, and upgradesstables
operations under LeveledCompactionStrategy (CASSANDRA-3989)
  * fix race leading to super columns assertion failure (CASSANDRA-3957)
+ * ensure that directory is selected for compaction (CASSANDRA-3985)
 
 
 1.0.8

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 5aa59e4..f981adf 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -63,8 +63,6 @@ public class DatabaseDescriptor
 private static InetAddress broadcastAddress;
 private static InetAddress rpcAddress;
 private static SeedProvider seedProvider;
-/* Current index into the above list of directories */
-private static int currentIndex = 0;
 
 /* Hashing strategy Random or OPHF */
 private static IPartitioner partitioner;
@@ -741,12 +739,6 @@ public class DatabaseDescriptor
 return tableLocations;
 }
 
-public synchronized static String getNextAvailableDataLocation()
-{
-String dataFileDirectory = conf.data_file_directories[currentIndex];
-currentIndex = (currentIndex + 1) % conf.data_file_directories.length;
-return dataFileDirectory;
-}
 
 public static String getCommitLogLocation()
 {
@@ -763,41 +755,57 @@ public class DatabaseDescriptor
 return Collections.unmodifiableSet(new 
HashSet(seedProvider.getSeeds()));
 }
 
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize)
+{
+return getDataFileLocationForTable(table, expectedCompactedFileSize, 
true);
+}
+
 /*
  * Loop through all the disks to see which disk has the max free space
  * return the disk with max free space for compactions. If the size of the 
expected
  * compacted file is greater than the max disk space available return 
null, we cannot
  * do compaction in this case.
+ *
+ * @param table name of the table.
+ * @param expectedCompactedSize expected file size in bytes.
+ * @param ensureFreeSpace Flag if the function should ensure enough free 
space exists for the expected file size.
+ *If False and there is not enough free space a 
warning is logged, and the dir with the most space is returned.
  */
-public static String getDataFileLocationForTable(String table, long 
expectedCompactedFileSize)
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize, boolean ensureFreeSpace)
 {
-  long maxFreeDisk = 0;
-  int maxDiskIndex = 0;
-  String dataFileDirectory = null;
-  String[] dataDirectoryForTable = getAllDataFileLocationsForTable(table);
+long maxFreeDisk = 0;
+int maxDiskIndex = 0;
+String dataFileDirectory = null;
+String[] dataDirectoryForTable = 
getAllDataFileLocationsForTable(table);
 
-  for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ )
-  {
-File f = new File(dataDirectoryForTable[i]);
-if( maxFreeDisk < f.getUsableSpace())
+for (int i = 0; i < dataDirectoryForTable.length; i++)
 {
-  maxFreeDisk = f.getUsableSpace();
-  maxDiskIndex = i;
+File f = new File(dataDirectoryFo

[1/2] git commit: merge from 1.0

2012-03-22 Thread xedin
Updated Branches:
  refs/heads/cassandra-1.1.0 3136c2092 -> b12c34f30


merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b12c34f3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b12c34f3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b12c34f3

Branch: refs/heads/cassandra-1.1.0
Commit: b12c34f309cba15fb0d4187461a7065121f38e7b
Parents: 3136c20 fbdf7b0
Author: Pavel Yaskevich 
Authored: Thu Mar 22 16:26:11 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 16:45:57 2012 +0300

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Directories.java  |   32 ---
 .../cassandra/db/compaction/CompactionTask.java|   32 ---
 3 files changed, 44 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b12c34f3/CHANGES.txt
--
diff --cc CHANGES.txt
index 70db8e5,c1e1cfe..c770868
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -38,96 -10,9 +38,97 @@@ Merged from 1.0
   * don't change manifest level for cleanup, scrub, and upgradesstables
 operations under LeveledCompactionStrategy (CASSANDRA-3989)
   * fix race leading to super columns assertion failure (CASSANDRA-3957)
+  * ensure that directory is selected for compaction (CASSANDRA-3985)
  
  
 +1.1-beta1
 + * (cqlsh)
 +   + add SOURCE and CAPTURE commands, and --file option (CASSANDRA-3479)
 +   + add ALTER COLUMNFAMILY WITH (CASSANDRA-3523)
 +   + bundle Python dependencies with Cassandra (CASSANDRA-3507)
 +   + added to Debian package (CASSANDRA-3458)
 +   + display byte data instead of erroring out on decode failure 
 + (CASSANDRA-3874)
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL > ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * "defragment" rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that "SELECT first ... *" isn't really "SELECT *" 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted han

[1/3] git commit: merge from 1.1.0

2012-03-22 Thread xedin
Updated Branches:
  refs/heads/cassandra-1.1 86f5eaa9b -> df103258c


merge from 1.1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/df103258
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/df103258
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/df103258

Branch: refs/heads/cassandra-1.1
Commit: df103258cdd8cd6f0b89a9733936a5ca2fa4cc8c
Parents: 86f5eaa b12c34f
Author: Pavel Yaskevich 
Authored: Thu Mar 22 17:06:31 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 17:06:31 2012 +0300

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Directories.java  |   32 ---
 .../cassandra/db/compaction/CompactionTask.java|   32 ---
 3 files changed, 44 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/df103258/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/df103258/src/java/org/apache/cassandra/db/Directories.java
--



[3/3] git commit: ensure that directory is selected for compaction patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985

2012-03-22 Thread xedin
ensure that directory is selected for compaction
patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fbdf7b03
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fbdf7b03
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fbdf7b03

Branch: refs/heads/cassandra-1.1
Commit: fbdf7b03c7a8138ae9621bf9bacaada906a2530d
Parents: 5a3d4c1
Author: Pavel Yaskevich 
Authored: Thu Mar 22 15:40:29 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 15:57:13 2012 +0300

--
 CHANGES.txt|1 +
 .../cassandra/config/DatabaseDescriptor.java   |   72 ---
 src/java/org/apache/cassandra/db/Table.java|9 ++-
 .../cassandra/db/compaction/CompactionTask.java|   27 +++--
 4 files changed, 64 insertions(+), 45 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 925a4a9..c1e1cfe 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -10,6 +10,7 @@
  * don't change manifest level for cleanup, scrub, and upgradesstables
operations under LeveledCompactionStrategy (CASSANDRA-3989)
  * fix race leading to super columns assertion failure (CASSANDRA-3957)
+ * ensure that directory is selected for compaction (CASSANDRA-3985)
 
 
 1.0.8

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 5aa59e4..f981adf 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -63,8 +63,6 @@ public class DatabaseDescriptor
 private static InetAddress broadcastAddress;
 private static InetAddress rpcAddress;
 private static SeedProvider seedProvider;
-/* Current index into the above list of directories */
-private static int currentIndex = 0;
 
 /* Hashing strategy Random or OPHF */
 private static IPartitioner partitioner;
@@ -741,12 +739,6 @@ public class DatabaseDescriptor
 return tableLocations;
 }
 
-public synchronized static String getNextAvailableDataLocation()
-{
-String dataFileDirectory = conf.data_file_directories[currentIndex];
-currentIndex = (currentIndex + 1) % conf.data_file_directories.length;
-return dataFileDirectory;
-}
 
 public static String getCommitLogLocation()
 {
@@ -763,41 +755,57 @@ public class DatabaseDescriptor
 return Collections.unmodifiableSet(new 
HashSet(seedProvider.getSeeds()));
 }
 
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize)
+{
+return getDataFileLocationForTable(table, expectedCompactedFileSize, 
true);
+}
+
 /*
  * Loop through all the disks to see which disk has the max free space
  * return the disk with max free space for compactions. If the size of the 
expected
  * compacted file is greater than the max disk space available return 
null, we cannot
  * do compaction in this case.
+ *
+ * @param table name of the table.
+ * @param expectedCompactedSize expected file size in bytes.
+ * @param ensureFreeSpace Flag if the function should ensure enough free 
space exists for the expected file size.
+ *If False and there is not enough free space a 
warning is logged, and the dir with the most space is returned.
  */
-public static String getDataFileLocationForTable(String table, long 
expectedCompactedFileSize)
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize, boolean ensureFreeSpace)
 {
-  long maxFreeDisk = 0;
-  int maxDiskIndex = 0;
-  String dataFileDirectory = null;
-  String[] dataDirectoryForTable = getAllDataFileLocationsForTable(table);
+long maxFreeDisk = 0;
+int maxDiskIndex = 0;
+String dataFileDirectory = null;
+String[] dataDirectoryForTable = 
getAllDataFileLocationsForTable(table);
 
-  for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ )
-  {
-File f = new File(dataDirectoryForTable[i]);
-if( maxFreeDisk < f.getUsableSpace())
+for (int i = 0; i < dataDirectoryForTable.length; i++)
 {
-  maxFreeDisk = f.getUsableSpace();
-  maxDiskIndex = i;
+File f = new File(dataDirectoryForT

[2/3] git commit: merge from 1.0

2012-03-22 Thread xedin
merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b12c34f3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b12c34f3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b12c34f3

Branch: refs/heads/cassandra-1.1
Commit: b12c34f309cba15fb0d4187461a7065121f38e7b
Parents: 3136c20 fbdf7b0
Author: Pavel Yaskevich 
Authored: Thu Mar 22 16:26:11 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 16:45:57 2012 +0300

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Directories.java  |   32 ---
 .../cassandra/db/compaction/CompactionTask.java|   32 ---
 3 files changed, 44 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b12c34f3/CHANGES.txt
--
diff --cc CHANGES.txt
index 70db8e5,c1e1cfe..c770868
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -38,96 -10,9 +38,97 @@@ Merged from 1.0
   * don't change manifest level for cleanup, scrub, and upgradesstables
 operations under LeveledCompactionStrategy (CASSANDRA-3989)
   * fix race leading to super columns assertion failure (CASSANDRA-3957)
+  * ensure that directory is selected for compaction (CASSANDRA-3985)
  
  
 +1.1-beta1
 + * (cqlsh)
 +   + add SOURCE and CAPTURE commands, and --file option (CASSANDRA-3479)
 +   + add ALTER COLUMNFAMILY WITH (CASSANDRA-3523)
 +   + bundle Python dependencies with Cassandra (CASSANDRA-3507)
 +   + added to Debian package (CASSANDRA-3458)
 +   + display byte data instead of erroring out on decode failure 
 + (CASSANDRA-3874)
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL > ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * "defragment" rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that "SELECT first ... *" isn't really "SELECT *" 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted handoff (CASSANDRA-3694)
 + * Upgrade Thrift to 0.7 (CASSANDRA-3213)
 + * Mak

[2/4] git commit: merge from 1.1.0

2012-03-22 Thread xedin
merge from 1.1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/df103258
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/df103258
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/df103258

Branch: refs/heads/trunk
Commit: df103258cdd8cd6f0b89a9733936a5ca2fa4cc8c
Parents: 86f5eaa b12c34f
Author: Pavel Yaskevich 
Authored: Thu Mar 22 17:06:31 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 17:06:31 2012 +0300

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Directories.java  |   32 ---
 .../cassandra/db/compaction/CompactionTask.java|   32 ---
 3 files changed, 44 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/df103258/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/df103258/src/java/org/apache/cassandra/db/Directories.java
--



[1/4] git commit: merge from 1.1

2012-03-22 Thread xedin
Updated Branches:
  refs/heads/trunk ad4541e6b -> fe507e305


merge from 1.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fe507e30
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fe507e30
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fe507e30

Branch: refs/heads/trunk
Commit: fe507e3058511cc363e2a94e8370e7306efe45bb
Parents: ad4541e df10325
Author: Pavel Yaskevich 
Authored: Thu Mar 22 17:08:37 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 17:08:37 2012 +0300

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Directories.java  |   32 ---
 .../cassandra/db/compaction/CompactionTask.java|   32 ---
 3 files changed, 44 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe507e30/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe507e30/src/java/org/apache/cassandra/db/Directories.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe507e30/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
--



[3/4] git commit: merge from 1.0

2012-03-22 Thread xedin
merge from 1.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b12c34f3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b12c34f3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b12c34f3

Branch: refs/heads/trunk
Commit: b12c34f309cba15fb0d4187461a7065121f38e7b
Parents: 3136c20 fbdf7b0
Author: Pavel Yaskevich 
Authored: Thu Mar 22 16:26:11 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 16:45:57 2012 +0300

--
 CHANGES.txt|1 +
 src/java/org/apache/cassandra/db/Directories.java  |   32 ---
 .../cassandra/db/compaction/CompactionTask.java|   32 ---
 3 files changed, 44 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b12c34f3/CHANGES.txt
--
diff --cc CHANGES.txt
index 70db8e5,c1e1cfe..c770868
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -38,96 -10,9 +38,97 @@@ Merged from 1.0
   * don't change manifest level for cleanup, scrub, and upgradesstables
 operations under LeveledCompactionStrategy (CASSANDRA-3989)
   * fix race leading to super columns assertion failure (CASSANDRA-3957)
+  * ensure that directory is selected for compaction (CASSANDRA-3985)
  
  
 +1.1-beta1
 + * (cqlsh)
 +   + add SOURCE and CAPTURE commands, and --file option (CASSANDRA-3479)
 +   + add ALTER COLUMNFAMILY WITH (CASSANDRA-3523)
 +   + bundle Python dependencies with Cassandra (CASSANDRA-3507)
 +   + added to Debian package (CASSANDRA-3458)
 +   + display byte data instead of erroring out on decode failure 
 + (CASSANDRA-3874)
 + * add nodetool rebuild_index (CASSANDRA-3583)
 + * add nodetool rangekeysample (CASSANDRA-2917)
 + * Fix streaming too much data during move operations (CASSANDRA-3639)
 + * Nodetool and CLI connect to localhost by default (CASSANDRA-3568)
 + * Reduce memory used by primary index sample (CASSANDRA-3743)
 + * (Hadoop) separate input/output configurations (CASSANDRA-3197, 3765)
 + * avoid returning internal Cassandra classes over JMX (CASSANDRA-2805)
 + * add row-level isolation via SnapTree (CASSANDRA-2893)
 + * Optimize key count estimation when opening sstable on startup
 +   (CASSANDRA-2988)
 + * multi-dc replication optimization supporting CL > ONE (CASSANDRA-3577)
 + * add command to stop compactions (CASSANDRA-1740, 3566, 3582)
 + * multithreaded streaming (CASSANDRA-3494)
 + * removed in-tree redhat spec (CASSANDRA-3567)
 + * "defragment" rows for name-based queries under STCS, again (CASSANDRA-2503)
 + * Recycle commitlog segments for improved performance 
 +   (CASSANDRA-3411, 3543, 3557, 3615)
 + * update size-tiered compaction to prioritize small tiers (CASSANDRA-2407)
 + * add message expiration logic to OutboundTcpConnection (CASSANDRA-3005)
 + * off-heap cache to use sun.misc.Unsafe instead of JNA (CASSANDRA-3271)
 + * EACH_QUORUM is only supported for writes (CASSANDRA-3272)
 + * replace compactionlock use in schema migration by checking CFS.isValid
 +   (CASSANDRA-3116)
 + * recognize that "SELECT first ... *" isn't really "SELECT *" 
(CASSANDRA-3445)
 + * Use faster bytes comparison (CASSANDRA-3434)
 + * Bulk loader is no longer a fat client, (HADOOP) bulk load output format
 +   (CASSANDRA-3045)
 + * (Hadoop) add support for KeyRange.filter
 + * remove assumption that keys and token are in bijection
 +   (CASSANDRA-1034, 3574, 3604)
 + * always remove endpoints from delevery queue in HH (CASSANDRA-3546)
 + * fix race between cf flush and its 2ndary indexes flush (CASSANDRA-3547)
 + * fix potential race in AES when a repair fails (CASSANDRA-3548)
 + * Remove columns shadowed by a deleted container even when we cannot purge
 +   (CASSANDRA-3538)
 + * Improve memtable slice iteration performance (CASSANDRA-3545)
 + * more efficient allocation of small bloom filters (CASSANDRA-3618)
 + * Use separate writer thread in SSTableSimpleUnsortedWriter (CASSANDRA-3619)
 + * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
 + * fix minor issues reported by FindBugs (CASSANDRA-3658)
 + * global key/row caches (CASSANDRA-3143, 3849)
 + * optimize memtable iteration during range scan (CASSANDRA-3638)
 + * introduce 'crc_check_chance' in CompressionParameters to support
 +   a checksum percentage checking chance similarly to read-repair 
(CASSANDRA-3611)
 + * a way to deactivate global key/row cache on per-CF basis (CASSANDRA-3667)
 + * fix LeveledCompactionStrategy broken because of generation pre-allocation
 +   in LeveledManifest (CASSANDRA-3691)
 + * finer-grained control over data directories (CASSANDRA-2749)
 + * Fix ClassCastException during hinted handoff (CASSANDRA-3694)
 + * Upgrade Thrift to 0.7 (CASSANDRA-3213)
 + * Make stress

[4/4] git commit: ensure that directory is selected for compaction patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985

2012-03-22 Thread xedin
ensure that directory is selected for compaction
patch by Aaron Morton; reviewed by Pavel Yaskevich for CASSANDRA-3985


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fbdf7b03
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fbdf7b03
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fbdf7b03

Branch: refs/heads/trunk
Commit: fbdf7b03c7a8138ae9621bf9bacaada906a2530d
Parents: 5a3d4c1
Author: Pavel Yaskevich 
Authored: Thu Mar 22 15:40:29 2012 +0300
Committer: Pavel Yaskevich 
Committed: Thu Mar 22 15:57:13 2012 +0300

--
 CHANGES.txt|1 +
 .../cassandra/config/DatabaseDescriptor.java   |   72 ---
 src/java/org/apache/cassandra/db/Table.java|9 ++-
 .../cassandra/db/compaction/CompactionTask.java|   27 +++--
 4 files changed, 64 insertions(+), 45 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 925a4a9..c1e1cfe 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -10,6 +10,7 @@
  * don't change manifest level for cleanup, scrub, and upgradesstables
operations under LeveledCompactionStrategy (CASSANDRA-3989)
  * fix race leading to super columns assertion failure (CASSANDRA-3957)
+ * ensure that directory is selected for compaction (CASSANDRA-3985)
 
 
 1.0.8

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fbdf7b03/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 5aa59e4..f981adf 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -63,8 +63,6 @@ public class DatabaseDescriptor
 private static InetAddress broadcastAddress;
 private static InetAddress rpcAddress;
 private static SeedProvider seedProvider;
-/* Current index into the above list of directories */
-private static int currentIndex = 0;
 
 /* Hashing strategy Random or OPHF */
 private static IPartitioner partitioner;
@@ -741,12 +739,6 @@ public class DatabaseDescriptor
 return tableLocations;
 }
 
-public synchronized static String getNextAvailableDataLocation()
-{
-String dataFileDirectory = conf.data_file_directories[currentIndex];
-currentIndex = (currentIndex + 1) % conf.data_file_directories.length;
-return dataFileDirectory;
-}
 
 public static String getCommitLogLocation()
 {
@@ -763,41 +755,57 @@ public class DatabaseDescriptor
 return Collections.unmodifiableSet(new 
HashSet(seedProvider.getSeeds()));
 }
 
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize)
+{
+return getDataFileLocationForTable(table, expectedCompactedFileSize, 
true);
+}
+
 /*
  * Loop through all the disks to see which disk has the max free space
  * return the disk with max free space for compactions. If the size of the 
expected
  * compacted file is greater than the max disk space available return 
null, we cannot
  * do compaction in this case.
+ *
+ * @param table name of the table.
+ * @param expectedCompactedSize expected file size in bytes.
+ * @param ensureFreeSpace Flag if the function should ensure enough free 
space exists for the expected file size.
+ *If False and there is not enough free space a 
warning is logged, and the dir with the most space is returned.
  */
-public static String getDataFileLocationForTable(String table, long 
expectedCompactedFileSize)
+public synchronized static String getDataFileLocationForTable(String 
table, long expectedCompactedFileSize, boolean ensureFreeSpace)
 {
-  long maxFreeDisk = 0;
-  int maxDiskIndex = 0;
-  String dataFileDirectory = null;
-  String[] dataDirectoryForTable = getAllDataFileLocationsForTable(table);
+long maxFreeDisk = 0;
+int maxDiskIndex = 0;
+String dataFileDirectory = null;
+String[] dataDirectoryForTable = 
getAllDataFileLocationsForTable(table);
 
-  for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ )
-  {
-File f = new File(dataDirectoryForTable[i]);
-if( maxFreeDisk < f.getUsableSpace())
+for (int i = 0; i < dataDirectoryForTable.length; i++)
 {
-  maxFreeDisk = f.getUsableSpace();
-  maxDiskIndex = i;
+File f = new File(dataDirectoryForTable[i])

[jira] [Reopened] (CASSANDRA-4037) Move CfDef and KsDef validation to CFMetaData and KSMetaData

2012-03-22 Thread Sylvain Lebresne (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reopened CASSANDRA-4037:
-


> Move CfDef and KsDef validation to CFMetaData and KSMetaData
> 
>
> Key: CASSANDRA-4037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4037
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.1.1
>
>
> Following CASSANDRA-3792, CQL don't need to use thrift CfDef and KsDef. 
> However, those are still used in order to reuse ThriftValidation validation 
> methods. We should move that validation to CFM and KSM and remove the use of 
> those thrift structures by CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4037) Move CfDef and KsDef validation to CFMetaData and KSMetaData

2012-03-22 Thread Sylvain Lebresne (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4037:


Attachment: 0001-Fix-CFMetadata-copyOpts.txt

This patch actually broke the distributed tests because CFMetaData.copyOpts 
wasn't correctly cloning the columns metadata. Attached fix (against 1.1.1, 
copyOpts is not used in previous version).

> Move CfDef and KsDef validation to CFMetaData and KSMetaData
> 
>
> Key: CASSANDRA-4037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4037
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.1.1
>
> Attachments: 0001-Fix-CFMetadata-copyOpts.txt
>
>
> Following CASSANDRA-3792, CQL don't need to use thrift CfDef and KsDef. 
> However, those are still used in order to reuse ThriftValidation validation 
> methods. We should move that validation to CFM and KSM and remove the use of 
> those thrift structures by CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4037) Move CfDef and KsDef validation to CFMetaData and KSMetaData

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235663#comment-13235663
 ] 

Jonathan Ellis commented on CASSANDRA-4037:
---

+1, tho i'd be ok w/ leaving the debug line in

> Move CfDef and KsDef validation to CFMetaData and KSMetaData
> 
>
> Key: CASSANDRA-4037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4037
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.1.1
>
> Attachments: 0001-Fix-CFMetadata-copyOpts.txt
>
>
> Following CASSANDRA-3792, CQL don't need to use thrift CfDef and KsDef. 
> However, those are still used in order to reuse ThriftValidation validation 
> methods. We should move that validation to CFM and KSM and remove the use of 
> those thrift structures by CQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




git commit: Fix #4037 commit

2012-03-22 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.1 df103258c -> 11bdcd6d7


Fix #4037 commit


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11bdcd6d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11bdcd6d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11bdcd6d

Branch: refs/heads/cassandra-1.1
Commit: 11bdcd6d7f78709fdf069fbd03ffe1418c76980f
Parents: df10325
Author: Sylvain Lebresne 
Authored: Thu Mar 22 16:53:40 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 16:53:40 2012 +0100

--
 .../org/apache/cassandra/config/CFMetaData.java|   11 ++-
 .../apache/cassandra/config/ColumnDefinition.java  |5 +
 2 files changed, 15 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/11bdcd6d/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 22b16d7..b3e3a8b 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -353,6 +353,12 @@ public final class CFMetaData
 
 static CFMetaData copyOpts(CFMetaData newCFMD, CFMetaData oldCFMD)
 {
+Map clonedColumns = new 
HashMap();
+for (ColumnDefinition cd : oldCFMD.column_metadata.values())
+{
+ColumnDefinition cloned = cd.clone();
+clonedColumns.put(cloned.name, cloned);
+}
 return newCFMD.comment(oldCFMD.comment)
   .readRepairChance(oldCFMD.readRepairChance)
   .dcLocalReadRepairChance(oldCFMD.dcLocalReadRepairChance)
@@ -362,7 +368,10 @@ public final class CFMetaData
   .keyValidator(oldCFMD.keyValidator)
   .minCompactionThreshold(oldCFMD.minCompactionThreshold)
   .maxCompactionThreshold(oldCFMD.maxCompactionThreshold)
-  .columnMetadata(oldCFMD.column_metadata)
+  .keyAlias(oldCFMD.keyAlias)
+  .columnAliases(new 
ArrayList(oldCFMD.columnAliases))
+  .valueAlias(oldCFMD.valueAlias)
+  .columnMetadata(clonedColumns)
   .compactionStrategyClass(oldCFMD.compactionStrategyClass)
   
.compactionStrategyOptions(oldCFMD.compactionStrategyOptions)
   .compressionParameters(oldCFMD.compressionParameters)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/11bdcd6d/src/java/org/apache/cassandra/config/ColumnDefinition.java
--
diff --git a/src/java/org/apache/cassandra/config/ColumnDefinition.java 
b/src/java/org/apache/cassandra/config/ColumnDefinition.java
index f6d8209..795f1d2 100644
--- a/src/java/org/apache/cassandra/config/ColumnDefinition.java
+++ b/src/java/org/apache/cassandra/config/ColumnDefinition.java
@@ -83,6 +83,11 @@ public class ColumnDefinition
 return new ColumnDefinition(ByteBufferUtil.bytes(name), 
DoubleType.instance, null, null, null);
 }
 
+public ColumnDefinition clone()
+{
+return new ColumnDefinition(name, validator, index_type, 
index_options, index_name);
+}
+
 @Override
 public boolean equals(Object o)
 {



[1/2] git commit: Merge branch 'cassandra-1.1' into trunk

2012-03-22 Thread slebresne
Updated Branches:
  refs/heads/trunk fe507e305 -> e571ec2c6


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e571ec2c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e571ec2c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e571ec2c

Branch: refs/heads/trunk
Commit: e571ec2c6fe0a18d5c88671e2a560c199ef93eea
Parents: fe507e3 11bdcd6
Author: Sylvain Lebresne 
Authored: Thu Mar 22 16:54:53 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 16:54:53 2012 +0100

--
 .../org/apache/cassandra/config/CFMetaData.java|   11 ++-
 .../apache/cassandra/config/ColumnDefinition.java  |5 +
 2 files changed, 15 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e571ec2c/src/java/org/apache/cassandra/config/CFMetaData.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e571ec2c/src/java/org/apache/cassandra/config/ColumnDefinition.java
--



[2/2] git commit: Fix #4037 commit

2012-03-22 Thread slebresne
Fix #4037 commit


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11bdcd6d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11bdcd6d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11bdcd6d

Branch: refs/heads/trunk
Commit: 11bdcd6d7f78709fdf069fbd03ffe1418c76980f
Parents: df10325
Author: Sylvain Lebresne 
Authored: Thu Mar 22 16:53:40 2012 +0100
Committer: Sylvain Lebresne 
Committed: Thu Mar 22 16:53:40 2012 +0100

--
 .../org/apache/cassandra/config/CFMetaData.java|   11 ++-
 .../apache/cassandra/config/ColumnDefinition.java  |5 +
 2 files changed, 15 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/11bdcd6d/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 22b16d7..b3e3a8b 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -353,6 +353,12 @@ public final class CFMetaData
 
 static CFMetaData copyOpts(CFMetaData newCFMD, CFMetaData oldCFMD)
 {
+Map clonedColumns = new 
HashMap();
+for (ColumnDefinition cd : oldCFMD.column_metadata.values())
+{
+ColumnDefinition cloned = cd.clone();
+clonedColumns.put(cloned.name, cloned);
+}
 return newCFMD.comment(oldCFMD.comment)
   .readRepairChance(oldCFMD.readRepairChance)
   .dcLocalReadRepairChance(oldCFMD.dcLocalReadRepairChance)
@@ -362,7 +368,10 @@ public final class CFMetaData
   .keyValidator(oldCFMD.keyValidator)
   .minCompactionThreshold(oldCFMD.minCompactionThreshold)
   .maxCompactionThreshold(oldCFMD.maxCompactionThreshold)
-  .columnMetadata(oldCFMD.column_metadata)
+  .keyAlias(oldCFMD.keyAlias)
+  .columnAliases(new 
ArrayList(oldCFMD.columnAliases))
+  .valueAlias(oldCFMD.valueAlias)
+  .columnMetadata(clonedColumns)
   .compactionStrategyClass(oldCFMD.compactionStrategyClass)
   
.compactionStrategyOptions(oldCFMD.compactionStrategyOptions)
   .compressionParameters(oldCFMD.compressionParameters)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/11bdcd6d/src/java/org/apache/cassandra/config/ColumnDefinition.java
--
diff --git a/src/java/org/apache/cassandra/config/ColumnDefinition.java 
b/src/java/org/apache/cassandra/config/ColumnDefinition.java
index f6d8209..795f1d2 100644
--- a/src/java/org/apache/cassandra/config/ColumnDefinition.java
+++ b/src/java/org/apache/cassandra/config/ColumnDefinition.java
@@ -83,6 +83,11 @@ public class ColumnDefinition
 return new ColumnDefinition(ByteBufferUtil.bytes(name), 
DoubleType.instance, null, null, null);
 }
 
+public ColumnDefinition clone()
+{
+return new ColumnDefinition(name, validator, index_type, 
index_options, index_name);
+}
+
 @Override
 public boolean equals(Object o)
 {



[jira] [Commented] (CASSANDRA-4067) Report lifetime compaction throughput

2012-03-22 Thread Nick Bailey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235671#comment-13235671
 ] 

Nick Bailey commented on CASSANDRA-4067:


I mentioned this offline, but is there any reason we shouldn't go ahead and 
expose total number of compactions completed as well?

> Report lifetime compaction throughput
> -
>
> Key: CASSANDRA-4067
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4067
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Brandon Williams
>Priority: Trivial
>  Labels: compaction
> Fix For: 1.1.0
>
> Attachments: 4067.txt
>
>
> Would be useful to be able to monitor total compaction throughput without 
> having to poll frequently enough to make sure we get every CompactionInfo 
> object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4067) Report lifetime compaction throughput

2012-03-22 Thread Brandon Williams (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-4067:


Attachment: (was: 4067.txt)

> Report lifetime compaction throughput
> -
>
> Key: CASSANDRA-4067
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4067
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Brandon Williams
>Priority: Trivial
>  Labels: compaction
> Fix For: 1.1.0
>
>
> Would be useful to be able to monitor total compaction throughput without 
> having to poll frequently enough to make sure we get every CompactionInfo 
> object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4067) Report lifetime compaction throughput

2012-03-22 Thread Brandon Williams (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-4067:


Attachment: 0002-Track-and-expose-total-compactions.txt
0001-Track-and-expose-lifetime-bytes-compacted.txt

Updated patchset to also expose total compactions.

> Report lifetime compaction throughput
> -
>
> Key: CASSANDRA-4067
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4067
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Brandon Williams
>Priority: Trivial
>  Labels: compaction
> Fix For: 1.1.0
>
> Attachments: 0001-Track-and-expose-lifetime-bytes-compacted.txt, 
> 0002-Track-and-expose-total-compactions.txt
>
>
> Would be useful to be able to monitor total compaction throughput without 
> having to poll frequently enough to make sure we get every CompactionInfo 
> object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4073) cqlsh: flush CAPTURE output after each command

2012-03-22 Thread Tyler Patterson (Created) (JIRA)
cqlsh: flush CAPTURE output after each command
--

 Key: CASSANDRA-4073
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4073
 Project: Cassandra
  Issue Type: Improvement
 Environment: cassandra1.1
Reporter: Tyler Patterson
Assignee: paul cannon
Priority: Minor


In cqlsh A user might want to enable capturing, run a command, and then go look 
at the file to see the output. This workflow could be useful, for instance, 
where the output is large. Internal buffering forces the user to turn capture 
off each time he or she wants to see the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-4020) System time suddenly changed made gossip working abnormally

2012-03-22 Thread Brandon Williams (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-4020.
-

Resolution: Duplicate

Looks like the same issues as CASSANDRA-4066

> System time suddenly changed  made gossip working abnormally 
> -
>
> Key: CASSANDRA-4020
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4020
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.6
>Reporter: MaHaiyang
>
> I hava four cassandra node (A,B,C,D) .
>  I changed node A's system time to one hour ahead  and change the time to 
> normal after serval seconds.Then I use nodetool's ring command at node B , 
> node B look node A is "down" . It's the same thing on node C and D . But node 
> A look itself is "UP"  by ring command .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1991) CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may block, under flusher lock write lock

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235831#comment-13235831
 ] 

Jonathan Ellis commented on CASSANDRA-1991:
---

What if we just had getContext cheat and inject its task at the front of the 
commitlog executor's queue, instead of at the end?  This would mean we might 
replay data unnecessarily after a crash (the tasks we cut in front of), but 
that's Not A Big Deal.

> CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may 
> block, under flusher lock write lock
> ---
>
> Key: CASSANDRA-1991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1991
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Peter Schuller
>Assignee: Peter Schuller
> Attachments: 1991-checkpointing-flush.txt, 1991-logchanges.txt, 
> 1991-trunk-v2.txt, 1991-trunk.txt, 1991-v3.txt, 1991-v4.txt, 1991-v5.txt, 
> 1991-v6.txt, 1991-v7.txt, 1991-v8.txt, 1991-v9.txt, trigger.py
>
>
> While investigate CASSANDRA-1955 I realized I was seeing very poor latencies 
> for reasons that had nothing to do with flush_writers, even when using 
> periodic commit log mode (and flush writers set ridiculously high, 500).
> It turns out writes blocked were slow because Table.apply() was spending lots 
> of time (I can easily trigger seconds on moderate work-load) trying to 
> acquire a flusher lock read lock ("flush lock millis" log printout in the 
> logging patch I'll attach).
> That in turns is caused by CFS.maybeSwitchMemtable() which acquires the 
> flusher lock write lock.
> Bisecting further revealed that the offending line of code that blocked was:
> final CommitLogSegment.CommitLogContext ctx = 
> writeCommitLog ? CommitLog.instance.getContext() : null;
> Indeed, CommitLog.getContext() simply returns currentSegment().getContext(), 
> but does so by submitting a callable on the service executor. So 
> independently of flush writers, this can block all (global, for all cf:s) 
> writes very easily, and does.
> I'll attach a file that is an independent Python script that triggers it on 
> my macos laptop (with an intel SSD, which is why I was particularly 
> surprised) (it assumes CPython, out-of-the-box-or-almost Cassandra on 
> localhost that isn't in a cluster, and it will drop/recreate a keyspace 
> called '1955').
> I'm also attaching, just FYI, the patch with log entries that I used while 
> tracking it down.
> Finally, I'll attach a patch with a suggested solution of keeping track of 
> the latest commit log with an AtomicReference (as an alternative to 
> synchronizing all access to segments). With that patch applied, latencies are 
> not affected by my trigger case like they were before. There are some 
> sub-optimal > 100 ms cases on my test machine, but for other reasons. I'm no 
> longer able to trigger the extremes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1991) CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may block, under flusher lock write lock

2012-03-22 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235851#comment-13235851
 ] 

Sylvain Lebresne commented on CASSANDRA-1991:
-

Haven't followed the whole conversation but

bq. we might replay data unnecessarily after a crash (the tasks we cut in front 
of), but that's Not A Big Deal

Not true for counters.

> CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may 
> block, under flusher lock write lock
> ---
>
> Key: CASSANDRA-1991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1991
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Peter Schuller
>Assignee: Peter Schuller
> Attachments: 1991-checkpointing-flush.txt, 1991-logchanges.txt, 
> 1991-trunk-v2.txt, 1991-trunk.txt, 1991-v3.txt, 1991-v4.txt, 1991-v5.txt, 
> 1991-v6.txt, 1991-v7.txt, 1991-v8.txt, 1991-v9.txt, trigger.py
>
>
> While investigate CASSANDRA-1955 I realized I was seeing very poor latencies 
> for reasons that had nothing to do with flush_writers, even when using 
> periodic commit log mode (and flush writers set ridiculously high, 500).
> It turns out writes blocked were slow because Table.apply() was spending lots 
> of time (I can easily trigger seconds on moderate work-load) trying to 
> acquire a flusher lock read lock ("flush lock millis" log printout in the 
> logging patch I'll attach).
> That in turns is caused by CFS.maybeSwitchMemtable() which acquires the 
> flusher lock write lock.
> Bisecting further revealed that the offending line of code that blocked was:
> final CommitLogSegment.CommitLogContext ctx = 
> writeCommitLog ? CommitLog.instance.getContext() : null;
> Indeed, CommitLog.getContext() simply returns currentSegment().getContext(), 
> but does so by submitting a callable on the service executor. So 
> independently of flush writers, this can block all (global, for all cf:s) 
> writes very easily, and does.
> I'll attach a file that is an independent Python script that triggers it on 
> my macos laptop (with an intel SSD, which is why I was particularly 
> surprised) (it assumes CPython, out-of-the-box-or-almost Cassandra on 
> localhost that isn't in a cluster, and it will drop/recreate a keyspace 
> called '1955').
> I'm also attaching, just FYI, the patch with log entries that I used while 
> tracking it down.
> Finally, I'll attach a patch with a suggested solution of keeping track of 
> the latest commit log with an AtomicReference (as an alternative to 
> synchronizing all access to segments). With that patch applied, latencies are 
> not affected by my trigger case like they were before. There are some 
> sub-optimal > 100 ms cases on my test machine, but for other reasons. I'm no 
> longer able to trigger the extremes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4042) add "caching" to CQL CF options

2012-03-22 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235966#comment-13235966
 ] 

paul cannon commented on CASSANDRA-4042:


partial dupe of CASSANDRA-3941. I could remove "caching" from that one, or we 
could add bloom_filter_fp_chance to this one. I like the second option, since 
this looks pretty easy to add the other option to this already-done work.

> add "caching" to CQL CF options
> ---
>
> Key: CASSANDRA-4042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4042
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Pavel Yaskevich
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: CASSANDRA-4042.patch
>
>
> "Caching" option is missing from CQL ColumnFamily options.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4042) add "caching" to CQL CF options

2012-03-22 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235976#comment-13235976
 ] 

Pavel Yaskevich commented on CASSANDRA-4042:


I'm fine you just attach the "bloom_filter_fp_change" addition here as a 
separate patch and we will just close CASSANDRA-3941 as Duplicate or vice versa 
weather you prefer. :)

> add "caching" to CQL CF options
> ---
>
> Key: CASSANDRA-4042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4042
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Pavel Yaskevich
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: CASSANDRA-4042.patch
>
>
> "Caching" option is missing from CQL ColumnFamily options.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of "DataModel" by TylerHobbs

2012-03-22 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "DataModel" page has been changed by TylerHobbs:
http://wiki.apache.org/cassandra/DataModel?action=diff&rev1=13&rev2=14

Comment:
Column family changes don't require a restart, schema.xml is no longer used

  }}}
  All values are supplied by the client, including the 'timestamp'.  This means 
that clocks on the clients should be synchronized (in the Cassandra server 
environment is useful also), as these timestamps are used for conflict 
resolution.  In many cases the 'timestamp' is not used in client applications, 
and it becomes convenient to think of a column as a name/value pair. For the 
remainder of this document, 'timestamps' will be elided for readability.  It is 
also worth noting the name and value are binary values, although in many 
applications they are UTF8 serialized strings.
  
- Timestamps can be anything you like, but microseconds since 1970 is a 
convention. Whatever you use, it must be consistent across the application 
otherwise earlier changes may overwrite newer ones.
+ Timestamps can be anything you like, but microseconds since 1970 is a 
convention. Whatever you use, it must be consistent across the application, 
otherwise earlier changes may overwrite newer ones.
  
  = Column Families =
- A column family is a container for columns, analogous to the table in a 
relational system.  You define column families in your storage-conf.xml file, 
and cannot modify them (or add new column families) without restarting your 
Cassandra process.  A column family holds an ordered list of columns, which you 
can reference by the column name.
+ A column family is a container for rows, analogous to the table in a 
relational system.  Each row in a column family can referenced by its key.
  
- Column families have a configurable ordering applied to the columns within 
each row, which affects the behavior of the get_slice call in the thrift API.  
Out of the box ordering implementations include ASCII, UTF-8, Long, and UUID 
(lexical or time).
+ Column families have a configurable ordering applied to the columns within 
each row, which affects the behavior of the get_slice call in the thrift API.  
Out of the box ordering implementations include ASCII, UTF-8, Long, UUID 
(lexical or time), Date, combinations of these using CompositeType, and others.
  
  = Rows =
  In Cassandra, each column family is stored in a separate file, and the file 
is sorted in row (i.e. key) major order. Related columns, those that you'll 
access together, should be kept within the same column family.


[jira] [Created] (CASSANDRA-4074) cqlsh: Tab completion should not suggest consistency level ANY for select statements

2012-03-22 Thread Tyler Patterson (Created) (JIRA)
cqlsh: Tab completion should not suggest consistency level ANY for select 
statements


 Key: CASSANDRA-4074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4074
 Project: Cassandra
  Issue Type: Bug
Reporter: Tyler Patterson
Assignee: paul cannon
Priority: Trivial


consistency level ANY should not be suggested in tab-completion for SELECT 
statements

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Joaquin Casares (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joaquin Casares updated CASSANDRA-4075:
---

Labels: datastax_qa  (was: )

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Joaquin Casares (Created) (JIRA)
Dropped keyspaces and cfs do not get deleted


 Key: CASSANDRA-4075
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joaquin Casares


Tested in 0.8.10, reported in 0.8.1.

Dropped keyspaces and column families have their sstables marked as Compacted, 
but will not disappear, even on restart. 

Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Jonathan Ellis (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-4075:
-

Assignee: Yuki Morishita

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>Assignee: Yuki Morishita
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236290#comment-13236290
 ] 

Jonathan Ellis commented on CASSANDRA-4075:
---

Isn't this just normal pre-CASSANDRA-2521 "there hasn't been a full GC so I 
can't delete the sstables" behavior?

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>Assignee: Yuki Morishita
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-22 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236306#comment-13236306
 ] 

Vijay commented on CASSANDRA-3722:
--

I was almost complete with the patch but filtering based on pending queue can 
be potentially dangerous.
On a Multi region cluster the pending commands in the local replicas will be 
almost always higher than remote ones because they dont receive reads. We might 
want to filter might not want to filter based on pending.

We can do a hackie solution by padding the score of the remote DC's to be 
higher artificial value than the local DC, What do you guys think?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236329#comment-13236329
 ] 

Jonathan Ellis commented on CASSANDRA-3772:
---

(2975 did get committed.)

> Evaluate Murmur3-based partitioner
> --
>
> Key: CASSANDRA-3772
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Dave Brosius
> Fix For: 1.2
>
> Attachments: try_murmur3.diff
>
>
> MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
> qualities, just a good output distribution.  Let's see how much overhead we 
> can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3912) support incremental repair controlled by external agent

2012-03-22 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3912:
--

Reviewer: stuhood

Stu, could you review?

> support incremental repair controlled by external agent
> ---
>
> Key: CASSANDRA-3912
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3912
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Peter Schuller
>Assignee: Peter Schuller
> Fix For: 1.2
>
> Attachments: CASSANDRA-3912-trunk-v1.txt, 
> CASSANDRA-3912-v2-001-add-nodetool-commands.txt, 
> CASSANDRA-3912-v2-002-fix-antientropyservice.txt
>
>
> As a poor man's pre-cursor to CASSANDRA-2699, exposing the ability to repair 
> small parts of a range is extremely useful because it allows (with external 
> scripting logic) to slowly repair a node's content over time. Other than 
> avoiding the bulkyness of complete repairs, it means that you can safely do 
> repairs even if you absolutely cannot afford e.g. disk spaces spikes (see 
> CASSANDRA-2699 for what the issues are).
> Attaching a patch that exposes a "repairincremental" command to nodetool, 
> where you specify a step and the number of total steps. Incrementally 
> performing a repair in 100 steps, for example, would be done by:
> {code}
> nodetool repairincremental 0 100
> nodetool repairincremental 1 100
> ...
> nodetool repairincremental 99 100
> {code}
> An external script can be used to keep track of what has been repaired and 
> when. This should allow (1) allow incremental repair to happen now/soon, and 
> (2) allow experimentation and evaluation for an implementation of 
> CASSANDRA-2699 which I still think is a good idea. This patch does nothing to 
> help the average deployment, but at least makes incremental repair possible 
> given sufficient effort spent on external scripting.
> The big "no-no" about the patch is that it is entirely specific to 
> RandomPartitioner and BigIntegerToken. If someone can suggest a way to 
> implement this command generically using the Range/Token abstractions, I'd be 
> happy to hear suggestions.
> An alternative would be to provide a nodetool command that allows you to 
> simply specify the specific token ranges on the command line. It makes using 
> it a bit more difficult, but would mean that it works for any partitioner and 
> token type.
> Unless someone can suggest a better way to do this, I think I'll provide a 
> patch that does this. I'm still leaning towards supporting the simple "step N 
> out of M" form though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3912) support incremental repair controlled by external agent

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236331#comment-13236331
 ] 

Jonathan Ellis commented on CASSANDRA-3912:
---

bq. As a side note, I'll remark that every repair of a range triggers a flush, 
so one should probably be careful to not repair incrementally on too small a 
range.

Is it worth evaluating using the range scan code to compute the trees instead 
of an sstable-only scanner?  That would let us avoid the flush.

> support incremental repair controlled by external agent
> ---
>
> Key: CASSANDRA-3912
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3912
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Peter Schuller
>Assignee: Peter Schuller
> Fix For: 1.2
>
> Attachments: CASSANDRA-3912-trunk-v1.txt, 
> CASSANDRA-3912-v2-001-add-nodetool-commands.txt, 
> CASSANDRA-3912-v2-002-fix-antientropyservice.txt
>
>
> As a poor man's pre-cursor to CASSANDRA-2699, exposing the ability to repair 
> small parts of a range is extremely useful because it allows (with external 
> scripting logic) to slowly repair a node's content over time. Other than 
> avoiding the bulkyness of complete repairs, it means that you can safely do 
> repairs even if you absolutely cannot afford e.g. disk spaces spikes (see 
> CASSANDRA-2699 for what the issues are).
> Attaching a patch that exposes a "repairincremental" command to nodetool, 
> where you specify a step and the number of total steps. Incrementally 
> performing a repair in 100 steps, for example, would be done by:
> {code}
> nodetool repairincremental 0 100
> nodetool repairincremental 1 100
> ...
> nodetool repairincremental 99 100
> {code}
> An external script can be used to keep track of what has been repaired and 
> when. This should allow (1) allow incremental repair to happen now/soon, and 
> (2) allow experimentation and evaluation for an implementation of 
> CASSANDRA-2699 which I still think is a good idea. This patch does nothing to 
> help the average deployment, but at least makes incremental repair possible 
> given sufficient effort spent on external scripting.
> The big "no-no" about the patch is that it is entirely specific to 
> RandomPartitioner and BigIntegerToken. If someone can suggest a way to 
> implement this command generically using the Range/Token abstractions, I'd be 
> happy to hear suggestions.
> An alternative would be to provide a nodetool command that allows you to 
> simply specify the specific token ranges on the command line. It makes using 
> it a bit more difficult, but would mean that it works for any partitioner and 
> token type.
> Unless someone can suggest a better way to do this, I think I'll provide a 
> patch that does this. I'm still leaning towards supporting the simple "step N 
> out of M" form though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2319) Promote row index

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236333#comment-13236333
 ] 

Jonathan Ellis commented on CASSANDRA-2319:
---

I see getIndexedReadBufferSizeInKB is still around, is that dead code now?

> Promote row index
> -
>
> Key: CASSANDRA-2319
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2319
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Assignee: Sylvain Lebresne
>  Labels: index, timeseries
> Fix For: 1.2
>
> Attachments: 2319-v1.tgz, 2319-v2.tgz, promotion.pdf, version-f.txt, 
> version-g-lzf.txt, version-g.txt
>
>
> The row index contains entries for configurably sized blocks of a wide row. 
> For a row of appreciable size, the row index ends up directing the third seek 
> (1. index, 2. row index, 3. content) to nearby the first column of a scan.
> Since the row index is always used for wide rows, and since it contains 
> information that tells us whether or not the 3rd seek is necessary (the 
> column range or name we are trying to slice may not exist in a given 
> sstable), promoting the row index into the sstable index would allow us to 
> drop the maximum number of seeks for wide rows back to 2, and, more 
> importantly, would allow sstables to be eliminated using only the index.
> An example usecase that benefits greatly from this change is time series data 
> in wide rows, where data is appended to the beginning or end of the row. Our 
> existing compaction strategy gets lucky and clusters the oldest data in the 
> oldest sstables: for queries to recently appended data, we would be able to 
> eliminate wide rows using only the sstable index, rather than needing to seek 
> into the data file to determine that it isn't interesting. For narrow rows, 
> this change would have no effect, as they will not reach the threshold for 
> indexing anyway.
> A first cut design for this change would look very similar to the file format 
> design proposed on #674: 
> http://wiki.apache.org/cassandra/FileFormatDesignDoc: row keys clustered, 
> column names clustered, and offsets clustered and delta encoded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4051) Stream sessions can only fail via the FailureDetector

2012-03-22 Thread Brandon Williams (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-4051:


Reviewer: yukim  (was: slebresne)

> Stream sessions can only fail via the FailureDetector
> -
>
> Key: CASSANDRA-4051
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4051
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>  Labels: streaming
> Fix For: 1.1.0
>
> Attachments: 4051.txt
>
>
> If for some reason, FileStreamTask itself fails more than the number of retry 
> attempts but gossip continues to work, the stream session will never be 
> closed.  This is unlikely to happen in practice since it requires blocking 
> the storage port from new connections but keeping the existing ones, however 
> for the bulk loader this is especially problematic since it doesn't have 
> access to a failure detector and thus no way of knowing if a session failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236339#comment-13236339
 ] 

Jonathan Ellis commented on CASSANDRA-1123:
---

Aaron, do you think you'll be getting back to this?  If not I'm happy to find 
someone to work on it.

> Allow tracing query details
> ---
>
> Key: CASSANDRA-1123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Aaron Morton
> Fix For: 1.2
>
> Attachments: 1123-3.patch.gz
>
>
> In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
> see where latency is coming from: how long did row cache lookup take?  key 
> search in the index?  merging the data from the sstables?  etc.
> The main difference vs setting debug logging is that debug logging is too big 
> of a hammer; by turning on the flood of logging for everyone, you actually 
> distort the information you're looking for.  This would be something you 
> could set per-query (or more likely per connection).
> We don't need to be as sophisticated as the techniques discussed in the 
> following papers but they are interesting reading:
> http://research.google.com/pubs/pub36356.html
> http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
> http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-3469) More fine-grained request statistics

2012-03-22 Thread Jonathan Ellis (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3469.
---

   Resolution: Won't Fix
Fix Version/s: (was: 1.2)
 Assignee: (was: Yuki Morishita)

resolving as wontfix for now.  will revise if necessary depending on how 1123 
goes.

> More fine-grained request statistics
> 
>
> Key: CASSANDRA-3469
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3469
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Priority: Minor
>
> It would be useful to split the CFS stats up by query type.  slice vs named 
> vs range vs index, to start with (right now we don't track range scans at 
> all), but also at the "prepared statement" level as it were:
> {{SELECT x FROM foo WHERE key = ?}} would be one query no matter what the ? 
> is, but {{SELECT y FROM foo WHERE key = ?}} would be different.  {{SELECT 
> x..y FROM foo WHERE key = ?}} would be another, as would {{SELECT x FROM foo 
> WHERE key = ? AND bar= ?}}.  (But {{SELECT x FROM foo WHERE bar = ? AND key = 
> ?}} would be identical to the former, of course.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Joaquin Casares (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236347#comment-13236347
 ] 

Joaquin Casares commented on CASSANDRA-4075:


Also, on my 0.8.10 test, at first we thought it might have been CASSANDRA-2942, 
but this was tested on one node built out of source.

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>Assignee: Yuki Morishita
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-3794) Change ColumnFamily identifiers to be UUIDs instead of sequential Integers.

2012-03-22 Thread Jonathan Ellis (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-3794:
-

Assignee: (was: Pavel Yaskevich)

I believe Pavel is working on other things now (he can correct me if I am 
wrong) so I am clearing the assignment in case someone else can tackle it.

> Change ColumnFamily identifiers to be UUIDs instead of sequential Integers.
> ---
>
> Key: CASSANDRA-3794
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3794
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Pavel Yaskevich
>Priority: Minor
> Fix For: 1.2
>
>
> Change ColumnFamily identifiers to be UUIDs instead of sequential Integers. 
> Would be useful in the situation when nodes simultaneously trying to create 
> ColumnFamilies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2942) Dropped columnfamilies can leave orphaned data files that do not get cleared on restart

2012-03-22 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2942:
--

Description: 
* Bring up 3 node cluster
* From node1: Run Stress Tool
{code} stress --num-keys=10 --columns=10 --consistency-level=ALL 
--average-size-values --replication-factor=3 --nodes=node1,node2 {code}
* Shutdown node3
* From node1: drop the Standard1 CF in Keyspace1
* Shutdown node2 and node3
* Bring up node1 and node2. Check that the Standard1 files are gone.
{code}
ls -al /var/lib/cassandra/data/Keyspace1/
{code}
* Bring up node3. The log file shows the drop column family occurs
{code}
 INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0--8901a7c5c9ce Drop 
column family: Keyspace1.Standard1
{code}
* Restart node3 to clear out dropped tables from the filesystem
{code}
root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
total 36
drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
-rw-r--r-- 1 root root0 Jul 23 00:51 Standard1-g-1-Compacted
-rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
-rw-r--r-- 2 root root   32 Jul 23 00:51 Standard1-g-1-Filter.db
-rw-r--r-- 2 root root  120 Jul 23 00:51 Standard1-g-1-Index.db
-rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
{code}
*Bug:  The files for Standard1 are orphaned on node3*



  was:

* Bring up 3 node cluster
* From node1: Run Stress Tool
{code} stress --num-keys=10 --columns=10 --consistency-level=ALL 
--average-size-values --replication-factor=3 --nodes=node1,node2 {code}
* Shutdown node3
* From node1: drop the Standard1 CF in Keyspace1
* Shutdown node2 and node3
* Bring up node1 and node2. Check that the Standard1 files are gone.
{code}
ls -al /var/lib/cassandra/data/Keyspace1/
{code}
* Bring up node3. The log file shows the drop column family occurs
{code}
 INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0--8901a7c5c9ce Drop 
column family: Keyspace1.Standard1
{code}
* Restart node3 to clear out dropped tables from the filesystem
{code}
root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
total 36
drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
-rw-r--r-- 1 root root0 Jul 23 00:51 Standard1-g-1-Compacted
-rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
-rw-r--r-- 2 root root   32 Jul 23 00:51 Standard1-g-1-Filter.db
-rw-r--r-- 2 root root  120 Jul 23 00:51 Standard1-g-1-Index.db
-rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
{code}
*Bug:  The files for Standard1 are orphaned on node3*



Summary: Dropped columnfamilies can leave orphaned data files that do 
not get cleared on restart  (was: If you drop a CF when one node is down the 
files are orphaned on the downed node)

> Dropped columnfamilies can leave orphaned data files that do not get cleared 
> on restart
> ---
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Cathy Daw
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 1.0.0
>
> Attachments: 2942.txt
>
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL 
> --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
>  INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0--8901a7c5c9ce 
> Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root   32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root  120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug:  The files for Standard1 are orphaned on node3*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectl

[jira] [Commented] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236350#comment-13236350
 ] 

Jonathan Ellis commented on CASSANDRA-4075:
---

The title on 2942 is misleading. As I said in my first comment there, "You 
should be able to reproduce this even on a single node – just drop a CF, then 
restart. It only cleans out marked-for-delete files from known CFs."

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>Assignee: Yuki Morishita
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Joaquin Casares (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236361#comment-13236361
 ] 

Joaquin Casares commented on CASSANDRA-4075:


Yes, it may actually be that ticket then. Sorry, I misread what you meant by 
single node. The patch doesn't apply cleanly on 0.8.10. I'll look at it in the 
morning to see if it's an easy backport. If not, can we get a backport of it to 
0.8.x, particularly 0.8.1?

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>Assignee: Yuki Morishita
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted

2012-03-22 Thread Joaquin Casares (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236401#comment-13236401
 ] 

Joaquin Casares commented on CASSANDRA-4075:


Also, just tested a full gc on both 0.8.10 and 0.8.1 after waiting rpc_timeout 
seconds and the sstables that were marked as compacted through the drop command 
did clear.

GC was done by using jconsole: java.lang.Memory.gc().

> Dropped keyspaces and cfs do not get deleted
> 
>
> Key: CASSANDRA-4075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4075
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
>Reporter: Joaquin Casares
>Assignee: Yuki Morishita
>  Labels: datastax_qa
>
> Tested in 0.8.10, reported in 0.8.1.
> Dropped keyspaces and column families have their sstables marked as 
> Compacted, but will not disappear, even on restart. 
> Worked correctly in 1.0.8 where the sstables get deleted almost immediately 
> following the column family drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira