[jira] [Commented] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0

2013-12-30 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858811#comment-13858811
 ] 

Brandon Williams commented on CASSANDRA-6053:
-

Can you provide debug logs from node2?

 system.peers table not updated after decommissioning nodes in C* 2.0
 

 Key: CASSANDRA-6053
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6053
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Datastax AMI running EC2 m1.xlarge instances
Reporter: Guyon Moree
Assignee: Tyler Hobbs
 Attachments: peers


 After decommissioning my cluster from 20 to 9 nodes using opscenter, I found 
 all but one of the nodes had incorrect system.peers tables.
 This became a problem (afaik) when using the python-driver, since this 
 queries the peers table to set up its connection pool. Resulting in very slow 
 startup times, because of timeouts.
 The output of nodetool didn't seem to be affected. After removing the 
 incorrect entries from the peers tables, the connection issues seem to have 
 disappeared for us. 
 Would like some feedback on if this was the right way to handle the issue or 
 if I'm still left with a broken cluster.
 Attached is the output of nodetool status, which shows the correct 9 nodes. 
 Below that the output of the system.peers tables on the individual nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (CASSANDRA-6501) Cannot run pig examples on current 2.0 branch

2013-12-30 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-6501:
---

Assignee: Alex Liu

 Cannot run pig examples on current 2.0 branch
 -

 Key: CASSANDRA-6501
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6501
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Alex Liu
  Labels: pig

 I checked out the cassandra-2.0 branch to try the pig examples because the 
 2.0.3 release has the CASSANDRA-6309 problem which is fixed on the branch.  I 
 tried to run both the cql and the CassandraStorage examples in local mode 
 with pig 0.10.1, 0.11.1, and 0.12.0 and all of them give the following error 
 and stack trace:
 {quote}
 ERROR 2998: Unhandled internal error. readLength_
 java.lang.NoSuchFieldError: readLength_
   at 
 org.apache.cassandra.thrift.TBinaryProtocol$Factory.getProtocol(TBinaryProtocol.java:57)
   at org.apache.thrift.TSerializer.init(TSerializer.java:66)
   at 
 org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cfdefToString(AbstractCassandraStorage.java:508)
   at 
 org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:470)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:318)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:357)
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.alias_col_ref(LogicalPlanGenerator.java:15356)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.col_ref(LogicalPlanGenerator.java:15203)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:8881)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:8632)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:7984)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.flatten_generated_item(LogicalPlanGenerator.java:5962)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.generate_clause(LogicalPlanGenerator.java:14101)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:12493)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:12360)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1577)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)
   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)
   at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
   at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
   at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
   at org.apache.pig.Main.run(Main.java:555)
   at org.apache.pig.Main.main(Main.java:111)
 
 {quote}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-12-30 Thread brandonwilliams
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7eaedbf9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7eaedbf9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7eaedbf9

Branch: refs/heads/trunk
Commit: 7eaedbf9587089e9b856c47959b76ee8a1093914
Parents: 35c2f92 604e31b
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 08:18:39 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 08:18:39 2013 -0600

--
 .../cassandra/db/HintedHandOffManager.java  | 24 ++-
 .../cassandra/db/HintedHandOffManagerMBean.java |  6 
 .../org/apache/cassandra/tools/NodeCmd.java |  8 -
 .../org/apache/cassandra/tools/NodeProbe.java   | 21 +
 .../apache/cassandra/tools/NodeToolHelp.yaml|  3 ++
 .../apache/cassandra/db/HintedHandOffTest.java  | 32 
 6 files changed, 92 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7eaedbf9/src/java/org/apache/cassandra/db/HintedHandOffManager.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7eaedbf9/src/java/org/apache/cassandra/tools/NodeCmd.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7eaedbf9/src/java/org/apache/cassandra/tools/NodeProbe.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7eaedbf9/test/unit/org/apache/cassandra/db/HintedHandOffTest.java
--



[1/3] git commit: Allow truncating hints from nodetool. Patch by Sankalp Kohli, reviewed by brandonwilliams for CASSANDRA-6158

2013-12-30 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-2.0 bbc089125 - 604e31b60
  refs/heads/trunk 35c2f9285 - 7eaedbf95


Allow truncating hints from nodetool.
Patch by Sankalp Kohli, reviewed by brandonwilliams for CASSANDRA-6158


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/604e31b6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/604e31b6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/604e31b6

Branch: refs/heads/cassandra-2.0
Commit: 604e31b60998d5df349f8cc84d06bbae1391dc0e
Parents: bbc0891
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 08:17:50 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 08:17:50 2013 -0600

--
 .../cassandra/db/HintedHandOffManager.java  | 24 ++-
 .../cassandra/db/HintedHandOffManagerMBean.java |  6 
 .../org/apache/cassandra/tools/NodeCmd.java |  8 -
 .../org/apache/cassandra/tools/NodeProbe.java   | 21 +
 .../apache/cassandra/tools/NodeToolHelp.yaml|  3 ++
 .../apache/cassandra/db/HintedHandOffTest.java  | 32 
 6 files changed, 92 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/604e31b6/src/java/org/apache/cassandra/db/HintedHandOffManager.java
--
diff --git a/src/java/org/apache/cassandra/db/HintedHandOffManager.java 
b/src/java/org/apache/cassandra/db/HintedHandOffManager.java
index 9508a86..40d5aaa 100644
--- a/src/java/org/apache/cassandra/db/HintedHandOffManager.java
+++ b/src/java/org/apache/cassandra/db/HintedHandOffManager.java
@@ -225,7 +225,29 @@ public class HintedHandOffManager implements 
HintedHandOffManagerMBean
 }
 }
 };
-StorageService.optionalTasks.execute(runnable);
+StorageService.optionalTasks.submit(runnable);
+}
+
+//foobar
+public void truncateAllHints() throws ExecutionException, 
InterruptedException
+{
+Runnable runnable = new Runnable()
+{
+public void run()
+{
+try
+{
+logger.info(Truncating all stored hints.);
+
Keyspace.open(Keyspace.SYSTEM_KS).getColumnFamilyStore(SystemKeyspace.HINTS_CF).truncateBlocking();
+}
+catch (Exception e)
+{
+logger.warn(Could not truncate all hints., e);
+}
+}
+};
+StorageService.optionalTasks.submit(runnable).get();
+
 }
 
 @VisibleForTesting

http://git-wip-us.apache.org/repos/asf/cassandra/blob/604e31b6/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
--
diff --git a/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java 
b/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
index 9a024e6..bbb2a14 100644
--- a/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
+++ b/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.db;
 
 import java.net.UnknownHostException;
 import java.util.List;
+import java.util.concurrent.ExecutionException;
 
 public interface HintedHandOffManagerMBean
 {
@@ -29,6 +30,11 @@ public interface HintedHandOffManagerMBean
 public void deleteHintsForEndpoint(final String host);
 
 /**
+ *  Truncate all the hints
+ */
+public void truncateAllHints() throws ExecutionException, 
InterruptedException;
+
+/**
  * List all the endpoints that this node has hints for.
  * @return set of endpoints; as Strings
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/604e31b6/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index c32539f..f1e7c73 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -164,6 +164,7 @@ public class NodeCmd
 STOP,
 STOPDAEMON,
 TPSTATS,
+TRUNCATEHINTS,
 UPGRADESSTABLES,
 VERSION,
 DESCRIBERING,
@@ -1116,7 +1117,12 @@ public class NodeCmd
 case RESETLOCALSCHEMA: probe.resetLocalSchema(); break;
 case ENABLEBACKUP: 
probe.setIncrementalBackupsEnabled(true); break;
 case DISABLEBACKUP   : 
probe.setIncrementalBackupsEnabled(false); break;
-
+
+case TRUNCATEHINTS:
+if (arguments.length  1) badUse(Too many 

[2/3] git commit: Allow truncating hints from nodetool. Patch by Sankalp Kohli, reviewed by brandonwilliams for CASSANDRA-6158

2013-12-30 Thread brandonwilliams
Allow truncating hints from nodetool.
Patch by Sankalp Kohli, reviewed by brandonwilliams for CASSANDRA-6158


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/604e31b6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/604e31b6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/604e31b6

Branch: refs/heads/trunk
Commit: 604e31b60998d5df349f8cc84d06bbae1391dc0e
Parents: bbc0891
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 08:17:50 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 08:17:50 2013 -0600

--
 .../cassandra/db/HintedHandOffManager.java  | 24 ++-
 .../cassandra/db/HintedHandOffManagerMBean.java |  6 
 .../org/apache/cassandra/tools/NodeCmd.java |  8 -
 .../org/apache/cassandra/tools/NodeProbe.java   | 21 +
 .../apache/cassandra/tools/NodeToolHelp.yaml|  3 ++
 .../apache/cassandra/db/HintedHandOffTest.java  | 32 
 6 files changed, 92 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/604e31b6/src/java/org/apache/cassandra/db/HintedHandOffManager.java
--
diff --git a/src/java/org/apache/cassandra/db/HintedHandOffManager.java 
b/src/java/org/apache/cassandra/db/HintedHandOffManager.java
index 9508a86..40d5aaa 100644
--- a/src/java/org/apache/cassandra/db/HintedHandOffManager.java
+++ b/src/java/org/apache/cassandra/db/HintedHandOffManager.java
@@ -225,7 +225,29 @@ public class HintedHandOffManager implements 
HintedHandOffManagerMBean
 }
 }
 };
-StorageService.optionalTasks.execute(runnable);
+StorageService.optionalTasks.submit(runnable);
+}
+
+//foobar
+public void truncateAllHints() throws ExecutionException, 
InterruptedException
+{
+Runnable runnable = new Runnable()
+{
+public void run()
+{
+try
+{
+logger.info(Truncating all stored hints.);
+
Keyspace.open(Keyspace.SYSTEM_KS).getColumnFamilyStore(SystemKeyspace.HINTS_CF).truncateBlocking();
+}
+catch (Exception e)
+{
+logger.warn(Could not truncate all hints., e);
+}
+}
+};
+StorageService.optionalTasks.submit(runnable).get();
+
 }
 
 @VisibleForTesting

http://git-wip-us.apache.org/repos/asf/cassandra/blob/604e31b6/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
--
diff --git a/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java 
b/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
index 9a024e6..bbb2a14 100644
--- a/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
+++ b/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.db;
 
 import java.net.UnknownHostException;
 import java.util.List;
+import java.util.concurrent.ExecutionException;
 
 public interface HintedHandOffManagerMBean
 {
@@ -29,6 +30,11 @@ public interface HintedHandOffManagerMBean
 public void deleteHintsForEndpoint(final String host);
 
 /**
+ *  Truncate all the hints
+ */
+public void truncateAllHints() throws ExecutionException, 
InterruptedException;
+
+/**
  * List all the endpoints that this node has hints for.
  * @return set of endpoints; as Strings
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/604e31b6/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index c32539f..f1e7c73 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -164,6 +164,7 @@ public class NodeCmd
 STOP,
 STOPDAEMON,
 TPSTATS,
+TRUNCATEHINTS,
 UPGRADESSTABLES,
 VERSION,
 DESCRIBERING,
@@ -1116,7 +1117,12 @@ public class NodeCmd
 case RESETLOCALSCHEMA: probe.resetLocalSchema(); break;
 case ENABLEBACKUP: 
probe.setIncrementalBackupsEnabled(true); break;
 case DISABLEBACKUP   : 
probe.setIncrementalBackupsEnabled(false); break;
-
+
+case TRUNCATEHINTS:
+if (arguments.length  1) badUse(Too many arguments.);
+else if (arguments.length == 1) 
probe.truncateHints(arguments[0]);
+   

git commit: fix test build

2013-12-30 Thread brandonwilliams
Updated Branches:
  refs/heads/trunk 7eaedbf95 - 23bc52386


fix test build


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/23bc5238
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/23bc5238
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/23bc5238

Branch: refs/heads/trunk
Commit: 23bc52386bd13fddf01037ac7e9cfb234f52ec43
Parents: 7eaedbf
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 08:30:43 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 08:30:43 2013 -0600

--
 test/unit/org/apache/cassandra/db/HintedHandOffTest.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/23bc5238/test/unit/org/apache/cassandra/db/HintedHandOffTest.java
--
diff --git a/test/unit/org/apache/cassandra/db/HintedHandOffTest.java 
b/test/unit/org/apache/cassandra/db/HintedHandOffTest.java
index 262e518..c3b9367 100644
--- a/test/unit/org/apache/cassandra/db/HintedHandOffTest.java
+++ b/test/unit/org/apache/cassandra/db/HintedHandOffTest.java
@@ -102,8 +102,8 @@ public class HintedHandOffTest extends SchemaLoader
 hintStore.clearUnsafe();
 
 // insert 1 hint
-RowMutation rm = new RowMutation(KEYSPACE4, ByteBufferUtil.bytes(1));
-rm.add(STANDARD1_CF, ByteBufferUtil.bytes(String.valueOf(COLUMN1)), 
ByteBufferUtil.EMPTY_BYTE_BUFFER, System.currentTimeMillis());
+Mutation rm = new Mutation(KEYSPACE4, ByteBufferUtil.bytes(1));
+rm.add(STANDARD1_CF, Util.cellname(COLUMN1), 
ByteBufferUtil.EMPTY_BYTE_BUFFER, System.currentTimeMillis());
 
 HintedHandOffManager.instance.hintFor(rm, 
HintedHandOffManager.calculateHintTTL(rm), UUID.randomUUID()).apply();
 



[3/8] git commit: Allow nodetool to optionally resolve hostnames. Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for CASSANDRA-2238

2013-12-30 Thread brandonwilliams
Allow nodetool to optionally resolve hostnames.
Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for
CASSANDRA-2238


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2f63bbad
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2f63bbad
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2f63bbad

Branch: refs/heads/trunk
Commit: 2f63bbadfa2fad66e04bc17edc66ce8be7497157
Parents: 3eef540
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:51:36 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:51:36 2013 -0600

--
 .../org/apache/cassandra/tools/NodeCmd.java | 118 +--
 1 file changed, 79 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2f63bbad/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index 87b114c..e14ba1a 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -73,6 +73,7 @@ public class NodeCmd
 private static final PairString, String UPGRADE_ALL_SSTABLE_OPT = 
Pair.create(a, include-all-sstables);
 private static final PairString, String NO_SNAPSHOT = Pair.create(ns, 
no-snapshot);
 private static final PairString, String CFSTATS_IGNORE_OPT = 
Pair.create(i, ignore);
+private static final PairString, String RESOLVE_IP = Pair.create(r, 
resolve-ip);
 
 private static final String DEFAULT_HOST = 127.0.0.1;
 private static final int DEFAULT_PORT = 7199;
@@ -98,6 +99,7 @@ public class NodeCmd
 options.addOption(UPGRADE_ALL_SSTABLE_OPT, false, includes sstables 
that are already on the most recent version during upgradesstables);
 options.addOption(NO_SNAPSHOT, false, disables snapshot creation for 
scrub);
 options.addOption(CFSTATS_IGNORE_OPT, false, ignore the supplied list 
of keyspace.columnfamiles in statistics);
+options.addOption(RESOLVE_IP, false, show node domain names instead 
of IPs);
 }
 
 public NodeCmd(NodeProbe probe)
@@ -374,11 +376,13 @@ public class NodeCmd
 MapString, String loadMap, hostIDMap, tokensToEndpoints;
 EndpointSnitchInfoMBean epSnitchInfo;
 PrintStream outs;
+private final boolean resolveIp;
 
-ClusterStatus(PrintStream outs, String kSpace)
+ClusterStatus(PrintStream outs, String kSpace, boolean resolveIp)
 {
 this.kSpace = kSpace;
 this.outs = outs;
+this.resolveIp = resolveIp;
 joiningNodes = probe.getJoiningNodes();
 leavingNodes = probe.getLeavingNodes();
 movingNodes = probe.getMovingNodes();
@@ -396,18 +400,58 @@ public class NodeCmd
 outs.println(|/ State=Normal/Leaving/Joining/Moving);
 }
 
-private MapString, MapInetAddress, Float 
getOwnershipByDc(MapInetAddress, Float ownerships)
+class SetHostStat implements IterableHostStat {
+final ListHostStat hostStats = new ArrayListHostStat();
+
+public SetHostStat() {}
+
+public SetHostStat(MapInetAddress, Float ownerships) {
+for (Map.EntryInetAddress, Float entry : 
ownerships.entrySet()) {
+hostStats.add(new HostStat(entry));
+}
+}
+
+@Override
+public IteratorHostStat iterator() {
+return hostStats.iterator();
+}
+
+public void add(HostStat entry) {
+hostStats.add(entry);
+}
+}
+
+class HostStat {
+public final String ip;
+public final String dns;
+public final Float owns;
+
+public HostStat(Map.EntryInetAddress, Float ownership) {
+this.ip = ownership.getKey().getHostAddress();
+this.dns = ownership.getKey().getHostName();
+this.owns = ownership.getValue();
+}
+
+public String ipOrDns() {
+if (resolveIp) {
+return dns;
+}
+return ip;
+}
+}
+
+private MapString, SetHostStat getOwnershipByDc(SetHostStat 
ownerships)
 throws UnknownHostException
 {
-MapString, MapInetAddress, Float ownershipByDc = 
Maps.newLinkedHashMap();
+MapString, SetHostStat ownershipByDc = Maps.newLinkedHashMap();
 EndpointSnitchInfoMBean epSnitchInfo = 
probe.getEndpointSnitchInfoProxy();
 
-for (Map.EntryInetAddress, Float ownership : 

[4/8] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-12-30 Thread brandonwilliams
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fc9709cd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fc9709cd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fc9709cd

Branch: refs/heads/cassandra-2.0
Commit: fc9709cdfe6076bf7b6653a1797299c197be8619
Parents: 604e31b 2f63bba
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:52:31 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:52:31 2013 -0600

--

--




[6/8] git commit: Allow nodetool to optionally resolve hostnames. Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for CASSANDRA-2238

2013-12-30 Thread brandonwilliams
Allow nodetool to optionally resolve hostnames.
Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for
CASSANDRA-2238


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c5ca8de4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c5ca8de4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c5ca8de4

Branch: refs/heads/trunk
Commit: c5ca8de4dfd512e971f9bba100dfcc3709f70786
Parents: fc9709c
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:53:13 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:53:13 2013 -0600

--
 .../org/apache/cassandra/tools/NodeCmd.java | 118 +--
 1 file changed, 79 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c5ca8de4/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index f1e7c73..0cc7320 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -73,6 +73,7 @@ public class NodeCmd
 private static final PairString, String UPGRADE_ALL_SSTABLE_OPT = 
Pair.create(a, include-all-sstables);
 private static final PairString, String NO_SNAPSHOT = Pair.create(ns, 
no-snapshot);
 private static final PairString, String CFSTATS_IGNORE_OPT = 
Pair.create(i, ignore);
+private static final PairString, String RESOLVE_IP = Pair.create(r, 
resolve-ip);
 
 private static final String DEFAULT_HOST = 127.0.0.1;
 private static final int DEFAULT_PORT = 7199;
@@ -99,6 +100,7 @@ public class NodeCmd
 options.addOption(UPGRADE_ALL_SSTABLE_OPT, false, includes sstables 
that are already on the most recent version during upgradesstables);
 options.addOption(NO_SNAPSHOT, false, disables snapshot creation for 
scrub);
 options.addOption(CFSTATS_IGNORE_OPT, false, ignore the supplied list 
of keyspace.columnfamiles in statistics);
+options.addOption(RESOLVE_IP, false, show node domain names instead 
of IPs);
 }
 
 public NodeCmd(NodeProbe probe)
@@ -373,11 +375,13 @@ public class NodeCmd
 MapString, String loadMap, hostIDMap, tokensToEndpoints;
 EndpointSnitchInfoMBean epSnitchInfo;
 PrintStream outs;
+private final boolean resolveIp;
 
-ClusterStatus(PrintStream outs, String kSpace)
+ClusterStatus(PrintStream outs, String kSpace, boolean resolveIp)
 {
 this.kSpace = kSpace;
 this.outs = outs;
+this.resolveIp = resolveIp;
 joiningNodes = probe.getJoiningNodes();
 leavingNodes = probe.getLeavingNodes();
 movingNodes = probe.getMovingNodes();
@@ -395,18 +399,58 @@ public class NodeCmd
 outs.println(|/ State=Normal/Leaving/Joining/Moving);
 }
 
-private MapString, MapInetAddress, Float 
getOwnershipByDc(MapInetAddress, Float ownerships)
+class SetHostStat implements IterableHostStat {
+final ListHostStat hostStats = new ArrayListHostStat();
+
+public SetHostStat() {}
+
+public SetHostStat(MapInetAddress, Float ownerships) {
+for (Map.EntryInetAddress, Float entry : 
ownerships.entrySet()) {
+hostStats.add(new HostStat(entry));
+}
+}
+
+@Override
+public IteratorHostStat iterator() {
+return hostStats.iterator();
+}
+
+public void add(HostStat entry) {
+hostStats.add(entry);
+}
+}
+
+class HostStat {
+public final String ip;
+public final String dns;
+public final Float owns;
+
+public HostStat(Map.EntryInetAddress, Float ownership) {
+this.ip = ownership.getKey().getHostAddress();
+this.dns = ownership.getKey().getHostName();
+this.owns = ownership.getValue();
+}
+
+public String ipOrDns() {
+if (resolveIp) {
+return dns;
+}
+return ip;
+}
+}
+
+private MapString, SetHostStat getOwnershipByDc(SetHostStat 
ownerships)
 throws UnknownHostException
 {
-MapString, MapInetAddress, Float ownershipByDc = 
Maps.newLinkedHashMap();
+MapString, SetHostStat ownershipByDc = Maps.newLinkedHashMap();
 EndpointSnitchInfoMBean epSnitchInfo = 
probe.getEndpointSnitchInfoProxy();
 
-for (Map.EntryInetAddress, Float ownership : 

[5/8] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-12-30 Thread brandonwilliams
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fc9709cd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fc9709cd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fc9709cd

Branch: refs/heads/trunk
Commit: fc9709cdfe6076bf7b6653a1797299c197be8619
Parents: 604e31b 2f63bba
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:52:31 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:52:31 2013 -0600

--

--




[2/8] git commit: Allow nodetool to optionally resolve hostnames. Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for CASSANDRA-2238

2013-12-30 Thread brandonwilliams
Allow nodetool to optionally resolve hostnames.
Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for
CASSANDRA-2238


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2f63bbad
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2f63bbad
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2f63bbad

Branch: refs/heads/cassandra-2.0
Commit: 2f63bbadfa2fad66e04bc17edc66ce8be7497157
Parents: 3eef540
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:51:36 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:51:36 2013 -0600

--
 .../org/apache/cassandra/tools/NodeCmd.java | 118 +--
 1 file changed, 79 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2f63bbad/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index 87b114c..e14ba1a 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -73,6 +73,7 @@ public class NodeCmd
 private static final PairString, String UPGRADE_ALL_SSTABLE_OPT = 
Pair.create(a, include-all-sstables);
 private static final PairString, String NO_SNAPSHOT = Pair.create(ns, 
no-snapshot);
 private static final PairString, String CFSTATS_IGNORE_OPT = 
Pair.create(i, ignore);
+private static final PairString, String RESOLVE_IP = Pair.create(r, 
resolve-ip);
 
 private static final String DEFAULT_HOST = 127.0.0.1;
 private static final int DEFAULT_PORT = 7199;
@@ -98,6 +99,7 @@ public class NodeCmd
 options.addOption(UPGRADE_ALL_SSTABLE_OPT, false, includes sstables 
that are already on the most recent version during upgradesstables);
 options.addOption(NO_SNAPSHOT, false, disables snapshot creation for 
scrub);
 options.addOption(CFSTATS_IGNORE_OPT, false, ignore the supplied list 
of keyspace.columnfamiles in statistics);
+options.addOption(RESOLVE_IP, false, show node domain names instead 
of IPs);
 }
 
 public NodeCmd(NodeProbe probe)
@@ -374,11 +376,13 @@ public class NodeCmd
 MapString, String loadMap, hostIDMap, tokensToEndpoints;
 EndpointSnitchInfoMBean epSnitchInfo;
 PrintStream outs;
+private final boolean resolveIp;
 
-ClusterStatus(PrintStream outs, String kSpace)
+ClusterStatus(PrintStream outs, String kSpace, boolean resolveIp)
 {
 this.kSpace = kSpace;
 this.outs = outs;
+this.resolveIp = resolveIp;
 joiningNodes = probe.getJoiningNodes();
 leavingNodes = probe.getLeavingNodes();
 movingNodes = probe.getMovingNodes();
@@ -396,18 +400,58 @@ public class NodeCmd
 outs.println(|/ State=Normal/Leaving/Joining/Moving);
 }
 
-private MapString, MapInetAddress, Float 
getOwnershipByDc(MapInetAddress, Float ownerships)
+class SetHostStat implements IterableHostStat {
+final ListHostStat hostStats = new ArrayListHostStat();
+
+public SetHostStat() {}
+
+public SetHostStat(MapInetAddress, Float ownerships) {
+for (Map.EntryInetAddress, Float entry : 
ownerships.entrySet()) {
+hostStats.add(new HostStat(entry));
+}
+}
+
+@Override
+public IteratorHostStat iterator() {
+return hostStats.iterator();
+}
+
+public void add(HostStat entry) {
+hostStats.add(entry);
+}
+}
+
+class HostStat {
+public final String ip;
+public final String dns;
+public final Float owns;
+
+public HostStat(Map.EntryInetAddress, Float ownership) {
+this.ip = ownership.getKey().getHostAddress();
+this.dns = ownership.getKey().getHostName();
+this.owns = ownership.getValue();
+}
+
+public String ipOrDns() {
+if (resolveIp) {
+return dns;
+}
+return ip;
+}
+}
+
+private MapString, SetHostStat getOwnershipByDc(SetHostStat 
ownerships)
 throws UnknownHostException
 {
-MapString, MapInetAddress, Float ownershipByDc = 
Maps.newLinkedHashMap();
+MapString, SetHostStat ownershipByDc = Maps.newLinkedHashMap();
 EndpointSnitchInfoMBean epSnitchInfo = 
probe.getEndpointSnitchInfoProxy();
 
-for (Map.EntryInetAddress, Float 

[7/8] git commit: Allow nodetool to optionally resolve hostnames. Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for CASSANDRA-2238

2013-12-30 Thread brandonwilliams
Allow nodetool to optionally resolve hostnames.
Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for
CASSANDRA-2238


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c5ca8de4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c5ca8de4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c5ca8de4

Branch: refs/heads/cassandra-2.0
Commit: c5ca8de4dfd512e971f9bba100dfcc3709f70786
Parents: fc9709c
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:53:13 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:53:13 2013 -0600

--
 .../org/apache/cassandra/tools/NodeCmd.java | 118 +--
 1 file changed, 79 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c5ca8de4/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index f1e7c73..0cc7320 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -73,6 +73,7 @@ public class NodeCmd
 private static final PairString, String UPGRADE_ALL_SSTABLE_OPT = 
Pair.create(a, include-all-sstables);
 private static final PairString, String NO_SNAPSHOT = Pair.create(ns, 
no-snapshot);
 private static final PairString, String CFSTATS_IGNORE_OPT = 
Pair.create(i, ignore);
+private static final PairString, String RESOLVE_IP = Pair.create(r, 
resolve-ip);
 
 private static final String DEFAULT_HOST = 127.0.0.1;
 private static final int DEFAULT_PORT = 7199;
@@ -99,6 +100,7 @@ public class NodeCmd
 options.addOption(UPGRADE_ALL_SSTABLE_OPT, false, includes sstables 
that are already on the most recent version during upgradesstables);
 options.addOption(NO_SNAPSHOT, false, disables snapshot creation for 
scrub);
 options.addOption(CFSTATS_IGNORE_OPT, false, ignore the supplied list 
of keyspace.columnfamiles in statistics);
+options.addOption(RESOLVE_IP, false, show node domain names instead 
of IPs);
 }
 
 public NodeCmd(NodeProbe probe)
@@ -373,11 +375,13 @@ public class NodeCmd
 MapString, String loadMap, hostIDMap, tokensToEndpoints;
 EndpointSnitchInfoMBean epSnitchInfo;
 PrintStream outs;
+private final boolean resolveIp;
 
-ClusterStatus(PrintStream outs, String kSpace)
+ClusterStatus(PrintStream outs, String kSpace, boolean resolveIp)
 {
 this.kSpace = kSpace;
 this.outs = outs;
+this.resolveIp = resolveIp;
 joiningNodes = probe.getJoiningNodes();
 leavingNodes = probe.getLeavingNodes();
 movingNodes = probe.getMovingNodes();
@@ -395,18 +399,58 @@ public class NodeCmd
 outs.println(|/ State=Normal/Leaving/Joining/Moving);
 }
 
-private MapString, MapInetAddress, Float 
getOwnershipByDc(MapInetAddress, Float ownerships)
+class SetHostStat implements IterableHostStat {
+final ListHostStat hostStats = new ArrayListHostStat();
+
+public SetHostStat() {}
+
+public SetHostStat(MapInetAddress, Float ownerships) {
+for (Map.EntryInetAddress, Float entry : 
ownerships.entrySet()) {
+hostStats.add(new HostStat(entry));
+}
+}
+
+@Override
+public IteratorHostStat iterator() {
+return hostStats.iterator();
+}
+
+public void add(HostStat entry) {
+hostStats.add(entry);
+}
+}
+
+class HostStat {
+public final String ip;
+public final String dns;
+public final Float owns;
+
+public HostStat(Map.EntryInetAddress, Float ownership) {
+this.ip = ownership.getKey().getHostAddress();
+this.dns = ownership.getKey().getHostName();
+this.owns = ownership.getValue();
+}
+
+public String ipOrDns() {
+if (resolveIp) {
+return dns;
+}
+return ip;
+}
+}
+
+private MapString, SetHostStat getOwnershipByDc(SetHostStat 
ownerships)
 throws UnknownHostException
 {
-MapString, MapInetAddress, Float ownershipByDc = 
Maps.newLinkedHashMap();
+MapString, SetHostStat ownershipByDc = Maps.newLinkedHashMap();
 EndpointSnitchInfoMBean epSnitchInfo = 
probe.getEndpointSnitchInfoProxy();
 
-for (Map.EntryInetAddress, Float 

[1/8] git commit: Allow nodetool to optionally resolve hostnames. Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for CASSANDRA-2238

2013-12-30 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.2 3eef54097 - 2f63bbadf
  refs/heads/cassandra-2.0 604e31b60 - c5ca8de4d
  refs/heads/trunk 23bc52386 - 76ee9a155


Allow nodetool to optionally resolve hostnames.
Patch by Daneel S. Yaitskov, reviewed by brandonwilliams for
CASSANDRA-2238


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2f63bbad
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2f63bbad
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2f63bbad

Branch: refs/heads/cassandra-1.2
Commit: 2f63bbadfa2fad66e04bc17edc66ce8be7497157
Parents: 3eef540
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:51:36 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:51:36 2013 -0600

--
 .../org/apache/cassandra/tools/NodeCmd.java | 118 +--
 1 file changed, 79 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2f63bbad/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index 87b114c..e14ba1a 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -73,6 +73,7 @@ public class NodeCmd
 private static final PairString, String UPGRADE_ALL_SSTABLE_OPT = 
Pair.create(a, include-all-sstables);
 private static final PairString, String NO_SNAPSHOT = Pair.create(ns, 
no-snapshot);
 private static final PairString, String CFSTATS_IGNORE_OPT = 
Pair.create(i, ignore);
+private static final PairString, String RESOLVE_IP = Pair.create(r, 
resolve-ip);
 
 private static final String DEFAULT_HOST = 127.0.0.1;
 private static final int DEFAULT_PORT = 7199;
@@ -98,6 +99,7 @@ public class NodeCmd
 options.addOption(UPGRADE_ALL_SSTABLE_OPT, false, includes sstables 
that are already on the most recent version during upgradesstables);
 options.addOption(NO_SNAPSHOT, false, disables snapshot creation for 
scrub);
 options.addOption(CFSTATS_IGNORE_OPT, false, ignore the supplied list 
of keyspace.columnfamiles in statistics);
+options.addOption(RESOLVE_IP, false, show node domain names instead 
of IPs);
 }
 
 public NodeCmd(NodeProbe probe)
@@ -374,11 +376,13 @@ public class NodeCmd
 MapString, String loadMap, hostIDMap, tokensToEndpoints;
 EndpointSnitchInfoMBean epSnitchInfo;
 PrintStream outs;
+private final boolean resolveIp;
 
-ClusterStatus(PrintStream outs, String kSpace)
+ClusterStatus(PrintStream outs, String kSpace, boolean resolveIp)
 {
 this.kSpace = kSpace;
 this.outs = outs;
+this.resolveIp = resolveIp;
 joiningNodes = probe.getJoiningNodes();
 leavingNodes = probe.getLeavingNodes();
 movingNodes = probe.getMovingNodes();
@@ -396,18 +400,58 @@ public class NodeCmd
 outs.println(|/ State=Normal/Leaving/Joining/Moving);
 }
 
-private MapString, MapInetAddress, Float 
getOwnershipByDc(MapInetAddress, Float ownerships)
+class SetHostStat implements IterableHostStat {
+final ListHostStat hostStats = new ArrayListHostStat();
+
+public SetHostStat() {}
+
+public SetHostStat(MapInetAddress, Float ownerships) {
+for (Map.EntryInetAddress, Float entry : 
ownerships.entrySet()) {
+hostStats.add(new HostStat(entry));
+}
+}
+
+@Override
+public IteratorHostStat iterator() {
+return hostStats.iterator();
+}
+
+public void add(HostStat entry) {
+hostStats.add(entry);
+}
+}
+
+class HostStat {
+public final String ip;
+public final String dns;
+public final Float owns;
+
+public HostStat(Map.EntryInetAddress, Float ownership) {
+this.ip = ownership.getKey().getHostAddress();
+this.dns = ownership.getKey().getHostName();
+this.owns = ownership.getValue();
+}
+
+public String ipOrDns() {
+if (resolveIp) {
+return dns;
+}
+return ip;
+}
+}
+
+private MapString, SetHostStat getOwnershipByDc(SetHostStat 
ownerships)
 throws UnknownHostException
 {
-MapString, MapInetAddress, Float ownershipByDc = 
Maps.newLinkedHashMap();
+MapString, SetHostStat ownershipByDc = 

[8/8] git commit: Merge branch 'cassandra-2.0' into trunk

2013-12-30 Thread brandonwilliams
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/76ee9a15
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/76ee9a15
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/76ee9a15

Branch: refs/heads/trunk
Commit: 76ee9a155ea14304595bc2e9755accbfded04e62
Parents: 23bc523 c5ca8de
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Dec 30 09:53:51 2013 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Dec 30 09:53:51 2013 -0600

--
 .../org/apache/cassandra/tools/NodeCmd.java | 118 +--
 1 file changed, 79 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/76ee9a15/src/java/org/apache/cassandra/tools/NodeCmd.java
--



[jira] [Resolved] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2013-12-30 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-2238.
-

   Resolution: Fixed
Fix Version/s: 2.0.5
   1.2.14
 Assignee: Daneel S. Yaitskov

Committed, thanks!

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Assignee: Daneel S. Yaitskov
Priority: Trivial
 Fix For: 1.2.14, 2.0.5


 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5396) Repair process is a joke leading to a downward spiralling and eventually unusable cluster

2013-12-30 Thread Donald Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858966#comment-13858966
 ] 

Donald Smith commented on CASSANDRA-5396:
-

We ran nodetool repair -pr on one node of a three node cluster running on 
production-quality hardware, each node with about 1TB of data. It was using 
cassandra version 2.0.3. After 5 days it was still running and had apparently 
frozen.  See https://issues.apache.org/jira/browse/CASSANDRA-5220 (Dec 23 
comment by Donald Smith) for more detail.  We tried running repair on our 
smallest column family (with 12G of data), and it took 31 hours to complete.
We're not yet in production but we plan on not running repair, since we do very 
few deletes or updates and since we don't trust it.

 Repair process is a joke leading to a downward spiralling and eventually 
 unusable cluster
 -

 Key: CASSANDRA-5396
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5396
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.3
 Environment: all
Reporter: David Berkman
Priority: Critical

 Let's review the repair process...
 1) It's mandatory to run repair.
 2) Repair has a high impact and can take hours.
 3) Repair provides no estimation of completion time and no progress indicator.
 4) Repair is extremely fragile, and can fail to complete, or become stuck 
 quite easily in real operating environments.
 5) When repair fails it provides no feedback whatsoever of the problem or 
 possible resolution.
 6) A failed repair operation saddles the effected nodes with a huge amount of 
 extra data (judging from node size).
 7) There is no way to rid the node of the extra data associated with a failed 
 repair short of completely rebuilding the node.
 8) The extra data from a failed repair makes any subsequent repair take 
 longer and increases the likelihood that it will simply become stuck or fail, 
 leading to yet more node corruption.
 9) Eventually no repair operation will complete successfully, and node 
 operations will eventually become impacted leading to a failing cluster.
 Who would design such a system for a service meant to operate as a fault 
 tolerant clustered data store operating on a lot of commodity hardware?
 Solution...
 1) Repair must be robust.
 2) Repair must *never* become 'stuck'.
 3) Failure to complete must result in reasonable feedback.
 4) Failure to complete must not result in a node whose state is worse than 
 before the operation began.
 5) Repair must provide some means of determining completion percentage.
 6) It would be nice if repair could estimate its run time, even if it could 
 do so only based upon previous runs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (CASSANDRA-5396) Repair process is a joke leading to a downward spiralling and eventually unusable cluster

2013-12-30 Thread Donald Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858966#comment-13858966
 ] 

Donald Smith edited comment on CASSANDRA-5396 at 12/30/13 6:14 PM:
---

We ran nodetool repair -pr on one node of a three node cluster running on 
production-quality hardware, each node with about 1TB of data. It was using 
cassandra version 2.0.3. After 5 days it was still running and had apparently 
frozen.  See https://issues.apache.org/jira/browse/CASSANDRA-5220 (Dec 23 
comment by Donald Smith) for more detail.  We tried running repair on our 
smallest column family (with 12G of data), and it took 31 hours to complete.
We're not yet in production but we plan on not running repair, since we do very 
few deletes or updates and since we don't trust it. Also, our data isn't 
critical.


was (Author: thinkerfeeler):
We ran nodetool repair -pr on one node of a three node cluster running on 
production-quality hardware, each node with about 1TB of data. It was using 
cassandra version 2.0.3. After 5 days it was still running and had apparently 
frozen.  See https://issues.apache.org/jira/browse/CASSANDRA-5220 (Dec 23 
comment by Donald Smith) for more detail.  We tried running repair on our 
smallest column family (with 12G of data), and it took 31 hours to complete.
We're not yet in production but we plan on not running repair, since we do very 
few deletes or updates and since we don't trust it.

 Repair process is a joke leading to a downward spiralling and eventually 
 unusable cluster
 -

 Key: CASSANDRA-5396
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5396
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.3
 Environment: all
Reporter: David Berkman
Priority: Critical

 Let's review the repair process...
 1) It's mandatory to run repair.
 2) Repair has a high impact and can take hours.
 3) Repair provides no estimation of completion time and no progress indicator.
 4) Repair is extremely fragile, and can fail to complete, or become stuck 
 quite easily in real operating environments.
 5) When repair fails it provides no feedback whatsoever of the problem or 
 possible resolution.
 6) A failed repair operation saddles the effected nodes with a huge amount of 
 extra data (judging from node size).
 7) There is no way to rid the node of the extra data associated with a failed 
 repair short of completely rebuilding the node.
 8) The extra data from a failed repair makes any subsequent repair take 
 longer and increases the likelihood that it will simply become stuck or fail, 
 leading to yet more node corruption.
 9) Eventually no repair operation will complete successfully, and node 
 operations will eventually become impacted leading to a failing cluster.
 Who would design such a system for a service meant to operate as a fault 
 tolerant clustered data store operating on a lot of commodity hardware?
 Solution...
 1) Repair must be robust.
 2) Repair must *never* become 'stuck'.
 3) Failure to complete must result in reasonable feedback.
 4) Failure to complete must not result in a node whose state is worse than 
 before the operation began.
 5) Repair must provide some means of determining completion percentage.
 6) It would be nice if repair could estimate its run time, even if it could 
 do so only based upon previous runs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858973#comment-13858973
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

Why doesn't ML.update need to call ascendToRoot before toNode?

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858975#comment-13858975
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

Also, what's going on here?

{code}
// finish copying any remaining keys
while (true)
{
ModifierLevel next = current.update(POSITIVE_INFINITY, comparator, 
replaceF);
if (next == null)
break;
current = next;
}
{code}

The comment doesn't make much sense, since we already looped over all the keys 
in the source Collection.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858987#comment-13858987
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

{code}
// true iff this node (or a child) should contain the key
boolean owns = true;
{code}

How do we get into a situation where update is called on a sub-tree that 
shouldn't contain the key?

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858989#comment-13858989
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

{code}
// buffer for building new nodes
Object[] buildKeys = new Object[1 + (FAN_FACTOR  1)];  // buffers keys 
for branches and leaves
Object[] buildChildren = new Object[2 + (FAN_FACTOR  1)]; // buffers 
children for branches only
{code}

Where do these sizes come from?  I would expect keys to be FF*2 and children to 
be 1+FF*2.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6530) Fix logback configuration in scripts and debian packaging for trunk/2.1

2013-12-30 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858994#comment-13858994
 ] 

Brandon Williams commented on CASSANDRA-6530:
-

Looks like we need to suppress the HSHA disconnect warnings like we did in 
log4j:

{noformat}
WARN  19:01:51 Got an IOException in internalRead!
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_17]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
~[na:1.7.0_17]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) 
~[na:1.7.0_17]
at sun.nio.ch.IOUtil.read(IOUtil.java:198) ~[na:1.7.0_17]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) 
~[na:1.7.0_17]
at 
org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
 ~[libthrift-0.9.1.jar:0.9.1]
at com.thinkaurelius.thrift.util.mem.Buffer.readFrom(Buffer.java:96) 
~[thrift-server-0.3.2.jar:na]
at com.thinkaurelius.thrift.Message.internalRead(Message.java:338) 
[thrift-server-0.3.2.jar:na]
at com.thinkaurelius.thrift.Message.read(Message.java:141) 
[thrift-server-0.3.2.jar:na]
at 
com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.handleRead(TDisruptorServer.java:521)
 [thrift-server-0.3.2.jar:na]
at 
com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:500)
 [thrift-server-0.3.2.jar:na]
at 
com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.java:375)
 [thrift-server-0.3.2.jar:na]
at 
com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.run(TDisruptorServer.java:339)
 [thrift-server-0.3.2.jar:na]
{noformat}

 Fix logback configuration in scripts and debian packaging for trunk/2.1
 ---

 Key: CASSANDRA-6530
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6530
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Michael Shuler
Assignee: Michael Shuler
Priority: Minor
 Fix For: 2.1

 Attachments: logback_configurations_final.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859013#comment-13859013
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

{noformat}
$ ant long-test -Dtest.name=BTreeTest

[junit] Testsuite: org.apache.cassandra.utils.BTreeTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
0.036 sec
[junit] 
[junit] Testcase: 
initializationError(org.apache.cassandra.utils.BTreeTest):Caused an 
ERROR
[junit] No runnable methods
[junit] java.lang.Exception: No runnable methods
[junit] at 
java.lang.reflect.Constructor.newInstance(Constructor.java:525)
{noformat}

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859020#comment-13859020
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

{code}
int c = this.compareTo(findLast, forwards);
if (forwards ? c  0 : c  0)
{
endNode = stack[depth];
endIndex = index[depth];
}
else
{
endNode = findLast.stack[findLast.depth];
endIndex = findLast.index[findLast.depth];
}
{code}

This looks like we're second-guessing lower/upper bounds.  Would prefer to 
assert that lower is actually = upper.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6086) Node refuses to start with exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers when find that some to be removed files are already removed

2013-12-30 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-6086:
--

Fix Version/s: (was: 2.0.4)
   2.0.5

 Node refuses to start with exception in 
 ColumnFamilyStore.removeUnfinishedCompactionLeftovers when find that some to 
 be removed files are already removed
 -

 Key: CASSANDRA-6086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6086
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Oleg Anastasyev
Assignee: Yuki Morishita
 Fix For: 2.0.5

 Attachments: 6086-2.0-v3.txt, 6086-v2.txt, 
 removeUnfinishedCompactionLeftovers.txt


 Node refuses to start with
 {code}
 Caused by: java.lang.IllegalStateException: Unfinished compactions reference 
 missing sstables. This should never happen since compactions are marked 
 finished before we start removing the old sstables.
   at 
 org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:544)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:262)
 {code}
 IMO, there is no reason to refuse to start discivering files that must be 
 removed are already removed. It looks like pure bug diagnostic code and mean 
 nothing to operator (nor he can do anything about this).
 Replaced throw of excepion with dump of diagnostic warning and continue 
 startup.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-5633) CQL support for updating multiple rows in a partition using CAS

2013-12-30 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-5633:
--

Fix Version/s: (was: 2.0.4)
   2.0.5

 CQL support for updating multiple rows in a partition using CAS
 ---

 Key: CASSANDRA-5633
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5633
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 2.0 beta 1
Reporter: sankalp kohli
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: cql3
 Fix For: 2.0.5


 This is currently supported via Thrift but not via CQL. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6373) describe_ring hangs with hsha thrift server

2013-12-30 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-6373:
--

Fix Version/s: (was: 2.0.4)
   2.0.5

 describe_ring hangs with hsha thrift server
 ---

 Key: CASSANDRA-6373
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6373
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
Assignee: Pavel Yaskevich
 Fix For: 2.0.5

 Attachments: describe_ring_failure.patch, jstack.txt, jstack2.txt


 There is a strange bug with the thrift hsha server in 2.0 (we switched to 
 lmax disruptor server).
 The bug is that the first call to describe_ring from one connection will hang 
 indefinitely when the client is not connecting from localhost (or it at least 
 looks like the client is not on the same host). Additionally the cluster must 
 be using vnodes. When connecting from localhost the first call will work as 
 expected. And in either case subsequent calls from the same connection will 
 work as expected. According to git bisect the bad commit is the switch to the 
 lmax disruptor server:
 https://github.com/apache/cassandra/commit/98eec0a223251ecd8fec7ecc9e46b05497d631c6
 I've attached the patch I used to reproduce the error in the unit tests. The 
 command to reproduce is: 
 {noformat}
 PYTHONPATH=test nosetests 
 --tests=system.test_thrift_server:TestMutations.test_describe_ring
 {noformat}
 I reproduced on ec2 and a single machine by having the server bind to the 
 private ip on ec2 and the client connect to the public ip (so it appears as 
 if the client is non local). I've also reproduced with two different vms 
 though.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6210) Repair hangs when a new datacenter is added to a cluster

2013-12-30 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-6210:
--

Fix Version/s: (was: 2.0.4)
   2.0.5

 Repair hangs when a new datacenter is added to a cluster
 

 Key: CASSANDRA-6210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6210
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Amazon Ec2
 2 M1.large nodes
Reporter: Russell Alexander Spitzer
Assignee: Yuki Morishita
 Fix For: 2.0.5

 Attachments: RepairLogs.tar.gz


 Attempting to add a new datacenter to a cluster seems to cause repair 
 operations to break. I've been reproducing this with 20~ node clusters but 
 can get it to reliably occur on 2 node setups.
 {code}
 ##Basic Steps to reproduce
 #Node 1 is started using GossipingPropertyFileSnitch as dc1
 #Cassandra-stress is used to insert a minimal amount of data
 $CASSANDRA_STRESS -t 100 -R 
 org.apache.cassandra.locator.NetworkTopologyStrategy  --num-keys=1000 
 --columns=10 --consistency-level=LOCAL_QUORUM --average-size-values -
 -compaction-strategy='LeveledCompactionStrategy' -O dc1:1 
 --operation=COUNTER_ADD
 #Alter Keyspace1
 ALTER KEYSPACE Keyspace1 WITH replication = {'class': 
 'NetworkTopologyStrategy', 'dc1': 1 , 'dc2': 1 };
 #Add node 2 using GossipingPropertyFileSnitch as dc2
 run repair on node 1
 run repair on node 2
 {code}
 The repair task on node 1 never completes and while there are no exceptions 
 in the logs of node1, netstat reports the following repair tasks
 {code}
 Mode: NORMAL
 Repair 4e71a250-36b4-11e3-bedc-1d1bb5c9abab
 Repair 6c64ded0-36b4-11e3-bedc-1d1bb5c9abab
 Read Repair Statistics:
 Attempted: 0
 Mismatch (Blocking): 0
 Mismatch (Background): 0
 Pool NameActive   Pending  Completed
 Commandsn/a 0  10239
 Responses   n/a 0   3839
 {code}
 Checking on node 2 we see the following exceptions
 {code}
 ERROR [STREAM-IN-/10.171.122.130] 2013-10-16 22:42:58,961 StreamSession.java 
 (line 410) [Stream #4e71a250-36b4-11e3-bedc-1d1bb5c9abab] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:174)
 at 
 org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
 at 
 org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
 at java.lang.Thread.run(Thread.java:724)
 ...
 ERROR [STREAM-IN-/10.171.122.130] 2013-10-16 22:43:49,214 StreamSession.java 
 (line 410) [Stream #6c64ded0-36b4-11e3-bedc-1d1bb5c9abab] Streaming error 
 occurred
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:174)
 at 
 org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
 at 
 org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:358)
 at 
 org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293)
 at java.lang.Thread.run(Thread.java:724)
 {code}
 Netstats on node 2 reports
 {code}
 automaton@ip-10-171-15-234:~$ nodetool netstats
 Mode: NORMAL
 Repair 4e71a250-36b4-11e3-bedc-1d1bb5c9abab
 Read Repair Statistics:
 Attempted: 0
 Mismatch (Blocking): 0
 Mismatch (Background): 0
 Pool NameActive   Pending  Completed
 Commandsn/a 0   2562
 Responses   n/a 0   4284
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859026#comment-13859026
 ] 

Benedict commented on CASSANDRA-6271:
-

bq.   How do we get into a situation where update is called on a sub-tree that 
shouldn't contain the key?

When we're in the middle of the tree and the next item is further up / right

bq.  The comment doesn't make much sense, since we already looped over all the 
keys in the source Collection.

From the tree we're copying from

bq. Why doesn't ML.update need to call ascendToRoot before toNode?
The update(+inf) results in ascending to the root

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2013-12-30 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-6311:
--

Fix Version/s: (was: 2.0.4)
   2.0.5

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.5

 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6501) Cannot run pig examples on current 2.0 branch

2013-12-30 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859029#comment-13859029
 ] 

Alex Liu commented on CASSANDRA-6501:
-

I manually add CASSANDRA-6309 to 2.0.3 release and tested it on pig 0.10.0, it 
worked fine. It looks like cassandra-2.0 branch has some issues.

 Cannot run pig examples on current 2.0 branch
 -

 Key: CASSANDRA-6501
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6501
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Alex Liu
  Labels: pig

 I checked out the cassandra-2.0 branch to try the pig examples because the 
 2.0.3 release has the CASSANDRA-6309 problem which is fixed on the branch.  I 
 tried to run both the cql and the CassandraStorage examples in local mode 
 with pig 0.10.1, 0.11.1, and 0.12.0 and all of them give the following error 
 and stack trace:
 {quote}
 ERROR 2998: Unhandled internal error. readLength_
 java.lang.NoSuchFieldError: readLength_
   at 
 org.apache.cassandra.thrift.TBinaryProtocol$Factory.getProtocol(TBinaryProtocol.java:57)
   at org.apache.thrift.TSerializer.init(TSerializer.java:66)
   at 
 org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cfdefToString(AbstractCassandraStorage.java:508)
   at 
 org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:470)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:318)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:357)
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.alias_col_ref(LogicalPlanGenerator.java:15356)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.col_ref(LogicalPlanGenerator.java:15203)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:8881)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:8632)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:7984)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.flatten_generated_item(LogicalPlanGenerator.java:5962)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.generate_clause(LogicalPlanGenerator.java:14101)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:12493)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:12360)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1577)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)
   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)
   at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
   at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
   at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
   at org.apache.pig.Main.run(Main.java:555)
   at org.apache.pig.Main.main(Main.java:111)
 
 {quote}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Git Push Summary

2013-12-30 Thread eevans
Updated Tags:  refs/tags/2.0.4-tentative [deleted] d56f8f293


Git Push Summary

2013-12-30 Thread eevans
Updated Tags:  refs/tags/cassandra-2.0.4 [created] 68a768882


[2/3] git commit: Delete unfinished compaction sstables incrementally

2013-12-30 Thread yukim
Delete unfinished compaction sstables incrementally

patch by thobbs; reviewed by yukim for CASSANDRA-6086


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4ed22340
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4ed22340
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4ed22340

Branch: refs/heads/trunk
Commit: 4ed2234078c4d302c256332252a8ddd6ae345484
Parents: c5ca8de
Author: Yuki Morishita yu...@apache.org
Authored: Mon Dec 30 13:48:09 2013 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Dec 30 13:48:09 2013 -0600

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/ColumnFamilyStore.java  | 39 +++
 .../org/apache/cassandra/db/SystemKeyspace.java | 23 --
 .../db/compaction/LeveledManifest.java  | 14 
 .../cassandra/service/CassandraDaemon.java  | 43 +---
 .../cassandra/db/ColumnFamilyStoreTest.java | 74 ++--
 .../db/compaction/CompactionsTest.java  |  4 +-
 7 files changed, 151 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ed22340/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0396006..958369a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.5
+* Delete unfinished compaction incrementally (CASSANDRA-6086)
 Merged from 1.2:
  * fsync compression metadata (CASSANDRA-6531)
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ed22340/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index cbd9d2e..4d7d6f2 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -483,27 +483,30 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  * compactions, we remove the new ones (since those may be incomplete -- 
under LCS, we may create multiple
  * sstables from any given ancestor).
  */
-public static void removeUnfinishedCompactionLeftovers(String keyspace, 
String columnfamily, SetInteger unfinishedGenerations)
+public static void removeUnfinishedCompactionLeftovers(String keyspace, 
String columnfamily, MapInteger, UUID unfinishedCompactions)
 {
 Directories directories = Directories.create(keyspace, columnfamily);
 
-// sanity-check unfinishedGenerations
-SetInteger allGenerations = new HashSetInteger();
+SetInteger allGenerations = new HashSet();
 for (Descriptor desc : directories.sstableLister().list().keySet())
 allGenerations.add(desc.generation);
+
+// sanity-check unfinishedCompactions
+SetInteger unfinishedGenerations = unfinishedCompactions.keySet();
 if (!allGenerations.containsAll(unfinishedGenerations))
 {
-throw new IllegalStateException(Unfinished compactions reference 
missing sstables.
-+  This should never happen since 
compactions are marked finished before we start removing the old sstables.);
+HashSetInteger missingGenerations = new 
HashSet(unfinishedGenerations);
+missingGenerations.removeAll(allGenerations);
+logger.debug(Unfinished compactions of {}.{} reference missing 
sstables of generations {},
+ keyspace, columnfamily, missingGenerations);
 }
 
 // remove new sstables from compactions that didn't complete, and 
compute
 // set of ancestors that shouldn't exist anymore
-SetInteger completedAncestors = new HashSetInteger();
+SetInteger completedAncestors = new HashSet();
 for (Map.EntryDescriptor, SetComponent sstableFiles : 
directories.sstableLister().list().entrySet())
 {
 Descriptor desc = sstableFiles.getKey();
-SetComponent components = sstableFiles.getValue();
 
 SetInteger ancestors;
 try
@@ -515,9 +518,16 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 throw new FSReadError(e, desc.filenameFor(Component.STATS));
 }
 
-if (!ancestors.isEmpty()  
unfinishedGenerations.containsAll(ancestors))
+if (!ancestors.isEmpty()
+ unfinishedGenerations.containsAll(ancestors)
+ allGenerations.containsAll(ancestors))
 {
-SSTable.delete(desc, components);
+// any of the ancestors would work, so we'll just lookup the 
compaction task ID with the 

[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859037#comment-13859037
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

Having consumeNextLeaf sometimes consume one node per call and sometimes two is 
seriously confusing.  Should fix that, I doubt this pseudo-loop-unroll saves 
much.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-12-30 Thread yukim
Merge branch 'cassandra-2.0' into trunk

Conflicts:
src/java/org/apache/cassandra/db/ColumnFamilyStore.java
src/java/org/apache/cassandra/service/CassandraDaemon.java
test/unit/org/apache/cassandra/db/ColumnFamilyStoreTest.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49efc13c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49efc13c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49efc13c

Branch: refs/heads/trunk
Commit: 49efc13cd530735ad802769e7f5322f3c79085ef
Parents: 76ee9a1 4ed2234
Author: Yuki Morishita yu...@apache.org
Authored: Mon Dec 30 14:03:54 2013 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Dec 30 14:03:54 2013 -0600

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/ColumnFamilyStore.java  | 35 ++---
 .../org/apache/cassandra/db/SystemKeyspace.java | 23 --
 .../cassandra/service/CassandraDaemon.java  | 16 ++---
 .../cassandra/db/ColumnFamilyStoreTest.java | 74 ++--
 .../db/compaction/CompactionsTest.java  |  4 +-
 6 files changed, 124 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/49efc13c/CHANGES.txt
--
diff --cc CHANGES.txt
index 0974925,958369a..7af391e
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,27 -1,5 +1,28 @@@
 +2.1
 + * Multithreaded commitlog (CASSANDRA-3578)
 + * allocate fixed index summary memory pool and resample cold index summaries 
 +   to use less memory (CASSANDRA-5519)
 + * Removed multithreaded compaction (CASSANDRA-6142)
 + * Parallelize fetching rows for low-cardinality indexes (CASSANDRA-1337)
 + * change logging from log4j to logback (CASSANDRA-5883)
 + * switch to LZ4 compression for internode communication (CASSANDRA-5887)
 + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971)
 + * Remove 1.2 network compatibility code (CASSANDRA-5960)
 + * Remove leveled json manifest migration code (CASSANDRA-5996)
 + * Remove CFDefinition (CASSANDRA-6253)
 + * Use AtomicIntegerFieldUpdater in RefCountedMemory (CASSANDRA-6278)
 + * User-defined types for CQL3 (CASSANDRA-5590)
 + * Use of o.a.c.metrics in nodetool (CASSANDRA-5871, 6406)
 + * Batch read from OTC's queue and cleanup (CASSANDRA-1632)
 + * Secondary index support for collections (CASSANDRA-4511)
 + * SSTable metadata(Stats.db) format change (CASSANDRA-6356)
 + * Push composites support in the storage engine (CASSANDRA-5417)
 + * Add snapshot space used to cfstats (CASSANDRA-6231)
 + * Add cardinality estimator for key count estimation (CASSANDRA-5906)
 +
 +
  2.0.5
+ * Delete unfinished compaction incrementally (CASSANDRA-6086)
  Merged from 1.2:
   * fsync compression metadata (CASSANDRA-6531)
  

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49efc13c/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49efc13c/src/java/org/apache/cassandra/db/SystemKeyspace.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49efc13c/src/java/org/apache/cassandra/service/CassandraDaemon.java
--
diff --cc src/java/org/apache/cassandra/service/CassandraDaemon.java
index 006,d36b0db..02731d8
--- a/src/java/org/apache/cassandra/service/CassandraDaemon.java
+++ b/src/java/org/apache/cassandra/service/CassandraDaemon.java
@@@ -21,7 -21,11 +21,9 @@@ import java.io.File
  import java.io.IOException;
  import java.lang.management.ManagementFactory;
  import java.net.InetAddress;
 -import java.net.MalformedURLException;
 -import java.net.URL;
  import java.util.Arrays;
+ import java.util.Map;
+ import java.util.UUID;
  import java.util.concurrent.TimeUnit;
  import javax.management.MBeanServer;
  import javax.management.ObjectName;
@@@ -30,7 -34,7 +32,6 @@@ import javax.management.StandardMBean
  import com.addthis.metrics.reporter.config.ReporterConfig;
  
  import com.google.common.collect.Iterables;
- import com.google.common.collect.SetMultimap;
 -import org.apache.log4j.PropertyConfigurator;
  import org.slf4j.Logger;
  import org.slf4j.LoggerFactory;
  
@@@ -199,6 -237,22 +200,12 @@@ public class CassandraDaemo
  // load keyspace descriptions.
  DatabaseDescriptor.loadSchemas();
  
 -try
 -{
 -LeveledManifest.maybeMigrateManifests();
 -}
 -catch(IOException e)
 -{
 -logger.error(Could not migrate old leveled manifest. Move away 
the .json file in the data directory, e);
 -

[1/3] git commit: Delete unfinished compaction sstables incrementally

2013-12-30 Thread yukim
Updated Branches:
  refs/heads/cassandra-2.0 c5ca8de4d - 4ed223407
  refs/heads/trunk 76ee9a155 - 49efc13cd


Delete unfinished compaction sstables incrementally

patch by thobbs; reviewed by yukim for CASSANDRA-6086


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4ed22340
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4ed22340
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4ed22340

Branch: refs/heads/cassandra-2.0
Commit: 4ed2234078c4d302c256332252a8ddd6ae345484
Parents: c5ca8de
Author: Yuki Morishita yu...@apache.org
Authored: Mon Dec 30 13:48:09 2013 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Dec 30 13:48:09 2013 -0600

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/ColumnFamilyStore.java  | 39 +++
 .../org/apache/cassandra/db/SystemKeyspace.java | 23 --
 .../db/compaction/LeveledManifest.java  | 14 
 .../cassandra/service/CassandraDaemon.java  | 43 +---
 .../cassandra/db/ColumnFamilyStoreTest.java | 74 ++--
 .../db/compaction/CompactionsTest.java  |  4 +-
 7 files changed, 151 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ed22340/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0396006..958369a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.5
+* Delete unfinished compaction incrementally (CASSANDRA-6086)
 Merged from 1.2:
  * fsync compression metadata (CASSANDRA-6531)
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4ed22340/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index cbd9d2e..4d7d6f2 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -483,27 +483,30 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  * compactions, we remove the new ones (since those may be incomplete -- 
under LCS, we may create multiple
  * sstables from any given ancestor).
  */
-public static void removeUnfinishedCompactionLeftovers(String keyspace, 
String columnfamily, SetInteger unfinishedGenerations)
+public static void removeUnfinishedCompactionLeftovers(String keyspace, 
String columnfamily, MapInteger, UUID unfinishedCompactions)
 {
 Directories directories = Directories.create(keyspace, columnfamily);
 
-// sanity-check unfinishedGenerations
-SetInteger allGenerations = new HashSetInteger();
+SetInteger allGenerations = new HashSet();
 for (Descriptor desc : directories.sstableLister().list().keySet())
 allGenerations.add(desc.generation);
+
+// sanity-check unfinishedCompactions
+SetInteger unfinishedGenerations = unfinishedCompactions.keySet();
 if (!allGenerations.containsAll(unfinishedGenerations))
 {
-throw new IllegalStateException(Unfinished compactions reference 
missing sstables.
-+  This should never happen since 
compactions are marked finished before we start removing the old sstables.);
+HashSetInteger missingGenerations = new 
HashSet(unfinishedGenerations);
+missingGenerations.removeAll(allGenerations);
+logger.debug(Unfinished compactions of {}.{} reference missing 
sstables of generations {},
+ keyspace, columnfamily, missingGenerations);
 }
 
 // remove new sstables from compactions that didn't complete, and 
compute
 // set of ancestors that shouldn't exist anymore
-SetInteger completedAncestors = new HashSetInteger();
+SetInteger completedAncestors = new HashSet();
 for (Map.EntryDescriptor, SetComponent sstableFiles : 
directories.sstableLister().list().entrySet())
 {
 Descriptor desc = sstableFiles.getKey();
-SetComponent components = sstableFiles.getValue();
 
 SetInteger ancestors;
 try
@@ -515,9 +518,16 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 throw new FSReadError(e, desc.filenameFor(Component.STATS));
 }
 
-if (!ancestors.isEmpty()  
unfinishedGenerations.containsAll(ancestors))
+if (!ancestors.isEmpty()
+ unfinishedGenerations.containsAll(ancestors)
+ allGenerations.containsAll(ancestors))
 {
-SSTable.delete(desc, 

[jira] [Comment Edited] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859037#comment-13859037
 ] 

Jonathan Ellis edited comment on CASSANDRA-6271 at 12/30/13 8:04 PM:
-

Having consumeNextLeaf sometimes consume one node per call and sometimes two is 
seriously confusing.  Should fix that, I doubt this pseudo-loop-unroll saves 
much.

Edit: maybe the first {{if}} was supposed to be a {{while}}?


was (Author: jbellis):
Having consumeNextLeaf sometimes consume one node per call and sometimes two is 
seriously confusing.  Should fix that, I doubt this pseudo-loop-unroll saves 
much.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (CASSANDRA-6086) Node refuses to start with exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers when find that some to be removed files are already removed

2013-12-30 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita resolved CASSANDRA-6086.
---

Resolution: Fixed
  Reviewer: Yuki Morishita  (was: Tyler Hobbs)
  Assignee: Tyler Hobbs  (was: Yuki Morishita)

Thanks Tyler, committed.
(I removed LegacyLeveledManifest related change in trunk since it no longer 
exists.)

 Node refuses to start with exception in 
 ColumnFamilyStore.removeUnfinishedCompactionLeftovers when find that some to 
 be removed files are already removed
 -

 Key: CASSANDRA-6086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6086
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Oleg Anastasyev
Assignee: Tyler Hobbs
 Fix For: 2.0.5

 Attachments: 6086-2.0-v3.txt, 6086-v2.txt, 
 removeUnfinishedCompactionLeftovers.txt


 Node refuses to start with
 {code}
 Caused by: java.lang.IllegalStateException: Unfinished compactions reference 
 missing sstables. This should never happen since compactions are marked 
 finished before we start removing the old sstables.
   at 
 org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:544)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:262)
 {code}
 IMO, there is no reason to refuse to start discivering files that must be 
 removed are already removed. It looks like pure bug diagnostic code and mean 
 nothing to operator (nor he can do anything about this).
 Replaced throw of excepion with dump of diagnostic warning and continue 
 startup.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859037#comment-13859037
 ] 

Jonathan Ellis edited comment on CASSANDRA-6271 at 12/30/13 8:04 PM:
-

Having consumeNextLeaf sometimes consume one node per call and sometimes two is 
seriously confusing.  Should fix that, I doubt this pseudo-loop-unroll saves 
much.

Edit: maybe the first {{if}} was supposed to be a {{while}}?  But then the 
{{while}} in count() becomes redundant.


was (Author: jbellis):
Having consumeNextLeaf sometimes consume one node per call and sometimes two is 
seriously confusing.  Should fix that, I doubt this pseudo-loop-unroll saves 
much.

Edit: maybe the first {{if}} was supposed to be a {{while}}?

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6378) sstableloader does not support client encryption on Cassandra 2.0

2013-12-30 Thread David Laube (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859078#comment-13859078
 ] 

David Laube commented on CASSANDRA-6378:


Thanks for all of the hard work and effort everyone put in to get this fixed!

 sstableloader does not support client encryption on Cassandra 2.0
 -

 Key: CASSANDRA-6378
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6378
 Project: Cassandra
  Issue Type: Bug
Reporter: David Laube
Assignee: Sam Tunnicliffe
  Labels: client, encryption, ssl, sstableloader
 Fix For: 2.0.4

 Attachments: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch


 We have been testing backup/restore from one ring to another and we recently 
 stumbled upon an issue with sstableloader. When client_enc_enable: true, the 
 exception below is generated. However, when client_enc_enable is set to 
 false, the sstableloader is able to get to the point where it is discovers 
 endpoints, connects to stream data, etc.
 ==BEGIN EXCEPTION==
 sstableloader --debug -d x.x.x.248,x.x.x.108,x.x.x.113 
 /tmp/import/keyspace_name/columnfamily_name
 Exception in thread main java.lang.RuntimeException: Could not retrieve 
 endpoint ranges:
 at 
 org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:226)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149)
 at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:68)
 Caused by: org.apache.thrift.transport.TTransportException: Frame size 
 (352518400) larger than max length (16384000)!
 at 
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137)
 at 
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_describe_partitioner(Cassandra.java:1292)
 at 
 org.apache.cassandra.thrift.Cassandra$Client.describe_partitioner(Cassandra.java:1280)
 at 
 org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:199)
 ... 2 more
 ==END EXCEPTION==



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6419) Setting max_hint_window_in_ms explicitly to null causes problems with JMX view

2013-12-30 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859102#comment-13859102
 ] 

Robert Coli commented on CASSANDRA-6419:


FWIW, this looks like a single case of the systemic issue described in 
CASSANDRA-4967.

 Setting max_hint_window_in_ms explicitly to null causes problems with JMX view
 --

 Key: CASSANDRA-6419
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6419
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: Nate McCall
Assignee: Nate McCall
Priority: Minor
 Fix For: 1.2.14, 2.0.4

 Attachments: 6419-1.2.patch, 6419-2.0.patch


 Setting max_hint_window_in_ms to null in cassandra.yaml makes the 
 StorageProxy mbean inaccessable. 
 Stack trace when trying to view the bean through MX4J:
 {code}
 Exception during http request
 javax.management.RuntimeMBeanException: java.lang.NullPointerException
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
   at 
 mx4j.tools.adaptor.http.MBeanCommandProcessor.createMBeanElement(MBeanCommandProcessor.java:119)
   at 
 mx4j.tools.adaptor.http.MBeanCommandProcessor.executeRequest(MBeanCommandProcessor.java:56)
   at 
 mx4j.tools.adaptor.http.HttpAdaptor$HttpClient.run(HttpAdaptor.java:980)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.cassandra.config.DatabaseDescriptor.getMaxHintWindow(DatabaseDescriptor.java:1161)
   at 
 org.apache.cassandra.service.StorageProxy.getMaxHintWindow(StorageProxy.java:1506)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
   at 
 com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
   at 
 com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
   ... 4 more
 Exception during http request
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5682) When the Cassandra delete keys in secondary Index?

2013-12-30 Thread Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859121#comment-13859121
 ] 

Rao commented on CASSANDRA-5682:


Question: we are seeing some performance issues with some queries.. (only with 
specific inputs). The first query takes around 10sec, the second query comes 
back instantly. The column routeoffer has a secondary index. How can we check 
if we have a similar issues described in the ticket OR could it be some thing 
else? tried to repair and rebuild index, but it did not fix the issue.

cqlsh:topology SELECT count(*) FROM ManagedResource WHERE routeoffer='JMETER' 
ALLOW FILTERING;
count
---
137

cqlsh:topology SELECT count(*) FROM ManagedResource WHERE routeoffer='DEFAULT' 
ALLOW FILTERING;

count
---
161

 When the Cassandra delete keys in secondary Index?
 --

 Key: CASSANDRA-5682
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5682
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0.1
 Environment: normal x86 PC (i3 CPU + 4GB ram) + Ubuntu 12.04
Reporter: YounwooKim
Priority: Minor

 How can i reduce the size of secondary index?
 Obviously, I delete many keys, and tried flush, compact, cleanup, 
 rebuild_index using nodetool. However, i can't reduce the size of secondary 
 index. ( Of course, the size of table(Primary key) is reduced. )
 Therefore, I found out the hint from the Cassandra source code, and I guess a 
 feature of secondary index deletion.
 1) When I request deletion of key, and the key is in the sstable(not in the 
 memtable), the Cassandra doesn't insert the tombstone to the sstable for 
 secondary index.( Unlike the table )
 ( from AbstractSimpleColumnSecondaryIndex.delete() function. )
 2) After scaning the secondary index, the tombstone is maded in secondary 
 index.
 ( from KeysSearcher.getIndexedIterator() function. It is called by index scan 
 verb. )
 3) Cleanup command in nodetool is used to delete out of range keys. ( Cleanup 
 command don't care about deleted keys )
 ( from CompactionManager.doCleanupCompaction() function. )
 After this, I scan deleted keys using 'Where' clause, and I can reduce the 
 size of secondary index. I think that it is only one way to reduce the size 
 of secondary index.
 Is this a correct conclusion? I can't found related articles and other 
 methods. 
 I think that the Cassandra needs the compaction function for secondary index .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859134#comment-13859134
 ] 

Benedict commented on CASSANDRA-6271:
-

bq. Having consumeNextLeaf sometimes consume one node per call and sometimes 
two is seriously confusing. Should fix that, I doubt this pseudo-loop-unroll 
saves much.
I don't see a scenario where it consumes more than one leaf? Could you give me 
an example sequence of events?

bq. Edit: maybe the first if was supposed to be a while? But then the while in 
count() becomes redundant.
If it isn't a leaf, a single call to successor() is enough to guarantee we are 
in a leaf, since there is a child either side of each branch key.



 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859136#comment-13859136
 ] 

Benedict commented on CASSANDRA-6271:
-

bq. [junit] No runnable methods

Sorry, laziness on my part for not checking the ant for if we actually had a 
long test runner, I was just running this through the main() method. I'll 
refactor into an actual JUnit test and upload.

bq. This looks like we're second-guessing lower/upper bounds. Would prefer to 
assert that lower is actually = upper.

Actually, IIRC this is to deal with inclusive/exclusive bounds. Since we find 
these using floor/ceil/higher/lower, we can actually end up in a position where 
for empty result ranges we find the upper/lower bounds in the btree to be 
either side of each other, even though the provided ranges were legal. 

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859137#comment-13859137
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

bq. If it isn't a leaf, a single call to successor() is enough to guarantee we 
are in a leaf, since there is a child either side of each branch key

Why can a branch not have another branch as a child?

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6501) Cannot run pig examples on current 2.0 branch

2013-12-30 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859147#comment-13859147
 ] 

Alex Liu commented on CASSANDRA-6501:
-

I tested cassandra-2.0 branch on pig 0.10.1, it works fine.

 Cannot run pig examples on current 2.0 branch
 -

 Key: CASSANDRA-6501
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6501
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Alex Liu
  Labels: pig

 I checked out the cassandra-2.0 branch to try the pig examples because the 
 2.0.3 release has the CASSANDRA-6309 problem which is fixed on the branch.  I 
 tried to run both the cql and the CassandraStorage examples in local mode 
 with pig 0.10.1, 0.11.1, and 0.12.0 and all of them give the following error 
 and stack trace:
 {quote}
 ERROR 2998: Unhandled internal error. readLength_
 java.lang.NoSuchFieldError: readLength_
   at 
 org.apache.cassandra.thrift.TBinaryProtocol$Factory.getProtocol(TBinaryProtocol.java:57)
   at org.apache.thrift.TSerializer.init(TSerializer.java:66)
   at 
 org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cfdefToString(AbstractCassandraStorage.java:508)
   at 
 org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:470)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:318)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:357)
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.alias_col_ref(LogicalPlanGenerator.java:15356)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.col_ref(LogicalPlanGenerator.java:15203)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:8881)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:8632)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:7984)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.flatten_generated_item(LogicalPlanGenerator.java:5962)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.generate_clause(LogicalPlanGenerator.java:14101)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:12493)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:12360)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1577)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)
   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)
   at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
   at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
   at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
   at org.apache.pig.Main.run(Main.java:555)
   at org.apache.pig.Main.main(Main.java:111)
 
 {quote}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859181#comment-13859181
 ] 

Benedict commented on CASSANDRA-6271:
-

bq. I'll refactor into an actual JUnit test and upload.
The repository has been updated. I've also tweaked the fan factor to be 
configurable with a system property, which the unit tests now set to make the 
fan factor smaller to increase the  number of complex operations done per test.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859191#comment-13859191
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

The way Stack abuses succ/prev mutation to both construct a path and navigate 
it confuses the hell out of me.  

Wouldn't it be a lot simpler to add a Previous node reference (e.g. the first 
reference in the Object[] node)?  Then we wouldn't need to construct the 
Object[][] at all in Stack or poorly defined depth semantics; we'd just need a 
node and an index; almost all the complexity goes away from succ/prev.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns

2013-12-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859198#comment-13859198
 ] 

Jonathan Ellis commented on CASSANDRA-6271:
---

bq. The repository has been updated

I've rebased my work today on top of this and pushed --force to my branch.

 Replace SnapTree in AtomicSortedColumns
 ---

 Key: CASSANDRA-6271
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Attachments: oprate.svg


 On the write path a huge percentage of time is spent in GC (50% in my tests, 
 if accounting for slow down due to parallel marking). SnapTrees are both GC 
 unfriendly due to their structure and also very expensive to keep around - 
 each column name in AtomicSortedColumns uses  100 bytes on average 
 (excluding the actual ByteBuffer).
 I suggest using a sorted array; changes are supplied at-once, as opposed to 
 one at a time, and if  10% of the keys in the array change (and data equal 
 to  10% of the size of the key array) we simply overlay a new array of 
 changes only over the top. Otherwise we rewrite the array. This method should 
 ensure much less GC overhead, and also save approximately 80% of the current 
 memory overhead.
 TreeMap is similarly difficult object for the GC, and a related task might be 
 to remove it where not strictly necessary, even though we don't keep them 
 hanging around for long. TreeMapBackedSortedColumns, for instance, seems to 
 be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6245) nodetool refresh design is unsafe

2013-12-30 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859230#comment-13859230
 ] 

Robert Coli commented on CASSANDRA-6245:


{quote}
don't copy sstables over existing in-use sequence numbers
{quote}
Can only be a meaningful caveat to the use of nodetool refresh if we actually 
give operators this instruction anywhere. From what I can tell, the only 
documentation for refresh is :

http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/tools/toolsNodetool_r.html

refresh keyspace table
Loads newly placed SSTables on to the system without restart.


Additionally, an quote from #cassandra showing how some operators clearly do 
not understand the basics of SSTable naming vis a vis sequence :

{noformat}
17:06  rcoli cornedbeefsand: look at the versions of your sstables, via find
17:06  rcoli it's a component of the name.
17:06  cornedbeefsand right, i did that, but which component?
17:07  cornedbeefsand e.g., 
jenkins_preview_data-timeseries_NormalizedTSDouble-hf-23-Data.db
17:07  cornedbeefsand v23?
17:07  rcoli -hf-
17:07  rcoli 23 is a sequence number
{noformat}

Would you accept a patch which either :

1) moves refresh tasks so that they run from a subdirectory in the CF 
directory named refresh
or
2) checks for file existence before creation, and inflates the sequence by one 
if the file exists

?

 nodetool refresh design is unsafe
 ---

 Key: CASSANDRA-6245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6245
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Robert Coli
Priority: Minor

 CASSANDRA-2991 added a nodetool refresh feature by which Cassandra is able 
 to discover non-live SSTables in the datadir and make them live.
 It does this by :
 1) looking for SSTable files in the data dir
 2) renaming SSTables it finds into the current SSTable id sequence
 This implementation is exposed to a race with a chance of silent data loss.
 1) Node's SSTable id sequence is on sstable #2, the next table to flush will 
 get 2 as its numeric part
 2) Copy SSTable with 2 as its numeric part into data dir
 3) nodetool flush
 4) notice that your 2 SSTable has been silently overwritten by a 
 just-flushed 2 SSTable
 5) nodetool refresh would still succeed, but would now be a no-op
 A simple solution would be to create a subdirectory of the datadir called 
 refresh/ to serve as the location to refresh from.
 Alternately/additionally, there is probably not really a compelling reason 
 for Cassandra to completely ignore existing files at write time.. a check for 
 existing files at a given index and inflating the index to avoid overwriting 
 them them seems trivial and inexpensive. I will gladly file a JIRA for this 
 change in isolation if there is interest.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)