[jira] [Commented] (CASSANDRA-5959) CQL3 support for multi-column insert in a single operation (Batch Insert / Batch Mutate)

2013-09-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755973#comment-13755973
 ] 

Sylvain Lebresne commented on CASSANDRA-5959:
-

For what is worth, I wouldn't be opposed to adding the multi-value INSERT 
extension of the description. It can be handy (as in, it minimize the number of 
characters to type in cqlsh to insert multiple rows) and at least both MySQL 
and Postresql support such syntax extension.

Though as hinted above, it wouldn't fix the performance problem described here, 
so it's a completely different motivation.  The reason such a big batch is slow 
is due to parsing (and possibly also the transport of the large query string, 
though that part can be solved by using compression at the transport level). If 
you want performance on such big insert, you'll definitively need to use 
prepared statements (and batch of them) and that's where CASSANDRA-4693 misses 
in 1.2.

I'll note however that while C* 1.2 doesn't have CASSANDRA-4693, it can still 
prepare batch statements. So a workaround could be to prepare a medium-sized 
batch of a fixed number of inserts, say 500 inserts (but some experimentation 
to find the best number is probably in order), and use that to insert the 50K 
columns by batches of 500. It won't be as efficient as what CASSANDRA-4693 
gives you and it's certainly a bit of a pain to implement client side, but 
performance wise, this should (emphasize on should since I haven't tested it) 
get you closer from the thrift perf number.


 CQL3 support for multi-column insert in a single operation (Batch Insert / 
 Batch Mutate)
 

 Key: CASSANDRA-5959
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5959
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Drivers
Reporter: Les Hazlewood
  Labels: CQL

 h3. Impetus for this Request
 (from the original [question on 
 StackOverflow|http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque]):
 I want to insert a single row with 50,000 columns into Cassandra 1.2.9. 
 Before inserting, I have all the data for the entire row ready to go (in 
 memory):
 {code}
 +-+--+--+--+--+---+
 | | 0| 1| 2| ...  | 4 |
 | row_id  +--+--+--+--+---+
 | | text | text | text | ...  | text  |
 +-+--+--+--|--+---+
 {code}
 The column names are integers, allowing slicing for pagination. The column 
 values are a value at that particular index.
 CQL3 table definition:
 {code}
 create table results (
 row_id text,
 index int,
 value text,
 primary key (row_id, index)
 ) 
 with compact storage;
 {code}
 As I already have the row_id and all 50,000 name/value pairs in memory, I 
 just want to insert a single row into Cassandra in a single request/operation 
 so it is as fast as possible.
 The only thing I can seem to find is to do execute the following 50,000 times:
 {code}
 INSERT INTO results (row_id, index, value) values (my_row_id, ?, ?);
 {code}
 where the first {{?}} is is an index counter ({{i}}) and the second {{?}} is 
 the text value to store at location {{i}}.
 With the Datastax Java Driver client and C* server on the same development 
 machine, this took a full minute to execute.
 Oddly enough, the same 50,000 insert statements in a [Datastax Java Driver 
 Batch|http://www.datastax.com/drivers/java/apidocs/com/datastax/driver/core/querybuilder/QueryBuilder.html#batch(com.datastax.driver.core.Statement...)]
  on the same machine took 7.5 minutes.  I thought batches were supposed to be 
 _faster_ than individual inserts?
 We tried instead with a Thrift client (Astyanax) and the same insert via a 
 [MutationBatch|http://netflix.github.io/astyanax/javadoc/com/netflix/astyanax/MutationBatch.html].
   This took _235 milliseconds_.
 h3. Feature Request
 As a result of this performance testing, this issue is to request that CQL3 
 support batch mutation operations as a single operation (statement) to ensure 
 the same speed/performance benefits as existing Thrift clients.
 Example suggested syntax (based on the above example table/column family):
 {code}
 insert into results (row_id, (index,value)) values 
 ((0,text0), (1,text1), (2,text2), ..., (N,textN));
 {code}
 Each value in the {{values}} clause is a tuple.  The first tuple element is 
 the column name, the second tuple element is the column value.  This seems to 
 be the most simple/accurate representation of what happens during a batch 
 insert/mutate.
 Not having this CQL feature forced us to remove the Datastax Java Driver 
 (which we liked) in favor of Astyanax 

[jira] [Created] (CASSANDRA-5968) Nodetool info throws NPE when connected to a booting instance

2013-09-02 Thread Janne Jalkanen (JIRA)
Janne Jalkanen created CASSANDRA-5968:
-

 Summary: Nodetool info throws NPE when connected to a booting 
instance
 Key: CASSANDRA-5968
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5968
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 1.2.9, Ubuntu 12.04 LTS, Oracle JVM 7u25
Reporter: Janne Jalkanen
Priority: Minor


When an instance is newly added to the cluster and it's still streaming stuff, 
trying to call nodetool info on it throws NPE. Stack trace below.

To replicate: add a new node to the cluster, run nodetool info before bootstrap 
is complete.

Expected behaviour: is nice and just says RPC server is not running.

{noformat}
$ nodetool info
Token: (invoke with -T/--tokens to see all 0 tokens)
ID   : cc7bcf48-4a54-48af-97f6-99c82bce76f2
Gossip active: true
Exception in thread main java.lang.NullPointerException
at 
org.apache.cassandra.service.StorageService.isRPCServerRunning(StorageService.java:330)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at 
com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
at 
com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
at 
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Adam Hattrell (JIRA)
Adam Hattrell created CASSANDRA-5969:


 Summary: Allow JVM_OPTS to be passed to sstablescrub
 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell


Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
and other places where java is being called? (Among other things, this lets us 
run java stuff with -Djava.awt.headless=true on OS X so that Java processes 
don't pop up into the foreground -- i.e. we have a script that loops over all 
CFs and runs sstablescrub, and without that flag being passed in the OS X 
machine becomes pretty much unusable as it keeps switching focus to the java 
processes as they start.)
 
--- a/resources/cassandra/bin/sstablescrub
+++ b/resources/cassandra/bin/sstablescrub
@@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
 MAX_HEAP_SIZE=256M
 fi
 
-$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
+$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 -Dlog4j.configuration=log4j-tools.properties \
 org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-5969:
-

Assignee: Brandon Williams

Hmm.  Quoting $JVM_OPTS is typically best practice but quoting it makes it fail 
when JVM_OPTS is undefined.

 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Brandon Williams

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5957) Cannot drop keyspace Keyspace1 after running cassandra-stress

2013-09-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756097#comment-13756097
 ] 

Piotr Kołaczkowski commented on CASSANDRA-5957:
---

Today I got this on C* 1.2.6 dse 3.1.2. Slightly different exception, but maybe 
will be helpful:

{noformat}
ERROR 16:16:37,702 Error occurred during processing of message.
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: Tried to hard link to file that does not exist 
/var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-ic-64-Statistics.db
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:378)
at 
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:281)
at 
org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:262)
at 
org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:60)
at 
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:73)
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:145)
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:162)
at 
org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1714)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4074)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4062)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
Tried to hard link to file that does not exist 
/var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-ic-64-Statistics.db
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:374)
... 15 more
{noformat}

 Cannot drop keyspace Keyspace1 after running cassandra-stress
 -

 Key: CASSANDRA-5957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5957
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 1.2.9 freshly built from cassandra-1.2 branch 
 (f5b224cf9aa0f319d51078ef4b78d55e36613963)
Reporter: Piotr Kołaczkowski
Assignee: Aleksey Yeschenko
Priority: Minor
 Fix For: 1.2.10

 Attachments: system.log


 Steps to reproduce:
 # Set MAX_HEAP=2G, HEAP_NEWSIZE=400M
 # Run ./cassandra-stress -n 5 -c 400 -S 256
 # The test should complete despite several warnings about low heap memory.
 # Try to drop keyspace:
 {noformat}
 cqlsh drop keyspace Keyspace1;
 TSocket read 0 bytes
 {noformat}
 system.log:
 {noformat}
  INFO 15:10:46,516 Enqueuing flush of 
 Memtable-schema_columnfamilies@2127258371(0/0 serialized/live bytes, 1 ops)
  INFO 15:10:46,516 Writing Memtable-schema_columnfamilies@2127258371(0/0 
 serialized/live bytes, 1 ops)
  INFO 15:10:46,690 Completed flushing 
 /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-ic-6-Data.db
  (38 bytes) for commitlog position ReplayPosition(segmentId=1377867520699, 
 position=19794574)
  INFO 15:10:46,692 Enqueuing flush of Memtable-schema_columns@1997964959(0/0 
 serialized/live bytes, 1 ops)
  INFO 15:10:46,693 Writing Memtable-schema_columns@1997964959(0/0 
 serialized/live bytes, 1 ops)
  INFO 15:10:46,857 Completed flushing 
 /var/lib/cassandra/data/system/schema_columns/system-schema_columns-ic-6-Data.db
  (38 bytes) for commitlog position ReplayPosition(segmentId=1377867520699, 
 position=19794574)
  INFO 15:10:46,897 Enqueuing flush of Memtable-local@1366216652(98/98 
 serialized/live bytes, 3 ops)
  INFO 15:10:46,898 Writing Memtable-local@1366216652(98/98 serialized/live 
 bytes, 3 ops)
  INFO 15:10:47,064 Completed flushing 
 /var/lib/cassandra/data/system/local/system-local-ic-12-Data.db (139 bytes) 
 for commitlog position ReplayPosition(segmentId=1377867520699, 
 position=19794845)
  INFO 15:10:48,956 Enqueuing flush of Memtable-local@432522279(46/46 
 

git commit: Update netty dependency to 3.6.6

2013-09-02 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.2 3380fa7ba - 196038b73


Update netty dependency to 3.6.6


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/196038b7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/196038b7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/196038b7

Branch: refs/heads/cassandra-1.2
Commit: 196038b73e5783a1e282ba73e1d7b87c713f2e85
Parents: 3380fa7
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Mon Sep 2 18:00:27 2013 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon Sep 2 18:00:27 2013 +0200

--
 build.xml |   2 +-
 lib/netty-3.5.9.Final.jar | Bin 1128961 - 0 bytes
 lib/netty-3.6.6.Final.jar | Bin 0 - 1206119 bytes
 3 files changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/196038b7/build.xml
--
diff --git a/build.xml b/build.xml
index ff25e16..9e1de8d 100644
--- a/build.xml
+++ b/build.xml
@@ -386,7 +386,7 @@
   dependency groupId=com.yammer.metrics artifactId=metrics-core 
version=2.0.3 /
   dependency groupId=edu.stanford.ppl artifactId=snaptree 
version=0.1 /
   dependency groupId=org.mindrot artifactId=jbcrypt 
version=0.3m /
-  dependency groupId=io.netty artifactId=netty 
version=3.5.9.Final /
+  dependency groupId=io.netty artifactId=netty 
version=3.6.6.Final /
 /dependencyManagement
 developer id=alakshman name=Avinash Lakshman/
 developer id=antelder name=Anthony Elder/

http://git-wip-us.apache.org/repos/asf/cassandra/blob/196038b7/lib/netty-3.5.9.Final.jar
--
diff --git a/lib/netty-3.5.9.Final.jar b/lib/netty-3.5.9.Final.jar
deleted file mode 100644
index 7f41e0e..000
Binary files a/lib/netty-3.5.9.Final.jar and /dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/196038b7/lib/netty-3.6.6.Final.jar
--
diff --git a/lib/netty-3.6.6.Final.jar b/lib/netty-3.6.6.Final.jar
new file mode 100644
index 000..35cb073
Binary files /dev/null and b/lib/netty-3.6.6.Final.jar differ



[jira] [Resolved] (CASSANDRA-5955) The native protocol server can trigger a Netty bug

2013-09-02 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-5955.
-

Resolution: Fixed
  Reviewer: jbellis

Alright, dependency updated.

 The native protocol server can trigger a Netty bug
 --

 Key: CASSANDRA-5955
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5955
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 1.2.10


 The patch from CASSANDRA-5926 did fix the original deadlock, but 
 unfortunately we can now run into a netty bug (with 
 MemoryAwareThreadPoolExecutor): https://github.com/netty/netty/issues/1310.
 That bug has been fixed in netty 3.6.6 but we're currently using an older 
 version (3.5.9). So we should just upgrade our dependency to 3.6.6. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Jeff Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756161#comment-13756161
 ] 

Jeff Potter commented on CASSANDRA-5969:


Quoting makes sense -- although there are other scripts that already have 
JVM_OPTS without quotes (bin/cassandra, bin/sstableloader). Let me know if I 
should revise the patch.

 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Brandon Williams

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files

2013-09-02 Thread Jeff Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756170#comment-13756170
 ] 

Jeff Potter commented on CASSANDRA-5930:


We're seeing this too -- slightly different stack trace, which I'll include 
here in case it's of use.


WARNING: Non-fatal error reading row (stacktrace follows)
Exception in thread main java.io.IOError: java.lang.IllegalArgumentException
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:244)
at 
org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:125)
Caused by: java.lang.IllegalArgumentException 
at java.nio.Buffer.limit(Buffer.java:247)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
 
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at 
org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:128)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109)
 
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:219)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114)
 
at 
org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:98)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:166)
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:173) 
... 1 more


 Offline scrubs can choke on broken files
 

 Key: CASSANDRA-5930
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan
Assignee: Jason Brown
Priority: Minor

 There are cases where offline scrub can hit an exception and die, like:
 {noformat}
 WARNING: Non-fatal error reading row (stacktrace follows)
 Exception in thread main java.io.IOError: java.io.IOError: 
 java.io.EOFException
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242)
   at 
 org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121)
 Caused by: java.io.IOError: java.io.EOFException
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182)
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
   ... 1 more
 Caused by: java.io.EOFException
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
   ... 5 more
 {noformat}
 Since the purpose of offline scrub is to fix broken stuff, it should be more 
 resilient to broken stuff...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5958) Unable to find property errors from snakeyaml are confusing

2013-09-02 Thread Mikhail Stepura (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Stepura updated CASSANDRA-5958:
---

Attachment: trunk-5958-v2-print-all-invalid-properties.patch

A new patch.
Don't skip missing properties, print them all out and terminate.

 Unable to find property errors from snakeyaml are confusing
 -

 Key: CASSANDRA-5958
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5958
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Priority: Minor
 Attachments: trunk-5958-skip-missing-properties.patch, 
 trunk-5958-v2-print-all-invalid-properties.patch


 When an unexpected property is present in cassandra.yaml (e.g. after 
 upgrading), snakeyaml outputs the following message:
 {code}Unable to find property 'some_property' on class: 
 org.apache.cassandra.config.Config{code}
 The error message is kind of counterintuitive because at first glance it 
 seems to suggest the property is missing from the yaml file, when in fact the 
 error is caused by the *presence* of an unrecognized property.  I know if you 
 read it carefully it says it can't find the property on the class, but this 
 has confused more than one user.
 I think we should catch this exception and wrap it in another exception that 
 says something like this:
 {code}Please remove 'some_property' from your cassandra.yaml. It is not 
 recognized by this version of Cassandra.{code}
 Also, it might make sense to make this a warning instead of a fatal error, 
 and just ignore the unwanted property.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-5970) FilteredRangeSlice command for regex searches against column names on known sets of keys

2013-09-02 Thread Nate McCall (JIRA)
Nate McCall created CASSANDRA-5970:
--

 Summary: FilteredRangeSlice command for regex searches against 
column names on known sets of keys
 Key: CASSANDRA-5970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5970
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Nate McCall


This is the ability to apply a regex against columns when the set of keys is 
known. In filtering the keys, we would like to allow for the following clauses: 
E, GTE, LTE, NE, inclusive list, exclusive list.

The end goal is to provide for efficient searching in the case where you have 
some knowledge of the keys. A specific use case would be, say, searching user 
agent strings in the given set of date buckets in the classic time-series web 
log use case. This is a sweet spot for Cassandra and providing a more direct 
method of access for such will help a lot of users.

Additionally, this will provide some level of feature parity with RDBMS crowd 
who've had this feature for some time.

Internally, this will include the introduction of a new Verb, SSTableScanner 
extension and an ExtendedFilter implementation which applies the regex as well 
as a new method on StorageProxy.

This issue does not cover exposing this new query method to thrift and CQL, but 
obviously that will be required for this to be of any practical use. Those 
should be covered by separate issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756282#comment-13756282
 ] 

Jeremiah Jordan commented on CASSANDRA-5969:


[~jeffpotter] FYI for the OS X thing, install a newer JNA. See CASSANDRA-5611

 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Brandon Williams

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Jeff Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756306#comment-13756306
 ] 

Jeff Potter commented on CASSANDRA-5969:


Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock 
gets an app named bin when running cassandra and removing 
'-Djava.awt.headless=true' from JVM_OPTS.

 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Brandon Williams

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Jeff Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756306#comment-13756306
 ] 

Jeff Potter edited comment on CASSANDRA-5969 at 9/3/13 1:54 AM:


Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock 
gets an app named bin when running cassandra and removing 
'-Djava.awt.headless=true' from JVM_OPTS.

Edit: additional info:

On OS X 10.8.4

java -version
java version 1.7.0_25
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)



  was (Author: jeffpotter):
Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the 
Dock gets an app named bin when running cassandra and removing 
'-Djava.awt.headless=true' from JVM_OPTS.
  
 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Brandon Williams

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub

2013-09-02 Thread Jeff Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756306#comment-13756306
 ] 

Jeff Potter edited comment on CASSANDRA-5969 at 9/3/13 1:55 AM:


Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock 
gets an app named bin when running cassandra and removing 
'-Djava.awt.headless=true' from JVM_OPTS.

Edit: additional info:
- On OS X 10.8.4
- We're running with Java 1.6 to better match what we run in prod:
/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java


  was (Author: jeffpotter):
Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the 
Dock gets an app named bin when running cassandra and removing 
'-Djava.awt.headless=true' from JVM_OPTS.

Edit: additional info:

On OS X 10.8.4

java -version
java version 1.7.0_25
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)


  
 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Brandon Williams

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756323#comment-13756323
 ] 

Vijay commented on CASSANDRA-5933:
--

Ryan, Do you mind testing the custom with 5 to 10 ms... 
I am thinking, we might need enough sample for Percentiles to make more sense 
(if conformed we might want to wait till the samples arrive etc).

 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756324#comment-13756324
 ] 

Ryan McGuire commented on CASSANDRA-5933:
-

Hi [~vijay2...@yahoo.com], I'm not sure what you meant by 'custom with 5 to 10 
ms'. Can you please clarify the test scenario you'd like me to run?


 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756329#comment-13756329
 ] 

Vijay commented on CASSANDRA-5933:
--

Hi Ryan, You can set a custom speculative execution like the below...
{code}
update column family Standard1 with speculative_retry=10ms;
{code}

 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756331#comment-13756331
 ] 

Ryan McGuire edited comment on CASSANDRA-5933 at 9/3/13 3:21 AM:
-

Ah, OK, I can run that test, that also applies to CASSANDRA-5932. However, in 
this case, I don't believe speculative retry can account for all the 
difference. The red line has none enabled.

  was (Author: enigmacurry):
Ah, OK, I can run that test, that also applies to CASSANDRA-5332. However, 
in this case, I don't believe speculative retry can account for all the 
difference. The red line has none enabled.
  
 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756331#comment-13756331
 ] 

Ryan McGuire commented on CASSANDRA-5933:
-

Ah, OK, I can run that test, that also applies to CASSANDRA-5332. However, in 
this case, I don't believe speculative retry can account for all the 
difference. The red line has none enabled.

 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5906) Avoid allocating over-large bloom filters

2013-09-02 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5906:
--

Priority: Major  (was: Minor)

 Avoid allocating over-large bloom filters
 -

 Key: CASSANDRA-5906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5906
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 2.0.1


 We conservatively estimate the number of partitions post-compaction to be the 
 total number of partitions pre-compaction.  That is, we assume the worst-case 
 scenario of no partition overlap at all.
 This can result in substantial memory wasted in sstables resulting from 
 highly overlapping compactions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals

2013-09-02 Thread Aleksey Yeschenko (JIRA)
Aleksey Yeschenko created CASSANDRA-5971:


 Summary: Get rid of thrift-generated Index* classes usage in C* 
internals
 Key: CASSANDRA-5971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Trivial
 Fix For: 2.1


We've cleaned up most of it previously, but 
IndexExpression/IndexOperator/IndexType have somehow escaped the purge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals

2013-09-02 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-5971:
-

Attachment: 5971.txt

 Get rid of thrift-generated Index* classes usage in C* internals
 

 Key: CASSANDRA-5971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Trivial
 Fix For: 2.1

 Attachments: 5971.txt


 We've cleaned up most of it previously, but 
 IndexExpression/IndexOperator/IndexType have somehow escaped the purge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5970) FilteredRangeSlice command for regex searches against column names on known sets of keys

2013-09-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756339#comment-13756339
 ] 

Jonathan Ellis commented on CASSANDRA-5970:
---

It sounds like you have some code already, but ISTM it would be most 
straightforward to implement this as a predicate to the existing slice verb.  
Or put another way, making the slice code a closer match to the range 
(sequential scan) code that already has a concept of predicates being queried.

 FilteredRangeSlice command for regex searches against column names on known 
 sets of keys
 

 Key: CASSANDRA-5970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5970
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Nate McCall

 This is the ability to apply a regex against columns when the set of keys is 
 known. In filtering the keys, we would like to allow for the following 
 clauses: E, GTE, LTE, NE, inclusive list, exclusive list.
 The end goal is to provide for efficient searching in the case where you have 
 some knowledge of the keys. A specific use case would be, say, searching user 
 agent strings in the given set of date buckets in the classic time-series web 
 log use case. This is a sweet spot for Cassandra and providing a more 
 direct method of access for such will help a lot of users.
 Additionally, this will provide some level of feature parity with RDBMS crowd 
 who've had this feature for some time.
 Internally, this will include the introduction of a new Verb, SSTableScanner 
 extension and an ExtendedFilter implementation which applies the regex as 
 well as a new method on StorageProxy.
 This issue does not cover exposing this new query method to thrift and CQL, 
 but obviously that will be required for this to be of any practical use. 
 Those should be covered by separate issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals

2013-09-02 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5971:
--

Reviewer: dbrosius  (was: jbellis)

 Get rid of thrift-generated Index* classes usage in C* internals
 

 Key: CASSANDRA-5971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Trivial
 Fix For: 2.1

 Attachments: 5971.txt


 We've cleaned up most of it previously, but 
 IndexExpression/IndexOperator/IndexType have somehow escaped the purge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5958) Unable to find property errors from snakeyaml are confusing

2013-09-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756342#comment-13756342
 ] 

Jonathan Ellis commented on CASSANDRA-5958:
---

does HashSet.toString actually give us a human-readable error?

 Unable to find property errors from snakeyaml are confusing
 -

 Key: CASSANDRA-5958
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5958
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Priority: Minor
 Attachments: trunk-5958-skip-missing-properties.patch, 
 trunk-5958-v2-print-all-invalid-properties.patch


 When an unexpected property is present in cassandra.yaml (e.g. after 
 upgrading), snakeyaml outputs the following message:
 {code}Unable to find property 'some_property' on class: 
 org.apache.cassandra.config.Config{code}
 The error message is kind of counterintuitive because at first glance it 
 seems to suggest the property is missing from the yaml file, when in fact the 
 error is caused by the *presence* of an unrecognized property.  I know if you 
 read it carefully it says it can't find the property on the class, but this 
 has confused more than one user.
 I think we should catch this exception and wrap it in another exception that 
 says something like this:
 {code}Please remove 'some_property' from your cassandra.yaml. It is not 
 recognized by this version of Cassandra.{code}
 Also, it might make sense to make this a warning instead of a fatal error, 
 and just ignore the unwanted property.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5958) Unable to find property errors from snakeyaml are confusing

2013-09-02 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756348#comment-13756348
 ] 

Mikhail Stepura commented on CASSANDRA-5958:


For example I have the following in my cassandra.yaml

{code:title=cassandra.yaml}
oh: my
bla: bla
{code}

Then the stacktrace will be
{code}
ERROR 04:06:57 Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please 
remove properties [bla, oh] from your cassandra.yaml
at 
org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:131)
 ~[main/:na]
at 
org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:94)
 ~[main/:na]
at 
org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:128)
 ~[main/:na]
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:104)
 ~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:153) 
~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391) 
~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434) 
~[main/:na]
Invalid yaml. Please remove properties [bla, oh] from your cassandra.yaml
Fatal configuration error; unable to start. See log for stacktrace.
{code}

 Unable to find property errors from snakeyaml are confusing
 -

 Key: CASSANDRA-5958
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5958
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Priority: Minor
 Attachments: trunk-5958-skip-missing-properties.patch, 
 trunk-5958-v2-print-all-invalid-properties.patch


 When an unexpected property is present in cassandra.yaml (e.g. after 
 upgrading), snakeyaml outputs the following message:
 {code}Unable to find property 'some_property' on class: 
 org.apache.cassandra.config.Config{code}
 The error message is kind of counterintuitive because at first glance it 
 seems to suggest the property is missing from the yaml file, when in fact the 
 error is caused by the *presence* of an unrecognized property.  I know if you 
 read it carefully it says it can't find the property on the class, but this 
 has confused more than one user.
 I think we should catch this exception and wrap it in another exception that 
 says something like this:
 {code}Please remove 'some_property' from your cassandra.yaml. It is not 
 recognized by this version of Cassandra.{code}
 Also, it might make sense to make this a warning instead of a fatal error, 
 and just ignore the unwanted property.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals

2013-09-02 Thread Dave Brosius (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756350#comment-13756350
 ] 

Dave Brosius commented on CASSANDRA-5971:
-

+1

 Get rid of thrift-generated Index* classes usage in C* internals
 

 Key: CASSANDRA-5971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
Priority: Trivial
 Fix For: 2.1

 Attachments: 5971.txt


 We've cleaned up most of it previously, but 
 IndexExpression/IndexOperator/IndexType have somehow escaped the purge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira