[jira] [Commented] (CASSANDRA-5959) CQL3 support for multi-column insert in a single operation (Batch Insert / Batch Mutate)
[ https://issues.apache.org/jira/browse/CASSANDRA-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755973#comment-13755973 ] Sylvain Lebresne commented on CASSANDRA-5959: - For what is worth, I wouldn't be opposed to adding the multi-value INSERT extension of the description. It can be handy (as in, it minimize the number of characters to type in cqlsh to insert multiple rows) and at least both MySQL and Postresql support such syntax extension. Though as hinted above, it wouldn't fix the performance problem described here, so it's a completely different motivation. The reason such a big batch is slow is due to parsing (and possibly also the transport of the large query string, though that part can be solved by using compression at the transport level). If you want performance on such big insert, you'll definitively need to use prepared statements (and batch of them) and that's where CASSANDRA-4693 misses in 1.2. I'll note however that while C* 1.2 doesn't have CASSANDRA-4693, it can still prepare batch statements. So a workaround could be to prepare a medium-sized batch of a fixed number of inserts, say 500 inserts (but some experimentation to find the best number is probably in order), and use that to insert the 50K columns by batches of 500. It won't be as efficient as what CASSANDRA-4693 gives you and it's certainly a bit of a pain to implement client side, but performance wise, this should (emphasize on should since I haven't tested it) get you closer from the thrift perf number. CQL3 support for multi-column insert in a single operation (Batch Insert / Batch Mutate) Key: CASSANDRA-5959 URL: https://issues.apache.org/jira/browse/CASSANDRA-5959 Project: Cassandra Issue Type: New Feature Components: Core, Drivers Reporter: Les Hazlewood Labels: CQL h3. Impetus for this Request (from the original [question on StackOverflow|http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque]): I want to insert a single row with 50,000 columns into Cassandra 1.2.9. Before inserting, I have all the data for the entire row ready to go (in memory): {code} +-+--+--+--+--+---+ | | 0| 1| 2| ... | 4 | | row_id +--+--+--+--+---+ | | text | text | text | ... | text | +-+--+--+--|--+---+ {code} The column names are integers, allowing slicing for pagination. The column values are a value at that particular index. CQL3 table definition: {code} create table results ( row_id text, index int, value text, primary key (row_id, index) ) with compact storage; {code} As I already have the row_id and all 50,000 name/value pairs in memory, I just want to insert a single row into Cassandra in a single request/operation so it is as fast as possible. The only thing I can seem to find is to do execute the following 50,000 times: {code} INSERT INTO results (row_id, index, value) values (my_row_id, ?, ?); {code} where the first {{?}} is is an index counter ({{i}}) and the second {{?}} is the text value to store at location {{i}}. With the Datastax Java Driver client and C* server on the same development machine, this took a full minute to execute. Oddly enough, the same 50,000 insert statements in a [Datastax Java Driver Batch|http://www.datastax.com/drivers/java/apidocs/com/datastax/driver/core/querybuilder/QueryBuilder.html#batch(com.datastax.driver.core.Statement...)] on the same machine took 7.5 minutes. I thought batches were supposed to be _faster_ than individual inserts? We tried instead with a Thrift client (Astyanax) and the same insert via a [MutationBatch|http://netflix.github.io/astyanax/javadoc/com/netflix/astyanax/MutationBatch.html]. This took _235 milliseconds_. h3. Feature Request As a result of this performance testing, this issue is to request that CQL3 support batch mutation operations as a single operation (statement) to ensure the same speed/performance benefits as existing Thrift clients. Example suggested syntax (based on the above example table/column family): {code} insert into results (row_id, (index,value)) values ((0,text0), (1,text1), (2,text2), ..., (N,textN)); {code} Each value in the {{values}} clause is a tuple. The first tuple element is the column name, the second tuple element is the column value. This seems to be the most simple/accurate representation of what happens during a batch insert/mutate. Not having this CQL feature forced us to remove the Datastax Java Driver (which we liked) in favor of Astyanax
[jira] [Created] (CASSANDRA-5968) Nodetool info throws NPE when connected to a booting instance
Janne Jalkanen created CASSANDRA-5968: - Summary: Nodetool info throws NPE when connected to a booting instance Key: CASSANDRA-5968 URL: https://issues.apache.org/jira/browse/CASSANDRA-5968 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 1.2.9, Ubuntu 12.04 LTS, Oracle JVM 7u25 Reporter: Janne Jalkanen Priority: Minor When an instance is newly added to the cluster and it's still streaming stuff, trying to call nodetool info on it throws NPE. Stack trace below. To replicate: add a new node to the cluster, run nodetool info before bootstrap is complete. Expected behaviour: is nice and just says RPC server is not running. {noformat} $ nodetool info Token: (invoke with -T/--tokens to see all 0 tokens) ID : cc7bcf48-4a54-48af-97f6-99c82bce76f2 Gossip active: true Exception in thread main java.lang.NullPointerException at org.apache.cassandra.service.StorageService.isRPCServerRunning(StorageService.java:330) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657) at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
Adam Hattrell created CASSANDRA-5969: Summary: Allow JVM_OPTS to be passed to sstablescrub Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-5969: - Assignee: Brandon Williams Hmm. Quoting $JVM_OPTS is typically best practice but quoting it makes it fail when JVM_OPTS is undefined. Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Brandon Williams Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5957) Cannot drop keyspace Keyspace1 after running cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756097#comment-13756097 ] Piotr Kołaczkowski commented on CASSANDRA-5957: --- Today I got this on C* 1.2.6 dse 3.1.2. Slightly different exception, but maybe will be helpful: {noformat} ERROR 16:16:37,702 Error occurred during processing of message. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Tried to hard link to file that does not exist /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-ic-64-Statistics.db at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:378) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:281) at org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:262) at org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:60) at org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:73) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:145) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:162) at org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1714) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4074) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4062) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Tried to hard link to file that does not exist /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-ic-64-Statistics.db at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:374) ... 15 more {noformat} Cannot drop keyspace Keyspace1 after running cassandra-stress - Key: CASSANDRA-5957 URL: https://issues.apache.org/jira/browse/CASSANDRA-5957 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 1.2.9 freshly built from cassandra-1.2 branch (f5b224cf9aa0f319d51078ef4b78d55e36613963) Reporter: Piotr Kołaczkowski Assignee: Aleksey Yeschenko Priority: Minor Fix For: 1.2.10 Attachments: system.log Steps to reproduce: # Set MAX_HEAP=2G, HEAP_NEWSIZE=400M # Run ./cassandra-stress -n 5 -c 400 -S 256 # The test should complete despite several warnings about low heap memory. # Try to drop keyspace: {noformat} cqlsh drop keyspace Keyspace1; TSocket read 0 bytes {noformat} system.log: {noformat} INFO 15:10:46,516 Enqueuing flush of Memtable-schema_columnfamilies@2127258371(0/0 serialized/live bytes, 1 ops) INFO 15:10:46,516 Writing Memtable-schema_columnfamilies@2127258371(0/0 serialized/live bytes, 1 ops) INFO 15:10:46,690 Completed flushing /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-ic-6-Data.db (38 bytes) for commitlog position ReplayPosition(segmentId=1377867520699, position=19794574) INFO 15:10:46,692 Enqueuing flush of Memtable-schema_columns@1997964959(0/0 serialized/live bytes, 1 ops) INFO 15:10:46,693 Writing Memtable-schema_columns@1997964959(0/0 serialized/live bytes, 1 ops) INFO 15:10:46,857 Completed flushing /var/lib/cassandra/data/system/schema_columns/system-schema_columns-ic-6-Data.db (38 bytes) for commitlog position ReplayPosition(segmentId=1377867520699, position=19794574) INFO 15:10:46,897 Enqueuing flush of Memtable-local@1366216652(98/98 serialized/live bytes, 3 ops) INFO 15:10:46,898 Writing Memtable-local@1366216652(98/98 serialized/live bytes, 3 ops) INFO 15:10:47,064 Completed flushing /var/lib/cassandra/data/system/local/system-local-ic-12-Data.db (139 bytes) for commitlog position ReplayPosition(segmentId=1377867520699, position=19794845) INFO 15:10:48,956 Enqueuing flush of Memtable-local@432522279(46/46
git commit: Update netty dependency to 3.6.6
Updated Branches: refs/heads/cassandra-1.2 3380fa7ba - 196038b73 Update netty dependency to 3.6.6 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/196038b7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/196038b7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/196038b7 Branch: refs/heads/cassandra-1.2 Commit: 196038b73e5783a1e282ba73e1d7b87c713f2e85 Parents: 3380fa7 Author: Sylvain Lebresne sylv...@datastax.com Authored: Mon Sep 2 18:00:27 2013 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Sep 2 18:00:27 2013 +0200 -- build.xml | 2 +- lib/netty-3.5.9.Final.jar | Bin 1128961 - 0 bytes lib/netty-3.6.6.Final.jar | Bin 0 - 1206119 bytes 3 files changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/196038b7/build.xml -- diff --git a/build.xml b/build.xml index ff25e16..9e1de8d 100644 --- a/build.xml +++ b/build.xml @@ -386,7 +386,7 @@ dependency groupId=com.yammer.metrics artifactId=metrics-core version=2.0.3 / dependency groupId=edu.stanford.ppl artifactId=snaptree version=0.1 / dependency groupId=org.mindrot artifactId=jbcrypt version=0.3m / - dependency groupId=io.netty artifactId=netty version=3.5.9.Final / + dependency groupId=io.netty artifactId=netty version=3.6.6.Final / /dependencyManagement developer id=alakshman name=Avinash Lakshman/ developer id=antelder name=Anthony Elder/ http://git-wip-us.apache.org/repos/asf/cassandra/blob/196038b7/lib/netty-3.5.9.Final.jar -- diff --git a/lib/netty-3.5.9.Final.jar b/lib/netty-3.5.9.Final.jar deleted file mode 100644 index 7f41e0e..000 Binary files a/lib/netty-3.5.9.Final.jar and /dev/null differ http://git-wip-us.apache.org/repos/asf/cassandra/blob/196038b7/lib/netty-3.6.6.Final.jar -- diff --git a/lib/netty-3.6.6.Final.jar b/lib/netty-3.6.6.Final.jar new file mode 100644 index 000..35cb073 Binary files /dev/null and b/lib/netty-3.6.6.Final.jar differ
[jira] [Resolved] (CASSANDRA-5955) The native protocol server can trigger a Netty bug
[ https://issues.apache.org/jira/browse/CASSANDRA-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-5955. - Resolution: Fixed Reviewer: jbellis Alright, dependency updated. The native protocol server can trigger a Netty bug -- Key: CASSANDRA-5955 URL: https://issues.apache.org/jira/browse/CASSANDRA-5955 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 1.2.10 The patch from CASSANDRA-5926 did fix the original deadlock, but unfortunately we can now run into a netty bug (with MemoryAwareThreadPoolExecutor): https://github.com/netty/netty/issues/1310. That bug has been fixed in netty 3.6.6 but we're currently using an older version (3.5.9). So we should just upgrade our dependency to 3.6.6. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756161#comment-13756161 ] Jeff Potter commented on CASSANDRA-5969: Quoting makes sense -- although there are other scripts that already have JVM_OPTS without quotes (bin/cassandra, bin/sstableloader). Let me know if I should revise the patch. Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Brandon Williams Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files
[ https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756170#comment-13756170 ] Jeff Potter commented on CASSANDRA-5930: We're seeing this too -- slightly different stack trace, which I'll include here in case it's of use. WARNING: Non-fatal error reading row (stacktrace follows) Exception in thread main java.io.IOError: java.lang.IllegalArgumentException at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:244) at org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:125) Caused by: java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:247) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31) at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:128) at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114) at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:219) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:166) at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:173) ... 1 more Offline scrubs can choke on broken files Key: CASSANDRA-5930 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Assignee: Jason Brown Priority: Minor There are cases where offline scrub can hit an exception and die, like: {noformat} WARNING: Non-fatal error reading row (stacktrace follows) Exception in thread main java.io.IOError: java.io.IOError: java.io.EOFException at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242) at org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121) Caused by: java.io.IOError: java.io.EOFException at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116) at org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182) at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171) ... 1 more Caused by: java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112) ... 5 more {noformat} Since the purpose of offline scrub is to fix broken stuff, it should be more resilient to broken stuff... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5958) Unable to find property errors from snakeyaml are confusing
[ https://issues.apache.org/jira/browse/CASSANDRA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-5958: --- Attachment: trunk-5958-v2-print-all-invalid-properties.patch A new patch. Don't skip missing properties, print them all out and terminate. Unable to find property errors from snakeyaml are confusing - Key: CASSANDRA-5958 URL: https://issues.apache.org/jira/browse/CASSANDRA-5958 Project: Cassandra Issue Type: Bug Reporter: J.B. Langston Priority: Minor Attachments: trunk-5958-skip-missing-properties.patch, trunk-5958-v2-print-all-invalid-properties.patch When an unexpected property is present in cassandra.yaml (e.g. after upgrading), snakeyaml outputs the following message: {code}Unable to find property 'some_property' on class: org.apache.cassandra.config.Config{code} The error message is kind of counterintuitive because at first glance it seems to suggest the property is missing from the yaml file, when in fact the error is caused by the *presence* of an unrecognized property. I know if you read it carefully it says it can't find the property on the class, but this has confused more than one user. I think we should catch this exception and wrap it in another exception that says something like this: {code}Please remove 'some_property' from your cassandra.yaml. It is not recognized by this version of Cassandra.{code} Also, it might make sense to make this a warning instead of a fatal error, and just ignore the unwanted property. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5970) FilteredRangeSlice command for regex searches against column names on known sets of keys
Nate McCall created CASSANDRA-5970: -- Summary: FilteredRangeSlice command for regex searches against column names on known sets of keys Key: CASSANDRA-5970 URL: https://issues.apache.org/jira/browse/CASSANDRA-5970 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Nate McCall This is the ability to apply a regex against columns when the set of keys is known. In filtering the keys, we would like to allow for the following clauses: E, GTE, LTE, NE, inclusive list, exclusive list. The end goal is to provide for efficient searching in the case where you have some knowledge of the keys. A specific use case would be, say, searching user agent strings in the given set of date buckets in the classic time-series web log use case. This is a sweet spot for Cassandra and providing a more direct method of access for such will help a lot of users. Additionally, this will provide some level of feature parity with RDBMS crowd who've had this feature for some time. Internally, this will include the introduction of a new Verb, SSTableScanner extension and an ExtendedFilter implementation which applies the regex as well as a new method on StorageProxy. This issue does not cover exposing this new query method to thrift and CQL, but obviously that will be required for this to be of any practical use. Those should be covered by separate issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756282#comment-13756282 ] Jeremiah Jordan commented on CASSANDRA-5969: [~jeffpotter] FYI for the OS X thing, install a newer JNA. See CASSANDRA-5611 Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Brandon Williams Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756306#comment-13756306 ] Jeff Potter commented on CASSANDRA-5969: Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock gets an app named bin when running cassandra and removing '-Djava.awt.headless=true' from JVM_OPTS. Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Brandon Williams Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756306#comment-13756306 ] Jeff Potter edited comment on CASSANDRA-5969 at 9/3/13 1:54 AM: Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock gets an app named bin when running cassandra and removing '-Djava.awt.headless=true' from JVM_OPTS. Edit: additional info: On OS X 10.8.4 java -version java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) was (Author: jeffpotter): Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock gets an app named bin when running cassandra and removing '-Djava.awt.headless=true' from JVM_OPTS. Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Brandon Williams Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756306#comment-13756306 ] Jeff Potter edited comment on CASSANDRA-5969 at 9/3/13 1:55 AM: Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock gets an app named bin when running cassandra and removing '-Djava.awt.headless=true' from JVM_OPTS. Edit: additional info: - On OS X 10.8.4 - We're running with Java 1.6 to better match what we run in prod: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java was (Author: jeffpotter): Alas, no luck: upgrading JNA from 3.5.1 to 3.5.2 doesn't resolve it -- the Dock gets an app named bin when running cassandra and removing '-Djava.awt.headless=true' from JVM_OPTS. Edit: additional info: On OS X 10.8.4 java -version java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) Allow JVM_OPTS to be passed to sstablescrub --- Key: CASSANDRA-5969 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Hattrell Assignee: Brandon Williams Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- and other places where java is being called? (Among other things, this lets us run java stuff with -Djava.awt.headless=true on OS X so that Java processes don't pop up into the foreground -- i.e. we have a script that loops over all CFs and runs sstablescrub, and without that flag being passed in the OS X machine becomes pretty much unusable as it keeps switching focus to the java processes as they start.) --- a/resources/cassandra/bin/sstablescrub +++ b/resources/cassandra/bin/sstablescrub @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then MAX_HEAP_SIZE=256M fi -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.StandaloneScrubber $@ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756323#comment-13756323 ] Vijay commented on CASSANDRA-5933: -- Ryan, Do you mind testing the custom with 5 to 10 ms... I am thinking, we might need enough sample for Percentiles to make more sense (if conformed we might want to wait till the samples arrive etc). 2.0 read performance is slower than 1.2 --- Key: CASSANDRA-5933 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png Over the course of several tests I have observed that 2.0 read performance is noticeably slower than 1.2 Example: Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this on rc2, just don't have a good graph handy) !1.2-faster-than-2.0.png! !1.2-faster-than-2.0-stats.png! [See test data here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756324#comment-13756324 ] Ryan McGuire commented on CASSANDRA-5933: - Hi [~vijay2...@yahoo.com], I'm not sure what you meant by 'custom with 5 to 10 ms'. Can you please clarify the test scenario you'd like me to run? 2.0 read performance is slower than 1.2 --- Key: CASSANDRA-5933 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png Over the course of several tests I have observed that 2.0 read performance is noticeably slower than 1.2 Example: Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this on rc2, just don't have a good graph handy) !1.2-faster-than-2.0.png! !1.2-faster-than-2.0-stats.png! [See test data here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756329#comment-13756329 ] Vijay commented on CASSANDRA-5933: -- Hi Ryan, You can set a custom speculative execution like the below... {code} update column family Standard1 with speculative_retry=10ms; {code} 2.0 read performance is slower than 1.2 --- Key: CASSANDRA-5933 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png Over the course of several tests I have observed that 2.0 read performance is noticeably slower than 1.2 Example: Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this on rc2, just don't have a good graph handy) !1.2-faster-than-2.0.png! !1.2-faster-than-2.0-stats.png! [See test data here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5933) 2.0 read performance is slower than 1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756331#comment-13756331 ] Ryan McGuire edited comment on CASSANDRA-5933 at 9/3/13 3:21 AM: - Ah, OK, I can run that test, that also applies to CASSANDRA-5932. However, in this case, I don't believe speculative retry can account for all the difference. The red line has none enabled. was (Author: enigmacurry): Ah, OK, I can run that test, that also applies to CASSANDRA-5332. However, in this case, I don't believe speculative retry can account for all the difference. The red line has none enabled. 2.0 read performance is slower than 1.2 --- Key: CASSANDRA-5933 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png Over the course of several tests I have observed that 2.0 read performance is noticeably slower than 1.2 Example: Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this on rc2, just don't have a good graph handy) !1.2-faster-than-2.0.png! !1.2-faster-than-2.0-stats.png! [See test data here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756331#comment-13756331 ] Ryan McGuire commented on CASSANDRA-5933: - Ah, OK, I can run that test, that also applies to CASSANDRA-5332. However, in this case, I don't believe speculative retry can account for all the difference. The red line has none enabled. 2.0 read performance is slower than 1.2 --- Key: CASSANDRA-5933 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png Over the course of several tests I have observed that 2.0 read performance is noticeably slower than 1.2 Example: Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this on rc2, just don't have a good graph handy) !1.2-faster-than-2.0.png! !1.2-faster-than-2.0-stats.png! [See test data here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5906) Avoid allocating over-large bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5906: -- Priority: Major (was: Minor) Avoid allocating over-large bloom filters - Key: CASSANDRA-5906 URL: https://issues.apache.org/jira/browse/CASSANDRA-5906 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 2.0.1 We conservatively estimate the number of partitions post-compaction to be the total number of partitions pre-compaction. That is, we assume the worst-case scenario of no partition overlap at all. This can result in substantial memory wasted in sstables resulting from highly overlapping compactions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals
Aleksey Yeschenko created CASSANDRA-5971: Summary: Get rid of thrift-generated Index* classes usage in C* internals Key: CASSANDRA-5971 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Trivial Fix For: 2.1 We've cleaned up most of it previously, but IndexExpression/IndexOperator/IndexType have somehow escaped the purge. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals
[ https://issues.apache.org/jira/browse/CASSANDRA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-5971: - Attachment: 5971.txt Get rid of thrift-generated Index* classes usage in C* internals Key: CASSANDRA-5971 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Trivial Fix For: 2.1 Attachments: 5971.txt We've cleaned up most of it previously, but IndexExpression/IndexOperator/IndexType have somehow escaped the purge. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5970) FilteredRangeSlice command for regex searches against column names on known sets of keys
[ https://issues.apache.org/jira/browse/CASSANDRA-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756339#comment-13756339 ] Jonathan Ellis commented on CASSANDRA-5970: --- It sounds like you have some code already, but ISTM it would be most straightforward to implement this as a predicate to the existing slice verb. Or put another way, making the slice code a closer match to the range (sequential scan) code that already has a concept of predicates being queried. FilteredRangeSlice command for regex searches against column names on known sets of keys Key: CASSANDRA-5970 URL: https://issues.apache.org/jira/browse/CASSANDRA-5970 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Nate McCall This is the ability to apply a regex against columns when the set of keys is known. In filtering the keys, we would like to allow for the following clauses: E, GTE, LTE, NE, inclusive list, exclusive list. The end goal is to provide for efficient searching in the case where you have some knowledge of the keys. A specific use case would be, say, searching user agent strings in the given set of date buckets in the classic time-series web log use case. This is a sweet spot for Cassandra and providing a more direct method of access for such will help a lot of users. Additionally, this will provide some level of feature parity with RDBMS crowd who've had this feature for some time. Internally, this will include the introduction of a new Verb, SSTableScanner extension and an ExtendedFilter implementation which applies the regex as well as a new method on StorageProxy. This issue does not cover exposing this new query method to thrift and CQL, but obviously that will be required for this to be of any practical use. Those should be covered by separate issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals
[ https://issues.apache.org/jira/browse/CASSANDRA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5971: -- Reviewer: dbrosius (was: jbellis) Get rid of thrift-generated Index* classes usage in C* internals Key: CASSANDRA-5971 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Trivial Fix For: 2.1 Attachments: 5971.txt We've cleaned up most of it previously, but IndexExpression/IndexOperator/IndexType have somehow escaped the purge. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5958) Unable to find property errors from snakeyaml are confusing
[ https://issues.apache.org/jira/browse/CASSANDRA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756342#comment-13756342 ] Jonathan Ellis commented on CASSANDRA-5958: --- does HashSet.toString actually give us a human-readable error? Unable to find property errors from snakeyaml are confusing - Key: CASSANDRA-5958 URL: https://issues.apache.org/jira/browse/CASSANDRA-5958 Project: Cassandra Issue Type: Bug Reporter: J.B. Langston Priority: Minor Attachments: trunk-5958-skip-missing-properties.patch, trunk-5958-v2-print-all-invalid-properties.patch When an unexpected property is present in cassandra.yaml (e.g. after upgrading), snakeyaml outputs the following message: {code}Unable to find property 'some_property' on class: org.apache.cassandra.config.Config{code} The error message is kind of counterintuitive because at first glance it seems to suggest the property is missing from the yaml file, when in fact the error is caused by the *presence* of an unrecognized property. I know if you read it carefully it says it can't find the property on the class, but this has confused more than one user. I think we should catch this exception and wrap it in another exception that says something like this: {code}Please remove 'some_property' from your cassandra.yaml. It is not recognized by this version of Cassandra.{code} Also, it might make sense to make this a warning instead of a fatal error, and just ignore the unwanted property. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5958) Unable to find property errors from snakeyaml are confusing
[ https://issues.apache.org/jira/browse/CASSANDRA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756348#comment-13756348 ] Mikhail Stepura commented on CASSANDRA-5958: For example I have the following in my cassandra.yaml {code:title=cassandra.yaml} oh: my bla: bla {code} Then the stacktrace will be {code} ERROR 04:06:57 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please remove properties [bla, oh] from your cassandra.yaml at org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:131) ~[main/:na] at org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:94) ~[main/:na] at org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:128) ~[main/:na] at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:104) ~[main/:na] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:153) ~[main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391) ~[main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434) ~[main/:na] Invalid yaml. Please remove properties [bla, oh] from your cassandra.yaml Fatal configuration error; unable to start. See log for stacktrace. {code} Unable to find property errors from snakeyaml are confusing - Key: CASSANDRA-5958 URL: https://issues.apache.org/jira/browse/CASSANDRA-5958 Project: Cassandra Issue Type: Bug Reporter: J.B. Langston Priority: Minor Attachments: trunk-5958-skip-missing-properties.patch, trunk-5958-v2-print-all-invalid-properties.patch When an unexpected property is present in cassandra.yaml (e.g. after upgrading), snakeyaml outputs the following message: {code}Unable to find property 'some_property' on class: org.apache.cassandra.config.Config{code} The error message is kind of counterintuitive because at first glance it seems to suggest the property is missing from the yaml file, when in fact the error is caused by the *presence* of an unrecognized property. I know if you read it carefully it says it can't find the property on the class, but this has confused more than one user. I think we should catch this exception and wrap it in another exception that says something like this: {code}Please remove 'some_property' from your cassandra.yaml. It is not recognized by this version of Cassandra.{code} Also, it might make sense to make this a warning instead of a fatal error, and just ignore the unwanted property. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5971) Get rid of thrift-generated Index* classes usage in C* internals
[ https://issues.apache.org/jira/browse/CASSANDRA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756350#comment-13756350 ] Dave Brosius commented on CASSANDRA-5971: - +1 Get rid of thrift-generated Index* classes usage in C* internals Key: CASSANDRA-5971 URL: https://issues.apache.org/jira/browse/CASSANDRA-5971 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Trivial Fix For: 2.1 Attachments: 5971.txt We've cleaned up most of it previously, but IndexExpression/IndexOperator/IndexType have somehow escaped the purge. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira