[ https://issues.apache.org/jira/browse/CASSANDRA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981199#comment-13981199 ]
Jacek Furmankiewicz commented on CASSANDRA-7088: ------------------------------------------------ We are digging into it more and more. It does not seem to be the TRUNCATE after all. It seems that after many writes, we try to query (just before our BDD ends, so it looked like it is stuck trying to start a new one, hence our initial truncate suspicion). Then we get this exception on a regular basis and everything just freezes for a while. We run in DEBUG mode to see what is going on and are getting this: DEBUG [Thrift:2] 2014-04-25 11:10:49,403 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,403 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,404 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,418 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,419 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,419 Tracing.java (line 159) request complete DEBUG [OptionalTasks:1] 2014-04-25 11:10:49,437 ColumnFamilyStore.java (line 298) retryPolicy for sessions is 0.99 DEBUG [OptionalTasks:1] 2014-04-25 11:10:49,439 ColumnFamilyStore.java (line 298) retryPolicy for events is 0.99 DEBUG [Thrift:4] 2014-04-25 11:10:49,533 CassandraServer.java (line 313) get_slice DEBUG [Thrift:4] 2014-04-25 11:10:49,533 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:4] 2014-04-25 11:10:49,534 Tracing.java (line 159) request complete DEBUG [Thrift:2] 2014-04-25 11:10:49,534 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,534 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,535 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,535 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,536 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,536 Tracing.java (line 159) request complete DEBUG [Thrift:4] 2014-04-25 11:10:49,536 CassandraServer.java (line 313) get_slice DEBUG [Thrift:4] 2014-04-25 11:10:49,537 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:4] 2014-04-25 11:10:49,537 Tracing.java (line 159) request complete DEBUG [Thrift:2] 2014-04-25 11:10:49,537 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,538 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,538 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,538 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,539 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,539 Tracing.java (line 159) request complete DEBUG [Thrift:4] 2014-04-25 11:10:49,539 CassandraServer.java (line 313) get_slice DEBUG [Thrift:4] 2014-04-25 11:10:49,540 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:4] 2014-04-25 11:10:49,540 Tracing.java (line 159) request complete DEBUG [Thrift:2] 2014-04-25 11:10:49,540 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,541 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,541 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,541 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,542 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,542 Tracing.java (line 159) request complete DEBUG [Thrift:4] 2014-04-25 11:10:49,542 CassandraServer.java (line 313) get_slice DEBUG [Thrift:4] 2014-04-25 11:10:49,543 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:4] 2014-04-25 11:10:49,543 Tracing.java (line 159) request complete DEBUG [Thrift:2] 2014-04-25 11:10:49,543 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,544 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,544 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,544 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,544 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,545 Tracing.java (line 159) request complete DEBUG [Thrift:4] 2014-04-25 11:10:49,545 CassandraServer.java (line 313) get_slice DEBUG [Thrift:4] 2014-04-25 11:10:49,545 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:4] 2014-04-25 11:10:49,545 Tracing.java (line 159) request complete DEBUG [Thrift:2] 2014-04-25 11:10:49,546 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,546 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,546 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,546 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,547 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,547 Tracing.java (line 159) request complete DEBUG [Thrift:4] 2014-04-25 11:10:49,547 CassandraServer.java (line 313) get_slice DEBUG [Thrift:4] 2014-04-25 11:10:49,548 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:4] 2014-04-25 11:10:49,548 Tracing.java (line 159) request complete DEBUG [Thrift:2] 2014-04-25 11:10:49,548 CassandraServer.java (line 313) get_slice DEBUG [Thrift:2] 2014-04-25 11:10:49,548 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:2] 2014-04-25 11:10:49,549 Tracing.java (line 159) request complete DEBUG [Thrift:1] 2014-04-25 11:10:49,549 CassandraServer.java (line 313) get_slice DEBUG [Thrift:1] 2014-04-25 11:10:49,549 StorageProxy.java (line 1226) Read: 0 ms. DEBUG [Thrift:1] 2014-04-25 11:10:49,550 Tracing.java (line 159) request complete DEBUG [Thrift:3] 2014-04-25 11:10:49,983 ReadCallback.java (line 103) Read timeout: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses. DEBUG [Thrift:3] 2014-04-25 11:10:49,984 StorageProxy.java (line 1243) Read timeout; received 0 of 1 responses DEBUG [Thrift:3] 2014-04-25 11:10:49,984 Tracing.java (line 159) request complete DEBUG [Thrift:4] 2014-04-25 11:10:49,984 CassandraServer.java (line 313) get_slice ERROR [ReadStage:21] 2014-04-25 11:10:49,987 CassandraDaemon.java (line 198) Exception in thread Thread[ReadStage:21,5,main] java.lang.RuntimeException: java.lang.NumberFormatException: Zero length BigInteger at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1920) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NumberFormatException: Zero length BigInteger at java.math.BigInteger.<init>(BigInteger.java:190) at org.apache.cassandra.serializers.IntegerSerializer.deserialize(IntegerSerializer.java:32) at org.apache.cassandra.serializers.IntegerSerializer.deserialize(IntegerSerializer.java:26) at org.apache.cassandra.db.marshal.AbstractType.getString(AbstractType.java:156) at org.apache.cassandra.db.marshal.AbstractCompositeType.getString(AbstractCompositeType.java:242) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:219) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1540) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1369) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1352) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1916) ... 3 more DEBUG [Thrift:7] 2014-04-25 11:10:49,990 CustomTThreadPoolServer.java (line 211) Thrift transport error occurred during processing of message. org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) DEBUG [Thrift:2] 2014-04-25 11:10:50,404 CassandraServer.java (line 313) get_slice > Zero length BigInteger exception causing processing to freeze > -------------------------------------------------------------- > > Key: CASSANDRA-7088 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7088 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux Mint 16 > Reporter: Jacek Furmankiewicz > Attachments: cassandra.yaml > > > We attempted to migrate our developers to Cassandra 2.0.7 from 1.2. > Everything worked perfectly, but we have experienced a massive drop in > developer velocity. > We run integration tests with Cucumber BDD and 1000 BDDs went from 7 minutes > (Cassandra 1.2) to 15 minutes (2.0.7), > This is when we run Cassandra of the ramdisk (/dev/shm) to make it run faster > on dev boxes. > When we tried pointed to actual drives the difference was dramatic: the > entire suite took over 70 minutes (!) vs 15 in Cassandra 1.2. > After investigation, we found that most of the time is spent in the > truncation logic between every scenario, where we truncate all the column > families and start with a clean DB for the next test case. > This used to be super fast in 1.2, is now very slow in 2.0. > It may not seem important, but upgrading to 2.0 has basically cut down > developer velocity by 100%, just by more than doubling the time it takes to > run our BDD suite. > We truncate the CFs using the Ruby driver: > $cassandra.column_families.each do |column_family| > name = column_family[0].to_s > $cassandra.truncate! name > end > I am attaching our cassandra.yaml. Please note we already switched off > auto_compaction before truncate, just as we did in 1.2 for dev boxes, Made no > difference. -- This message was sent by Atlassian JIRA (v6.2#6252)