[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030425#comment-13030425 ]
Jackson Chung commented on CASSANDRA-2401: ------------------------------------------ Here is 1 way that i could 100% reproduce the issue with data being null: Need 2 nodes, 1 is gonna to autobootstrap to the other. Also assuming completely clean start (blow up the /var/lib/cassandra/ or where ever data are stored i am also using brisk beta to test to start: node-A: 1) get brisk 2) start brisk with -t (jobtracker) 3) run a simple hive query : 3a) bin/brisk hive 3b) create table foo (bar INT); 3c) select count(*) from foo; 3d) exit; 4) every thing should be so far so good, let the brisk node continue to be up node-B: 1) get brisk 2) modify the resources/cassandra/conf/cassandra.yaml: 2a) to enable autobootstrap. 2b) point seeds to node-A 3) put a sleep or break point in o.a.c.service.StorageService.joinTokenRing method, right after "Map<InetAddress, Double> loadinfo = StorageLoadBalancer.instance.getLoadInfo();" (personal preference: log a sleep line, add a thread.sleep(a_long_time)) 4) start brisk with -t on node-B 5) wait till the log line "Joining: getting bootstrap token" , it should now reaches your break point (or zz) 6) crash the jvm (personal preference: kill -9 <pid>) back to node-A 1) exit the jvm (BriskDaemon) "normally" (kill <pid>) 2) start the brisk node again (with -t): log from node-A: {noformat} INFO 23:25:00,213 Logging initialized SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/riptano/work/brisk/resources/cassandra/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/riptano/work/brisk/resources/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. INFO 23:25:00,235 Heap size: 510263296/511311872 INFO 23:25:00,237 JNA not found. Native methods will be disabled. INFO 23:25:00,263 Loading settings from file:/home/riptano/work/brisk/resources/cassandra/conf/cassandra.yaml INFO 23:25:00,470 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 23:25:00,496 Detected Hadoop trackers are enabled, setting my DC to Brisk INFO 23:25:00,696 Global memtable threshold is enabled at 162MB INFO 23:25:00,846 Opening /var/lib/cassandra/data/system/IndexInfo-f-1 INFO 23:25:00,912 Opening /var/lib/cassandra/data/system/Schema-f-2 INFO 23:25:00,926 Opening /var/lib/cassandra/data/system/Schema-f-1 INFO 23:25:00,951 Opening /var/lib/cassandra/data/system/Migrations-f-2 INFO 23:25:00,954 Opening /var/lib/cassandra/data/system/Migrations-f-1 INFO 23:25:00,970 Opening /var/lib/cassandra/data/system/LocationInfo-f-2 INFO 23:25:00,989 Opening /var/lib/cassandra/data/system/LocationInfo-f-1 INFO 23:25:01,089 Loading schema version c4fd2440-7900-11e0-0000-ba846f9adcf7 INFO 23:25:01,499 Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1304810701499.log INFO 23:25:01,530 Replaying /var/lib/cassandra/commitlog/CommitLog-1304810455288.log INFO 23:25:01,675 Finished reading /var/lib/cassandra/commitlog/CommitLog-1304810455288.log INFO 23:25:01,730 Enqueuing flush of Memtable-MetaStore@102170028(869/1086 serialized/live bytes, 3 ops) INFO 23:25:01,735 Writing Memtable-MetaStore@102170028(869/1086 serialized/live bytes, 3 ops) INFO 23:25:01,743 Enqueuing flush of Memtable-sblocks@1075051425(3044096/3805120 serialized/live bytes, 17 ops) INFO 23:25:01,747 Enqueuing flush of Memtable-inode.path@780298059(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:01,748 Enqueuing flush of Memtable-inode.sentinel@1934329031(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:01,748 Enqueuing flush of Memtable-inode@1660575731(6393/7991 serialized/live bytes, 134 ops) INFO 23:25:01,821 Completed flushing /var/lib/cassandra/data/HiveMetaStore/MetaStore-f-1-Data.db (989 bytes) INFO 23:25:01,832 Writing Memtable-sblocks@1075051425(3044096/3805120 serialized/live bytes, 17 ops) INFO 23:25:01,927 Completed flushing /var/lib/cassandra/data/cfs/sblocks-f-1-Data.db (3045448 bytes) INFO 23:25:01,928 Writing Memtable-inode.path@780298059(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:01,968 Completed flushing /var/lib/cassandra/data/cfs/inode.path-f-1-Data.db (5346 bytes) INFO 23:25:01,969 Writing Memtable-inode.sentinel@1934329031(2848/3560 serialized/live bytes, 59 ops) INFO 23:25:02,035 Completed flushing /var/lib/cassandra/data/cfs/inode.sentinel-f-1-Data.db (1735 bytes) INFO 23:25:02,036 Writing Memtable-inode@1660575731(6393/7991 serialized/live bytes, 134 ops) INFO 23:25:02,085 Completed flushing /var/lib/cassandra/data/cfs/inode-f-1-Data.db (8582 bytes) INFO 23:25:02,087 Log replay complete INFO 23:25:02,092 Cassandra version: 0.8.0-beta2-SNAPSHOT INFO 23:25:02,092 Thrift API version: 19.10.0 INFO 23:25:02,092 Loading persisted ring state INFO 23:25:02,092 load token size: 0 INFO 23:25:02,093 Starting up server gossip INFO 23:25:02,104 Enqueuing flush of Memtable-LocationInfo@22262475(29/36 serialized/live bytes, 1 ops) INFO 23:25:02,105 Writing Memtable-LocationInfo@22262475(29/36 serialized/live bytes, 1 ops) INFO 23:25:02,127 Completed flushing /var/lib/cassandra/data/system/LocationInfo-f-3-Data.db (80 bytes) INFO 23:25:02,149 Starting Messaging Service on port 7000 INFO 23:25:02,172 Using saved token 152036150612811635197207268153837644139 INFO 23:25:02,173 Enqueuing flush of Memtable-LocationInfo@1977026981(53/66 serialized/live bytes, 2 ops) INFO 23:25:02,174 Writing Memtable-LocationInfo@1977026981(53/66 serialized/live bytes, 2 ops) INFO 23:25:02,190 Completed flushing /var/lib/cassandra/data/system/LocationInfo-f-4-Data.db (163 bytes) INFO 23:25:02,193 Compacting Major: [SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-2-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-1-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-3-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-4-Data.db')] INFO 23:25:02,196 Will not load MX4J, mx4j-tools.jar is not in the classpath INFO 23:25:02,196 Starting up Hadoop trackers INFO 23:25:02,197 Waiting for gossip to start INFO 23:25:02,225 Major@1830423861(system, LocationInfo, 438/741) now compacting at 16777 bytes/ms. INFO 23:25:02,257 Compacted to /var/lib/cassandra/data/system/LocationInfo-tmp-f-5-Data.db. 741 to 447 (~60% of original) bytes for 3 keys. Time: 64ms. INFO 23:25:07,272 Chose seed 10.179.96.212 as jobtracker WARN 23:25:09,331 Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-jobtracker.properties, hadoop-metrics2.properties INFO 23:25:09,994 Chose seed 10.179.96.212 as jobtracker INFO 23:25:10,139 Updating the current master key for generating delegation tokens INFO 23:25:10,143 Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s) INFO 23:25:10,143 Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) INFO 23:25:10,144 Updating the current master key for generating delegation tokens INFO 23:25:10,145 Refreshing hosts (include/exclude) list INFO 23:25:10,223 Starting jobtracker with owner as riptano INFO 23:25:10,245 Starting SocketReader INFO 23:25:10,374 Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog INFO 23:25:10,623 Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) INFO 23:25:10,673 Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50030 INFO 23:25:10,673 listener.getLocalPort() returned 50030 webServer.getConnectors()[0].getLocalPort() returned 50030 INFO 23:25:10,674 Jetty bound to port 50030 INFO 23:25:10,674 jetty-6.1.21 INFO 23:25:11,140 Started SelectChannelConnector@0.0.0.0:50030 INFO 23:25:11,147 JobTracker up at: 8012 INFO 23:25:11,147 JobTracker webserver: 50030 WARN 23:25:11,276 Incorrect permissions on cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to rwx------ ERROR 23:25:11,321 Fatal exception in thread Thread[ReadStage:4,5,main] java.lang.AssertionError: No data found for NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in DecoratedKey(55249227080490826413412398468829851220, 3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode', superColumnName='null', columnName='null') (original filter NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from expression 'inode.73656e74696e656c EQ 78' at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) INFO 23:25:20,059 Deleted /var/lib/cassandra/data/system/LocationInfo-f-3 INFO 23:25:20,060 Deleted /var/lib/cassandra/data/system/LocationInfo-f-4 INFO 23:25:20,576 Deleted /var/lib/cassandra/data/system/LocationInfo-f-1 INFO 23:25:20,577 Deleted /var/lib/cassandra/data/system/LocationInfo-f-2 INFO 23:25:21,297 problem cleaning system directory: cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system java.io.IOException: TimedOutException() at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:523) at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listSubPaths(CassandraFileSystemThriftStore.java:529) at org.apache.cassandra.hadoop.fs.CassandraFileSystem.listStatus(CassandraFileSystem.java:171) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2374) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2174) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:303) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:294) at org.apache.cassandra.hadoop.trackers.TrackerInitializer$1.run(TrackerInitializer.java:93) at java.lang.Thread.run(Thread.java:662) Caused by: TimedOutException() at org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:673) at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:506) ... 8 more WARN 23:25:31,300 Incorrect permissions on cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to rwx------ ERROR 23:25:31,315 Fatal exception in thread Thread[ReadStage:6,5,main] java.lang.AssertionError: No data found for NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in DecoratedKey(55249227080490826413412398468829851220, 3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode', superColumnName='null', columnName='null') (original filter NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from expression 'inode.73656e74696e656c EQ 78' at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) INFO 23:25:41,303 problem cleaning system directory: cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system java.io.IOException: TimedOutException() at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:523) at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listSubPaths(CassandraFileSystemThriftStore.java:529) at org.apache.cassandra.hadoop.fs.CassandraFileSystem.listStatus(CassandraFileSystem.java:171) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2374) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2174) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:303) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:294) at org.apache.cassandra.hadoop.trackers.TrackerInitializer$1.run(TrackerInitializer.java:93) at java.lang.Thread.run(Thread.java:662) Caused by: TimedOutException() at org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:673) at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:506) ... 8 more WARN 23:25:51,308 Incorrect permissions on cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to rwx------ ERROR 23:25:51,321 Fatal exception in thread Thread[ReadStage:8,5,main] java.lang.AssertionError: No data found for NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in DecoratedKey(55249227080490826413412398468829851220, 3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode', superColumnName='null', columnName='null') (original filter NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from expression 'inode.73656e74696e656c EQ 78' at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513) at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} > getColumnFamily() return null, which is not checked in ColumnFamilyStore.java > scan() method, causing Timeout Exception in query > ------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-2401 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7.0 > Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse > Reporter: Tey Kar Shiang > Assignee: Jonathan Ellis > Fix For: 0.7.6 > > Attachments: 2401-v2.txt, 2401-v3.txt, 2401.txt > > > ColumnFamilyStore.java, line near 1680, "ColumnFamily data = > getColumnFamily(new QueryFilter(dk, path, firstFilter))", the data is > returned null, causing NULL exception in "satisfies(data, clause, primary)" > which is not captured. The callback got timeout and return a Timeout > exception to Hector. > The data is empty, as I traced, I have the the columns Count as 0 in > removeDeletedCF(), which return the null there. (I am new and trying to > understand the logics around still). Instead of crash to NULL, could we > bypass the data? > About my test: > A stress-test program to add, modify and delete data to keyspace. I have 30 > threads simulate concurrent users to perform the actions above, and do a > query to all rows periodically. I have Column Family with rows (as File) and > columns as index (e.g. userID, fileType). > No issue on the first day of test, and stopped for 3 days. I restart the test > on 4th day, 1 of the users failed to query the files (timeout exception > received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira