[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030425#comment-13030425
 ] 

Jackson Chung commented on CASSANDRA-2401:
------------------------------------------

Here is 1 way that i could 100% reproduce the issue with data being null:

Need 2 nodes, 1 is gonna to autobootstrap to the other. Also assuming 
completely clean start (blow up the /var/lib/cassandra/ or where ever data are 
stored

i am also using brisk beta to test

to start:
node-A:
1) get brisk
2) start brisk  with -t (jobtracker)
3) run a simple hive query : 
 3a) bin/brisk hive 
 3b) create table foo (bar INT);
 3c) select count(*) from foo;
 3d) exit;
4) every thing should be so far so good, let the brisk node continue to be up

node-B:
1) get brisk
2) modify the resources/cassandra/conf/cassandra.yaml:
 2a) to enable autobootstrap. 
 2b) point seeds to node-A

3) put a sleep or break point in o.a.c.service.StorageService.joinTokenRing 
method, right after "Map<InetAddress, Double> loadinfo = 
StorageLoadBalancer.instance.getLoadInfo();" (personal preference: log a sleep 
line, add a thread.sleep(a_long_time))
4) start brisk with -t on node-B 
5) wait till the log line "Joining: getting bootstrap token" , it should now 
reaches your break point (or zz)
6) crash the jvm (personal preference: kill -9 <pid>)

back to node-A
1) exit the jvm (BriskDaemon) "normally" (kill <pid>)
2) start the brisk node again (with -t):

log from node-A: 
{noformat}
 INFO 23:25:00,213 Logging initialized
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/riptano/work/brisk/resources/cassandra/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/riptano/work/brisk/resources/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
 INFO 23:25:00,235 Heap size: 510263296/511311872
 INFO 23:25:00,237 JNA not found. Native methods will be disabled.
 INFO 23:25:00,263 Loading settings from 
file:/home/riptano/work/brisk/resources/cassandra/conf/cassandra.yaml
 INFO 23:25:00,470 DiskAccessMode 'auto' determined to be mmap, indexAccessMode 
is mmap
 INFO 23:25:00,496 Detected Hadoop trackers are enabled, setting my DC to Brisk
 INFO 23:25:00,696 Global memtable threshold is enabled at 162MB
 INFO 23:25:00,846 Opening /var/lib/cassandra/data/system/IndexInfo-f-1
 INFO 23:25:00,912 Opening /var/lib/cassandra/data/system/Schema-f-2
 INFO 23:25:00,926 Opening /var/lib/cassandra/data/system/Schema-f-1
 INFO 23:25:00,951 Opening /var/lib/cassandra/data/system/Migrations-f-2
 INFO 23:25:00,954 Opening /var/lib/cassandra/data/system/Migrations-f-1
 INFO 23:25:00,970 Opening /var/lib/cassandra/data/system/LocationInfo-f-2
 INFO 23:25:00,989 Opening /var/lib/cassandra/data/system/LocationInfo-f-1
 INFO 23:25:01,089 Loading schema version c4fd2440-7900-11e0-0000-ba846f9adcf7
 INFO 23:25:01,499 Creating new commitlog segment 
/var/lib/cassandra/commitlog/CommitLog-1304810701499.log
 INFO 23:25:01,530 Replaying 
/var/lib/cassandra/commitlog/CommitLog-1304810455288.log
 INFO 23:25:01,675 Finished reading 
/var/lib/cassandra/commitlog/CommitLog-1304810455288.log
 INFO 23:25:01,730 Enqueuing flush of Memtable-MetaStore@102170028(869/1086 
serialized/live bytes, 3 ops)
 INFO 23:25:01,735 Writing Memtable-MetaStore@102170028(869/1086 
serialized/live bytes, 3 ops)
 INFO 23:25:01,743 Enqueuing flush of 
Memtable-sblocks@1075051425(3044096/3805120 serialized/live bytes, 17 ops)
 INFO 23:25:01,747 Enqueuing flush of Memtable-inode.path@780298059(2848/3560 
serialized/live bytes, 59 ops)
 INFO 23:25:01,748 Enqueuing flush of 
Memtable-inode.sentinel@1934329031(2848/3560 serialized/live bytes, 59 ops)
 INFO 23:25:01,748 Enqueuing flush of Memtable-inode@1660575731(6393/7991 
serialized/live bytes, 134 ops)
 INFO 23:25:01,821 Completed flushing 
/var/lib/cassandra/data/HiveMetaStore/MetaStore-f-1-Data.db (989 bytes)
 INFO 23:25:01,832 Writing Memtable-sblocks@1075051425(3044096/3805120 
serialized/live bytes, 17 ops)
 INFO 23:25:01,927 Completed flushing 
/var/lib/cassandra/data/cfs/sblocks-f-1-Data.db (3045448 bytes)
 INFO 23:25:01,928 Writing Memtable-inode.path@780298059(2848/3560 
serialized/live bytes, 59 ops)
 INFO 23:25:01,968 Completed flushing 
/var/lib/cassandra/data/cfs/inode.path-f-1-Data.db (5346 bytes)
 INFO 23:25:01,969 Writing Memtable-inode.sentinel@1934329031(2848/3560 
serialized/live bytes, 59 ops)
 INFO 23:25:02,035 Completed flushing 
/var/lib/cassandra/data/cfs/inode.sentinel-f-1-Data.db (1735 bytes)
 INFO 23:25:02,036 Writing Memtable-inode@1660575731(6393/7991 serialized/live 
bytes, 134 ops)
 INFO 23:25:02,085 Completed flushing 
/var/lib/cassandra/data/cfs/inode-f-1-Data.db (8582 bytes)
 INFO 23:25:02,087 Log replay complete
 INFO 23:25:02,092 Cassandra version: 0.8.0-beta2-SNAPSHOT
 INFO 23:25:02,092 Thrift API version: 19.10.0
 INFO 23:25:02,092 Loading persisted ring state
 INFO 23:25:02,092 load token size: 0
 INFO 23:25:02,093 Starting up server gossip
 INFO 23:25:02,104 Enqueuing flush of Memtable-LocationInfo@22262475(29/36 
serialized/live bytes, 1 ops)
 INFO 23:25:02,105 Writing Memtable-LocationInfo@22262475(29/36 serialized/live 
bytes, 1 ops)
 INFO 23:25:02,127 Completed flushing 
/var/lib/cassandra/data/system/LocationInfo-f-3-Data.db (80 bytes)
 INFO 23:25:02,149 Starting Messaging Service on port 7000
 INFO 23:25:02,172 Using saved token 152036150612811635197207268153837644139
 INFO 23:25:02,173 Enqueuing flush of Memtable-LocationInfo@1977026981(53/66 
serialized/live bytes, 2 ops)
 INFO 23:25:02,174 Writing Memtable-LocationInfo@1977026981(53/66 
serialized/live bytes, 2 ops)
 INFO 23:25:02,190 Completed flushing 
/var/lib/cassandra/data/system/LocationInfo-f-4-Data.db (163 bytes)
 INFO 23:25:02,193 Compacting Major: 
[SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-2-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-1-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-3-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-f-4-Data.db')]
 INFO 23:25:02,196 Will not load MX4J, mx4j-tools.jar is not in the classpath
 INFO 23:25:02,196 Starting up Hadoop trackers
 INFO 23:25:02,197 Waiting for gossip to start
 INFO 23:25:02,225 Major@1830423861(system, LocationInfo, 438/741) now 
compacting at 16777 bytes/ms.
 INFO 23:25:02,257 Compacted to 
/var/lib/cassandra/data/system/LocationInfo-tmp-f-5-Data.db.  741 to 447 (~60% 
of original) bytes for 3 keys.  Time: 64ms.
 INFO 23:25:07,272 Chose seed 10.179.96.212 as jobtracker
 WARN 23:25:09,331 Metrics system not started: Cannot locate configuration: 
tried hadoop-metrics2-jobtracker.properties, hadoop-metrics2.properties
 INFO 23:25:09,994 Chose seed 10.179.96.212 as jobtracker
 INFO 23:25:10,139 Updating the current master key for generating delegation 
tokens
 INFO 23:25:10,143 Starting expired delegation token remover thread, 
tokenRemoverScanInterval=60 min(s)
 INFO 23:25:10,143 Scheduler configured with (memSizeForMapSlotOnJT, 
memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) 
(-1, -1, -1, -1)
 INFO 23:25:10,144 Updating the current master key for generating delegation 
tokens
 INFO 23:25:10,145 Refreshing hosts (include/exclude) list
 INFO 23:25:10,223 Starting jobtracker with owner as riptano
 INFO 23:25:10,245 Starting SocketReader
 INFO 23:25:10,374 Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
 INFO 23:25:10,623 Added global filtersafety 
(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
 INFO 23:25:10,673 Port returned by webServer.getConnectors()[0].getLocalPort() 
before open() is -1. Opening the listener on 50030
 INFO 23:25:10,673 listener.getLocalPort() returned 50030 
webServer.getConnectors()[0].getLocalPort() returned 50030
 INFO 23:25:10,674 Jetty bound to port 50030
 INFO 23:25:10,674 jetty-6.1.21
 INFO 23:25:11,140 Started SelectChannelConnector@0.0.0.0:50030
 INFO 23:25:11,147 JobTracker up at: 8012
 INFO 23:25:11,147 JobTracker webserver: 50030
 WARN 23:25:11,276 Incorrect permissions on 
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to 
rwx------
ERROR 23:25:11,321 Fatal exception in thread Thread[ReadStage:4,5,main]
java.lang.AssertionError: No data found for 
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in 
DecoratedKey(55249227080490826413412398468829851220, 
3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode',
 superColumnName='null', columnName='null') (original filter 
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from 
expression 'inode.73656e74696e656c EQ 78'
        at 
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513)
        at 
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46)
        at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
 INFO 23:25:20,059 Deleted /var/lib/cassandra/data/system/LocationInfo-f-3
 INFO 23:25:20,060 Deleted /var/lib/cassandra/data/system/LocationInfo-f-4
 INFO 23:25:20,576 Deleted /var/lib/cassandra/data/system/LocationInfo-f-1
 INFO 23:25:20,577 Deleted /var/lib/cassandra/data/system/LocationInfo-f-2
 INFO 23:25:21,297 problem cleaning system directory: 
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system
java.io.IOException: TimedOutException()
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:523)
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listSubPaths(CassandraFileSystemThriftStore.java:529)
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystem.listStatus(CassandraFileSystem.java:171)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2374)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2174)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:303)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:294)
        at 
org.apache.cassandra.hadoop.trackers.TrackerInitializer$1.run(TrackerInitializer.java:93)
        at java.lang.Thread.run(Thread.java:662)
Caused by: TimedOutException()
        at 
org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:673)
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:506)
        ... 8 more
 WARN 23:25:31,300 Incorrect permissions on 
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to 
rwx------
ERROR 23:25:31,315 Fatal exception in thread Thread[ReadStage:6,5,main]
java.lang.AssertionError: No data found for 
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in 
DecoratedKey(55249227080490826413412398468829851220, 
3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode',
 superColumnName='null', columnName='null') (original filter 
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from 
expression 'inode.73656e74696e656c EQ 78'
        at 
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513)
        at 
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46)
        at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
 INFO 23:25:41,303 problem cleaning system directory: 
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system
java.io.IOException: TimedOutException()
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:523)
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listSubPaths(CassandraFileSystemThriftStore.java:529)
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystem.listStatus(CassandraFileSystem.java:171)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2374)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2174)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:303)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:294)
        at 
org.apache.cassandra.hadoop.trackers.TrackerInitializer$1.run(TrackerInitializer.java:93)
        at java.lang.Thread.run(Thread.java:662)
Caused by: TimedOutException()
        at 
org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:673)
        at 
org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.listDeepSubPaths(CassandraFileSystemThriftStore.java:506)
        ... 8 more
 WARN 23:25:51,308 Incorrect permissions on 
cassandra://localhost:9160/tmp/hadoop-riptano/mapred/system. Setting it to 
rwx------
ERROR 23:25:51,321 Fatal exception in thread Thread[ReadStage:8,5,main]
java.lang.AssertionError: No data found for 
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17]) in 
DecoratedKey(55249227080490826413412398468829851220, 
3165333533353736613164333836353061346636333465656437326131353939):QueryPath(columnFamilyName='inode',
 superColumnName='null', columnName='null') (original filter 
NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=16 cap=17])) from 
expression 'inode.73656e74696e656c EQ 78'
        at 
org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1513)
        at 
org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:46)
        at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

{noformat}



> getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
> scan() method, causing Timeout Exception in query
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2401
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>         Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
>            Reporter: Tey Kar Shiang
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.6
>
>         Attachments: 2401-v2.txt, 2401-v3.txt, 2401.txt
>
>
> ColumnFamilyStore.java, line near 1680, "ColumnFamily data = 
> getColumnFamily(new QueryFilter(dk, path, firstFilter))", the data is 
> returned null, causing NULL exception in "satisfies(data, clause, primary)" 
> which is not captured. The callback got timeout and return a Timeout 
> exception to Hector.
> The data is empty, as I traced, I have the the columns Count as 0 in 
> removeDeletedCF(), which return the null there. (I am new and trying to 
> understand the logics around still). Instead of crash to NULL, could we 
> bypass the data?
> About my test:
> A stress-test program to add, modify and delete data to keyspace. I have 30 
> threads simulate concurrent users to perform the actions above, and do a 
> query to all rows periodically. I have Column Family with rows (as File) and 
> columns as index (e.g. userID, fileType).
> No issue on the first day of test, and stopped for 3 days. I restart the test 
> on 4th day, 1 of the users failed to query the files (timeout exception 
> received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to