[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029366#comment-13029366 ]
Sylvain Lebresne commented on CASSANDRA-2401: --------------------------------------------- Comments on the patch: * ignoreObsoleteMutation() now forgot to actually remove the obsolete mutation from cf. * not sure why mutatedIndexColumns need to be concurrent. There is no concurrency in ignoreObsoleteMutation, is there ? * really minor: change to debug log "Scanning index row %s ..." seems misleading since the first argument is not a row name. Other than that, I do agree with you that there is quite probably a race between reads and concurrent writes. But also agree that it doesn't seem to be the problem here > getColumnFamily() return null, which is not checked in ColumnFamilyStore.java > scan() method, causing Timeout Exception in query > ------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-2401 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7.0 > Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse > Reporter: Tey Kar Shiang > Assignee: Jonathan Ellis > Fix For: 0.7.6 > > Attachments: 2401.txt > > > ColumnFamilyStore.java, line near 1680, "ColumnFamily data = > getColumnFamily(new QueryFilter(dk, path, firstFilter))", the data is > returned null, causing NULL exception in "satisfies(data, clause, primary)" > which is not captured. The callback got timeout and return a Timeout > exception to Hector. > The data is empty, as I traced, I have the the columns Count as 0 in > removeDeletedCF(), which return the null there. (I am new and trying to > understand the logics around still). Instead of crash to NULL, could we > bypass the data? > About my test: > A stress-test program to add, modify and delete data to keyspace. I have 30 > threads simulate concurrent users to perform the actions above, and do a > query to all rows periodically. I have Column Family with rows (as File) and > columns as index (e.g. userID, fileType). > No issue on the first day of test, and stopped for 3 days. I restart the test > on 4th day, 1 of the users failed to query the files (timeout exception > received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira