[ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201436#comment-13201436
 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

I believe you are absolutely right that this is a bug.

Unfortunately I don't think including the memtables during cache reads really 
solves it. If you miss an update, it won't ever get added to the cached row, 
but the update itself will be flushed at some point and thus not be in any 
memtable anymore.

One partial solution I see could be that when a read 'reads for caching', it 
starts adding some sentinel object in the cache for the given row key. That 
sentinel would need to be an actual (empty) row but marked with the fact it's 
only a sentinel. When a write look if the row is cache, if it's a sentinel we 
would add the write to the sentinel. Once the read returns and we actually put 
the row in cache, we would it (atomically) with the content of the sentinel. A 
read that check the cache and see a sentinel would just skip the cache (and 
would not put it's result into the cache). Adapting that to the 
serializingCache is trivial.

Unfortunately, this is not perfect because this would screw counters. Though I 
guess for counters we could do the same thing as we would do for the 
serializingCache, i.e, if a read that 'reads for caching' see that the sentinel 
is not empty, we would just not cache the result (i.e, a row would be cache 
only if we are sure no write were done concurrently to the read).
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I 
> guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only 
> reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not 
> eagerly read during write (because we want fast writes) so the writer cannot 
> perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately 
> didn't dig after I got no answer since I assumed that I just missed 
> something. In a way I still do but haven't found any locking mechanism that 
> makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I 
> restart the server the expected column is there. It's just missing from the 
> cache.
> To test I have created a patch that merges memtables with the row cache. With 
> the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
> relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to