[ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-3862:
----------------------------------------

    Attachment: 3862-v4.patch

Actually then handling of the copying patch by the preceding patches is wrong.  
When a put arrives and there is a sentinel, the patch does not add the put to 
the sentinel correctly. But thinking about it, for the copying cache, we should 
avoid having writes check the current value in the cache, because that have a 
non-negligible performance impact. What we should do is let invalidate actually 
invalidate sentinels. The only problem we're faced with if we do that, is that 
when a read-for-caching returns, it must make sure his own sentinel hasn't been 
invalidated. And in particular it must be careful of the case where the 
sentinel has been invalidated and another read has set another sentinel.

Anyway, attaching a v4 (that include the comments cleanups) that choose that 
strategy instead (and thus is (hopefully) not buggy even in the copying cache 
case). Note that it means that reads must be able to identify sentinels 
uniquely (not based on the content), so the code assign a unique ID to sentinel 
and use that for comparison.

                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 
> 3862.patch, 3862_v3.patch, include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I 
> guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only 
> reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not 
> eagerly read during write (because we want fast writes) so the writer cannot 
> perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately 
> didn't dig after I got no answer since I assumed that I just missed 
> something. In a way I still do but haven't found any locking mechanism that 
> makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I 
> restart the server the expected column is there. It's just missing from the 
> cache.
> To test I have created a patch that merges memtables with the row cache. With 
> the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
> relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to