[ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215535#comment-13215535
 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

Remarks on v6:
* Since we don't add stuffs to the sentinel, it has no reason to be a subclass 
of ColumnFamily. We should probably create a CachedRow class extended by both 
Sentinel (that would really just be an identifier, no metadata needed) and 
ColumnFamily and use that as cache values. It'll be cleaner and more 
importantly more type safe (a cache lookup won't be able to ignore by mistake 
that it could get a sentinel).
* Not adding stuffs to the sentinel also mean that in getThroughCache the 
counter special case is not needed anymore.
* In getThroughCache, if we fail to replace the sentinel, I think we should 
still better return the data rather than looping and re-reading. Better let the 
next client read cache the data than getting a crappy latency on the current 
read.
* Is it really an improvement to use UUIDs (over an AtomicLong)? I have nothing 
against UUID per se but it takes twice the space (and we serialize them) and 
without having benchmarked it, I'm willing to bet are much faster to generate. 
And let's be honest, the risk of overflow with an AtomicLong is science-fiction 
(or to be precise, at 1 millions sentinels created per seconds (which is *way* 
more than we'll ever see), you'd need more than 100,000 year of uptime to 
overflow).

                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: Daniel Doubleday
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3862-cleanup.txt, 3862-v2.patch, 3862-v4.patch, 
> 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch, 
> include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I 
> guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only 
> reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not 
> eagerly read during write (because we want fast writes) so the writer cannot 
> perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately 
> didn't dig after I got no answer since I assumed that I just missed 
> something. In a way I still do but haven't found any locking mechanism that 
> makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I 
> restart the server the expected column is there. It's just missing from the 
> cache.
> To test I have created a patch that merges memtables with the row cache. With 
> the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
> relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to