[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265947#comment-14265947 ]
Robert Stupp commented on CASSANDRA-7438: ----------------------------------------- The latest (just checked in) benchmark implementation gives much better results. Using {{com.codahale.metrics.Timer#time(java.util.concurrent.Callable<T>)}} eliminates use of {{System.nanoTime()}} or {{ThreadMXBean.getCurrentThreadCpuTime()}} - it can directly use its internal clock. The benchmark {{java -jar ohc-benchmark/target/ohc-benchmark-0.2-SNAPSHOT.jar -rkd 'gaussian(1..20000000,2)' -wkd 'gaussian(1..20000000,2)' -vs 'gaussian(1024..4096,2)' -r .9 -cap 1600000000 -d 30 -t 30}} improved from 800k reads to 3.3M reads per second w/ 8 cores). So yes - benchmark was measuring its own mad code. Due to that I edited my previous comment with the benchmark results since those are invalid now. I've added a (yet simple) JMH benchmark as a separate module. This one can cause high system CPU usage - at operation rates of 2M per second or more (8 cores). I think these rates are really fine. Note: these rates cannot be achieved in production since then you'll obviously have to pay for (de)serialization, too. So we want to address these topics as follow-up: * own off-heap allocator * C* ability to access off-heap cached rows * C* ability to serialize hot keys directly from off-heap (might be a minor win since it's triggered not that often) * per-table knob to control whether to add to row-cache on writes -- I strongly believe that this is a useful feature (maybe LHF) on workloads where read and written data work on different (row} keys. * investigate if counter-cache can benefit * investigate if key-cache can benefit bq. You could start with it outside and publish to maven central and if there an issue getting patches applied quickly we can always fork it in C*. OK bq. pluggable row cache Then I'll start with that - just make row-cache pluggable and the implementation configurable. Note: JNA has a synchronized block that's executed at every call - version 4.2.0 fixes this (don't know when it will be released). > Serializing Row cache alternative (Fully off heap) > -------------------------------------------------- > > Key: CASSANDRA-7438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Linux > Reporter: Vijay > Assignee: Vijay > Labels: performance > Fix For: 3.0 > > Attachments: 0001-CASSANDRA-7438.patch, tests.zip > > > Currently SerializingCache is partially off heap, keys are still stored in > JVM heap as BB, > * There is a higher GC costs for a reasonably big cache. > * Some users have used the row cache efficiently in production for better > results, but this requires careful tunning. > * Overhead in Memory for the cache entries are relatively high. > So the proposal for this ticket is to move the LRU cache logic completely off > heap and use JNI to interact with cache. We might want to ensure that the new > implementation match the existing API's (ICache), and the implementation > needs to have safe memory access, low overhead in memory and less memcpy's > (As much as possible). > We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)