[ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274409#comment-14274409 ]
Ariel Weisberg commented on CASSANDRA-7438: ------------------------------------------- I did another review. The additional test coverage looks great. Don’t throw Error, throw runtime exceptions on things like serialization issues. The only place it make sense to throw error is when allocating memory fails. That would match the behavior of ByteBuffer.allocateDirect. I don’t see failure to allocate from the heap allocator as recoverable even in the context of the cache. IOError is thrown from one place in the entire JDK (Console) so it's an odd choice. freeCapacity should reall be a field inside each segment and full/not full and eviction decisions should be made inside each segment independently. In practice inside C* it’s probably fine as just an AtomicLong, but I want to see OHC be all it can be. Rehash test could validate the data. After the rehash. It could also validate the rehash under concurrent access, say have a reader thread that is randomly accessing values < the already inserted value.I can’t tell if the crosscheck test inserts enough values to trigger rehashing. Inlining the murmur3 changes makes me a little uncomfortable. It’s good see see some test coverage comparing with another implementation, but it’s over a small set of data. It seems like the Unsigned stuff necessary to perfectly mimic the native version of murmur3 is missing? Add 2-4 byte coed points for the UTF-8 tests. FastUtil is a 17 megabyte dependency all to get one array list. The cross checking implementation is really nice. Looking at the AbstractKeyIterator, I don’t see how it can do the right thing when a segment rehashes. It will point to a random spot in the segment after a rehash right? In practice maybe this doesn’t matter since they should size up promptly and it’s just an optimization that we dump this stuff at all. I can understand what the current code does so I lean towards keeping it. There are a couple of places (serializeForPut, putInternal, maybe others) where there are two exception handlers that each de-allocate the same piece of memory. The deallocation could go in a finally instead of the exception handlers since it always happens. > Serializing Row cache alternative (Fully off heap) > -------------------------------------------------- > > Key: CASSANDRA-7438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7438 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Linux > Reporter: Vijay > Assignee: Robert Stupp > Labels: performance > Fix For: 3.0 > > Attachments: 0001-CASSANDRA-7438.patch, tests.zip > > > Currently SerializingCache is partially off heap, keys are still stored in > JVM heap as BB, > * There is a higher GC costs for a reasonably big cache. > * Some users have used the row cache efficiently in production for better > results, but this requires careful tunning. > * Overhead in Memory for the cache entries are relatively high. > So the proposal for this ticket is to move the LRU cache logic completely off > heap and use JNI to interact with cache. We might want to ensure that the new > implementation match the existing API's (ICache), and the implementation > needs to have safe memory access, low overhead in memory and less memcpy's > (As much as possible). > We might also want to make this cache configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)