[ 
https://issues.apache.org/jira/browse/CASSANDRA-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101856#comment-14101856
 ] 

Benedict commented on CASSANDRA-6726:
-------------------------------------

Thanks. Some comments:

* It looks like we're still recycling the bytebuffer we've allocated _along 
with the RAR_ into FileCacheService which is what we're trying to avoid - i.e. 
in close() we should recycle the bytebuffer whether or not we deallocate, and 
rewire it up next time we're returned
* I appreciate what you're doing with the logarithmic buckets and special pool 
classes, but we already have a morass of pooling classes and I'd prefer not to 
complicate that further unless necessary. Here I think we can do much less and 
achieve the same end: currently the block size is fixed (and this storage 
format is has an expiry date attached to it, so this is unlikely to change), so 
a ThreadLocal ArrayDeque containing buffers all of DEFAULT_BUFFER_SIZE should 
be sufficient, perhaps with a low ceiling on size and a shared fallback queue 
for when that one is exausted, so that we still function well on small numbers 
of huge workloads. Let's keep it all within FileCacheService since that's what 
currently manages the cache, although it's conflating concepts a bit
* To clarify this, we could remove the bufferSize parameter so it's always 
clearly DEFAULT_BUFFER_SIZE
* We should probably rename file_cache_size_in_mb to something more accurate 
like file_buffer_pool_size_in_mb
* We should probably introduce a new max_file_cache_handles property, and 
FileCacheService should pool files based on this property, and no longer pay 
attention to getTotalBufferSize()
* There are a few places you allocate non-tiny temporary arrays inside of 
methods. It would be preferable to use a ThreadLocal<byte[]> so they can be 
reused; we do this in a few places, so it might be nice to coalesce them all 
into a ByteBufferUtil method so we don't litter the whole codebase with this 
pattern (currently happens in PureJavaCrc32 and AbstractNativeCell). They can 
all use arrays of the same size - these other two use 256 vs. your 4096; IMO 
256 or 512 is plenty for all of these scenarios (although feel free to 
benchmark the difference; we do have particularly large chunks of memory to 
checksum in the reader)
* Possibly ByteArrayCompressor.uncompress should also have its own ThreadLocal 
temporary array of arbitrary (max encountered) size limit instead of delegating 
to getArray, but this is not a big deal since we don't expect to exercise this 
codepath 

bq. so please let me know of any style, testing, etc. issues, or anything I may 
be missing about the Cassandra review process.

Patch looks good style-wise. Standard practice is to name a diff as #.txt, in 
this case 6726.txt. Personally I much prefer to review branches posted to 
github, for all but the most trivial patches.

> Recycle CompressedRandomAccessReader/RandomAccessReader buffers independently 
> of their owners, and move them off-heap when possible
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6726
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6726
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: cassandra-6726.patch
>
>
> Whilst CRAR and RAR are pooled, we could and probably should pool the buffers 
> independently, so that they are not tied to a specific sstable. It may be 
> possible to move the RAR buffer off-heap, and the CRAR sometimes (e.g. Snappy 
> may possibly support off-heap buffers)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to