[ 
https://issues.apache.org/jira/browse/CASSANDRA-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528754#comment-13528754
 ] 

Pavel Yaskevich edited comment on CASSANDRA-5020 at 12/11/12 7:37 AM:
----------------------------------------------------------------------

How about we use madvice(dont_need) and mincore calls to check if file is still 
being used instead of waiting for GC to cleanup or copy contents on read? 
Basically, when file is scheduled for deletion it's page cache would be dropped 
and checked using mincore until 3 checks in succession return empty results, 
the interval could be set to 20-30 seconds as we know that we are actually 
waiting system to "post-process" e.g. send pre-existing buffers to 
client/coordinator and there is no way for new data to be read for that file.

Edit: as a second option, we could make segments mmap on-demand with 
WeakReference so it could be reclaimed when no longer needed, mmap call 
overhead is a matter for performance measurement.
                
      was (Author: xedin):
    How about we use madvice(dont_need) and mincore calls to check if file is 
still being used instead of waiting for GC to cleanup or copy contents on read? 
Basically, when file is scheduled for deletion it's page cache would be dropped 
and checked using mincore until 3 checks in succession return empty results, 
the interval could be set to 20-30 seconds as we know that we are actually 
waiting system to "post-process" e.g. send pre-existing buffers to 
client/coordinator and there is no way for new data to be read for that file.
                  
> Time to switch back to byte[] internally?
> -----------------------------------------
>
>                 Key: CASSANDRA-5020
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5020
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> We switched to ByteBuffer for column names and values back in 0.7, which gave 
> us a short term performance boost on mmap'd reads, but we gave that up when 
> we switched to refcounted sstables in 1.0.  (refcounting all the way up the 
> read path would be too painful, so we copy into an on-heap buffer when 
> reading from an sstable, then release the reference.)
> A HeapByteBuffer wastes a lot of memory compared to a byte[] (5 more ints, a 
> long, and a boolean).
> The hard problem here is how to do the arena allocation we do on writes, 
> which has been very successful in reducing STW CMS from heap fragmentation.  
> ByteBuffer is a good fit there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to