[ 
https://issues.apache.org/jira/browse/CASSANDRA-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019491#comment-13019491
 ] 

Peter Schuller commented on CASSANDRA-2466:
-------------------------------------------

Yes, it's not decreasing the total amount. The idea is that with small values, 
at least you're now left with a tweakable situation where a suitable heap size 
and a suitable initial occupancy trigger should be sufficient to avoid 
concurrent mode failures, as long as you're avoiding the fragmentation induced 
promotion failures (pragmatically, even if theoretically unclean, one can also 
help CMS along by inserting some minor application level pauses during 
allocations of many chunks as part of a larger allocation). But the memory 
pressure remains.

Regarding off-heap: I didn't suggest it because I'd personally like to avoid 
JNI/JNA if possible for "safety" reasons, and I would also like to avoid adding 
further dependencies on CMS sweeps for external resources (files, off-heap mem, 
etc) for the usual reasons. But maybe I'm more paranoid about that than most.

It would be really nice to just have an explicitly managed pool of fixed-size 
off-heap buffers of a reasonable size (say, a meg a piece, mmap():ed). I'm 
thinking maybe the bloom filters can be explicitly managed more easily than the 
mmaps/brafs for data reading; for example by accepting that for the few 
requests that race with the removal of an sstable, the bloom filter would just 
pretend all bits are set.

Hmmm...



> bloom filters should avoid huge array allocations to avoid fragmentation 
> concerns
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2466
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2466
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Peter Schuller
>            Priority: Minor
>
> The fact that bloom filters are backed by single large arrays of longs is 
> expected to interact badly with promotion of objects into old gen with CMS, 
> due to fragmentation concerns (as discussed in CASSANDRA-2463).
> It should be less of an issue than CASSANDRA-2463 in the sense that you need 
> to have a lot of rows before the array sizes become truly huge. For 
> comparison, the ~ 143 million row key limit implied by the use of 'int' in 
> BitSet prior to the switch to OpenBitSet translates roughly to 238 MB 
> (assuming the limitation factor there was the addressability of the bits with 
> a 32 bit int, which is my understanding).
> Having a preliminary look at OpenBitSet with an eye towards replacing the 
> single long[] with multiple arrays, it seems that if we're willing to drop 
> some of the functionality that is not used for bloom filter purposes, the 
> bits[i] indexing should be pretty easy to augment with modulo to address an 
> appropriate smaller array. Locality is not an issue since the bloom filter 
> case is the worst possible case for locality anyway, and it doesn't matter 
> whether it's one huge array or a number of ~ 64k arrays.
> Callers may be affected like BloomFilterSerializer which cares about the 
> underlying bit array.
> If the full functionality of OpenBitSet is to be maintained (e.g., xorCount) 
> some additional acrobatics would be necessary and presumably at a noticable 
> performance cost if such operations were to be used in performance critical 
> places.
> An argument against touching OpenBitSet is that it seems to be pretty 
> carefully written and tested and has some non-trivial details and people have 
> seemingly benchmarked it quite carefully. On the other hand, the improvement 
> would then apply to other things as well, such as the bitsets used to keep 
> track of in-core pages (off the cuff for scale, a 64 gig sstable should imply 
> a 2 mb bit set, with one bit per 4k page).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to