[ https://issues.apache.org/jira/browse/CASSANDRA-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934111#comment-13934111 ]
Jonathan Ellis commented on CASSANDRA-6633: ------------------------------------------- so, close this as Later or Notaproblem? > Investigate Bloom Filter Improvements > ------------------------------------- > > Key: CASSANDRA-6633 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6633 > Project: Cassandra > Issue Type: Improvement > Reporter: Benedict > > Investigate various possible improvements to our bloom filters: > 1) Dynamic resizing would be useful. There are a few ways this could be > achieved: with some modification, downsampling could be supported; > partitioning the hash functions so that we may select the number of > hashes/bits dynamically, by loading/unloading a given partition; and there > are some related data structures, such as Quotient Filters that support > resizing and merging. > 2) Faster loading: should be possible to mmap the bloom filter representation > on disk > 3) Most ambitious of all, would be to try to reduce the memory requirement of > bloom filters. Golomb Coded Sets > [1|http://algo2.iti.kit.edu/singler/publications/cacheefficientbloomfilters-wea2007.pdf] > are a possibility, as are other compressed hash structures > [2|http://www.it-c.dk/people/pagh/papers/bloom.pdf]. -- This message was sent by Atlassian JIRA (v6.2#6252)