[ 
https://issues.apache.org/jira/browse/COLLECTIONS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959847#comment-16959847
 ] 

Alex Herbert commented on COLLECTIONS-728:
------------------------------------------

Looking at BloomFilter.java class there is nothing in it that allows you to add 
another filter.

This is a major use case for filters that guard collections, if an object is 
not present then add it to the existing Bloom filter then do something with the 
object.
{code:java}
boolean match(BloomFilter other);

// Possible for a MutableBloomFilter interface?
// Could return true if the filter was changed by the operation.
// Or return void.
boolean add(BloomFilter other);
{code}
The Shape class is a strong statement about the classic definition of a Bloom 
filter. It implements the Bloom equations for m, n, k, p:

 
{noformat}
"computing an array of m bits, and for up to n different elements, test k bits 
using positions chosen using hash functions. If all bits are set, the element 
probably already exists, with a false positive rate of p." 
{noformat}
I think this can be part of the interface but is not required. It could be 
moved to a separate class as it just acts as a calculator.

It does open the following questions:
 # Is a BloomFilter defined by its size and set bits?
 # Can we match BloomFilters of different lengths?

I would assume the answer to 1 is perhaps (if a filter is built using hashing 
functions to choose bits). If there is no other way to build a Bloom filter 
then the size and access to set bits should be available within the interface:
{code:java}
int getNumberOfBits();
boolean get(int bitIndex);
{code}
I assume the answer to 2 is yes. The default would be to pad the smaller filter 
with nothing (i.e. zeros).

I see the core functionality as:
 # Matching another filter
 # Adding another filter
 # Optionally defining key properties of the filter (the size and the set bits)

Separation of filter functionality and implementation would apply to the Hash 
and Skeleton classes since the way to build a Bloom filter is separate from the 
filter functionality.

> BloomFilter contribution
> ------------------------
>
>                 Key: COLLECTIONS-728
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-728
>             Project: Commons Collections
>          Issue Type: Task
>            Reporter: Claude Warren
>            Priority: Minor
>         Attachments: BF_Func.md, BloomFilter.java, BloomFilterI2.java, 
> Usage.md
>
>
> Contribution of BloomFilter library comprising base implementation and gated 
> collections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to