[ 
https://issues.apache.org/jira/browse/COLLECTIONS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000951#comment-17000951
 ] 

Gilles Sadowski commented on COLLECTIONS-728:
---------------------------------------------

{quote}properties are now tracked in the HashFunctionIdentity.
{quote}
Perhaps {{HashFunctionProperties}} or {{HashFunctionSpecification}} would be a 
better name (?).
{quote}An equals method on the HashFunction would require the HashFunction 
object be in some sense Serializable.
{quote}
IIUC correctly (?), a hash function can be implemented by a third party (not 
abiding by any of APIs to be defined here).
 Then to be usable within the {{BloomFilter}} framework defined here, an 
application developer will need
 * to wrap the function's "properties" in a class (say, 
{{MyHashFunctionIdentity}}) that implements {{HashFunctionIdentity}}
 * to define a serialization scheme forĀ {{MyHashFunctionIndentity}} (if the 
application is distributed).

If so, why not have a {{MyHashFunction}} class that wraps the full third-party 
hash function rather then just its "properties"?
 We could have, in [Collections]
{code:java}
/**
 * Interface for implementations used within this framework.
 */
public interface HashFunction {
    /**
     * Returns {@code true} when from a given input, this instance
     * computes the same output as the {@code other} instance.
     */
    boolean isCompatible(HashFunction other);

    /**
     * Computes the hash value.
     *
     * @param input Input.
     * @param seed Seed.
     * @return the hash.
     */
    long compute(byte[] input, int seed);
}
{code}
As simple as can be (?).

And, the application developer would responsible for implementing the notion of 
"compatibility" (in the same way that he is responsible for computing the hash 
value, including issues arising from using a buggy function):
{code:java}
import java.io.Serializable;
import com.thirdparty.hash.NiceFunction;
import org.apache.commons.collections.bloomfilter.HashFunction;

public class MyHash implements HashFunction, Serializable {
    private static final byte[] COMP_TEST_A = new byte[] {- 19, 45, -34, 65, 1, 
22, 17, 74};
    private static final int COMP_TEST_B = -1395561;
    private static final long serialVersionUID = 123456789L;
    private NiceFunction f; // Assuming that "NiceFunction" is "Serializable".

    public MyHash(NiceFunction f) {
        this.f = f;
    }

    @Override
    public boolean isCompatible(HashFunction other) {
        if (other instanceof MyHash) {
            return true;
        } else {
            return Long.compare(compute(COMP_TEST_A, COMP_TEST_B),
                                other.compute(COMP_TEST_A, COMP_TEST_B)) == 0;
        }
    }

    @Override
    public long compute(byte[] input, int seed) {
        // ...
    }
}
{code}
Then, [Collections] could provide wrappers for the functions implemented in 
[Codec]. Casual users will get new functions at release time, while not 
preventing power users to define their own wrappers around experimental and 
non-standard functions.

Or am I completely off base?

bq. I would not expect the actual code for for the function to be sent.  This 
would probably require that the listener be a Java based application so that it 
could have the HashFunction implementation.

I'm confused.  Where is the hash computation used?

> BloomFilter contribution
> ------------------------
>
>                 Key: COLLECTIONS-728
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-728
>             Project: Commons Collections
>          Issue Type: Task
>            Reporter: Claude Warren
>            Priority: Minor
>         Attachments: BF_Func.md, BloomFilter.java, BloomFilterI2.java, 
> Usage.md
>
>
> Contribution of BloomFilter library comprising base implementation and gated 
> collections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to