GitHub user vitamon opened a pull request:

    https://github.com/apache/spark/pull/19409

    fix openHashSet to actually use quadratic probing instead of linear

    The comments in the code state that OpehHashSet uses quadratic probing, but 
in fact it uses linear probing, which "results in primary clustering, and as 
the cluster grows larger, the search for those items hashing within the cluster 
becomes less efficient."
    see https://en.wikipedia.org/wiki/Quadratic_probing
    
    OpenHashSetSuite pass with both probing methods.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vitamon/spark openhashset

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19409.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19409
    
----
commit 6fb0a407edd3ae8c8d4b9154076768ed03028a09
Author: Vitalii Tamazian <vtamaz...@google.com>
Date:   2017-10-02T14:15:26Z

    fix openHashSet to actually use quadratic probing instead of linear

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to