GitHub user Bcpoole opened a pull request: https://github.com/apache/spark/pull/16864
[SPARK-19527][Core] Approximate Size of Intersection of Bloom Filters **What changes were proposed in this pull request?** Added functions to get the Swamidass & Baldi (2007) approximation for number of items in a Bloom filter and the intersections of two filters. Added an exception type IncompatibleUnionException mimicing IncompatibleMergeException. As needed for the intersection approximation, there is a function that create the union of two Bloom filters (no mutations). **How was this patch tested?** Manual Tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/Bcpoole/spark approxItemsInBloomFilterIntersection Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16864.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16864 ---- commit 7a3ad46ff86bd3d2d47f6a56bace1a0c4fd171c8 Author: Bcpoole <brandoncpo...@gmail.com> Date: 2017-02-09T01:11:07Z Swamidass & Baldi approx. items in intersection of two Bloom filters. Also function to create union (non-mutation) of two Bloom filters. commit b9680c57b2f8b1d93c28884de9a7ebbe52505f6c Author: Bcpoole <brandoncpo...@gmail.com> Date: 2017-02-09T01:42:36Z Changed createUnionBloomFilter & approxItemsInIntersection to be instance instead of static functions commit 501ad7e22101b00862c0c77ef8c38e1b166d33a4 Author: Bcpoole <brandoncpo...@gmail.com> Date: 2017-02-09T01:53:50Z Updated abstract class to reflect changes in previous commit ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org