A question for Spark developers I see that Bloom filters have been integrated in Spark 2.0 <https://spark.apache.org/docs/2.0.0-preview/api/scala/index.html#org.apache.spark.util.sketch.BloomFilter> .
Hadoop already has some Bloom filter implementations, especially a dynamic one <https://hadoop.apache.org/docs/r2.7.2/api/org/apache/hadoop/util/bloom/DynamicBloomFilter.html> , very interesting when the number of keys largely exceed what was imagined. Is there any rationale (performance, implem...) for this implem in Spark instead of re-using the one from Hadoop ? Thanks ! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-2-0-bloom-filters-tp27297.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org