dclim commented on a change in pull request #6397: Adds bloom filter aggregator to 'druid-bloom-filters' extension URL: https://github.com/apache/incubator-druid/pull/6397#discussion_r221790755
########## File path: docs/content/development/extensions-core/bloom-filter.md ########## @@ -4,21 +4,29 @@ layout: doc_page # Druid Bloom Filter -Make sure to [include](../../operations/including-extensions.html) `druid-bloom-filter` as an extension. +This extension adds the ability to both construct bloom filters from query results, and filter query results by testing +against a bloom filter. Make sure to [include](../../operations/including-extensions.html) `druid-bloom-filter` as an +extension. -BloomFilter is a probabilistic data structure for set membership check. +A BloomFilter is a probabilistic data structure for set membership check. Following are some characterstics of BloomFilter - BloomFilters are highly space efficient when compared to using a HashSet. -- Because of the probabilistic nature of bloom filter false positive (element not present in bloom filter but test() says true) are possible -- false negatives are not possible (if element is present then test() will never say false). -- The false positive probability is configurable (default: 5%) depending on which storage requirement may increase or decrease. +- Because of the probabilistic nature of bloom filter false positive results are possible (e.g. element was not actually +present in bloom filter construction, but `test()` says true) +- False negatives are not possible (if element is present then `test()` will never say false). +- The false positive probability is configurable (default: 5%) depending on which storage requirement may increase or + decrease. - Lower the false positive probability greater is the space requirement. - Bloom filters are sensitive to number of elements that will be inserted in the bloom filter. -- During the creation of bloom filter expected number of entries must be specified.If the number of insertions exceed the specified initial number of entries then false positive probability will increase accordingly. +- During the creation of bloom filter expected number of entries must be specified.If the number of insertions exceed Review comment: space after period ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
