[ https://issues.apache.org/jira/browse/PHOENIX-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ankit Singhal updated PHOENIX-6710: ----------------------------------- Fix Version/s: 5.1.3 > Revert PHOENIX-3842 Turn on back default bloomFilter for Phoenix Tables > ----------------------------------------------------------------------- > > Key: PHOENIX-6710 > URL: https://issues.apache.org/jira/browse/PHOENIX-6710 > Project: Phoenix > Issue Type: Bug > Components: core > Affects Versions: 4.11.0 > Reporter: Ankit Singhal > Assignee: Ankit Singhal > Priority: Major > Fix For: 5.2.0, 5.1.3 > > > PHOENIX-3842 was done to workaround PHOENIX-3797 to unblock a release, and > with the assumption that Phoenix is not used for GETs. > > At one of our users, we saw that they have been doing heavy GETs in their > custom coprocessor to check if the key is present or not in the current. At > most 99% of the time, the key is not expected to be present during the > initial load as keys are expected to be random, but there is still some > chance that there is 1% of keys would be duplicated. But in the absence of > BloomFilter, HBase has to seek HFile to confirm if the key is not present, > which results in regression in performance for about 2x slower. > > Even in use cases like Index maintenance and "ON DUPLICATE KEY" queries will > also be impacted without bloom filters. > > As Phoenix is still used for GETs by the users (SELECT query with key as a > filter). and we also have constructs that intrinsically do GETs like Index > maintenance and > "On Duplicate key". So I believe it is always better to have a bloom filter > should be "ON" by default as I don't also see any implication of it, even if > it is not getting used. > -- This message was sent by Atlassian Jira (v8.20.7#820007)