[ 
https://issues.apache.org/jira/browse/PHOENIX-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-6710:
-----------------------------------
    Fix Version/s: 5.2.0

> Revert PHOENIX-3842 Turn on back default bloomFilter for Phoenix Tables
> -----------------------------------------------------------------------
>
>                 Key: PHOENIX-6710
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6710
>             Project: Phoenix
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.11.0
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>            Priority: Major
>             Fix For: 5.2.0
>
>
> PHOENIX-3842 was done to workaround PHOENIX-3797  to unblock a release, and 
> with the assumption that Phoenix is not used for GETs.
>  
> At one of our users, we saw that they have been doing heavy GETs in their 
> custom coprocessor to check if the key is present or not in the current. At 
> most 99% of the time, the key is not expected to be present during the 
> initial load as keys are expected to be random, but there is still some 
> chance that there is 1% of keys would be duplicated. But in the absence of 
> BloomFilter, HBase has to seek HFile to confirm if the key is not present, 
> which results in regression in performance for about 2x slower.
>  
> Even in use cases like Index maintenance and "ON DUPLICATE KEY" queries will 
> also be impacted without bloom filters.
>  
> As Phoenix is still used for GETs by the users (SELECT query with key as a 
> filter). and we also have constructs that intrinsically do GETs like Index 
> maintenance and
> "On Duplicate key". So I believe it is always better to have a bloom filter 
> should be "ON" by default as I don't also see any implication of it, even if 
> it is not getting used.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to