Tanuj Khurana created PHOENIX-7229:
--------------------------------------
Summary: Leverage bloom filters for single key point lookups
Key: PHOENIX-7229
URL: https://issues.apache.org/jira/browse/PHOENIX-7229
Project: Phoenix
Issue Type: Improvement
Affects Versions: 5.1.3
Reporter: Tanuj Khurana
Assignee: Tanuj Khurana
PHOENIX-6710 enabled bloom filters by default when Phoenix tables are created.
However, we were not making use of it because Phoenix translates point lookups
to scans with the scan range [startkey, stopkey) where startkey is inclusive
and is equal to the row key and stopkey is exclusive and is the next key after
the row key.
This fails the check inside the hbase code in
[StoreFileReader#passesBloomFilter|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileReader.java#L245-L250]
because it applies bloom filter only to scans which are gets and a scan is a
GET only if startkey = stopkey and both are inclusive. This is defined here
[Scan#isGetScan|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java#L253-L255]
We recently have some customers whose use case involves doing point lookups
where the row key is not going to be present in the table. Bloom filters are
ideal for those use cases.
We can change our scan range for point lookups to leverage Bloom filters.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)