[ https://issues.apache.org/jira/browse/HBASE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guangxu Cheng updated HBASE-20636: ---------------------------------- Attachment: HBASE-20636.master.001.patch > Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED > ------------------------------------------------------------------- > > Key: HBASE-20636 > URL: https://issues.apache.org/jira/browse/HBASE-20636 > Project: HBase > Issue Type: New Feature > Components: regionserver > Reporter: Guangxu Cheng > Assignee: Guangxu Cheng > Priority: Major > Attachments: HBASE-20636.master.001.patch > > > As we all know, HBase uses BloomFilter(ROW and ROWCOL) to filter unnecessary > files to improve read performance. But they only support Get and do not > support Scan. > In our company(Tencent), many users need to scan all rows with the same > prefix, such as Tencent Game. Game user's some operational record will be > written into HBase, each game user will have a lot of records, the rowkey is > constructed as userid+'#'+timestamps. So we can scan all records for a given > user for a specified period. > For this scenario, we designed the prefix Bloom filter. If the startRow and > stopRow of the Scan has a valid common prefix, the scan will be allowed to > use BloomFilter to filter files which will enhance the performance of the > scan. > Now, this feature has been running on our cluster over a year, and scan > performance for this scenario has been improved by more than one times than > before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)