[ https://issues.apache.org/jira/browse/HBASE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886041#action_12886041 ]
HBase Review Board commented on HBASE-2265: ------------------------------------------- Message from: "Kannan Muthukkaruppan" <kan...@facebook.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/257/#review312 ----------------------------------------------------------- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java <http://review.hbase.org/r/257/#comment1375> Could we hoist the cheaper check first? Since isGetScan() has to do a byte comparison of start/endRow it would be better to do this only if bloom filters are actually in use. So change the second part of the expression to something like: ( (this.bloomFilter == null) || (!scan.isGetScan()) || passesBloomFilter(...)) Or, you could just pass the Scan to passesBloomFilter() instead of scan.getStartRow(). And there we already check for this.bloomFilter == null first. Then you could add the check for "scan.isGetScan". And then the rest of the function. - Kannan > HFile and Memstore should maintain minimum and maximum timestamps > ----------------------------------------------------------------- > > Key: HBASE-2265 > URL: https://issues.apache.org/jira/browse/HBASE-2265 > Project: HBase > Issue Type: Improvement > Components: regionserver > Reporter: Todd Lipcon > Assignee: Pranav Khaitan > > In order to fix HBASE-1485 and HBASE-29, it would be very helpful to have > HFile and Memstore track their maximum and minimum timestamps. This has the > following nice properties: > - for a straight Get, if an entry has been already been found with timestamp > X, and X >= HFile.maxTimestamp, the HFile doesn't need to be checked. Thus, > the current fast behavior of get can be maintained for those who use strictly > increasing timestamps, but "correct" behavior for those who sometimes write > out-of-order. > - for a scan, the "latest timestamp" of the storage can be used to decide > which cell wins, even if the timestamp of the cells is equal. In essence, > rather than comparing timestamps, instead you are able to compare tuples of > (row timestamp, storage.max_timestamp) > - in general, min_timestamp(storage A) >= max_timestamp(storage B) if storage > A was flushed after storage B. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.