[ 
https://issues.apache.org/jira/browse/HBASE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886041#action_12886041
 ] 

HBase Review Board commented on HBASE-2265:
-------------------------------------------

Message from: "Kannan Muthukkaruppan" <kan...@facebook.com>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/257/#review312
-----------------------------------------------------------



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
<http://review.hbase.org/r/257/#comment1375>

    Could we hoist the cheaper check first?
    
    Since isGetScan() has to do a byte comparison of start/endRow it would be 
better to do this only if bloom filters are actually in use.
    
    So change the second part of the expression to something like:
    
    (   (this.bloomFilter == null)
     || (!scan.isGetScan())
     || passesBloomFilter(...))
    
    Or, you could just pass the Scan to passesBloomFilter() instead of 
scan.getStartRow().
    
    And there we already check for this.bloomFilter == null first.
    
    Then you could add the check for "scan.isGetScan".
    
    And then the rest of the function.
    
    


- Kannan





> HFile and Memstore should maintain minimum and maximum timestamps
> -----------------------------------------------------------------
>
>                 Key: HBASE-2265
>                 URL: https://issues.apache.org/jira/browse/HBASE-2265
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: Pranav Khaitan
>
> In order to fix HBASE-1485 and HBASE-29, it would be very helpful to have 
> HFile and Memstore track their maximum and minimum timestamps. This has the 
> following nice properties:
> - for a straight Get, if an entry has been already been found with timestamp 
> X, and X >= HFile.maxTimestamp, the HFile doesn't need to be checked. Thus, 
> the current fast behavior of get can be maintained for those who use strictly 
> increasing timestamps, but "correct" behavior for those who sometimes write 
> out-of-order.
> - for a scan, the "latest timestamp" of the storage can be used to decide 
> which cell wins, even if the timestamp of the cells is equal. In essence, 
> rather than comparing timestamps, instead you are able to compare tuples of 
> (row timestamp, storage.max_timestamp)
> - in general, min_timestamp(storage A) >= max_timestamp(storage B) if storage 
> A was flushed after storage B.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to