[ 
https://issues.apache.org/jira/browse/HBASE-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857482#action_12857482
 ] 

Jonathan Gray commented on HBASE-2450:
--------------------------------------

I'm not sure it's actually only related to minors not respecting deletes (and 
actually, they currently do process the deletes and leave the tombstones as 
well to keep our existing invariant that the current files deletes only apply 
to later files).

If we have a delete row, for example, it would be at the start of the row so if 
we seeked directly to the column in question, we would skip past the delete row 
which could actually apply to the column we're looking for.

I'm not sure how much additional complexity is added by having separate minor 
and major code paths.  There's not a heck of a lot of difference and it's 
mostly just having one path use certain elements of our normal read path stuff 
(delete trackers, query matcher, etc).  As I recall, there was a significant 
performance hit doing the processing required for majors vs minors, but also I 
don't think at the time we were using the ScanDeleteTracker on minors as we are 
today.  We should probably do some benchmarking to just find out.

Good catch Erik.  We need to think about this more.  We should still be able to 
start at the front of the row and once past any full row deletes then skip down 
to the column in question (using something like HBASE-1517)... right?

> For single row reads of specific columns, seek to the first column in HFiles 
> rather than start of row
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2450
>                 URL: https://issues.apache.org/jira/browse/HBASE-2450
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>            Reporter: Jonathan Gray
>             Fix For: 0.20.5, 0.21.0
>
>
> Currently we will always seek to the start of a row.  If we are getting 
> specific columns, we should seek to the first column in that row.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to