[ 
https://issues.apache.org/jira/browse/HBASE-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008540#comment-15008540
 ] 

ramkrishna.s.vasudevan commented on HBASE-14826:
------------------------------------------------

Take this case where we have two storefiles (one column family and 2 qualifiers)
Storefile 1 - row0, row1, row2, row3, row5 (all with qual 1)
Storefile 2 : row4 and row6 (with qual 2)

Now after fetching row3 if we try to reseek to row4 - the scanner on store file 
1 would have already fetched row5 and so that next() call would have updated 
the current to the scanner on storefile 2. 
So the reseek to row4 will start on that scanner on storefile 2.
Now if we suppose fetch row2 and then reseek to row4, now the scanner on 
storefile1 will be in row3 so when we do reseek it will see that row3 in that 
scanner is lesser than row4 (reseeked row) and so it will try to get the next 
scanner from the heap (this is as part of reseek). So every time reseek() doing 
an PQ.add(current) can be avoided. 
Let me try the hadoopQA and see if there is something wrong in my point.


> Small improvement in KVHeap seek() API
> --------------------------------------
>
>                 Key: HBASE-14826
>                 URL: https://issues.apache.org/jira/browse/HBASE-14826
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: ramkrishna.s.vasudevan
>            Priority: Minor
>         Attachments: HBASE-14826.patch
>
>
> Currently in seek/reseek() APIs we tend to do lot of priorityqueue related 
> operations. We initially add the current scanner to the heap, then poll and 
> again add the scanner back if the seekKey is greater than the topkey in that 
> scanner. Since the KVs are always going to be in increasing order and in 
> ideal scan flow every seek/reseek is followed by a next() call it should be 
> ok if we start with checking the current scanner and then do a poll to get 
> the next scanner. Just avoid the initial PQ.add(current) call. This could 
> save some comparisons. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to