[ 
https://issues.apache.org/jira/browse/HDFS-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968350#comment-15968350
 ] 

Rushabh S Shah commented on HDFS-11634:
---------------------------------------

bq. Agreed, since we never iterate backwards, we don't need iterators from 
skipped storges.
We do iterate backwards when the requested data size is more than the fetched 
size (from the offset we chose randomly).
It creates a brand new iterator there and it would be nice if we can use the 
same iterator which we created above by resetting some index.
{noformat}
  if(totalSize<size) {
      iter = node.getBlockIterator(); // start from the beginning
      for(int i=0; i<startBlock&&totalSize<size; i++) {
        curBlock = iter.next();
        if(!curBlock.isComplete())  continue;
        if (curBlock.getNumBytes() < getBlocksMinBlockSize) {
          continue;
        }
        totalSize += addBlock(curBlock, results);
      }
{noformat}

> Optimize BlockIterator when interating starts in the middle.
> ------------------------------------------------------------
>
>                 Key: HDFS-11634
>                 URL: https://issues.apache.org/jira/browse/HDFS-11634
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.6.5
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>         Attachments: HDFS-11634.001.patch, HDFS-11634.002.patch, 
> HDFS-11634.003.patch, HDFS-11634.004.patch, HDFS-11643.005.patch
>
>
> {{BlockManager.getBlocksWithLocations()}} needs to iterate blocks from a 
> randomly selected {{startBlock}} index. It creates an iterator which points 
> to the first block and then skips all blocks until {{startBlock}}. It is 
> inefficient when DN has multiple storages. Instead of skipping blocks one by 
> one we can skip entire storages. Should be more efficient on average.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to