[jira] [Commented] (HDFS-9543) DiskBalancer : Add Data mover

Anu Engineer (JIRA) Mon, 18 Apr 2016 10:02:02 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246011#comment-15246011
 ]


Anu Engineer commented on HDFS-9543:
------------------------------------

[~eddyxu] Thank you for  the code review comments. Please see some of my 
thoughts on your suggestions.

bq.Could you put getNextBlock() logic into a separate Iterator, and make it 
Closable, which will include getBlockToCopy(), openPoolIters(), getNextBlock(), 
closePoolIters(). There are a few draw backs of separating them into different 
functions. 

I thought that this is an excellent idea, and I explored this path of code 
organization. But I got stuck in an intractable problem which makes me want to 
abandon this path. Please do let me know if you have suggestions on how you 
think I can solve this.

In supporting an Iterator interface, we have to support hasNext. In our case it 
is a scan of the block list and finding a block that is small than the required 
move size. There are 2 ways to do this, look for the block and report true if 
found, but there is no gurantee that it will indeed be returned in the next() 
call since these blocks can go away underneath us. 

So a common pattern of iterator code becomes complex to write -- 
while(hasNext()) next() -- pattern now needs to worry next() failing even when 
hasNext has been successful.

We can keep a pointer to the found block in memory, and return that in the next 
call, but that means that we have to do some unnecessary block state management 
in the iterator.

I ended up writing all this and found that code is getting more complex instead 
of simpler and has kind of decided to abandon this approach.


bq. 1) The states (i.e. poolIndex,) are stored outside these functions, the 
caller needs maintain these states. 

These are part of BlockMover class, and copyblocks in the only call made by 
other classes. So it is not visible to caller at all.

bq. 2) poolIndex is never initialized and is not able be reset.
PoolIndex is a index to a circular array,  if you like I can initialize this to 
0, but in most cases we just move to next block pool and get the next block. 
Before each block fetch we init the count variable so that we know if we have 
visited all the block pools. In other words, users should not be able to see 
this, nor need to reset this variable.

bq. Please always log the IOEs. And I think it is better to throw IOE here as 
well as many other places.
I will log a debug trace here. The reason why we are not throwing is because it 
is possible that we might encounter a damaged block when we try to move large 
number of blocks from one volume to another. Instead of failing or aborting the 
action we keep a count of errors we have encountered. We will just ignore that 
block and continue copy, until we hit max_error_counts. For each move -- that 
is a source disk, destination disk, bytes to move -- you can specify the max 
error count. Hence we are not throwing but keeping track of the error count.

bq. Can it be a private static List<BlockIterator> openPoolIters() ?
Since we are not doing the iterator interface, I am skipping this too.

bq. In a few such places, should we actually break the while loop? Wouldn't 
continue here just generate a lot of LOGS and spend CPU cycles?
You are right  and I did think of writing a break here. The reason that I chose 
continue over break was this. It is easier to reason about a loop if it has 
only one exit point. With break, you have to reason about all exit points. This 
loop is a pretty complicated one since it can exit in many ways. Here are some :

# when we meet the maximum error count.
# if we have reach close enough to move bytes -- say close to 10% of the target 
# if we get an interrupt call from the client.
# if we get a shutdown call.
# if we are not able to find any blocks to move.
# if we are out of destination space.

so instead of making people reason about each of these, we set exit flag and 
loop back up. Since the while will turn to false, it will have one single exit, 
and will exit without any extra logging.

bq. Why do you need to change float to double. In this case, wouldn't float 
good enough ? 
This is a stylistic change, Java docs recommend double as the default choice 
over float. Hence this fix.

> DiskBalancer : Add Data mover 
> ------------------------------
>
>                 Key: HDFS-9543
>                 URL: https://issues.apache.org/jira/browse/HDFS-9543
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>         Attachments: HDFS-9543-HDFS-1312.001.patch
>
>
> This patch adds the actual mover logic to the datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9543) DiskBalancer : Add Data mover

Reply via email to