[
https://issues.apache.org/jira/browse/HADOOP-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536802
]
Hairong Kuang commented on HADOOP-1912:
---------------------------------------
Thanks Raghu!
1. Throttling test is in TestBlockReplacement.
2. Regd throttler, yes I totally agree with you. I will make ThrottlerBase to
be package private and it would be nice if a throttler can be shared by mutlple
threads. Let's see how this could be done.
3. For comment 3, your change is not exactly the same as the logic in the
current code. If you'd merge two checks into one, I will change the code to be
if(priSet.contains(delNodeHint) || ( addedNode != null &&
!priSet.contains(addedNode)) { cur = delNodeHint;}
4. For addBlock, the current code may have null locations in machineSet which
cause serialization error, so I use an ArrayList first and then convert it into
an array when constructing the result.
> Datanode should support block replacement
> -----------------------------------------
>
> Key: HADOOP-1912
> URL: https://issues.apache.org/jira/browse/HADOOP-1912
> Project: Hadoop
> Issue Type: New Feature
> Components: dfs
> Affects Versions: 0.14.1
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: replace.patch, replace1.patch, replace2.patch,
> replace3.patch
>
>
> This jira Data Node's support for rebalancing (HADOOP-1652). When a balancer
> decides to move a block B from Source S to Destination D. It also chooses a
> proxy source PS, which contains a replica of B, to speed up block copy. The
> block placement is carried in the following steps:
> 1. A block copy command is sent to datanode PS in the format of
> "OP_BLOCK_COPY <block_id_of_B> <source S> <destination D>". It requests PS to
> copy B to datanode D.
> 2. PS then transfers block B to datanode D with a block replacement command
> to D in the format of "OP_BLOCK_REPLACEMENT <block_id_of_B> <source S>
> <data_of_B>".
> 3. Datanode D writes the block B to its disk and then sends a name node a
> blockReceived RPC informing the namenode that a block B is received and
> please delete a replica of B from source S if there is any excessive replica.
> 4. The namenode then adds datanode D to block B's map and removes an exesive
> replicas of B in favor of datanode S.
> In addition, each data node has a limited bandwidth for rebalancing. The
> default value for the bandwidth is 5MB/s. Throttling is done at both source &
> destination sides. Each data node limits maximum number of concurrent data
> transfers (including both sending and receiving) for the rebalancing purpose
> to be 5. In the worst case, each data transfer has a limited bandwidth of
> 1MB/s. Each sender & receiver has a Throttler. The primary method of the
> class is "throttle( int numOfBytes )". The parameter numOfBytes indicates the
> total number of bytes that the caller has sent or received since the last
> throttle is called. The method calculates the caller's I/O rate. If the rate
> is faster than the bandwidth limit, it sleeps to slow down the data transfer.
> After it wakes up, it adjusts its bandwidth limit if the number of concurrent
> data transfers is changed.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.