[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791457#comment-14791457
 ] 

He Tianyi commented on HDFS-9090:
---------------------------------

Not quite sure but I do think perhaps this is perpendicular with 
{{BlockPlacementPolicy}}.

Assume that HDFS-7068 is implemented. In this case, one can configure 
{{BlockPlacementPolicy}} for specified INode. It is certain that write 
operation under particular directory can be enforced to scatter data across the 
cluster. 
But, given that {{BlockPlacementPolicy}} focuses on where replica should be 
located, each identical policy may differentiate to two different versions 
(with locality, and without).
That is, we have {{BlockPlacementPolicyDefault}}, then perhaps we need a 
{{BlockPlacementPolicyDefaultWithoutWriteLocality}}.
And for a real case, we have {{BlockPlacementPolicyWithMultiDC}}, then perhaps 
we also need a {{BlockPlacementPolicyWithMultiDCWithoutWriteLocality}}.
Let alone the latter one could be implemented by just overriding several 
methods.

Based on that, how about add one parameter, perhaps named "localityLevel" to 
{{chooseTarget}}, then each policy can have their own consideration without 
having the burden of implement two versions?

This could also work when multiple policy is not supported.

> Write hot data on few nodes may cause performance issue
> -------------------------------------------------------
>
>                 Key: HDFS-9090
>                 URL: https://issues.apache.org/jira/browse/HDFS-9090
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.3.0
>            Reporter: He Tianyi
>            Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to