[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-20 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900148#comment-14900148
 ] 

He Tianyi commented on HDFS-9090:
-

The combination theory makes sense. Thanks guys.

Since multiple placement policy is not supported yet, I took the approach that 
DFSClient add nodes in local rack to {{excludeNodes}} during calls of 
{{getAdditionalBlock}}. This is ugly but solved the problem right now. 
I'll further wait for either HDFS-4894 or HDFS-7068 implemented, then use the 
custom policy without write locality only for these data.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-19 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877135#comment-14877135
 ] 

Steve Loughran commented on HDFS-9090:
--

I'm in favour of multiple placement policies, not options on the current one 
—with those policies designed to share as much code as they can (i.e. not the 
way the YARN schedulers currently work).

Allowing apps to chose a placement policy when creating an FS client instance 
would let you deploy different configs for different parts of the cluster; this 
would tie in well with label-driven placement

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791500#comment-14791500
 ] 

Walter Su commented on HDFS-9090:
-

bq. Based on that, how about add one parameter, perhaps named "localityLevel" 
to chooseTarget
HDFS-8390 try to do the same thing. But it doesn't worth to complicate default 
policy if it's not a popular demand. 10 users have 10 special needs, we have 
2^10 combinations. It's a disaster to put all of them in default policy.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791457#comment-14791457
 ] 

He Tianyi commented on HDFS-9090:
-

Not quite sure but I do think perhaps this is perpendicular with 
{{BlockPlacementPolicy}}.

Assume that HDFS-7068 is implemented. In this case, one can configure 
{{BlockPlacementPolicy}} for specified INode. It is certain that write 
operation under particular directory can be enforced to scatter data across the 
cluster. 
But, given that {{BlockPlacementPolicy}} focuses on where replica should be 
located, each identical policy may differentiate to two different versions 
(with locality, and without).
That is, we have {{BlockPlacementPolicyDefault}}, then perhaps we need a 
{{BlockPlacementPolicyDefaultWithoutWriteLocality}}.
And for a real case, we have {{BlockPlacementPolicyWithMultiDC}}, then perhaps 
we also need a {{BlockPlacementPolicyWithMultiDCWithoutWriteLocality}}.
Let alone the latter one could be implemented by just overriding several 
methods.

Based on that, how about add one parameter, perhaps named "localityLevel" to 
{{chooseTarget}}, then each policy can have their own consideration without 
having the burden of implement two versions?

This could also work when multiple policy is not supported.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791410#comment-14791410
 ] 

Walter Su commented on HDFS-9090:
-

bq. The placement policy in the erasure coding branch achieves the goal of 
spreading the data across racks.
It still could burden local racks where the 10 storm nodes locate on. Maybe you 
can customize a policy. Just override {{chooseTargetInOrder}} with completely 
{{chooseRandom}}. But it hurts YARN application's performance. HDFS-4894 or 
HDFS-7068 could be very helpful but not implemented yet. Maybe you should take 
the advice from [~ste...@apache.org] to use ingest nodes.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14790984#comment-14790984
 ] 

Zhe Zhang commented on HDFS-9090:
-

[~ste...@apache.org] Good thought. The placement policy in the erasure coding 
branch achieves the goal of spreading the data across racks. [~walter.k.su] Did 
the work under HDFS-8186. Right now we are switching the policy on for EC files 
only.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14790966#comment-14790966
 ] 

Steve Loughran commented on HDFS-9090:
--

This deployment makes sense: the traditional placement policy is based on the 
assumption the writer of the data may want to read it again, so leaves a copy 
close. That doesn't hold if its something that does want to spread its data 
across the racks —with the expectation that other work will read it in. It 
doesn't just hurt disk usage, it would bias YARN workloads to run on those 
nodes, or at least the same rack.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14790745#comment-14790745
 ] 

Zhe Zhang commented on HDFS-9090:
-

This could also be addressed through multiple / customizable placement 
policies: HDFS-4894 and HDFS-7068

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14769041#comment-14769041
 ] 

He Tianyi commented on HDFS-9090:
-

Alternatively, from a different perspective, perhaps we can consider add a 
{{considerLoad}} for getBlockLocations either, which affects sort weight of 
overloaded data nodes. This is similar to {{considerLoad}} during write.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14768973#comment-14768973
 ] 

He Tianyi commented on HDFS-9090:
-

Thanks, [~ste...@apache.org].

My case may be a little rare. Actually these writer nodes have Storm deployed 
and it is storm jobs that feed HDFS with logs.
And due to cost control and budget cycle, it is natural to deploy DataNode on 
every machine that has enough hardware resource.
(Otherwise it would be a waste to keep hard disks of 'ingest nodes' almost 
empty)

IMHO perhaps this could be a common scenario for medium-sized startups.

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9090) Write hot data on few nodes may cause performance issue

2015-09-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14747311#comment-14747311
 ] 

Steve Loughran commented on HDFS-9090:
--

moved to improvement. Generally work is balanced enough across a cluster that 
you don't get overloaded nodes; presumably this is a deployment where you have 
some servers dedicated to a specific role, rather than having YARN place it 
wherever it chooses.

FWIW, a lot of Hadoop clusters have 'edge nodes'/'ingest nodes' that (a) don't 
have any local DN and (b) are sometimes hooked straight to the toplevel switch 
at 10 Gb/s, so get great performance scattering data across the cluster

> Write hot data on few nodes may cause performance issue
> ---
>
> Key: HDFS-9090
> URL: https://issues.apache.org/jira/browse/HDFS-9090
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> (I am not sure whether this should be reported as BUG, feel free to modify 
> this)
> Current block placement policy makes best effort to guarantee first replica 
> on local node whenever possible.
> Consider the following scenario:
> 1. There are 500 datanodes across plenty of racks,
> 2. Raw user action log (just an example) are being written only on 10 nodes, 
> which also have datanode deployed locally,
> 3. Then, before any balance, all these logs will have at least one replica in 
> 10 nodes, implying one thirds data read on these log will be served by these 
> 10 nodes if repl factor is 3, performance suffers.
> I propose to solve this scenario by introducing a configuration entry for 
> client to disable arbitrary level of write locality.
> Then we can either (A) add local nodes to excludedNodes, or (B) tell NameNode 
> the locality we prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)