[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite

2021-10-25 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433687#comment-17433687
 ] 

Janus Chow commented on HDFS-16275:
---

[~ayushtkn] Thank you for your quick explanation.

In fact, I was curious and confused about the default "false" here. I thought 
the considerLoad is as same as the current conditions in `isGoodDatanode`, as 
exclude stale or exclude slow node, they are kind of no difference for locality 
or non-locality.

For configuration, since the considerLoad is from the config of 
"dfs.namenode.replication.considerLoad",  would 
"dfs.namenode.replication.locality.considerLoad" be a choice?

 

 

> [HDFS] Enable considerLoad for localWrite
> -
>
> Key: HDFS-16275
> URL: https://issues.apache.org/jira/browse/HDFS-16275
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when client is on the same machine of a datanode, it will try to 
> write to the local machine regardless of the load of the datanode, that is 
> the xceiverCount.
> In our production cluster, datanode and Nodemanager are running on the same 
> server, so when there are heavy jobs running on a labeled queue, the 
> corresponding datanodes will have higher xceiverCounts than other datanodes. 
> When other clients are trying to write, the exception of "could only be 
> replicated to 0 nodes" would be thrown.
> This ticket is to enable considerLoad to avoid the hot local write.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite

2021-10-25 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433678#comment-17433678
 ] 

Ayush Saxena commented on HDFS-16275:
-

Ohhk, By any chance have you explored AvailableSpaceBlockPlacementPolicy. That 
has a optimisation available for local node as well in form of a config 
{{dfs.namenode.available-space-block-placement-policy.balance-local-node}}

I haven't gone through the code, but the change proposed should be configurable 
& by default turned off, for backward compatibility 

 

> [HDFS] Enable considerLoad for localWrite
> -
>
> Key: HDFS-16275
> URL: https://issues.apache.org/jira/browse/HDFS-16275
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when client is on the same machine of a datanode, it will try to 
> write to the local machine regardless of the load of the datanode, that is 
> the xceiverCount.
> In our production cluster, datanode and Nodemanager are running on the same 
> server, so when there are heavy jobs running on a labeled queue, the 
> corresponding datanodes will have higher xceiverCounts than other datanodes. 
> When other clients are trying to write, the exception of "could only be 
> replicated to 0 nodes" would be thrown.
> This ticket is to enable considerLoad to avoid the hot local write.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite

2021-10-25 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433671#comment-17433671
 ] 

Janus Chow commented on HDFS-16275:
---

[~ayushtkn] Thanks for the comment.

I think we do like to have data locality, only not a too hot one. IMHO when the 
node is not too hot, the locality should boost the performance. The change from 
default "false" is mainly for cooling the node down.

> [HDFS] Enable considerLoad for localWrite
> -
>
> Key: HDFS-16275
> URL: https://issues.apache.org/jira/browse/HDFS-16275
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when client is on the same machine of a datanode, it will try to 
> write to the local machine regardless of the load of the datanode, that is 
> the xceiverCount.
> In our production cluster, datanode and Nodemanager are running on the same 
> server, so when there are heavy jobs running on a labeled queue, the 
> corresponding datanodes will have higher xceiverCounts than other datanodes. 
> When other clients are trying to write, the exception of "could only be 
> replicated to 0 nodes" would be thrown.
> This ticket is to enable considerLoad to avoid the hot local write.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite

2021-10-23 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433306#comment-17433306
 ] 

Ayush Saxena commented on HDFS-16275:
-

If that is the problem, why don't you use avoidLocal? From the use case you any 
way doesn't seems to be interested in data locality

> [HDFS] Enable considerLoad for localWrite
> -
>
> Key: HDFS-16275
> URL: https://issues.apache.org/jira/browse/HDFS-16275
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when client is on the same machine of a datanode, it will try to 
> write to the local machine regardless of the load of the datanode, that is 
> the xceiverCount.
> In our production cluster, datanode and Nodemanager are running on the same 
> server, so when there are heavy jobs running on a labeled queue, the 
> corresponding datanodes will have higher xceiverCounts than other datanodes. 
> When other clients are trying to write, the exception of "could only be 
> replicated to 0 nodes" would be thrown.
> This ticket is to enable considerLoad to avoid the hot local write.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org