[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite
[ https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433687#comment-17433687 ] Janus Chow commented on HDFS-16275: --- [~ayushtkn] Thank you for your quick explanation. In fact, I was curious and confused about the default "false" here. I thought the considerLoad is as same as the current conditions in `isGoodDatanode`, as exclude stale or exclude slow node, they are kind of no difference for locality or non-locality. For configuration, since the considerLoad is from the config of "dfs.namenode.replication.considerLoad", would "dfs.namenode.replication.locality.considerLoad" be a choice? > [HDFS] Enable considerLoad for localWrite > - > > Key: HDFS-16275 > URL: https://issues.apache.org/jira/browse/HDFS-16275 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently when client is on the same machine of a datanode, it will try to > write to the local machine regardless of the load of the datanode, that is > the xceiverCount. > In our production cluster, datanode and Nodemanager are running on the same > server, so when there are heavy jobs running on a labeled queue, the > corresponding datanodes will have higher xceiverCounts than other datanodes. > When other clients are trying to write, the exception of "could only be > replicated to 0 nodes" would be thrown. > This ticket is to enable considerLoad to avoid the hot local write. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite
[ https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433678#comment-17433678 ] Ayush Saxena commented on HDFS-16275: - Ohhk, By any chance have you explored AvailableSpaceBlockPlacementPolicy. That has a optimisation available for local node as well in form of a config {{dfs.namenode.available-space-block-placement-policy.balance-local-node}} I haven't gone through the code, but the change proposed should be configurable & by default turned off, for backward compatibility > [HDFS] Enable considerLoad for localWrite > - > > Key: HDFS-16275 > URL: https://issues.apache.org/jira/browse/HDFS-16275 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently when client is on the same machine of a datanode, it will try to > write to the local machine regardless of the load of the datanode, that is > the xceiverCount. > In our production cluster, datanode and Nodemanager are running on the same > server, so when there are heavy jobs running on a labeled queue, the > corresponding datanodes will have higher xceiverCounts than other datanodes. > When other clients are trying to write, the exception of "could only be > replicated to 0 nodes" would be thrown. > This ticket is to enable considerLoad to avoid the hot local write. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite
[ https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433671#comment-17433671 ] Janus Chow commented on HDFS-16275: --- [~ayushtkn] Thanks for the comment. I think we do like to have data locality, only not a too hot one. IMHO when the node is not too hot, the locality should boost the performance. The change from default "false" is mainly for cooling the node down. > [HDFS] Enable considerLoad for localWrite > - > > Key: HDFS-16275 > URL: https://issues.apache.org/jira/browse/HDFS-16275 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently when client is on the same machine of a datanode, it will try to > write to the local machine regardless of the load of the datanode, that is > the xceiverCount. > In our production cluster, datanode and Nodemanager are running on the same > server, so when there are heavy jobs running on a labeled queue, the > corresponding datanodes will have higher xceiverCounts than other datanodes. > When other clients are trying to write, the exception of "could only be > replicated to 0 nodes" would be thrown. > This ticket is to enable considerLoad to avoid the hot local write. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16275) [HDFS] Enable considerLoad for localWrite
[ https://issues.apache.org/jira/browse/HDFS-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433306#comment-17433306 ] Ayush Saxena commented on HDFS-16275: - If that is the problem, why don't you use avoidLocal? From the use case you any way doesn't seems to be interested in data locality > [HDFS] Enable considerLoad for localWrite > - > > Key: HDFS-16275 > URL: https://issues.apache.org/jira/browse/HDFS-16275 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Janus Chow >Assignee: Janus Chow >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently when client is on the same machine of a datanode, it will try to > write to the local machine regardless of the load of the datanode, that is > the xceiverCount. > In our production cluster, datanode and Nodemanager are running on the same > server, so when there are heavy jobs running on a labeled queue, the > corresponding datanodes will have higher xceiverCounts than other datanodes. > When other clients are trying to write, the exception of "could only be > replicated to 0 nodes" would be thrown. > This ticket is to enable considerLoad to avoid the hot local write. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org