[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216455#comment-17216455 ] Ayush Saxena commented on HDFS-14383: - Committed to trunk. Thanx [~elgoiri] and [~LiJinglun] for the review and [~kpalanisamy] for the report!!! > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14383-01.patch, HDFS-14383-02.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216420#comment-17216420 ] Jinglun commented on HDFS-14383: +1. Yes it solves my problem ! > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14383-01.patch, HDFS-14383-02.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216171#comment-17216171 ] Ayush Saxena commented on HDFS-14383: - Thanx [~elgoiri] for the review. [~LiJinglun] Let me know if you are verifying, Plan to push this in another two days > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14383-01.patch, HDFS-14383-02.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215648#comment-17215648 ] Íñigo Goiri commented on HDFS-14383: [^HDFS-14383-02.patch] LGTM. +1 [~LiJinglun], can you verify this solves your use case too? > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14383-01.patch, HDFS-14383-02.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215242#comment-17215242 ] Jinglun commented on HDFS-14383: I met the same problem recently. This patch makes sense to me. Thanks [~ayushtkn] your working. > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-14383-01.patch, HDFS-14383-02.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214617#comment-17214617 ] Ayush Saxena commented on HDFS-14383: - Thanx [~elgoiri] for the review. Have handled the comments i v2. Checkstyle warning is not from new code, it was already there, Test failures not related, due to {{OOM}} > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Priority: Major > Attachments: HDFS-14383-01.patch, HDFS-14383-02.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214612#comment-17214612 ] Hadoop QA commented on HDFS-14383: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 35s{color} | | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 34m 14s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 8s{color} | | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 4s{color} | | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 10s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} blanks {color} | {color:green} 0m 0s{color} | | {color:green} The patch has no blanks issues. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | [/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt|https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/233/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt] | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 576 unchanged - 1 fixed = 577 total (was 577) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 4s{color} | | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | | {color:green} the patch passed with JDK Private
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214464#comment-17214464 ] Íñigo Goiri commented on HDFS-14383: Minor comments: * When initializing the {{considerLoadByStorageType}} parameter, we should leave {{conf.getBoolean(}} in the first line. We should also fix the comma. * We should put the code to get {{inServiceXceiverCount}} in a function with a comment explaining the principle. * Javadoc for getInServiceXceiverAverageByStorageType(). * Javadoc for {{getStorageTypeStats()}}. Overall the unit test is pretty good at describing the behavior too. > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Priority: Major > Attachments: HDFS-14383-01.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214001#comment-17214001 ] Ayush Saxena commented on HDFS-14383: - [~elgoiri] [~vinayakumarb] can you help give a check > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Priority: Major > Attachments: HDFS-14383-01.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213200#comment-17213200 ] Hadoop QA commented on HDFS-14383: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 28s{color} | | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 45s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 53s{color} | | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 58s{color} | | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s{color} | | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} blanks {color} | {color:green} 0m 0s{color} | | {color:green} The patch has no blanks issues. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 41s{color} | [/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt|https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/229/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt] | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 576 unchanged - 1 fixed = 579 total (was 577) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 33s{color} | | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | | {color:green} the patch passed with JDK Private
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211865#comment-17211865 ] Ayush Saxena commented on HDFS-14383: - What is the expectation in the case of heterogeneous datanode gets encountered with this enabled? Do we calculate based on the storages on that Datanode, or fallback to the original method? Well I was trying this out, have uploaded a patch with the idea(what I understood), give a check if that sounds good > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Priority: Major > Attachments: HDFS-14383-01.patch > > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197097#comment-17197097 ] Karthik Palanisamy commented on HDFS-14383: --- [~weichiu] I busy with other issues. I think someone can pick this up. > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Priority: Major > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808050#comment-16808050 ] Karthik Palanisamy commented on HDFS-14383: --- {quote}I suppose this load issue wouldn't occur if you configure datanodes such that they have some HOT and some COLD volumes. {quote} Exactly [~jojochuang]. Problem only with some datanodes which are specific either COLD or HOT. IMO - it is typical to see this setup because hardware and resources are specific to COLD/HOT storage. > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Karthik Palanisamy >Priority: Major > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808004#comment-16808004 ] Wei-Chiu Chuang commented on HDFS-14383: Seems tricky. I wonder what's the typical configuration for heterogeneous storage node. Is it typical to see nodes with only COLD storage volumes? I suppose this load issue wouldn't occur if you configure datanodes such that they have some HOT and some COLD volumes. > Compute datanode load based on StoragePolicy > > > Key: HDFS-14383 > URL: https://issues.apache.org/jira/browse/HDFS-14383 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.3, 3.1.2 >Reporter: Karthik Palanisamy >Assignee: Karthik Palanisamy >Priority: Major > > Datanode load check logic needs to be changed because existing computation > will not consider StoragePolicy. > DatanodeManager#getInServiceXceiverAverage > {code} > public double getInServiceXceiverAverage() { > double avgLoad = 0; > final int nodes = getNumDatanodesInService(); > if (nodes != 0) { > final int xceivers = heartbeatManager > .getInServiceXceiverCount(); > avgLoad = (double)xceivers/nodes; > } > return avgLoad; > } > {code} > > For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) > with average 10 xceivers the calculated threshold by the NN is 28 (((500 + > 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes > unavailable when the COLD tier nodes are barely in use. Turning this check > off helps to mitigate this issue, however the > dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, > upon turning it off can lead to situations where specific DNs are > "overloaded". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org