[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222001#comment-14222001 ] Hudson commented on HDFS-7331: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1965 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1965/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221994#comment-14221994 ] Hudson commented on HDFS-7331: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #13 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/13/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221964#comment-14221964 ] Hudson commented on HDFS-7331: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #13 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/13/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221957#comment-14221957 ] Hudson commented on HDFS-7331: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1941 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1941/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221925#comment-14221925 ] Hudson commented on HDFS-7331: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #13 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/13/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221932#comment-14221932 ] Hudson commented on HDFS-7331: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #751 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/751/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221650#comment-14221650 ] Hudson commented on HDFS-7331: -- FAILURE: Integrated in Hadoop-trunk-Commit #6590 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6590/]) HDFS-7331. Add Datanode network counts to datanode jmx page. Contributed by Charles Lamb. (atm: rev 2d4f3e567e4bb8068c028de12df118a4f3fa6343) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Fix For: 2.7.0 > > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220464#comment-14220464 ] Charles Lamb commented on HDFS-7331: Thanks for the review and the +1 Haohui. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220456#comment-14220456 ] Haohui Mai commented on HDFS-7331: -- Thanks for the pointer. The patch looks good to me. +1 > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220288#comment-14220288 ] Aaron T. Myers commented on HDFS-7331: -- For reference, here's some docs on RRDTool, which is what Ganglia uses internally. Of particular note: {quote} Every rrd input parameter is either of type GAUGE, DERIVE, ABSOLUTE, COMPUTE or COUNTER. GAUGE values store a certain sample at a particular instant. Whereas COUNTER values store the difference from the previous value. COUNTERs are useful when you want to measure bandwidth utilization in which the values monotonically increase. Other types are not so commonly used. If you are curious read the man page of rrdcreate(1). {quote} > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220285#comment-14220285 ] Aaron T. Myers commented on HDFS-7331: -- bq. I know most of the metric systems will collect metrics and plot graphs directly. I'm unaware of any metrics systems that can do subtraction out-of-the-box. Some pointers are appreciated. Ganglia (and I bet basically all monitoring software) can do this out of the box. In Ganglia, one need only specify the slope of the metric to be a counter instead of a gauge, in which case the default is to store the values as a per-second rate. In general counters are strictly better than calculating the rate on the producer side, since the monitoring software can derive the latter from the former, but not the reverse. Also, the monitoring sampling frequency matters much less if you use a counter because if you calculate the rate on the producer side, then it's entirely possible that if your monitoring software's sample frequency is too low then one can miss anomalies in the values and have no way to detect this. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220235#comment-14220235 ] Haohui Mai commented on HDFS-7331: -- bq. ... somewhat handled by external cluster management tools, since they periodically sample and can do the subtraction to see the change over time. I know most of the metric systems will collect metrics and plot graphs directly. I'm unaware of any metrics systems that can do subtraction out-of-the-box. Some pointers are appreciated. This metric is quite useful, personally I think some sorts of TTL is required so that the metric can be consumed and integrated with existing metric systems. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220210#comment-14220210 ] Aaron T. Myers commented on HDFS-7331: -- In the abstract it's an interesting idea to have all of our counter metrics be time-bounded, but presently none of them are, and I see no reason that this new metric should be inconsistent with the rest of them. I also agree with [~andrew.wang] that by and large operators consume these metrics via tools that do aggregations and can do their own calculations of occurrences within a time bound based on sampling, so there may not be much of a need to implement this functionality directly in Hadoop. [~wheat9], to be explicit, are you OK with the current patch as-is? I think we should commit this and thinking about having TTLs for our counter-based metrics more generally outside of this JIRA. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215603#comment-14215603 ] Andrew Wang commented on HDFS-7331: --- I think this is somewhat handled by external cluster management tools, since they periodically sample and can do the subtraction to see the change over time. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215587#comment-14215587 ] Haohui Mai commented on HDFS-7331: -- Thinking about it a little bit more, from a operational point of view maybe it is important to limit the TTL of these counts. If I understand correctly, the current patch does not reset the counts unless the DN is restarted. The number of the network failures since the last 24 hours, last week, etc. might be more valuable. Therefore I think it might be more appropriate to limit the TTL of the counts (which should be configurable), instead of bounding the maximum size of the map (which should be effectively bounded by the TTL anyway). > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202686#comment-14202686 ] Hadoop QA commented on HDFS-7331: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12680207/HDFS-7331.004.patch against trunk revision 1e97f2f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8690//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8690//console This message is automatically generated. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch, HDFS-7331.004.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202299#comment-14202299 ] Hadoop QA commented on HDFS-7331: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12680170/HDFS-7331.003.patch against trunk revision 42bbe37. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1219 javac compiler warnings (more than the trunk's current 1218 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8689//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8689//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8689//console This message is automatically generated. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, > HDFS-7331.003.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201227#comment-14201227 ] Charles Lamb commented on HDFS-7331: bq. Do folks agree with the above? This makes sense. [~wheat9]? > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201220#comment-14201220 ] Aaron T. Myers commented on HDFS-7331: -- {quote} If I understand correctly, the metrics record failures for each DataXceiver endpoint. I think you have the assumption that all accesses comes within the cluster. The assumption no longer holds when the computation and storage are separated. I've no strong opinion on this, I'm fine with using a HashMap in this jira and bounding its size down the road if problem occurs. {quote} Got it. I'd suggest then that we leave support for bounding the size in the patch, and make it configurable, but default it to be unbounded. That way if it ends up causing a problem for someone they can configure the size smaller without having to deploy a new version of the software. This brings up a related point - I think the current patch groups the counts by remote address, which I believe includes the remote port. In the case of non-DN clients, this port can be anything in the ephemeral port range, which isn't very useful. I think better would be to group just by IP address, and not include the port. Do folks agree with the above? > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201044#comment-14201044 ] Haohui Mai commented on HDFS-7331: -- bq. By 'it' do you mean the '#getDatanodeNetworkCounts()' method? It's still needed for the MXBean, no? Or do you mean we can get rid of something else (perhaps the servlet?). Yeah. I think getting rid of the servlet sounds good. bq. Sorry, let me ask again: why are we making this map have a max size at all? If I understand correctly, the metrics record failures for each DataXceiver endpoint. I think you have the assumption that all accesses comes within the cluster. The assumption no longer holds when the computation and storage are separated. I've no strong opinion on this, I'm fine with using a {{HashMap}} in this jira and bounding its size down the road if problem occurs. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201031#comment-14201031 ] Charles Lamb commented on HDFS-7331: @wheat9, Thanks for the comments. bq. Does the stats servlet return the exact value of getDatanodeNetworkCounts()? If so we can get rid of it. By 'it' do you mean the '#getDatanodeNetworkCounts()' method? It's still needed for the MXBean, no? Or do you mean we can get rid of something else (perhaps the servlet?). > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201030#comment-14201030 ] Aaron T. Myers commented on HDFS-7331: -- bq. Is there a reason to make it configurable? Maybe we can first experiment with a reasonable default value, and make it configurable later. Sorry, let me ask again: why are we making this map have a max size at all? I can't think of any good reason to do so, and I can think of some bad ones - specifically that this will be used to try to diagnose network issues, and having the possibility of an incomplete view of the failures is not good at all for this purpose. If you insist on it having an enforceable max size, then it absolutely should be configurable, and the default value should be very high IMO - maybe 5,000 - which should cover almost all clusters in the world. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200880#comment-14200880 ] Haohui Mai commented on HDFS-7331: -- Thanks for the work. Some comments: {code} +final int dncCacheMaxSize = +conf.getInt(DFS_DATANODE_NETWORK_COUNTS_CACHE_MAX_SIZE_KEY, +DFS_DATANODE_NETWORK_COUNTS_CACHE_MAX_SIZE_DEFAULT) ; {code} Is there a reason to make it configurable? Maybe we can first experiment with a reasonable default value, and make it configurable later. {code} + @InterfaceAudience.Private + public static class NetworkStatsServlet extends HttpServlet { +private static final long serialVersionUID = 1L; + +@Override +public void doGet(HttpServletRequest request, +HttpServletResponse response) throws IOException { + response.setContentType("text/plain"); + + final DataNode datanode = (DataNode) + getServletContext().getAttribute("datanode"); + final Map> dnc = + datanode.getDatanodeNetworkCounts(); + + final StringBuilder buffer = new StringBuilder(8*1024); + buffer.append("Network counters for datanode "). + append(datanode.getDisplayName()).append(":\n"); + for (Map.Entry> ent : dnc.entrySet()) { +buffer.append(ent.getKey()).append(":\n"); +for (Map.Entry ent2 : ent.getValue().entrySet()) { + buffer.append(String.format(" %-26s : %8s\n", + ent2.getKey(), ent2.getValue())); +} + } + buffer.append("\n"); + response.getWriter().write(buffer.toString()); +} + } + {code} Does the stats servlet return the exact value of {{getDatanodeNetworkCounts()}}? If so we can get rid of it. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200804#comment-14200804 ] Hadoop QA commented on HDFS-7331: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12679887/HDFS-7331.002.patch against trunk revision 10f9f51. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1218 javac compiler warnings (more than the trunk's current 1217 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestLeaseRecovery {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8678//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8678//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8678//console This message is automatically generated. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14197449#comment-14197449 ] Aaron T. Myers commented on HDFS-7331: -- bq. The size of the map needs to be bounded. A cachemap can do the job. Why does the size of the map need to be bounded? To take an extreme case, in a 5,000 node cluster we'd be storing maybe an extra 30 bytes for each hostname, an extra 4 bytes for each IP address, and an extra 8 bytes for each long. Add in maybe another 32 bytes for object overhead per map entry, and you're looking at a total of 370KB on each DN. That hardly seems like something to worry about, considering most DN heaps are multiple GBs in size. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14197393#comment-14197393 ] Andrew Wang commented on HDFS-7331: --- Instead of a Map, couldn't we just use an object? Fairly sure Jackson can do the POJO->JSON conversion automatically. If there are also plans to put other network stats into this structure, we should consider some more fine-grained synchronization. Stat updates were a major slowdown in the FileSystem client until Colin fixed it. If it's just for error cases, this isn't necessary. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195656#comment-14195656 ] Hadoop QA commented on HDFS-7331: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12679087/HDFS-7331.001.patch against trunk revision 35d353e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1331 javac compiler warnings (more than the trunk's current 1273 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestGlobPaths {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8633//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8633//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8633//console This message is automatically generated. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195484#comment-14195484 ] Haohui Mai commented on HDFS-7331: -- Got it. It's a nice feature. Some comments: {code} + final Map> datanodeNetworkCounts = + new HashMap>(); + {code} The size of the map needs to be bounded. A cachemap can do the job. {code} + @Override // DataNodeMXBean + public String getDatanodeNetworkCounts() { +return JSON.toString(datanodeNetworkCounts); + } + {code} The net effect is that the JMX will return a JSON string, but not a JSON object. You'll need to return a map in the function directly so that the JMX can return a real JSON. It can take some time to implement -- to me it also makes sense to expose the information in a servlet, given the hierarchical structure of the data. > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195470#comment-14195470 ] Charles Lamb commented on HDFS-7331: Hi [~wheat9], The current metric only exposes the total error count for all DNs that a DN is talking to. This Jira adds per-DN error counts and leaves room in the JSON to add other items on a per-dn basis. For example, if DN1 is talking to DN2, DN3, and DN4, this will allow an admin to see the error count for each of DN2, DN3, and DN4 as seen from DN1. That will help the admin isolate a network problem. Charles > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
[ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195463#comment-14195463 ] Haohui Mai commented on HDFS-7331: -- Since the information has been exposed in metrics, why putting it into JMX again? > Add Datanode network counts to datanode jmx page > > > Key: HDFS-7331 > URL: https://issues.apache.org/jira/browse/HDFS-7331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Charles Lamb >Assignee: Charles Lamb >Priority: Minor > Attachments: HDFS-7331.001.patch > > > Add per-datanode counts to the datanode jmx page. For example, networkErrors > could be exposed like this: > {noformat} > }, { > ... > "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}", > ... > "NamenodeAddresses" : > "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}", > "VolumeInfo" : > "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}", > "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e" > }, { > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)