[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565906#comment-17565906 ] Masatake Iwasaki commented on HDFS-14084: - update the targets to 3.2.5 for preparing 3.2.4 release. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203485#comment-17203485 ] Erik Krogen commented on HDFS-14084: Hey [~sodonnell], thanks for the ping. Internally we decided to go with a different approach so I never got around to wrapping this up. Unfortunately I'm not working on HDFS these days and don't have the bandwidth to devote to it beyond maybe some simple reviews. I've unassigned this ticket from myself. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203481#comment-17203481 ] Stephen O'Donnell commented on HDFS-14084: -- [~xkrogen] It looks like this Jira was very nearly done, and it is something that would be great to get committed. Do you have bandwidth to finish it off? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194687#comment-17194687 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 11s{color} | {color:red} HDFS-14084 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956974/HDFS-14084.018.patch | | Console output | https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/150/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16915860#comment-16915860 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} HDFS-14084 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956974/HDFS-14084.018.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27679/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16915815#comment-16915815 ] Zhankun Tang commented on HDFS-14084: - Preparing for 3.1.3 release. moved all 3.1.3 non-blocker issues to 3.1.4, please move back if it is a blocker for you. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787206#comment-16787206 ] Erik Krogen commented on HDFS-14084: Hey [~dannytbecker], I think that would fix this specific problem, but I think it's also worth considering if the clientID is the right thing to use for that string. Should it instead be named by something more useful, such as the destination NameNode address, the client UGI, etc.? I haven't fully thought through the meaning of that argument yet, but I expect that a randomly generated string isn't the best thing to put there. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787165#comment-16787165 ] Danny Becker commented on HDFS-14084: - [~xkrogen] It looks like that error is caused by the use of Arrays.toString() on clientId which is a byte array. I see that in other parts of Client.java, StringUtils.byteToHexString() is used on clientId to print it or convert it to a string. Should that be used instead of Arrays.toString() for converting the byte array into a string? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781768#comment-16781768 ] Erik Krogen commented on HDFS-14084: Sounds good, thanks [~jojochuang] :) > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781239#comment-16781239 ] Wei-Chiu Chuang commented on HDFS-14084: [~xkrogen] – since you have a better idea where the issue is, I am assigning the task to you. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Erik Krogen >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779791#comment-16779791 ] Erik Krogen commented on HDFS-14084: Hi [~pranay_singh], do you plan to complete this work? I may take it up if not. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758738#comment-16758738 ] Erik Krogen commented on HDFS-14084: Also, if the metrics system _is_ initialized properly, you get an error like: {code:java} 19/02/01 22:40:05 WARN util.MBeans: Error creating MBean object name: Hadoop:service=client,name=RpcDetailedActivityForClient[74, 58, 30, 83, 70, 123, 72, -45, -104, 77, 65, -37, -54, 92, 38, 60] org.apache.hadoop.metrics2.MetricsException: javax.management.MalformedObjectNameException: Invalid character ',' inkey part of property at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:135) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:110) at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:165) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:97) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:73) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:222) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:101) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:268) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:233) at org.apache.hadoop.ipc.metrics.RpcDetailedMetrics.create(RpcDetailedMetrics.java:62) at org.apache.hadoop.ipc.Client.(Client.java:1345){code} I think we need to be more careful about how we create stringify the clientID > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758718#comment-16758718 ] Erik Krogen commented on HDFS-14084: Hey folks, I was testing out this patch and noticed an issue that I consider pretty serious. When this is used in a standard DFS client, the {{DefaultMetricsSystem}} singleton has never been initialized, so there is no proper prefix to use for configurations, and numerous bits of testing code is triggered. For example: {code:title=MetricsSystemImpl#register()} final String finalName = // be friendly to non-metrics tests DefaultMetricsSystem.sourceName(name2, !monitoring); {code} With your patch, this is triggered when {{monitoring}} is false, which is really only intended for testing AFAICT. This is the first instance I'm aware of that is leveraging metrics2 for client-side metrics. I think it means that we need to add a {{DefaultMetricsSystem.init("client")}} in the instantiation of {{Client}}. It will need a corresponding {{shutdown()}}, probably on {{Client#close()}}. Unfortunately both of these methods are expected to be only called once, so we probably need to add some new mechanisms for a "conditional initialization" that only initializes the system if this is the first call. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756827#comment-16756827 ] Hadoop QA commented on HDFS-14084: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 25s{color} | {color:green} root: The patch generated 0 new + 132 unchanged - 2 fixed = 132 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 47s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}100m 12s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 52s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}219m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956974/HDFS-14084.018.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 0e29c006e4e6 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c354195 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26097/testReport/ | | Max.
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756564#comment-16756564 ] Hadoop QA commented on HDFS-14084: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 25s{color} | {color:orange} root: The patch generated 2 new + 132 unchanged - 2 fixed = 134 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 45s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}100m 6s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}219m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956922/HDFS-14084.017.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a90fbf38d70b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0e95ae4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756349#comment-16756349 ] Pranay Singh commented on HDFS-14084: - Uploaded the new PATCH, HDFS-14084.017 > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755498#comment-16755498 ] Hudson commented on HDFS-14084: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15847 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15847/]) Revert "HDFS-14084. Need for more stats in DFSClient. Contributed by (weichiu: rev d1714c20e9309754397588c9b29818b9a74a80d8) * (delete) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestClientMetrics.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/metrics/RpcDetailedMetrics.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755487#comment-16755487 ] Wei-Chiu Chuang commented on HDFS-14084: Reverted. Sorry for my trouble. [~pranay_singh] please go ahead. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755480#comment-16755480 ] Wei-Chiu Chuang commented on HDFS-14084: I'm going to revert the commit from trunk, and then [~pranay_singh] can post a new patch. IMO that makes history cleaner and cherry-pick easier. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754374#comment-16754374 ] Íñigo Goiri commented on HDFS-14084: We have a couple options: * Follow up JIRA to fix this. * Revert [^HDFS-14084.013.patch] and commit one with the fix. * Make [^HDFS-14084.016.patch] and addendum to [^HDFS-14084.013.patch] and clean up the two intermediate ones.. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754261#comment-16754261 ] Pranay Singh commented on HDFS-14084: - That's right so shall I remove the other two patches and rename [^HDFS-14084.016.patch] to [^HDFS-14084.016.patch] > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754195#comment-16754195 ] Íñigo Goiri commented on HDFS-14084: Summarizing, [^HDFS-14084.016.patch] would be an addendum to [^HDFS-14084.013.patch] to fix the issue between hdfs and hdfs-client right? Should we name it as such? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753647#comment-16753647 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}161m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956518/HDFS-14084.016.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9296202f6209 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3b49d7a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/26067/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26067/testReport/ | | Max. process+thread count | 2921 (vs. ulimit of 1) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/26067/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753570#comment-16753570 ] Hadoop QA commented on HDFS-14084: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 95m 28s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956507/HDFS-14084.015.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4c20f006b519 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47d6b9b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/26066/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/26066/testReport/ | | Max. process+thread count | 3500 (vs. ulimit of 1) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/26066/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753494#comment-16753494 ] Pranay Singh commented on HDFS-14084: - The above failures are related to changing the package details (below) from org.apache.hadoop.hdfs.client.impl. The test fails as it uses classes that are only found in "org.apache.hadoop.hdfs" package, so I have moved TestClientMetrics.java under directory hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs package org.apache.hadoop.hdfs > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753477#comment-16753477 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} HDFS-14084 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956501/HDFS-14084.015.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/26065/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753317#comment-16753317 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 52s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 49s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 49s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 54s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 3m 50s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 32s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 51s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12956485/HDFS-14084.014.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 11efa5a742ec 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 91649c3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/26064/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/26064/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/26064/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt | | checkstyle |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753309#comment-16753309 ] Pranay Singh commented on HDFS-14084: - Created a new patch to fix the issue. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753266#comment-16753266 ] Ayush Saxena commented on HDFS-14084: - Hi!!! I guess the newly introduced test seems to be failing at trunk!!! [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1028/testReport/junit/org.apache.hadoop.hdfs/TestClientMetrics/testGetMetrics/] Even at Pre-Commit Jobs too [https://builds.apache.org/job/PreCommit-HDFS-Build/26062/testReport/junit/org.apache.hadoop.hdfs/TestClientMetrics/testGetMetrics/] Pls give a check to the package name for the Test too. {code:java} b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestClientMetrics.java {code} There is a difference b/w the location and the package set. {code:java} +package org.apache.hadoop.hdfs; {code} I guess it should be org.apache.hadoop.hdfs.client.impl instead org.apache.hadoop.hdfs from the location where it is put in ; If I am not missing some context. :) > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752380#comment-16752380 ] Wei-Chiu Chuang commented on HDFS-14084: +1 will commit shortly. I verified the tests in TestAMRMTokens pass now. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752471#comment-16752471 ] Hudson commented on HDFS-14084: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15830 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15830/]) HDFS-14084. Need for more stats in DFSClient. Contributed by Pranay (weichiu: rev 1d523279da94e199edafc8d4df23107e9c43da3e) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestClientMetrics.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/metrics/RpcDetailedMetrics.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740603#comment-16740603 ] Íñigo Goiri commented on HDFS-14084: +1 on [^HDFS-14084.013.patch]. Eventually, I want to reuse this new metrics infrastructure in the client for: * Track speeds in HDFS-12861. * Track the number of errors and exceptions. [~jlowe], this should be fine now; do you mind double checking? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740503#comment-16740503 ] Pranay Singh commented on HDFS-14084: - [~elgoiri] I ran all the tests including TestAMRMTokens they are all passing now. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739965#comment-16739965 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 6s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 24s{color} | {color:green} root: The patch generated 0 new + 132 unchanged - 2 fixed = 132 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 29s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 15s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}215m 45s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.ssl.TestSSLFactory | | | hadoop.hdfs.TestClientMetrics | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12954497/HDFS-14084.013.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bb4a174adf00 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 33c009a4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739896#comment-16739896 ] Íñigo Goiri commented on HDFS-14084: Have you been able to run {{TestAMRMTokens}} and the others that used to break? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739829#comment-16739829 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 41s{color} | {color:green} root: The patch generated 0 new + 132 unchanged - 2 fixed = 132 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 960 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 15s{color} | {color:red} The patch 600 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 45s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 27s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 58s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 46s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}135m 31s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.ssl.TestSSLFactory | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12954476/HDFS-14084.012.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9e2dad07f92c 4.4.0-139-generic #165-Ubuntu SMP
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738856#comment-16738856 ] Wangda Tan commented on HDFS-14084: --- Thanks [~jlowe] for letting me know, will redo the RC. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738813#comment-16738813 ] Hudson commented on HDFS-14084: --- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #15754 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15754/]) Revert "HDFS-14084. Need for more stats in DFSClient. Contributed by (jlowe: rev c634589ab2d602bf80ba513f88d44544e9bedcb5) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/metrics/RpcDetailedMetrics.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738812#comment-16738812 ] Pranay Singh commented on HDFS-14084: - [~jlowe] I have identified the cause of the problem I'm working on the PATCH that I'll upload by Monday, the issue is that I have used the dummy port # -1 for metrics, reusing that port number is causing these tests to fail. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738808#comment-16738808 ] Wei-Chiu Chuang commented on HDFS-14084: Let's go ahead revert it. Apologize for not verifying YARN/MapReduce. I suspect we can still get away with it by changing the name of metrics registry. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738770#comment-16738770 ] Íñigo Goiri commented on HDFS-14084: [~jlowe], please go ahead with the revert. I still don't get how this happens but we should be able to use TestAMRMTokens for testing. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738768#comment-16738768 ] Jason Lowe commented on HDFS-14084: --- This is breaking more than just tests. MapReduce job tasks are failing with the same error reported in YARN-9183. I verified that jobs work just before this commit went in and fail with this commit. I propose this is reverted until it can be fixed to restore basic functionality to the 3.x builds. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737298#comment-16737298 ] Pranay Singh commented on HDFS-14084: - [~ajisakaa] will look in to the issue today. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736702#comment-16736702 ] Akira Ajisaka commented on HDFS-14084: -- This commit broke TestAMRMTokens and many MapReduce tests. Hi [~pranay_singh], [~elgoiri], and [~jojochuang], would you check YARN-9183? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1 > > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733323#comment-16733323 ] Hudson commented on HDFS-14084: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15699 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15699/]) HDFS-14084. Need for more stats in DFSClient. Contributed by Pranay (weichiu: rev ecdeaa7e6ad43555031aed032e6ba7a14a17d7bc) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/metrics/RpcDetailedMetrics.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733290#comment-16733290 ] Wei-Chiu Chuang commented on HDFS-14084: Committing the v011 patch now > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732337#comment-16732337 ] Pranay Singh commented on HDFS-14084: - Seems to be caused by [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8208350] as commented by [~ajisakaa] in HADOOP-16016, there is a patch already done on this one ... > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732325#comment-16732325 ] Pranay Singh commented on HDFS-14084: - OK I'll see if I can triage it further to root cause the failure in TestSSLFactory > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732288#comment-16732288 ] Íñigo Goiri commented on HDFS-14084: I've been seeing the TestSSLFactory failing consistently now. The others are spurious as mentioned. +1 on [^HDFS-14084.011.patch]. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732268#comment-16732268 ] Pranay Singh commented on HDFS-14084: - [~elgoiri] thanks for commenting on the issue, I looked at the details of the failure related to TestSSLFactory. It is caused due to testServerWeakCiphers, which is a sporadic failure and has been reported in HADOOP-16016. - [ERROR] testServerWeakCiphers(org.apache.hadoop.security.ssl.TestSSLFactory) Time elapsed: 0.082 s <<< FAILURE! java.lang.AssertionError: Expected to find 'no cipher suites in common' but got unexpected exception: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate) --- The other two failures are caused because "port is in use" *java.net.BindException: Port in use: localhost:36969 at* org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1207) at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1229) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1288) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1143) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:183) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:892) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:703) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:960) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:933) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1699) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2188) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNodes(MiniDFSCluster.java:2143) at org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader.testHasNonEcBlockUsingStripedIDForAddBlock(TestFSEditLogLoader.java:656) *java.net.BindException: Port in use: localhost:10197 at* org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1207) at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1229) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1288) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1143) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:183) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:892) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:703) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:960) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:933) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1699) at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1316) at org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1085) > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: >
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731409#comment-16731409 ] Íñigo Goiri commented on HDFS-14084: Thanks [~pranay_singh] for the patch. Can you verify the failed unit tests? A little concerned about TestSSLFactory. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730584#comment-16730584 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 31s{color} | {color:green} root: The patch generated 0 new + 132 unchanged - 2 fixed = 132 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 7s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 55s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}232m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.ssl.TestSSLFactory | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953276/HDFS-14084.011.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 201909b8be72 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e9a005d | | maven | version:
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730529#comment-16730529 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 4m 34s{color} | {color:orange} root: The patch generated 1 new + 132 unchanged - 2 fixed = 133 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 8s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 2s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 4s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}247m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.ssl.TestSSLFactory | | | hadoop.hdfs.TestPersistBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953266/HDFS-14084.010.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 50f801c65c95 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e9a005d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730469#comment-16730469 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 6m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 25s{color} | {color:orange} root: The patch generated 3 new + 133 unchanged - 1 fixed = 136 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 32s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 11s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}218m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.ssl.TestSSLFactory | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953256/HDFS-14084.009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 0822b05591ef 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 128f340 | | maven | version:
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730058#comment-16730058 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 7s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 14s{color} | {color:red} root generated 1 new + 1490 unchanged - 0 fixed = 1491 total (was 1490) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 58s{color} | {color:orange} root: The patch generated 19 new + 133 unchanged - 1 fixed = 152 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 24s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 45s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}183m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.ssl.TestSSLFactory | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953185/HDFS-14084.008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9cd508e563d6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d8f670f | | maven | version: Apache Maven
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730020#comment-16730020 ] Pranay Singh commented on HDFS-14084: - Uploaded new patch > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729820#comment-16729820 ] Wei-Chiu Chuang commented on HDFS-14084: I think the latest patch is mostly good. If you can review the javac and checkstyle warnings and get rid of them then I think we're good to go. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729791#comment-16729791 ] Pranay Singh commented on HDFS-14084: - [~ste...@apache.org] , [~elgoiri] please let me know if there are more issues to be addressed, I have incorporated the above suggested changes. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720611#comment-16720611 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 1s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 43s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 13s{color} | {color:red} root generated 1 new + 1490 unchanged - 0 fixed = 1491 total (was 1490) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 49s{color} | {color:orange} root: The patch generated 19 new + 133 unchanged - 1 fixed = 152 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 25s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 24s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized | | | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.web.TestWebHDFS | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951712/HDFS-14084.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9c2c2445a24c 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720183#comment-16720183 ] Steve Loughran commented on HDFS-14084: --- h3. code RpcDetailedMetrics should be private, not protected. Time.monotonicNow() is only guaranteed to be monotonic on a single core in a single server socket; it is potentially different across sockets [http://steveloughran.blogspot.com/2015/09/time-on-multi-core-multi-socket-servers.html] . I think you may as well go for currentTimeMillis and have special handling for the case where time comes out negative. Because for safe execution in a multisocket system, you need to handle that case too h3. tests can you make the GenericTestUtils.setLogLevel setup in a @BeforeClass and reset in an @AfterClass method? Otherwise the logging will be at debug for the rest of all tests run in the same JVM h3. imports import ordering is all wrong, please do it in the (unenforced yet expected) order of: {code} java.* --- (non-org.apache) --- (org.apache) -- all static and within each block: sorted. You can set a rule for this in the IDE -but do disable any automatic reordering, as it makes merging impossible. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719983#comment-16719983 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 13m 49s{color} | {color:red} root generated 1 new + 1490 unchanged - 0 fixed = 1491 total (was 1490) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 43s{color} | {color:orange} root: The patch generated 18 new + 133 unchanged - 1 fixed = 151 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 42s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 59s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}179m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ha.TestZKFailoverController | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951609/HDFS-14084.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 47729ec87f54 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719817#comment-16719817 ] Pranay Singh commented on HDFS-14084: - Thanks [~elgoiri] and [~xkrogen] for you review comments,I have made changes in the latest PATCh as per your suggestions. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719424#comment-16719424 ] Erik Krogen commented on HDFS-14084: Looks good, I have one comment in addition to [~elgoiri]'s above. The log line: {code} LOG.debug(" RPC Client stats: {} " + rb); {code} is wrong, you can see this in your sample output above that it prints the curly braces: {code} 2018-12-10 14:39:54,391 INFO ipc.ProtobufRpcEngine: RPC Client stats: {} AddBlockNumOps = 1 ... {code} It should be: {code} LOG.debug("RPC Client stats: {}", rb); {code} > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719323#comment-16719323 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 34s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 34s{color} | {color:red} root generated 1 new + 1490 unchanged - 0 fixed = 1491 total (was 1490) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 59s{color} | {color:orange} root: The patch generated 18 new + 133 unchanged - 1 fixed = 151 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 42s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 58s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 34s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}185m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized | | | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951526/HDFS-14084.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 20997ad9df7f 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719331#comment-16719331 ] Íñigo Goiri commented on HDFS-14084: Thanks [~pranay_singh] for adding the log capturer. Minor comments: * The import in {{Client}} is a dded in a weird place and removing a space. Can you put it in the right order in {{org.apache.hadoop}}? * For {{rpcDetailedMetrics}} can you put the protected first? Follow the common order. * Extra break line after {{Client#218}}. * Add javadoc to {{Client#updateMetrics}} and {{Client#getRpcDetailedMetrics}}? * Why is there a space in the debug message {{ RPC Client stats:}}? It should be right away. * Add a break line in {{RpcDetailedMetrics#83}}. * In {{TestClientMetrics}}, can you extract the output and then actually check for the number of operations? {{output.contains("MkdirsNumOps = 1")}}. * In the {{assertTrue}}, we should print the output: assertTrue("Unexpected output in: " + output, output.contains("MkdirsNumOps = 1")); > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719245#comment-16719245 ] Wei-Chiu Chuang commented on HDFS-14084: LGTM +1 from me pending Jenkins. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718006#comment-16718006 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 39s{color} | {color:red} root generated 1 new + 1490 unchanged - 0 fixed = 1491 total (was 1490) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 4s{color} | {color:orange} root: The patch generated 17 new + 133 unchanged - 1 fixed = 150 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 7s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 28s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}185m 21s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.namenode.snapshot.TestSnapRootDescendantDiff | | | hadoop.hdfs.server.namenode.sps.TestBlockStorageMovementAttemptedItems | | | hadoop.hdfs.server.namenode.snapshot.TestXAttrWithSnapshot | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951394/HDFS-14084.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f54efbf0554c
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717986#comment-16717986 ] Íñigo Goiri commented on HDFS-14084: We should do: {code} LOG.debug(" RPC Client stats: {} " + rb.toString()); {code} Instead of: {code} LOG.debug(" RPC Client stats: {}", rb); {code} Is there a way to check the output of the logging in the unit test? It looks like GenericTestUtils has some utils like {{LogCapturer}}. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717971#comment-16717971 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 30s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 52s{color} | {color:orange} root: The patch generated 5 new + 133 unchanged - 1 fixed = 138 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 15s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 17s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}184m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestErasureCodingPoliciesWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestErasureCodingPolicyWithSnapshotWithRandomECPolicy | | | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.TestReadStripedFileWithDNFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951388/HDFS-14084.004.patch | | Optional Tests | dupname asflicense compile
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717399#comment-16717399 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HDFS-14084 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951377/HDFS-14084.003.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/25763/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716085#comment-16716085 ] Pranay Singh commented on HDFS-14084: - [~xkrogen] I changed the logging from DEBUG to INFO in my test setup and did a listing, copy of file from local filesystem to HDFS and this is what shows up 2018-12-10 14:39:54,391 INFO ipc.ProtobufRpcEngine: RPC Client stats: {} AddBlockNumOps = 1 AddBlockAvgTime = 21.0 CreateNumOps = 1 CreateAvgTime = 39.0 GetFileInfoNumOps = 4 GetFileInfoAvgTime = 1.0 GetServerDefaultsNumOps = 1 GetServerDefaultsAvgTime = 6.0 2018-12-10 14:39:54,471 INFO ipc.ProtobufRpcEngine: RPC Client stats: {} AddBlockNumOps = 1 AddBlockAvgTime = 21.0 CreateNumOps = 1 CreateAvgTime = 39.0 CompleteNumOps = 1 CompleteAvgTime = 4.0 GetFileInfoNumOps = 4 GetFileInfoAvgTime = 1.0 GetServerDefaultsNumOps = 1 GetServerDefaultsAvgTime = 6.0 2018-12-10 14:39:54,880 INFO ipc.ProtobufRpcEngine: RPC Client stats: {} AddBlockNumOps = 1 AddBlockAvgTime = 21.0 CreateNumOps = 1 CreateAvgTime = 39.0 CompleteNumOps = 2 CompleteAvgTime = 7.0 GetFileInfoNumOps = 4 GetFileInfoAvgTime = 1.0 GetServerDefaultsNumOps = 1 GetServerDefaultsAvgTime = 6.0 2018-12-10 14:39:54,896 INFO ipc.ProtobufRpcEngine: RPC Client stats: {} RenameNumOps = 1 RenameAvgTime = 12.0 AddBlockNumOps = 1 AddBlockAvgTime = 21.0 CreateNumOps = 1 CreateAvgTime = 39.0 CompleteNumOps = 2 CompleteAvgTime = 7.0 GetFileInfoNumOps = 4 GetFileInfoAvgTime = 1.0 GetServerDefaultsNumOps = 1 GetServerDefaultsAvgTime = 6.0 > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713168#comment-16713168 ] Erik Krogen commented on HDFS-14084: Nice, v002 became surprisingly easy :) A few comments: * Can we create a separate log for this e.g. {code}private static final Logger PERF_LOG = LoggerFactory.getLogger(DFSClient.getClass().getCanonicalName() + "_Performance");{code} or something like that? Then you can more easily enable/disable this logging without having to affect the log level of the rest of DFSClient. * Can you please post an example of what the log output looks like? * You took a few things that were guarded by {{isDebugEnabled()}} and now have them execute regardless of log level. I would prefer to see all of this code guarded by {{isDebugEnabled()}} so that none of it is evaluated unless output will be produced. As a more general comment, I would still love to see this integrated with {{StorageStatistics}}, but I think the logging approach makes sense as a complement. It might not be a bad idea to leave this JIRA as the v002 patch and do a follow-on to integrate with {{StorageStatistics}}. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713153#comment-16713153 ] Íñigo Goiri commented on HDFS-14084: [^HDFS-14084.002.patch] looks surprisingly clean :) This definitely needs a unit test. Minor comments: * You could directly do {{long startTime = Time.now();}}, no need to split. * Should the log be trace? > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713131#comment-16713131 ] Wei-Chiu Chuang commented on HDFS-14084: v002 patch: Suggest to use Time.monotonicNow() instead of Time.now() Rewrite the following debug log message {code} LOG.debug(" RPC Client stats: " + rb.toString()); {code} with parameterized log messages: {code} LOG.debug(" RPC Client stats: {}", rb); {code} > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712159#comment-16712159 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 52s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 4 new + 134 unchanged - 0 fixed = 138 total (was 134) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 18s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 96m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950902/HDFS-14084.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f0dd5599be34 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 019836b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/25726/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/25726/testReport/ | | Max. process+thread count | 1688 (vs. ulimit of 1) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712079#comment-16712079 ] Erik Krogen commented on HDFS-14084: One additional advantage to doing it at the FileSystem level is that it can be exported through standard means. If HADOOP-15125 is completed to finish replacing {{FileSystem#Statistics}} with {{StorageStatistics}}, then this information is exported as MapReduce counters, and can easily be viewed debugging task/job slowness. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712068#comment-16712068 ] Wei-Chiu Chuang commented on HDFS-14084: (deleted my previous comment because I was talking to Pranay offline and then realized I didn't understand what I was talking about) For most part, I am interested in the distribution of latency number. For example, 50%-tile,90%-tile,99%-tile, of OP_DELETE, over some period of time, say the past 1, 5 minutes. We already have something similar at RPC server level (via config key dfs.metrics.percentiles.intervals), just that we don't have that in the client side. Perhaps those metrics can be exported periodically, say 1 minute apart, in the debug log. As I went through the thread, one debate in this thread is whether it should be done at RPC client level or file system level. Either way has its own advantage. HBase sometimes use dfsclient instead of file system, so if it is done only at file system level, hbase won't be able to troubleshoot performance issue. Doing it at file system level makes it generic and applicable to HDFS as well as webhdfs clients. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712046#comment-16712046 ] Wei-Chiu Chuang commented on HDFS-14084: bq. I'm not sure on the best way to get the information out... logging (trace leve?) might be too verbose but I don't see many more ways. IMO logging latency number at trace level (for each op?) doesn't work too well. For most part I am interested in the distribution of latency number. For example, 50%-tile,90%-tile,99%-tile, of OP_DELETE, over some period of time, say the past 30 seconds, 5 minutes. We already have something similar at RPC server level (via config key dfs.metrics.percentiles.intervals), just that we don't have that in the client side. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711760#comment-16711760 ] Erik Krogen commented on HDFS-14084: [~pranay_singh] - thanks for taking this on, this is really valuable work. Regarding: {quote} the problem with StorageStatistics is that it just records the frequency of a particular operation and not it's latency {quote} Ideally this would be great time to _extend_ {{StorageStatistics}} to support latency as well as frequency. This is work I am particularly interested in and was planning on tackling soon, I would love to collaborate. {{StorageStatistics}} has had some great work done on it so far, and it isn't quite where it needs to be yet, but the only way it will get there is by us continuing to make it better. Going down a parallel path that doesn't have long-term and/or shareable benefits isn't good in the long run. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708023#comment-16708023 ] Íñigo Goiri commented on HDFS-14084: As long as we use most of the standard frameworks available in Hadoop it is fine with me. I'm not sure on the best way to get the information out... logging (trace leve?) might be too verbose but I don't see many more ways. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708010#comment-16708010 ] Pranay Singh commented on HDFS-14084: - [~ste...@apache.org] I went through the code and found that the StorageStatistics is already implemented in DistributedFileSystem.java, but the problem with StorageStatistics is that it just records the frequency of a particular operation and not it's latency. It appears that though the stats are collected but they are only used by the test programs perhaps for verification purpose like TestDistributedFileSystem.java, StorageStatisticsTracker.java. The other problem is how to display it, I tried printing the stats through the Iterator on the log but that is too verbose. [~elgoiri] I saw the usage of rpcMetrics(RpcMetrics) in Server.java, which has a function updateMetrics() that records the RPC latency via addRpcProcessingTime() which is called in ProtobufRpcEngine.java at the end RPC call (server side) perhaps similar implementation can be done in client side calling protobuf RPC methods. Please let me know your thoughts on that. This would use the generic Metric interface for publishing the result hence would be more usable. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705135#comment-16705135 ] Pranay Singh commented on HDFS-14084: - Thanks [~ste...@apache.org] I'll go over StorageStatistics stats as well > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704817#comment-16704817 ] Steve Loughran commented on HDFS-14084: --- We also have the StorageStatistics stats collection which is intended to provide an extensible set of stats, with a standard set of names across filestores and the option for implementations to use The Azure stores (wasb, abfs) were the first clients to take the server-side metrics client side; S3AStorageStatistics pulled that into the StorageStatistics API Why use that instead of implement your own methods? # shipping APIs for clients to use. The S3A committers collect these stats and aggregated them over entire jobs. # consistent across filesystems # transitive access through wrapper filesystems Can I also point you at org.apache.hadoop.fs.s3a.commit.DurationInfo, which collects & logs the duration of operations without all the copy and pasting; it'd be straightforward to do something similar for metrics collections > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704188#comment-16704188 ] Pranay Singh commented on HDFS-14084: - Thanks [~elgoiri] will check RPCServer. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704154#comment-16704154 ] Íñigo Goiri commented on HDFS-14084: {code} Íñigo Goiri the issue is that we don't have enough metrics on DFSClient side, so it is hard to troubleshoot issues at the client side. {code} I got that; I'm just pointing out that we should use some of the code abstraction already existing in Hadoop like having proper classes and using metric counters. Right now, this is implemented raw while there are helper tools already available. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704140#comment-16704140 ] Pranay Singh commented on HDFS-14084: - [~elgoiri] the issue is that we don't have enough metrics on DFSClient side, so it is hard to troubleshoot issues at the client side. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704126#comment-16704126 ] Íñigo Goiri commented on HDFS-14084: I just realized that [^HDFS-14084.001.patch] was attached. I think this should use the metrics infrastructure already available in Hadoop. RPC server for example collects most of these parameters in a generic way and it provides histograms, etc. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704045#comment-16704045 ] Hadoop QA commented on HDFS-14084: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 21 new + 45 unchanged - 0 fixed = 66 total (was 45) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 44s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 40s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | Should org.apache.hadoop.hdfs.DFSClient$NamenodeRpcStat be a _static_ inner class? At DFSClient.java:inner class? At DFSClient.java:[lines 210-287] | | | Unread field:field be static? At DFSClient.java:[line 214] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDFS-14084 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950088/HDFS-14084.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d2177145276d 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0081b02 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs |
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704002#comment-16704002 ] Íñigo Goiri commented on HDFS-14084: Internally, we have what we described in HDFS-12861. This is a fair start for investigation on what's going on. We also extended HTrace to collect resource utilization but that goes in a different direction. In addition to throughput (as shown in HDFS-12861) what other metrics you are interested on? DFSClient has plenty of metrics but they are not readily available and the easiest way to get them is to dump them as a log; not the most convenient. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Assignee: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org