[jira] [Comment Edited] (YARN-10718) Fix CapacityScheduler#initScheduler log error.
[ https://issues.apache.org/jira/browse/YARN-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309983#comment-17309983 ] Qi Zhu edited comment on YARN-10718 at 3/30/21, 5:42 AM: - [~pbacsko] [~gandras] [~ebadger] [~brahmareddy] Fixed an obvious log error in this jira. Could you help review this when you are free. Thanks. was (Author: zhuqi): [~pbacsko] [~gandras] [~ebadger] Fixed an obvious log error in this jira. Could you help review this when you are free. Thanks. > Fix CapacityScheduler#initScheduler log error. > --- > > Key: YARN-10718 > URL: https://issues.apache.org/jira/browse/YARN-10718 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10718.001.patch, image-2021-03-28-00-03-28-244.png > > > !image-2021-03-28-00-03-28-244.png|width=972,height=52! > The Resource toString() method already with "<" and ">" string, it's wrong > to add it again. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310481#comment-17310481 ] Qi Zhu edited comment on YARN-10720 at 3/30/21, 5:41 AM: - cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] [~brahmareddy] Could you help review this? Tested in our test cluster, works well. Thanks. was (Author: zhuqi): cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] Could you help review this? Tested in our test cluster, works well. Thanks. > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10720.001.patch, YARN-10720.002.patch, > YARN-10720.003.patch, image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10466) Fix NullPointerException in yarn-services Component.java
[ https://issues.apache.org/jira/browse/YARN-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1734#comment-1734 ] Brahma Reddy Battula commented on YARN-10466: - [~dmmkr] thanks for updating the patch. Patch lgtm.. will commit shortly. > Fix NullPointerException in yarn-services Component.java > - > > Key: YARN-10466 > URL: https://issues.apache.org/jira/browse/YARN-10466 > Project: Hadoop YARN > Issue Type: Bug >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Minor > Attachments: YARN-10466.001.patch, YARN-10466.002.patch > > > Due to changes in > [YARN-10219|https://issues.apache.org/jira/browse/YARN-10219] where the > constraint is initialised as null, there might be few scenarios in which NPE > can be thrown in requestContainers method. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1187) Add discrete event-based simulation to yarn scheduler simulator
[ https://issues.apache.org/jira/browse/YARN-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17311100#comment-17311100 ] Anup Agarwal commented on YARN-1187: Trunk gets new commits almost daily. The trunk patch can be applied on the head commit: HDFS-15918 [. Replace deprecated RAND_pseudo_bytes (|https://github.com/apache/hadoop/commit/654555783db0200aef3ae830e381857d2b46701e] [#2811|https://github.com/apache/hadoop/pull/2811] [)|https://github.com/apache/hadoop/commit/654555783db0200aef3ae830e381857d2b46701e] (with hash: 654555783db0200aef3ae830e381857d2b46701e) > Add discrete event-based simulation to yarn scheduler simulator > --- > > Key: YARN-1187 > URL: https://issues.apache.org/jira/browse/YARN-1187 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Andrew Chung >Priority: Major > Attachments: YARN-1187 design doc.pdf, > YARN-1187-branch-2.1.3.001.patch, YARN-1187-trunk.001.patch > > > Follow the discussion in YARN-1021. > Discrete event simulation decouples the running from any real-world clock. > This allows users to step through the execution, set debug points, and > definitely get a deterministic rexec. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10722) Improvement to DelegationTokenRenewer in RM
Fengnan Li created YARN-10722: - Summary: Improvement to DelegationTokenRenewer in RM Key: YARN-10722 URL: https://issues.apache.org/jira/browse/YARN-10722 Project: Hadoop YARN Issue Type: Improvement Components: RM, yarn Reporter: Fengnan Li Assignee: Fengnan Li We have seen multiple issues related with Yarn DelegationTokenRenewer, especially when namenodes where the token was given had some issues (i.e. standby down). This component has become a SPOF blocking all Yarn applications to be accepted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10721) YARN Service containers are restarted when RM failover
[ https://issues.apache.org/jira/browse/YARN-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310928#comment-17310928 ] Hadoop QA commented on YARN-10721: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 20s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 21m 30s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/867/artifact/out/branch-mvninstall-root.txt{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 14s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 43s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 49s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green}{color} | {color:green}
[jira] [Updated] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10501: --- Fix Version/s: 2.10.2 Thanks for the patch/patience [~caozhiqiang]. Finally HadoopQA is back to normal. I fixed up the small checkstyle on the patch and committed it to branch-2.10. > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.1.5, 2.10.2, 3.2.3 > > Attachments: YARN-10501-branch-2.10.001.patch, YARN-10501.002.patch, > YARN-10501.003.patch, YARN-10501.004.patch, YARN-10502-branch-2.10.002.patch, > YARN-10502-branch-2.10.003.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642 replaceNodeForLabels(nodeId, host.labels, labels); > 643 replaceLabelsForNode(nodeId, host.labels, labels); > 644 host.labels.clear(); > 645 host.labels.addAll(labels); > 646 for (Node node : host.nms.values()) { > 647 replaceNodeForLabels(node.nodeId, node.labels, labels); > 649 node.labels = null; > 650 } > 651 break;{code} > The cause is in 647 line, when add labels to node without port, the 0 port > and the real nm port with be both add to node info, and when remove labels, > the parameter node.labels in 647 line is null, so it will not remove the old > label. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10721) YARN Service containers are restarted when RM failover
[ https://issues.apache.org/jira/browse/YARN-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kyungwan nam updated YARN-10721: Description: Our cluster has a large number of NMs. When RM failover, it took 7 minutes for most of NMs to register with RM. After, I’ve seen that a lot of containers was restarted I think it related with YARN-6168. was: Our cluster has a large number of NMs. When RM failover, it took 7 minutes for most of NMs to register with RM. After, I’ve seen that a lot of containers was restarted > YARN Service containers are restarted when RM failover > -- > > Key: YARN-10721 > URL: https://issues.apache.org/jira/browse/YARN-10721 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-10721.001.patch > > > Our cluster has a large number of NMs. > When RM failover, it took 7 minutes for most of NMs to register with RM. > After, I’ve seen that a lot of containers was restarted > I think it related with YARN-6168. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10721) YARN Service containers are restarted when RM failover
[ https://issues.apache.org/jira/browse/YARN-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kyungwan nam reassigned YARN-10721: --- Attachment: YARN-10721.001.patch Assignee: kyungwan nam > YARN Service containers are restarted when RM failover > -- > > Key: YARN-10721 > URL: https://issues.apache.org/jira/browse/YARN-10721 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-10721.001.patch > > > Our cluster has a large number of NMs. > When RM failover, it took 7 minutes for most of NMs to register with RM. > After, I’ve seen that a lot of containers was restarted -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10721) YARN Service containers are restarted when RM failover
kyungwan nam created YARN-10721: --- Summary: YARN Service containers are restarted when RM failover Key: YARN-10721 URL: https://issues.apache.org/jira/browse/YARN-10721 Project: Hadoop YARN Issue Type: Bug Reporter: kyungwan nam Our cluster has a large number of NMs. When RM failover, it took 7 minutes for most of NMs to register with RM. After, I’ve seen that a lot of containers was restarted -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310757#comment-17310757 ] Hadoop QA commented on YARN-10720: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 15s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 37s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 41s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 8s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 0s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 43s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 12s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 30m 52s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 4m 43s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 34s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 15s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 15s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/866/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 generated 1 new + 384 unchanged - 0 fixed = 385 total (was 384) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 18s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 18s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/866/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 generated 1 new + 354 unchanged - 0 fixed = 355 total (was 354) {color} | |
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Attachment: YARN-10720.003.patch > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10720.001.patch, YARN-10720.002.patch, > YARN-10720.003.patch, image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310630#comment-17310630 ] Hadoop QA commented on YARN-10501: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 22s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.10 Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 48s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Azul Systems, Inc.-1.7.0_262-b10 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Azul Systems, Inc.-1.7.0_262-b10 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 49s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 26s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} the patch passed with JDK Azul Systems, Inc.-1.7.0_262-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 17s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/865/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 2 new + 55 unchanged - 0 fixed = 57 total (was 55) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} the patch passed with JDK Azul Systems, Inc.-1.7.0_262-b10 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} spotbugs {color} |
[jira] [Updated] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-10501: --- Attachment: YARN-10502-branch-2.10.003.patch > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.1.5, 3.2.3 > > Attachments: YARN-10501-branch-2.10.001.patch, YARN-10501.002.patch, > YARN-10501.003.patch, YARN-10501.004.patch, YARN-10502-branch-2.10.002.patch, > YARN-10502-branch-2.10.003.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642 replaceNodeForLabels(nodeId, host.labels, labels); > 643 replaceLabelsForNode(nodeId, host.labels, labels); > 644 host.labels.clear(); > 645 host.labels.addAll(labels); > 646 for (Node node : host.nms.values()) { > 647 replaceNodeForLabels(node.nodeId, node.labels, labels); > 649 node.labels = null; > 650 } > 651 break;{code} > The cause is in 647 line, when add labels to node without port, the 0 port > and the real nm port with be both add to node info, and when remove labels, > the parameter node.labels in 647 line is null, so it will not remove the old > label. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6202) Configuration item Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY is disregarded
[ https://issues.apache.org/jira/browse/YARN-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310564#comment-17310564 ] zhengchenyu commented on YARN-6202: --- [~yufeigu] I agree that exitOnDispatchException should be true in most scene. But some unimportant component, for examaple dispatcher in CommonNodeLabelsManager. I think that set exitOnDispatchException to true is unnecessary. NodeLabel command failed, should not make resourcemanager crash.Maybe I think we should provide a configuration item to user. > Configuration item Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY is disregarded > - > > Key: YARN-6202 > URL: https://issues.apache.org/jira/browse/YARN-6202 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Yufei Gu >Priority: Major > Fix For: 3.0.0-alpha4 > > Attachments: YARN-6202.001.patch, YARN-6202.002.patch, > YARN-6202.003.patch, YARN-6202.004.patch > > > Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY (yarn.dispatcher.exit-on-error) > always be true no matter what value in configuration files. This misleads > users. Two solutions: > # Remove the configuration item and provide a method to allow > {{exitOnDispatchException}}/{{shouldExitOnError}} to be false to enable > related unit tests. There is no need for false value in a real daemon since > daemons should crash if its dispatcher quit. > # Make it default true instead of false, so that we don't need to hard code > it to be true in RM and NM, it is still configurable, and also provide method > to enable related unit tests. > Other than that, the code around it needs to refactor. {{public static > final}} for a variable of interface isn't necessary, and YARN related > configure item should be in class YarnConfiguration. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310554#comment-17310554 ] Hadoop QA commented on YARN-10720: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 48s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 40s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 56s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 27m 35s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 4m 52s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 30s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 30s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/864/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 generated 1 new + 384 unchanged - 0 fixed = 385 total (was 384) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 49s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 49s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/864/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 generated 1 new + 355 unchanged - 0 fixed = 356 total (was 355) {color} | |
[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310540#comment-17310540 ] Hadoop QA commented on YARN-10720: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 51s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 4s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 23s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 41s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 8s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 42s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 21s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 23m 20s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 2m 44s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 10s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 10s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/862/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 generated 1 new + 385 unchanged - 0 fixed = 386 total (was 385) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 21s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 21s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/862/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 generated 1 new + 354 unchanged - 0 fixed = 355 total (was 354) {color} | |
[jira] [Commented] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310497#comment-17310497 ] Hadoop QA commented on YARN-10501: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 42s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.10 Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 39s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Azul Systems, Inc.-1.7.0_262-b10 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Azul Systems, Inc.-1.7.0_262-b10 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 4m 5s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 36s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 35s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/863/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 36s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/863/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-jdkAzulSystems,Inc.-1.7.0_262-b10.txt{color} | {color:red} hadoop-yarn-common in the patch failed with JDK Azul Systems, Inc.-1.7.0_262-b10. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 36s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/863/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-jdkAzulSystems,Inc.-1.7.0_262-b10.txt{color} | {color:red} hadoop-yarn-common in the patch failed with JDK Azul Systems, Inc.-1.7.0_262-b10. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 33s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/863/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08.txt{color} | {color:red} hadoop-yarn-common in the patch failed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 33s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/863/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08.txt{color} | {color:red} hadoop-yarn-common in the patch failed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} |
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Attachment: YARN-10720.002.patch > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10720.001.patch, YARN-10720.002.patch, > image-2021-03-29-14-04-33-776.png, image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17310481#comment-17310481 ] Qi Zhu commented on YARN-10720: --- cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] Could you help review this? Tested in our test cluster, works well. Thanks. > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10720.001.patch, image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-10501: --- Attachment: YARN-10502-branch-2.10.002.patch > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.1.5, 3.2.3 > > Attachments: YARN-10501-branch-2.10.001.patch, YARN-10501.002.patch, > YARN-10501.003.patch, YARN-10501.004.patch, YARN-10502-branch-2.10.002.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642 replaceNodeForLabels(nodeId, host.labels, labels); > 643 replaceLabelsForNode(nodeId, host.labels, labels); > 644 host.labels.clear(); > 645 host.labels.addAll(labels); > 646 for (Node node : host.nms.values()) { > 647 replaceNodeForLabels(node.nodeId, node.labels, labels); > 649 node.labels = null; > 650 } > 651 break;{code} > The cause is in 647 line, when add labels to node without port, the 0 port > and the real nm port with be both add to node info, and when remove labels, > the parameter node.labels in 647 line is null, so it will not remove the old > label. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-10501: --- Attachment: (was: YARN-10501-branch-2.10.1.002.patch) > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.1.5, 3.2.3 > > Attachments: YARN-10501-branch-2.10.001.patch, YARN-10501.002.patch, > YARN-10501.003.patch, YARN-10501.004.patch, YARN-10502-branch-2.10.002.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642 replaceNodeForLabels(nodeId, host.labels, labels); > 643 replaceLabelsForNode(nodeId, host.labels, labels); > 644 host.labels.clear(); > 645 host.labels.addAll(labels); > 646 for (Node node : host.nms.values()) { > 647 replaceNodeForLabels(node.nodeId, node.labels, labels); > 649 node.labels = null; > 650 } > 651 break;{code} > The cause is in 647 line, when add labels to node without port, the 0 port > and the real nm port with be both add to node info, and when remove labels, > the parameter node.labels in 647 line is null, so it will not remove the old > label. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-10501: --- Attachment: YARN-10501-branch-2.10.1.002.patch > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.1.5, 3.2.3 > > Attachments: YARN-10501-branch-2.10.001.patch, > YARN-10501-branch-2.10.1.002.patch, YARN-10501.002.patch, > YARN-10501.003.patch, YARN-10501.004.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642 replaceNodeForLabels(nodeId, host.labels, labels); > 643 replaceLabelsForNode(nodeId, host.labels, labels); > 644 host.labels.clear(); > 645 host.labels.addAll(labels); > 646 for (Node node : host.nms.values()) { > 647 replaceNodeForLabels(node.nodeId, node.labels, labels); > 649 node.labels = null; > 650 } > 651 break;{code} > The cause is in 647 line, when add labels to node without port, the 0 port > and the real nm port with be both add to node info, and when remove labels, > the parameter node.labels in 647 line is null, so it will not remove the old > label. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-10501: --- Attachment: (was: YARN-10502-branch-2.10.002.patch) > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.1.5, 3.2.3 > > Attachments: YARN-10501-branch-2.10.001.patch, > YARN-10501-branch-2.10.1.002.patch, YARN-10501.002.patch, > YARN-10501.003.patch, YARN-10501.004.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642 replaceNodeForLabels(nodeId, host.labels, labels); > 643 replaceLabelsForNode(nodeId, host.labels, labels); > 644 host.labels.clear(); > 645 host.labels.addAll(labels); > 646 for (Node node : host.nms.values()) { > 647 replaceNodeForLabels(node.nodeId, node.labels, labels); > 649 node.labels = null; > 650 } > 651 break;{code} > The cause is in 647 line, when add labels to node without port, the 0 port > and the real nm port with be both add to node info, and when remove labels, > the parameter node.labels in 647 line is null, so it will not remove the old > label. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Attachment: image-2021-03-29-14-05-32-708.png > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-13-44-05-579.png|width=657,height=97! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Attachment: (was: image-2021-03-29-13-42-47-672.png) > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Attachment: (was: image-2021-03-29-13-44-05-579.png) > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Description: Following is proxy server show, {color:#de350b}too many connections from one client{color}, this caused the proxy server hang, and the yarn web can't jump to web proxy. !image-2021-03-29-14-04-33-776.png|width=632,height=57! Following is the AM which is abnormal, but proxy server don't know it is abnormal already, so the connections can't be closed, we should add time out support in proxy server to prevent this. And one abnormal AM may cause hundreds even thousands of connections, it is very heavy. !image-2021-03-29-14-05-32-708.png|width=669,height=101! After i kill the abnormal AM, the proxy server become healthy. This case happened many times in our production clusters, our clusters are huge, and the abnormal AM will be existed in a regular case. I will add timeout supported in web proxy server in this jira. cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] was: Following is proxy server show, {color:#de350b}too many connections from one client{color}, this caused the proxy server hang, and the yarn web can't jump to web proxy. !image-2021-03-29-14-04-33-776.png|width=632,height=57! Following is the AM which is abnormal, but proxy server don't know it is abnormal already, so the connections can't be closed, we should add time out support in proxy server to prevent this. And one abnormal AM may cause hundreds even thousands of connections, it is very heavy. !image-2021-03-29-13-44-05-579.png|width=657,height=97! After i kill the abnormal AM, the proxy server become healthy. This case happened many times in our production clusters, our clusters are huge, and the abnormal AM will be existed in a regular case. I will add timeout supported in web proxy server in this jira. cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Attachment: image-2021-03-29-14-04-33-776.png > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-13-42-47-672.png, > image-2021-03-29-13-44-05-579.png, image-2021-03-29-14-04-33-776.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-13-42-47-672.png|width=718,height=62! > > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-13-44-05-579.png|width=657,height=97! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Description: Following is proxy server show, {color:#de350b}too many connections from one client{color}, this caused the proxy server hang, and the yarn web can't jump to web proxy. !image-2021-03-29-14-04-33-776.png|width=632,height=57! Following is the AM which is abnormal, but proxy server don't know it is abnormal already, so the connections can't be closed, we should add time out support in proxy server to prevent this. And one abnormal AM may cause hundreds even thousands of connections, it is very heavy. !image-2021-03-29-13-44-05-579.png|width=657,height=97! After i kill the abnormal AM, the proxy server become healthy. This case happened many times in our production clusters, our clusters are huge, and the abnormal AM will be existed in a regular case. I will add timeout supported in web proxy server in this jira. cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] was: Following is proxy server show, {color:#de350b}too many connections from one client{color}, this caused the proxy server hang, and the yarn web can't jump to web proxy. !image-2021-03-29-13-42-47-672.png|width=718,height=62! Following is the AM which is abnormal, but proxy server don't know it is abnormal already, so the connections can't be closed, we should add time out support in proxy server to prevent this. And one abnormal AM may cause hundreds even thousands of connections, it is very heavy. !image-2021-03-29-13-44-05-579.png|width=657,height=97! After i kill the abnormal AM, the proxy server become healthy. This case happened many times in our production clusters, our clusters are huge, and the abnormal AM will be existed in a regular case. I will add timeout supported in web proxy server in this jira. cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-13-42-47-672.png, > image-2021-03-29-13-44-05-579.png, image-2021-03-29-14-04-33-776.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-13-44-05-579.png|width=657,height=97! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent too many abnormal connections.
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10720: -- Description: Following is proxy server show, {color:#de350b}too many connections from one client{color}, this caused the proxy server hang, and the yarn web can't jump to web proxy. !image-2021-03-29-13-42-47-672.png|width=718,height=62! Following is the AM which is abnormal, but proxy server don't know it is abnormal already, so the connections can't be closed, we should add time out support in proxy server to prevent this. And one abnormal AM may cause hundreds even thousands of connections, it is very heavy. !image-2021-03-29-13-44-05-579.png|width=657,height=97! After i kill the abnormal AM, the proxy server become healthy. This case happened many times in our production clusters, our clusters are huge, and the abnormal AM will be existed in a regular case. I will add timeout supported in web proxy server in this jira. cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] was: Following is proxy server show, {color:#de350b}too many connections from one client{color}, this caused the proxy server hang, and the yarn web can't jump to web proxy. !image-2021-03-29-13-42-47-672.png|width=718,height=62! Following is the AM which is abnormal, but proxy server don't know it is abnormal already, so the connections can't be closed, we should add time out support in proxy server to prevent this. !image-2021-03-29-13-44-05-579.png|width=657,height=97! After i kill the abnormal AM, the proxy server become healthy. This case happened many times in our production clusters, our clusters are huge, and the abnormal AM will be existed in a regular case. I will add timeout supported in web proxy server in this jira. cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] [~bteke] > YARN WebAppProxyServlet should support connection timeout to prevent too many > abnormal connections. > --- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: image-2021-03-29-13-42-47-672.png, > image-2021-03-29-13-44-05-579.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-13-42-47-672.png|width=718,height=62! > > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-13-44-05-579.png|width=657,height=97! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org