[jira] [Commented] (YARN-7601) Incorrect container states recovered as LevelDB uses alphabetical order
[ https://issues.apache.org/jira/browse/YARN-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671162#comment-16671162 ] Hadoop QA commented on YARN-7601: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 32s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 58s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 58s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 32s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 4m 21s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 58s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 48m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-7601 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903697/YARN-7601.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3eaa12055ebd 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b13c567 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/22395/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | compile | https://builds.apache.org/job/PreCommit-YARN-Build/22395/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/22395/artifact/out/pa
[jira] [Commented] (YARN-7901) Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671144#comment-16671144 ] Hadoop QA commented on YARN-7901: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-7901 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7901 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12909626/YARN-7901_trunk.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22396/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy > > > Key: YARN-7901 > URL: https://issues.apache.org/jira/browse/YARN-7901 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: lovekesh bansal >Assignee: lovekesh bansal >Priority: Minor > Attachments: YARN-7901_trunk.001.patch > > > in the splitIndividualAny method while creating the resourceRequest we are > not setting the profile capability. > ResourceRequest.newInstance(originalResourceRequest.getPriority(), > originalResourceRequest.getResourceName(), > originalResourceRequest.getCapability(), > originalResourceRequest.getNumContainers(), > originalResourceRequest.getRelaxLocality(), > originalResourceRequest.getNodeLabelExpression(), > originalResourceRequest.getExecutionTypeRequest()); -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4858) start-yarn and stop-yarn scripts to support timeline and sharedcachemanager
[ https://issues.apache.org/jira/browse/YARN-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-4858: Target Version/s: 2.10.0 (was: 2.9.2) > start-yarn and stop-yarn scripts to support timeline and sharedcachemanager > --- > > Key: YARN-4858 > URL: https://issues.apache.org/jira/browse/YARN-4858 > Project: Hadoop YARN > Issue Type: Improvement > Components: scripts >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Labels: oct16-easy > Attachments: YARN-4858-001.patch, YARN-4858-branch-2.001.patch > > > The start-yarn and stop-yarn scripts don't have any (even commented out) > support for the timeline and sharedcachemanager > Proposed: > * bash and cmd start-yarn scripts have commented out start actions > * stop-yarn scripts stop the servers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7601) Incorrect container states recovered as LevelDB uses alphabetical order
[ https://issues.apache.org/jira/browse/YARN-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7601: Target Version/s: 2.9.3 (was: 2.9.2) > Incorrect container states recovered as LevelDB uses alphabetical order > --- > > Key: YARN-7601 > URL: https://issues.apache.org/jira/browse/YARN-7601 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sampada Dehankar >Assignee: Sampada Dehankar >Priority: Major > Attachments: YARN-7601.001.patch, YARN-7601.002.patch > > > LevelDB stores key-value pairs in the alphabetical order. Container id > concatenated by its state is used as key. So, even if container goes through > any states in its life cycle, the order of states for following values > retrieved from LevelDB is always going to be as below`: > LAUNCHED > PAUSED > QUEUED > For eg: If a container is LAUNCHED then PAUSED and LAUNCHED again, the > recovered container state is PAUSED currently instead of LAUNCHED. > We propose to store the timestamp as the value while making call to > > storeContainerLaunched > storeContainerPaused > storeContainerQueued > > so that correct container state is recovered based on timestamps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6918) Remove acls after queue delete to avoid memory leak
[ https://issues.apache.org/jira/browse/YARN-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-6918: Target Version/s: 3.0.4, 2.9.3 (was: 3.0.2, 2.9.2) > Remove acls after queue delete to avoid memory leak > --- > > Key: YARN-6918 > URL: https://issues.apache.org/jira/browse/YARN-6918 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: YARN-6918.001.patch, YARN-6918.002.patch > > > Acl for deleted queue need to removed from allAcls to avoid leak > (Priority,YarnAuthorizer) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7450) ATS Client should retry on intermittent Kerberos issues.
[ https://issues.apache.org/jira/browse/YARN-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7450: Target Version/s: 2.9.3 (was: 2.9.2) > ATS Client should retry on intermittent Kerberos issues. > > > Key: YARN-7450 > URL: https://issues.apache.org/jira/browse/YARN-7450 > Project: Hadoop YARN > Issue Type: Improvement > Components: ATSv2 >Affects Versions: 2.7.3 > Environment: Hadoop-2.7.3 >Reporter: Ravi Prakash >Priority: Major > > We saw a stack trace (posted in the first comment) in the ResourceManager > logs for the TimelineClientImpl not being able to relogin from keytab. > I'm guessing there was an intermittent issue that failed the kerberos relogin > from keytab. However, I'm assuming this was *not* retried because I only saw > one instance of this stack trace. I propose that this operation should have > been retried. > It seems, this caused events at the ResourceManager to queue up and > eventually stop responding to even basic {{yarn application -list}} commands. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value
[ https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7560: Fix Version/s: (was: 3.0.3) > Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a > overflow value > -- > > Key: YARN-7560 > URL: https://issues.apache.org/jira/browse/YARN-7560 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: zhengchenyu >Assignee: zhengchenyu >Priority: Major > Attachments: YARN-7560.000.patch, YARN-7560.001.patch > > > In our cluster, we changed the configuration, then refreshQueues, we found > the resourcemanager hangs. And the Resourcemanager can't restart > successfully. We got jstack information, always show like this: > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable > [0x7f98eed9a000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148) > - locked <0x7f8c4a8177a0> (a java.util.HashMap) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422) > - locked <0x7f8c4a7eb2e0> (a > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c4a76ac48> (a java.lang.Object) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c49254268> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c467495e0> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220) > {code} > When we debug the cluster, we found resourceUsedWithWeightToResourceRatio > return a negative value. So the loop can't return. We found in our cluster, > the sum of all minRes is over int.max, so > resourceUsedWithWeightToResourceRatio return a negative value. > below is the loop. Because totalResource is long, so always postive. But > resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big > that resourceUsedWithWeightToResourceRatio will return a overflow value, just > a negative. So the loop will never break. > {code} > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource) { > rMax *= 2.0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubs
[jira] [Updated] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value
[ https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7560: Target Version/s: 3.0.4 (was: 3.0.0) > Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a > overflow value > -- > > Key: YARN-7560 > URL: https://issues.apache.org/jira/browse/YARN-7560 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: zhengchenyu >Assignee: zhengchenyu >Priority: Major > Attachments: YARN-7560.000.patch, YARN-7560.001.patch > > > In our cluster, we changed the configuration, then refreshQueues, we found > the resourcemanager hangs. And the Resourcemanager can't restart > successfully. We got jstack information, always show like this: > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable > [0x7f98eed9a000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148) > - locked <0x7f8c4a8177a0> (a java.util.HashMap) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422) > - locked <0x7f8c4a7eb2e0> (a > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c4a76ac48> (a java.lang.Object) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c49254268> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c467495e0> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220) > {code} > When we debug the cluster, we found resourceUsedWithWeightToResourceRatio > return a negative value. So the loop can't return. We found in our cluster, > the sum of all minRes is over int.max, so > resourceUsedWithWeightToResourceRatio return a negative value. > below is the loop. Because totalResource is long, so always postive. But > resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big > that resourceUsedWithWeightToResourceRatio will return a overflow value, just > a negative. So the loop will never break. > {code} > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource) { > rMax *= 2.0; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To
[jira] [Updated] (YARN-7901) Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7901: Fix Version/s: (was: 3.0.3) > Adding profile capability in resourceReq in LocalityMulticastAMRMProxyPolicy > > > Key: YARN-7901 > URL: https://issues.apache.org/jira/browse/YARN-7901 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: lovekesh bansal >Assignee: lovekesh bansal >Priority: Minor > Attachments: YARN-7901_trunk.001.patch > > > in the splitIndividualAny method while creating the resourceRequest we are > not setting the profile capability. > ResourceRequest.newInstance(originalResourceRequest.getPriority(), > originalResourceRequest.getResourceName(), > originalResourceRequest.getCapability(), > originalResourceRequest.getNumContainers(), > originalResourceRequest.getRelaxLocality(), > originalResourceRequest.getNodeLabelExpression(), > originalResourceRequest.getExecutionTypeRequest()); -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7649) RMContainer state transition exception after container update
[ https://issues.apache.org/jira/browse/YARN-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7649: Target Version/s: 3.0.4, 2.9.3 (was: 3.0.2, 2.9.2) > RMContainer state transition exception after container update > - > > Key: YARN-7649 > URL: https://issues.apache.org/jira/browse/YARN-7649 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Weiwei Yang >Assignee: Arun Suresh >Priority: Major > > I've been seen this in a cluster deployment as well as in UT, run > {{TestAMRMClient#testAMRMClientWithContainerPromotion}} could reproduce this, > it doesn't fail the test case but following error message is shown up in the > log > {noformat} > 2017-12-13 19:41:31,817 ERROR rmcontainer.RMContainerImpl > (RMContainerImpl.java:handle(480)) - Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > RELEASED at ALLOCATED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:478) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:675) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1586) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:155) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > 2017-12-13 19:41:31,817 ERROR rmcontainer.RMContainerImpl > (RMContainerImpl.java:handle(481)) - Invalid event RELEASED on container > container_1513165290804_0001_01_03 > {noformat} > this seems to be related to YARN-6251. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8965) Revisit delay scheduling for cloud environment
Weiwei Yang created YARN-8965: - Summary: Revisit delay scheduling for cloud environment Key: YARN-8965 URL: https://issues.apache.org/jira/browse/YARN-8965 Project: Hadoop YARN Issue Type: Improvement Reporter: Weiwei Yang Assignee: Weiwei Yang Delay scheduling was introduced to honor task locality at best-effort, which has node/rack level delays. However under cloud environment, usually the storage is remote to the work-load cluster, which makes the locality constraint totally irrelevant. Lets revisit this and create a model for cloud environments without too many setups. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Priority: Critical (was: Major) > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Description: *Issue 1:* YARN-3635 intention was to add PlacementRule interface common for all YarnSchedules. {code} 33public abstract boolean initialize( 34CapacitySchedulerContext schedulerContext) throws IOException; {code} PlacementRule initialization is done using CapacitySchedulerContext binding to CapacityScheduler *Issue 2:* {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity Scheduler {quote} * **Queue Mapping Interface based on Default or User Defined Placement Rules** - This feature allows users to map a job to a specific queue based on some default placement rule. For instance based on user & group, or application name. User can also define their own placement rule. {quote} As per current UserGroupMapping is always added in placementRule. {{CapacityScheduler#updatePlacementRules}} {code} // Initialize placement rules Collection placementRuleStrs = conf.getStringCollection( YarnConfiguration.QUEUE_PLACEMENT_RULES); List placementRules = new ArrayList<>(); ... // add UserGroupMappingPlacementRule if absent distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); {code} was: YARN-3635 intention was to add PlacementRule interface common for all YarnSchedules. {code} 33public abstract boolean initialize( 34CapacitySchedulerContext schedulerContext) throws IOException; {code} PlacementRule initialization is done using CapacitySchedulerContext binding to CapacityScheduler > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Major > Attachments: YARN-8948.001.patch, YARN-8948.002.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8394) Improve data locality documentation for Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671119#comment-16671119 ] Weiwei Yang commented on YARN-8394: --- Hi [~yufeigu] Apologies I missed your last comment. Are you suggesting when "yarn.scheduler.capacity.node-locality-delay" is set to "-1", then we should AUTO disable "rack-locality-additional-delay" too? I think that makes sense. We need a Jira to track this, revisit the locality code under the context of cloud environment. Let me open one to track. Thanks! > Improve data locality documentation for Capacity Scheduler > -- > > Key: YARN-8394 > URL: https://issues.apache.org/jira/browse/YARN-8394 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.2.0, 3.1.1, 3.0.4 > > Attachments: YARN-8394.001.patch, YARN-8394.002.patch > > > YARN-6344 introduces a new parameter > {{yarn.scheduler.capacity.rack-locality-additional-delay}} in > capacity-scheduler.xml, we need to add some documentation in > {{CapacityScheduler.md}} accordingly. > Moreover, we are seeing more and more clusters are separating storage and > computation where file system is always remote, in such cases we need to > introduce how to compromise data locality in CS otherwise MR jobs are > suffering. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8404) Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down
[ https://issues.apache.org/jira/browse/YARN-8404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-8404: Fix Version/s: 2.9.2 Thanks! Backported to branch-2.9. > Timeline event publish need to be async to avoid Dispatcher thread leak in > case ATS is down > --- > > Key: YARN-8404 > URL: https://issues.apache.org/jira/browse/YARN-8404 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Blocker > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4 > > Attachments: YARN-8404.01.patch > > > It is observed that if ATS1/1.5 daemon is not running, RM recovery is delayed > as long as timeline client get timed out for each applications. By default, > timed out will take around 5 mins. If completed applications are more then > amount of time RM will wait is *(number of completed applications in a > cluster * 5 minutes)* which is kind of hanged. > Primary reason for this behavior is YARN-3044 YARN-4129 which refactor > existing system metric publisher. This refactoring made appFinished event as > synchronous which was asynchronous earlier. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7765) [Atsv2] GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM
[ https://issues.apache.org/jira/browse/YARN-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7765: Target Version/s: 3.0.1, 3.1.0, 2.10.0 (was: 3.1.0, 2.10.0, 3.0.1) Fix Version/s: 2.9.2 Backported to branch-2.9. Thanks [~rohithsharma]. > [Atsv2] GSSException: No valid credentials provided - Failed to find any > Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM > > > Key: YARN-7765 > URL: https://issues.apache.org/jira/browse/YARN-7765 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Sumana Sathish >Assignee: Rohith Sharma K S >Priority: Blocker > Fix For: 3.1.0, 2.10.0, 3.0.1, 2.9.2 > > Attachments: YARN-7765.01.patch, YARN-7765.02.patch > > > Secure cluster is deployed and all YARN services are started successfully. > When application is submitted, app collectors which is started as aux-service > throwing below exception. But this exception is *NOT* observed from RM > TimelineCollector. > Cluster is deployed with Hadoop-3.0 and Hbase-1.2.6 secure cluster. All the > YARN and HBase service are started and working perfectly fine. After 24 hours > i.e when token lifetime is expired, HBaseClient in NM and HDFSClient in > HMaster and HRegionServer started getting this error. After sometime, HBase > daemons got shutdown. In NM, JVM didn't shutdown but none of the events got > published. > {noformat} > 2018-01-17 11:04:48,017 FATAL ipc.RpcClientImpl (RpcClientImpl.java:run(684)) > - SASL authentication failed. The most likely cause is missing or invalid > credentials. Consider 'kinit'. > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > {noformat} > cc :/ [~vrushalic] [~varun_saxena] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7277) Container Launch expand environment needs to consider bracket matching
[ https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-7277: --- Attachment: YARN-7277-trunk.004.patch > Container Launch expand environment needs to consider bracket matching > -- > > Key: YARN-7277 > URL: https://issues.apache.org/jira/browse/YARN-7277 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: balloons >Assignee: Zhankun Tang >Priority: Critical > Attachments: YARN-7277-trunk.001.patch, YARN-7277-trunk.002.patch, > YARN-7277-trunk.003.patch, YARN-7277-trunk.004.patch > > > The SPARK application I submitted always failed and I finally found that the > commands I specified to launch AM Container were changed by NM. > *The following is part of the excerpt I submitted to RM to see the command:* > {code:java} > *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}}}'* > {code} > *The following is an excerpt from the corresponding command used when I > observe the NM launch container:* > {code:java} > *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}* > {code} > Finally, I found that NM made the following transformation in launch > container which led to this situation: > {code:java} > @VisibleForTesting > public static String expandEnvironment(String var, > Path containerLogDir) { > var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR, > containerLogDir.toString()); > var = var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR, > File.pathSeparator); > // replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced > // as %VAR% and on Linux replaced as "$VAR" > if (Shell.WINDOWS) { > var = var.replaceAll("(\\{\\{)|(\\}\\})", "%"); > } else { > var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$"); > *var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");* > } > return var; > } > {code} > I think this is a Bug that doesn't even consider the pairing of > "*PARAMETER_EXPANSION_LEFT*" and "*PARAMETER_EXPANSION_RIGHT*" when > substituting. But simply substituting for simple violence. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671044#comment-16671044 ] Weiwei Yang commented on YARN-8958: --- Hi [~Tao Yang], No worries. I saw it several times, it should not be caused by this patch. I think the fix makes sense to me, I'll take one more look today. Thanks > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1 after node1 reconnected to RM, then the state of > contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder > after container released, then app1 will be added back into schedulable > entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671030#comment-16671030 ] Tao Yang commented on YARN-8958: There is no UT failure but still got -1 for unit by Hadoop QA. [~cheersyang], Can you help to see what happened? > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1 after node1 reconnected to RM, then the state of > contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder > after container released, then app1 will be added back into schedulable > entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671024#comment-16671024 ] Tao Yang edited comment on YARN-8958 at 11/1/18 2:09 AM: - Thanks [~cheersyang] for the review. {quote} In testSchedulableEntitiesLeak, why the app attempt is finished, but then you try to recover a container for this app? I suppose by then all containers of this app attempt are done correct? {quote} This can happen after RM restart which is step 2 of the reproduce process. Remove app attempt(step 3) may happen before NM reconnect to RM and recover containers (step 4), so that not all containers are done when app attempt finished. Clarified step 4 as {{(4) recover container1 after node1 reconnected to RM}}. was (Author: tao yang): Thanks [~cheersyang] for the review. {quote} In testSchedulableEntitiesLeak, why the app attempt is finished, but then you try to recover a container for this app? I suppose by then all containers of this app attempt are done correct? {quote} This can happen after RM restart which is step 2 of the reproduce process. Remove app attempt(step 3) may happen before NM reconnect to RM and recover containers (step 4), so that not all containers are done when app attempt finished. > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1 after node1 reconnected to RM, then the state of > contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder > after container released, then app1 will be added back into schedulable > entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8958: --- Description: We found a NPE in ClientRMService#getApplications when querying apps with specified queue. The cause is that there is one app which can't be found by calling RMContextImpl#getRMApps(is finished and swapped out of memory) but still can be queried from fair ordering policy. To reproduce schedulable entities leak in fair ordering policy: (1) create app1 and launch container1 on node1 (2) restart RM (3) remove app1 attempt, app1 is removed from the schedulable entities. (4) recover container1 after node1 reconnected to RM, then the state of contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder after container released, then app1 will be added back into schedulable entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. (5) remove app1 To solve this problem, we should make sure schedulableEntities can only be affected by add or remove app attempt, new entity should not be added into schedulableEntities by reordering process. {code:java} protected void reorderSchedulableEntity(S schedulableEntity) { //remove, update comparable data, and reinsert to update position in order schedulableEntities.remove(schedulableEntity); updateSchedulingResourceUsage( schedulableEntity.getSchedulingResourceUsage()); schedulableEntities.add(schedulableEntity); } {code} Related codes above can be improved as follow to make sure only existent entity can be re-add into schedulableEntities. {code:java} protected void reorderSchedulableEntity(S schedulableEntity) { //remove, update comparable data, and reinsert to update position in order boolean exists = schedulableEntities.remove(schedulableEntity); updateSchedulingResourceUsage( schedulableEntity.getSchedulingResourceUsage()); if (exists) { schedulableEntities.add(schedulableEntity); } else { LOG.info("Skip reordering non-existent schedulable entity: " + schedulableEntity.getId()); } } {code} was: We found a NPE in ClientRMService#getApplications when querying apps with specified queue. The cause is that there is one app which can't be found by calling RMContextImpl#getRMApps(is finished and swapped out of memory) but still can be queried from fair ordering policy. To reproduce schedulable entities leak in fair ordering policy: (1) create app1 and launch container1 on node1 (2) restart RM (3) remove app1 attempt, app1 is removed from the schedulable entities. (4) recover container1, then the state of contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder after container released, then app1 will be added back into schedulable entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. (5) remove app1 To solve this problem, we should make sure schedulableEntities can only be affected by add or remove app attempt, new entity should not be added into schedulableEntities by reordering process. {code:java} protected void reorderSchedulableEntity(S schedulableEntity) { //remove, update comparable data, and reinsert to update position in order schedulableEntities.remove(schedulableEntity); updateSchedulingResourceUsage( schedulableEntity.getSchedulingResourceUsage()); schedulableEntities.add(schedulableEntity); } {code} Related codes above can be improved as follow to make sure only existent entity can be re-add into schedulableEntities. {code:java} protected void reorderSchedulableEntity(S schedulableEntity) { //remove, update comparable data, and reinsert to update position in order boolean exists = schedulableEntities.remove(schedulableEntity); updateSchedulingResourceUsage( schedulableEntity.getSchedulingResourceUsage()); if (exists) { schedulableEntities.add(schedulableEntity); } else { LOG.info("Skip reordering non-existent schedulable entity: " + schedulableEntity.getId()); } } {code} > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671024#comment-16671024 ] Tao Yang commented on YARN-8958: Thanks [~cheersyang] for the review. {quote} In testSchedulableEntitiesLeak, why the app attempt is finished, but then you try to recover a container for this app? I suppose by then all containers of this app attempt are done correct? {quote} This can happen after RM restart which is step 2 of the reproduce process. Remove app attempt(step 3) may happen before NM reconnect to RM and recover containers (step 4), so that not all containers are done when app attempt finished. > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1, then the state of contianer1 is changed to COMPLETED, > app1 is bring back to entitiesToReorder after container released, then app1 > will be added back into schedulable entities after calling > FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage
[ https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671022#comment-16671022 ] Hadoop QA commented on YARN-8932: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 3s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 56s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 13s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 6s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 59s{color} | {color:green} YARN-1011 passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 14m 39s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 3s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server generated 16 new + 86 unchanged - 0 fixed = 102 total (was 86) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 0s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 203 unchanged - 0 fixed = 204 total (was 203) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m 18s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 15s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8932 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946290/YARN-8932-YARN-1011.02.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 77cdecc1e6f3 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 17 11:07:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-1011 / f3d08c7 | |
[jira] [Commented] (YARN-8404) Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down
[ https://issues.apache.org/jira/browse/YARN-8404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671017#comment-16671017 ] Rohith Sharma K S commented on YARN-8404: - Yes, this need to be backported. > Timeline event publish need to be async to avoid Dispatcher thread leak in > case ATS is down > --- > > Key: YARN-8404 > URL: https://issues.apache.org/jira/browse/YARN-8404 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Blocker > Fix For: 2.10.0, 3.2.0, 3.1.1, 3.0.4 > > Attachments: YARN-8404.01.patch > > > It is observed that if ATS1/1.5 daemon is not running, RM recovery is delayed > as long as timeline client get timed out for each applications. By default, > timed out will take around 5 mins. If completed applications are more then > amount of time RM will wait is *(number of completed applications in a > cluster * 5 minutes)* which is kind of hanged. > Primary reason for this behavior is YARN-3044 YARN-4129 which refactor > existing system metric publisher. This refactoring made appFinished event as > synchronous which was asynchronous earlier. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7765) [Atsv2] GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM
[ https://issues.apache.org/jira/browse/YARN-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671016#comment-16671016 ] Rohith Sharma K S commented on YARN-7765: - Yes, this need to be back ported to branch-2.9 because kerborse support for ATSv2 exist in branch-2.9 > [Atsv2] GSSException: No valid credentials provided - Failed to find any > Kerberos tgt thrown by Timelinev2Client & HBaseClient in NM > > > Key: YARN-7765 > URL: https://issues.apache.org/jira/browse/YARN-7765 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0 >Reporter: Sumana Sathish >Assignee: Rohith Sharma K S >Priority: Blocker > Fix For: 3.1.0, 2.10.0, 3.0.1 > > Attachments: YARN-7765.01.patch, YARN-7765.02.patch > > > Secure cluster is deployed and all YARN services are started successfully. > When application is submitted, app collectors which is started as aux-service > throwing below exception. But this exception is *NOT* observed from RM > TimelineCollector. > Cluster is deployed with Hadoop-3.0 and Hbase-1.2.6 secure cluster. All the > YARN and HBase service are started and working perfectly fine. After 24 hours > i.e when token lifetime is expired, HBaseClient in NM and HDFSClient in > HMaster and HRegionServer started getting this error. After sometime, HBase > daemons got shutdown. In NM, JVM didn't shutdown but none of the events got > published. > {noformat} > 2018-01-17 11:04:48,017 FATAL ipc.RpcClientImpl (RpcClientImpl.java:run(684)) > - SASL authentication failed. The most likely cause is missing or invalid > credentials. Consider 'kinit'. > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > {noformat} > cc :/ [~vrushalic] [~varun_saxena] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671007#comment-16671007 ] Hadoop QA commented on YARN-8914: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 31m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 21s{color} | {color:orange} root: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 5s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}156m 43s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 54s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}320m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | | | hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
[jira] [Commented] (YARN-8778) Add Command Line interface to invoke interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671005#comment-16671005 ] Hadoop QA commented on YARN-8778: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 30m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 42s{color} | {color:green} root: The patch generated 0 new + 322 unchanged - 1 fixed = 322 total (was 323) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}166m 1s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}331m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized | | | hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8778
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670962#comment-16670962 ] Jonathan Hung commented on YARN-7225: - BTW the trunk patch looks good to me, but I found neither this one or the branch-2.8 one apply to branch-2/branch-2.9. Do you mind uploading a patch for these? > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670926#comment-16670926 ] Jonathan Hung edited comment on YARN-7225 at 11/1/18 12:22 AM: --- Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this information will be used for resource tracking so it seems most useful if it tracks what partition it actually ran on - not sure if you have a similar use case. Furthermore I think #1 introduces other issues since the AM partition is often not the same as the partition of the non-AM containers, whether it be this non-AM container's requested partition or the partition it ran on. was (Author: jhung): Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this information will be used for resource tracking so it seems most useful if it tracks what partition it actually ran on. Furthermore I think #1 introduces other issues since the AM partition is often not the same as the partition of the non-AM containers, whether it be this non-AM container's requested partition or the partition it ran on. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670926#comment-16670926 ] Jonathan Hung edited comment on YARN-7225 at 11/1/18 12:21 AM: --- Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this information will be used for resource tracking so it seems most useful if it tracks what partition it actually ran on. Furthermore I think #1 introduces other issues since the AM partition is often not the same as the partition of the non-AM containers, whether it be this non-AM container's requested partition or the partition it ran on. was (Author: jhung): Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this information will be used for resource tracking so it seems most useful if it tracks what partition it actually ran on. Furthermore I think #1 introduces other issues since the AM partition is often not the same as the partition of the non-AM containers. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670926#comment-16670926 ] Jonathan Hung commented on YARN-7225: - Thanks [~eepayne], IMO #2 makes the most sense, at least for our use case this information will be used for resource tracking so it seems most useful if it tracks what partition it actually ran on. Furthermore I think #1 introduces other issues since the AM partition is often not the same as the partition of the non-AM containers. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670851#comment-16670851 ] Hadoop QA commented on YARN-8958: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 51s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 21s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8958 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946245/YARN-8958.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 31032a523279 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6668c19 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22388/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22388/testReport/ | | Max. process+thread count | 944 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreComm
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670827#comment-16670827 ] Eric Payne commented on YARN-7225: -- Thanks [~jhung]. The behavior is slightly different depending on if we are using {{appAMNodePartitionName}} vs. { {{partition}}, {{schedulerContainer.getSchedulerNode().getPartition()}}, {{node.getPartition()}} } 1) {{appAMNodePartitionName}} will always be the partition of the application. 2) Using the other methods to get the partition AND non-exclusive labels is used, it will be the label of the node on which the container ran. I think #2 is technically the correct behavior, but I also think that it may be confusing if the application was submitted to a queue that does not explicitly allow that label. Thoughts? > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8854) Upgrade jquery datatable version references to v1.10.19
[ https://issues.apache.org/jira/browse/YARN-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670743#comment-16670743 ] Hudson commented on YARN-8854: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15340 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15340/]) YARN-8854. Upgrade jquery datatable version references to v1.10.19. (sunilg: rev d36012b69f01c9ddfd2e95545d1f5e1fbc1c3236) * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/custom_datatable.css * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/favicon.ico * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_desc_disabled.png * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_asc_disabled.png * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/css/demo_table.css * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/back_enabled.jpg * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_desc.png * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/back_enabled.jpg * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_both.png * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/forward_disabled.jpg * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/forward_enabled.jpg * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/Sorting icons.psd * (edit) LICENSE.txt * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/forward_disabled.jpg * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_desc.png * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/css/demo_page.css * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/js/jquery.dataTables.min.js * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_asc.png * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_desc_disabled.png * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_asc_disabled.png * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/forward_enabled.jpg * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/sort_both.png * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/sort_asc.png * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/css/jui-dt.css * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/jui-dt.css * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/demo_page.css * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/back_disabled.jpg * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/Sorting icons.psd * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/images/favicon.ico * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/demo_table.css * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/css/jquery.dataTables.css * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.7/images/back_disabled.jpg * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/js/jquery.dataTables.min.js > Upgrade jquery datatable version references to v1.10.19 > --- > > Key: YARN-8854
[jira] [Commented] (YARN-6729) Clarify documentation on how to enable cgroup support
[ https://issues.apache.org/jira/browse/YARN-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670744#comment-16670744 ] Hudson commented on YARN-6729: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15340 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15340/]) YARN-6729. Clarify documentation on how to enable cgroup support. (skumpf: rev 277a3d8d9fe1127c75452d083ff7859c603e686d) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md > Clarify documentation on how to enable cgroup support > - > > Key: YARN-6729 > URL: https://issues.apache.org/jira/browse/YARN-6729 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Yufei Gu >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-6729-trunk.001.patch > > > NM percentage-physical-cpu-limit is not honored in > DefaultLCEResourcesHandler, which may cause container cpu usage calculation > issue. e.g. container vcore usage is potentially more than 100% if > percentage-physical-cpu-limit is set to a value less than 100. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8776) Container Executor change to create stdin/stdout pipeline
[ https://issues.apache.org/jira/browse/YARN-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670740#comment-16670740 ] Hadoop QA commented on YARN-8776: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 81 unchanged - 3 fixed = 81 total (was 84) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 43s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8776 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946237/YARN-8776.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 7a807e9fc42b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6668c19 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22389/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results
[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage
[ https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670737#comment-16670737 ] Hadoop QA commented on YARN-8932: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 4s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 29s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 55s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 5s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 36s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server generated 16 new + 86 unchanged - 0 fixed = 102 total (was 86) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 11s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 203 unchanged - 0 fixed = 204 total (was 203) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 32s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 72m 15s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}179m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestNMProxy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8932 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946290/YARN-8932-YARN-1011.02.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 0a371aeadbcb 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle
[ https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670720#comment-16670720 ] Hadoop QA commented on YARN-8902: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 57s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 57 unchanged - 0 fixed = 61 total (was 57) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 22s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}179m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8902 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946327/YARN-8902.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d4f8734eb0f3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision |
[jira] [Commented] (YARN-8761) Service AM support for decommissioning component instances
[ https://issues.apache.org/jira/browse/YARN-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670722#comment-16670722 ] Hadoop QA commented on YARN-8761: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 11m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 33m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 8m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 34s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 16 new + 392 unchanged - 1 fixed = 408 total (was 393) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 25m 16s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 19s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {col
[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage
[ https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670670#comment-16670670 ] Robert Kanter commented on YARN-8932: - +1 pending Jenkins > ResourceUtilization cpu is misused in oversubscription as a percentage > -- > > Key: YARN-8932 > URL: https://issues.apache.org/jira/browse/YARN-8932 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8932-YARN-1011.00.patch, > YARN-8932-YARN-1011.01.patch, YARN-8932-YARN-1011.02.patch > > > The ResourceUtilization javadoc mistakenly documents the cpu as a percentage > represented by a float number in [0, 1.0f], however it is used as the # of > vcores used in reality. > See javadoc and discussion in YARN-8911. > /** > * Get CPU utilization. > * > * @return CPU utilization normalized to 1 CPU > */ > @Public > @Unstable > public abstract float getCPU(); -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8916) Define a constant "docker" string in "ContainerRuntimeConstants.java" for better maintainability
[ https://issues.apache.org/jira/browse/YARN-8916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-8916: -- Labels: Docker (was: ) > Define a constant "docker" string in "ContainerRuntimeConstants.java" for > better maintainability > > > Key: YARN-8916 > URL: https://issues.apache.org/jira/browse/YARN-8916 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Labels: Docker > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8916-trunk.001.patch > > > There're several hard-code string "docker" exists. It's better to use a > constant string in "ContainerRuntimeConstants" to make this container type > easy to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8927) Better handling of "docker.trusted.registries" in container-executor's "trusted_image_check" function
[ https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-8927: -- Labels: Docker (was: ) > Better handling of "docker.trusted.registries" in container-executor's > "trusted_image_check" function > - > > Key: YARN-8927 > URL: https://issues.apache.org/jira/browse/YARN-8927 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Labels: Docker > > There are some missing cases that we need to catch when handling > "docker.trusted.registries". > The container-executor.cfg configuration is as follows: > {code:java} > docker.trusted.registries=tangzhankun,ubuntu,centos{code} > It works if run DistrubutedShell with "tangzhankun/tensorflow" > {code:java} > "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env > YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow > {code} > But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu" > and "ubuntu[:tagName]" fails: > The error message is like: > {code:java} > "image: centos is not trusted" > {code} > We need better handling the above cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8949) Document usage of "library" in "docker.trusted.repositories" to trust local image
[ https://issues.apache.org/jira/browse/YARN-8949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-8949: -- Labels: Docker (was: ) > Document usage of "library" in "docker.trusted.repositories" to trust local > image > - > > Key: YARN-8949 > URL: https://issues.apache.org/jira/browse/YARN-8949 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Labels: Docker > > As we came to a solution on how code changes will be (YARN-8927) to improve > this trusted repo feature, the usage of "library" in > "docker.trusted.repositories" in current implementation should be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670503#comment-16670503 ] Chandni Singh commented on YARN-8867: - {quote} Is this status going to show up in Yarn Service JSON or some other mechanism to surface to the end user? {quote} Didn't have any plans yet to expose this in Yarn Service JSON. It will be used by the Service AM to find out when the localization is complete. We can expose it in the Yarn Service JSON when the need arises. {quote} The status definition may also include a state for not yet started, like PENDING. {quote} Right now I was using the information in {{ResourceSet}} class. All the resources in {{pendingResources}} had state marked as {{IN_PROGRESS}}. In order to differentiate further between resources for which localization has started and the ones which are queued, will have to extract the information from {{ResourceLocalizationService}}. > Retrieve the status of resource localization > > > Key: YARN-8867 > URL: https://issues.apache.org/jira/browse/YARN-8867 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8867.wip.patch > > > Refer YARN-3854. > Currently NM does not have an API to retrieve the status of localization. > Unless the client can know when the localization of a resource is complete > irrespective of the type of the resource, it cannot take any appropriate > action. > We need an API in {{ContainerManagementProtocol}} to retrieve the status on > the localization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment
[ https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670418#comment-16670418 ] Hadoop QA commented on YARN-8960: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: The patch generated 7 new + 41 unchanged - 0 fixed = 48 total (was 41) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8960 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946367/YARN-8960.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9e28c47e065c 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 478b2cb | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22384/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22384/testReport/ | | Max. process+thread count | 306 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine U: hadoop-yarn-
[jira] [Updated] (YARN-8838) Add security check for container user is same as websocket user
[ https://issues.apache.org/jira/browse/YARN-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8838: Attachment: YARN-8838.003.patch > Add security check for container user is same as websocket user > --- > > Key: YARN-8838 > URL: https://issues.apache.org/jira/browse/YARN-8838 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: docker > Attachments: YARN-8838.001.patch, YARN-8838.002.patch, > YARN-8838.003.patch > > > When user is authenticate via SPNEGO entry point, node manager must verify > the remote user is the same as the container user to start the web socket > session. One possible solution is to verify the web request user matches > yarn container local directory owne during onWebSocketConnect.. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670258#comment-16670258 ] Weiwei Yang commented on YARN-8958: --- Hi [~Tao Yang] Thanks for creating the issue and the fix. I am trying to understand this issue, got a question about UT, In \{{testSchedulableEntitiesLeak}}, why the app attempt is finished, but then you try to recover a container for this app? I suppose by then all containers of this app attempt are done correct? > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1, then the state of contianer1 is changed to COMPLETED, > app1 is bring back to entitiesToReorder after container released, then app1 > will be added back into schedulable entities after calling > FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8960) Can't get submarine service status using the command of "yarn app -status" under security environment
[ https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zac Zhou updated YARN-8960: --- Attachment: YARN-8960.003.patch > Can't get submarine service status using the command of "yarn app -status" > under security environment > - > > Key: YARN-8960 > URL: https://issues.apache.org/jira/browse/YARN-8960 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zac Zhou >Assignee: Zac Zhou >Priority: Major > Attachments: YARN-8960.001.patch, YARN-8960.002.patch, > YARN-8960.003.patch > > > After submitting a submarine job, we tried to get service status using the > following command: > yarn app -status ${service_name} > But we got the following error: > HTTP error code : 500 > > The stack in resourcemanager log is : > ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {} > java.lang.reflect.UndeclaredThrowableException > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748) > at > org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800) > at > org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) > at > com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker > ._dispatch(AbstractResourceMethodDispatchProvider.java:205) > at > com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodD > ispatcher.java:75) > at > com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) > at > com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) > at > com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) > at > com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) > at > com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) > at > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) > at > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) > at > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) > at > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) > at > com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) > at > com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) > at > com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:179) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.apache.hadoop.security.authentication.server.AuthenticationFi
[jira] [Commented] (YARN-8954) Reservations list field in ReservationListInfo is not accessible
[ https://issues.apache.org/jira/browse/YARN-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669832#comment-16669832 ] Oleksandr Shevchenko commented on YARN-8954: The failed tests not related to the changes. All tests in TestRMAppTransitions, TestCapacitySchedulerMetrics, and TestFairScheduler passed successfully locally. Could someone review the patch? Thanks! > Reservations list field in ReservationListInfo is not accessible > > > Key: YARN-8954 > URL: https://issues.apache.org/jira/browse/YARN-8954 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, restapi >Reporter: Oleksandr Shevchenko >Priority: Minor > Attachments: YARN-8954.001.patch > > > We need to add the getter for Reservations list field since the field cannot > be accessible after the unmarshal. The similar problem described in YARN-2280. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7277) Container Launch expand environment needs to consider bracket matching
[ https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669692#comment-16669692 ] Weiwei Yang commented on YARN-7277: --- Hi [~tangzhankun] Thanks for creating this issue [~balloons] and thanks for the patch [~tangzhankun]. For v3 patch, it doesn't seem to work on windows, can you pls check? Thanks > Container Launch expand environment needs to consider bracket matching > -- > > Key: YARN-7277 > URL: https://issues.apache.org/jira/browse/YARN-7277 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: balloons >Assignee: Zhankun Tang >Priority: Critical > Attachments: YARN-7277-trunk.001.patch, YARN-7277-trunk.002.patch, > YARN-7277-trunk.003.patch > > > The SPARK application I submitted always failed and I finally found that the > commands I specified to launch AM Container were changed by NM. > *The following is part of the excerpt I submitted to RM to see the command:* > {code:java} > *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}}}'* > {code} > *The following is an excerpt from the corresponding command used when I > observe the NM launch container:* > {code:java} > *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}* > {code} > Finally, I found that NM made the following transformation in launch > container which led to this situation: > {code:java} > @VisibleForTesting > public static String expandEnvironment(String var, > Path containerLogDir) { > var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR, > containerLogDir.toString()); > var = var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR, > File.pathSeparator); > // replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced > // as %VAR% and on Linux replaced as "$VAR" > if (Shell.WINDOWS) { > var = var.replaceAll("(\\{\\{)|(\\}\\})", "%"); > } else { > var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$"); > *var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");* > } > return var; > } > {code} > I think this is a Bug that doesn't even consider the pairing of > "*PARAMETER_EXPANSION_LEFT*" and "*PARAMETER_EXPANSION_RIGHT*" when > substituting. But simply substituting for simple violence. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8961) [UI2] Flow Run End Time shows 'Invalid date'
[ https://issues.apache.org/jira/browse/YARN-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669690#comment-16669690 ] Charan Hebri commented on YARN-8961: [~akhilpb] below is the response for flow runs, {noformat} [ { "metrics": [], "events": [], "createdtime": 1540969639076, "idprefix": 0, "id": "hrt_qa@test_flow/1540969639076", "type": "YARN_FLOW_RUN", "info": { "UID": "yarn-cluster!hrt_qa!test_flow!1540969639076", "SYSTEM_INFO_FLOW_NAME": "test_flow", "SYSTEM_INFO_FLOW_RUN_ID": 1540969639076, "SYSTEM_INFO_USER": "hrt_qa", "FROM_ID": "yarn-cluster!hrt_qa!test_flow!1540969639076" }, "isrelatedto": {}, "relatesto": {} }, { "metrics": [], "events": [], "createdtime": 1540969221139, "idprefix": 0, "id": "hrt_qa@test_flow/1540969221139", "type": "YARN_FLOW_RUN", "info": { "UID": "yarn-cluster!hrt_qa!test_flow!1540969221139", "SYSTEM_INFO_FLOW_RUN_END_TIME": 1540969587649, "SYSTEM_INFO_FLOW_NAME": "test_flow", "SYSTEM_INFO_FLOW_RUN_ID": 1540969221139, "SYSTEM_INFO_USER": "hrt_qa", "FROM_ID": "yarn-cluster!hrt_qa!test_flow!1540969221139" }, "isrelatedto": {}, "relatesto": {} } ]{noformat} You can see that for a flow that hasn't completed yet, SYSTEM_INFO_FLOW_RUN_END_TIME is not available in the response. > [UI2] Flow Run End Time shows 'Invalid date' > > > Key: YARN-8961 > URL: https://issues.apache.org/jira/browse/YARN-8961 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Charan Hebri >Assignee: Akhil PB >Priority: Major > Attachments: Invalid_Date.png > > > End Time for Flow Runs is shown as *Invalid date* for runs that are in > progress. This should be shown as *N/A* just like how it is shown for 'CPU > VCores' and 'Memory Used'. Attached relevant screenshot. > cc [~akhilpb] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org