[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564456#comment-16564456 ] Hudson commented on YARN-8579: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14679 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14679/]) YARN-8579. Recover NMToken of previous attempted component data. (eyang: rev c7ebcd76bf3dd14127336951f2be3de772e7826a) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch, YARN-8579.004.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564250#comment-16564250 ] Gour Saha commented on YARN-8579: - Thanks [~csingh]. [~eyang] please review and commit when you get a chance. > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch, YARN-8579.004.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564062#comment-16564062 ] Chandni Singh commented on YARN-8579: - +1 LGTM > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch, YARN-8579.004.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562978#comment-16562978 ] genericqa commented on YARN-8579: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 70m 20s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 39s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8579 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933679/YARN-8579.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2e3b926d0909 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ee53602 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21443/testReport/ | | Max. process+thread count | 860 (vs. ulim
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562586#comment-16562586 ] Gour Saha commented on YARN-8579: - Ah, nice catch [~csingh]. That's exactly what the issue was. With the fix in FairScheduler.java, the test now passes for both FAIR and CAPACITY schedulers. I am running all the tests now and will upload the updated patch after they all pass. > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562355#comment-16562355 ] Chandni Singh commented on YARN-8579: - [~gsaha] please see {quote}I do have one fundamental question though. I don't understand why for FAIR scheduler the below assert fails (which means no NMTokens are sent over even with this patch). The method where I made the code change is a common method which is called by both Fair and Capacity Schedulers. Any idea? That's why I had to enable this assert for CAPACITY scheduler only. I don't have a cluster setup where I can test FairScheduler. {quote} The bug is in the order of calls to {{SchedulerApplicationAttempt.pullPreviousAttemptContainers()}} and {{SchedulerApplicationAttempt.pullUpdatedNMTokens()}} in {{FairScheduler}} {{FiCaSchedulerApp}} does the right order. It calls {{pullPreviousAttemptContainers()}} which updates the NM tokens and then pulls with {{pullUpdatedNMTokens()}} {code:java} List previousAttemptContainers = pullPreviousAttemptContainers(); List newlyAllocatedContainers = pullNewlyAllocatedContainers(); List newlyIncreasedContainers = pullNewlyIncreasedContainers(); List newlyDecreasedContainers = pullNewlyDecreasedContainers(); List newlyPromotedContainers = pullNewlyPromotedContainers(); List newlyDemotedContainers = pullNewlyDemotedContainers(); List updatedNMTokens = pullUpdatedNMTokens(); {code} However, {{FairScheduler}} does the wrong order by calling first {{pullUpdatedNMTokens()}} before {{pullPreviousAttemptContainers()}}. {code:java} return new Allocation(newlyAllocatedContainers, headroom, preemptionContainerIds, null, null, application.pullUpdatedNMTokens(), null, null, application.pullNewlyPromotedContainers(), application.pullNewlyDemotedContainers(), application.pullPreviousAttemptContainers()); {code} Since NMTokens are not updated, they are null in the allocation. I think we should fix this instead of modifying the test to only check this for capacity scheduler. > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apa
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562280#comment-16562280 ] genericqa commented on YARN-8579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 3s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 47s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 55s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8579 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933440/YARN-8579.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3c8f14d9bc28 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3517a47 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://buil
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560862#comment-16560862 ] Gour Saha commented on YARN-8579: - None of the test failures are related to the code change and all patches have completely different non-overlapping test failures. > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560819#comment-16560819 ] genericqa commented on YARN-8579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 46s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 40s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 10s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.scheduler.fair.policies.TestDominantResourceFairnessPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8579 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933440/YARN-8579.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3056e195dbb5 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 59adeb8 | | maven | v
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560530#comment-16560530 ] genericqa commented on YARN-8579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 24s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 32s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}165m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8579 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933440/YARN-8579.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f670b595b9ac 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2cccf40 | | maven | version: Apache Maven
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560415#comment-16560415 ] Gour Saha commented on YARN-8579: - Thanks [~csingh] for the review. I uploaded 003 with your suggestion. I do have one fundamental question though. I don't understand why for FAIR scheduler the below assert fails (which means no NMTokens are sent over even with this patch). The method where I made the code change is a common method which is called by both Fair and Capacity Schedulers. Any idea? That's why I had to enable this assert for CAPACITY scheduler only. I don't have a cluster setup where I can test FairScheduler. {code} if (getSchedulerType().equals(SchedulerType.CAPACITY)) { Assert.assertEquals(1, nmTokens.size()); // container 3 is running on node 2 Assert.assertEquals(nm2Address, nmTokens.get(0).getNodeId().toString()); } {code} > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch, > YARN-8579.003.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560414#comment-16560414 ] genericqa commented on YARN-8579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 17s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 1s{color} | {color:red} hadoop-yarn-services-core in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.policies.TestDominantResourceFairnessPolicy | | | hadoop.yarn.service.TestYarnNativeServices | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8579 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933419/YARN-8579.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 349032c56f6c 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b429f19 | | maven | version: Apache Maven 3.3.9 | | Defaul
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560218#comment-16560218 ] Chandni Singh commented on YARN-8579: - [~gsaha] Thanks for debugging the issue. patch 2 looks good to me. Just a nitpick. Since we use slf4j, we can use it instead of string concatenation in the log stmt {code:java} LOG.info("Containers recovered after AM registered: ", containers); {code} to {code:java} LOG.info("Containers recovered after AM registered: {} ", containers); {code} > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560197#comment-16560197 ] Gour Saha commented on YARN-8579: - [~csingh], please review the patch when you get a chance. > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560193#comment-16560193 ] Gour Saha commented on YARN-8579: - Uploaded 002 with a few more asserts in the test. > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8579.001.patch, YARN-8579.002.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559242#comment-16559242 ] genericqa commented on YARN-8579: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 7s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 69m 2s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 40s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}180m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8579 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933296/YARN-8579.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5fa995b24f28 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8d3c068 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21389/testReport/ | | Max. process+thread count | 850 (vs. uli
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559122#comment-16559122 ] Gour Saha commented on YARN-8579: - Uploading patch 001 with a fix that I successfully tested in my cluster > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > Attachments: YARN-8579.001.patch > > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8579) New AM attempt could not retrieve previous attempt component data
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559121#comment-16559121 ] Gour Saha commented on YARN-8579: - I investigated this issue and figured that the root cause is the missing NM tokens corresponding to the containers which were passed to the AM after registration via the onContainersReceivedFromPreviousAttempts callback. This is required with the change made in YARN-6168. Exception seen in AM log is as below - {code} 2018-07-26 23:22:31,373 [pool-5-thread-4] ERROR instance.ComponentInstance - [COMPINSTANCE httpd-proxy-0 : container_e15_1532637883791_0001_01_04] Failed to get container status on ctr-e138-1518143905142-412155-01-05.hwx.site:25454, will try again org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent for ctr-e138-1518143905142-412155-01-05.hwx.site:25454 at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:262) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:252) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:137) at org.apache.hadoop.yarn.client.api.impl.NMClientImpl.getContainerStatus(NMClientImpl.java:323) at org.apache.hadoop.yarn.service.component.instance.ComponentInstance$ContainerStatusRetriever.run(ComponentInstance.java:596) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} > New AM attempt could not retrieve previous attempt component data > - > > Key: YARN-8579 > URL: https://issues.apache.org/jira/browse/YARN-8579 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Gour Saha >Priority: Critical > > Steps: > 1) Launch httpd-docker > 2) Wait for app to be in STABLE state > 3) Run validation for app (It takes around 3 mins) > 4) Stop all Zks > 5) Wait 60 sec > 6) Kill AM > 7) wait for 30 sec > 8) Start all ZKs > 9) Wait for application to finish > 10) Validate expected containers of the app > Expected behavior: > New attempt of AM should start and docker containers launched by 1st attempt > should be recovered by new attempt. > Actual behavior: > New AM attempt starts. It can not recover 1st attempt docker containers. It > can not read component details from ZK. > Thus, it starts new attempt for all containers. > {code} > 2018-07-19 22:42:47,595 [main] INFO service.ServiceScheduler - Registering > appattempt_1531977563978_0015_02, fault-test-zkrm-httpd-docker into > registry > 2018-07-19 22:42:47,611 [main] INFO service.ServiceScheduler - Received 1 > containers from previous attempt. > 2018-07-19 22:42:47,642 [main] INFO service.ServiceScheduler - Could not > read component paths: > `/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components': > No such file or directory: KeeperErrorCode = NoNode for > /registry/users/hrt-qa/services/yarn-service/fault-test-zkrm-httpd-docker/components > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Handling > container_e08_1531977563978_0015_01_03 from previous attempt > 2018-07-19 22:42:47,643 [main] INFO service.ServiceScheduler - Record not > found in registry for container container_e08_1531977563978_0015_01_03 > from previous attempt, releasing > 2018-07-19 22:42:47,649 [AMRM Callback Handler Thread] INFO > impl.TimelineV2ClientImpl - Updated timeline service address to xxx:33019 > 2018-07-19 22:42:47,651 [main] INFO service.ServiceScheduler - Triggering > initial evaluation of component httpd > 2018-07-19 22:42:47,652 [main] INFO component.Component - [INIT COMPONENT > httpd]: 2 instances. > 2018-07-19 22:42:47,652 [main] INFO component.Component - [COMPONENT httpd] > Requesting for 2 container(s){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsub