[jira] [Commented] (YARN-7770) Support for setting application priority for Distributed Shell jobs
[ https://issues.apache.org/jira/browse/YARN-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352040#comment-16352040 ] Sunil G commented on YARN-7770: --- Yes, its been in DS help message. > Support for setting application priority for Distributed Shell jobs > --- > > Key: YARN-7770 > URL: https://issues.apache.org/jira/browse/YARN-7770 > Project: Hadoop YARN > Issue Type: Bug > Components: applications/distributed-shell >Reporter: Charan Hebri >Assignee: Sunil G >Priority: Major > > Currently there isn't a way to submit a Distributed Shell job with an > application priority like how it is done via the property > {noformat} > mapred.job.priority{noformat} > for MapReduce jobs. Creating this issue to track support for the same. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-7770) Support for setting application priority for Distributed Shell jobs
[ https://issues.apache.org/jira/browse/YARN-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G resolved YARN-7770. --- Resolution: Not A Bug > Support for setting application priority for Distributed Shell jobs > --- > > Key: YARN-7770 > URL: https://issues.apache.org/jira/browse/YARN-7770 > Project: Hadoop YARN > Issue Type: Bug > Components: applications/distributed-shell >Reporter: Charan Hebri >Assignee: Sunil G >Priority: Major > > Currently there isn't a way to submit a Distributed Shell job with an > application priority like how it is done via the property > {noformat} > mapred.job.priority{noformat} > for MapReduce jobs. Creating this issue to track support for the same. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352038#comment-16352038 ] Sunil G commented on YARN-7292: --- Hi [~leftnoteasy]. Thanks for the patch. Generally approach seems fine. I will help in clearing jenkins etc today. > Revisit Resource Profile Behavior > - > > Key: YARN-7292 > URL: https://issues.apache.org/jira/browse/YARN-7292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7292.002.patch, YARN-7292.003.patch, > YARN-7292.wip.001.patch > > > Had discussions with [~templedf], [~vvasudev], [~sunilg] offline. There're a > couple of resource profile related behaviors might need to be updated: > 1) Configure resource profile in server side or client side: > Currently resource profile can be only configured centrally: > - Advantages: > A given resource profile has a the same meaning in the cluster. It won’t > change when we run different apps in different configurations. A job can run > under Amazon’s G2.8X can also run on YARN with G2.8X profile. A side benefit > is YARN scheduler can potentially do better bin packing. > - Disadvantages: > Hard for applications to add their own resource profiles. > 2) Do we really need mandatory resource profiles such as > minimum/maximum/default? > 3) Should we send resource profile name inside ResourceRequest, or should > client/AM translate it to resource and set it to the existing resource > fields? > 4) Related to above, should we allow resource overrides or client/AM should > send final resource to RM? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6510) Fix profs stat file warning caused by process names that includes parenthesis
[ https://issues.apache.org/jira/browse/YARN-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352002#comment-16352002 ] Cholju Paek edited comment on YARN-6510 at 2/5/18 4:06 AM: --- Hello. One of my customer is experiencing the same and needs a patch. There is one question. I wonder if this bug causes functional problems? For example, could a job fail or a data drop? The patch is very careful because the site is very sensitive to patches. Could you answer that? 안녕하세요. 저희 사이트도 동일한 이슈가 발생하고 있고, Patch가 필요한 상태입니다. 한가지 궁금한 점이 있습니다. 이 Bug로 인하여 기능적인 문제가 발생하는지에 대하여 궁금합니다. 예를들어 작업실패 또는 데이터 누락 등이 발생할 수 있을까요? 사이트가 패치에 대해 매우 민감하므로 Patch가 조심스럽습니다. 혹시 답변해 주실수 있으신지요? was (Author: cholju73): Hello. One of my customer is experiencing the same issue. Currently, Cloudera CDH is used, and due to the bug, patch is required. I wonder if this bug causes functional problems? The patch is very careful because the site is very sensitive to patches. Could you answer that? 안녕하세요. 저희 사이트도 동일한 이슈가 발생하고 있습니다. 현재 Cloudera CDH를 이용하고 있으며, 해당 Bug로 인하여 Patch가 필요한 상태입니다. 궁금한 점은,, 이 Bug로 인하여 기능적인 문제가 발생하는지에 대하여 궁금합니다. 사이트가 패치에 대해 매우 민감하므로 Patch가 조심스럽습니다. 혹시 답변해 주실수 있으신지요? > Fix profs stat file warning caused by process names that includes parenthesis > - > > Key: YARN-6510 > URL: https://issues.apache.org/jira/browse/YARN-6510 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0, 3.0.0-alpha2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-6510.01.patch > > > Even with the fix for YARN-3344 we still have issues with the procfs format. > This is the case that is causing issues: > {code} > [user@nm1 ~]$ cat /proc/2406/stat > 2406 (ib_fmr(mlx4_0)) S 2 0 0 0 -1 2149613632 0 0 0 0 166 126908 0 0 20 0 1 0 > 4284 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 > 0 0 17 6 0 0 0 0 0 > {code} > We do not handle the parenthesis in the name which causes the pattern > matching to fail -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7892) NodeAttributePBImpl does not implement hashcode and Equals properly
[ https://issues.apache.org/jira/browse/YARN-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352009#comment-16352009 ] Sunil G commented on YARN-7892: --- [~Naganarasimha] This looks fine. For NodeAttributeType, we are using enum. So I think separate compare may not be needed for each type. > NodeAttributePBImpl does not implement hashcode and Equals properly > --- > > Key: YARN-7892 > URL: https://issues.apache.org/jira/browse/YARN-7892 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Major > Attachments: YARN-7892-YARN-3409.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6510) Fix profs stat file warning caused by process names that includes parenthesis
[ https://issues.apache.org/jira/browse/YARN-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352002#comment-16352002 ] Cholju Paek commented on YARN-6510: --- Hello. One of my customer is experiencing the same issue. Currently, Cloudera CDH is used, and due to the bug, patch is required. I wonder if this bug causes functional problems? The patch is very careful because the site is very sensitive to patches. Could you answer that? 안녕하세요. 저희 사이트도 동일한 이슈가 발생하고 있습니다. 현재 Cloudera CDH를 이용하고 있으며, 해당 Bug로 인하여 Patch가 필요한 상태입니다. 궁금한 점은,, 이 Bug로 인하여 기능적인 문제가 발생하는지에 대하여 궁금합니다. 사이트가 패치에 대해 매우 민감하므로 Patch가 조심스럽습니다. 혹시 답변해 주실수 있으신지요? > Fix profs stat file warning caused by process names that includes parenthesis > - > > Key: YARN-6510 > URL: https://issues.apache.org/jira/browse/YARN-6510 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0, 3.0.0-alpha2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-6510.01.patch > > > Even with the fix for YARN-3344 we still have issues with the procfs format. > This is the case that is causing issues: > {code} > [user@nm1 ~]$ cat /proc/2406/stat > 2406 (ib_fmr(mlx4_0)) S 2 0 0 0 -1 2149613632 0 0 0 0 166 126908 0 0 20 0 1 0 > 4284 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 > 0 0 17 6 0 0 0 0 0 > {code} > We do not handle the parenthesis in the name which causes the pattern > matching to fail -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang resolved YARN-7880. - Resolution: Duplicate Assignee: Jiandan Yang Fix Version/s: 3.0.0 > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL: https://issues.apache.org/jira/browse/YARN-7880 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Fix For: 3.0.0 > > > sls test case: node count = 9000, job count=10k,task num of job = 500, task > run time = 100s, but it does not occur when node count = 500 and 2000. > {code} > 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: > container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED > to RUNNING > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) > {code} > some CapacityScheduler$AsyncScheduleThread also throws NPE > {code} > 18/02/02 20:40:34 INFO resourcemanager.DefaultAMSProcessor: AM registration > appattempt_1517575125794_4564_01 > 18/02/02 20:40:34 INFO resourcemanager.RMAuditLogger: USER=default > OPERATION=Register App Master TARGET=ApplicationMasterService > RESULT=SUCCESS APPID=application_1517575125794_4564 > APPATTEMPTID=appattempt_1517575125794_4564_01 > Exception in thread "Thread-43" 18/02/02 20:40:34 INFO appmaster.AMSimulator: > Register the application master for application application_1517575125794_4564 > 18/02/02 20:40:34 INFO resourcemanager.MockAMLauncher: Notify AM launcher > launched:container_1517575125794_4564_01_01 > 18/02/02 20:40:34 INFO rmcontainer.RMContainerImpl: > container_1517575125794_2703_01_01 Container Transitioned from ACQUIRED > to RUNNING > 18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: > appattempt_1517575125794_4564_01 State change from ALLOCATED to LAUNCHED > on event = LAUNCHED > 18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: > appattempt_1517575125794_4564_01 State change from LAUNCHED to RUNNING on > event = REGISTERED > 18/02/02 20:40:34 INFO rmapp.RMAppImpl: application_1517575125794_4564 State > change from ACCEPTED to RUNNING on event = ATTEMPT_REGISTERED > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequests(SchedulerApplicationAttempt.java:1341) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:302) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:389) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:470) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:252) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:816) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:854) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:856) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:735) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559) > at >
[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers
[ https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351991#comment-16351991 ] Weiwei Yang commented on YARN-7757: --- Thanks [~Naganarasimha], [~sunilg] for all the reviews and comments! > Refactor NodeLabelsProvider to be more generic and reusable for node > attributes providers > - > > Key: YARN-7757 > URL: https://issues.apache.org/jira/browse/YARN-7757 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Blocker > Fix For: yarn-3409 > > Attachments: YARN-7757-YARN-3409.001.patch, > YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, > YARN-7757-YARN-3409.004.patch, YARN-7757-YARN-3409.005.patch, > YARN-7757-YARN-3409.006.patch, > nodeLabelsProvider_refactor_class_hierarchy.pdf, > nodeLabelsProvider_refactor_v2.pdf, nodeLabelsProvider_refactor_v3.pdf > > > Propose to do refactor on {{NodeLabelsProvider}}, > {{AbstractNodeLabelsProvider}} to be more generic, so node attributes > providers can reuse these interface/abstract classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351990#comment-16351990 ] Jiandan Yang commented on YARN-7880: - duplicated with YARN-7591 > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL: https://issues.apache.org/jira/browse/YARN-7880 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 >Reporter: Jiandan Yang >Priority: Major > > sls test case: node count = 9000, job count=10k,task num of job = 500, task > run time = 100s, but it does not occur when node count = 500 and 2000. > {code} > 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: > container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED > to RUNNING > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) > {code} > some CapacityScheduler$AsyncScheduleThread also throws NPE > {code} > 18/02/02 20:40:34 INFO resourcemanager.DefaultAMSProcessor: AM registration > appattempt_1517575125794_4564_01 > 18/02/02 20:40:34 INFO resourcemanager.RMAuditLogger: USER=default > OPERATION=Register App Master TARGET=ApplicationMasterService > RESULT=SUCCESS APPID=application_1517575125794_4564 > APPATTEMPTID=appattempt_1517575125794_4564_01 > Exception in thread "Thread-43" 18/02/02 20:40:34 INFO appmaster.AMSimulator: > Register the application master for application application_1517575125794_4564 > 18/02/02 20:40:34 INFO resourcemanager.MockAMLauncher: Notify AM launcher > launched:container_1517575125794_4564_01_01 > 18/02/02 20:40:34 INFO rmcontainer.RMContainerImpl: > container_1517575125794_2703_01_01 Container Transitioned from ACQUIRED > to RUNNING > 18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: > appattempt_1517575125794_4564_01 State change from ALLOCATED to LAUNCHED > on event = LAUNCHED > 18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: > appattempt_1517575125794_4564_01 State change from LAUNCHED to RUNNING on > event = REGISTERED > 18/02/02 20:40:34 INFO rmapp.RMAppImpl: application_1517575125794_4564 State > change from ACCEPTED to RUNNING on event = ATTEMPT_REGISTERED > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequests(SchedulerApplicationAttempt.java:1341) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:302) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:389) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:470) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:252) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:816) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:854) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:856) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:735) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559) > at >
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Description: sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s, but it does not occur when node count = 500 and 2000. {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} some CapacityScheduler$AsyncScheduleThread also throws NPE {code} 18/02/02 20:40:34 INFO resourcemanager.DefaultAMSProcessor: AM registration appattempt_1517575125794_4564_01 18/02/02 20:40:34 INFO resourcemanager.RMAuditLogger: USER=default OPERATION=Register App Master TARGET=ApplicationMasterService RESULT=SUCCESS APPID=application_1517575125794_4564 APPATTEMPTID=appattempt_1517575125794_4564_01 Exception in thread "Thread-43" 18/02/02 20:40:34 INFO appmaster.AMSimulator: Register the application master for application application_1517575125794_4564 18/02/02 20:40:34 INFO resourcemanager.MockAMLauncher: Notify AM launcher launched:container_1517575125794_4564_01_01 18/02/02 20:40:34 INFO rmcontainer.RMContainerImpl: container_1517575125794_2703_01_01 Container Transitioned from ACQUIRED to RUNNING 18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: appattempt_1517575125794_4564_01 State change from ALLOCATED to LAUNCHED on event = LAUNCHED 18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: appattempt_1517575125794_4564_01 State change from LAUNCHED to RUNNING on event = REGISTERED 18/02/02 20:40:34 INFO rmapp.RMAppImpl: application_1517575125794_4564 State change from ACCEPTED to RUNNING on event = ATTEMPT_REGISTERED java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequests(SchedulerApplicationAttempt.java:1341) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:302) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:389) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:470) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:252) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:816) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:854) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:856) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:735) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1343) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1337) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1434) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1199) at
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Description: sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s, but it does not occur when node count = 500 and 2000. {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} some CapacityScheduler$AsyncScheduleThread also throws NPE {code} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequests(SchedulerApplicationAttempt.java:1341) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:302) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:389) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:470) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:252) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:816) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:854) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:856) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:735) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1343) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1337) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1434) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1199) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:474) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:501) {code} was: sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} some CapacityScheduler$AsyncScheduleThread also throws NPE {code} java.lang.NullPointerException at
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Description: sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} some CapacityScheduler$AsyncScheduleThread also throws NPE {code} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequests(SchedulerApplicationAttempt.java:1341) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:302) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:389) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:470) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:252) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:816) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:854) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:856) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:735) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1343) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1337) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1434) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1199) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:474) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:501) {code} was: sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL:
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Component/s: yarn > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL: https://issues.apache.org/jira/browse/YARN-7880 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 >Reporter: Jiandan Yang >Priority: Major > > sls test case: node count = 9000, job count=10k,task num of job = 500, task > run time = 100s > {code} > 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: > container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED > to RUNNING > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Affects Version/s: 3.0.0 > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL: https://issues.apache.org/jira/browse/YARN-7880 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jiandan Yang >Priority: Major > > sls test case: node count = 9000, job count=10k,task num of job = 500, task > run time = 100s > {code} > 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: > container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED > to RUNNING > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Description: sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} was: {code} 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to RUNNING java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) {code} > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL: https://issues.apache.org/jira/browse/YARN-7880 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jiandan Yang >Priority: Major > > sls test case: node count = 9000, job count=10k,task num of job = 500, task > run time = 100s > {code} > 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: > container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED > to RUNNING > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7880) CapacityScheduler$ResourceCommitterService throws NPE when running sls
[ https://issues.apache.org/jira/browse/YARN-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7880: Summary: CapacityScheduler$ResourceCommitterService throws NPE when running sls (was: FiCaSchedulerApp.commonCheckContainerAllocation throws NPE when running sls) > CapacityScheduler$ResourceCommitterService throws NPE when running sls > -- > > Key: YARN-7880 > URL: https://issues.apache.org/jira/browse/YARN-7880 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jiandan Yang >Priority: Major > > {code} > 18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: > container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED > to RUNNING > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7892) NodeAttributePBImpl does not implement hashcode and Equals properly
[ https://issues.apache.org/jira/browse/YARN-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351910#comment-16351910 ] Naganarasimha G R commented on YARN-7892: - [~sunil.gov...@gmail.com], hope this modification is fine ? > NodeAttributePBImpl does not implement hashcode and Equals properly > --- > > Key: YARN-7892 > URL: https://issues.apache.org/jira/browse/YARN-7892 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Major > Attachments: YARN-7892-YARN-3409.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7892) NodeAttributePBImpl does not implement hashcode and Equals properly
[ https://issues.apache.org/jira/browse/YARN-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-7892: Attachment: YARN-7892-YARN-3409.001.patch > NodeAttributePBImpl does not implement hashcode and Equals properly > --- > > Key: YARN-7892 > URL: https://issues.apache.org/jira/browse/YARN-7892 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Major > Attachments: YARN-7892-YARN-3409.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7892) NodeAttributePBImpl does not implement hashcode and Equals properly
Naganarasimha G R created YARN-7892: --- Summary: NodeAttributePBImpl does not implement hashcode and Equals properly Key: YARN-7892 URL: https://issues.apache.org/jira/browse/YARN-7892 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7757) Refactor NodeLabelsProvider to be more generic and reusable for node attributes providers
[ https://issues.apache.org/jira/browse/YARN-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351899#comment-16351899 ] Naganarasimha G R commented on YARN-7757: - Thanks [~cheersyang] Seems like many of the things we plan to track in other jira. Hope we ensure that its tracked and not missed. I am +1 on it. [~sunilg] as they were getting blocked on this i have gone ahead and committed it. > Refactor NodeLabelsProvider to be more generic and reusable for node > attributes providers > - > > Key: YARN-7757 > URL: https://issues.apache.org/jira/browse/YARN-7757 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-7757-YARN-3409.001.patch, > YARN-7757-YARN-3409.002.patch, YARN-7757-YARN-3409.003.patch, > YARN-7757-YARN-3409.004.patch, YARN-7757-YARN-3409.005.patch, > YARN-7757-YARN-3409.006.patch, > nodeLabelsProvider_refactor_class_hierarchy.pdf, > nodeLabelsProvider_refactor_v2.pdf, nodeLabelsProvider_refactor_v3.pdf > > > Propose to do refactor on {{NodeLabelsProvider}}, > {{AbstractNodeLabelsProvider}} to be more generic, so node attributes > providers can reuse these interface/abstract classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351826#comment-16351826 ] genericqa commented on YARN-7292: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 6s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 49s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 298 unchanged - 11 fixed = 303 total (was 309) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 42s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 8s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 29s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 28m 24s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 50s{color} | {color:red} hadoop-yarn-applications-distributedshell in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings.
[jira] [Commented] (YARN-7891) LogAggregationIndexedFileController should support HAR file
[ https://issues.apache.org/jira/browse/YARN-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351795#comment-16351795 ] genericqa commented on YARN-7891: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 8s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7891 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12909138/YARN-7891.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 2f188757c339 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4e9a59c | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/19599/artifact/out/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19599/testReport/ | | Max. process+thread count | 410 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console
[jira] [Updated] (YARN-7891) LogAggregationIndexedFileController should support HAR file
[ https://issues.apache.org/jira/browse/YARN-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-7891: Attachment: YARN-7891.1.patch > LogAggregationIndexedFileController should support HAR file > --- > > Key: YARN-7891 > URL: https://issues.apache.org/jira/browse/YARN-7891 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Major > Attachments: YARN-7891.1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351770#comment-16351770 ] Wangda Tan commented on YARN-7292: -- Rebased to latest trunk (003) > Revisit Resource Profile Behavior > - > > Key: YARN-7292 > URL: https://issues.apache.org/jira/browse/YARN-7292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7292.002.patch, YARN-7292.003.patch, > YARN-7292.wip.001.patch > > > Had discussions with [~templedf], [~vvasudev], [~sunilg] offline. There're a > couple of resource profile related behaviors might need to be updated: > 1) Configure resource profile in server side or client side: > Currently resource profile can be only configured centrally: > - Advantages: > A given resource profile has a the same meaning in the cluster. It won’t > change when we run different apps in different configurations. A job can run > under Amazon’s G2.8X can also run on YARN with G2.8X profile. A side benefit > is YARN scheduler can potentially do better bin packing. > - Disadvantages: > Hard for applications to add their own resource profiles. > 2) Do we really need mandatory resource profiles such as > minimum/maximum/default? > 3) Should we send resource profile name inside ResourceRequest, or should > client/AM translate it to resource and set it to the existing resource > fields? > 4) Related to above, should we allow resource overrides or client/AM should > send final resource to RM? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7292: - Attachment: YARN-7292.003.patch > Revisit Resource Profile Behavior > - > > Key: YARN-7292 > URL: https://issues.apache.org/jira/browse/YARN-7292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7292.002.patch, YARN-7292.003.patch, > YARN-7292.wip.001.patch > > > Had discussions with [~templedf], [~vvasudev], [~sunilg] offline. There're a > couple of resource profile related behaviors might need to be updated: > 1) Configure resource profile in server side or client side: > Currently resource profile can be only configured centrally: > - Advantages: > A given resource profile has a the same meaning in the cluster. It won’t > change when we run different apps in different configurations. A job can run > under Amazon’s G2.8X can also run on YARN with G2.8X profile. A side benefit > is YARN scheduler can potentially do better bin packing. > - Disadvantages: > Hard for applications to add their own resource profiles. > 2) Do we really need mandatory resource profiles such as > minimum/maximum/default? > 3) Should we send resource profile name inside ResourceRequest, or should > client/AM translate it to resource and set it to the existing resource > fields? > 4) Related to above, should we allow resource overrides or client/AM should > send final resource to RM? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks
[ https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351744#comment-16351744 ] genericqa commented on YARN-7655: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}112m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7655 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12909131/YARN-7655-002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux eb1f0bd357ce 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4e9a59c | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19596/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19596/testReport/ | | Max. process+thread count | 881 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Commented] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351712#comment-16351712 ] genericqa commented on YARN-7292: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} YARN-7292 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7292 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12909132/YARN-7292.002.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19597/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Revisit Resource Profile Behavior > - > > Key: YARN-7292 > URL: https://issues.apache.org/jira/browse/YARN-7292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7292.002.patch, YARN-7292.wip.001.patch > > > Had discussions with [~templedf], [~vvasudev], [~sunilg] offline. There're a > couple of resource profile related behaviors might need to be updated: > 1) Configure resource profile in server side or client side: > Currently resource profile can be only configured centrally: > - Advantages: > A given resource profile has a the same meaning in the cluster. It won’t > change when we run different apps in different configurations. A job can run > under Amazon’s G2.8X can also run on YARN with G2.8X profile. A side benefit > is YARN scheduler can potentially do better bin packing. > - Disadvantages: > Hard for applications to add their own resource profiles. > 2) Do we really need mandatory resource profiles such as > minimum/maximum/default? > 3) Should we send resource profile name inside ResourceRequest, or should > client/AM translate it to resource and set it to the existing resource > fields? > 4) Related to above, should we allow resource overrides or client/AM should > send final resource to RM? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351707#comment-16351707 ] Wangda Tan commented on YARN-7292: -- Attached ver.2 patch, fixed reported issues and completed all changes (i think so). > Revisit Resource Profile Behavior > - > > Key: YARN-7292 > URL: https://issues.apache.org/jira/browse/YARN-7292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7292.002.patch, YARN-7292.wip.001.patch > > > Had discussions with [~templedf], [~vvasudev], [~sunilg] offline. There're a > couple of resource profile related behaviors might need to be updated: > 1) Configure resource profile in server side or client side: > Currently resource profile can be only configured centrally: > - Advantages: > A given resource profile has a the same meaning in the cluster. It won’t > change when we run different apps in different configurations. A job can run > under Amazon’s G2.8X can also run on YARN with G2.8X profile. A side benefit > is YARN scheduler can potentially do better bin packing. > - Disadvantages: > Hard for applications to add their own resource profiles. > 2) Do we really need mandatory resource profiles such as > minimum/maximum/default? > 3) Should we send resource profile name inside ResourceRequest, or should > client/AM translate it to resource and set it to the existing resource > fields? > 4) Related to above, should we allow resource overrides or client/AM should > send final resource to RM? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7292) Revisit Resource Profile Behavior
[ https://issues.apache.org/jira/browse/YARN-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7292: - Attachment: YARN-7292.002.patch > Revisit Resource Profile Behavior > - > > Key: YARN-7292 > URL: https://issues.apache.org/jira/browse/YARN-7292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7292.002.patch, YARN-7292.wip.001.patch > > > Had discussions with [~templedf], [~vvasudev], [~sunilg] offline. There're a > couple of resource profile related behaviors might need to be updated: > 1) Configure resource profile in server side or client side: > Currently resource profile can be only configured centrally: > - Advantages: > A given resource profile has a the same meaning in the cluster. It won’t > change when we run different apps in different configurations. A job can run > under Amazon’s G2.8X can also run on YARN with G2.8X profile. A side benefit > is YARN scheduler can potentially do better bin packing. > - Disadvantages: > Hard for applications to add their own resource profiles. > 2) Do we really need mandatory resource profiles such as > minimum/maximum/default? > 3) Should we send resource profile name inside ResourceRequest, or should > client/AM translate it to resource and set it to the existing resource > fields? > 4) Related to above, should we allow resource overrides or client/AM should > send final resource to RM? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks
[ https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351704#comment-16351704 ] Steven Rand commented on YARN-7655: --- Thanks [~yufeigu], new patch is attached. Unfortunately I'm still struggling to have the starved app be allocated the right number of containers in the test (though the preemption part happens correctly). The details of that are in my first comment above. It seems like the options are: * What the current patch does, which is just leave a TODO above where we check for allocation. * Only test that the preemption went as expected, and don't test allocation, i.e., don't call {{verifyPreemption}}. * Find a way to have the allocation work out while still guaranteeing that the RR we consider for preemption is the {{NODE_LOCAL}} one. I thought I'd be able to figure this out, but have to admit I've been unsuccessful. > avoid AM preemption caused by RRs for specific nodes or racks > - > > Key: YARN-7655 > URL: https://issues.apache.org/jira/browse/YARN-7655 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: Steven Rand >Assignee: Steven Rand >Priority: Major > Attachments: YARN-7655-001.patch, YARN-7655-002.patch > > > We frequently see AM preemptions when > {{starvedApp.getStarvedResourceRequests()}} in > {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs > that request containers on a specific node. Since this causes us to only > consider one node to preempt containers on, the really good work that was > done in YARN-5830 doesn't save us from AM preemption. Even though there might > be multiple nodes on which we could preempt enough non-AM containers to > satisfy the app's starvation, we often wind up preempting one or more AM > containers on the single node that we're considering. > A proposed solution is that if we're going to preempt one or more AM > containers for an RR that specifies a node or rack, then we should instead > expand the search space to consider all nodes. That way we take advantage of > YARN-5830, and only preempt AMs if there's no alternative. I've attached a > patch with an initial implementation of this. We've been running it on a few > clusters, and have seen AM preemptions drop from double-digit occurrences on > many days to zero. > Of course, the tradeoff is some loss of locality, since the starved app is > less likely to be allocated resources at the most specific locality level > that it asked for. My opinion is that this tradeoff is worth it, but > interested to hear what others think as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks
[ https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rand updated YARN-7655: -- Attachment: YARN-7655-002.patch > avoid AM preemption caused by RRs for specific nodes or racks > - > > Key: YARN-7655 > URL: https://issues.apache.org/jira/browse/YARN-7655 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: Steven Rand >Assignee: Steven Rand >Priority: Major > Attachments: YARN-7655-001.patch, YARN-7655-002.patch > > > We frequently see AM preemptions when > {{starvedApp.getStarvedResourceRequests()}} in > {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs > that request containers on a specific node. Since this causes us to only > consider one node to preempt containers on, the really good work that was > done in YARN-5830 doesn't save us from AM preemption. Even though there might > be multiple nodes on which we could preempt enough non-AM containers to > satisfy the app's starvation, we often wind up preempting one or more AM > containers on the single node that we're considering. > A proposed solution is that if we're going to preempt one or more AM > containers for an RR that specifies a node or rack, then we should instead > expand the search space to consider all nodes. That way we take advantage of > YARN-5830, and only preempt AMs if there's no alternative. I've attached a > patch with an initial implementation of this. We've been running it on a few > clusters, and have seen AM preemptions drop from double-digit occurrences on > many days to zero. > Of course, the tradeoff is some loss of locality, since the starved app is > less likely to be allocated resources at the most specific locality level > that it asked for. My opinion is that this tradeoff is worth it, but > interested to hear what others think as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7872) labeled node cannot be used to satisfy locality specified request
[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351675#comment-16351675 ] Yuqi Wang commented on YARN-7872: - [~leftnoteasy], could you please take a look at this? :) Appreciate your insights! > labeled node cannot be used to satisfy locality specified request > - > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > > *For example:* > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that (at least for version 2.7 > and 2.8), the node cannot allocate container for the request, because the > node label is not matched when the leaf queue assign container. > > *Possible solution:* > However, node locality and node label should be two orthogonal dimensions to > select candidate nodes for container request. And the node label matching > should only be executed for container request with ANY resource name, since > only this kind of container request is allowed to have 'not empty' node label. > So, for container request with 'not ANY' resource name (so, we clearly know > it should not have node label), we should use the requested resource name to > match with the node instead of using the requested node label to match with > the node. And this resource name matching should be safe, since the node > whose node label is not accessible for the queue will not be sent to the leaf > queue. > > *Discussion:* > Attachment is the fix according to this principle, please help to review. > Without it, we cannot use locality to request container within these labeled > nodes. > If the fix is acceptable, we should also recheck whether the same issue > happens in trunk and other hadoop versions. > If not acceptable (i.e. the current behavior is by designed), so, how can we > use locality to request container within these labeled nodes? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org