[jira] [Commented] (YARN-4205) Add a service for monitoring application life time out
[ https://issues.apache.org/jira/browse/YARN-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466519#comment-15466519 ] Jian He commented on YARN-4205: --- Thanks Rohith, looks good overall, few comments: - can you clarify the definition of lifetime in the API, and also mention the unit of the time in the getter API - RMAppRecoveredTransition: this will cause a lot of loggings for active apps on RM recovery, remove it as it is already logged in the normal run path ? or move to debug level ? {code} LOG.info("Application " + app.applicationId
+ " is registered with Application lifetime monitor after recovery. "
+ "The lifetime configured is " + applicationLifetime + " seconds"); {code} - use getLong ? {code} int monitorInterval = conf.getInt( YarnConfiguration.RM_APPLICATION_LIFETIME_MONITOR_INTERVAL_MS, YarnConfiguration.DEFAULT_RM_APPLICATION_LIFETIME_MONITOR_INTERVAL_MS); {code} > Add a service for monitoring application life time out > -- > > Key: YARN-4205 > URL: https://issues.apache.org/jira/browse/YARN-4205 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: nijel >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4205.patch, 0002-YARN-4205.patch, > YARN-4205_01.patch, YARN-4205_02.patch, YARN-4205_03.patch > > > This JIRA intend to provide a lifetime monitor service. > The service will monitor the applications where the life time is configured. > If the application is running beyond the lifetime, it will be killed. > The lifetime will be considered from the submit time. > The thread monitoring interval is configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5445) Log aggregation configured to different namenode can fail fast
[ https://issues.apache.org/jira/browse/YARN-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466498#comment-15466498 ] Chackaravarthy commented on YARN-5445: -- Request to have a look at the patch and give suggestion. Thanks in advance. > Log aggregation configured to different namenode can fail fast > -- > > Key: YARN-5445 > URL: https://issues.apache.org/jira/browse/YARN-5445 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Chackaravarthy > Attachments: YARN-5445-1.patch > > > Log aggregation is enabled and configured to write applogs to different > cluster or different namespace (NN federation). In these cases, would like to > have some configs on attempts or retries to fail fast in case the other > cluster is completely down. > Currently it takes default {{dfs.client.failover.max.attempts}} as 15 and > hence adding a latency of 2 to 2.5 mins in each container launch (per node > manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5618) Support for Intra queue preemption framework
Sunil G created YARN-5618: - Summary: Support for Intra queue preemption framework Key: YARN-5618 URL: https://issues.apache.org/jira/browse/YARN-5618 Project: Hadoop YARN Issue Type: Sub-task Reporter: Sunil G Assignee: Sunil G Currently inter-queue preemption framework covers the basics (configs and scheduling monitor interval etc). This new framework will come as new CandidateSelector policy. Priority and user-limit will be a part of this framework. This is a tracking jira for the framework impl alone. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466489#comment-15466489 ] Sunil G commented on YARN-4945: --- Thanks [~eepayne] bq.LeafQueue#getApplications returns an umnodifiable Collection Yes, I have made changes to handle this scenario. bq.if it's already in selectedCandidates, it's because an inter-queue preemption policy put it there I think I must give some more clarity for what I am trying to do here. Its possible that there can be some containers which were selected by priority/user-limit policy may already be selected from inter-queue policies. In that case, we need not have to mark them again. Rather we can deduct the resource directly as its container marked for preemption. bq.container's resources twice from toObtainByPartition Its a mistake, I corrected the same in second patch. > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5608) TestAMRMClient.setup() fails with ArrayOutOfBoundsException
[ https://issues.apache.org/jira/browse/YARN-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466422#comment-15466422 ] Rohith Sharma K S commented on YARN-5608: - +1 LGTM > TestAMRMClient.setup() fails with ArrayOutOfBoundsException > --- > > Key: YARN-5608 > URL: https://issues.apache.org/jira/browse/YARN-5608 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: YARN-5608.002.patch, YARN-5608.003.patch, > YARN-5608.004.patch, YARN-5608.005.patch, YARN-5608.patch > > > After 39 runs the {{TestAMRMClient}} test, I encountered: > {noformat} > java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.setup(TestAMRMClient.java:144) > {noformat} > I see it shows up occasionally in the error emails as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465592#comment-15465592 ] Eric Payne commented on YARN-4945: -- [~leftnoteasy] and [~sunilg], bq. Using logic similar to {{deductPreemptableResourcesBasedSelectedCandidates}} should be able to achieve this, and I think it doesn't bring too many complexities to the implementation. I'm sorry, but I'm still not understanding how this can work. In {{PriorityCandidatesSelector#preemptFromLeastStarvedApp}}: {code} if (CapacitySchedulerPreemptionUtils.isContainerAlreadySelected(c, selectedCandidates)) { Resources.subtractFrom(toObtainByPartition, c.getAllocatedResource()); Resources.subtractFrom(toObtainByPartition, c.getAllocatedResource()); continue; } {code} This code seems to indicate that if a container is already in {{selectedCandidates}}, it will be preempted and then given back to apps in this queue. But if it's already in {{selectedCandidates}}, it's because an inter-queue preemption policy put it there, so it's not likely to end up back in this queue. Please help me understand what I'm missing. Also, Why is it subtracting the container's resources twice from {{toObtainByPartition}}? Should one of those be {{totalPreemptedResourceAllowed}}? > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3854) Add localization support for docker images
[ https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465230#comment-15465230 ] Zhankun Tang commented on YARN-3854: [~vvasudev], thanks for the review. Yes. I want to go ahead and start the implementation. It would be great if [~shaneku...@gmail.com] can support. How about we create below sub-tasks in this JIRA (or YARN-3611?) 1. Add support for Docker pull command 2. Add Docker type local resource to enable Docker image localization 3. Add support for Docker image clean up > Add localization support for docker images > -- > > Key: YARN-3854 > URL: https://issues.apache.org/jira/browse/YARN-3854 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: Zhankun Tang > Attachments: YARN-3854-branch-2.8.001.patch, > YARN-3854_Localization_support_for_Docker_image_v1.pdf, > YARN-3854_Localization_support_for_Docker_image_v2.pdf, > YARN-3854_Localization_support_for_Docker_image_v3.pdf > > > We need the ability to localize docker images when those images aren't > already available locally. There are various approaches that could be used > here with different trade-offs/issues : image archives on HDFS + docker load > , docker pull during the localization phase or (automatic) docker pull > during the run/launch phase. > We also need the ability to clean-up old/stale, unused images. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application
[ https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464776#comment-15464776 ] Steve Loughran commented on YARN-3692: -- I concur with the incompat issue. This needs to be something which downgrades > Allow REST API to set a user generated message when killing an application > -- > > Key: YARN-3692 > URL: https://issues.apache.org/jira/browse/YARN-3692 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Rajat Jain >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch > > > Currently YARN's REST API supports killing an application without setting a > diagnostic message. It would be good to provide that support. > *Use Case* > Usually this helps in workflow management in a multi-tenant environment when > the workflow scheduler (or the hadoop admin) wants to kill a job - and let > the user know the reason why the job was killed. Killing the job by setting a > diagnostic message is a very good solution for that. Ideally, we can set the > diagnostic message on all such interface: > yarn kill -applicationId ... -diagnosticMessage "some message added by > admin/workflow" > REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by > admin/workflow'} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-666) [Umbrella] Support rolling upgrades in YARN
[ https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464330#comment-15464330 ] Brahma Reddy Battula edited comment on YARN-666 at 9/5/16 8:33 AM: --- Sorry for coming late.Did not seen any documentation for this.. I feel, it will be good if rolling upgrade/downgrade/rollback process documented like hdfs.. was (Author: brahmareddy): Sorry for coming late, I feel, it will be good if this needs to be documented like hdfs..? > [Umbrella] Support rolling upgrades in YARN > --- > > Key: YARN-666 > URL: https://issues.apache.org/jira/browse/YARN-666 > Project: Hadoop YARN > Issue Type: Improvement > Components: graceful, rolling upgrade >Affects Versions: 2.0.4-alpha >Reporter: Siddharth Seth > Fix For: 2.6.0 > > Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf > > > Jira to track changes required in YARN to allow rolling upgrades, including > documentation and possible upgrade routes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2255) YARN Audit logging not added to log4j.properties
[ https://issues.apache.org/jira/browse/YARN-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464378#comment-15464378 ] Ying Zhang commented on YARN-2255: -- Hi [~varun_saxena], would you mind that I take this JIRA and continue to work on it? > YARN Audit logging not added to log4j.properties > > > Key: YARN-2255 > URL: https://issues.apache.org/jira/browse/YARN-2255 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Varun Saxena >Assignee: Varun Saxena > > log4j.properties file which is part of the hadoop package, doesnt have YARN > Audit logging tied to it. This leads to audit logs getting generated in > normal log files. Audit logs should be generated in a separate log file -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN
[ https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464330#comment-15464330 ] Brahma Reddy Battula commented on YARN-666: --- Sorry for coming late, I feel, it will be good if this needs to be documented like hdfs..? > [Umbrella] Support rolling upgrades in YARN > --- > > Key: YARN-666 > URL: https://issues.apache.org/jira/browse/YARN-666 > Project: Hadoop YARN > Issue Type: Improvement > Components: graceful, rolling upgrade >Affects Versions: 2.0.4-alpha >Reporter: Siddharth Seth > Fix For: 2.6.0 > > Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf > > > Jira to track changes required in YARN to allow rolling upgrades, including > documentation and possible upgrade routes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5576) Allow resource localization while container is running
[ https://issues.apache.org/jira/browse/YARN-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464221#comment-15464221 ] Jian He commented on YARN-5576: --- The failed tests are passing locally for me. > Allow resource localization while container is running > -- > > Key: YARN-5576 > URL: https://issues.apache.org/jira/browse/YARN-5576 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-5576.1.patch, YARN-5576.2.patch, YARN-5576.3.patch, > YARN-5576.4.branch-2.patch, YARN-5576.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org