[jira] [Commented] (YARN-4795) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202571#comment-15202571 ] Hadoop QA commented on YARN-4795: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 23s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 9s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 50s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12793202/YARN-4795.001.patch | | JIRA Issue | YARN-4795 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 6bf5bc037e48 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 33239c9 | | Default Java | 1.7.0_95 | | Multi-JDK versi
[jira] [Commented] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call
[ https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198733#comment-15198733 ] Li Lu commented on YARN-4815: - Fine... My concern is that we do not need to have separate caches for each use cases if they can be modeled by Guava. I'm fine with either ways. > ATS 1.5 timelineclinet impl try to create attempt directory for every event > call > > > Key: YARN-4815 > URL: https://issues.apache.org/jira/browse/YARN-4815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4815.1.patch > > > ATS 1.5 timelineclinet impl, try to create attempt directory for every event > call. Since per attempt only one call to create directory is enough, this is > causing perf issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4002) make ResourceTrackerService.nodeHeartbeat more concurrent
[ https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202592#comment-15202592 ] Wangda Tan commented on YARN-4002: -- [~rohithsharma], [~zhiguohong], Thanks for working on this patch, generally looks good. Do you think we need to acquire readlock in printConfiguredHosts and setDecomissionedNMsMetrics? > make ResourceTrackerService.nodeHeartbeat more concurrent > - > > Key: YARN-4002 > URL: https://issues.apache.org/jira/browse/YARN-4002 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Critical > Attachments: 0001-YARN-4002.patch, YARN-4002-lockless-read.patch, > YARN-4002-rwlock.patch, YARN-4002-v0.patch > > > We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By > design the method ResourceTrackerService.nodeHeartbeat should be concurrent > enough to scale for large clusters. > But we have a "BIG" lock in NodesListManager.isValidNode which I think it's > unnecessary. > First, the fields "includes" and "excludes" of HostsFileReader are only > updated on "refresh nodes". All RPC threads handling node heartbeats are > only readers. So RWLock could be used to alow concurrent access by RPC > threads. > Second, since he fields "includes" and "excludes" of HostsFileReader are > always updated by "reference assignment", which is atomic in Java, the reader > side lock could just be skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
[ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201835#comment-15201835 ] Varun Saxena commented on YARN-4712: bq. Varun Saxena, will you do the honors of committing this patch? Sure. WiIl commit it shortly. > CPU Usage Metric is not captured properly in YARN-2928 > -- > > Key: YARN-4712 > URL: https://issues.apache.org/jira/browse/YARN-4712 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4712-YARN-2928.v1.001.patch, > YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, > YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, > YARN-4712-YARN-2928.v1.006.patch > > > There are 2 issues with CPU usage collection > * I was able to observe that that many times CPU usage got from > {{pTree.getCpuUsagePercent()}} is > ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do > the calculation i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore > /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE > check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not > encountered. so proper checks needs to be handled > * {{EntityColumnPrefix.METRIC}} uses always LongConverter but > ContainerMonitor is publishing decimal values for the CPU usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4829) Add support for binary units
Varun Vasudev created YARN-4829: --- Summary: Add support for binary units Key: YARN-4829 URL: https://issues.apache.org/jira/browse/YARN-4829 Project: Hadoop YARN Issue Type: Sub-task Reporter: Varun Vasudev Assignee: Varun Vasudev The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4766) NM should not aggregate logs older than the retention policy
[ https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200655#comment-15200655 ] Haibo Chen commented on YARN-4766: -- fixed the liscense + checkstyle issues. The unit test failure is unrelated to the patch. I have create another jira to fix the test failure at https://issues.apache.org/jira/browse/YARN-4838 > NM should not aggregate logs older than the retention policy > > > Key: YARN-4766 > URL: https://issues.apache.org/jira/browse/YARN-4766 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation, nodemanager >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: yarn4766.001.patch, yarn4766.002.patch > > > When a log aggregation fails on the NM the information is for the attempt is > kept in the recovery DB. Log aggregation can fail for multiple reasons which > are often related to HDFS space or permissions. > On restart the recovery DB is read and if an application attempt needs its > logs aggregated, the files are scheduled for aggregation without any checks. > The log files could be older than the retention limit in which case we should > not aggregate them but immediately mark them for deletion from the local file > system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write
[ https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197348#comment-15197348 ] Jason Lowe commented on YARN-4814: -- +1 lgtm, holding off on committing to allow others to comment. I assume this was manually tested to verify the flushes are no longer seen at the filesystem level for every write. > ATS 1.5 timelineclient impl call flush after every event write > -- > > Key: YARN-4814 > URL: https://issues.apache.org/jira/browse/YARN-4814 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Fix For: 2.8.0 > > Attachments: YARN-4814.1.patch, YARN-4814.2.patch > > > ATS 1.5 timelineclient impl call flush after every event write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Enhancement for tracking Blacklist in AM Launching
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202584#comment-15202584 ] Wangda Tan commented on YARN-4576: -- Thanks [~vinodkv] for starting this discussion, After ramped up most of discussions in this JIRA and related JIRAs, my suggestions: 1) AM blacklist is unnecessary to me: - When YARN detects *possible* failures, it should blacklist nodes *within the app* (from [~sjlee0]). If AM container of an app fails on a node because of node-specific reasons, other containers of the app could fail with the same reason. But we shouldn't spread it to other apps because different app has different settings. We can do this unless we're confident enough that the two apps are very similar in configs. - When YARN detects fatal failures, it should blacklist nodes globally, we mark node to be UNHEALTHY. As [~djp][commented|https://issues.apache.org/jira/browse/YARN-4576?focusedCommentId=15201559&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15201559], we may need to fix this issue, if a node's state goes between HEALTHY and UNHEALTHY back-and-forth, we need detect it and mark this node to be UNHEALTHY. 2) Framework-specified container blacklists should be transient to end users. YARN should make correct decisions to select best places for apps, and apps should trust YARN's decisions. Just like UNHEALTHY status of a node, it is possible that node has 90% of disk utilization is quite acceptable to some apps, but we shouldn't allow apps to say: I know it's risky, but I still want to schedule on these UNHEALTHY nodes. 3) App should have their own choices to setup preferred nodes, hosts etc. As Junping commented: bq. We don't really give application that freedom - where and how to launch application's AM container is never be application's business so far, that's why we call it out here - give applications the right to set their bar for AM launching. We need this for AMs, I cannot find the original JIRA for AM resource reuqest. But I believe there's an open JIRA for this. And I think AM should be able to add blacklist nodes with ApplicationSubmissionContext. > Enhancement for tracking Blacklist in AM Launching > -- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: EnhancementAMLaunchingBlacklist.pdf > > > Before YARN-2005, YARN blacklist mechanism is to track the bad nodes by AM: > If AM tried to launch containers on a specific node get failed for several > times, AM will blacklist this node in future resource asking. This mechanism > works fine for normal containers. However, from our observation on behaviors > of several clusters: if this problematic node launch AM failed, then RM could > pickup this problematic node to launch next AM attempts again and again that > cause application failure in case other functional nodes are busy. In normal > case, the customized healthy checker script cannot be so sensitive to mark > node as unhealthy when one or two containers get launched failed. > After YARN-2005, we can have a BlacklistManager in each RMapp, so those nodes > who launching AM attempts failed for specific application before will get > blacklisted. To get rid of potential risks that all nodes being blacklisted > by BlacklistManager, a disable-failure-threshold is involved to stop adding > more nodes into blacklist if hit certain ratio already. > There are already some enhancements for this AM blacklist mechanism: > YARN-4284 is to address the more wider case for AM container get launched > failure and YARN-4389 tries to make configuration settings available for > change by App to meet app specific requirement. However, there are still > several gaps to address more scenarios: > 1. We may need a global blacklist instead of each app maintain a separated > one. The reason is: AM could get more chance to fail if other AM get failed > before. A quick example is: in a busy cluster, all nodes are busy except two > problematic nodes: node a and node b, app1 already submit and get failed in > two AM attempts on a and b. app2 and other apps should wait for other busy > nodes rather than waste attempts on these two problematic nodes. > 2. If AM container failure is recognized as global event instead app own > issue, we should consider the blacklist is not a permanent thing but with a > specific time window. > 3. We could have user defined black list polices to address more possible > cases and scenarios, so it reasonable to make blacklist policy pluggable. > 4. For some test scenario, we
[jira] [Updated] (YARN-998) Persistent resource change during NM/RM restart
[ https://issues.apache.org/jira/browse/YARN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-998: Attachment: YARN-998-v2.patch Update the patch to incorporate with Jian's comments and fix Jenkins whitespace and checkstyle issue. The unit test failures (TestAMAuthorization and TestClientRMTokens) is not related as we have seen many times on trunk (latest same failure in YARN-4785). > Persistent resource change during NM/RM restart > --- > > Key: YARN-998 > URL: https://issues.apache.org/jira/browse/YARN-998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-998-sample.patch, YARN-998-v1.patch, > YARN-998-v2.patch > > > When NM is restarted by plan or from a failure, previous dynamic resource > setting should be kept for consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4830) Add support for resource types in the nodemanager
[ https://issues.apache.org/jira/browse/YARN-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4830: Description: The RM has support for multiple resource types. The same should be added for the NMs. > Add support for resource types in the nodemanager > - > > Key: YARN-4830 > URL: https://issues.apache.org/jira/browse/YARN-4830 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > > The RM has support for multiple resource types. The same should be added for > the NMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201655#comment-15201655 ] Sangjin Lee commented on YARN-4837: --- I just wanted to add my 2 cents to the discussion, specifically about YARN-4284 where we broadened the cause for blacklisting a node for an AM purpose. AMs repeatedly getting assigned to the same node in spite of failures is one of the most frequent complaints from our users ("why did our AMs keep landing on that bad node, causing our jobs to fail?"). If a node is having a "soft" failure that doesn't quite trip itself over to an unhealthy state, that's the worst possible case. Since the node is still healthy and appears to have a lot of available capacity, the chance that it still gets the next attempt is quite high; i.e. we have node-affinity. And since this is AM, the consequence is much more severe than when a container landed on that node. Oftentimes, the cause for this soft failure situation is varied, and trying to come up with a precise set of exit codes that meet this criteria isn't straightforward. There are even error codes like INVALID which we see quite often (see [my previous comment|https://issues.apache.org/jira/browse/YARN-4284?focusedCommentId=14966248&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14966248]). I know it could blacklist the node for the app for reasons such as the app's configuration error (false positives). However, the reason we could afford to go broad is this blacklisting is *per-app*. The only downside there is to get assigned to another node. We have a number of large busy clusters, and we're using this with success and with little downside. That said, I do recognize that this could be a problem if {{yarn.resourcemanager.am.max-attempts}} is larger than the size of the cluster. > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4831) Recovered containers will be killed after NM stateful restart
[ https://issues.apache.org/jira/browse/YARN-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li reassigned YARN-4831: - Assignee: Siqi Li > Recovered containers will be killed after NM stateful restart > -- > > Key: YARN-4831 > URL: https://issues.apache.org/jira/browse/YARN-4831 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Siqi Li >Assignee: Siqi Li > Attachments: YARN-4831.v1.patch > > > {code} > 2016-03-04 19:43:48,130 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1456335621285_0040_01_66 transitioned from NEW to > DONE > 2016-03-04 19:43:48,130 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=henkins-service >OPERATION=Container Finished - Killed TARGET=ContainerImpl > RESULT=SUCCESS APPID=application_1456335621285_0040 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4711) NM is going down with NPE's due to single thread processing of events by Timeline client
[ https://issues.apache.org/jira/browse/YARN-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197858#comment-15197858 ] Naganarasimha G R commented on YARN-4711: - Hi [~sjlee0], Following issues needs to be fixed as part of this jira : # Stop Retry on all exceptions # Async calls for metrics needs to implemented # Possible NPE in NMTimelinePublisher$ContainerEventHandler # Possible NPE in putEntity(NMTimelinePublisher.java:213) # Possible NPE in TimelineEntity.toString() when real is not null and i wanted discuss in particular to Issue 3, It seems now after 3367 whether its required to queue in Dispatcher when we are already queuing in TimelineClient for NM side events. On further thinking thought it will be required for only sync events as for example {{ContainerManagerImpl.ContainerEventDispatcher.handle(ContainerEvent)}} should not be *blocked* on sync timeline publish event. But one flaw in current handling is, we put the event into *nmMetricsPublisher.dispatcher* in {{nmMetricsPublisher.publishContainerEvent}}, due to this if there is any delay in the dispatcher we might miss to publish the events.(NPE we got as in 3 point ). So my idea was to create a ATS event to be published and wrap it in the existing *TimelinePublishEvent* and put in the dispatcher and for metrics directly get the apps TimelineClient and call *putEntitiesAsync*. Thoughts? > NM is going down with NPE's due to single thread processing of events by > Timeline client > > > Key: YARN-4711 > URL: https://issues.apache.org/jira/browse/YARN-4711 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: 4711Analysis.txt > > > After YARN-3367, while testing the latest 2928 branch came across few NPEs > due to which NM is shutting down. > {code} > 2016-02-21 23:19:54,078 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:306) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:296) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {code} > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:213) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerFinishedEvent(NMTimelinePublisher.java:192) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.access$400(NMTimelinePublisher.java:63) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:289) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:280) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {code} > On analysis found that the there was delay in processing of events, as after > YARN-3367 all the events were getting processed by a single thread inside the > timeline client. > Additionally found one scenario where there is possibility of NPE: > * TimelineEntity.toString() when {{real}} is not null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-998) Persistent resource change during NM/RM restart
[ https://issues.apache.org/jira/browse/YARN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197900#comment-15197900 ] Hadoop QA commented on YARN-998: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 2 new + 54 unchanged - 12 fixed = 56 total (was 66) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 30s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 16s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 163m 5s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestRMAdminService | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestRMAdminService | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Updated] (YARN-4743) ResourceManager crash because TimSort
[ https://issues.apache.org/jira/browse/YARN-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zephyr Guo updated YARN-4743: - Attachment: YARN-4743-cdh5.4.7.patch > ResourceManager crash because TimSort > - > > Key: YARN-4743 > URL: https://issues.apache.org/jira/browse/YARN-4743 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.6.4 >Reporter: Zephyr Guo >Assignee: Yufei Gu > Attachments: YARN-4743-cdh5.4.7.patch > > > {code} > 2016-02-26 14:08:50,821 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeCollapse(TimSort.java:410) > at java.util.TimSort.sort(TimSort.java:214) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:316) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:240) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1091) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:989) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1185) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) > at java.lang.Thread.run(Thread.java:745) > 2016-02-26 14:08:50,822 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. > {code} > Actually, this issue found in 2.6.0-cdh5.4.7. > I think the cause is that we modify {{Resouce}} while we are sorting > {{runnableApps}}. > {code:title=FSLeafQueue.java} > Comparator comparator = policy.getComparator(); > writeLock.lock(); > try { > Collections.sort(runnableApps, comparator); > } finally { > writeLock.unlock(); > } > readLock.lock(); > {code} > {code:title=FairShareComparator} > public int compare(Schedulable s1, Schedulable s2) { > .. > s1.getResourceUsage(), minShare1); > boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, > s2.getResourceUsage(), minShare2); > minShareRatio1 = (double) s1.getResourceUsage().getMemory() > / Resources.max(RESOURCE_CALCULATOR, null, minShare1, > ONE).getMemory(); > minShareRatio2 = (double) s2.getResourceUsage().getMemory() > / Resources.max(RESOURCE_CALCULATOR, null, minShare2, > ONE).getMemory(); > .. > {code} > {{getResourceUsage}} will return current Resource. The current Resource is > unstable. > {code:title=FSAppAttempt.java} > @Override > public Resource getResourceUsage() { > // Here the getPreemptedResources() always return zero, except in > // a preemption round > return Resources.subtract(getCurrentConsumption(), > getPreemptedResources()); > } > {code} > {code:title=SchedulerApplicationAttempt} > public Resource getCurrentConsumption() { > return currentConsumption; > } > // This method may modify current Resource. > public synchronized void recoverContainer(RMContainer rmContainer) { > .. > Resources.addTo(currentConsumption, rmContainer.getContainer() > .getResource()); > .. > } > {code} > I suggest that use stable Resource in comparator. > Is there something i think wrong? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199604#comment-15199604 ] Hadoop QA commented on YARN-4062: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 49s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 14s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 46s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 19s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 39s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 36s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 2 new + 212 unchanged - 1 fixed = 214 total (was 213) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 31s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 5s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:
[jira] [Commented] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
[ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197763#comment-15197763 ] Varun Vasudev commented on YARN-3926: - bq. That should work... But I feel, maybe allow mismatch should be the default. If NM has a super-set of RMs resource types, it will just be ignored, If sub-set, then for those specific resource-types, RM will assign a 0 value for the NM. I don't have any particular preference - I can see scenarios for all 3. I'm fine with making allow mismatch the default. bq. We can also add admin API on the RM to add / remove allowable resource types on the fly. This should be do-able but we need to go through the affect on running apps. > Extend the YARN resource model for easier resource-type management and > profiles > --- > > Key: YARN-3926 > URL: https://issues.apache.org/jira/browse/YARN-3926 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: Proposal for modifying resource model and profiles.pdf > > > Currently, there are efforts to add support for various resource-types such > as disk(YARN-2139), network(YARN-2140), and HDFS bandwidth(YARN-2681). These > efforts all aim to add support for a new resource type and are fairly > involved efforts. In addition, once support is added, it becomes harder for > users to specify the resources they need. All existing jobs have to be > modified, or have to use the minimum allocation. > This ticket is a proposal to extend the YARN resource model to a more > flexible model which makes it easier to support additional resource-types. It > also considers the related aspect of “resource profiles” which allow users to > easily specify the various resources they need for any given container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Enhancement for tracking Blacklist in AM Launching
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201559#comment-15201559 ] Junping Du commented on YARN-4576: -- bq. this whole "AM blacklisting" feature is unnecessarily blown way out of proportion - we just don't need this amount of complexity. Are we not seeing the failure price for AM container could be different with normal containers? If so, a separate blacklist for AM container launching with normal container launching make a lot of sense to me. Actually, we can also call it grey list - a special status between good and bad, risky to launch AM container but ok to launch normal containers. I would buy in this complexity for addressing more different scenarios - in case they are solid. bq. Containers are marked DISKS_FAILED only if all the disks have become bad, in which case the node itself becomes unhealthy. So there is no need for blacklisting per app at all !! No. DISKS_FAILED mark on bad disks are transient status for the node. Take an example, if "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage" is set to 90% (by default), another job (no matter YARN or not) writing some files and deleting them afterwards back-and-forth - if disks usage for the node is just happen to be around 90%, it make NM's healthy status report to RM between healthy and unhealthy back and forth. Blacklist of AM launching can evaluate history record to decide a better place to launch AM. The bar for launching normal containers could be different or we could end up with so less choice. bq. If an AM is killed due to memory over-flow, blacklisting the node will not help at all! I agree that if memory over-flow issue is due to AM asking for more than expected resource, then there is no different to schedule to other nodes. However, some of memory issues is caused by memory congestion of nodes that make AM can be good for other nodes which has different memory resources and settings. bq. When YARN finds a node with configuration / permission issues, it should itself take an action to (a) avoid scheduling on that node, (b) alert administrators etc. If YARN know it is a configuration/permission issue, we definitely should do a) and b). Sometimes the runtime container failure is intermittently (like due to disk, memory resource, etc.), the bar between launching AM and normal containers is nice to be different. bq. But that isn't the case with YARN - part of the reason why we never implemented heuristics based per-app blacklisting in YARN - we left that completely up to applications. We don't really give application that freedom - where and how to launch application's AM container is never be application's business so far, that's why we call it out here - give applications the right to set their bar for AM launching. > Enhancement for tracking Blacklist in AM Launching > -- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: EnhancementAMLaunchingBlacklist.pdf > > > Before YARN-2005, YARN blacklist mechanism is to track the bad nodes by AM: > If AM tried to launch containers on a specific node get failed for several > times, AM will blacklist this node in future resource asking. This mechanism > works fine for normal containers. However, from our observation on behaviors > of several clusters: if this problematic node launch AM failed, then RM could > pickup this problematic node to launch next AM attempts again and again that > cause application failure in case other functional nodes are busy. In normal > case, the customized healthy checker script cannot be so sensitive to mark > node as unhealthy when one or two containers get launched failed. > After YARN-2005, we can have a BlacklistManager in each RMapp, so those nodes > who launching AM attempts failed for specific application before will get > blacklisted. To get rid of potential risks that all nodes being blacklisted > by BlacklistManager, a disable-failure-threshold is involved to stop adding > more nodes into blacklist if hit certain ratio already. > There are already some enhancements for this AM blacklist mechanism: > YARN-4284 is to address the more wider case for AM container get launched > failure and YARN-4389 tries to make configuration settings available for > change by App to meet app specific requirement. However, there are still > several gaps to address more scenarios: > 1. We may need a global blacklist instead of each app maintain a separated > one. The reason is: AM could get more chance to fail if other AM get fa
[jira] [Commented] (YARN-4636) Make blacklist tracking policy pluggable for more extensions.
[ https://issues.apache.org/jira/browse/YARN-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201373#comment-15201373 ] Junping Du commented on YARN-4636: -- bq. -1 for something like this without understanding the use-cases. We should ask for user cases first before making -1 decision. bq. IMO, the "AM blacklisting" doesn't even need to be user-visible (YARN-4837) let alone be pluggable. Pluggable blacklist policy is necessary because application's requirement for AM robust is different. Some app can tolerant AM failure (small and short-running job) but some apps don't want any risk (like a large MR job with long running reducer jobs - AM restart will kill reducer tasks no mater how long it is already running). IMO, Allowing various blacklist policies is a good thing for YARN to show the extension capability to address different application's requirement especially for a cluster form of heterogeneous nodes. Any comments from guys in watching list? > Make blacklist tracking policy pluggable for more extensions. > - > > Key: YARN-4636 > URL: https://issues.apache.org/jira/browse/YARN-4636 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Junping Du >Assignee: Sunil G > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202323#comment-15202323 ] Sangjin Lee commented on YARN-4837: --- Yes, [~vinodkv], I agree with the direction you laid out. > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199736#comment-15199736 ] Eric Badger commented on YARN-4686: --- [~kasha] When HA is not enabled, the RM will be transitioned to active in the RM serviceStart method. So it is not necessary to transition it to active in the MiniYARNCluster serviceStart method. However, the transitionToActive method will not end up actually transitioning the RM again since it checks to make sure that it is not already active before proceeding. I believe that this makes the MiniYARNCluster start method a little bit cleaner since there are no extra checks. But, it may be more intuitive for the check to be in the MiniYARNCluster start so that it is obvious that the RM is only explicitly transitioned to active in HA setups. The check is made either way, it's just a matter of where we want the check to occur. Currently it is in the RM start method, but if you feel that it is better to put it in the MiniYARNCluster start method then we can add the check there. > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197826#comment-15197826 ] Kuhu Shukla commented on YARN-4311: --- Requesting Jason Lowe, Daniel Templeton for review/comments. Thanks a lot! > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200770#comment-15200770 ] Hadoop QA commented on YARN-4746: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 10s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 12s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 50s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 10s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 26s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 8s {color} | {color:red} hadoop-yarn-serve
[jira] [Commented] (YARN-4699) Scheduler UI and REST o/p is not in sync when -replaceLabelsOnNode is used to change label of a node
[ https://issues.apache.org/jira/browse/YARN-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200371#comment-15200371 ] Hadoop QA commented on YARN-4699: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 3s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 53s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 153m 21s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12789619/0001-YARN-4699.pat
[jira] [Updated] (YARN-4390) Consider container request size during CS preemption
[ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4390: - Issue Type: Sub-task (was: Bug) Parent: YARN-45 > Consider container request size during CS preemption > > > Key: YARN-4390 > URL: https://issues.apache.org/jira/browse/YARN-4390 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.0.0, 2.8.0, 2.7.3 >Reporter: Eric Payne >Assignee: Wangda Tan > > There are multiple reasons why preemption could unnecessarily preempt > containers. One is that an app could be requesting a large container (say > 8-GB), and the preemption monitor could conceivably preempt multiple > containers (say 8, 1-GB containers) in order to fill the large container > request. These smaller containers would then be rejected by the requesting AM > and potentially given right back to the preempted app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197449#comment-15197449 ] Varun Vasudev edited comment on YARN-4829 at 3/16/16 3:12 PM: -- Attached a file with the fix. [~asuresh] - do you mind doing a review? The patch is for the YARN-3926 branch. was (Author: vvasudev): Attached a file with the fix. > Add support for binary units > > > Key: YARN-4829 > URL: https://issues.apache.org/jira/browse/YARN-4829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4829-YARN-3926.001.patch > > > The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4833) For Queue AccessControlException client retries multiple times on both RM
[ https://issues.apache.org/jira/browse/YARN-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202550#comment-15202550 ] Sunil G commented on YARN-4833: --- Hi [~bibinchundatt] and [~jianhe] I have some doubts here. {{AccessControlException}} is used in various places and in many places its been caught from caller as IOException. May be I didnt fully understand the point 2 in the mentioned solution "Wrap AccessControl exception to YarnException". Do you mean that instead of throwing AccessControlException, it will be thrown as {{YarnExcpetion}} with the message from AccessControlException OR are u planning to have AccessControlException as YarnException in its inheritance hierarchy? Since {{submitApplication}} is user facing, I am not very sure about change to YarnException in these ACL specific cases. RPC RemoteException is handled with correct RetryPolicy. So is it possible to throw AccessControlException wrapped with RPC RemotException. May be I missed/overlooked something, pls help to correct me if I am wrong. > For Queue AccessControlException client retries multiple times on both RM > - > > Key: YARN-4833 > URL: https://issues.apache.org/jira/browse/YARN-4833 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > > Submit application to queue where ACL is enabled and submitted user is not > having access. Client retries till failMaxattempt 10 times. > {noformat} > 16/03/18 10:01:06 INFO retry.RetryInvocationHandler: Exception while invoking > submitApplication of class ApplicationClientProtocolPBClientImpl over rm1. > Trying to fail over immediately. > org.apache.hadoop.security.AccessControlException: User hdfs does not have > permission to submit application_1458273884145_0001 to queue default > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:380) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:291) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:618) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:252) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:483) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2360) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2356) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2356) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:272) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:257) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy23.submitApplication(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:261) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDe
[jira] [Updated] (YARN-4694) Document ATS v1.5
[ https://issues.apache.org/jira/browse/YARN-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4694: Assignee: Li Lu (was: Xuan Gong) > Document ATS v1.5 > - > > Key: YARN-4694 > URL: https://issues.apache.org/jira/browse/YARN-4694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.8.0 >Reporter: Xuan Gong >Assignee: Li Lu > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call
[ https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202365#comment-15202365 ] Hadoop QA commented on YARN-4815: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 22s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 44s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 23s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 34s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 45s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 36s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 3 new + 215 unchanged - 0 fixed = 218 total (was 215) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 31s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 22s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {c
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202500#comment-15202500 ] Junping Du commented on YARN-4837: -- Hi [~vinodkv], did you see my comments at YARN-4576 (https://issues.apache.org/jira/browse/YARN-4576?focusedCommentId=15201559&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15201559)? +1 on "explicitly treat known exit-codes" which is exactly the same as previous proposal in YARN-4576. However, the different is: "DISKS_FAILED" shouldn't be skipped for the reason I mentioned in YARN-4576. Also, we cannot simply judge system innocent when hitting memory issues. Also, hide all AM scheduling info/preference from application doesn't make sense in long time: AM can ask for resources for its running containers in the beginning, but application cannot ask how to place its AM even today which is sad to me. YARN-4685 is something fixable and much better than the age without blacklist (we do see AM keep launching on bad nodes repeatedly and get stuck in many cases). We just need to go ahead to fix YARN-4685. > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1040) De-link container life cycle from an Allocation
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202391#comment-15202391 ] Bikas Saha commented on YARN-1040: -- This design doc effectively looks like a re-design of almost all core semantics of YARN. This probably deserves a wider discussion on the dev email list and under its own jira. Although it covers YARN-1040 and YARN-4726 the scope looks much wider and careful thinking about backwards compatibility is needed etc. Conceptually this changes the current semantic understanding of allocation and container thats widely understood externally. I am afraid that this jira or just the folks on this thread are not enough to make a decision for the given proposal. As far as this jira is concerned, both the previous (say a) & new (say b) proposals sound similar with startContainer_in_a renamed to startAllocation_in_b & startProcess_in_a renamed to startContainer_in_b. So we may be fine in that restricted part minus the renamings. > De-link container life cycle from an Allocation > --- > > Key: YARN-1040 > URL: https://issues.apache.org/jira/browse/YARN-1040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > Attachments: YARN-1040-rough-design.pdf > > > The AM should be able to exec >1 process in a container, rather than have the > NM automatically release the container when the single process exits. > This would let an AM restart a process on the same container repeatedly, > which for HBase would offer locality on a restarted region server. > We may also want the ability to exec multiple processes in parallel, so that > something could be run in the container while a long-lived process was > already running. This can be useful in monitoring and reconfiguring the > long-lived process, as well as shutting it down. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-810) Support CGroup ceiling enforcement on CPU
[ https://issues.apache.org/jira/browse/YARN-810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202256#comment-15202256 ] Karthik Kambatla commented on YARN-810: --- Hasn't this been recently added with strict cpu usage? > Support CGroup ceiling enforcement on CPU > - > > Key: YARN-810 > URL: https://issues.apache.org/jira/browse/YARN-810 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.1.0-beta, 2.0.5-alpha >Reporter: Chris Riccomini >Assignee: Wei Yan > Labels: BB2015-05-TBR > Attachments: YARN-810-3.patch, YARN-810-4.patch, YARN-810-5.patch, > YARN-810-6.patch, YARN-810.patch, YARN-810.patch > > > Problem statement: > YARN currently lets you define an NM's pcore count, and a pcore:vcore ratio. > Containers are then allowed to request vcores between the minimum and maximum > defined in the yarn-site.xml. > In the case where a single-threaded container requests 1 vcore, with a > pcore:vcore ratio of 1:4, the container is still allowed to use up to 100% of > the core it's using, provided that no other container is also using it. This > happens, even though the only guarantee that YARN/CGroups is making is that > the container will get "at least" 1/4th of the core. > If a second container then comes along, the second container can take > resources from the first, provided that the first container is still getting > at least its fair share (1/4th). > There are certain cases where this is desirable. There are also certain cases > where it might be desirable to have a hard limit on CPU usage, and not allow > the process to go above the specified resource requirement, even if it's > available. > Here's an RFC that describes the problem in more detail: > http://lwn.net/Articles/336127/ > Solution: > As it happens, when CFS is used in combination with CGroups, you can enforce > a ceiling using two files in cgroups: > {noformat} > cpu.cfs_quota_us > cpu.cfs_period_us > {noformat} > The usage of these two files is documented in more detail here: > https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu.html > Testing: > I have tested YARN CGroups using the 2.0.5-alpha implementation. By default, > it behaves as described above (it is a soft cap, and allows containers to use > more than they asked for). I then tested CFS CPU quotas manually with YARN. > First, you can see that CFS is in use in the CGroup, based on the file names: > {noformat} > [criccomi@eat1-qa464 ~]$ sudo -u app ls -l /cgroup/cpu/hadoop-yarn/ > total 0 > -r--r--r-- 1 app app 0 Jun 13 16:46 cgroup.procs > drwxr-xr-x 2 app app 0 Jun 13 17:08 container_1371141151815_0004_01_02 > -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.cfs_period_us > -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.cfs_quota_us > -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.rt_period_us > -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.rt_runtime_us > -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.shares > -r--r--r-- 1 app app 0 Jun 13 16:46 cpu.stat > -rw-r--r-- 1 app app 0 Jun 13 16:46 notify_on_release > -rw-r--r-- 1 app app 0 Jun 13 16:46 tasks > [criccomi@eat1-qa464 ~]$ sudo -u app cat > /cgroup/cpu/hadoop-yarn/cpu.cfs_period_us > 10 > [criccomi@eat1-qa464 ~]$ sudo -u app cat > /cgroup/cpu/hadoop-yarn/cpu.cfs_quota_us > -1 > {noformat} > Oddly, it appears that the cfs_period_us is set to .1s, not 1s. > We can place processes in hard limits. I have process 4370 running YARN > container container_1371141151815_0003_01_03 on a host. By default, it's > running at ~300% cpu usage. > {noformat} > CPU > 4370 criccomi 20 0 1157m 551m 14m S 240.3 0.8 87:10.91 ... > {noformat} > When I set the CFS quote: > {noformat} > echo 1000 > > /cgroup/cpu/hadoop-yarn/container_1371141151815_0003_01_03/cpu.cfs_quota_us > CPU > 4370 criccomi 20 0 1157m 563m 14m S 1.0 0.8 90:08.39 ... > {noformat} > It drops to 1% usage, and you can see the box has room to spare: > {noformat} > Cpu(s): 2.4%us, 1.0%sy, 0.0%ni, 92.2%id, 4.2%wa, 0.0%hi, 0.1%si, > 0.0%st > {noformat} > Turning the quota back to -1: > {noformat} > echo -1 > > /cgroup/cpu/hadoop-yarn/container_1371141151815_0003_01_03/cpu.cfs_quota_us > {noformat} > Burns the cores again: > {noformat} > Cpu(s): 11.1%us, 1.7%sy, 0.0%ni, 83.9%id, 3.1%wa, 0.0%hi, 0.2%si, > 0.0%st > CPU > 4370 criccomi 20 0 1157m 563m 14m S 253.9 0.8 89:32.31 ... > {noformat} > On my dev box, I was testing CGroups by running a python process eight times, > to burn t
[jira] [Created] (YARN-4840) Add option to upload files recursively from container directory
Brook Zhou created YARN-4840: Summary: Add option to upload files recursively from container directory Key: YARN-4840 URL: https://issues.apache.org/jira/browse/YARN-4840 Project: Hadoop YARN Issue Type: Improvement Components: log-aggregation Affects Versions: 2.8.0 Reporter: Brook Zhou Priority: Minor Fix For: 2.8.0 It may be useful to allow users to aggregate their logs recursively from container directories. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4595) Add support for configurable read-only mounts
[ https://issues.apache.org/jira/browse/YARN-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200076#comment-15200076 ] Allen Wittenauer commented on YARN-4595: What's preventing users from mounting files and file systems they shouldn't have access to? > Add support for configurable read-only mounts > - > > Key: YARN-4595 > URL: https://issues.apache.org/jira/browse/YARN-4595 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Billie Rinaldi >Assignee: Billie Rinaldi > Attachments: YARN-4595.1.patch, YARN-4595.2.patch > > > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. We could allow > the user to set a list of mounts in the environment of ContainerLaunchContext > (e.g. /dir1:/targetdir1,/dir2:/targetdir2). These would be mounted read-only > to the specified target locations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4823) Refactor the nested reservation id field in listReservation to simple string field
[ https://issues.apache.org/jira/browse/YARN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4823: - Description: The listReservation REST API returns a ReservationId field which has a nested id field which is also called ReservationId. This JIRA proposes to rename the nested field to a string as it's easier to read and moreover what the update/delete APIs take in as input. (was: The listReservation REST API returns a ReservationId field which has a nested id field which is also called ReservationId. This JIRA proposes to rename the nested field to id.) > Refactor the nested reservation id field in listReservation to simple string > field > -- > > Key: YARN-4823 > URL: https://issues.apache.org/jira/browse/YARN-4823 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > > The listReservation REST API returns a ReservationId field which has a nested > id field which is also called ReservationId. This JIRA proposes to rename the > nested field to a string as it's easier to read and moreover what the > update/delete APIs take in as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler
[ https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4785: Attachment: YARN-4785.branch-2.6.001.patch YARN-4785.branch-2.7.001.patch Thanks for the reviews [~djp] and [~jhsenjaliya]! Junping - I've attached versions of the patch for 2.6 and 2.7. Can you please commit them? > inconsistent value type of the "type" field for LeafQueueInfo in response of > RM REST API - cluster/scheduler > > > Key: YARN-4785 > URL: https://issues.apache.org/jira/browse/YARN-4785 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Jayesh >Assignee: Varun Vasudev > Labels: REST_API > Attachments: YARN-4785.001.patch, YARN-4785.branch-2.6.001.patch, > YARN-4785.branch-2.7.001.patch > > > I see inconsistent value type ( String and Array ) of the "type" field for > LeafQueueInfo in response of RM REST API - cluster/scheduler > as per the spec it should be always String. > here is the sample output ( removed non-relevant fields ) > {code} > { > "scheduler": { > "schedulerInfo": { > "type": "capacityScheduler", > "capacity": 100, > ... > "queueName": "root", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 0.1, > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 0.1, > "queueName": "test-queue", > "state": "RUNNING", > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 2.5, > > }, > { > "capacity": 25, > > "state": "RUNNING", > "queues": { > "queue": [ > { > "capacity": 6, > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > > }, > { > "capacity": 6, > ... > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > ... > }, > ... > ] > }, > ... > } > ] > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
[ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201695#comment-15201695 ] Sangjin Lee commented on YARN-4712: --- For the record (and following up on yesterday's offline discussion), I am +1 on the patch. If there are different opinions on this, we can reopen the discussion. Thanks! [~varun_saxena], will you do the honors of committing this patch? > CPU Usage Metric is not captured properly in YARN-2928 > -- > > Key: YARN-4712 > URL: https://issues.apache.org/jira/browse/YARN-4712 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4712-YARN-2928.v1.001.patch, > YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, > YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, > YARN-4712-YARN-2928.v1.006.patch > > > There are 2 issues with CPU usage collection > * I was able to observe that that many times CPU usage got from > {{pTree.getCpuUsagePercent()}} is > ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do > the calculation i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore > /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE > check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not > encountered. so proper checks needs to be handled > * {{EntityColumnPrefix.METRIC}} uses always LongConverter but > ContainerMonitor is publishing decimal values for the CPU usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4833) For Queue AccessControlException client retries multiple times on both RM
Bibin A Chundatt created YARN-4833: -- Summary: For Queue AccessControlException client retries multiple times on both RM Key: YARN-4833 URL: https://issues.apache.org/jira/browse/YARN-4833 Project: Hadoop YARN Issue Type: Bug Reporter: Bibin A Chundatt Assignee: Bibin A Chundatt Submit application to queue where ACL is enabled and submitted user is not having access. Client retries till failMaxattempt 10 times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198012#comment-15198012 ] Arun Suresh commented on YARN-4829: --- Thanks, LGTM +1 pending Jenkins > Add support for binary units > > > Key: YARN-4829 > URL: https://issues.apache.org/jira/browse/YARN-4829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4829-YARN-3926.001.patch, > YARN-4829-YARN-3926.002.patch, YARN-4829-YARN-3926.003.patch > > > The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4820) ResourceManager web redirects in HA mode drops query parameters
[ https://issues.apache.org/jira/browse/YARN-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199524#comment-15199524 ] Varun Vasudev commented on YARN-4820: - bq. I was just looking at the redirect logic and noting it was looking at 302's only. [~steve_l] - can you point me to where you found this logic? The logic in RMWebAppFilter.java is that if the node is a standby, it redirects all requests to the active. > ResourceManager web redirects in HA mode drops query parameters > --- > > Key: YARN-4820 > URL: https://issues.apache.org/jira/browse/YARN-4820 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4820.001.patch, YARN-4820.002.patch > > > The RMWebAppFilter redirects http requests from the standby to the active. > However it drops all the query parameters when it does the redirect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4607) AppAttempt page TotalOutstandingResource Requests table support pagination
[ https://issues.apache.org/jira/browse/YARN-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202269#comment-15202269 ] Hadoop QA commented on YARN-4607: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 52s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch generated 0 new + 21 unchanged - 2 fixed = 21 total (was 23) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 53s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s {color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 53s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:gr
[jira] [Commented] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call
[ https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202431#comment-15202431 ] Junping Du commented on YARN-4815: -- v2 patch LGTM. However, can we fix checkstyle issues? > ATS 1.5 timelineclinet impl try to create attempt directory for every event > call > > > Key: YARN-4815 > URL: https://issues.apache.org/jira/browse/YARN-4815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4815.1.patch, YARN-4815.2.patch > > > ATS 1.5 timelineclinet impl, try to create attempt directory for every event > call. Since per attempt only one call to create directory is enough, this is > causing perf issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202503#comment-15202503 ] Hadoop QA commented on YARN-4746: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 41s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 59s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 55s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 10s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 48s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:gree