date:20150311


[ 
https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357044#comment-14357044
 ] 

Hudson commented on YARN-3295:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2079 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2079/])
YARN-3295. Fix documentation nits found in markdown conversion. Contributed by 
Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md


 Fix documentation nits found in markdown conversion
 ---

 Key: YARN-3295
 URL: https://issues.apache.org/jira/browse/YARN-3295
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3295.001.patch


 * In ResourceManagerRestart page - Inside the Notes, the _e{epoch}_ , was 
 highlighted before but not now.
 * yarn container command
 {noformat}
 list ApplicationId (should be Application Attempt ID ?)
 Lists containers for the application attempt.
 {noformat}
 * yarn application attempt command
 {noformat}
 list ApplicationId
 Lists applications attempts from the RM (should be Lists applications 
 attempts for the given application)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3248) Display count of nodes blacklisted by apps in the web UI

2015-03-11 Thread Tsuyoshi Ozawa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357172#comment-14357172
 ] 

Tsuyoshi Ozawa commented on YARN-3248:
--

[~vvasudev] It would be good to have new test file TestApplicationReportPBImpl 
under
./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/impl/pb/.

TestSerializedExceptionPBImpl.java would be helpful for you to write the test 
cases.

 Display count of nodes blacklisted by apps in the web UI
 

 Key: YARN-3248
 URL: https://issues.apache.org/jira/browse/YARN-3248
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: All applications.png, App page.png, Screenshot.jpg, 
 apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
 apache-yarn-3248.3.patch


 It would be really useful when debugging app performance and failure issues 
 to get a count of the nodes blacklisted by individual apps displayed in the 
 web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible


[ 
https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357010#comment-14357010
 ] 

Hudson commented on YARN-2280:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #129 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/129/])
YARN-2280. Resource manager web service fields are not accessible (Krisztian 
Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java


 Resource manager web service fields are not accessible
 --

 Key: YARN-2280
 URL: https://issues.apache.org/jira/browse/YARN-2280
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
Assignee: Krisztian Horvath
Priority: Trivial
 Fix For: 3.0.0

 Attachments: YARN-2280.patch


 Using the resource manager's rest api 
 (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some 
 rest call returns a class where the fields after the unmarshal cannot be 
 accessible. For example SchedulerTypeInfo - schedulerInfo. Using the same 
 classes on client side these fields only accessible via reflection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3334) [Event Producers] NM start to posting some app related metrics in early POC stage of phase 2.

2015-03-11 Thread Junping Du (JIRA)

Junping Du created YARN-3334:


 Summary: [Event Producers] NM start to posting some app related 
metrics in early POC stage of phase 2.
 Key: YARN-3334
 URL: https://issues.apache.org/jira/browse/YARN-3334
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: YARN-2928
Reporter: Junping Du
Assignee: Junping Du






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357013#comment-14357013
 ] 

Hudson commented on YARN-3187:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #129 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/129/])
YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or 
group. Contributed by Gururaj Shetty (jianhe: rev 
a380643d2044a4974e379965f65066df2055d003)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
* hadoop-yarn-project/CHANGES.txt


 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357042#comment-14357042
 ] 

Hudson commented on YARN-3187:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2079 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2079/])
YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or 
group. Contributed by Gururaj Shetty (jianhe: rev 
a380643d2044a4974e379965f65066df2055d003)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md


 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible


[ 
https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357039#comment-14357039
 ] 

Hudson commented on YARN-2280:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2079 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2079/])
YARN-2280. Resource manager web service fields are not accessible (Krisztian 
Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java


 Resource manager web service fields are not accessible
 --

 Key: YARN-2280
 URL: https://issues.apache.org/jira/browse/YARN-2280
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
Assignee: Krisztian Horvath
Priority: Trivial
 Fix For: 3.0.0

 Attachments: YARN-2280.patch


 Using the resource manager's rest api 
 (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some 
 rest call returns a class where the fields after the unmarshal cannot be 
 accessible. For example SchedulerTypeInfo - schedulerInfo. Using the same 
 classes on client side these fields only accessible via reflection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3154) Should not upload partial logs for MR jobs or other short-running' applications


[ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357520#comment-14357520
 ] 

Vinod Kumar Vavilapalli commented on YARN-3154:
---

Looks close.

Can you also update the javadoc for existing APIs to say that those APIs only 
take affect on logs that exist at the time of application finish?

 Should not upload partial logs for MR jobs or other short-running' 
 applications 
 -

 Key: YARN-3154
 URL: https://issues.apache.org/jira/browse/YARN-3154
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Blocker
 Attachments: YARN-3154.1.patch, YARN-3154.2.patch, YARN-3154.3.patch


 Currently, if we are running a MR job, and we do not set the log interval 
 properly, we will have their partial logs uploaded and then removed from the 
 local filesystem which is not right.
 We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters


[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357536#comment-14357536
 ] 

Vinod Kumar Vavilapalli commented on YARN-3304:
---

How about we simplify this and throw an explicit exception when we think it is 
unavailable and let higher layers handle it appropriately when that happens?

IAC, I'd like us to make some progress to unblock 2.7. Tx.

 ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
 inconsistent with other getters
 

 Key: YARN-3304
 URL: https://issues.apache.org/jira/browse/YARN-3304
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Junping Du
Assignee: Karthik Kambatla
Priority: Blocker

 Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
 unavailable case while other resource metrics are return 0 in the same case 
 which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-11 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357564#comment-14357564
 ] 

Wangda Tan commented on YARN-3298:
--

[~nroberts], 
I think I got your point now. Yes, as you said, if we enforce the limit (used + 
requred = user-limit), and don't change the user-limit computation, queue 
cannot over its configured capacity.

Originally, this ticket trying to solve the jitter problem when we have the 
YARN-2069. However, YARN-2069 will only take effect when queue becomes 
over-satisfied, at that time, CS will not give queue more resources. So the 
jitter won't happen actually.

Jitter will happen when we have YARN-2113 (preemption will happen to balance 
usage between users when queue doesn't over its capacity), at that time, 
user-limit enforcement should be done.

Basically, I agree with your method, which is {{current_capacity = 
max(queue.used,queue.capacity)+now_required}}, it can solve the queue cannot 
over its configured capacity problem, but it seems not necessary at least for 
now. We can delay this change until YARN-2113 is required.

Thoughts?

Thanks,
Wangda

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2015-03-11 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357600#comment-14357600
 ] 

Eric Payne commented on YARN-1963:
--

Thanks, [~sunilg], for your work on in-queue priorities.

Along with [~nroberts], I'm confused about why priority labels are needed. As a 
user, I just need to know that the higher the number, the higher the priority. 
Then, I just need a way to see what priority each application is using and a 
way to set the priority of applications. To me, it just seems like labels will 
get in the way.

 Support priorities across applications within the same queue 
 -

 Key: YARN-1963
 URL: https://issues.apache.org/jira/browse/YARN-1963
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api, resourcemanager
Reporter: Arun C Murthy
Assignee: Sunil G
 Attachments: 0001-YARN-1963-prototype.patch, YARN Application 
 Priorities Design.pdf, YARN Application Priorities Design_01.pdf


 It will be very useful to support priorities among applications within the 
 same queue, particularly in production scenarios. It allows for finer-grained 
 controls without having to force admins to create a multitude of queues, plus 
 allows existing applications to continue using existing queues which are 
 usually part of institutional memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3267) Timelineserver applies the ACL rules after applying the limit on the number of records


[ 
https://issues.apache.org/jira/browse/YARN-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357552#comment-14357552
 ] 

Hadoop QA commented on YARN-3267:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703968/YARN-3267.3.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice:

  org.apache.hadoop.mapred.TestMRTimelineEventHandling

  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6920//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6920//console

This message is automatically generated.

 Timelineserver applies the ACL rules after applying the limit on the number 
 of records
 --

 Key: YARN-3267
 URL: https://issues.apache.org/jira/browse/YARN-3267
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Prakash Ramachandran
Assignee: Chang Li
 Attachments: YARN-3267.3.patch, YARN_3267_V1.patch, 
 YARN_3267_V2.patch, YARN_3267_WIP.patch, YARN_3267_WIP1.patch, 
 YARN_3267_WIP2.patch, YARN_3267_WIP3.patch


 While fetching the entities from timelineserver, the limit is applied on the 
 entities to be fetched from leveldb, the ACL filters are applied after this 
 (TimelineDataManager.java::getEntities). 
 this could mean that even if there are entities available which match the 
 query criteria, we could end up not getting any results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer


 [ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3034:

Attachment: YARN-3034-20150312-1.patch

Have attached a patch with Basic structure up, Please review
[~gtCarrera9] Please check whether package structuring is fine 

 [Aggregator wireup] Implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2836) RM behaviour on token renewal failures is broken


 [ 
https://issues.apache.org/jira/browse/YARN-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2836:
--
Target Version/s: 2.8.0  (was: 2.7.0)

This is too late for 2.7. I'll try getting something done in 2.8. Moving it out.

 RM behaviour on token renewal failures is broken
 

 Key: YARN-2836
 URL: https://issues.apache.org/jira/browse/YARN-2836
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Found this while reviewing YARN-2834.
 We now completely ignore token renewal failures. For things like Timeline 
 tokens which are automatically obtained whether the app needs it or not (we 
 should fix this to be user driven), we can ignore failures. But for HDFS 
 Tokens etc, ignoring failures is bad because it (1) wastes resources as AMs 
 will continue and eventually fail (2) app doesn't know what happened it fails 
 eventually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-11 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357579#comment-14357579
 ] 

Varun Saxena commented on YARN-3047:


Code related to TimelineClientServiceManager i.e. to handle YARN CLI requests, 
I will handle as part of another JIRA. Will raise it later.

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-11 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357618#comment-14357618
 ] 

Li Lu commented on YARN-3047:
-

Hi [~varun_saxena], thanks for updating this patch! I have a quick question 
that, is the reader supposed to be a separate daemon in the server (there is a 
main function in TimelineReaderServer)? I think it would be very helpful if you 
could have a simple write up for you current reader architecture. Thanks! 

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357629#comment-14357629
 ] 

Hadoop QA commented on YARN-3047:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12704008/YARN-3047.02.patch
  against trunk revision 344d7cb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6925//console

This message is automatically generated.

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3154) Should not upload partial logs for MR jobs or other short-running' applications

2015-03-11 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3154:

Attachment: YARN-3154.4.patch

 Should not upload partial logs for MR jobs or other short-running' 
 applications 
 -

 Key: YARN-3154
 URL: https://issues.apache.org/jira/browse/YARN-3154
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Blocker
 Attachments: YARN-3154.1.patch, YARN-3154.2.patch, YARN-3154.3.patch, 
 YARN-3154.4.patch


 Currently, if we are running a MR job, and we do not set the log interval 
 properly, we will have their partial logs uploaded and then removed from the 
 local filesystem which is not right.
 We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp


[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357553#comment-14357553
 ] 

Vinod Kumar Vavilapalli commented on YARN-3021:
---

bq. Hi Vinod Kumar Vavilapalli and Harsh J, comments on this approach that Jian 
described above?
Caught up with the discussion. The latest proposal seems like a reasonable 
approach without adding too much throw-away functionality in YARN. +1 for the 
approach, let's get this done.

 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-11 Thread Yongjun Zhang (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357569#comment-14357569
]

Yongjun Zhang commented on YARN-3021:
-

Hi [~vinodkv],

Thanks for the comments. We do have the consensus about the approach too, I
have been caught on other critical stuff. Will try to get to this asap. Thanks.

YARN's delegation-token handling disallows certain trust setups to operate
properly over DistCp
---

Key: YARN-3021
URL: https://issues.apache.org/jira/browse/YARN-3021
Project: Hadoop YARN
Issue Type: Bug
Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
Attachments: YARN-3021.001.patch, YARN-3021.002.patch,
YARN-3021.003.patch, YARN-3021.patch

Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON,
and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN
clusters.
Now if one logs in with a COMMON credential, and runs a job on A's YARN that
needs to access B's HDFS (such as a DistCp), the operation fails in the RM,
as it attempts a renewDelegationToken(…) synchronously during application
submission (to validate the managed token before it adds it to a scheduler
for automatic renewal). The call obviously fails cause B realm will not trust
A's credentials (here, the RM's principal is the renewer).
In the 1.x JobTracker the same call is present, but it is done asynchronously
and once the renewal attempt failed we simply ceased to schedule any further
attempts of renewals, rather than fail the job immediately.
We should change the logic such that we attempt the renewal but go easy on
the failure and skip the scheduling alone, rather than bubble back an error
to the client, failing the app submission. This way the old behaviour is
retained.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3335) Job In Error State Will Lost Jobhistory Of Second and Later Attempts


 [ 
https://issues.apache.org/jira/browse/YARN-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-3335:
---
Summary: Job In Error State Will Lost Jobhistory Of Second and Later 
Attempts  (was: Job In Error State Will Lost Jobhistory For Second and Later 
Attempts)

 Job In Error State Will Lost Jobhistory Of Second and Later Attempts
 

 Key: YARN-3335
 URL: https://issues.apache.org/jira/browse/YARN-3335
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3335.1.patch


 Related to a fixed issue MAPREDUCE-6230 which cause a Job to get into error 
 state. In that situation Job's second or some later attempt could succeed but 
 those later attempts' history file will all be lost. Because the first 
 attempt in error state will copy its history file to intermediate dir while 
 mistakenly think of itself as lastattempt. Jobhistory server will later move 
 the history file of that error attempt from intermediate dir to done dir 
 while ignore all later that job attempt's history file in intermediate dir.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2015-03-11 Thread Sunil G (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357241#comment-14357241
]

Sunil G commented on YARN-1963:
---

Thank you [~vinodkv] and [~nroberts] for the comments.

Considering usability ways, labels will be handy. And scheduler must be
agnostic of labels and should handle only integers like in linux. This will
have a complexity on priority manager inside RM which will translate label -
integer an vice versa. But a call can be taken by seeing all possibilities and
can be standardized the same so that a minimal working version can be pushed in
by improvising on the patches submitted (working prototype was attached).
Hoping [~leftnoteasy] and [~eepayne] to join the discussion.

Support priorities across applications within the same queue
-

Key: YARN-1963
URL: https://issues.apache.org/jira/browse/YARN-1963
Project: Hadoop YARN
Issue Type: New Feature
Components: api, resourcemanager
Reporter: Arun C Murthy
Assignee: Sunil G
Attachments: 0001-YARN-1963-prototype.patch, YARN Application
Priorities Design.pdf, YARN Application Priorities Design_01.pdf

It will be very useful to support priorities among applications within the
same queue, particularly in production scenarios. It allows for finer-grained
controls without having to force admins to create a multitude of queues, plus
allows existing applications to continue using existing queues which are
usually part of institutional memory.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-11 Thread Varun Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3047:
---
Attachment: YARN-3047.02.patch

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-11 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357625#comment-14357625
 ] 

Varun Saxena commented on YARN-3047:


Yes. It is a daemon. Multiple reader instances will come in phase 2(YARN-3118).
Sure, will update a write up.

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3267) Timelineserver applies the ACL rules after applying the limit on the number of records

2015-03-11 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357644#comment-14357644
 ] 

Jonathan Eagles commented on YARN-3267:
---

[~lichangleo], Couple more minor things with this patch

* LeveldbTimelineStore, MemoryTimelineStore, and TimelineReader all have extra 
UserGroupInformation import
* Spacing issues
** 'Check{' should be written as 'Check {'
** 'ugi=callerUGI;' should be written as 'ugi = callerUGI;'
** 'throws IOException{' should be written as 'throws IOException {'
* check logic simplification

{code}
  try {
if (!timelineACLsManager.checkAccess(
  ugi, ApplicationAccessType.VIEW_APP, entity)) {
  return false;
}
  }
{code}

might be simpler as

{code}
  try {
  return timelineACLsManager.checkAccess(
  ugi, ApplicationAccessType.VIEW_APP, entity);
  } 
{code}

* reduce logging level

{code}
  } catch (YarnException e) {
LOG.error(Error when verifying access for user  + ugi
  +  on the events of the timeline entity 
  + new EntityIdentifier(entity.getEntityId(),
  entity.getEntityType()), e);
return false;
  }
{code}

this might be better suited as info level since any missing domain can trying 
this scenario.

 Timelineserver applies the ACL rules after applying the limit on the number 
 of records
 --

 Key: YARN-3267
 URL: https://issues.apache.org/jira/browse/YARN-3267
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Prakash Ramachandran
Assignee: Chang Li
 Attachments: YARN-3267.3.patch, YARN_3267_V1.patch, 
 YARN_3267_V2.patch, YARN_3267_WIP.patch, YARN_3267_WIP1.patch, 
 YARN_3267_WIP2.patch, YARN_3267_WIP3.patch


 While fetching the entities from timelineserver, the limit is applied on the 
 entities to be fetched from leveldb, the ACL filters are applied after this 
 (TimelineDataManager.java::getEntities). 
 this could mean that even if there are entities available which match the 
 query criteria, we could end up not getting any results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.


[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357684#comment-14357684
 ] 

Hadoop QA commented on YARN-3243:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703964/YARN-3243.4.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
43 warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/6923//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebAppFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6923//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6923//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6923//console

This message is automatically generated.

 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be unreserved to keep its parent's resource limit.
 - More over, with this, YARN-3026 will make a clear boundary between 
 LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


[ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357300#comment-14357300
 ] 

Ravindra Naik commented on YARN-3324:
-

Do you think that these steps will be sufficient ?
1. Stop any container that is using that docker image. 
2. Delete any container that is using that docker image. 
3. Delete the docker image.


 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.002.patch, 
 YARN-3324-trunk.002.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


[ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357313#comment-14357313
 ] 

Hadoop QA commented on YARN-3324:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12703955/YARN-3324-branch-2.6.0.002.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6917//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6917//console

This message is automatically generated.

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.002.patch, 
 YARN-3324-trunk.002.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3204) Fix new findbug warnings in hadoop-yarn-server-resourcemanager(resourcemanager.scheduler.fair)

2015-03-11 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357353#comment-14357353
 ] 

Brahma Reddy Battula commented on YARN-3204:


Kindly  review if you find time ..thanks..

 Fix new findbug warnings in 
 hadoop-yarn-server-resourcemanager(resourcemanager.scheduler.fair)
 --

 Key: YARN-3204
 URL: https://issues.apache.org/jira/browse/YARN-3204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker
 Attachments: YARN-3204-001.patch, YARN-3204-002.patch, 
 YARN-3204-003.patch


 Please check following findbug report..
 https://builds.apache.org/job/PreCommit-YARN-Build/6644//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile


[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357402#comment-14357402
 ] 

Hadoop QA commented on YARN-3080:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703979/YARN-3080.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6921//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6921//console

This message is automatically generated.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch, YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1884) ContainerReport should have nodeHttpAddress

2015-03-11 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1884:

Attachment: YARN-1884.4.patch

address all the comments

 ContainerReport should have nodeHttpAddress
 ---

 Key: YARN-1884
 URL: https://issues.apache.org/jira/browse/YARN-1884
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Xuan Gong
 Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch, 
 YARN-1884.4.patch


 In web UI, we're going to show the node, which used to be to link to the NM 
 web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
 field has to be set to nodeID where the container is allocated. We need to 
 add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-11 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3243:
-
Attachment: YARN-3243.4.patch

Addressed all comments from Jian except:
bq. Do you think passing down a QueueHeadRoom compared with QueueMaxLimit may 
make the code simpler
Some places needs available resource and some places needs limit, there 
should be no much difference in code effort no matter we pass down headroom or 
limit.

Attached new patch (ver.4)

 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be unreserved to keep its parent's resource limit.
 - More over, with this, YARN-3026 will make a clear boundary between 
 LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress


[ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357432#comment-14357432
 ] 

Hadoop QA commented on YARN-1884:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703957/YARN-1884.4.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.client.TestGetGroups
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6918//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6918//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6918//console

This message is automatically generated.

 ContainerReport should have nodeHttpAddress
 ---

 Key: YARN-1884
 URL: https://issues.apache.org/jira/browse/YARN-1884
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Xuan Gong
 Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch, 
 YARN-1884.4.patch


 In web UI, we're going to show the node, which used to be to link to the NM 
 web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
 field has to be set to nodeID where the container is allocated. We need to 
 add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress

2015-03-11 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357465#comment-14357465
 ] 

Xuan Gong commented on YARN-1884:
-

Testcase failures are not related

 ContainerReport should have nodeHttpAddress
 ---

 Key: YARN-1884
 URL: https://issues.apache.org/jira/browse/YARN-1884
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Xuan Gong
 Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch, 
 YARN-1884.4.patch


 In web UI, we're going to show the node, which used to be to link to the NM 
 web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
 field has to be set to nodeID where the container is allocated. We need to 
 add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


 [ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Naik updated YARN-3324:

Attachment: (was: YARN-3324-trunk.patch)

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.002.patch, 
 YARN-3324-trunk.002.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


 [ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Naik updated YARN-3324:

Attachment: YARN-3324-branch-2.6.0.002.patch
YARN-3324-trunk.002.patch

updated patches to consider the case when docker is not installed.

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.002.patch, 
 YARN-3324-branch-2.6.0.patch, YARN-3324-trunk.002.patch, YARN-3324-trunk.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3267) Timelineserver applies the ACL rules after applying the limit on the number of records


[ 
https://issues.apache.org/jira/browse/YARN-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357292#comment-14357292
 ] 

Chang Li commented on YARN-3267:


[~jeagles] Thanks for review. Updated my patch according to your suggestions.

 Timelineserver applies the ACL rules after applying the limit on the number 
 of records
 --

 Key: YARN-3267
 URL: https://issues.apache.org/jira/browse/YARN-3267
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Prakash Ramachandran
Assignee: Chang Li
 Attachments: YARN-3267.3.patch, YARN_3267_V1.patch, 
 YARN_3267_V2.patch, YARN_3267_WIP.patch, YARN_3267_WIP1.patch, 
 YARN_3267_WIP2.patch, YARN_3267_WIP3.patch


 While fetching the entities from timelineserver, the limit is applied on the 
 entities to be fetched from leveldb, the ACL filters are applied after this 
 (TimelineDataManager.java::getEntities). 
 this could mean that even if there are entities available which match the 
 query criteria, we could end up not getting any results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-11 Thread Abin Shahab (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated YARN-3080:
--
Attachment: YARN-3080.patch

Removed gitignore

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch, YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3335) Job In Error State Will Lost Jobhistory For Second and Later Attempts


 [ 
https://issues.apache.org/jira/browse/YARN-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-3335:
---
Attachment: YARN-3335.1.patch

 Job In Error State Will Lost Jobhistory For Second and Later Attempts
 -

 Key: YARN-3335
 URL: https://issues.apache.org/jira/browse/YARN-3335
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3335.1.patch


 Related to a fixed issue MAPREDUCE-6230 which cause a Job to get into error 
 state. In that situation Job's second or some later attempt could succeed but 
 those later attempts' history file will all be lost. Because the first 
 attempt in error state will copy its history file to intermediate dir while 
 mistakenly think of itself as lastattempt. Jobhistory server will later move 
 the history file of that error attempt from intermediate dir to done dir 
 while ignore all later that job attempt's history file in intermediate dir.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done

2015-03-11 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357249#comment-14357249
 ] 

Chen He commented on YARN-3324:
---

Hi [~ravindra.naik], thank you for the patch. Actually, just add a docker rmi 
testImage can not guarantee deletion of the testImage. 

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.002.patch, 
 YARN-3324-trunk.002.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


 [ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Naik updated YARN-3324:

Attachment: (was: YARN-3324-branch-2.6.0.patch)

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.002.patch, 
 YARN-3324-trunk.002.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3335) Job In Error State Will Lost Jobhistory For Second and Later Attempts


[ 
https://issues.apache.org/jira/browse/YARN-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357464#comment-14357464
 ] 

Chang Li commented on YARN-3335:


Another plausible solution is to always let ERROR state to retry. 

 Job In Error State Will Lost Jobhistory For Second and Later Attempts
 -

 Key: YARN-3335
 URL: https://issues.apache.org/jira/browse/YARN-3335
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3335.1.patch


 Related to a fixed issue MAPREDUCE-6230 which cause a Job to get into error 
 state. In that situation Job's second or some later attempt could succeed but 
 those later attempts' history file will all be lost. Because the first 
 attempt in error state will copy its history file to intermediate dir while 
 mistakenly think of itself as lastattempt. Jobhistory server will later move 
 the history file of that error attempt from intermediate dir to done dir 
 while ignore all later that job attempt's history file in intermediate dir.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3267) Timelineserver applies the ACL rules after applying the limit on the number of records


 [ 
https://issues.apache.org/jira/browse/YARN-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-3267:
---
Attachment: YARN-3267.3.patch

 Timelineserver applies the ACL rules after applying the limit on the number 
 of records
 --

 Key: YARN-3267
 URL: https://issues.apache.org/jira/browse/YARN-3267
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Prakash Ramachandran
Assignee: Chang Li
 Attachments: YARN-3267.3.patch, YARN_3267_V1.patch, 
 YARN_3267_V2.patch, YARN_3267_WIP.patch, YARN_3267_WIP1.patch, 
 YARN_3267_WIP2.patch, YARN_3267_WIP3.patch


 While fetching the entities from timelineserver, the limit is applied on the 
 entities to be fetched from leveldb, the ACL filters are applied after this 
 (TimelineDataManager.java::getEntities). 
 this could mean that even if there are entities available which match the 
 query criteria, we could end up not getting any results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2828) Enable auto refresh of web pages (using http parameter)

2015-03-11 Thread Vijay Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Bhat updated YARN-2828:
-
Attachment: (was: HADOOP-9329.004.patch)

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: YARN-2828.001.patch, YARN-2828.002.patch, 
 YARN-2828.003.patch, YARN-2828.004.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2828) Enable auto refresh of web pages (using http parameter)

2015-03-11 Thread Vijay Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Bhat updated YARN-2828:
-
Attachment: YARN-2828.004.patch

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: YARN-2828.001.patch, YARN-2828.002.patch, 
 YARN-2828.003.patch, YARN-2828.004.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer

zhihai xu created YARN-3336:
---

 Summary: FileSystem memory leak in DelegationTokenRenewer
 Key: YARN-3336
 URL: https://issues.apache.org/jira/browse/YARN-3336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical


FileSystem memory leak in DelegationTokenRenewer.
Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
FileSystem entry will be added to  FileSystem#CACHE which will never be garbage 
collected.
This is the implementation of obtainSystemTokensForUser:
{code}
  protected Token?[] obtainSystemTokensForUser(String user,
  final Credentials credentials) throws IOException, InterruptedException {
// Get new hdfs tokens on behalf of this user
UserGroupInformation proxyUser =
UserGroupInformation.createProxyUser(user,
  UserGroupInformation.getLoginUser());
Token?[] newTokens =
proxyUser.doAs(new PrivilegedExceptionActionToken?[]() {
  @Override
  public Token?[] run() throws Exception {
return FileSystem.get(getConfig()).addDelegationTokens(
  UserGroupInformation.getLoginUser().getUserName(), credentials);
  }
});
return newTokens;
  }
{code}

The memory leak happened when FileSystem.get(getConfig()) is called with a new 
proxy user.
Because createProxyUser will always create a new Subject.
{code}
public static UserGroupInformation createProxyUser(String user,
  UserGroupInformation realUser) {
if (user == null || user.isEmpty()) {
  throw new IllegalArgumentException(Null user);
}
if (realUser == null) {
  throw new IllegalArgumentException(Null real user);
}
Subject subject = new Subject();
SetPrincipal principals = subject.getPrincipals();
principals.add(new User(user));
principals.add(new RealUser(realUser));
UserGroupInformation result =new UserGroupInformation(subject);
result.setAuthenticationMethod(AuthenticationMethod.PROXY);
return result;
  }
{code}

FileSystem#Cache#Key.equals will compare the ugi
{code}
  Key(URI uri, Configuration conf, long unique) throws IOException {
scheme = uri.getScheme()==null?:uri.getScheme().toLowerCase();
authority = 
uri.getAuthority()==null?:uri.getAuthority().toLowerCase();
this.unique = unique;
this.ugi = UserGroupInformation.getCurrentUser();
  }
  public boolean equals(Object obj) {
if (obj == this) {
  return true;
}
if (obj != null  obj instanceof Key) {
  Key that = (Key)obj;
  return isEqual(this.scheme, that.scheme)
  isEqual(this.authority, that.authority)
  isEqual(this.ugi, that.ugi)
  (this.unique == that.unique);
}
return false;
  }
{code}

UserGroupInformation.equals will compare subject by reference.
{code}
  public boolean equals(Object o) {
if (o == this) {
  return true;
} else if (o == null || getClass() != o.getClass()) {
  return false;
} else {
  return subject == ((UserGroupInformation) o).subject;
}
  }
{code}

So in this case, every time createProxyUser and FileSystem.get(getConfig()) are 
called, a new FileSystem will be created and a new entry will be added to 
FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)


[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357928#comment-14357928
 ] 

Naganarasimha G R commented on YARN-2495:
-

bq. It's just that I don't like this areNodeLabelsSetInReq flag in the 
protocol. Are there other ways of achieving this?
Other way out is always send the set of labels as part of every heartbeat. We 
wanted to avoid this traffic, hence initially we came up with this approach 
when we were supporting multiple labels for a node (may be in future we might 
support multiple labels again right ?)

{{I think treating invalid labels as a disaster case will be, well, a 
disaster.}} : liked the sentence :)

bq. How about we let the node run (just like we let an unhealthy node run) and 
report it in the diagnostics? I'm okay keeping that same behavior during 
registration too.
Yes this is similar to the earlier behavior we had in the Dec's patch , 
Additionally to inform back the failure, we had  added one flag to inform NM 
whether RM accepted the labels or not and Diag Message was also set with 
appropriate message. Whether this approach is fine ?


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357724#comment-14357724
 ] 

Hadoop QA commented on YARN-3034:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12704029/YARN-3034-20150312-1.patch
  against trunk revision 7a346bc.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6927//console

This message is automatically generated.

 [Aggregator wireup] Implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer


 [ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3336:

Attachment: YARN-3336.000.patch

 FileSystem memory leak in DelegationTokenRenewer
 

 Key: YARN-3336
 URL: https://issues.apache.org/jira/browse/YARN-3336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3336.000.patch


 FileSystem memory leak in DelegationTokenRenewer.
 Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
 FileSystem entry will be added to  FileSystem#CACHE which will never be 
 garbage collected.
 This is the implementation of obtainSystemTokensForUser:
 {code}
   protected Token?[] obtainSystemTokensForUser(String user,
   final Credentials credentials) throws IOException, InterruptedException 
 {
 // Get new hdfs tokens on behalf of this user
 UserGroupInformation proxyUser =
 UserGroupInformation.createProxyUser(user,
   UserGroupInformation.getLoginUser());
 Token?[] newTokens =
 proxyUser.doAs(new PrivilegedExceptionActionToken?[]() {
   @Override
   public Token?[] run() throws Exception {
 return FileSystem.get(getConfig()).addDelegationTokens(
   UserGroupInformation.getLoginUser().getUserName(), credentials);
   }
 });
 return newTokens;
   }
 {code}
 The memory leak happened when FileSystem.get(getConfig()) is called with a 
 new proxy user.
 Because createProxyUser will always create a new Subject.
 {code}
 public static UserGroupInformation createProxyUser(String user,
   UserGroupInformation realUser) {
 if (user == null || user.isEmpty()) {
   throw new IllegalArgumentException(Null user);
 }
 if (realUser == null) {
   throw new IllegalArgumentException(Null real user);
 }
 Subject subject = new Subject();
 SetPrincipal principals = subject.getPrincipals();
 principals.add(new User(user));
 principals.add(new RealUser(realUser));
 UserGroupInformation result =new UserGroupInformation(subject);
 result.setAuthenticationMethod(AuthenticationMethod.PROXY);
 return result;
   }
 {code}
 FileSystem#Cache#Key.equals will compare the ugi
 {code}
   Key(URI uri, Configuration conf, long unique) throws IOException {
 scheme = uri.getScheme()==null?:uri.getScheme().toLowerCase();
 authority = 
 uri.getAuthority()==null?:uri.getAuthority().toLowerCase();
 this.unique = unique;
 this.ugi = UserGroupInformation.getCurrentUser();
   }
   public boolean equals(Object obj) {
 if (obj == this) {
   return true;
 }
 if (obj != null  obj instanceof Key) {
   Key that = (Key)obj;
   return isEqual(this.scheme, that.scheme)
   isEqual(this.authority, that.authority)
   isEqual(this.ugi, that.ugi)
   (this.unique == that.unique);
 }
 return false;
   }
 {code}
 UserGroupInformation.equals will compare subject by reference.
 {code}
   public boolean equals(Object o) {
 if (o == this) {
   return true;
 } else if (o == null || getClass() != o.getClass()) {
   return false;
 } else {
   return subject == ((UserGroupInformation) o).subject;
 }
   }
 {code}
 So in this case, every time createProxyUser and FileSystem.get(getConfig()) 
 are called, a new FileSystem will be created and a new entry will be added to 
 FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2828) Enable auto refresh of web pages (using http parameter)


[ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357886#comment-14357886
 ] 

Hadoop QA commented on YARN-2828:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12704041/YARN-2828.004.patch
  against trunk revision 7a346bc.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebApp

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6930//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6930//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6930//console

This message is automatically generated.

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: YARN-2828.001.patch, YARN-2828.002.patch, 
 YARN-2828.003.patch, YARN-2828.004.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-11 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357736#comment-14357736
 ] 

Wangda Tan commented on YARN-3243:
--

Javadoc warning are mis-reported by Jenkins, findbugs warnings are tracked by 
YARN-3204, test failures can pass locally.

 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be unreserved to keep its parent's resource limit.
 - More over, with this, YARN-3026 will make a clear boundary between 
 LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress


[ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357762#comment-14357762
 ] 

Hadoop QA commented on YARN-1884:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703957/YARN-1884.4.patch
  against trunk revision 344d7cb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6926//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6926//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6926//console

This message is automatically generated.

 ContainerReport should have nodeHttpAddress
 ---

 Key: YARN-1884
 URL: https://issues.apache.org/jira/browse/YARN-1884
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Xuan Gong
 Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch, 
 YARN-1884.4.patch


 In web UI, we're going to show the node, which used to be to link to the NM 
 web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
 field has to be set to nodeID where the container is allocated. We need to 
 add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2792) Have a public Test-only API for creating important records that ecosystem projects can depend on


 [ 
https://issues.apache.org/jira/browse/YARN-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2792:
--
Issue Type: Sub-task  (was: Bug)
Parent: YARN-1953

 Have a public Test-only API for creating important records that ecosystem 
 projects can depend on
 

 Key: YARN-2792
 URL: https://issues.apache.org/jira/browse/YARN-2792
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Priority: Blocker

 From YARN-2789,
 {quote}
 Sigh.
 Even though this is a private API, it will be used by downstream projects 
 for testing. It'll be useful for this to be re-instated, maybe with a 
 deprecated annotation, so that older versions of downstream projects can 
 build against Hadoop 2.6.
 I am inclined to have a separate test-only public util API that keeps 
 compatibility for tests. Rather than opening unwanted APIs up. I'll file a 
 separate ticket for this, we need all YARN apps/frameworks to move to that 
 API instead of these private unstable APIs.
 For now, I am okay keeping a private compat for the APIs changed in YARN-2698.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-11 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357846#comment-14357846
 ] 

Jian He commented on YARN-3243:
---

- getResourceLimitsOfChild: code comments should match the variable name

- CapacityScheduler, why following code is moved ?
{code}
// update this node to node label manager
if (labelManager != null) {
  labelManager.activateNode(nodeManager.getNodeID(),
  nodeManager.getTotalCapability());
}

{code}
- needToUnreserve is removed, comments is invalid any more
{code}
// we got here by possibly ignoring parent queue capacity limits. If // the 
parameter needToUnreserve i
{code}
- the checkReservedContainers flag in canAssignToThisQueue is not needed, 
instead check whether resourceCouldBeUnreserved is none or not

 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be unreserved to keep its parent's resource limit.
 - More over, with this, YARN-3026 will make a clear boundary between 
 LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)


[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357852#comment-14357852
 ] 

Vinod Kumar Vavilapalli commented on YARN-2495:
---

bq. Well as craig informed, RegisterNodeManagerRequestProto.nodeLabels is 
already a set but as by default empty set is provided by protoc, its req to 
inform whether labels are set as part of request hence areNodeLabelsSetInReq is 
required.
It's just that I don't like this _areNodeLabelsSetInReq_ flag in the protocol. 
Are there other ways of achieving this?

bq.  Well i am little confused here, As per wangda's earlier comment i 
understand that it was your comment to send shutdown (which i felt correct in 
terms of maintenance)
I think treating invalid labels as a disaster case will be, well, a disaster. 
How about we let the node run (just like we let an unhealthy node run) and 
report it in the diagnostics? I'm okay keeping that same behavior during 
registration too.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer


[ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357900#comment-14357900
 ] 

zhihai xu commented on YARN-3336:
-

I upload a patch for this issue. 
The fix is to call FileSystem.get(getConfig()) outside of the proxyUser.doAs.
So FileSystem.get(getConfig()) is at the current RM user context, it will 
return a FileSystem from the FileSystem.CACHE instead of creating a new 
FileSystem.
Since the patch is straightforward and a very small change, I think we don't 
need a test case for this patch.

 FileSystem memory leak in DelegationTokenRenewer
 

 Key: YARN-3336
 URL: https://issues.apache.org/jira/browse/YARN-3336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3336.000.patch


 FileSystem memory leak in DelegationTokenRenewer.
 Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
 FileSystem entry will be added to  FileSystem#CACHE which will never be 
 garbage collected.
 This is the implementation of obtainSystemTokensForUser:
 {code}
   protected Token?[] obtainSystemTokensForUser(String user,
   final Credentials credentials) throws IOException, InterruptedException 
 {
 // Get new hdfs tokens on behalf of this user
 UserGroupInformation proxyUser =
 UserGroupInformation.createProxyUser(user,
   UserGroupInformation.getLoginUser());
 Token?[] newTokens =
 proxyUser.doAs(new PrivilegedExceptionActionToken?[]() {
   @Override
   public Token?[] run() throws Exception {
 return FileSystem.get(getConfig()).addDelegationTokens(
   UserGroupInformation.getLoginUser().getUserName(), credentials);
   }
 });
 return newTokens;
   }
 {code}
 The memory leak happened when FileSystem.get(getConfig()) is called with a 
 new proxy user.
 Because createProxyUser will always create a new Subject.
 {code}
 public static UserGroupInformation createProxyUser(String user,
   UserGroupInformation realUser) {
 if (user == null || user.isEmpty()) {
   throw new IllegalArgumentException(Null user);
 }
 if (realUser == null) {
   throw new IllegalArgumentException(Null real user);
 }
 Subject subject = new Subject();
 SetPrincipal principals = subject.getPrincipals();
 principals.add(new User(user));
 principals.add(new RealUser(realUser));
 UserGroupInformation result =new UserGroupInformation(subject);
 result.setAuthenticationMethod(AuthenticationMethod.PROXY);
 return result;
   }
 {code}
 FileSystem#Cache#Key.equals will compare the ugi
 {code}
   Key(URI uri, Configuration conf, long unique) throws IOException {
 scheme = uri.getScheme()==null?:uri.getScheme().toLowerCase();
 authority = 
 uri.getAuthority()==null?:uri.getAuthority().toLowerCase();
 this.unique = unique;
 this.ugi = UserGroupInformation.getCurrentUser();
   }
   public boolean equals(Object obj) {
 if (obj == this) {
   return true;
 }
 if (obj != null  obj instanceof Key) {
   Key that = (Key)obj;
   return isEqual(this.scheme, that.scheme)
   isEqual(this.authority, that.authority)
   isEqual(this.ugi, that.ugi)
   (this.unique == that.unique);
 }
 return false;
   }
 {code}
 UserGroupInformation.equals will compare subject by reference.
 {code}
   public boolean equals(Object o) {
 if (o == this) {
   return true;
 } else if (o == null || getClass() != o.getClass()) {
   return false;
 } else {
   return subject == ((UserGroupInformation) o).subject;
 }
   }
 {code}
 So in this case, every time createProxyUser and FileSystem.get(getConfig()) 
 are called, a new FileSystem will be created and a new entry will be added to 
 FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2828) Enable auto refresh of web pages (using http parameter)

2015-03-11 Thread Vijay Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Bhat updated YARN-2828:
-
Attachment: HADOOP-9329.004.patch

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: HADOOP-9329.004.patch, YARN-2828.001.patch, 
 YARN-2828.002.patch, YARN-2828.003.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3154) Should not upload partial logs for MR jobs or other short-running' applications


[ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357706#comment-14357706
 ] 

Hadoop QA commented on YARN-3154:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12704007/YARN-3154.4.patch
  against trunk revision 344d7cb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6924//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6924//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6924//console

This message is automatically generated.

 Should not upload partial logs for MR jobs or other short-running' 
 applications 
 -

 Key: YARN-3154
 URL: https://issues.apache.org/jira/browse/YARN-3154
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Blocker
 Attachments: YARN-3154.1.patch, YARN-3154.2.patch, YARN-3154.3.patch, 
 YARN-3154.4.patch


 Currently, if we are running a MR job, and we do not set the log interval 
 properly, we will have their partial logs uploaded and then removed from the 
 local filesystem which is not right.
 We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1334) YARN should give more info on errors when running failed distributed shell command


[ 
https://issues.apache.org/jira/browse/YARN-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357802#comment-14357802
 ] 

Hadoop QA commented on YARN-1334:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12609555/YARN-1334.1.patch
  against trunk revision 7a346bc.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6931//console

This message is automatically generated.

 YARN should give more info on errors when running failed distributed shell 
 command
 --

 Key: YARN-1334
 URL: https://issues.apache.org/jira/browse/YARN-1334
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications/distributed-shell
Affects Versions: 2.3.0
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Attachments: YARN-1334.1.patch


 Run incorrect command such as:
 /usr/bin/yarn  org.apache.hadoop.yarn.applications.distributedshell.Client 
 -jar distributedshell jar -shell_command ./test1.sh -shell_script ./
 would show shell exit code exception with no useful message. It should print 
 out sysout/syserr of containers/AM of why it is failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3337) Provide YARN chaos monkey

2015-03-11 Thread Steve Loughran (JIRA)

Steve Loughran created YARN-3337:


 Summary: Provide YARN chaos monkey
 Key: YARN-3337
 URL: https://issues.apache.org/jira/browse/YARN-3337
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: test
Affects Versions: 2.7.0
Reporter: Steve Loughran


To test failure resilience today you either need custom scripts or implement 
Chaos Monkey-like logic in your application (SLIDER-202). 

Killing AMs and containers on a schedule  probability is the core activity 
here, one that could be handled by a CLI App/client lib that does this. 

# entry point to have a startup delay before acting
# frequency of chaos wakeup/polling
# probability to AM failure generation (0-100)
# probability of non-AM container kill
# future: other operations




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

[
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357922#comment-14357922
]

Hadoop QA commented on YARN-2890:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12704036/YARN-2890.patch
against trunk revision 7a346bc.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 4 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

org.apache.hadoop.mapred.TestSequenceFileAsBinaryOutputFormat
org.apache.hadoop.mapred.TestResourceMgrDelegate
org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath

org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels

The following test timeouts occurred in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

org.apache.hadoop.mapred.TestMRIntermediateDataEncryption
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/6929//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6929//console

This message is automatically generated.

MiniMRYarnCluster should turn on timeline service if configured to do so

Key: YARN-2890
URL: https://issues.apache.org/jira/browse/YARN-2890
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch,
YARN-2890.patch, YARN-2890.patch

Currently the MiniMRYarnCluster does not consider the configuration value for
enabling timeline service before starting. The MiniYarnCluster should only
start the timeline service if it is configured to do so.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2854) The document about timeline service and generic service needs to be updated


 [ 
https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2854:

Attachment: YARN-2854.20150311-1.patch

 The document about timeline service and generic service needs to be updated
 ---

 Key: YARN-2854
 URL: https://issues.apache.org/jira/browse/YARN-2854
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
Priority: Critical
 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, 
 YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, 
 YARN-2854.20150311-1.patch, timeline_structure.jpg






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated


[ 
https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356462#comment-14356462
 ] 

Hadoop QA commented on YARN-2854:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12703860/YARN-2854.20150311-1.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6914//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6914//console

This message is automatically generated.

 The document about timeline service and generic service needs to be updated
 ---

 Key: YARN-2854
 URL: https://issues.apache.org/jira/browse/YARN-2854
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
Priority: Critical
 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, 
 YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, 
 YARN-2854.20150311-1.patch, timeline_structure.jpg






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1876) Document the REST APIs of timeline and generic history services

2015-03-11 Thread Gururaj Shetty (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356479#comment-14356479
 ] 

Gururaj Shetty commented on YARN-1876:
--

Hi [~zjshen],

I can convert this content to Markdown and append with YARN-2854. Kindly let me 
know if I can handle.

 Document the REST APIs of timeline and generic history services
 ---

 Key: YARN-1876
 URL: https://issues.apache.org/jira/browse/YARN-1876
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: documentaion
 Attachments: YARN-1876.1.patch, YARN-1876.2.patch, YARN-1876.3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion


[ 
https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356373#comment-14356373
 ] 

Hudson commented on YARN-3295:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7303 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7303/])
YARN-3295. Fix documentation nits found in markdown conversion. Contributed by 
Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md


 Fix documentation nits found in markdown conversion
 ---

 Key: YARN-3295
 URL: https://issues.apache.org/jira/browse/YARN-3295
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3295.001.patch


 * In ResourceManagerRestart page - Inside the Notes, the _e{epoch}_ , was 
 highlighted before but not now.
 * yarn container command
 {noformat}
 list ApplicationId (should be Application Attempt ID ?)
 Lists containers for the application attempt.
 {noformat}
 * yarn application attempt command
 {noformat}
 list ApplicationId
 Lists applications attempts from the RM (should be Lists applications 
 attempts for the given application)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3248) Display count of nodes blacklisted by apps in the web UI

2015-03-11 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-3248:

Attachment: apache-yarn-3248.3.patch

Uploaded a new patch which applies after YARN-1809

 Display count of nodes blacklisted by apps in the web UI
 

 Key: YARN-3248
 URL: https://issues.apache.org/jira/browse/YARN-3248
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: All applications.png, App page.png, Screenshot.jpg, 
 apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
 apache-yarn-3248.3.patch


 It would be really useful when debugging app performance and failure issues 
 to get a count of the nodes blacklisted by individual apps displayed in the 
 web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (YARN-3329) There's no way to rebuild containers Managed by NMClientAsync If AM restart

2015-03-11 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reopened YARN-3329:
-

 There's no way to rebuild containers Managed by NMClientAsync If AM restart
 ---

 Key: YARN-3329
 URL: https://issues.apache.org/jira/browse/YARN-3329
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, applications, client
Affects Versions: 2.6.0
Reporter: sandflee

 If work preserving is enabled and AM restart, AM could't stop containers or 
 query container status launched by pre-am, because there's no corresponding 
 container in NMClientAsync.containers.
 And there‘s no way to rebuild NMClientAsync.containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (YARN-3329) There's no way to rebuild containers Managed by NMClientAsync If AM restart

2015-03-11 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved YARN-3329.
-
  Resolution: Duplicate
Release Note:   (was: the same to YARN-3328, sorry for creating twice)

 There's no way to rebuild containers Managed by NMClientAsync If AM restart
 ---

 Key: YARN-3329
 URL: https://issues.apache.org/jira/browse/YARN-3329
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, applications, client
Affects Versions: 2.6.0
Reporter: sandflee

 If work preserving is enabled and AM restart, AM could't stop containers or 
 query container status launched by pre-am, because there's no corresponding 
 container in NMClientAsync.containers.
 And there‘s no way to rebuild NMClientAsync.containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group

2015-03-11 Thread Gururaj Shetty (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356394#comment-14356394
 ] 

Gururaj Shetty commented on YARN-3187:
--

Thanks [~jianhe] and [~Naganarasimha Garla] for committing and reviewing the 
patch.

 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3328) There's no way to rebuild containers Managed by NMClientAsync If AM restart

2015-03-11 Thread sandflee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356384#comment-14356384
 ] 

sandflee commented on YARN-3328:


Is there any necessary to keep containers info in NMClientAsync?  YARN-3327 
also caused by this.

 There's no way to rebuild containers Managed by NMClientAsync If AM restart
 ---

 Key: YARN-3328
 URL: https://issues.apache.org/jira/browse/YARN-3328
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, applications, client
Affects Versions: 2.6.0
Reporter: sandflee

 If work preserving is enabled and AM restart, AM could't stop containers  
 launched by pre-am, because there's no corresponding container in 
 NMClientAsync.containers.
  There‘s no way to rebuild NMClientAsync.containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3248) Display count of nodes blacklisted by apps in the web UI


[ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356668#comment-14356668
 ] 

Hadoop QA commented on YARN-3248:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12703863/apache-yarn-3248.3.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 6 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6915//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6915//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6915//console

This message is automatically generated.

 Display count of nodes blacklisted by apps in the web UI
 

 Key: YARN-3248
 URL: https://issues.apache.org/jira/browse/YARN-3248
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: All applications.png, App page.png, Screenshot.jpg, 
 apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
 apache-yarn-3248.3.patch


 It would be really useful when debugging app performance and failure issues 
 to get a count of the nodes blacklisted by individual apps displayed in the 
 web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


 [ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Naik updated YARN-3324:

 Target Version/s: 2.6.0  (was: trunk-win, 2.6.0)
Affects Version/s: (was: trunk-win)

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.patch, YARN-3324-trunk.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


[ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356709#comment-14356709
 ] 

Hadoop QA commented on YARN-3324:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703871/YARN-3324-trunk.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestDockerContainerExecutor
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6916//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6916//console

This message is automatically generated.

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.patch, YARN-3324-trunk.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done


 [ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Naik updated YARN-3324:

Attachment: YARN-3324-trunk.patch
YARN-3324-branch-2.6.0.patch

 TestDockerContainerExecutor should clean test docker image from local 
 repository after test is done
 ---

 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: trunk-win, 2.6.0
Reporter: Chen He
 Attachments: YARN-3324-branch-2.6.0.patch, YARN-3324-trunk.patch


 Current TestDockerContainerExecutor only cleans the temp directory in local 
 file system but leaves the test docker image in local docker repository. It 
 should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3248) Display count of nodes blacklisted by apps in the web UI

2015-03-11 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356857#comment-14356857
 ] 

Varun Vasudev commented on YARN-3248:
-

Thanks for the review [~ozawa]. Can you help me out - when you say add a test 
about ApplicationReportPBImpl, can you let me know where to add the tests? I've 
modified/added tests in ClientRMService but I'm not sure where tests for 
*PBImpl go. Thanks!

 Display count of nodes blacklisted by apps in the web UI
 

 Key: YARN-3248
 URL: https://issues.apache.org/jira/browse/YARN-3248
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: All applications.png, App page.png, Screenshot.jpg, 
 apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
 apache-yarn-3248.3.patch


 It would be really useful when debugging app performance and failure issues 
 to get a count of the nodes blacklisted by individual apps displayed in the 
 web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3248) Display count of nodes blacklisted by apps in the web UI

2015-03-11 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356858#comment-14356858
 ] 

Varun Vasudev commented on YARN-3248:
-

Sorry that should have been TestClientRMService, and not ClientRMService.

 Display count of nodes blacklisted by apps in the web UI
 

 Key: YARN-3248
 URL: https://issues.apache.org/jira/browse/YARN-3248
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: All applications.png, App page.png, Screenshot.jpg, 
 apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
 apache-yarn-3248.3.patch


 It would be really useful when debugging app performance and failure issues 
 to get a count of the nodes blacklisted by individual apps displayed in the 
 web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3248) Display count of nodes blacklisted by apps in the web UI

2015-03-11 Thread Tsuyoshi Ozawa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356831#comment-14356831
 ] 

Tsuyoshi Ozawa commented on YARN-3248:
--

[~vvasudev], thank you for updating. Could you add a test about 
ApplicationReportPBImpl since sometimes we struggle NPEs by *PBImpl?

 Display count of nodes blacklisted by apps in the web UI
 

 Key: YARN-3248
 URL: https://issues.apache.org/jira/browse/YARN-3248
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: All applications.png, App page.png, Screenshot.jpg, 
 apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
 apache-yarn-3248.3.patch


 It would be really useful when debugging app performance and failure issues 
 to get a count of the nodes blacklisted by individual apps displayed in the 
 web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible


[ 
https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356758#comment-14356758
 ] 

Hudson commented on YARN-2280:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #129 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/129/])
YARN-2280. Resource manager web service fields are not accessible (Krisztian 
Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java
* hadoop-yarn-project/CHANGES.txt


 Resource manager web service fields are not accessible
 --

 Key: YARN-2280
 URL: https://issues.apache.org/jira/browse/YARN-2280
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
Assignee: Krisztian Horvath
Priority: Trivial
 Fix For: 3.0.0

 Attachments: YARN-2280.patch


 Using the resource manager's rest api 
 (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some 
 rest call returns a class where the fields after the unmarshal cannot be 
 accessible. For example SchedulerTypeInfo - schedulerInfo. Using the same 
 classes on client side these fields only accessible via reflection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion


[ 
https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356763#comment-14356763
 ] 

Hudson commented on YARN-3295:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #129 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/129/])
YARN-3295. Fix documentation nits found in markdown conversion. Contributed by 
Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md


 Fix documentation nits found in markdown conversion
 ---

 Key: YARN-3295
 URL: https://issues.apache.org/jira/browse/YARN-3295
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3295.001.patch


 * In ResourceManagerRestart page - Inside the Notes, the _e{epoch}_ , was 
 highlighted before but not now.
 * yarn container command
 {noformat}
 list ApplicationId (should be Application Attempt ID ?)
 Lists containers for the application attempt.
 {noformat}
 * yarn application attempt command
 {noformat}
 list ApplicationId
 Lists applications attempts from the RM (should be Lists applications 
 attempts for the given application)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356761#comment-14356761
 ] 

Hudson commented on YARN-3187:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #129 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/129/])
YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or 
group. Contributed by Gururaj Shetty (jianhe: rev 
a380643d2044a4974e379965f65066df2055d003)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
* hadoop-yarn-project/CHANGES.txt


 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible


[ 
https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356766#comment-14356766
 ] 

Hudson commented on YARN-2280:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #863 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/863/])
YARN-2280. Resource manager web service fields are not accessible (Krisztian 
Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java


 Resource manager web service fields are not accessible
 --

 Key: YARN-2280
 URL: https://issues.apache.org/jira/browse/YARN-2280
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
Assignee: Krisztian Horvath
Priority: Trivial
 Fix For: 3.0.0

 Attachments: YARN-2280.patch


 Using the resource manager's rest api 
 (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some 
 rest call returns a class where the fields after the unmarshal cannot be 
 accessible. For example SchedulerTypeInfo - schedulerInfo. Using the same 
 classes on client side these fields only accessible via reflection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI

2015-03-11 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356774#comment-14356774
]

Junping Du commented on YARN-3225:
--

bq. Here what would happen to the decommissioning node if the RMAdmin issued
refreshNodeGracefully() and gets terminated(exited) before issuing the
'refreshNode forcefully'? This can be done by doing Ctrl+C on the command
prompt. The Node will be in decommissioning state forever and becomes unusable
for new containers allocation.
From v3 version of proposal in umbrella JIRA, If CLI get interrupted, then it
won’t keep track of timeout to forcefully decommissioned left nodes. However,
nodes in “DECOMMISSIONING” will still get terminated later (after running apps
get finished) except admin call recommission CLI on these nodes explicitly.
The node in decommissioning state are terminated for 2 trigger events, one is
timeout and the other is all application (on that node) get finished (will be
covered in YARN-3212). We can document what's that means if user do Ctrl +C to
gracefully decommissioning CLI. If application (like LRS) never ends in this
situation, then user need to refresh these node forcefully (or gracefully with
timeout but not interrupt). Make sense?

New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
---

Key: YARN-3225
URL: https://issues.apache.org/jira/browse/YARN-3225
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Junping Du
Assignee: Devaraj K
Attachments: YARN-3225.patch, YARN-914.patch

New CLI (or existing CLI with parameters) should put each node on
decommission list to decommissioning status and track timeout to terminate
the nodes that haven't get finished.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356769#comment-14356769
 ] 

Hudson commented on YARN-3187:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #863 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/863/])
YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or 
group. Contributed by Gururaj Shetty (jianhe: rev 
a380643d2044a4974e379965f65066df2055d003)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md


 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion


[ 
https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356771#comment-14356771
 ] 

Hudson commented on YARN-3295:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #863 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/863/])
YARN-3295. Fix documentation nits found in markdown conversion. Contributed by 
Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md


 Fix documentation nits found in markdown conversion
 ---

 Key: YARN-3295
 URL: https://issues.apache.org/jira/browse/YARN-3295
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3295.001.patch


 * In ResourceManagerRestart page - Inside the Notes, the _e{epoch}_ , was 
 highlighted before but not now.
 * yarn container command
 {noformat}
 list ApplicationId (should be Application Attempt ID ?)
 Lists containers for the application attempt.
 {noformat}
 * yarn application attempt command
 {noformat}
 list ApplicationId
 Lists applications attempts from the RM (should be Lists applications 
 attempts for the given application)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3334) [Event Producers] NM start to posting some app related metrics in early POC stage of phase 2.


[ 
https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358072#comment-14358072
 ] 

Naganarasimha G R commented on YARN-3334:
-

Is the scope of this jira different from YARN-3045, If same i was planning to 
work on this part  ... once yarn-3039 is up  

 [Event Producers] NM start to posting some app related metrics in early POC 
 stage of phase 2.
 -

 Key: YARN-3334
 URL: https://issues.apache.org/jira/browse/YARN-3334
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: YARN-2928
Reporter: Junping Du
Assignee: Junping Du





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2015-03-11 Thread Mit Desai (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2890:

Attachment: YARN-2890.patch

Attaching the updated patch

 MiniMRYarnCluster should turn on timeline service if configured to do so
 

 Key: YARN-2890
 URL: https://issues.apache.org/jira/browse/YARN-2890
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
 YARN-2890.patch, YARN-2890.patch


 Currently the MiniMRYarnCluster does not consider the configuration value for 
 enabling timeline service before starting. The MiniYarnCluster should only 
 start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2828) Enable auto refresh of web pages (using http parameter)


[ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357716#comment-14357716
 ] 

Hadoop QA commented on YARN-2828:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12704032/HADOOP-9329.004.patch
  against trunk revision 7a346bc.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6928//console

This message is automatically generated.

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: HADOOP-9329.004.patch, YARN-2828.001.patch, 
 YARN-2828.002.patch, YARN-2828.003.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1551) Allow user-specified reason for killApplication

2015-03-11 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-1551:
-
Target Version/s: 2.8.0  (was: 2.4.0)

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch, YARN-1551.v02.patch, 
 YARN-1551.v03.patch, YARN-1551.v04.patch, YARN-1551.v05.patch, 
 YARN-1551.v06.patch, YARN-1551.v06.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3337) Provide YARN chaos monkey

2015-03-11 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357917#comment-14357917
 ] 

Steve Loughran commented on YARN-3337:
--

the slider chaos monkey is pretty sophisticated as it is configured/deployed 
from within the AM itself, can trigger container failure and AM failure itself. 

I'm proposing something more minimal here, a CLI tool/class that can look up an 
app by ID or (user, type, name) and repeatedly sleep, decide whether or not to 
act, and act. Initial actions: kill the AM container, kill other containers.

I don't think the current client API lets me do this, as the tool would need
# ability to kill a specific container of an application, ideally forcing exit 
code
# ability to identify which container the AM is currently running in (so as not 
to kill it in a worker container kill operation, but do kill it in an AM kill 
operation)

In SLIDER-202 we are doing this in-AM, which has the data  operations; this is 
a visible change to the code which could easily be handled in a re-usable 
lib/CLI tool for better YARN app testing

 Provide YARN chaos monkey
 -

 Key: YARN-3337
 URL: https://issues.apache.org/jira/browse/YARN-3337
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: test
Affects Versions: 2.7.0
Reporter: Steve Loughran

 To test failure resilience today you either need custom scripts or implement 
 Chaos Monkey-like logic in your application (SLIDER-202). 
 Killing AMs and containers on a schedule  probability is the core activity 
 here, one that could be handled by a CLI App/client lib that does this. 
 # entry point to have a startup delay before acting
 # frequency of chaos wakeup/polling
 # probability to AM failure generation (0-100)
 # probability of non-AM container kill
 # future: other operations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer


 [ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3336:

Description: 
FileSystem memory leak in DelegationTokenRenewer.
Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
FileSystem entry will be added to  FileSystem#CACHE which will never be garbage 
collected.
This is the implementation of obtainSystemTokensForUser:
{code}
  protected Token?[] obtainSystemTokensForUser(String user,
  final Credentials credentials) throws IOException, InterruptedException {
// Get new hdfs tokens on behalf of this user
UserGroupInformation proxyUser =
UserGroupInformation.createProxyUser(user,
  UserGroupInformation.getLoginUser());
Token?[] newTokens =
proxyUser.doAs(new PrivilegedExceptionActionToken?[]() {
  @Override
  public Token?[] run() throws Exception {
return FileSystem.get(getConfig()).addDelegationTokens(
  UserGroupInformation.getLoginUser().getUserName(), credentials);
  }
});
return newTokens;
  }
{code}

The memory leak happened when FileSystem.get(getConfig()) is called with a new 
proxy user.
Because createProxyUser will always create a new Subject.
The calling sequence is 
FileSystem.get(getConfig())=FileSystem.get(getDefaultUri(conf), 
conf)=FileSystem.CACHE.get(uri, conf)=FileSystem.CACHE.getInternal(uri, conf, 
key)=FileSystem.CACHE.map.get(key)=createFileSystem(uri, conf)
{code}
public static UserGroupInformation createProxyUser(String user,
  UserGroupInformation realUser) {
if (user == null || user.isEmpty()) {
  throw new IllegalArgumentException(Null user);
}
if (realUser == null) {
  throw new IllegalArgumentException(Null real user);
}
Subject subject = new Subject();
SetPrincipal principals = subject.getPrincipals();
principals.add(new User(user));
principals.add(new RealUser(realUser));
UserGroupInformation result =new UserGroupInformation(subject);
result.setAuthenticationMethod(AuthenticationMethod.PROXY);
return result;
  }
{code}

FileSystem#Cache#Key.equals will compare the ugi
{code}
  Key(URI uri, Configuration conf, long unique) throws IOException {
scheme = uri.getScheme()==null?:uri.getScheme().toLowerCase();
authority = 
uri.getAuthority()==null?:uri.getAuthority().toLowerCase();
this.unique = unique;
this.ugi = UserGroupInformation.getCurrentUser();
  }
  public boolean equals(Object obj) {
if (obj == this) {
  return true;
}
if (obj != null  obj instanceof Key) {
  Key that = (Key)obj;
  return isEqual(this.scheme, that.scheme)
  isEqual(this.authority, that.authority)
  isEqual(this.ugi, that.ugi)
  (this.unique == that.unique);
}
return false;
  }
{code}

UserGroupInformation.equals will compare subject by reference.
{code}
  public boolean equals(Object o) {
if (o == this) {
  return true;
} else if (o == null || getClass() != o.getClass()) {
  return false;
} else {
  return subject == ((UserGroupInformation) o).subject;
}
  }
{code}

So in this case, every time createProxyUser and FileSystem.get(getConfig()) are 
called, a new FileSystem will be created and a new entry will be added to 
FileSystem.CACHE.

  was:
FileSystem memory leak in DelegationTokenRenewer.
Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
FileSystem entry will be added to  FileSystem#CACHE which will never be garbage 
collected.
This is the implementation of obtainSystemTokensForUser:
{code}
  protected Token?[] obtainSystemTokensForUser(String user,
  final Credentials credentials) throws IOException, InterruptedException {
// Get new hdfs tokens on behalf of this user
UserGroupInformation proxyUser =
UserGroupInformation.createProxyUser(user,
  UserGroupInformation.getLoginUser());
Token?[] newTokens =
proxyUser.doAs(new PrivilegedExceptionActionToken?[]() {
  @Override
  public Token?[] run() throws Exception {
return FileSystem.get(getConfig()).addDelegationTokens(
  UserGroupInformation.getLoginUser().getUserName(), credentials);
  }
});
return newTokens;
  }
{code}

The memory leak happened when FileSystem.get(getConfig()) is called with a new 
proxy user.
Because createProxyUser will always create a new Subject.
{code}
public static UserGroupInformation createProxyUser(String user,
  UserGroupInformation realUser) {
if (user == null || user.isEmpty()) {
  throw new IllegalArgumentException(Null user);
}
if (realUser == null) {
  throw new IllegalArgumentException(Null real user);
}
Subject subject = new Subject();
SetPrincipal

[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible


[ 
https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356943#comment-14356943
 ] 

Hudson commented on YARN-2280:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2061 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2061/])
YARN-2280. Resource manager web service fields are not accessible (Krisztian 
Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java


 Resource manager web service fields are not accessible
 --

 Key: YARN-2280
 URL: https://issues.apache.org/jira/browse/YARN-2280
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
Assignee: Krisztian Horvath
Priority: Trivial
 Fix For: 3.0.0

 Attachments: YARN-2280.patch


 Using the resource manager's rest api 
 (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some 
 rest call returns a class where the fields after the unmarshal cannot be 
 accessible. For example SchedulerTypeInfo - schedulerInfo. Using the same 
 classes on client side these fields only accessible via reflection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion


[ 
https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356948#comment-14356948
 ] 

Hudson commented on YARN-3295:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2061 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2061/])
YARN-3295. Fix documentation nits found in markdown conversion. Contributed by 
Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md


 Fix documentation nits found in markdown conversion
 ---

 Key: YARN-3295
 URL: https://issues.apache.org/jira/browse/YARN-3295
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3295.001.patch


 * In ResourceManagerRestart page - Inside the Notes, the _e{epoch}_ , was 
 highlighted before but not now.
 * yarn container command
 {noformat}
 list ApplicationId (should be Application Attempt ID ?)
 Lists containers for the application attempt.
 {noformat}
 * yarn application attempt command
 {noformat}
 list ApplicationId
 Lists applications attempts from the RM (should be Lists applications 
 attempts for the given application)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356946#comment-14356946
 ] 

Hudson commented on YARN-3187:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2061 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2061/])
YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or 
group. Contributed by Gururaj Shetty (jianhe: rev 
a380643d2044a4974e379965f65066df2055d003)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
* hadoop-yarn-project/CHANGES.txt


 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group


[ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356966#comment-14356966
 ] 

Hudson commented on YARN-3187:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #120 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/120/])
YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or 
group. Contributed by Gururaj Shetty (jianhe: rev 
a380643d2044a4974e379965f65066df2055d003)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md


 Documentation of Capacity Scheduler Queue mapping based on user or group
 

 Key: YARN-3187
 URL: https://issues.apache.org/jira/browse/YARN-3187
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, documentation
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Gururaj Shetty
  Labels: documentation
 Fix For: 2.7.0

 Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, 
 YARN-3187.4.patch


 YARN-2411 exposes a very useful feature {{support simple user and group 
 mappings to queues}} but its not captured in the documentation. So in this 
 jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion


[ 
https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356968#comment-14356968
 ] 

Hudson commented on YARN-3295:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #120 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/120/])
YARN-3295. Fix documentation nits found in markdown conversion. Contributed by 
Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md


 Fix documentation nits found in markdown conversion
 ---

 Key: YARN-3295
 URL: https://issues.apache.org/jira/browse/YARN-3295
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3295.001.patch


 * In ResourceManagerRestart page - Inside the Notes, the _e{epoch}_ , was 
 highlighted before but not now.
 * yarn container command
 {noformat}
 list ApplicationId (should be Application Attempt ID ?)
 Lists containers for the application attempt.
 {noformat}
 * yarn application attempt command
 {noformat}
 list ApplicationId
 Lists applications attempts from the RM (should be Lists applications 
 attempts for the given application)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible


[ 
https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356963#comment-14356963
 ] 

Hudson commented on YARN-2280:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #120 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/120/])
YARN-2280. Resource manager web service fields are not accessible (Krisztian 
Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java


 Resource manager web service fields are not accessible
 --

 Key: YARN-2280
 URL: https://issues.apache.org/jira/browse/YARN-2280
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
Assignee: Krisztian Horvath
Priority: Trivial
 Fix For: 3.0.0

 Attachments: YARN-2280.patch


 Using the resource manager's rest api 
 (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some 
 rest call returns a class where the fields after the unmarshal cannot be 
 accessible. For example SchedulerTypeInfo - schedulerInfo. Using the same 
 classes on client side these fields only accessible via reflection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-11 Thread Karthik Kambatla (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356418#comment-14356418
]

Karthik Kambatla commented on YARN-3332:

bq. the machine level big picture is fragmented between YARN and HDFS (and
HBase etc)
What constitutes the machine level big picture? Isn't this just the overall
node's resource usage? YARN, at least as of today, doesn't need to know about
the usage stats of HDFS or HBase.

I have nothing against going the server route, except the additional daemon one
might end up having to run.

bq. I anyways needed a service to expose an API for both admins/users as well
as external systems beyond HDFS too - I can imagine tools being built on top of
this.
It is not as clear to me. Let us say an admin and a user want usage stats about
their YARN containers. The service can only provide the usage stats, while YARN
will be able to provide other container metadata. Also, we should consider
privacy of usage information. Will auth against this new service be additional
overhead?

bq. That said, it doesn't need to be service or library. I can think of a
library that wires into the exposed API, though I haven't found uses for that
yet.
Sorry, didn't get that. Can you clarify/ elaborate?

[Umbrella] Unified Resource Statistics Collection per node
--

Key: YARN-3332
URL: https://issues.apache.org/jira/browse/YARN-3332
Project: Hadoop YARN
Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Attachments: Design - UnifiedResourceStatisticsCollection.pdf

Today in YARN, NodeManager collects statistics like per container resource
usage and overall physical resources available on the machine. Currently this
is used internally in YARN by the NodeManager for only a limited usage:
automatically determining the capacity of resources on node and enforcing
memory usage to what is reserved per container.
This proposal is to extend the existing architecture and collect statistics
for usage beyond the existing usecases.
Proposal attached in comments.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated