[jira] [Updated] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-16 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-514:
-

Attachment: YARN-514.6.patch

Thank @Bikas for your investigation. I've modified the code. The newest patch 
contain the following major updates:

1. FAILED => FAILED transition on RMAppEventType.APP_SAVED and KILLED => KILLED 
transition on RMAppEventType.APP_SAVED are defined. It fixes the problem 
pointed by @Bikas.

2. In addition, I found there's a problem in RMApp state transition in the RM 
restarting scenario. The stored MRApp will be recovered, an RMApp instance will 
be created, it will transit to NEW_SAVING and be stored again with the previous 
patch. To fix the  problem, "isRecovered" is defined in RMAppImpl, and is set 
to true when RMAppImpl#recover is called. Then, on RMAppEventType.START being 
received, NEW => NEW_SAVING if the RMApp instance is not recovered, NEW => 
SUBMITTED otherwise.

3. Addition test cases are added in TestRMAppTransitions to test the 
aforementioned transition rules.

4. TestRMRestart should have traced the problem of saving the RMApp instance 
which is recovered again.  However, it didn't failed the test case with 
previous patch because MemoryRMStateStore didn't throw exceptions when storing 
a duplicate application/attempt. Therefore, in the newest patch, 
MemoryRMStateStore will through IOException when the application/attempt has 
already been stored, which is consistent with the behavior of 
FileSystemRMStateStore. Then, the current test case of TestRMRestart can trace 
the problem of saving the RMApp instance twice.

> Delayed store operations should not result in RM unavailability for app 
> submission
> --
>
> Key: YARN-514
> URL: https://issues.apache.org/jira/browse/YARN-514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: YARN-514.1.patch, YARN-514.2.patch, YARN-514.3.patch, 
> YARN-514.4.patch, YARN-514.5.patch, YARN-514.6.patch
>
>
> Currently, app submission is the only store operation performed synchronously 
> because the app must be stored before the request returns with success. This 
> makes the RM susceptible to blocking all client threads on slow store 
> operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-541) getAllocatedContainers() is not returning all the allocated containers

2013-04-16 Thread Krishna Kishore Bonagiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krishna Kishore Bonagiri updated YARN-541:
--

Attachment: yarn-dsadm-resourcemanager-isredeng.out
yarn-dsadm-nodemanager-isredeng.out
AppMaster.stdout

Hi Hitesh,

  I am attaching the logs for AM, RM, and NM. I have an application being
run in a loop, which requires 5 containers. The 8th run has failed with
this issue of getAllocatedContainers(). The Application Master couldn't get
all the 5 containers it required, the getAllocatedContainers() method
returned only 4. The RM's log is saying that the 5th container is also
allocated thro' the message,

2013-04-16 03:32:54,701 INFO  [ResourceManager Event Processor]
rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(220)) -
container_1366096597608_0008_01_06 Container Transitioned from NEW to
ALLOCATED

In RM's log, you can see that this kind of for the remaining 4 containers
also, i.e. container_1366096597608_0008_01_02 to
container_1366096597608_0008_01_05.

Also, as I said before this issue is seen randomly.

Thanks,
Kishore






> getAllocatedContainers() is not returning all the allocated containers
> --
>
> Key: YARN-541
> URL: https://issues.apache.org/jira/browse/YARN-541
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
> Environment: Redhat Linux 64-bit
>Reporter: Krishna Kishore Bonagiri
> Attachments: AppMaster.stdout, yarn-dsadm-nodemanager-isredeng.out, 
> yarn-dsadm-resourcemanager-isredeng.out
>
>
> I am running an application that was written and working well with the 
> hadoop-2.0.0-alpha but when I am running the same against 2.0.3-alpha, the 
> getAllocatedContainers() method called on AMResponse is not returning all the 
> containers allocated sometimes. For example, I request for 10 containers and 
> this method gives me only 9 containers sometimes, and when I looked at the 
> log of Resource Manager, the 10th container is also allocated. It happens 
> only sometimes randomly and works fine all other times. If I send one more 
> request for the remaining container to RM after it failed to give them the 
> first time(and before releasing already acquired ones), it could allocate 
> that container. I am running only one application at a time, but 1000s of 
> them one after another.
> My main worry is, even though the RM's log is saying that all 10 requested 
> containers are allocated,  the getAllocatedContainers() method is not 
> returning me all of them, it returned only 9 surprisingly. I never saw this 
> kind of issue in the previous version, i.e. hadoop-2.0.0-alpha.
> Thanks,
> Kishore
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632655#comment-13632655
 ] 

Hadoop QA commented on YARN-514:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578893/YARN-514.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/745//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/745//console

This message is automatically generated.

> Delayed store operations should not result in RM unavailability for app 
> submission
> --
>
> Key: YARN-514
> URL: https://issues.apache.org/jira/browse/YARN-514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: YARN-514.1.patch, YARN-514.2.patch, YARN-514.3.patch, 
> YARN-514.4.patch, YARN-514.5.patch, YARN-514.6.patch
>
>
> Currently, app submission is the only store operation performed synchronously 
> because the app must be stored before the request returns with success. This 
> makes the RM susceptible to blocking all client threads on slow store 
> operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-547) Race condition in Public / Private Localizer may result into resource getting downloaded again

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-547:
---

Description: 
Public Localizer :
At present when multiple containers try to request a localized resource 
* If the resource is not present then first it is created and Resource 
Localization starts ( LocalizedResource is in DOWNLOADING state)
* Now if in this state multiple ResourceRequestEvents arrive then 
ResourceLocalizationEvents are sent for all of them.

Most of the times it is not resulting into a duplicate resource download but 
there is a race condition present there. Inside ResourceLocalization (for 
public download) all the requests are added to local attempts map. If a new 
request comes in then first it is checked in this map before a new download 
starts for the same. For the current download the request will be there in the 
map. Now if a same resource request comes in then it will rejected (i.e. 
resource is getting downloaded already). However if the current download 
completes then the request will be removed from this local map. Now after this 
removal if the LocalizerRequestEvent comes in then as it is not present in 
local map the resource will be downloaded again.

PrivateLocalizer :
Here a different but similar race condition is present.
* Here inside findNextResource method call; each LocalizerRunner tries to grab 
a lock on LocalizerResource. If the lock is not acquired then it will keep 
trying until the resource state changes to LOCALIZED. This lock will be 
released by the LocalizerRunner when download completes.
* Now if another ContainerLocalizer tries to grab the lock on a resource before 
LocalizedResource state changes to LOCALIZED then resource will be downloaded 
again.

At both the places the root cause of this is that all the threads try to 
acquire the lock on resource however current state of the LocalizedResource is 
not taken into consideration.

  was:
At present when multiple containers try to request a localized resource 
* If the resource is not present then first it is created and Resource 
Localization starts ( LocalizedResource is in DOWNLOADING state)
* Now if in this state multiple ResourceRequestEvents come in then 
ResourceLocalizationEvents are fired for all of them.

Most of the times it is not resulting into a duplicate resource download but 
there is a race condition present there. Inside ResourceLocalization (for 
public download) all the requests are added to local attempts map. If a new 
request comes in then first it is checked in this map before a new download 
starts for the same. For the current download the request will be there in the 
map. Now if a same resource request comes in then it will rejected (i.e. 
resource is getting downloaded already). However if the current download 
completes then the request will be removed from this local map. Now after this 
removal if the LocalizerRequestEvent comes in then as it is not present in 

The root cause for this is the presence of FetchResourceTransition on receiving 
ResourceRequestEvent in DOWNLOADING state.


> Race condition in Public / Private Localizer may result into resource getting 
> downloaded again
> --
>
> Key: YARN-547
> URL: https://issues.apache.org/jira/browse/YARN-547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-547-20130411.1.patch, yarn-547-20130411.patch, 
> yarn-547-20130412.patch, yarn-547-20130415.patch
>
>
> Public Localizer :
> At present when multiple containers try to request a localized resource 
> * If the resource is not present then first it is created and Resource 
> Localization starts ( LocalizedResource is in DOWNLOADING state)
> * Now if in this state multiple ResourceRequestEvents arrive then 
> ResourceLocalizationEvents are sent for all of them.
> Most of the times it is not resulting into a duplicate resource download but 
> there is a race condition present there. Inside ResourceLocalization (for 
> public download) all the requests are added to local attempts map. If a new 
> request comes in then first it is checked in this map before a new download 
> starts for the same. For the current download the request will be there in 
> the map. Now if a same resource request comes in then it will rejected (i.e. 
> resource is getting downloaded already). However if the current download 
> completes then the request will be removed from this local map. Now after 
> this removal if the LocalizerRequestEvent comes in then as it is not present 
> in local map the resource will be downloaded again.
> PrivateLocalizer :
> Here a different but similar race condition is present.
> * Here inside findNextResource m

[jira] [Updated] (YARN-547) Race condition in Public / Private Localizer may result into resource getting downloaded again

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-547:
---

Attachment: yarn-547-20130416.patch

I have added separate test cases for public and private localizer. With little 
modification it should fail for earlier implementation.
For Both the the test cases
* first I am setting up required ResourceLocalizationService, other dispatchers 
and directory handlers.
* Container-1 makes a request for the resource-1. Request is handled and 
resource download starts (resource state and semaphore count is verified)
* Now Container-2 makes the request for the same resource while resource is 
getting downloaded. Request is rejected because lock on the resource can not be 
acquired.
* Resource-1 download fails, it is transitioned into LOCALIZATION_FAILED state 
and semaphore lock is unlocked.
* Now container-3 makes the request to download same resource. This request is 
rejected because even though it can acquire lock still the resource is no 
longer in DOWNLOADING state.


> Race condition in Public / Private Localizer may result into resource getting 
> downloaded again
> --
>
> Key: YARN-547
> URL: https://issues.apache.org/jira/browse/YARN-547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-547-20130411.1.patch, yarn-547-20130411.patch, 
> yarn-547-20130412.patch, yarn-547-20130415.patch, yarn-547-20130416.patch
>
>
> Public Localizer :
> At present when multiple containers try to request a localized resource 
> * If the resource is not present then first it is created and Resource 
> Localization starts ( LocalizedResource is in DOWNLOADING state)
> * Now if in this state multiple ResourceRequestEvents arrive then 
> ResourceLocalizationEvents are sent for all of them.
> Most of the times it is not resulting into a duplicate resource download but 
> there is a race condition present there. Inside ResourceLocalization (for 
> public download) all the requests are added to local attempts map. If a new 
> request comes in then first it is checked in this map before a new download 
> starts for the same. For the current download the request will be there in 
> the map. Now if a same resource request comes in then it will rejected (i.e. 
> resource is getting downloaded already). However if the current download 
> completes then the request will be removed from this local map. Now after 
> this removal if the LocalizerRequestEvent comes in then as it is not present 
> in local map the resource will be downloaded again.
> PrivateLocalizer :
> Here a different but similar race condition is present.
> * Here inside findNextResource method call; each LocalizerRunner tries to 
> grab a lock on LocalizerResource. If the lock is not acquired then it will 
> keep trying until the resource state changes to LOCALIZED. This lock will be 
> released by the LocalizerRunner when download completes.
> * Now if another ContainerLocalizer tries to grab the lock on a resource 
> before LocalizedResource state changes to LOCALIZED then resource will be 
> downloaded again.
> At both the places the root cause of this is that all the threads try to 
> acquire the lock on resource however current state of the LocalizedResource 
> is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-547) Race condition in Public / Private Localizer may result into resource getting downloaded again

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632678#comment-13632678
 ] 

Hadoop QA commented on YARN-547:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12578902/yarn-547-20130416.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/746//console

This message is automatically generated.

> Race condition in Public / Private Localizer may result into resource getting 
> downloaded again
> --
>
> Key: YARN-547
> URL: https://issues.apache.org/jira/browse/YARN-547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-547-20130411.1.patch, yarn-547-20130411.patch, 
> yarn-547-20130412.patch, yarn-547-20130415.patch, yarn-547-20130416.patch
>
>
> Public Localizer :
> At present when multiple containers try to request a localized resource 
> * If the resource is not present then first it is created and Resource 
> Localization starts ( LocalizedResource is in DOWNLOADING state)
> * Now if in this state multiple ResourceRequestEvents arrive then 
> ResourceLocalizationEvents are sent for all of them.
> Most of the times it is not resulting into a duplicate resource download but 
> there is a race condition present there. Inside ResourceLocalization (for 
> public download) all the requests are added to local attempts map. If a new 
> request comes in then first it is checked in this map before a new download 
> starts for the same. For the current download the request will be there in 
> the map. Now if a same resource request comes in then it will rejected (i.e. 
> resource is getting downloaded already). However if the current download 
> completes then the request will be removed from this local map. Now after 
> this removal if the LocalizerRequestEvent comes in then as it is not present 
> in local map the resource will be downloaded again.
> PrivateLocalizer :
> Here a different but similar race condition is present.
> * Here inside findNextResource method call; each LocalizerRunner tries to 
> grab a lock on LocalizerResource. If the lock is not acquired then it will 
> keep trying until the resource state changes to LOCALIZED. This lock will be 
> released by the LocalizerRunner when download completes.
> * Now if another ContainerLocalizer tries to grab the lock on a resource 
> before LocalizedResource state changes to LOCALIZED then resource will be 
> downloaded again.
> At both the places the root cause of this is that all the threads try to 
> acquire the lock on resource however current state of the LocalizedResource 
> is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-547) Race condition in Public / Private Localizer may result into resource getting downloaded again

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632691#comment-13632691
 ] 

Omkar Vinit Joshi commented on YARN-547:


There was a problem at my end while creating a patch. resubmitting the patch. 
(I took diff against local trunk in which 2 files were already committed.)

> Race condition in Public / Private Localizer may result into resource getting 
> downloaded again
> --
>
> Key: YARN-547
> URL: https://issues.apache.org/jira/browse/YARN-547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-547-20130411.1.patch, yarn-547-20130411.patch, 
> yarn-547-20130412.patch, yarn-547-20130415.patch, yarn-547-20130416.patch
>
>
> Public Localizer :
> At present when multiple containers try to request a localized resource 
> * If the resource is not present then first it is created and Resource 
> Localization starts ( LocalizedResource is in DOWNLOADING state)
> * Now if in this state multiple ResourceRequestEvents arrive then 
> ResourceLocalizationEvents are sent for all of them.
> Most of the times it is not resulting into a duplicate resource download but 
> there is a race condition present there. Inside ResourceLocalization (for 
> public download) all the requests are added to local attempts map. If a new 
> request comes in then first it is checked in this map before a new download 
> starts for the same. For the current download the request will be there in 
> the map. Now if a same resource request comes in then it will rejected (i.e. 
> resource is getting downloaded already). However if the current download 
> completes then the request will be removed from this local map. Now after 
> this removal if the LocalizerRequestEvent comes in then as it is not present 
> in local map the resource will be downloaded again.
> PrivateLocalizer :
> Here a different but similar race condition is present.
> * Here inside findNextResource method call; each LocalizerRunner tries to 
> grab a lock on LocalizerResource. If the lock is not acquired then it will 
> keep trying until the resource state changes to LOCALIZED. This lock will be 
> released by the LocalizerRunner when download completes.
> * Now if another ContainerLocalizer tries to grab the lock on a resource 
> before LocalizedResource state changes to LOCALIZED then resource will be 
> downloaded again.
> At both the places the root cause of this is that all the threads try to 
> acquire the lock on resource however current state of the LocalizedResource 
> is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-547) Race condition in Public / Private Localizer may result into resource getting downloaded again

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-547:
---

Attachment: yarn-547-20130416.1.patch

> Race condition in Public / Private Localizer may result into resource getting 
> downloaded again
> --
>
> Key: YARN-547
> URL: https://issues.apache.org/jira/browse/YARN-547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-547-20130411.1.patch, yarn-547-20130411.patch, 
> yarn-547-20130412.patch, yarn-547-20130415.patch, yarn-547-20130416.1.patch, 
> yarn-547-20130416.patch
>
>
> Public Localizer :
> At present when multiple containers try to request a localized resource 
> * If the resource is not present then first it is created and Resource 
> Localization starts ( LocalizedResource is in DOWNLOADING state)
> * Now if in this state multiple ResourceRequestEvents arrive then 
> ResourceLocalizationEvents are sent for all of them.
> Most of the times it is not resulting into a duplicate resource download but 
> there is a race condition present there. Inside ResourceLocalization (for 
> public download) all the requests are added to local attempts map. If a new 
> request comes in then first it is checked in this map before a new download 
> starts for the same. For the current download the request will be there in 
> the map. Now if a same resource request comes in then it will rejected (i.e. 
> resource is getting downloaded already). However if the current download 
> completes then the request will be removed from this local map. Now after 
> this removal if the LocalizerRequestEvent comes in then as it is not present 
> in local map the resource will be downloaded again.
> PrivateLocalizer :
> Here a different but similar race condition is present.
> * Here inside findNextResource method call; each LocalizerRunner tries to 
> grab a lock on LocalizerResource. If the lock is not acquired then it will 
> keep trying until the resource state changes to LOCALIZED. This lock will be 
> released by the LocalizerRunner when download completes.
> * Now if another ContainerLocalizer tries to grab the lock on a resource 
> before LocalizedResource state changes to LOCALIZED then resource will be 
> downloaded again.
> At both the places the root cause of this is that all the threads try to 
> acquire the lock on resource however current state of the LocalizedResource 
> is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-547) Race condition in Public / Private Localizer may result into resource getting downloaded again

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632711#comment-13632711
 ] 

Hadoop QA commented on YARN-547:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12578908/yarn-547-20130416.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/747//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/747//console

This message is automatically generated.

> Race condition in Public / Private Localizer may result into resource getting 
> downloaded again
> --
>
> Key: YARN-547
> URL: https://issues.apache.org/jira/browse/YARN-547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-547-20130411.1.patch, yarn-547-20130411.patch, 
> yarn-547-20130412.patch, yarn-547-20130415.patch, yarn-547-20130416.1.patch, 
> yarn-547-20130416.patch
>
>
> Public Localizer :
> At present when multiple containers try to request a localized resource 
> * If the resource is not present then first it is created and Resource 
> Localization starts ( LocalizedResource is in DOWNLOADING state)
> * Now if in this state multiple ResourceRequestEvents arrive then 
> ResourceLocalizationEvents are sent for all of them.
> Most of the times it is not resulting into a duplicate resource download but 
> there is a race condition present there. Inside ResourceLocalization (for 
> public download) all the requests are added to local attempts map. If a new 
> request comes in then first it is checked in this map before a new download 
> starts for the same. For the current download the request will be there in 
> the map. Now if a same resource request comes in then it will rejected (i.e. 
> resource is getting downloaded already). However if the current download 
> completes then the request will be removed from this local map. Now after 
> this removal if the LocalizerRequestEvent comes in then as it is not present 
> in local map the resource will be downloaded again.
> PrivateLocalizer :
> Here a different but similar race condition is present.
> * Here inside findNextResource method call; each LocalizerRunner tries to 
> grab a lock on LocalizerResource. If the lock is not acquired then it will 
> keep trying until the resource state changes to LOCALIZED. This lock will be 
> released by the LocalizerRunner when download completes.
> * Now if another ContainerLocalizer tries to grab the lock on a resource 
> before LocalizedResource state changes to LOCALIZED then resource will be 
> downloaded again.
> At both the places the root cause of this is that all the threads try to 
> acquire the lock on resource however current state of the LocalizedResource 
> is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (YARN-501) Application Master getting killed randomly reporting excess usage of memory

2013-04-16 Thread Krishna Kishore Bonagiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krishna Kishore Bonagiri reopened YARN-501:
---


Added information dated 24th March, and 15th April. Please see above. As 
mentioned, it is not really issue with using more than allocated for the 
container. This issue happens randomly when ran the same application either 
mine, or the Distributed Shell example also, with just a date command.

Thanks,
Kishore

> Application Master getting killed randomly reporting excess usage of memory
> ---
>
> Key: YARN-501
> URL: https://issues.apache.org/jira/browse/YARN-501
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell, nodemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Krishna Kishore Bonagiri
>
> I am running a date command using the Distributed Shell example in a loop of 
> 500 times. It ran successfully all the times except one time where it gave 
> the following error.
> 2013-03-22 04:33:25,280 INFO  [main] distributedshell.Client 
> (Client.java:monitorApplication(605)) - Got application report from ASM for, 
> appId=222, clientToken=null, appDiagnostics=Application 
> application_1363938200742_0222 failed 1 times due to AM Container for 
> appattempt_1363938200742_0222_01 exited with  exitCode: 143 due to: 
> Container [pid=21141,containerID=container_1363938200742_0222_01_01] is 
> running beyond virtual memory limits. Current usage: 47.3 Mb of 128 Mb 
> physical memory used; 611.6 Mb of 268.8 Mb virtual memory used. Killing 
> container.
> Dump of the process-tree for container_1363938200742_0222_01_01 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 21147 21141 21141 21141 (java) 244 12 532643840 11802 
> /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --num_containers 2 --priority 0 --shell_command date
> |- 21141 8433 21141 21141 (bash) 0 0 108642304 298 /bin/bash -c 
> /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --num_containers 2 --priority 0 --shell_command date 
> 1>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stdout
>  
> 2>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stderr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-444) Move special container exit codes from YarnConfiguration to API

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632743#comment-13632743
 ] 

Hudson commented on YARN-444:
-

Integrated in Hadoop-Yarn-trunk #185 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/185/])
YARN-444. Moved special container exit codes from YarnConfiguration to API 
where they belong. Contributed by Sandy Ryza.
MAPREDUCE-5151. Updated MR AM to use standard exit codes from the API after 
YARN-444. Contributed by Sandy Ryza. (Revision 1468276)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1468276
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ContainerExitStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/ContainerInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java


> Move special container exit codes from YarnConfiguration to API
> ---
>
> Key: YARN-444
> URL: https://issues.apache.org/jira/browse/YARN-444
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications/distributed-shell
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-444-1.patch, YARN-444-2.patch, YARN-444-2.patch, 
> YARN-444-2.patch, YARN-444.patch
>
>
> YarnConfiguration currently contains the special container exit codes 
> INVALID_CONTAINER_EXIT_STATUS = -1000, ABORTED_CONTAINER_EXIT_STATUS = -100, 
> and DISKS_FAILED = -101.
> These are not really not really related to configuration, and 
> YarnConfiguration should not become a place to put miscellaneous constants.
> Per discussion on YARN-417, appmaster writers need to be able to provide 
> special handling for them, so it might make sense to move these to their own 
> user-facing class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-580) Delay scheduling in capacity scheduler is not ensuring 100% locality

2013-04-16 Thread Nishan Shetty (JIRA)
Nishan Shetty created YARN-580:
--

 Summary: Delay scheduling in capacity scheduler is not ensuring 
100% locality
 Key: YARN-580
 URL: https://issues.apache.org/jira/browse/YARN-580
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.1-alpha, 2.0.2-alpha
Reporter: Nishan Shetty


Example

Machine1: 3 blocks
Machine2: 2 blocks
Machine3: 1 blocks

When we run job on this data, node locality is not ensured 100%

Tasks run like below even if slots are available in all nodes:
--
Machine1: 4Task
Machine2: 2Task
Machine3: No task 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-580) Delay scheduling in capacity scheduler is not ensuring 100% locality

2013-04-16 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned YARN-580:
--

Assignee: Devaraj K

> Delay scheduling in capacity scheduler is not ensuring 100% locality
> 
>
> Key: YARN-580
> URL: https://issues.apache.org/jira/browse/YARN-580
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Assignee: Devaraj K
>
> Example
> 
> Machine1: 3 blocks
> Machine2: 2 blocks
> Machine3: 1 blocks
> When we run job on this data, node locality is not ensured 100%
> Tasks run like below even if slots are available in all nodes:
> --
> Machine1: 4Task
> Machine2: 2Task
> Machine3: No task 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-444) Move special container exit codes from YarnConfiguration to API

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632816#comment-13632816
 ] 

Hudson commented on YARN-444:
-

Integrated in Hadoop-Hdfs-trunk #1374 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1374/])
YARN-444. Moved special container exit codes from YarnConfiguration to API 
where they belong. Contributed by Sandy Ryza.
MAPREDUCE-5151. Updated MR AM to use standard exit codes from the API after 
YARN-444. Contributed by Sandy Ryza. (Revision 1468276)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1468276
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ContainerExitStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/ContainerInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java


> Move special container exit codes from YarnConfiguration to API
> ---
>
> Key: YARN-444
> URL: https://issues.apache.org/jira/browse/YARN-444
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications/distributed-shell
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-444-1.patch, YARN-444-2.patch, YARN-444-2.patch, 
> YARN-444-2.patch, YARN-444.patch
>
>
> YarnConfiguration currently contains the special container exit codes 
> INVALID_CONTAINER_EXIT_STATUS = -1000, ABORTED_CONTAINER_EXIT_STATUS = -100, 
> and DISKS_FAILED = -101.
> These are not really not really related to configuration, and 
> YarnConfiguration should not become a place to put miscellaneous constants.
> Per discussion on YARN-417, appmaster writers need to be able to provide 
> special handling for them, so it might make sense to move these to their own 
> user-facing class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-580) Delay scheduling in capacity scheduler is not ensuring 100% locality

2013-04-16 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632830#comment-13632830
 ] 

Thomas Graves commented on YARN-580:


Depending on what version you are using MAPREDUCE-4893 helped make this better. 
Even with that though you probably won't get perfect locality. 

I looked at this with MR and the problem there is in way the AM asks for 
containers or the lack of ability to tell RM I need only one of these 3 
locations (assuming block repl factor is 3). As far as the RM is concerned in 
my case it was giving perfect locality by what was requested but since the MR 
AM generally asks for 3 locations per block and the RM doesn't know that by 
filling one request it negates 2 others, it can give you containers that 
doesn't get you perfect locality.  In that case the AM would have to be smarter 
about requesting or about giveng them back and asking again.

> Delay scheduling in capacity scheduler is not ensuring 100% locality
> 
>
> Key: YARN-580
> URL: https://issues.apache.org/jira/browse/YARN-580
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Assignee: Devaraj K
>
> Example
> 
> Machine1: 3 blocks
> Machine2: 2 blocks
> Machine3: 1 blocks
> When we run job on this data, node locality is not ensured 100%
> Tasks run like below even if slots are available in all nodes:
> --
> Machine1: 4Task
> Machine2: 2Task
> Machine3: No task 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-444) Move special container exit codes from YarnConfiguration to API

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632852#comment-13632852
 ] 

Hudson commented on YARN-444:
-

Integrated in Hadoop-Mapreduce-trunk #1401 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1401/])
YARN-444. Moved special container exit codes from YarnConfiguration to API 
where they belong. Contributed by Sandy Ryza.
MAPREDUCE-5151. Updated MR AM to use standard exit codes from the API after 
YARN-444. Contributed by Sandy Ryza. (Revision 1468276)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1468276
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ContainerExitStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/dao/ContainerInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java


> Move special container exit codes from YarnConfiguration to API
> ---
>
> Key: YARN-444
> URL: https://issues.apache.org/jira/browse/YARN-444
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications/distributed-shell
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-444-1.patch, YARN-444-2.patch, YARN-444-2.patch, 
> YARN-444-2.patch, YARN-444.patch
>
>
> YarnConfiguration currently contains the special container exit codes 
> INVALID_CONTAINER_EXIT_STATUS = -1000, ABORTED_CONTAINER_EXIT_STATUS = -100, 
> and DISKS_FAILED = -101.
> These are not really not really related to configuration, and 
> YarnConfiguration should not become a place to put miscellaneous constants.
> Per discussion on YARN-417, appmaster writers need to be able to provide 
> special handling for them, so it might make sense to move these to their own 
> user-facing class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-168) No way to turn off virtual memory limits without turning off physical memory limits

2013-04-16 Thread Krishna Kishore Bonagiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632868#comment-13632868
 ] 

Krishna Kishore Bonagiri commented on YARN-168:
---

How can I get this fix if I want it now? When would the next release be? I 
mean, the one having this fix!

> No way to turn off virtual memory limits without turning off physical memory 
> limits
> ---
>
> Key: YARN-168
> URL: https://issues.apache.org/jira/browse/YARN-168
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: Harsh J
>
> Asked and reported by a user (Krishna) on ML:
> {quote}
> This is possible to do, but you've hit a bug with the current YARN
> implementation. Ideally you should be able to configure the vmem-pmem
> ratio (or an equivalent config) to be -1, to indicate disabling of
> virtual memory checks completely (and there's indeed checks for this),
> but it seems like we are enforcing the ratio to be at least 1.0 (and
> hence negatives are disallowed).
> You can't workaround by setting the NM's offered resource.mb to -1
> either, as you'll lose out on controlling maximum allocations.
> Please file a YARN bug on JIRA. The code at fault lies under
> ContainersMonitorImpl#init(…).
> On Thu, Oct 18, 2012 at 4:00 PM, Krishna Kishore Bonagiri
>  wrote:
> > Hi,
> >
> >   Is there a way we can ask the YARN RM for not killing a container when it
> > uses excess virtual memory than the maximum it can use as per the
> > specification in the configuration file yarn-site.xml? We can't always
> > estimate the amount of virtual memory needed for our application running on
> > a container, but we don't want to get it killed in a case it exceeds the
> > maximum limit.
> >
> >   Please suggest as to how can we come across this issue.
> >
> > Thanks,
> > Kishore
> {quote}
> Basically, we're doing:
> {code}
> // / Virtual memory configuration //
> float vmemRatio = conf.getFloat(
> YarnConfiguration.NM_VMEM_PMEM_RATIO,
> YarnConfiguration.DEFAULT_NM_VMEM_PMEM_RATIO);
> Preconditions.checkArgument(vmemRatio > 0.99f,
> YarnConfiguration.NM_VMEM_PMEM_RATIO +
> " should be at least 1.0");
> this.maxVmemAllottedForContainers =
>   (long)(vmemRatio * maxPmemAllottedForContainers);
> {code}
> For virtual memory monitoring to be disabled, maxVmemAllottedForContainers 
> has to be -1. For that to be -1, given the above buggy computation, vmemRatio 
> must be -1 or maxPmemAllottedForContainers must be -1.
> If vmemRatio were -1, we fail the precondition check and exit.
> If maxPmemAllottedForContainers, we also end up disabling physical memory 
> monitoring.
> Or perhaps that makes sense - to disable both physical and virtual memory 
> monitoring, but that way your NM becomes infinite in resource grants, I think.
> We need a way to selectively disable kills done via virtual memory 
> monitoring, which is the base request here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-16 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632975#comment-13632975
 ] 

Bikas Saha commented on YARN-514:
-

StartAppAttemptTransition and RMAppStartingOrSavingTransition both can start an 
app attempt and are currently duplicating the code to start the attempt.

IMO, some of the new tests seem redundant. eg. 
testAppSubmittedNoRecoveryKill/testAppSubmittedRecoveryKill basically testing 
transitions SUBMITTED->KILL. Do they depend on how the app came into the 
SUBMITTED state? If no, then the tests are redundant and increase maintenance 
burden. Same for other such tests for SUBMITTED state.

In new testAppFailedFailed(), in the comments you mean FAILED->FAILED and not 
KILLED->KILLED right?

Javadoc is wrong since this method is not called by the derived class. In fact 
its a private method. This looks like it was copied from the doc for 
notify*Attempt. Please correct both javadocs.
{code}
+   * In (@link storeApplication}, derived class can call this method to notify
+   * the application about operation completion
+   * @param appId id of the application that has been saved
+   * @param storedException the exception that is thrown when storing the
+   * application
+   */
+  private void notifyDoneStoringApplication(ApplicationId appId,
{code}

> Delayed store operations should not result in RM unavailability for app 
> submission
> --
>
> Key: YARN-514
> URL: https://issues.apache.org/jira/browse/YARN-514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: YARN-514.1.patch, YARN-514.2.patch, YARN-514.3.patch, 
> YARN-514.4.patch, YARN-514.5.patch, YARN-514.6.patch
>
>
> Currently, app submission is the only store operation performed synchronously 
> because the app must be stored before the request returns with success. This 
> makes the RM susceptible to blocking all client threads on slow store 
> operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart

2013-04-16 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632992#comment-13632992
 ] 

Bikas Saha commented on YARN-579:
-

In order to help restart, creation of appToken needs to move out from 
AMLauncher. AppToken needs to be set before the app attempt is saved. App 
attempt is saved after masterContainer is received from scheduler.

> Make ApplicationToken part of Container's token list to help RM-restart
> ---
>
> Key: YARN-579
> URL: https://issues.apache.org/jira/browse/YARN-579
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>
> Container is already persisted for helping RM restart. Instead of explicitly 
> setting ApplicationToken in AM's env, if we change it to be in Container, we 
> can avoid env and can also help restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-16 Thread Bikas Saha (JIRA)
Bikas Saha created YARN-582:
---

 Summary: Restore appToken and clientToken for app attempt after RM 
restart
 Key: YARN-582
 URL: https://issues.apache.org/jira/browse/YARN-582
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Jian He


These need to be saved and restored on a per app attempt basis. This is 
required only when work preserving restart is implemented for secure clusters. 
In non-preserving restart app attempts are killed and so this does not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-581) Test and verify that app delegation tokens are restored after RM restart

2013-04-16 Thread Bikas Saha (JIRA)
Bikas Saha created YARN-581:
---

 Summary: Test and verify that app delegation tokens are restored 
after RM restart
 Key: YARN-581
 URL: https://issues.apache.org/jira/browse/YARN-581
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Jian He


The code already saves the delegation tokens in AppSubmissionContext. Upon 
restart the AppSubmissionContext is used to submit the application again and so 
restores the delegation tokens. This jira tracks testing and verifying this 
functionality in a secure setup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-248) Security related work for RM restart

2013-04-16 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633007#comment-13633007
 ] 

Bikas Saha commented on YARN-248:
-

Closing this because YARN-581, YARN-582 etc have been opened to track specific 
work under YARN-128. Please add specific items that may have been missed. Or 
you could leave a comment here and I can open specific jiras.

> Security related work for RM restart
> 
>
> Key: YARN-248
> URL: https://issues.apache.org/jira/browse/YARN-248
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Tom White
>Assignee: Bikas Saha
>
> On restart, the RM creates a new RMDelegationTokenSecretManager with fresh 
> state. This will cause problems for Oozie jobs running on secure clusters 
> since the delegation tokens stored in the job credentials (used by the Oozie 
> launcher job to submit a job to the RM) will not be recognized by the RM, and 
> recovery will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-248) Security related work for RM restart

2013-04-16 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha resolved YARN-248.
-

Resolution: Duplicate

> Security related work for RM restart
> 
>
> Key: YARN-248
> URL: https://issues.apache.org/jira/browse/YARN-248
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Tom White
>Assignee: Bikas Saha
>
> On restart, the RM creates a new RMDelegationTokenSecretManager with fresh 
> state. This will cause problems for Oozie jobs running on secure clusters 
> since the delegation tokens stored in the job credentials (used by the Oozie 
> launcher job to submit a job to the RM) will not be recognized by the RM, and 
> recovery will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-501) Application Master getting killed randomly reporting excess usage of memory

2013-04-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633064#comment-13633064
 ] 

Hitesh Shah commented on YARN-501:
--

@Kishore, thanks for the logs. Looking at them. Would it be possible for you to 
generate another set but with the RM and NM running at DEBUG log level? 

> Application Master getting killed randomly reporting excess usage of memory
> ---
>
> Key: YARN-501
> URL: https://issues.apache.org/jira/browse/YARN-501
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell, nodemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Krishna Kishore Bonagiri
>
> I am running a date command using the Distributed Shell example in a loop of 
> 500 times. It ran successfully all the times except one time where it gave 
> the following error.
> 2013-03-22 04:33:25,280 INFO  [main] distributedshell.Client 
> (Client.java:monitorApplication(605)) - Got application report from ASM for, 
> appId=222, clientToken=null, appDiagnostics=Application 
> application_1363938200742_0222 failed 1 times due to AM Container for 
> appattempt_1363938200742_0222_01 exited with  exitCode: 143 due to: 
> Container [pid=21141,containerID=container_1363938200742_0222_01_01] is 
> running beyond virtual memory limits. Current usage: 47.3 Mb of 128 Mb 
> physical memory used; 611.6 Mb of 268.8 Mb virtual memory used. Killing 
> container.
> Dump of the process-tree for container_1363938200742_0222_01_01 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 21147 21141 21141 21141 (java) 244 12 532643840 11802 
> /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --num_containers 2 --priority 0 --shell_command date
> |- 21141 8433 21141 21141 (bash) 0 0 108642304 298 /bin/bash -c 
> /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --num_containers 2 --priority 0 --shell_command date 
> 1>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stdout
>  
> 2>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stderr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-16 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633072#comment-13633072
 ] 

Bikas Saha commented on YARN-514:
-

I am thinking of the following
RMAppManager.recover can call submitapplication and send RECOVER event to RMApp 
instead of START. in RMApp we can transition from NEW->SUBMITTED when we 
receive RECOVER event. that may be less code change than saving recovered flag 
and then doing transitions dependent on a flag. and more clear also in having a 
clean recover transition than flag dependent transitions. currently in 
NEW->SUBMITTED on recover event, new app will be created. later for work 
preserving restart, on recover event, app could move from NEW->RUNNING based on 
the last running app attempt. this will also prevent us from having more bugs 
like storing the same app twice because we will go down recovery transition and 
not the normal transition.

> Delayed store operations should not result in RM unavailability for app 
> submission
> --
>
> Key: YARN-514
> URL: https://issues.apache.org/jira/browse/YARN-514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: YARN-514.1.patch, YARN-514.2.patch, YARN-514.3.patch, 
> YARN-514.4.patch, YARN-514.5.patch, YARN-514.6.patch
>
>
> Currently, app submission is the only store operation performed synchronously 
> because the app must be stored before the request returns with success. This 
> makes the RM susceptible to blocking all client threads on slow store 
> operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-499) On container failure, surface logs to client

2013-04-16 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-499:


Summary: On container failure, surface logs to client  (was: On container 
failure, include last n lines of logs in diagnostics)

> On container failure, surface logs to client
> 
>
> Key: YARN-499
> URL: https://issues.apache.org/jira/browse/YARN-499
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-499.patch
>
>
> When a container fails, the only way to diagnose it is to look at the logs.  
> ContainerStatuses include a diagnostic string that is reported back to the 
> resource manager by the node manager.
> Currently in MR2 I believe whatever is sent to the task's standard out is 
> added to the diagnostics string, but for MR standard out is redirected to a 
> file called stdout.  In MR1, this string was populated with the last few 
> lines of the task's stdout file, and got printed to the console, allowing for 
> easy debugging.
> Handling this would help to soothe the infuriating problem of an AM dying for 
> a mysterious reason before setting a tracking URL (MAPREDUCE-3688).
> This could be done in one of two ways.
> * Use tee to send MR's standard out to both the stdout file and standard out. 
>  This requires modifying ShellCmdExecutor to roll what it reads in, as we 
> wouldn't want to be storing the entire task log in NM memory.
> * Read the task's log files.  This would require standardizing or making the 
> container log files configurable.  Right now the log files are determined in 
> userland and all that is YARN is aware of the log directory.
> Does this present any issues I'm not considering?  If so it this might only 
> be needed for AMs? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-562) NM should reject containers allocated by previous RM

2013-04-16 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-562:
-

Attachment: YARN-562.3.patch

change container_allocation_timestamp to cluster_timestamp

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-168) No way to turn off virtual memory limits without turning off physical memory limits

2013-04-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633110#comment-13633110
 ] 

Hitesh Shah commented on YARN-168:
--

As started in YARN-470, this is fixed for 2.0.4-alpha. Refer to 
http://hadoop.markmail.org/thread/mfaywrkctaezvlrv. 

> No way to turn off virtual memory limits without turning off physical memory 
> limits
> ---
>
> Key: YARN-168
> URL: https://issues.apache.org/jira/browse/YARN-168
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: Harsh J
>
> Asked and reported by a user (Krishna) on ML:
> {quote}
> This is possible to do, but you've hit a bug with the current YARN
> implementation. Ideally you should be able to configure the vmem-pmem
> ratio (or an equivalent config) to be -1, to indicate disabling of
> virtual memory checks completely (and there's indeed checks for this),
> but it seems like we are enforcing the ratio to be at least 1.0 (and
> hence negatives are disallowed).
> You can't workaround by setting the NM's offered resource.mb to -1
> either, as you'll lose out on controlling maximum allocations.
> Please file a YARN bug on JIRA. The code at fault lies under
> ContainersMonitorImpl#init(…).
> On Thu, Oct 18, 2012 at 4:00 PM, Krishna Kishore Bonagiri
>  wrote:
> > Hi,
> >
> >   Is there a way we can ask the YARN RM for not killing a container when it
> > uses excess virtual memory than the maximum it can use as per the
> > specification in the configuration file yarn-site.xml? We can't always
> > estimate the amount of virtual memory needed for our application running on
> > a container, but we don't want to get it killed in a case it exceeds the
> > maximum limit.
> >
> >   Please suggest as to how can we come across this issue.
> >
> > Thanks,
> > Kishore
> {quote}
> Basically, we're doing:
> {code}
> // / Virtual memory configuration //
> float vmemRatio = conf.getFloat(
> YarnConfiguration.NM_VMEM_PMEM_RATIO,
> YarnConfiguration.DEFAULT_NM_VMEM_PMEM_RATIO);
> Preconditions.checkArgument(vmemRatio > 0.99f,
> YarnConfiguration.NM_VMEM_PMEM_RATIO +
> " should be at least 1.0");
> this.maxVmemAllottedForContainers =
>   (long)(vmemRatio * maxPmemAllottedForContainers);
> {code}
> For virtual memory monitoring to be disabled, maxVmemAllottedForContainers 
> has to be -1. For that to be -1, given the above buggy computation, vmemRatio 
> must be -1 or maxPmemAllottedForContainers must be -1.
> If vmemRatio were -1, we fail the precondition check and exit.
> If maxPmemAllottedForContainers, we also end up disabling physical memory 
> monitoring.
> Or perhaps that makes sense - to disable both physical and virtual memory 
> monitoring, but that way your NM becomes infinite in resource grants, I think.
> We need a way to selectively disable kills done via virtual memory 
> monitoring, which is the base request here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-560) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-04-16 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-560:
--

Attachment: YARN-560.patch

Attaching a patch for trunk and branch-2. Doesn't apply cleanly to branch-0.23

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: YARN-560
> URL: https://issues.apache.org/jira/browse/YARN-560
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>  Labels: usability
> Attachments: MAPREDUCE-3949.patch, YARN-560.patch
>
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633165#comment-13633165
 ] 

Hadoop QA commented on YARN-562:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578973/YARN-562.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/748//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/748//console

This message is automatically generated.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-560) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633174#comment-13633174
 ] 

Hadoop QA commented on YARN-560:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578979/YARN-560.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/749//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/749//console

This message is automatically generated.

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: YARN-560
> URL: https://issues.apache.org/jira/browse/YARN-560
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>  Labels: usability
> Attachments: MAPREDUCE-3949.patch, YARN-560.patch
>
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-560) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-04-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633183#comment-13633183
 ] 

Sandy Ryza commented on YARN-560:
-

Ravi,

The patch looks good to me.  Nit: can you use block comments ( /** ... */ ) 
instead of single line comments over the tests?

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: YARN-560
> URL: https://issues.apache.org/jira/browse/YARN-560
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>  Labels: usability
> Attachments: MAPREDUCE-3949.patch, YARN-560.patch
>
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633194#comment-13633194
 ] 

Karthik Kambatla commented on YARN-562:
---

Hey Jian, cluster_timestamp makes me think of vector clocks and such 
(distributed synch clock), container_allocation_timestamp makes more sense. Any 
specific reason to change it?

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-560) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-04-16 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-560:
--

Attachment: YARN-560.patch

Thanks for your review Sandy. This patch uses the block comment

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: YARN-560
> URL: https://issues.apache.org/jira/browse/YARN-560
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>  Labels: usability
> Attachments: MAPREDUCE-3949.patch, YARN-560.patch, YARN-560.patch
>
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633215#comment-13633215
 ] 

Jian He commented on YARN-562:
--

cluster_timestamp here is merely the time recorded when RM starts, not a vector 
clock. All containers launched in the same RM would then have the same 
timestamp. Both approaches I think would work.  Having the same stamp looks 
more clear. 

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633219#comment-13633219
 ] 

Vinod Kumar Vavilapalli commented on YARN-562:
--

bq. Hey Jian, cluster_timestamp makes me think of vector clocks and such 
(distributed synch clock), container_allocation_timestamp makes more sense. Any 
specific reason to change it?
Haven't looked at the patch yet, but from my suggestion(in the first comment on 
this ticket), it shouldn't be container_allocation_timestamp - as that would 
mean the timestamp is per container. It is RM's startTimeStamp, may be we can 
call it that..

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-560) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633230#comment-13633230
 ] 

Hadoop QA commented on YARN-560:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578990/YARN-560.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/750//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/750//console

This message is automatically generated.

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: YARN-560
> URL: https://issues.apache.org/jira/browse/YARN-560
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>  Labels: usability
> Attachments: MAPREDUCE-3949.patch, YARN-560.patch, YARN-560.patch
>
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633256#comment-13633256
 ] 

Karthik Kambatla commented on YARN-562:
---

rm_start_timestamp makes absolute sense.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Omkar Vinit Joshi (JIRA)
Omkar Vinit Joshi created YARN-583:
--

 Summary: Application cache files should be localized under 
local-dir/usercache/userid/appcache/appid/filecache
 Key: YARN-583
 URL: https://issues.apache.org/jira/browse/YARN-583
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi


Currently application cache files are getting localized under 
local-dir/usercache/userid/appcache/appid/. however they should be localized 
under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi reassigned YARN-583:
--

Assignee: Omkar Vinit Joshi

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633340#comment-13633340
 ] 

Omkar Vinit Joshi commented on YARN-583:


Making sure application specific cache files are localized under filecache 
subdirectory. 

Earlier one :
/usercache//appcache//

updated one :
/usercache//appcache//filecache

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-583:
---

Attachment: yarn-583-20130416.patch

No tests to test it

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>     Attachments: yarn-583-20130416.patch
>
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633373#comment-13633373
 ] 

Hadoop QA commented on YARN-583:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12579013/yarn-583-20130416.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/751//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/751//console

This message is automatically generated.

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-583-20130416.patch
>
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633494#comment-13633494
 ] 

Chris Nauroth commented on YARN-583:


Hi, Omkar.  Nice to meet you!

{code}
 String path =
 "." + Path.SEPARATOR + ContainerLocalizer.USERCACHE + Path.SEPARATOR
 + user + Path.SEPARATOR + ContainerLocalizer.APPCACHE
-+ Path.SEPARATOR + appId;
++ Path.SEPARATOR + appId + ContainerLocalizer.FILECACHE;
{code}

It looks like you'll need to add another path separator after the appId.  Also, 
this might read better by using {{StringUtils#join}}, so that you won't need 
the repeated references to {{Path.SEPARATOR}}.

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-583-20130416.patch
>
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-126) yarn rmadmin help message contains reference to hadoop cli and JT

2013-04-16 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rémy SAISSY updated YARN-126:
-

Attachment: YARN-126.patch

Patch with a GNU Readline based implementation of RMAdmin.java.

> yarn rmadmin help message contains reference to hadoop cli and JT
> -
>
> Key: YARN-126
> URL: https://issues.apache.org/jira/browse/YARN-126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>  Labels: usability
> Attachments: YARN-126.patch
>
>
> has option to specify a job tracker and the last line for general command 
> line syntax had "bin/hadoop command [genericOptions] [commandOptions]"
> ran "yarn rmadmin" to get usage:
> RMAdmin
> Usage: java RMAdmin
>[-refreshQueues]
>[-refreshNodes]
>[-refreshUserToGroupsMappings]
>[-refreshSuperUserGroupsConfiguration]
>[-refreshAdminAcls]
>[-refreshServiceAcl]
>[-help [cmd]]
> Generic options supported are
> -conf  specify an application configuration file
> -D use value for given property
> -fs   specify a namenode
> -jt specify a job tracker
> -files specify comma separated files to be 
> copied to the map reduce cluster
> -libjars specify comma separated jar files 
> to include in the classpath.
> -archives specify comma separated 
> archives to be unarchived on the compute machines.
> The general command line syntax is
> bin/hadoop command [genericOptions] [commandOptions]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-126) yarn rmadmin help message contains reference to hadoop cli and JT

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633527#comment-13633527
 ] 

Hadoop QA commented on YARN-126:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12579048/YARN-126.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/752//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/752//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/752//console

This message is automatically generated.

> yarn rmadmin help message contains reference to hadoop cli and JT
> -
>
> Key: YARN-126
> URL: https://issues.apache.org/jira/browse/YARN-126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>  Labels: usability
> Attachments: YARN-126.patch
>
>
> has option to specify a job tracker and the last line for general command 
> line syntax had "bin/hadoop command [genericOptions] [commandOptions]"
> ran "yarn rmadmin" to get usage:
> RMAdmin
> Usage: java RMAdmin
>[-refreshQueues]
>[-refreshNodes]
>[-refreshUserToGroupsMappings]
>[-refreshSuperUserGroupsConfiguration]
>[-refreshAdminAcls]
>[-refreshServiceAcl]
>[-help [cmd]]
> Generic options supported are
> -conf  specify an application configuration file
> -D use value for given property
> -fs   specify a namenode
> -jt specify a job tracker
> -files specify comma separated files to be 
> copied to the map reduce cluster
> -libjars specify comma separated jar files 
> to include in the classpath.
> -archives specify comma separated 
> archives to be unarchived on the compute machines.
> The general command line syntax is
> bin/hadoop command [genericOptions] [commandOptions]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633534#comment-13633534
 ] 

Omkar Vinit Joshi commented on YARN-583:


[~cnauroth] Yeah ..thanks for figuring out the problem. StringUtils.join is 
really a helpful method. Updating getUserAppCachePath and getUserFileCachePath 
with this which will make path generation very clear.
getUserAppCachePath :- ./usercache/testuser/appcache/appid/filecache
getUserFileCachePath :- ./usercache/testuser/filecache

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>     Attachments: yarn-583-20130416.patch
>
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-583:
---

Attachment: yarn-583-20130416.1.patch

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-583-20130416.1.patch, yarn-583-20130416.patch
>
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-583) Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633559#comment-13633559
 ] 

Hadoop QA commented on YARN-583:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12579055/yarn-583-20130416.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/753//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/753//console

This message is automatically generated.

> Application cache files should be localized under 
> local-dir/usercache/userid/appcache/appid/filecache
> -
>
> Key: YARN-583
> URL: https://issues.apache.org/jira/browse/YARN-583
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-583-20130416.1.patch, yarn-583-20130416.patch
>
>
> Currently application cache files are getting localized under 
> local-dir/usercache/userid/appcache/appid/. however they should be localized 
> under filecache sub directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-534) AM max attempts is not checked when RM restart and try to recover attempts

2013-04-16 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-534:
-

Attachment: YARN-534.5.patch

Address last comments.
DEFAULT_RM_AM_MAX_ATTEMPTS is already overridden as 5 in the very beginning

> AM max attempts is not checked when RM restart and try to recover attempts
> --
>
> Key: YARN-534
> URL: https://issues.apache.org/jira/browse/YARN-534
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.0.5-beta
>
> Attachments: YARN-534.1.patch, YARN-534.2.patch, YARN-534.3.patch, 
> YARN-534.4.patch, YARN-534.5.patch
>
>
> Currently,AM max attempts is only checked if the current attempt fails and 
> check to see whether to create new attempt. If the RM restarts before the 
> max-attempt fails, it'll not clean the state store, when RM comes back, it 
> will retry attempt again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-534) AM max attempts is not checked when RM restart and try to recover attempts

2013-04-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633574#comment-13633574
 ] 

Jian He commented on YARN-534:
--

new patch add comments in case of work-preserve restart and put two more 
assertions in the test case.

> AM max attempts is not checked when RM restart and try to recover attempts
> --
>
> Key: YARN-534
> URL: https://issues.apache.org/jira/browse/YARN-534
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.0.5-beta
>
> Attachments: YARN-534.1.patch, YARN-534.2.patch, YARN-534.3.patch, 
> YARN-534.4.patch, YARN-534.5.patch
>
>
> Currently,AM max attempts is only checked if the current attempt fails and 
> check to see whether to create new attempt. If the RM restarts before the 
> max-attempt fails, it'll not clean the state store, when RM comes back, it 
> will retry attempt again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-584) In fair scheduler web UI, queues unexpand on refresh

2013-04-16 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-584:
---

 Summary: In fair scheduler web UI, queues unexpand on refresh
 Key: YARN-584
 URL: https://issues.apache.org/jira/browse/YARN-584
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza


In the fair scheduler web UI, you can expand queue information.  Refreshing the 
page causes the expansions to go away, which is annoying for someone who wants 
to monitor the scheduler page and needs to reopen all the queues they care 
about each time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-576) RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations

2013-04-16 Thread Kenji Kikushima (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenji Kikushima updated YARN-576:
-

Attachment: YARN-576.patch

Added a registration check for yarn.scheduler.minimum-allocation-mb and 
yarn.scheduler.minimum-allocation-vcores.
If NM request does't satisfy minimum allocations, RM returns 
NodeAction.SHUTDOWN.

> RM should not allow registrations from NMs that do not satisfy minimum 
> scheduler allocations
> 
>
> Key: YARN-576
> URL: https://issues.apache.org/jira/browse/YARN-576
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>  Labels: newbie
> Attachments: YARN-576.patch
>
>
> If the minimum resource allocation configured for the RM scheduler is 1 GB, 
> the RM should drop all NMs that register with a total capacity of less than 1 
> GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-457) Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633699#comment-13633699
 ] 

Hadoop QA commented on YARN-457:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12578398/YARN-457-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/755//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/755//console

This message is automatically generated.

> Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl
> 
>
> Key: YARN-457
> URL: https://issues.apache.org/jira/browse/YARN-457
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Kenji Kikushima
>Priority: Minor
>  Labels: Newbie
> Attachments: YARN-457-2.patch, YARN-457-3.patch, YARN-457-4.patch, 
> YARN-457.patch
>
>
> {code}
> if (updatedNodes == null) {
>   this.updatedNodes.clear();
>   return;
> }
> {code}
> If updatedNodes is already null, a NullPointerException is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-500) ResourceManager webapp is using next port if configured port is already in use

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633700#comment-13633700
 ] 

Hadoop QA commented on YARN-500:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577743/YARN-500-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/754//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/754//console

This message is automatically generated.

> ResourceManager webapp is using next port if configured port is already in use
> --
>
> Key: YARN-500
> URL: https://issues.apache.org/jira/browse/YARN-500
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Assignee: Kenji Kikushima
> Attachments: YARN-500-2.patch, YARN-500.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-561) Nodemanager should set some key information into the environment of every container that it launches.

2013-04-16 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-561:
---

Attachment: YARN-561.2.patch

Remove the duplicate in ApplicationConstants

> Nodemanager should set some key information into the environment of every 
> container that it launches.
> -
>
> Key: YARN-561
> URL: https://issues.apache.org/jira/browse/YARN-561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: YARN-561.1.patch, YARN-561.2.patch
>
>
> Information such as containerId, nodemanager hostname, nodemanager port is 
> not set in the environment when any container is launched. 
> For an AM, the RM does all of this for it but for a container launched by an 
> application, all of the above need to be set by the ApplicationMaster. 
> At the minimum, container id would be a useful piece of information. If the 
> container wishes to talk to its local NM, the nodemanager related information 
> would also come in handy. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-493) NodeManager job control logic flaws on Windows

2013-04-16 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-493:
---

Attachment: YARN-493.4.patch

Attaching rebased patch and resubmitting to Jenkins.

> NodeManager job control logic flaws on Windows
> --
>
> Key: YARN-493
> URL: https://issues.apache.org/jira/browse/YARN-493
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: YARN-493.1.patch, YARN-493.2.patch, YARN-493.3.patch, 
> YARN-493.4.patch
>
>
> Both product and test code contain some platform-specific assumptions, such 
> as availability of bash for executing a command in a container and signals to 
> check existence of a process and terminate it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-534) AM max attempts is not checked when RM restart and try to recover attempts

2013-04-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633739#comment-13633739
 ] 

Vinod Kumar Vavilapalli commented on YARN-534:
--

Jian/Bikas, can we open a new ticket for this instead of continuing on a 
resolved issue? Tx.

> AM max attempts is not checked when RM restart and try to recover attempts
> --
>
> Key: YARN-534
> URL: https://issues.apache.org/jira/browse/YARN-534
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.0.5-beta
>
> Attachments: YARN-534.1.patch, YARN-534.2.patch, YARN-534.3.patch, 
> YARN-534.4.patch, YARN-534.5.patch
>
>
> Currently,AM max attempts is only checked if the current attempt fails and 
> check to see whether to create new attempt. If the RM restarts before the 
> max-attempt fails, it'll not clean the state store, when RM comes back, it 
> will retry attempt again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-571) User should not be part of ContainerLaunchContext

2013-04-16 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633743#comment-13633743
 ] 

Chris Nauroth commented on YARN-571:


This patch conflicts with the patch pending on YARN-493, so whoever loses the 
commit race will need to rebase.  :-)


> User should not be part of ContainerLaunchContext
> -
>
> Key: YARN-571
> URL: https://issues.apache.org/jira/browse/YARN-571
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-571-20130415.2.txt
>
>
> Today, a user is expected to set the user name in the CLC when either 
> submitting an application or launching a container from the AM. This does not 
> make sense as the user can/has been identified by the RM as part of the RPC 
> layer.
> Solution would be to move the user information into either the Container 
> object or directly into the ContainerToken which can then be used by the NM 
> to launch the container. This user information would set into the container 
> by the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-500) ResourceManager webapp is using next port if configured port is already in use

2013-04-16 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633745#comment-13633745
 ] 

Vinod Kumar Vavilapalli commented on YARN-500:
--

This looks good, checking it in..

> ResourceManager webapp is using next port if configured port is already in use
> --
>
> Key: YARN-500
> URL: https://issues.apache.org/jira/browse/YARN-500
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Assignee: Kenji Kikushima
> Attachments: YARN-500-2.patch, YARN-500.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-561) Nodemanager should set some key information into the environment of every container that it launches.

2013-04-16 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-561:
---

Attachment: YARN-561.3.patch

Fix the test case failures

> Nodemanager should set some key information into the environment of every 
> container that it launches.
> -
>
> Key: YARN-561
> URL: https://issues.apache.org/jira/browse/YARN-561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: YARN-561.1.patch, YARN-561.2.patch, YARN-561.3.patch
>
>
> Information such as containerId, nodemanager hostname, nodemanager port is 
> not set in the environment when any container is launched. 
> For an AM, the RM does all of this for it but for a container launched by an 
> application, all of the above need to be set by the ApplicationMaster. 
> At the minimum, container id would be a useful piece of information. If the 
> container wishes to talk to its local NM, the nodemanager related information 
> would also come in handy. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-500) ResourceManager webapp is using next port if configured port is already in use

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633760#comment-13633760
 ] 

Hudson commented on YARN-500:
-

Integrated in Hadoop-trunk-Commit #3620 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3620/])
YARN-500. Fixed YARN webapps to not roll-over ports when explicitly asked 
to use non-ephemeral ports. Contributed by Kenji Kikushima. (Revision 1468739)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1468739
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/TestWebApp.java


> ResourceManager webapp is using next port if configured port is already in use
> --
>
> Key: YARN-500
> URL: https://issues.apache.org/jira/browse/YARN-500
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Assignee: Kenji Kikushima
> Fix For: 2.0.5-beta
>
> Attachments: YARN-500-2.patch, YARN-500.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-441) Clean up unused collection methods in various APIs

2013-04-16 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-441:
---

Attachment: YARN-441.6.patch

> Clean up unused collection methods in various APIs
> --
>
> Key: YARN-441
> URL: https://issues.apache.org/jira/browse/YARN-441
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Attachments: YARN-441.1.patch, YARN-441.2.patch, YARN-441.3.patch, 
> YARN-441.4.patch, YARN-441.5.patch, YARN-441.6.patch
>
>
> There's a bunch of unused methods like getAskCount() and getAsk(index) in 
> AllocateRequest, and other interfaces. These should be removed.
> In YARN, found them in. MR will have it's own set.
> AllocateRequest
> StartContaienrResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-493) NodeManager job control logic flaws on Windows

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633787#comment-13633787
 ] 

Hadoop QA commented on YARN-493:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12579087/YARN-493.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/756//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/756//console

This message is automatically generated.

> NodeManager job control logic flaws on Windows
> --
>
> Key: YARN-493
> URL: https://issues.apache.org/jira/browse/YARN-493
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: YARN-493.1.patch, YARN-493.2.patch, YARN-493.3.patch, 
> YARN-493.4.patch
>
>
> Both product and test code contain some platform-specific assumptions, such 
> as availability of bash for executing a command in a container and signals to 
> check existence of a process and terminate it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-441) Clean up unused collection methods in various APIs

2013-04-16 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633788#comment-13633788
 ] 

Xuan Gong commented on YARN-441:


1.Replace all occurrences of get*().put() and get*().clear() with setters
2.Add synchronized keyword back to StartContainerResponsePBImpl


> Clean up unused collection methods in various APIs
> --
>
> Key: YARN-441
> URL: https://issues.apache.org/jira/browse/YARN-441
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Attachments: YARN-441.1.patch, YARN-441.2.patch, YARN-441.3.patch, 
> YARN-441.4.patch, YARN-441.5.patch, YARN-441.6.patch
>
>
> There's a bunch of unused methods like getAskCount() and getAsk(index) in 
> AllocateRequest, and other interfaces. These should be removed.
> In YARN, found them in. MR will have it's own set.
> AllocateRequest
> StartContaienrResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-561) Nodemanager should set some key information into the environment of every container that it launches.

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633789#comment-13633789
 ] 

Hadoop QA commented on YARN-561:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12579088/YARN-561.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/757//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/757//console

This message is automatically generated.

> Nodemanager should set some key information into the environment of every 
> container that it launches.
> -
>
> Key: YARN-561
> URL: https://issues.apache.org/jira/browse/YARN-561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
>  Labels: usability
> Attachments: YARN-561.1.patch, YARN-561.2.patch, YARN-561.3.patch
>
>
> Information such as containerId, nodemanager hostname, nodemanager port is 
> not set in the environment when any container is launched. 
> For an AM, the RM does all of this for it but for a container launched by an 
> application, all of the above need to be set by the ApplicationMaster. 
> At the minimum, container id would be a useful piece of information. If the 
> container wishes to talk to its local NM, the nodemanager related information 
> would also come in handy. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-441) Clean up unused collection methods in various APIs

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633797#comment-13633797
 ] 

Hadoop QA commented on YARN-441:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12579092/YARN-441.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/758//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/758//console

This message is automatically generated.

> Clean up unused collection methods in various APIs
> --
>
> Key: YARN-441
> URL: https://issues.apache.org/jira/browse/YARN-441
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Attachments: YARN-441.1.patch, YARN-441.2.patch, YARN-441.3.patch, 
> YARN-441.4.patch, YARN-441.5.patch, YARN-441.6.patch
>
>
> There's a bunch of unused methods like getAskCount() and getAsk(index) in 
> AllocateRequest, and other interfaces. These should be removed.
> In YARN, found them in. MR will have it's own set.
> AllocateRequest
> StartContaienrResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-580) Delay scheduling in capacity scheduler is not ensuring 100% locality

2013-04-16 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633807#comment-13633807
 ] 

Devaraj K commented on YARN-580:


As per my observation for this issue, when we configure delay as more than no 
of cluster nodes for capacity scheduler it should check each node one time for 
NODE_LOCAL free resources for assigning. But here it is missing to check the 
last node for NODE_LOCAL and assgning to last but one node as RACK_LOCAL or 
OFF_SWITCH. 

We need to to correct it  for this issue.

> Delay scheduling in capacity scheduler is not ensuring 100% locality
> 
>
> Key: YARN-580
> URL: https://issues.apache.org/jira/browse/YARN-580
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Assignee: Devaraj K
>
> Example
> 
> Machine1: 3 blocks
> Machine2: 2 blocks
> Machine3: 1 blocks
> When we run job on this data, node locality is not ensured 100%
> Tasks run like below even if slots are available in all nodes:
> --
> Machine1: 4Task
> Machine2: 2Task
> Machine3: No task 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-16 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-514:
-

Attachment: YARN-514.7.patch

In this patch, there're the following updates:

1. Instead of setting a flag to differentiate new RMApp and recovered RMApp, a 
new event type, RECOVER, is defined. NEW->SUBMITTED on RECOVER while 
NEW->NEW_SAVING on START. StartAppAttemptTransition is invoked either 
NEW->SUBMITTED or NEW_SAVING->SUBMITTED. In this transition, only when the 
event type is APP_SAVED, the stored exception will be checked.

2. With the aforementioned modification, starting the attempt will not be 
called in two places, but one.

3. Wrong comments in testAppFailedFailed() is fixed.

4. Test cases are simplified. Only from NEW to SUBMITTED transition is tested, 
without the following redundant steps.

In addition, I've verified the delayed application store works correctly in the 
single-node cluster.

> Delayed store operations should not result in RM unavailability for app 
> submission
> --
>
> Key: YARN-514
> URL: https://issues.apache.org/jira/browse/YARN-514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: YARN-514.1.patch, YARN-514.2.patch, YARN-514.3.patch, 
> YARN-514.4.patch, YARN-514.5.patch, YARN-514.6.patch, YARN-514.7.patch
>
>
> Currently, app submission is the only store operation performed synchronously 
> because the app must be stored before the request returns with success. This 
> makes the RM susceptible to blocking all client threads on slow store 
> operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-514) Delayed store operations should not result in RM unavailability for app submission

2013-04-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633819#comment-13633819
 ] 

Hadoop QA commented on YARN-514:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12579098/YARN-514.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/759//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/759//console

This message is automatically generated.

> Delayed store operations should not result in RM unavailability for app 
> submission
> --
>
> Key: YARN-514
> URL: https://issues.apache.org/jira/browse/YARN-514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: YARN-514.1.patch, YARN-514.2.patch, YARN-514.3.patch, 
> YARN-514.4.patch, YARN-514.5.patch, YARN-514.6.patch, YARN-514.7.patch
>
>
> Currently, app submission is the only store operation performed synchronously 
> because the app must be stored before the request returns with success. This 
> makes the RM susceptible to blocking all client threads on slow store 
> operations, resulting in RM being perceived as unavailable by clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-501) Application Master getting killed randomly reporting excess usage of memory

2013-04-16 Thread Krishna Kishore Bonagiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633829#comment-13633829
 ] 

Krishna Kishore Bonagiri commented on YARN-501:
---

No problem Hitesh, I can do that. Can you please tell me how do I set that
log level?

Thanks,
Kishore





> Application Master getting killed randomly reporting excess usage of memory
> ---
>
> Key: YARN-501
> URL: https://issues.apache.org/jira/browse/YARN-501
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell, nodemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Krishna Kishore Bonagiri
>
> I am running a date command using the Distributed Shell example in a loop of 
> 500 times. It ran successfully all the times except one time where it gave 
> the following error.
> 2013-03-22 04:33:25,280 INFO  [main] distributedshell.Client 
> (Client.java:monitorApplication(605)) - Got application report from ASM for, 
> appId=222, clientToken=null, appDiagnostics=Application 
> application_1363938200742_0222 failed 1 times due to AM Container for 
> appattempt_1363938200742_0222_01 exited with  exitCode: 143 due to: 
> Container [pid=21141,containerID=container_1363938200742_0222_01_01] is 
> running beyond virtual memory limits. Current usage: 47.3 Mb of 128 Mb 
> physical memory used; 611.6 Mb of 268.8 Mb virtual memory used. Killing 
> container.
> Dump of the process-tree for container_1363938200742_0222_01_01 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 21147 21141 21141 21141 (java) 244 12 532643840 11802 
> /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --num_containers 2 --priority 0 --shell_command date
> |- 21141 8433 21141 21141 (bash) 0 0 108642304 298 /bin/bash -c 
> /home_/dsadm/yarn/jdk//bin/java -Xmx128m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --num_containers 2 --priority 0 --shell_command date 
> 1>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stdout
>  
> 2>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_01/AppMaster.stderr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira