[jira] [Assigned] (YARN-1551) Allow user-specified reason for killApplication

2013-12-30 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov reassigned YARN-1551:
---

Assignee: Gera Shegalov

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov

 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1551) Allow user-specified reason for killApplication

2013-12-30 Thread Gera Shegalov (JIRA)
Gera Shegalov created YARN-1551:
---

 Summary: Allow user-specified reason for killApplication
 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov


This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1551) Allow user-specified reason for killApplication

2013-12-30 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1551:


Attachment: YARN-1551.v01.patch

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1551) Allow user-specified reason for killApplication

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858803#comment-13858803
 ] 

Hadoop QA commented on YARN-1551:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620843/YARN-1551.v01.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2757//console

This message is automatically generated.

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1551) Allow user-specified reason for killApplication

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858871#comment-13858871
 ] 

Karthik Kambatla commented on YARN-1551:


We should wrap this JIRA up before MAPREDUCE-5648. 

Just skimmed through the patch. Looks reasonable at first glance. There are 
still map reduce changes in the patch, can we move them to the MR jira? Also, 
we should add some YARN specific tests. 

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1399) Allow users to annotate an application with multiple tags

2013-12-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858993#comment-13858993
 ] 

Zhijie Shen commented on YARN-1399:
---

Hm... It makes sense. The only tag from oozie seems not to be enough to 
identify the apps of a workflow. For example, in the multiple tenancy 
situation, two oozie workflows submitted by two different users happen to have 
the same workflow ID. Then, killing the apps have this workflow ID will screw 
up both workflow.

To solve the problem in YARN-1390, we may need more information to identify the 
apps belonging to a unique workflow:
1. Having GUID in the workflow ID to prevent conflicting IDs in a YARN cluster 
(unless others intentionally  attack)
2. Check owner of the application as well to avoid the case that attackers copy 
the workflow ID.

 Allow users to annotate an application with multiple tags
 -

 Key: YARN-1399
 URL: https://issues.apache.org/jira/browse/YARN-1399
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Nowadays, when submitting an application, users can fill the applicationType 
 field to facilitate searching it later. IMHO, it's good to accept multiple 
 tags to allow users to describe their applications in multiple aspects, 
 including the application type. Then, searching by tags may be more efficient 
 for users to reach their desired application collection. It's pretty much 
 like the tag system of online photo/video/music and etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859025#comment-13859025
 ] 

Sandy Ryza commented on YARN-1461:
--

The patch is looking good.  A few comments:

{code}
+  @Private
+  @Unstable
+  public abstract SetString getTags();
{code}
Getters in GetApplicationsRequest should be Public/Stable.

{code}
+
+
+  /**
+   * Get tags for the application
{code}
Two lines not needed here.

{code}
+  public abstract void setTags(SetString tags) throws 
IllegalArgumentException;
{code}
IllegalArgumentException is a RuntimeException - we don't need a throws for it. 
 I think using a checked exception here would require unnecessary try/catch for 
users who are doing it right. If they're using tags that violate the 
constraints, they should be changing their code, not handling the exception.

{code}
+SetString appTags = parseQueries(tags, false);
+if (!appTags.isEmpty()) {
+  checkAppTags = true;
+}
{code}
Null check?

{code}
+appInfo.getTags(.append(\,\)
+  .append(StringEscapeUtils.escapeJavaScript(StringEscapeUtils.escapeHtml(
{code}
We need to add a column in the table header for this, right?

{code}
+  protected String tags = ; // initialize to an empty string
{code}
Comment is unnecessary

{code}
+  if (app.getTags() != null  !app.getTags().isEmpty()) {
+for (String tag : app.getTags()) {
+  this.tags += tag + ,;
+}
+this.tags = this.tags.substring(0, tags.length() - 1);
+  }
{code}
Use Joiner.on(,)

{code}
+
+  @Override
+  public SetString getTags() { return null; }
{code}
Use multiple lines as is done for other methods that return null in the class.

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1029) Allow embedding leader election into the RM

2013-12-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1029:
---

Attachment: yarn-1029-8.patch

Noticed occasional failures of TestRMFailover in some Jenkins builds (e.g. 
https://builds.apache.org/job/PreCommit-YARN-Build/2750//testReport/org.apache.hadoop.yarn.client/TestRMFailover/testExplicitFailover/).
 New patch that bumps up the timeout for NM-RM connection to 10 seconds 
(granularity remains 100 ms). 

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1551) Allow user-specified reason for killApplication

2013-12-30 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859036#comment-13859036
 ] 

Gera Shegalov commented on YARN-1551:
-

[~kkambatl], I find it difficult to separate MR code out of this patch. The 
reason is ResourceMgrDelegate extends abstract YarnClient. You are right, the 
test is not YARN-specific but it tests end-to-end cli and YARN-api on a MR-job.

[~vinodkv], thanks for suggestion. I'll work on incorporating a limit here and 
in MAPREDUCE-5648. 

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1461:
---

Attachment: yarn-1461-6.patch

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1552) Promote GetApplicationsRequest APIs to Stable

2013-12-30 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1552:
--

 Summary: Promote GetApplicationsRequest APIs to Stable
 Key: YARN-1552
 URL: https://issues.apache.org/jira/browse/YARN-1552
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.2.0
Reporter: Karthik Kambatla


GetApplicationsRequest is used to fetch applications from the RM. Currently, 
all APIs are Public/Unstable. I think it is time to graduate these to Stable 
APIs



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2013-12-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859044#comment-13859044
 ] 

Sandy Ryza commented on YARN-1029:
--

Looked over the minicluster changes.

A couple tiny nits, otherwise LGTM:
* In MiniYarnCluster, failoverTimeout does not need to be initialized to 0 
because it will always get set in serviceInit.
* In initResourceManager, index does not need to be final
* In initResourceManager, having the open paren on the line after register 
looks a little weird, and new EventHandlerRMAppAttemptEvent()  should be at 
the same indentation level as RMAppAttemptEventType.class.
* The thread in startResourceManager should be given a name (including the 
index).  Though if that's unrelated to this patch, leaving it how it is is fine.
* Why the added null check in getActiveRMIndex?  When would one of the entries 
in the resourceManagers array be null?

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859048#comment-13859048
 ] 

Karthik Kambatla commented on YARN-1461:


Thanks Sandy. Posted a patch that incorporates most of your suggestions.

bq. Getters in GetApplicationsRequest should be Public/Stable.
All other getters and setts in GetApplicationsRequest are Public/Unstable. I 
would like to leave it this way for this patch. Created YARN-1552 to graduate 
all of them to Public/Stable.

{quote}
{code}
+SetString appTags = parseQueries(tags, false);
+if (!appTags.isEmpty()) {
+  checkAppTags = true;
+}
{code}
Null check?
{quote}
parseQueries always returns a non-null set. There are couple of other uses 
without a null check. I ll leave it consistent with others if that is okay. 

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859047#comment-13859047
 ] 

Sandy Ryza commented on YARN-1461:
--

In GetApplicationsRequest, getTags still hasn't been changed to Public/Stable

{code}
-  public void setApplicationStates(EnumSetYarnApplicationState 
applicationStates) {
+  public void setApplicationStates(EnumSet YarnApplicationState 
applicationStates) {
{code}

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859051#comment-13859051
 ] 

Hadoop QA commented on YARN-1461:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620866/yarn-1461-6.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2759//console

This message is automatically generated.

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1552) Make all GetApplicationsRequest getter APIs Public/Stable

2013-12-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1552:
-

Description: 
GetApplicationsRequest is used to fetch applications from the RM. Currently, 
all setters and some getters are Private/Unstable and some getters are 
Public/Stable.  We should graduate all the getters to Public/Stable.


  was:GetApplicationsRequest is used to fetch applications from the RM. 
Currently, all APIs are Public/Unstable. I think it is time to graduate these 
to Stable APIs

Summary: Make all GetApplicationsRequest getter APIs Public/Stable  
(was: Promote GetApplicationsRequest APIs to Stable)

 Make all GetApplicationsRequest getter APIs Public/Stable
 -

 Key: YARN-1552
 URL: https://issues.apache.org/jira/browse/YARN-1552
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.2.0
Reporter: Karthik Kambatla

 GetApplicationsRequest is used to fetch applications from the RM. Currently, 
 all setters and some getters are Private/Unstable and some getters are 
 Public/Stable.  We should graduate all the getters to Public/Stable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1399) Allow users to annotate an application with multiple tags

2013-12-30 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859053#comment-13859053
 ] 

Robert Kanter commented on YARN-1399:
-

{quote}
Having GUID in the workflow ID to prevent conflicting IDs in a YARN cluster 
(unless others intentionally attack)
{quote}
Oozie workflow ID's are configurable, but by default they are 
{{job_number\-timesteamp\-system_id-job_type}}.  These should be unique 
so we don't need a separate GUID.  We'd actually use the action ID instead of 
the workflow ID, but that's just {{workflow_ID@action_name}}, which is also 
unique.  For example: {{005-131218150050953-oozie-oozi-W@pig-node}}

Unless the user configured multiple Oozie servers such that the IDs would be 
similar, the only way to get the same workflow or action id would be to start 
the two servers at exactly the same time so they'd have the same timestamp, 
which I don't think we need to worry about.  

 Allow users to annotate an application with multiple tags
 -

 Key: YARN-1399
 URL: https://issues.apache.org/jira/browse/YARN-1399
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Nowadays, when submitting an application, users can fill the applicationType 
 field to facilitate searching it later. IMHO, it's good to accept multiple 
 tags to allow users to describe their applications in multiple aspects, 
 including the application type. Then, searching by tags may be more efficient 
 for users to reach their desired application collection. It's pretty much 
 like the tag system of online photo/video/music and etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1461:
---

Attachment: yarn-1461-6.patch

Resubmitting same patch to see if Jenkins likes it better.

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859062#comment-13859062
 ] 

Hadoop QA commented on YARN-1029:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620865/yarn-1029-8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 4 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  org.apache.hadoop.ha.TestZKFailoverController

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2758//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2758//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2758//console

This message is automatically generated.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859083#comment-13859083
 ] 

Hadoop QA commented on YARN-1461:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620867/yarn-1461-6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2760//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2760//console

This message is automatically generated.

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1522) TestApplicationCleanup.testAppCleanup occasionally fails

2013-12-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1522:
--

Attachment: YARN-1522-2.txt

Same patch as before, for appeasing Jenkins.

 TestApplicationCleanup.testAppCleanup occasionally fails
 

 Key: YARN-1522
 URL: https://issues.apache.org/jira/browse/YARN-1522
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Liyin Liang
Assignee: Liyin Liang
 Attachments: YARN-1522-1.diff, YARN-1522-2.txt, YARN-1522-2.txt


 TestApplicationCleanup is occasionally failing with the error:
 {code}
 ---
 Test set: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
 ---
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.215 sec  
 FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
 testAppCleanup(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup)
  Time elapsed: 5.555 sec  FAILURE!
 junit.framework.AssertionFailedError: expected:1 but was:0
 at 
 org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup.testAppCleanup(TestApplicationCleanup.java:119)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1461:
---

Attachment: yarn-1461-7.patch

Removed one false change in GetApplicationsRequestPBImpl.

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
 yarn-1461-7.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859103#comment-13859103
 ] 

Sandy Ryza commented on YARN-1461:
--

+1

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
 yarn-1461-7.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2013-12-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859130#comment-13859130
 ] 

Zhijie Shen commented on YARN-1461:
---

[~kkambatl], given Vinod's concern of using tags to identify the applications 
of a oozie workflow (see YARN-1399), would you mind holding off your patches 
for a while, such that we can think more about it?

 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
 yarn-1461-7.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1522) TestApplicationCleanup.testAppCleanup occasionally fails

2013-12-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859140#comment-13859140
 ] 

Hudson commented on YARN-1522:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4940 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4940/])
YARN-1522. Fixed a race condition in the test TestApplicationCleanup that was 
causing it to randomly fail. Contributed by Liyin Liang. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554328)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java


 TestApplicationCleanup.testAppCleanup occasionally fails
 

 Key: YARN-1522
 URL: https://issues.apache.org/jira/browse/YARN-1522
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Liyin Liang
Assignee: Liyin Liang
 Fix For: 2.4.0

 Attachments: YARN-1522-1.diff, YARN-1522-2.txt, YARN-1522-2.txt


 TestApplicationCleanup is occasionally failing with the error:
 {code}
 ---
 Test set: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
 ---
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.215 sec  
 FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
 testAppCleanup(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup)
  Time elapsed: 5.555 sec  FAILURE!
 junit.framework.AssertionFailedError: expected:1 but was:0
 at 
 org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup.testAppCleanup(TestApplicationCleanup.java:119)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1121) RMStateStore should flush all pending store events before closing

2013-12-30 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1121:
--

Attachment: YARN-1121.11.patch

 RMStateStore should flush all pending store events before closing
 -

 Key: YARN-1121
 URL: https://issues.apache.org/jira/browse/YARN-1121
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Jian He
 Fix For: 2.4.0

 Attachments: YARN-1121.1.patch, YARN-1121.10.patch, 
 YARN-1121.11.patch, YARN-1121.2.patch, YARN-1121.2.patch, YARN-1121.3.patch, 
 YARN-1121.4.patch, YARN-1121.5.patch, YARN-1121.6.patch, YARN-1121.6.patch, 
 YARN-1121.7.patch, YARN-1121.8.patch, YARN-1121.9.patch


 on serviceStop it should wait for all internal pending events to drain before 
 stopping.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1121) RMStateStore should flush all pending store events before closing

2013-12-30 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859141#comment-13859141
 ] 

Jian He commented on YARN-1121:
---

Agree, made the change to check the blockNewEvents flag before acquiring the 
lock and calling notify, and added the comment also.

 RMStateStore should flush all pending store events before closing
 -

 Key: YARN-1121
 URL: https://issues.apache.org/jira/browse/YARN-1121
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Jian He
 Fix For: 2.4.0

 Attachments: YARN-1121.1.patch, YARN-1121.10.patch, 
 YARN-1121.11.patch, YARN-1121.2.patch, YARN-1121.2.patch, YARN-1121.3.patch, 
 YARN-1121.4.patch, YARN-1121.5.patch, YARN-1121.6.patch, YARN-1121.6.patch, 
 YARN-1121.7.patch, YARN-1121.8.patch, YARN-1121.9.patch


 on serviceStop it should wait for all internal pending events to drain before 
 stopping.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1399) Allow users to annotate an application with multiple tags

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859148#comment-13859148
 ] 

Karthik Kambatla commented on YARN-1399:


bq.  Tags or the source/group originally proposed won't help the oozie case as 
described on YARN-1390. Or to be more accurate, they make it unwieldy. Let's 
say oozie uses a tag workflow_123_566 for all apps in a workflow, any other 
application from any other user SHOULD not set that tag. Or run the risk of 
getting killed by oozie. 
As Robert mentioned workflow@action is an involved string, and I don't think 
a user accidentally setting that tag should be a concern. Oozie will kill an 
application as the user who submitted the job. I agree Oozie will kill an app 
if the same user (who submitted the workflow) submits another app with a tag 
matching the action id. It doesn't sound likely, and I don't think it is really 
a concern. Even if it were a concern, I don't think it is YARN's responsibility 
to limit what tags can be set or not.

bq. To avoid it, we'll need to depend on oozie to not kill as a privileged user.
Oozie runs jobs as the user that submits the workflow. The same rules will 
apply to killing a job. Even if Oozie somehow becomes malicious, it is YARN's 
responsibility to not let it run/kill jobs as a privileged user. No? 

bq. Further, I could make any other user's application-search cumbersome by 
reusing his/her tags for my own applications. Seems like the tag-search should 
be linked to and limited by some other entity like user - search for apps 
matching a tag for a given user/queue etc.
That is definitely a thought. I am open to enforcing specifying either a 
user/queue when searching for a tag. However, in principle, this could happen 
with application-types as well: a user could submit a number of random YARN 
applications with type MAPREDUCE. I thought the way we were restricting 
exposing these (tags/types) was through ACLs on a secure cluster. Only users 
with permissions to view someone else's apps should be able to view the tags. 

 Allow users to annotate an application with multiple tags
 -

 Key: YARN-1399
 URL: https://issues.apache.org/jira/browse/YARN-1399
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Nowadays, when submitting an application, users can fill the applicationType 
 field to facilitate searching it later. IMHO, it's good to accept multiple 
 tags to allow users to describe their applications in multiple aspects, 
 including the application type. Then, searching by tags may be more efficient 
 for users to reach their desired application collection. It's pretty much 
 like the tag system of online photo/video/music and etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1399) Allow users to annotate an application with multiple tags

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859151#comment-13859151
 ] 

Karthik Kambatla commented on YARN-1399:


The following blurb in {{ClientRMService#getApplications}} restricts view 
access: 
{code}
  boolean allowAccess = checkAccess(callerUGI, application.getUser(),
  ApplicationAccessType.VIEW_APP, application);
  reports.add(application.createAndGetApplicationReport(
  callerUGI.getUserName(), allowAccess));
{code}

 Allow users to annotate an application with multiple tags
 -

 Key: YARN-1399
 URL: https://issues.apache.org/jira/browse/YARN-1399
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Nowadays, when submitting an application, users can fill the applicationType 
 field to facilitate searching it later. IMHO, it's good to accept multiple 
 tags to allow users to describe their applications in multiple aspects, 
 including the application type. Then, searching by tags may be more efficient 
 for users to reach their desired application collection. It's pretty much 
 like the tag system of online photo/video/music and etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1489) [Umbrella] Work-preserving ApplicationMaster restart

2013-12-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859156#comment-13859156
 ] 

Zhijie Shen commented on YARN-1489:
---

Thanks Vinod for the proposal. One thought when I read the following point.

bq. In case of apps like MapReduce where containers need to communicate 
directly with AMs, the old running-containers don’t know where the new 
ApplicationMaster is running and how to reach it (service addresses).

During AM restarting, the container may try to send messages to AM in some 
application, and these messages may get lost. Is good to buffer the outstanding 
messages and send them to AM when rebinding?

 [Umbrella] Work-preserving ApplicationMaster restart
 

 Key: YARN-1489
 URL: https://issues.apache.org/jira/browse/YARN-1489
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Work preserving AM restart.pdf


 Today if AMs go down,
  - RM kills all the containers of that ApplicationAttempt
  - New ApplicationAttempt doesn't know where the previous containers are 
 running
  - Old running containers don't know where the new AM is running.
 We need to fix this to enable work-preserving AM restart. The later two 
 potentially can be done at the app level, but it is good to have a common 
 solution for all apps where-ever possible.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1121) RMStateStore should flush all pending store events before closing

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859157#comment-13859157
 ] 

Hadoop QA commented on YARN-1121:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620880/YARN-1121.11.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2763//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2763//console

This message is automatically generated.

 RMStateStore should flush all pending store events before closing
 -

 Key: YARN-1121
 URL: https://issues.apache.org/jira/browse/YARN-1121
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Jian He
 Fix For: 2.4.0

 Attachments: YARN-1121.1.patch, YARN-1121.10.patch, 
 YARN-1121.11.patch, YARN-1121.2.patch, YARN-1121.2.patch, YARN-1121.3.patch, 
 YARN-1121.4.patch, YARN-1121.5.patch, YARN-1121.6.patch, YARN-1121.6.patch, 
 YARN-1121.7.patch, YARN-1121.8.patch, YARN-1121.9.patch


 on serviceStop it should wait for all internal pending events to drain before 
 stopping.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1029) Allow embedding leader election into the RM

2013-12-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1029:
---

Attachment: yarn-1029-9.patch

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859160#comment-13859160
 ] 

Karthik Kambatla commented on YARN-1029:


Thanks Sandy. Posted a new patch that addresses most of your comments.

bq. Why the added null check in getActiveRMIndex? When would one of the entries 
in the resourceManagers array be null?
stopResourceManager(i) stops and nullifies resourceManagers[i]. restart() 
resets this and points to a new RM. Currently, we are forced to do this because 
a stopped service can't be restarted, and the only way to trigger a failover 
automatically is to kill the RM. Have marked these two methods @Private to 
limit their access to YARN.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1552) Make all GetApplicationsRequest getter APIs Public/Stable

2013-12-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859167#comment-13859167
 ] 

Zhijie Shen commented on YARN-1552:
---

Having checked GetApplicationsRequest, I think two issues may be treated 
independently:

1. Private - Public: the methods are supposed to be user oriented. Not sure 
why they were Private before.
2. Unstable - Stable: I agree with Hitesh on not doing this until we figure 
out the all the filters we require. It seems that RM's webservices still 
provide more filtering options than RPC. In the future, we may add tags as well.

 Make all GetApplicationsRequest getter APIs Public/Stable
 -

 Key: YARN-1552
 URL: https://issues.apache.org/jira/browse/YARN-1552
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.2.0
Reporter: Karthik Kambatla

 GetApplicationsRequest is used to fetch applications from the RM. Currently, 
 all setters and some getters are Private/Unstable and some getters are 
 Public/Stable.  We should graduate all the getters to Public/Stable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1552) Make all GetApplicationsRequest getter APIs Public/Stable

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859173#comment-13859173
 ] 

Karthik Kambatla commented on YARN-1552:


Makes sense to wait until history server work is done. [~zjshen] - can you add 
a blocker link on a related history server item? 

 Make all GetApplicationsRequest getter APIs Public/Stable
 -

 Key: YARN-1552
 URL: https://issues.apache.org/jira/browse/YARN-1552
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.2.0
Reporter: Karthik Kambatla

 GetApplicationsRequest is used to fetch applications from the RM. Currently, 
 all setters and some getters are Private/Unstable and some getters are 
 Public/Stable.  We should graduate all the getters to Public/Stable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1552) Make all GetApplicationsRequest getter APIs Public/Stable

2013-12-30 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859178#comment-13859178
 ] 

Hitesh Shah commented on YARN-1552:
---

[~kkambatl] [~zjshen] Not sure that there should be a blocker history server 
task for this. What I meant to imply is that once the history server is 
implemented and merged in, a fresh look should be taken at this api and 
eventually be marked stable once the history server apis also reach stability. 
I doubt the history server apis will be stable when the work is ready to be 
merged into trunk.

 Make all GetApplicationsRequest getter APIs Public/Stable
 -

 Key: YARN-1552
 URL: https://issues.apache.org/jira/browse/YARN-1552
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.2.0
Reporter: Karthik Kambatla

 GetApplicationsRequest is used to fetch applications from the RM. Currently, 
 all setters and some getters are Private/Unstable and some getters are 
 Public/Stable.  We should graduate all the getters to Public/Stable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2013-12-30 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated YARN-1136:
--

Attachment: yarn1136.patch

 Replace junit.framework.Assert with org.junit.Assert
 

 Key: YARN-1136
 URL: https://issues.apache.org/jira/browse/YARN-1136
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Chen He
  Labels: newbie, test
 Attachments: yarn1136.patch


 There are several places where we are using junit.framework.Assert instead of 
 org.junit.Assert.
 {code}grep -rn junit.framework.Assert hadoop-yarn-project/ 
 --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1552) Make all GetApplicationsRequest getter APIs Public/Stable

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859180#comment-13859180
 ] 

Karthik Kambatla commented on YARN-1552:


I should have been clearer. I meant to say, this JIRA should depend on (linked 
as - blocked by) one of the HistoryServer API JIRAs, so we know when to look 
into this again.

 Make all GetApplicationsRequest getter APIs Public/Stable
 -

 Key: YARN-1552
 URL: https://issues.apache.org/jira/browse/YARN-1552
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.2.0
Reporter: Karthik Kambatla

 GetApplicationsRequest is used to fetch applications from the RM. Currently, 
 all setters and some getters are Private/Unstable and some getters are 
 Public/Stable.  We should graduate all the getters to Public/Stable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859196#comment-13859196
 ] 

Hadoop QA commented on YARN-1029:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620883/yarn-1029-9.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 4 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2764//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2764//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2764//console

This message is automatically generated.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859199#comment-13859199
 ] 

Karthik Kambatla commented on YARN-1410:


Like the idea of calling createApplication() from within submitApplication() if 
the appId is not set in ASC. There should also be a version of 
ASC.newInstance() that doesn't require an applicationId and can be used by the 
clients.

Another bizarre alternative would be to prematurely write the appId to the 
state-store on createApplication() even before an application is submitted. So, 
there could be appIds in the store that don't correspond to any applications - 
which is kind of weird and can lead to the store bloating up and other 
undesirable side-effects.

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410.1.patch


 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1521) Mark appropriate protocol methods with the idempotent annotation

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859206#comment-13859206
 ] 

Karthik Kambatla commented on YARN-1521:


I don't know the contracts on each of these APIs well enough (yet) to say 
whether they are idempotent - it would be nice for someone like [~vinodkv] to 
comment on this.

I think allocate(), finishApplicationMaster(), getQueueInfo() are idempotent. 
If not already, get/renewDelegationToken(), getCluster*() should also be (made) 
idempotent. Other items in the list sound okay to me. 

 Mark appropriate protocol methods with the idempotent annotation
 

 Key: YARN-1521
 URL: https://issues.apache.org/jira/browse/YARN-1521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong

 After YARN-1028, we add the automatically failover into RMProxy. This JIRA is 
 to identify whether we need to add idempotent annotation and which methods 
 can be marked as idempotent.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1553) Do not use HttpConfig.isSecure() in YARN

2013-12-30 Thread Haohui Mai (JIRA)
Haohui Mai created YARN-1553:


 Summary: Do not use HttpConfig.isSecure() in YARN
 Key: YARN-1553
 URL: https://issues.apache.org/jira/browse/YARN-1553
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haohui Mai


HDFS-5305 and related jira decide that each individual project will have their 
own configuration on http policy. {{HttpConfig.isSecure}} is a global static 
method which does not fit the design anymore. The same functionality should be 
moved into the YARN code base.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1553) Do not use HttpConfig.isSecure() in YARN

2013-12-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated YARN-1553:
-

Attachment: YARN-1553.000.patch

 Do not use HttpConfig.isSecure() in YARN
 

 Key: YARN-1553
 URL: https://issues.apache.org/jira/browse/YARN-1553
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haohui Mai
 Attachments: YARN-1553.000.patch


 HDFS-5305 and related jira decide that each individual project will have 
 their own configuration on http policy. {{HttpConfig.isSecure}} is a global 
 static method which does not fit the design anymore. The same functionality 
 should be moved into the YARN code base.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk

2013-12-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1549:
--

Attachment: YARN-1549.1.patch

Same patch with better formatting, reordered logic for better readability.

Will check it if Jenkins says okay..

 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1121) RMStateStore should flush all pending store events before closing

2013-12-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859227#comment-13859227
 ] 

Hudson commented on YARN-1121:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4941 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4941/])
YARN-1121. Addendum patch. Fixed AsyncDispatcher hang issue during stop due to 
a race condition caused by the previous patch. Contributed by Jian He. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554344)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java


 RMStateStore should flush all pending store events before closing
 -

 Key: YARN-1121
 URL: https://issues.apache.org/jira/browse/YARN-1121
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Jian He
 Fix For: 2.4.0

 Attachments: YARN-1121.1.patch, YARN-1121.10.patch, 
 YARN-1121.11.patch, YARN-1121.2.patch, YARN-1121.2.patch, YARN-1121.3.patch, 
 YARN-1121.4.patch, YARN-1121.5.patch, YARN-1121.6.patch, YARN-1121.6.patch, 
 YARN-1121.7.patch, YARN-1121.8.patch, YARN-1121.9.patch


 on serviceStop it should wait for all internal pending events to drain before 
 stopping.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2013-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859231#comment-13859231
 ] 

Karthik Kambatla commented on YARN-1492:


Comments on the design:
# In the client protocol, if a cleaner instance (or run) starts after R2 and 
before R2', the client wouldn't know of this cleaner's existence. 
# Dangling cleaner locks: Using ZK here would probably make it easier to handle 
these dangling locks. If the Cleaner crashes, the corresponding connection to 
ZK is severed, and all locks are automatically cleaned up (if using ephemeral 
nodes). As others have mentioned earlier, I think it is okay to assume one ZK 
quorum running. For instance, RM HA requires this. 
# We should probably mandate running CleanerService if shared-cache is enabled, 
and should run as part of the RM and periodically.

 truly shared cache for jars (jobjar/libjar)
 ---

 Key: YARN-1492
 URL: https://issues.apache.org/jira/browse/YARN-1492
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf, 
 shared_cache_design_v3.pdf, shared_cache_design_v4.pdf


 Currently there is the distributed cache that enables you to cache jars and 
 files so that attempts from the same job can reuse them. However, sharing is 
 limited with the distributed cache because it is normally on a per-job basis. 
 On a large cluster, sometimes copying of jobjars and libjars becomes so 
 prevalent that it consumes a large portion of the network bandwidth, not to 
 speak of defeating the purpose of bringing compute to where data is. This 
 is wasteful because in most cases code doesn't change much across many jobs.
 I'd like to propose and discuss feasibility of introducing a truly shared 
 cache so that multiple jobs from multiple users can share and cache jars. 
 This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk

2013-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859236#comment-13859236
 ] 

Hadoop QA commented on YARN-1549:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620892/YARN-1549.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2765//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2765//console

This message is automatically generated.

 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk

2013-12-30 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859276#comment-13859276
 ] 

haosdent commented on YARN-1549:


[~vinodkv] Thank you very much for your review.

 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1408) Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task timeout for 30mins

2013-12-30 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859336#comment-13859336
 ] 

Sunil G commented on YARN-1408:
---


Hi Devaraj

As per your comments, I have made the changes.

1. Need to handle the invalid transition and during the transition container to 
be removed from ContainerAllocationExpirer to avoid the timeout.
[Sunil]: When we remove this extra preempted container from the 
newlyAllocatedContainers, the invalid transition got handled. 
Because, when heartbeat comes, this extra container will not be there in 
newlyAllocatedContainers and hence ACQUIRED event will not be fired at this 
container.

2. In the patch, trying to remove from newlyAllocatedContainers. This can be 
removed directly from newlyAllocatedContainers using 
java.util.List.remove(Object o), instead of iterating, checking and then 
removing.
[Sunil]:  Yes, i changed it by removing directly from the list

3. Can you also add test to demonstrate this case.
[Sunil]:  Change has done to remove an element from the 
newlyAllocatedContainers. 
There are no functions added. Now the verification is done by manual testing to 
ensure the removal is performed.


 Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task 
 timeout for 30mins
 --

 Key: YARN-1408
 URL: https://issues.apache.org/jira/browse/YARN-1408
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
 Fix For: 2.2.0

 Attachments: Yarn-1408.1.patch, Yarn-1408.2.patch, Yarn-1408.patch


 Capacity preemption is enabled as follows.
  *  yarn.resourcemanager.scheduler.monitor.enable= true ,
  *  
 yarn.resourcemanager.scheduler.monitor.policies=org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy
 Queue = a,b
 Capacity of Queue A = 80%
 Capacity of Queue B = 20%
 Step 1: Assign a big jobA on queue a which uses full cluster capacity
 Step 2: Submitted a jobB to queue b  which would use less than 20% of cluster 
 capacity
 JobA task which uses queue b capcity is been preempted and killed.
 This caused below problem:
 1. New Container has got allocated for jobA in Queue A as per node update 
 from an NM.
 2. This container has been preempted immediately as per preemption.
 Here ACQUIRED at KILLED Invalid State exception came when the next AM 
 heartbeat reached RM.
 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
 Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 ACQUIRED at KILLED
 This also caused the Task to go for a timeout for 30minutes as this Container 
 was already killed by preemption.
 attempt_1380289782418_0003_m_00_0 Timed out after 1800 secs



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1408) Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task timeout for 30mins

2013-12-30 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-1408:
--

Attachment: Yarn-1408.2.patch

Updated as per comments

 Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task 
 timeout for 30mins
 --

 Key: YARN-1408
 URL: https://issues.apache.org/jira/browse/YARN-1408
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
 Fix For: 2.2.0

 Attachments: Yarn-1408.1.patch, Yarn-1408.2.patch, Yarn-1408.patch


 Capacity preemption is enabled as follows.
  *  yarn.resourcemanager.scheduler.monitor.enable= true ,
  *  
 yarn.resourcemanager.scheduler.monitor.policies=org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy
 Queue = a,b
 Capacity of Queue A = 80%
 Capacity of Queue B = 20%
 Step 1: Assign a big jobA on queue a which uses full cluster capacity
 Step 2: Submitted a jobB to queue b  which would use less than 20% of cluster 
 capacity
 JobA task which uses queue b capcity is been preempted and killed.
 This caused below problem:
 1. New Container has got allocated for jobA in Queue A as per node update 
 from an NM.
 2. This container has been preempted immediately as per preemption.
 Here ACQUIRED at KILLED Invalid State exception came when the next AM 
 heartbeat reached RM.
 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
 Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 ACQUIRED at KILLED
 This also caused the Task to go for a timeout for 30minutes as this Container 
 was already killed by preemption.
 attempt_1380289782418_0003_m_00_0 Timed out after 1800 secs



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)