[jira] [Commented] (YARN-1531) Update yarn command document

2014-01-30 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886410#comment-13886410
 ] 

Akira AJISAKA commented on YARN-1531:
-

[~kkambatl], would you please review the v2 patch?

> Update yarn command document
> 
>
> Key: YARN-1531
> URL: https://issues.apache.org/jira/browse/YARN-1531
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>  Labels: documentaion
> Attachments: YARN-1531.2.patch, YARN-1531.patch
>
>
> There are some options which are not written to Yarn Command document.
> For example, "yarn rmadmin" command options are as follows:
> {code}
>  Usage: yarn rmadmin
>-refreshQueues 
>-refreshNodes 
>-refreshSuperUserGroupsConfiguration 
>-refreshUserToGroupsMappings 
>-refreshAdminAcls 
>-refreshServiceAcl 
>-getGroups [username]
>-help [cmd]
>-transitionToActive 
>-transitionToStandby 
>-failover [--forcefence] [--forceactive]  
>-getServiceState 
>-checkHealth 
> {code}
> But some of the new options such as "-getGroups", "-transitionToActive", and 
> "-transitionToStandby" are not documented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886496#comment-13886496
 ] 

Hudson commented on YARN-1618:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #466 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/466/])
YARN-1618. Fix invalid RMApp transition from NEW to FINAL_SAVING (kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562529)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java


> Fix invalid RMApp transition from NEW to FINAL_SAVING
> -
>
> Key: YARN-1618
> URL: https://issues.apache.org/jira/browse/YARN-1618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch, 
> yarn-1618-branch-2.3.patch
>
>
> YARN-891 augments the RMStateStore to store information on completed 
> applications. In the process, it adds transitions from NEW to FINAL_SAVING. 
> This leads to the RM trying to update entries in the state-store that do not 
> exist. On ZKRMStateStore, this leads to the RM crashing. 
> Previous description:
> ZKRMStateStore fails to handle updates to znodes that don't exist. For 
> instance, this can happen when an app transitions from NEW to FINAL_SAVING. 
> In these cases, the store should create the missing znode and handle the 
> update.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1600) RM does not startup when security is enabled without spnego configured

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886500#comment-13886500
 ] 

Hudson commented on YARN-1600:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #466 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/466/])
YARN-1600. RM does not startup when security is enabled without spnego 
configured. Contributed by Haohui Mai (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562482)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java


> RM does not startup when security is enabled without spnego configured
> --
>
> Key: YARN-1600
> URL: https://issues.apache.org/jira/browse/YARN-1600
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 3.0.0, 2.3.0
>
> Attachments: YARN-1600.000.patch
>
>
> We have a custom auth filter in front of our various UI pages that handles 
> user authentication.  However currently the RM assumes that if security is 
> enabled then the user must have configured spnego as well for the RM web 
> pages which is not true in our case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1677) Potential bugs in exception handlers

2014-01-30 Thread Ding Yuan (JIRA)
Ding Yuan created YARN-1677:
---

 Summary: Potential bugs in exception handlers
 Key: YARN-1677
 URL: https://issues.apache.org/jira/browse/YARN-1677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: Ding Yuan


Hi Yarn developers,
We are a group of researchers on software reliability, and recently we did a 
study and found that majority of the most severe failures in hadoop are caused 
by bugs in exception handling logic. Therefore we built a simple checking tool 
that automatically detects some bug patterns that have caused some very severe 
failures. I am reporting some of the results for Yarn here. Any feedback is 
much appreciated!

==
Case 1:
Line: 551, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java"

{noformat}
switch (monitoringEvent.getType()) {
case START_MONITORING_CONTAINER:
  .. ..
default:
  // TODO: Wrong event.
}
{noformat}

The switch fall-through (handling any potential unexpected event) is empty. 
Should we at least print an error message here?
==

==
Case 2:
  Line: 491, File: 
"org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java"
{noformat}
  } catch (Throwable e) {

// TODO Better error handling. Thread can die with the rest of the
// NM still running.
LOG.error("Caught exception in status-updater", e);
  }
{noformat}

The handler of this very general exception only logs the error. The TODO seems 
to indicate it is not sufficient.
==

==
Case 3:
Line: 861, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"

   for (LocalResourceStatus stat : remoteResourceStatuses) {
LocalResource rsrc = stat.getResource();
LocalResourceRequest req = null;
try {
  req = new LocalResourceRequest(rsrc);
} catch (URISyntaxException e) {
  // TODO fail? Already translated several times...
}

The handler for URISyntaxException is empty, and the TODO seems to indicate it 
is not sufficient.
The same code pattern can also be found at:
Line: 901, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
Line: 838, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
Line: 878, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
At line: 803, File: 
org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java, 
the handler of URISyntaxException also seems not sufficient:
{noformat}
   try {
  shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
  shellScriptPath)));
} catch (URISyntaxException e) {
  LOG.error("Error when trying to use shell script path specified"
  + " in env, path=" + shellScriptPath);
  e.printStackTrace();

  // A failure scenario on bad input such as invalid shell script path
  // We know we cannot continue launching the container
  // so we should release it.
  // TODO
  numCompletedContainers.incrementAndGet();
  numFailedContainers.incrementAndGet();
  return;
}
{noformat}
==

==
Case 4:
Line: 627, File: 
"org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java"

{noformat}
  try {
/* keep the master in sync with the state machine */
this.stateMachine.doTransition(event.getType(), event);
  } catch (InvalidStateTransitonException e) {
LOG.error("Can't handle this event at current state", e);
/* TODO fail the application on the failed transition */
  }
{noformat}

The handler of this exception only logs the error. The TODO seems to indicate 
it is not sufficient.

This exact same code pattern can also be found at:
Line: 573, File: 
"org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java"
==

==
Case 5: empty handler for exception: java.lang.InterruptedException
  Line: 123, File: "org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java"

{noformat}
  public void join() {
if(proxyServer != null) {
  try {
proxyServer.join();
  } catch (InterruptedException e) {
  }
}
  }
{noformat}

The InterruptedException is completely ignored. As a result, any events causing 
this interrupt will be lost.

More info on why InterruptedException shouldn't be ignored: 
http://stackoverflow.com/questions/1087475/when-does-javas-thread-sleep-throw-int

[jira] [Updated] (YARN-1677) Potential bugs in exception handlers

2014-01-30 Thread Ding Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ding Yuan updated YARN-1677:


Description: 
Hi Yarn developers,
We are a group of researchers on software reliability, and recently we did a 
study and found that majority of the most severe failures in hadoop are caused 
by bugs in exception handling logic. Therefore we built a simple checking tool 
that automatically detects some bug patterns that have caused some very severe 
failures. I am reporting some of the results for Yarn here. Any feedback is 
much appreciated!

==
Case 1:
Line: 551, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java"

{noformat}
switch (monitoringEvent.getType()) {
case START_MONITORING_CONTAINER:
  .. ..
default:
  // TODO: Wrong event.
}
{noformat}

The switch fall-through (handling any potential unexpected event) is empty. 
Should we at least print an error message here?
==

==
Case 2:
  Line: 491, File: 
"org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java"
{noformat}
  } catch (Throwable e) {

// TODO Better error handling. Thread can die with the rest of the
// NM still running.
LOG.error("Caught exception in status-updater", e);
  }
{noformat}

The handler of this very general exception only logs the error. The TODO seems 
to indicate it is not sufficient.
==

==
Case 3:
Line: 861, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"

   for (LocalResourceStatus stat : remoteResourceStatuses) {
LocalResource rsrc = stat.getResource();
LocalResourceRequest req = null;
try {
  req = new LocalResourceRequest(rsrc);
} catch (URISyntaxException e) {
  // TODO fail? Already translated several times...
}

The handler for URISyntaxException is empty, and the TODO seems to indicate it 
is not sufficient.
The same code pattern can also be found at:
Line: 901, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
Line: 838, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
Line: 878, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
At line: 803, File: 
org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java, 
the handler of URISyntaxException also seems not sufficient:
{noformat}
   try {
  shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
  shellScriptPath)));
} catch (URISyntaxException e) {
  LOG.error("Error when trying to use shell script path specified"
  + " in env, path=" + shellScriptPath);
  e.printStackTrace();

  // A failure scenario on bad input such as invalid shell script path
  // We know we cannot continue launching the container
  // so we should release it.
  // TODO
  numCompletedContainers.incrementAndGet();
  numFailedContainers.incrementAndGet();
  return;
}
{noformat}
==

==
Case 4:
Line: 627, File: 
"org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java"

{noformat}
  try {
/* keep the master in sync with the state machine */
this.stateMachine.doTransition(event.getType(), event);
  } catch (InvalidStateTransitonException e) {
LOG.error("Can't handle this event at current state", e);
/* TODO fail the application on the failed transition */
  }
{noformat}

The handler of this exception only logs the error. The TODO seems to indicate 
it is not sufficient.

This exact same code pattern can also be found at:
Line: 573, File: 
"org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java"
==

==
Case 5: empty handler for exception: java.lang.InterruptedException
  Line: 123, File: "org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java"

{noformat}
  public void join() {
if(proxyServer != null) {
  try {
proxyServer.join();
  } catch (InterruptedException e) {
  }
}
  }
{noformat}

The InterruptedException is completely ignored. As a result, any events causing 
this interrupt will be lost.

More info on why InterruptedException shouldn't be ignored: 
http://stackoverflow.com/questions/1087475/when-does-javas-thread-sleep-throw-interruptedexception

This pattern of handling InterruptedException can be found in a few other 
places:
Line: 434, File: 
org/apache/hadoop/yarn/server/resourcemanager/ResourceM

[jira] [Updated] (YARN-1677) Potential bugs in exception handlers

2014-01-30 Thread Ding Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ding Yuan updated YARN-1677:


Description: 
Hi Yarn developers,
We are a group of researchers on software reliability, and recently we did a 
study and found that majority of the most severe failures in hadoop are caused 
by bugs in exception handling logic. Therefore we built a simple checking tool 
that automatically detects some bug patterns that have caused some very severe 
failures. I am reporting some of the results for Yarn here. Any feedback is 
much appreciated!

==
Case 1:
Line: 551, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java"

{noformat}
switch (monitoringEvent.getType()) {
case START_MONITORING_CONTAINER:
  .. ..
default:
  // TODO: Wrong event.
}
{noformat}

The switch fall-through (handling any potential unexpected event) is empty. 
Should we at least print an error message here?
==

==
Case 2:
  Line: 491, File: 
"org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java"
{noformat}
  } catch (Throwable e) {

// TODO Better error handling. Thread can die with the rest of the
// NM still running.
LOG.error("Caught exception in status-updater", e);
  }
{noformat}

The handler of this very general exception only logs the error. The TODO seems 
to indicate it is not sufficient.
==

==
Case 3:
Line: 861, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"

   for (LocalResourceStatus stat : remoteResourceStatuses) {
LocalResource rsrc = stat.getResource();
LocalResourceRequest req = null;
try {
  req = new LocalResourceRequest(rsrc);
} catch (URISyntaxException e) {
  // TODO fail? Already translated several times...
}

The handler for URISyntaxException is empty, and the TODO seems to indicate it 
is not sufficient.
The same code pattern can also be found at:
Line: 901, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
Line: 838, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
Line: 878, File: 
"org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java"
At line: 803, File: 
org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java, 
the handler of URISyntaxException also seems not sufficient:
{noformat}
   try {
  shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
  shellScriptPath)));
} catch (URISyntaxException e) {
  LOG.error("Error when trying to use shell script path specified"
  + " in env, path=" + shellScriptPath);
  e.printStackTrace();

  // A failure scenario on bad input such as invalid shell script path
  // We know we cannot continue launching the container
  // so we should release it.
  // TODO
  numCompletedContainers.incrementAndGet();
  numFailedContainers.incrementAndGet();
  return;
}
{noformat}
==

==
Case 4:
Line: 627, File: 
"org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java"

{noformat}
  try {
/* keep the master in sync with the state machine */
this.stateMachine.doTransition(event.getType(), event);
  } catch (InvalidStateTransitonException e) {
LOG.error("Can't handle this event at current state", e);
/* TODO fail the application on the failed transition */
  }
{noformat}

The handler of this exception only logs the error. The TODO seems to indicate 
it is not sufficient.

This exact same code pattern can also be found at:
Line: 573, File: 
"org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java"
==

==
Case 5: empty handler for exception: java.lang.InterruptedException
  Line: 123, File: "org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java"

{noformat}
  public void join() {
if(proxyServer != null) {
  try {
proxyServer.join();
  } catch (InterruptedException e) {
  }
}
  }
{noformat}

The InterruptedException is completely ignored. As a result, any events causing 
this interrupt will be lost.

More info on why InterruptedException shouldn't be ignored: 
http://stackoverflow.com/questions/1087475/when-does-javas-thread-sleep-throw-interruptedexception

This pattern of handling InterruptedException can be found in a few other 
places:
Line: 434, File: 
org/apache/hadoop/yarn/server/resourcemanager/ResourceM

[jira] [Commented] (YARN-1600) RM does not startup when security is enabled without spnego configured

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886566#comment-13886566
 ] 

Hudson commented on YARN-1600:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1683 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1683/])
YARN-1600. RM does not startup when security is enabled without spnego 
configured. Contributed by Haohui Mai (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562482)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java


> RM does not startup when security is enabled without spnego configured
> --
>
> Key: YARN-1600
> URL: https://issues.apache.org/jira/browse/YARN-1600
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 3.0.0, 2.3.0
>
> Attachments: YARN-1600.000.patch
>
>
> We have a custom auth filter in front of our various UI pages that handles 
> user authentication.  However currently the RM assumes that if security is 
> enabled then the user must have configured spnego as well for the RM web 
> pages which is not true in our case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886562#comment-13886562
 ] 

Hudson commented on YARN-1618:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1683 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1683/])
YARN-1618. Fix invalid RMApp transition from NEW to FINAL_SAVING (kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562529)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java


> Fix invalid RMApp transition from NEW to FINAL_SAVING
> -
>
> Key: YARN-1618
> URL: https://issues.apache.org/jira/browse/YARN-1618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch, 
> yarn-1618-branch-2.3.patch
>
>
> YARN-891 augments the RMStateStore to store information on completed 
> applications. In the process, it adds transitions from NEW to FINAL_SAVING. 
> This leads to the RM trying to update entries in the state-store that do not 
> exist. On ZKRMStateStore, this leads to the RM crashing. 
> Previous description:
> ZKRMStateStore fails to handle updates to znodes that don't exist. For 
> instance, this can happen when an app transitions from NEW to FINAL_SAVING. 
> In these cases, the store should create the missing znode and handle the 
> update.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1600) RM does not startup when security is enabled without spnego configured

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886584#comment-13886584
 ] 

Hudson commented on YARN-1600:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1658 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1658/])
YARN-1600. RM does not startup when security is enabled without spnego 
configured. Contributed by Haohui Mai (jlowe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562482)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java


> RM does not startup when security is enabled without spnego configured
> --
>
> Key: YARN-1600
> URL: https://issues.apache.org/jira/browse/YARN-1600
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 3.0.0, 2.3.0
>
> Attachments: YARN-1600.000.patch
>
>
> We have a custom auth filter in front of our various UI pages that handles 
> user authentication.  However currently the RM assumes that if security is 
> enabled then the user must have configured spnego as well for the RM web 
> pages which is not true in our case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1618) Fix invalid RMApp transition from NEW to FINAL_SAVING

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886580#comment-13886580
 ] 

Hudson commented on YARN-1618:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1658 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1658/])
YARN-1618. Fix invalid RMApp transition from NEW to FINAL_SAVING (kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562529)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java


> Fix invalid RMApp transition from NEW to FINAL_SAVING
> -
>
> Key: YARN-1618
> URL: https://issues.apache.org/jira/browse/YARN-1618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: yarn-1618-1.patch, yarn-1618-2.patch, yarn-1618-3.patch, 
> yarn-1618-branch-2.3.patch
>
>
> YARN-891 augments the RMStateStore to store information on completed 
> applications. In the process, it adds transitions from NEW to FINAL_SAVING. 
> This leads to the RM trying to update entries in the state-store that do not 
> exist. On ZKRMStateStore, this leads to the RM crashing. 
> Previous description:
> ZKRMStateStore fails to handle updates to znodes that don't exist. For 
> instance, this can happen when an app transitions from NEW to FINAL_SAVING. 
> In these cases, the store should create the missing znode and handle the 
> update.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1670) aggregated log writer can write more log data then it says is the log length

2014-01-30 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-1670:


Priority: Critical  (was: Major)

> aggregated log writer can write more log data then it says is the log length
> 
>
> Key: YARN-1670
> URL: https://issues.apache.org/jira/browse/YARN-1670
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Priority: Critical
>
> We have seen exceptions when using 'yarn logs' to read log files. 
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>at java.lang.Long.parseLong(Long.java:441)
>at java.lang.Long.parseLong(Long.java:483)
>at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
>at 
> org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
>at 
> org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
>at 
> org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)
> We traced it down to the reader trying to read the file type of the next file 
> but where it reads is still log data from the previous file.  What happened 
> was the Log Length was written as a certain size but the log data was 
> actually longer then that.  
> Inside of the write() routine in LogValue it first writes what the logfile 
> length is, but then when it goes to write the log itself it just goes to the 
> end of the file.  There is a race condition here where if someone is still 
> writing to the file when it goes to be aggregated the length written could be 
> to small.
> We should have the write() routine stop when it writes whatever it said was 
> the length.  It would be nice if we could somehow tell the user it might be 
> truncated but I'm not sure of a good way to do this.
> We also noticed that a bug in readAContainerLogsForALogType where it is using 
> an int for curRead whereas it should be using a long. 
>   while (len != -1 && curRead < fileLength) {
> This isn't actually a problem right now as it looks like the underlying 
> decoder is doing the right thing and the len condition exits.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886847#comment-13886847
 ] 

Karthik Kambatla commented on YARN-1461:


Thanks for the review, [~zjshen]. Sorry for the delay in following up on the 
comments. 

bq. The previous pattern of defining enum in proto is to have non proto 
corresponding enum, and map them one-to-one. It avoid using proto object in 
GetApplicationsRequest.
I am sorry, I didn't quite get that. Are you suggesting not having methods to 
get and set Scope in GetApplicationsRequest? If yes, how do you propose we 
allow users to set the Scope to the non-default value? 

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
> yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
> yarn-1461-7.patch, yarn-1461-8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1669) Make admin refreshServiceAcls work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1669:


Attachment: YARN-1669.1.patch

create the patch based on YARN-1611 for admin refreshServiceAcls changes

> Make admin refreshServiceAcls work across RM failover
> -
>
> Key: YARN-1669
> URL: https://issues.apache.org/jira/browse/YARN-1669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1669.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1504) RM changes for moving apps between queues

2014-01-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1504:
-

Attachment: YARN-1504-1.patch

> RM changes for moving apps between queues
> -
>
> Key: YARN-1504
> URL: https://issues.apache.org/jira/browse/YARN-1504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1504-1.patch, YARN-1504.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1504) RM changes for moving apps between queues

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886921#comment-13886921
 ] 

Sandy Ryza commented on YARN-1504:
--

Attached a patch that addresses Karthik's comments.  Regarding the tests, most 
of the error cases from ClientRMService#move*() were covered in 
TestMoveApplication.  The updated patch covers the one that was missing: 
checking permissions.

> RM changes for moving apps between queues
> -
>
> Key: YARN-1504
> URL: https://issues.apache.org/jira/browse/YARN-1504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1504-1.patch, YARN-1504.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data

2014-01-30 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1530:
-

Attachment: application timeline design-20140130.pdf

> [Umbrella] Store, manage and serve per-framework application-timeline data
> --
>
> Key: YARN-1530
> URL: https://issues.apache.org/jira/browse/YARN-1530
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
> Attachments: application timeline design-20140108.pdf, application 
> timeline design-20140116.pdf, application timeline design-20140130.pdf
>
>
> This is a sibling JIRA for YARN-321.
> Today, each application/framework has to do store, and serve per-framework 
> data all by itself as YARN doesn't have a common solution. This JIRA attempts 
> to solve the storage, management and serving of per-framework data from 
> various applications, both running and finished. The aim is to change YARN to 
> collect and store data in a generic manner with plugin points for frameworks 
> to do their own thing w.r.t interpretation and serving.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1461:
---

Attachment: yarn-1461-9.patch

Patch that sets the default scope to ALL, and addresses [~zjshen]'s review 
comments. 

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
> yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
> yarn-1461-7.patch, yarn-1461-8.patch, yarn-1461-9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1634) Define an in-memory implementation of ApplicationTimelineStore

2014-01-30 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1634:
--

Attachment: YARN-1634.1.patch

Upload a patch of in-memory implementation of ApplicationTimelineStore wit the 
test cases available.

> Define an in-memory implementation of ApplicationTimelineStore
> --
>
> Key: YARN-1634
> URL: https://issues.apache.org/jira/browse/YARN-1634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
> Attachments: YARN-1634.1.patch
>
>
> As per the design doc, the store needs to pluggable. We need a base 
> interface, and an in-memory implementation for testing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1659) Define ApplicationTimelineStore interface and store-facing entity, entity-info and event objects

2014-01-30 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1659:
-

Attachment: YARN-1659-3.patch

> Define ApplicationTimelineStore interface and store-facing entity, 
> entity-info and event objects
> 
>
> Key: YARN-1659
> URL: https://issues.apache.org/jira/browse/YARN-1659
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Attachments: YARN-1659-1.patch, YARN-1659-3.patch, YARN-1659.2.patch
>
>
> These will be used by ApplicationTimelineStore interface.  The web services 
> will convert the store-facing obects to the user-facing objects.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1634) Define an in-memory implementation of ApplicationTimelineStore

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886969#comment-13886969
 ] 

Hadoop QA commented on YARN-1634:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626156/YARN-1634.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2968//console

This message is automatically generated.

> Define an in-memory implementation of ApplicationTimelineStore
> --
>
> Key: YARN-1634
> URL: https://issues.apache.org/jira/browse/YARN-1634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
> Attachments: YARN-1634.1.patch
>
>
> As per the design doc, the store needs to pluggable. We need a base 
> interface, and an in-memory implementation for testing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1504) RM changes for moving apps between queues

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886968#comment-13886968
 ] 

Hadoop QA commented on YARN-1504:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626147/YARN-1504-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2966//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2966//console

This message is automatically generated.

> RM changes for moving apps between queues
> -
>
> Key: YARN-1504
> URL: https://issues.apache.org/jira/browse/YARN-1504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1504-1.patch, YARN-1504.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887004#comment-13887004
 ] 

Hadoop QA commented on YARN-1461:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626151/yarn-1461-9.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2967//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2967//console

This message is automatically generated.

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
> yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
> yarn-1461-7.patch, yarn-1461-8.patch, yarn-1461-9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable

2014-01-30 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-713:


Target Version/s: 2.3.0

> ResourceManager can exit unexpectedly if DNS is unavailable
> ---
>
> Key: YARN-713
> URL: https://issues.apache.org/jira/browse/YARN-713
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Jason Lowe
>Assignee: Omkar Vinit Joshi
>Priority: Critical
> Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
> YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
> YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch
>
>
> As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
> lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
> that ultimately would cause the RM to exit.  The RM should not exit during 
> DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1635) Implement a Leveldb based ApplicationTimelineStore

2014-01-30 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1635:
-

Attachment: YARN-1635.1.patch

> Implement a Leveldb based ApplicationTimelineStore
> --
>
> Key: YARN-1635
> URL: https://issues.apache.org/jira/browse/YARN-1635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Billie Rinaldi
> Attachments: YARN-1635.1.patch
>
>
> As per the design doc, we need a levelDB + local-filesystem based 
> implementation to start with and for small deployments.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-321) Generic application history service

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887036#comment-13887036
 ] 

Hudson commented on YARN-321:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #5074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5074/])
Updating trunk's YARN CHANGES.txt after YARN-321 merge. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1562950)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt


> Generic application history service
> ---
>
> Key: YARN-321
> URL: https://issues.apache.org/jira/browse/YARN-321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Luke Lu
> Attachments: AHS Diagram.pdf, ApplicationHistoryServiceHighLevel.pdf, 
> Generic Application History - Design-20131219.pdf, HistoryStorageDemo.java
>
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1659) Define ApplicationTimelineStore interface and store-facing entity, entity-info and event objects

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887047#comment-13887047
 ] 

Hadoop QA commented on YARN-1659:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626159/YARN-1659-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2969//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2969//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2969//console

This message is automatically generated.

> Define ApplicationTimelineStore interface and store-facing entity, 
> entity-info and event objects
> 
>
> Key: YARN-1659
> URL: https://issues.apache.org/jira/browse/YARN-1659
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Attachments: YARN-1659-1.patch, YARN-1659-3.patch, YARN-1659.2.patch
>
>
> These will be used by ApplicationTimelineStore interface.  The web services 
> will convert the store-facing obects to the user-facing objects.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1498) Common scheduler changes for moving apps between queues

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887043#comment-13887043
 ] 

Sandy Ryza commented on YARN-1498:
--

bq. AppSchedulingInfo: Not sure I understand the relevance of the following 
change to this JIRA. Am I missing something or is it just cleanup?
This isn't required, but I thought it made the code clearer, as we're adding to 
the places that incrPendingResources and decrPendingResources get called. 
AppSchedulingInfo#move is confusing already, and I wanted to avoid having a 
monstrosity like
{code}
-metrics.incrPendingResources(user, request.getNumContainers()
-- lastRequestContainers, Resources.subtractFrom( // save a clone
-Resources.multiply(request.getCapability(), request
-.getNumContainers()), Resources.multiply(lastRequestCapability,
-lastRequestContainers)));
{code}
in it.  Can revert it if you think it's not worth it.

bq. Can we throw an Exception instead of returning null.
I copied this from the Capacity Scheduler, so would rather keep it consistent.

> Common scheduler changes for moving apps between queues
> ---
>
> Key: YARN-1498
> URL: https://issues.apache.org/jira/browse/YARN-1498
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1498-1.patch, YARN-1498.patch, YARN-1498.patch
>
>
> This JIRA is to track changes that aren't in particular schedulers but that 
> help them support moving apps between queues.  In particular, it makes sure 
> that QueueMetrics are properly updated when an app changes queue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1659) Define ApplicationTimelineStore interface and store-facing entity, entity-info and event objects

2014-01-30 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1659:
-

Attachment: YARN-1659-4.patch

Fixed findbugs warnings.

> Define ApplicationTimelineStore interface and store-facing entity, 
> entity-info and event objects
> 
>
> Key: YARN-1659
> URL: https://issues.apache.org/jira/browse/YARN-1659
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Attachments: YARN-1659-1.patch, YARN-1659-3.patch, YARN-1659-4.patch, 
> YARN-1659.2.patch
>
>
> These will be used by ApplicationTimelineStore interface.  The web services 
> will convert the store-facing obects to the user-facing objects.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1678:


 Summary: Fair scheduler gabs incessantly about reservations
 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza


Come on FS. We really don't need to know every time a node with a reservation 
on it heartbeats.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1666) Make admin refreshNodes work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1666:


Attachment: YARN-1666.1.patch

create the patch based on YARN-1611 for refreshNodes changes

> Make admin refreshNodes work across RM failover
> ---
>
> Key: YARN-1666
> URL: https://issues.apache.org/jira/browse/YARN-1666
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1666.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1659) Define ApplicationTimelineStore interface and store-facing entity, entity-info and event objects

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887168#comment-13887168
 ] 

Hadoop QA commented on YARN-1659:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626191/YARN-1659-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2970//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2970//console

This message is automatically generated.

> Define ApplicationTimelineStore interface and store-facing entity, 
> entity-info and event objects
> 
>
> Key: YARN-1659
> URL: https://issues.apache.org/jira/browse/YARN-1659
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Attachments: YARN-1659-1.patch, YARN-1659-3.patch, YARN-1659-4.patch, 
> YARN-1659.2.patch
>
>
> These will be used by ApplicationTimelineStore interface.  The web services 
> will convert the store-facing obects to the user-facing objects.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1678:
-

Attachment: YARN-1678.patch

> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887192#comment-13887192
 ] 

Sandy Ryza commented on YARN-1678:
--

Attached patch avoids unnecessary info messages and documents some of the 
reserve code in AppSchedulable

> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1617) Remove ancient comment and surround LOG.debug in AppSchedulingInfo.allocate

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887193#comment-13887193
 ] 

Sandy Ryza commented on YARN-1617:
--

Thanks for the reviews Akira and Karthik.  Committing this.

> Remove ancient comment and surround LOG.debug in AppSchedulingInfo.allocate
> ---
>
> Key: YARN-1617
> URL: https://issues.apache.org/jira/browse/YARN-1617
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1617.patch
>
>
> {code}
>   synchronized private void allocate(Container container) {
> // Update consumption and track allocations
> //TODO: fixme sharad
> /* try {
> store.storeContainer(container);
>   } catch (IOException ie) {
> // TODO fix this. we shouldnt ignore
>   }*/
> 
> LOG.debug("allocate: applicationId=" + applicationId + " container="
> + container.getId() + " host="
> + container.getNodeId().toString());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1617) Remove ancient comment and surround LOG.debug in AppSchedulingInfo.allocate

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887207#comment-13887207
 ] 

Hudson commented on YARN-1617:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5076 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5076/])
YARN-1617. Remove ancient comment and surround LOG.debug in 
AppSchedulingInfo.allocate (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1563004)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java


> Remove ancient comment and surround LOG.debug in AppSchedulingInfo.allocate
> ---
>
> Key: YARN-1617
> URL: https://issues.apache.org/jira/browse/YARN-1617
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.4.0
>
> Attachments: YARN-1617.patch
>
>
> {code}
>   synchronized private void allocate(Container container) {
> // Update consumption and track allocations
> //TODO: fixme sharad
> /* try {
> store.storeContainer(container);
>   } catch (IOException ie) {
> // TODO fix this. we shouldnt ignore
>   }*/
> 
> LOG.debug("allocate: applicationId=" + applicationId + " container="
> + container.getId() + " host="
> + container.getNodeId().toString());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1611) Make admin refresh of configuration work across RM failover

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887221#comment-13887221
 ] 

Sandy Ryza commented on YARN-1611:
--

Thanks Xuan.  One other thing is that the current patch won't work for 
refreshing queues for the Fair Scheduler, which does not get its settings from 
a Configuration object.  The fair-scheduler.xml file is in a different format 
than a typical Hadoop configuration file.

> Make admin refresh of configuration work across RM failover
> ---
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887232#comment-13887232
 ] 

Hadoop QA commented on YARN-1678:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626205/YARN-1678.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2971//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2971//console

This message is automatically generated.

> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1611) Make admin refresh of scheduler configuration work across RM failover

2014-01-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1611:
--

Summary: Make admin refresh of scheduler configuration work across RM 
failover  (was: Make admin refresh of configuration work across RM failover)

Editing title as we are only focusing on scheduler configuration in this ticket.

> Make admin refresh of scheduler configuration work across RM failover
> -
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1504) RM changes for moving apps between queues

2014-01-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887252#comment-13887252
 ] 

Karthik Kambatla commented on YARN-1504:


Looks good to me, +1. 

Let us leave this open for a day, in case anyone else wants to take a look at 
it. 

> RM changes for moving apps between queues
> -
>
> Key: YARN-1504
> URL: https://issues.apache.org/jira/browse/YARN-1504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1504-1.patch, YARN-1504.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887257#comment-13887257
 ] 

Zhijie Shen commented on YARN-1461:
---

bq. I am sorry, I didn't quite get that. Are you suggesting not having methods 
to get and set Scope in GetApplicationsRequest? If yes, how do you propose we 
allow users to set the Scope to the non-default value?

No, I mean the generated proto class is not supposed to be used in API records. 
Usually, what we do is to define a Java enum, use it in API records, and map it 
to the corresponding proto class. The mapping is invoked in PBImpl of the API 
records. Please take a look at YarnApplicationState and 
YarnApplicationStateProto. 

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
> yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
> yarn-1461-7.patch, yarn-1461-8.patch, yarn-1461-9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1498) Common scheduler changes for moving apps between queues

2014-01-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887260#comment-13887260
 ] 

Karthik Kambatla commented on YARN-1498:


The arithmetic change in AppSchedulingInfo seems to be tested by several tests 
- assuming the change is correct.

+1. 

> Common scheduler changes for moving apps between queues
> ---
>
> Key: YARN-1498
> URL: https://issues.apache.org/jira/browse/YARN-1498
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1498-1.patch, YARN-1498.patch, YARN-1498.patch
>
>
> This JIRA is to track changes that aren't in particular schedulers but that 
> help them support moving apps between queues.  In particular, it makes sure 
> that QueueMetrics are properly updated when an app changes queue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1678:
-

Description: 
Come on FS. We really don't need to know every time a node with a reservation 
on it heartbeats.

{code}
2014-01-29 03:48:16,043 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Trying to fulfill reservation for application 
appattempt_1390547864213_0347_01 on node: host: 
a2330.halxg.cloudera.com:8041 #containers=8 available= 
used=
2014-01-29 03:48:16,043 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
Making reservation: node=a2330.halxg.cloudera.com 
app_id=application_1390547864213_0347
2014-01-29 03:48:16,043 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
 Application application_1390547864213_0347 reserved container 
container_1390547864213_0347_01_03 on node host: 
a2330.halxg.cloudera.com:8041 #containers=8 available= 
used=, currently has 6 at priority 0; currentReservation 
6144
2014-01-29 03:48:16,044 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
Updated reserved container container_1390547864213_0347_01_03 on node host: 
a2330.halxg.cloudera.com:8041 #containers=8 available= 
used= for application 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
{code}

  was:Come on FS. We really don't need to know every time a node with a 
reservation on it heartbeats.


> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.
> {code}
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Trying to fulfill reservation for application 
> appattempt_1390547864213_0347_01 on node: host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
> Making reservation: node=a2330.halxg.cloudera.com 
> app_id=application_1390547864213_0347
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Application application_1390547864213_0347 reserved container 
> container_1390547864213_0347_01_03 on node host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=, currently has 6 at priority 0; 
> currentReservation 6144
> 2014-01-29 03:48:16,044 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
> Updated reserved container container_1390547864213_0347_01_03 on node 
> host: a2330.halxg.cloudera.com:8041 #containers=8 available= vCores:8> used= for application 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887277#comment-13887277
 ] 

Sandy Ryza commented on YARN-1678:
--

Looks like I was missing an exclamation point.

> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678-1.patch, YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.
> {code}
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Trying to fulfill reservation for application 
> appattempt_1390547864213_0347_01 on node: host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
> Making reservation: node=a2330.halxg.cloudera.com 
> app_id=application_1390547864213_0347
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Application application_1390547864213_0347 reserved container 
> container_1390547864213_0347_01_03 on node host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=, currently has 6 at priority 0; 
> currentReservation 6144
> 2014-01-29 03:48:16,044 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
> Updated reserved container container_1390547864213_0347_01_03 on node 
> host: a2330.halxg.cloudera.com:8041 #containers=8 available= vCores:8> used= for application 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1678:
-

Attachment: YARN-1678-1.patch

> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678-1.patch, YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.
> {code}
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Trying to fulfill reservation for application 
> appattempt_1390547864213_0347_01 on node: host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
> Making reservation: node=a2330.halxg.cloudera.com 
> app_id=application_1390547864213_0347
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Application application_1390547864213_0347 reserved container 
> container_1390547864213_0347_01_03 on node host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=, currently has 6 at priority 0; 
> currentReservation 6144
> 2014-01-29 03:48:16,044 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
> Updated reserved container container_1390547864213_0347_01_03 on node 
> host: a2330.halxg.cloudera.com:8041 #containers=8 available= vCores:8> used= for application 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1498) Common scheduler changes for moving apps between queues

2014-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887289#comment-13887289
 ] 

Sandy Ryza commented on YARN-1498:
--

Committed this to trunk.

> Common scheduler changes for moving apps between queues
> ---
>
> Key: YARN-1498
> URL: https://issues.apache.org/jira/browse/YARN-1498
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 3.0.0
>
> Attachments: YARN-1498-1.patch, YARN-1498.patch, YARN-1498.patch
>
>
> This JIRA is to track changes that aren't in particular schedulers but that 
> help them support moving apps between queues.  In particular, it makes sure 
> that QueueMetrics are properly updated when an app changes queue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1498) Common scheduler changes for moving apps between queues

2014-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887301#comment-13887301
 ] 

Hudson commented on YARN-1498:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5078 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5078/])
YARN-1498. Common scheduler changes for moving apps between queues (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1563021)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Queue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerApplicationAttempt.java


> Common scheduler changes for moving apps between queues
> ---
>
> Key: YARN-1498
> URL: https://issues.apache.org/jira/browse/YARN-1498
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 3.0.0
>
> Attachments: YARN-1498-1.patch, YARN-1498.patch, YARN-1498.patch
>
>
> This JIRA is to track changes that aren't in particular schedulers but that 
> help them support moving apps between queues.  In particular, it makes sure 
> that QueueMetrics are properly updated when an app changes queue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887319#comment-13887319
 ] 

Hadoop QA commented on YARN-1678:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626219/YARN-1678-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2972//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2972//console

This message is automatically generated.

> Fair scheduler gabs incessantly about reservations
> --
>
> Key: YARN-1678
> URL: https://issues.apache.org/jira/browse/YARN-1678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1678-1.patch, YARN-1678.patch
>
>
> Come on FS. We really don't need to know every time a node with a reservation 
> on it heartbeats.
> {code}
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
> Trying to fulfill reservation for application 
> appattempt_1390547864213_0347_01 on node: host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
> Making reservation: node=a2330.halxg.cloudera.com 
> app_id=application_1390547864213_0347
> 2014-01-29 03:48:16,043 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  Application application_1390547864213_0347 reserved container 
> container_1390547864213_0347_01_03 on node host: 
> a2330.halxg.cloudera.com:8041 #containers=8 available= 
> used=, currently has 6 at priority 0; 
> currentReservation 6144
> 2014-01-29 03:48:16,044 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
> Updated reserved container container_1390547864213_0347_01_03 on node 
> host: a2330.halxg.cloudera.com:8041 #containers=8 available= vCores:8> used= for application 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887322#comment-13887322
 ] 

Karthik Kambatla commented on YARN-1461:


bq. Usually, what we do is to define a Java enum, use it in API records, and 
map it to the corresponding proto class. The mapping is invoked in PBImpl of 
the API records. Please take a look at YarnApplicationState and 
YarnApplicationStateProto.

I see. Just looked at YarnApplicationState and YarnApplicationStateProto. We 
could do something similar for this too. However, I am curious why having two 
different enums, one in Java and one in proto, and a converter between the two 
is preferable to just having one enum and no converter? Particularly, in this 
case, it is always going to be a 1:1 mapping between the two. 

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
> yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
> yarn-1461-7.patch, yarn-1461-8.patch, yarn-1461-9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1611) Make admin refresh of scheduler configuration work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1611:


Attachment: YARN-1611.6.patch

> Make admin refresh of scheduler configuration work across RM failover
> -
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch, 
> YARN-1611.6.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1611) Make admin refresh of scheduler configuration work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887339#comment-13887339
 ] 

Xuan Gong commented on YARN-1611:
-

bq. Fix formatting in the code. There are serveral non-standard instances of 
formatting, and lines crossing 80 char boundary.

DONE

bq. conf.store -> remote-configuration.store

changed

bq. Change DEFAULT_RM_CONF_STORE to be something like /yarn/conf?

changed 

bq.Add javadoc for RemoteConfiguration the class and all the methods.

added

bq. Same for RemoteConfigurationFactory.

added

bq. Move RC and RCF to yarn.conf package

Moved

bq. FileSystemBasedRemoteConfiguration: If path doesn't exist, we should not 
silently log it. Throw an exception.

changed

bq.CapacityScheduler.java: Remote conf is loaded only on refresh but not on 
init?

Yes, we need to do this. But this will be fixed in YARN-1459

bq. Instead of getConfigurationFileName(), create static constants for each 
config-file and directly code them into the caller.

DONE. Just create CS_CONFIGURATION_FILE in this ticket. Will create other conf 
file within related jira tickets

bq. When can conf be null in HA mode? Even if it can, the response to refresh 
to indicate an exception.

If we throw out exception in FileSystemBasedRemoteConfiguration If path doesn't 
exist. So, the conf will not be null. Changed.

> Make admin refresh of scheduler configuration work across RM failover
> -
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch, 
> YARN-1611.6.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1611) Make admin refresh of capacity scheduler configuration work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1611:


Summary: Make admin refresh of capacity scheduler configuration work across 
RM failover  (was: Make admin refresh of scheduler configuration work across RM 
failover)

> Make admin refresh of capacity scheduler configuration work across RM failover
> --
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch, 
> YARN-1611.6.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1679) Make admin refresh of Fair scheduler configuration work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-1679:
---

 Summary: Make admin refresh of Fair scheduler configuration work 
across RM failover
 Key: YARN-1679
 URL: https://issues.apache.org/jira/browse/YARN-1679
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1611) Make admin refresh of capacity scheduler configuration work across RM failover

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887382#comment-13887382
 ] 

Hadoop QA commented on YARN-1611:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626235/YARN-1611.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2973//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2973//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2973//console

This message is automatically generated.

> Make admin refresh of capacity scheduler configuration work across RM failover
> --
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch, 
> YARN-1611.6.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1611) Make admin refresh of capacity scheduler configuration work across RM failover

2014-01-30 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887391#comment-13887391
 ] 

Xuan Gong commented on YARN-1611:
-

This -1 on findbug is unrelated.

> Make admin refresh of capacity scheduler configuration work across RM failover
> --
>
> Key: YARN-1611
> URL: https://issues.apache.org/jira/browse/YARN-1611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-1611.1.patch, YARN-1611.2.patch, YARN-1611.2.patch, 
> YARN-1611.3.patch, YARN-1611.3.patch, YARN-1611.4.patch, YARN-1611.5.patch, 
> YARN-1611.6.patch
>
>
> Currently, If we do refresh* for a standby RM, it will failover to the 
> current active RM, and do the refresh* based on the local configuration file 
> of the active RM. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1461:
---

Attachment: yarn-1461-10.patch

Updated patch adds a new Java enum ApplicationsRequestScope to complement the 
proto enum ApplicationsRequestScopeProto. 

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-10.patch, 
> yarn-1461-2.patch, yarn-1461-3.patch, yarn-1461-4.patch, yarn-1461-5.patch, 
> yarn-1461-6.patch, yarn-1461-6.patch, yarn-1461-7.patch, yarn-1461-8.patch, 
> yarn-1461-9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887430#comment-13887430
 ] 

Hadoop QA commented on YARN-1461:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626247/yarn-1461-10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2974//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2974//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2974//console

This message is automatically generated.

> RM API and RM changes to handle tags for running jobs
> -
>
> Key: YARN-1461
> URL: https://issues.apache.org/jira/browse/YARN-1461
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-1461-1.patch, yarn-1461-10.patch, 
> yarn-1461-2.patch, yarn-1461-3.patch, yarn-1461-4.patch, yarn-1461-5.patch, 
> yarn-1461-6.patch, yarn-1461-6.patch, yarn-1461-7.patch, yarn-1461-8.patch, 
> yarn-1461-9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1662) Capacity Scheduler reservation issue cause Job Hang

2014-01-30 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887471#comment-13887471
 ] 

Sunil G commented on YARN-1662:
---

A timed reservation logic if we can implement here, then it will be safer for 
the fresh allocation to try in some other node.
I have reviewd the scheduler part and found that without a seperate timer 
thread, this can be achieved.

addReReservation() will be invoked when the same node tries to rereserve the 
same applications requests in the node.
This is a multiset, hence the internal count will increment everytime when this 
addReReservation() is performed.
Also this will be incremented in every 1 sec(node heartbeat interval) only.

I wish to add a code like below in LeafQueue::assignContainer() method. If the 
limit exceeds, i will try unreseve the same from the node.
This code will hit when the same application trying to re-reserve again in same 
node. 

} else {
  // Reserve by 'charging' in advance...
  reserve(application, priority, node, rmContainer, container);
  
  // Check for re-reservation limit. In this case, unreserve and try for a
  // fresh allocation.
  if (RESERVATION_TIME_LIMIT != 0
  && application.getReReservations(priority) > RESERVATION_TIME_LIMIT) {
unreserve(application, priority, node, rmContainer);
return Resources.none();
  }

So for the next nodeupdate from some other node, CS can try allocate resource 
to this application.

NB: Reservation is to ensure that same task can stick on to same node where its 
better to run. 
A bigger configurable limit which is based on the nature of the tasks running, 
can still achieve the above behavior.

Please share your thoughts.

> Capacity Scheduler reservation issue cause Job Hang
> ---
>
> Key: YARN-1662
> URL: https://issues.apache.org/jira/browse/YARN-1662
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.2.0
> Environment: Suse 11 SP1 + Linux
>Reporter: Sunil G
>
> There are 2 node managers in my cluster.
> NM1 with 8GB
> NM2 with 8GB
> I am submitting a Job with below details:
> AM with 2GB
> Map needs 5GB
> Reducer needs 3GB
> slowstart is enabled with 0.5
> 10maps and 50reducers are assigned.
> 5maps are completed. Now few reducers got scheduled.
> Now NM1 has 2GB AM and 3Gb Reducer_1[Used 5GB]
> NM2 has 3Gb Reducer_2  [Used 3GB]
> A Map has now reserved(5GB) in NM1 which has only 3Gb free.
> It hangs forever.
> Potential issue is, reservation is now blocked in NM1 for a Map which needs 
> 5GB.
> But the Reducer_1 hangs by waiting for few map ouputs.
> Reducer side preemption also not happened as few headroom is still available.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)