[jira] [Updated] (YARN-1024) Define a CPU resource(s) unambigiously

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1024:
-

Summary: Define a CPU resource(s) unambigiously  (was: Define a virtual 
core unambigiously)

> Define a CPU resource(s) unambigiously
> --
>
> Key: YARN-1024
> URL: https://issues.apache.org/jira/browse/YARN-1024
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: CPUasaYARNresource.pdf
>
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-976) Document the meaning of a virtual core

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-976:


Issue Type: Sub-task  (was: Task)
Parent: YARN-1024

> Document the meaning of a virtual core
> --
>
> Key: YARN-976
> URL: https://issues.apache.org/jira/browse/YARN-976
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-976.patch
>
>
> As virtual cores are a somewhat novel concept, it would be helpful to have 
> thorough documentation that clarifies their meaning.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1089:
-

Issue Type: Sub-task  (was: Improvement)
Parent: YARN-1024

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1089-1.patch, YARN-1089.patch
>
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-976) Document the meaning of a virtual core

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-976:


Attachment: YARN-976.patch

> Document the meaning of a virtual core
> --
>
> Key: YARN-976
> URL: https://issues.apache.org/jira/browse/YARN-976
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-976.patch
>
>
> As virtual cores are a somewhat novel concept, it would be helpful to have 
> thorough documentation that clarifies their meaning.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782566#comment-13782566
 ] 

Sandy Ryza commented on YARN-1241:
--

Uploaded patch to fix the new findbugs warnings

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241-3.patch, 
> YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1241:
-

Attachment: YARN-1241-3.patch

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241-3.patch, 
> YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (YARN-1259) In Fair Scheduler web UI, queue num pending and num active apps switched

2013-09-30 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1259:


 Summary: In Fair Scheduler web UI, queue num pending and num 
active apps switched
 Key: YARN-1259
 URL: https://issues.apache.org/jira/browse/YARN-1259
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza


The values returned in FairSchedulerLeafQueueInfo by numPendingApplications and 
numActiveApplications should be switched.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1010) FairScheduler: decouple container scheduling from nodemanager heartbeats

2013-09-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782473#comment-13782473
 ] 

Sandy Ryza commented on YARN-1010:
--

This looks almost there to me.  A few nits:
{code}
+LOG.warn("Error while doing sleep in continuous scheduling: " +
+e.toString(), e);
{code}
There should be indentation non the second line here.

{code}
+  private void continuousScheduling() {
{code}
Better to have method names be verbs.  Maybe "scheduleContinuously".

Most of the Fair Scheduler properties use dashes at the end instead of dots and 
I think this is a good convention.  We should change 
yarn.scheduler.fair.locality.threshold.node.time.ms to 
yarn.scheduler.fair.locality-delay-node-ms. (And the same for rack).  We should 
also change yarn.scheduler.fair.continuous.scheduling.enabled to 
yarn.scheduler.fair.continuous-scheduling-enabled and 
yarn.scheduler.fair.continuous.scheduling.sleep.time.ms to 
yarn.scheduler.fair.continuous-scheduling-sleep-ms.

Adding multi-second sleeps in the unit tests will slow down build times and is 
still theoretically open to races if the OS pauses.  Better would be to use the 
clock interface.  In the test you can use a MockClock like in 
TestFairScheduler#testChoiceOfPreemptedContainers, and you can change the start 
time in AppSchedulable to come from scheduler.getClock().getTime(). 

> FairScheduler: decouple container scheduling from nodemanager heartbeats
> 
>
> Key: YARN-1010
> URL: https://issues.apache.org/jira/browse/YARN-1010
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Wei Yan
>Priority: Critical
> Attachments: YARN-1010.patch
>
>
> Currently scheduling for a node is done when a node heartbeats.
> For large cluster where the heartbeat interval is set to several seconds this 
> delays scheduling of incoming allocations significantly.
> We could have a continuous loop scanning all nodes and doing scheduling. If 
> there is availability AMs will get the allocation in the next heartbeat after 
> the one that placed the request.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (YARN-1258) Allow configuring the Fair Scheduler root queue

2013-09-30 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1258:


 Summary: Allow configuring the Fair Scheduler root queue
 Key: YARN-1258
 URL: https://issues.apache.org/jira/browse/YARN-1258
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza


This would be useful for acls, maxRunningApps, scheduling modes, etc.

The allocation file should be able to accept both:
* An implicit root queue
* A root queue at the top of the hierarchy with all queues under/inside of it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1241:
-

Attachment: YARN-1241-2.patch

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782448#comment-13782448
 ] 

Sandy Ryza commented on YARN-1241:
--

Uploaded patch to fix findbugs warnings

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782322#comment-13782322
 ] 

Sandy Ryza commented on YARN-1221:
--

I just committed this to trunk, branch-2, and branch-2.1-beta.  Thanks Siqi!

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Siqi Li
> Fix For: 2.1.2-beta
>
> Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, 
> YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782319#comment-13782319
 ] 

Sandy Ryza commented on YARN-1241:
--

Rebased on trunk

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241-1.patch, YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1221:
-

Assignee: Siqi Li

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Siqi Li
> Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, 
> YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1241:
-

Attachment: YARN-1241-1.patch

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241-1.patch, YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782283#comment-13782283
 ] 

Sandy Ryza commented on YARN-1221:
--

+1

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
> Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, 
> YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1241:
-

Attachment: YARN-1241.patch

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1241.patch
>
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-26 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779516#comment-13779516
 ] 

Sandy Ryza commented on YARN-1221:
--

Thanks [~l201514]!  Good point about the rootQueueMetrics update.  I think it's 
an artifact from when we didn't have hierarchical queues.

Looks like you probably need to rebase on latest trunk - do you mind removing 
the whitespace change on the line with the if when you do?

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
> Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-26 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779173#comment-13779173
 ] 

Sandy Ryza commented on YARN-1221:
--

bq. It will affect the amount shown in the web UI, since they all have a parent 
QueueMetrics, which is the root queue metrics.
Ah, you are totally right.  Also, applied your patch and the issue went away 
for me.

For the ClusterMetricsInfo part, I'm still not convinced in the page, I'm still 
not convinced on the change, but either way we should do it in a separate JIRA.

Also, are you able to add a test?  An easy way to do this might be to just find 
an existing test in TestFairScheduler that, without the patch, has an incorrect 
value of reserved MB at the end and add an assert there.

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
> Attachments: YARN1221_v1.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-26 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778967#comment-13778967
 ] 

Sandy Ryza commented on YARN-1241:
--

Currently this property works by starting all apps as not runnable, and then 
marking some as runnable in the update thread. Because this thread runs only 
every half second, app start can be delayed by up to a half second.  The 
changes required to make the property work for parent queues are a good 
opportunity for solving this problem as well.

I propose splitting apps in leaf queues into runnable and non-runnable lists, 
and keeping a count of runnable apps in parent queues.  Determining an 
application's runnability at add-time will only require looking at the size of 
the runnable list of the leaf queue and the numRunnable in all the parent 
queues.

Then, when an application is removed, we need to check to see whether any 
applications can now be made runnable.  We find the highest queue in the 
hierarchy that's a parent of the application's queue and that was previously at 
its maxRunningApps capacity.  We go through the apps in all leaf queues under 
that queue in order of start time to see if any can be made runnable.

> In Fair Scheduler maxRunningApps does not work for non-leaf queues
> --
>
> Key: YARN-1241
> URL: https://issues.apache.org/jira/browse/YARN-1241
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> Setting the maxRunningApps property on a parent queue should make it that the 
> sum of apps in all subqueues can't exceed it

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778208#comment-13778208
 ] 

Sandy Ryza commented on YARN-1221:
--

bq. The reason I removed the code above is that there is no corresponding 
unreserve method got called.
Good catch.  But that shouldn't affect the amount shown in the web UI, because 
the metrics for which there is double counting are the leaf queue metrics, 
whereas the value in the web UI is based only off of the root queue metrics.  
Is that not right?

bq. As far as I saw from the webUI, the available memory never get decremented 
when it allocates memory to mr jobs.
Where are you seeing the available memory reported on the web UI?

To be clear, I'm referring to what's shown under the "Cluster Metrics" section 
when you go to http://:/cluster

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
> Attachments: YARN1221_v1.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues

2013-09-25 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1241:


 Summary: In Fair Scheduler maxRunningApps does not work for 
non-leaf queues
 Key: YARN-1241
 URL: https://issues.apache.org/jira/browse/YARN-1241
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza


Setting the maxRunningApps property on a parent queue should make it that the 
sum of apps in all subqueues can't exceed it

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778174#comment-13778174
 ] 

Sandy Ryza commented on YARN-1221:
--

{code}
-this.totalMB = availableMB + reservedMB + allocatedMB;
+this.totalMB = availableMB;
{code}
Total MB should still include allocatedMB.  I agree that reservedMB should be 
removed from it, but I think this is work for a separate JIRA.  This one is for 
dealing with why reservedMB is calculated incorrectly.

{code}
-  getMetrics().reserveResource(app.getUser(),
-  container.getResource());
{code}
Can you explain the rationale behind removing this?

> With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
> -
>
> Key: YARN-1221
> URL: https://issues.apache.org/jira/browse/YARN-1221
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
> Attachments: YARN1221_v1.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Attachment: YARN-1228-2.patch

> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1228-1.patch, YARN-1228-2.patch, YARN-1228.patch
>
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777661#comment-13777661
 ] 

Sandy Ryza commented on YARN-1228:
--

Updated patch adds license header to test-fair-scheduler.xml

> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1228-1.patch, YARN-1228-2.patch, YARN-1228.patch
>
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1236) FairScheduler setting queue name in RMApp is not working

2013-09-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1236:
-

Attachment: YARN-1236.patch

> FairScheduler setting queue name in RMApp is not working 
> -
>
> Key: YARN-1236
> URL: https://issues.apache.org/jira/browse/YARN-1236
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1236.patch
>
>
> The fair scheduler sometimes picks a different queue than the one an 
> application was submitted to, such as when user-as-default-queue is turned 
> on.  It needs to update the queue name in the RMApp so that this choice will 
> be reflected in the UI.
> This isn't working because the scheduler is looking up the RMApp by 
> application attempt id instead of app id and failing to find it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1236) FairScheduler setting queue name in RMApp is not working

2013-09-25 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1236:


 Summary: FairScheduler setting queue name in RMApp is not working 
 Key: YARN-1236
 URL: https://issues.apache.org/jira/browse/YARN-1236
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1236.patch

The fair scheduler sometimes picks a different queue than the one an 
application was submitted to, such as when user-as-default-queue is turned on.  
It needs to update the queue name in the RMApp so that this choice will be 
reflected in the UI.

This isn't working because the scheduler is looking up the RMApp by application 
attempt id instead of app id and failing to find it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777572#comment-13777572
 ] 

Sandy Ryza commented on YARN-1228:
--

Updated patch includes a test

> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1228-1.patch, YARN-1228.patch
>
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Attachment: YARN-1228-1.patch

> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1228-1.patch, YARN-1228.patch
>
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables

2013-09-24 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1128:
-

Hadoop Flags: Reviewed

Committed to trunk, branch-2, and branch-2.1-beta

> FifoPolicy.computeShares throws NPE on empty list of Schedulables
> -
>
> Key: YARN-1128
> URL: https://issues.apache.org/jira/browse/YARN-1128
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Karthik Kambatla
> Fix For: 2.1.2-beta
>
> Attachments: yarn-1128-1.patch
>
>
> FifoPolicy gives all of a queue's share to the earliest-scheduled application.
> {code}
> Schedulable earliest = null;
> for (Schedulable schedulable : schedulables) {
>   if (earliest == null ||
>   schedulable.getStartTime() < earliest.getStartTime()) {
> earliest = schedulable;
>   }
> }
> earliest.setFairShare(Resources.clone(totalResources));
> {code}
> If the queue has no schedulables in it, earliest will be left null, leading 
> to an NPE on the last line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777071#comment-13777071
 ] 

Sandy Ryza commented on YARN-1089:
--

As was requested, I posted a summary of the proposal on YARN-1024.

In case it's not clear on the summary, here's the problem we're trying to solve:
We want jobs to be portable between clusters. CPU is not a fluid resource in 
the way memory is. The number of cores on a machine is just as important its 
total processing power when scheduling tasks.

Imagine a cluster where every node has powerful CPUs with many cores.  One type 
of task that will be run on the cluster saturates a full CPU, but another type 
of task that will be run on the cluster contains two threads, each which can 
saturate only half a full CPU.  If we have a single dimension for CPU requests, 
these tasks will request an equal number of those.  What happens if we then 
move those tasks to a cluster with CPUs whose cores are half as fast?  The 
first task will run half as fast, and the second task will run in the same 
amount of time.  It's in the first task's interest to only request half as many 
CPU resources on that cluster.

I'm also afraid of things getting complicated, but I can't think of anything 
better that doesn't require having the meaning of a virtual core vary widely 
from cluster to cluster.

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1089-1.patch, YARN-1089.patch
>
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777034#comment-13777034
 ] 

Sandy Ryza commented on YARN-1089:
--

I'm ok with with waiting until 2.3.  In case it's not clear, the consequence of 
this is that until then it will be impossible to place more tasks on a node 
than its number of virtual cores, which is essentially its number of physical 
cores.

I think we should make YARN-976, documenting the meaning of vcores, a blocker 
for 2.2.

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1089-1.patch, YARN-1089.patch
>
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1230) Fair scheduler aclSubmitApps does not handle acls with only groups

2013-09-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1230:


 Summary: Fair scheduler aclSubmitApps does not handle acls with 
only groups
 Key: YARN-1230
 URL: https://issues.apache.org/jira/browse/YARN-1230
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza


ACLs are specified like "user1,user2 group1,group2".  " group1,group2", but 
will be interpreted incorrectly by the Fair Scheduler because it trims the 
leading space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775769#comment-13775769
 ] 

Sandy Ryza commented on YARN-1228:
--

Existing tests verify that absolute paths and not giving any file work. Adding 
a file to the classpath at runtime is difficult, so I verified that it picks up 
files from the classpath by manually testing on a pseudo-distributed cluster.


> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1228.patch
>
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Attachment: YARN-1228.patch

> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
> Attachments: YARN-1228.patch
>
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Description: 
Currently the Fair Scheduler is configured in two ways
* An allocations file that has a different format than the standard Hadoop 
configuration file, which makes it easier to specify hierarchical objects like 
queues and their properties. 
* With properties like yarn.scheduler.fair.max.assign that are specified in the 
standard Hadoop configuration format.

The standard and default way of configuring it is to use fair-scheduler.xml as 
the allocations file and to put the yarn.scheduler properties in yarn-site.xml.

It is also possible to specify a different file as the allocations file, and to 
place the yarn.scheduler properties in fair-scheduler.xml, which will be 
interpreted as in the standard Hadoop configuration format.  This flexibility 
is both confusing and unnecessary.

Additionally, the allocation file is loaded as fair-scheduler.xml from the 
classpath if it is not specified, but is loaded as a File if it is.  This 
causes two problems
1. We see different behavior when not setting the 
yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
which is its default.
2. Classloaders may choose to cache resources, which can break the reload logic 
when yarn.scheduler.fair.allocation.file is not specified.

We should never allow the yarn.scheduler properties to go into 
fair-scheduler.xml.  And we should always load the allocations file as a file, 
not as a resource on the classpath.  To preserve existing behavior and allow 
loading files from the classpath, we can look for files on the classpath, but 
strip of their scheme and interpret them as Files.


  was:
Currently the Fair Scheduler is configured in two ways
* An allocations file that has a different format than the standard Hadoop 
configuration file, which makes it easier to specify hierarchical objects like 
queues and their properties. 
* With properties like yarn.scheduler.fair.max.assign that are specified in the 
standard Hadoop configuration format.

The standard and default way of configuring it is to use fair-scheduler.xml as 
the allocations file and to put the yarn.scheduler properties in yarn-site.xml.

It is also possible to specify a different file as the allocations file, and to 
place the yarn.scheduler properties in fair-scheduler.xml, which will be 
interpreted as in the standard Hadoop configuration format.  This flexibility 
is both confusing and unnecessary.  There's no need to keep around the second 
way.



> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.
> Additionally, the allocation file is loaded as fair-scheduler.xml from the 
> classpath if it is not specified, but is loaded as a File if it is.  This 
> causes two problems
> 1. We see different behavior when not setting the 
> yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
> which is its default.
> 2. Classloaders may choose to cache resources, which can break the reload 
> logic when yarn.scheduler.fair.allocation.file is not specified.
> We should never allow the yarn.scheduler properties to go into 
> fair-scheduler.xml.  And we should always load the allocations file as a 
> file, not as a resource on the classpath.  To preserve existing behavior and 
> allow loading files from the classpath, we can look for files on the 
> classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Summary: Clean up Fair Scheduler configuration loading  (was: Don't allow 
other file than fair-scheduler.xml to be Fair Scheduler allocations file)

> Clean up Fair Scheduler configuration loading
> -
>
> Key: YARN-1228
> URL: https://issues.apache.org/jira/browse/YARN-1228
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.1.1-beta
>Reporter: Sandy Ryza
>
> Currently the Fair Scheduler is configured in two ways
> * An allocations file that has a different format than the standard Hadoop 
> configuration file, which makes it easier to specify hierarchical objects 
> like queues and their properties. 
> * With properties like yarn.scheduler.fair.max.assign that are specified in 
> the standard Hadoop configuration format.
> The standard and default way of configuring it is to use fair-scheduler.xml 
> as the allocations file and to put the yarn.scheduler properties in 
> yarn-site.xml.
> It is also possible to specify a different file as the allocations file, and 
> to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
> interpreted as in the standard Hadoop configuration format.  This flexibility 
> is both confusing and unnecessary.  There's no need to keep around the second 
> way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1227:
-

Labels: newbie  (was: )

> Update Single Cluster doc to use yarn.resourcemanager.hostname
> --
>
> Key: YARN-1227
> URL: https://issues.apache.org/jira/browse/YARN-1227
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>  Labels: newbie
>
> Now that yarn.resourcemanager.hostname can be used in place or 
> yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., 
> we should update the doc to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1188:
-

Assignee: Tsuyoshi OZAWA

> The context of QueueMetrics becomes 'default' when using FairScheduler
> --
>
> Key: YARN-1188
> URL: https://issues.apache.org/jira/browse/YARN-1188
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Akira AJISAKA
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
>  Labels: metrics, newbie
> Attachments: YARN-1188.1.patch
>
>
> I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
> was using FairScheduler.
> The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
> like below:
> {code}
> + @Metrics(context="yarn")
> public class FSQueueMetrics extends QueueMetrics {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1228) Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file

2013-09-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1228:


 Summary: Don't allow other file than fair-scheduler.xml to be Fair 
Scheduler allocations file
 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza


Currently the Fair Scheduler is configured in two ways
* An allocations file that has a different format than the standard Hadoop 
configuration file, which makes it easier to specify hierarchical objects like 
queues and their properties. 
* With properties like yarn.scheduler.fair.max.assign that are specified in the 
standard Hadoop configuration format.

The standard and default way of configuring it is to use fair-scheduler.xml as 
the allocations file and to put the yarn.scheduler properties in yarn-site.xml.

It is also possible to specify a different file as the allocations file, and to 
place the yarn.scheduler properties in fair-scheduler.xml, which will be 
interpreted as in the standard Hadoop configuration format.  This flexibility 
is both confusing and unnecessary.  There's no need to keep around the second 
way.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname

2013-09-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1227:


 Summary: Update Single Cluster doc to use 
yarn.resourcemanager.hostname
 Key: YARN-1227
 URL: https://issues.apache.org/jira/browse/YARN-1227
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


Now that yarn.resourcemanager.hostname can be used in place or 
yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., we 
should update the doc to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774344#comment-13774344
 ] 

Sandy Ryza commented on YARN-1188:
--

I just committed this to trunk and branch-2.  Thanks Tsuyoshi!

> The context of QueueMetrics becomes 'default' when using FairScheduler
> --
>
> Key: YARN-1188
> URL: https://issues.apache.org/jira/browse/YARN-1188
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: metrics, newbie
> Attachments: YARN-1188.1.patch
>
>
> I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
> was using FairScheduler.
> The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
> like below:
> {code}
> + @Metrics(context="yarn")
> public class FSQueueMetrics extends QueueMetrics {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1188:
-

Assignee: (was: Akira AJISAKA)

> The context of QueueMetrics becomes 'default' when using FairScheduler
> --
>
> Key: YARN-1188
> URL: https://issues.apache.org/jira/browse/YARN-1188
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: metrics, newbie
> Attachments: YARN-1188.1.patch
>
>
> I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
> was using FairScheduler.
> The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
> like below:
> {code}
> + @Metrics(context="yarn")
> public class FSQueueMetrics extends QueueMetrics {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1188:
-

Assignee: Akira AJISAKA

> The context of QueueMetrics becomes 'default' when using FairScheduler
> --
>
> Key: YARN-1188
> URL: https://issues.apache.org/jira/browse/YARN-1188
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: metrics, newbie
> Attachments: YARN-1188.1.patch
>
>
> I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
> was using FairScheduler.
> The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
> like below:
> {code}
> + @Metrics(context="yarn")
> public class FSQueueMetrics extends QueueMetrics {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774336#comment-13774336
 ] 

Sandy Ryza commented on YARN-1188:
--

+1

> The context of QueueMetrics becomes 'default' when using FairScheduler
> --
>
> Key: YARN-1188
> URL: https://issues.apache.org/jira/browse/YARN-1188
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: metrics, newbie
> Attachments: YARN-1188.1.patch
>
>
> I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
> was using FairScheduler.
> The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
> like below:
> {code}
> + @Metrics(context="yarn")
> public class FSQueueMetrics extends QueueMetrics {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-19 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1089:
-

Attachment: YARN-1089-1.patch

Updated patch should fix TestFairScheduler and TestSchedulerUtils.  The 
TestRMContainerAllocator failure looks like MAPREDUCE-5514.

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1089-1.patch, YARN-1089.patch
>
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-18 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1221:


 Summary: With Fair Scheduler, reserved MB reported in RM web UI 
increases indefinitely
 Key: YARN-1221
 URL: https://issues.apache.org/jira/browse/YARN-1221
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables

2013-09-18 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770823#comment-13770823
 ] 

Sandy Ryza commented on YARN-1128:
--

+1

> FifoPolicy.computeShares throws NPE on empty list of Schedulables
> -
>
> Key: YARN-1128
> URL: https://issues.apache.org/jira/browse/YARN-1128
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Karthik Kambatla
> Attachments: yarn-1128-1.patch
>
>
> FifoPolicy gives all of a queue's share to the earliest-scheduled application.
> {code}
> Schedulable earliest = null;
> for (Schedulable schedulable : schedulables) {
>   if (earliest == null ||
>   schedulable.getStartTime() < earliest.getStartTime()) {
> earliest = schedulable;
>   }
> }
> earliest.setFairShare(Resources.clone(totalResources));
> {code}
> If the queue has no schedulables in it, earliest will be left null, leading 
> to an NPE on the last line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1213) Add an equivalent of mapred.fairscheduler.allow.undeclared.pools to the Fair Scheduler

2013-09-17 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1213:


 Summary: Add an equivalent of 
mapred.fairscheduler.allow.undeclared.pools to the Fair Scheduler
 Key: YARN-1213
 URL: https://issues.apache.org/jira/browse/YARN-1213
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-17 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1089:
-

Attachment: YARN-1089.patch

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1089.patch
>
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-17 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1089:
-

Attachment: (was: YARN-1089.patch)

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-17 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1089:
-

Attachment: YARN-1089.patch

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-1089.patch
>
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished

2013-09-17 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770430#comment-13770430
 ] 

Sandy Ryza commented on YARN-1206:
--

Did this work differently prior to YARN-649?  My impression was that this was 
the existing behavior.

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Priority: Blocker
>  Labels: 2.1.1-beta
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished

2013-09-17 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1206:
-

Labels:   (was: 2.1.1-beta)

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Priority: Blocker
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1212) Add a Scheduling Concepts page to the web doc

2013-09-17 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1212:


 Summary: Add a Scheduling Concepts page to the web doc
 Key: YARN-1212
 URL: https://issues.apache.org/jira/browse/YARN-1212
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza


It would be helpful to have a page that explains some of the non-obvious 
concepts used in YARN scheduling.  We get a lot of questions about these on the 
user lists because they aren't really documented anywhere.

An incomplete list of concepts to cover:
* Resources (memory / CPU)
* Reservations
* ResourceRequest format
* Disabling locality relaxation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1184) ClassCastException is thrown during preemption When a huge job is submitted to a queue B whose resources is used by a job in queueA

2013-09-13 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1184:
-

Component/s: capacityscheduler

> ClassCastException is thrown during preemption When a huge job is submitted 
> to a queue B whose resources is used by a job in queueA
> ---
>
> Key: YARN-1184
> URL: https://issues.apache.org/jira/browse/YARN-1184
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: J.Andreina
>Assignee: Devaraj K
>
> preemption is enabled.
> Queue = a,b
> a capacity = 30%
> b capacity = 70%
> Step 1: Assign a big job to queue a ( so that job_a will utilize some 
> resources from queue b)
> Step 2: Assigne a big job to queue b.
> Following exception is thrown at Resource Manager
> {noformat}
> 2013-09-12 10:42:32,535 ERROR [SchedulingMonitor 
> (ProportionalCapacityPreemptionPolicy)] yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
> an Exception.
> java.lang.ClassCastException: java.util.Collections$UnmodifiableSet cannot be 
> cast to java.util.NavigableSet
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.getContainersToPreempt(ProportionalCapacityPreemptionPolicy.java:403)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:202)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:173)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
>   at java.lang.Thread.run(Thread.java:662)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763829#comment-13763829
 ] 

Sandy Ryza commented on YARN-938:
-

On vacation now, but I'll try to assemble them into a presentable form when I 
get back.

> Hadoop 2 benchmarking 
> --
>
> Key: YARN-938
> URL: https://issues.apache.org/jira/browse/YARN-938
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls
>
>
> I am running the benchmarks on Hadoop 2 and will update the results soon.
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763798#comment-13763798
 ] 

Sandy Ryza commented on YARN-938:
-

Thanks for working on these, [~mayank_bansal].  The results are pretty 
consistent with some internal benchmarking we've done at Cloudera.

A few questions:
* In MR1 was io.sort.record.percent tuned to spill the same number of times as 
MR2 does?
* What was slowstart completed maps set to?
* How many slots and MB were the TTs and NMs configured with?
* Any idea what caused the improvement between RC1 and the final release?  I'm 
guessing MAPREDUCE-5399 helped.


> Hadoop 2 benchmarking 
> --
>
> Key: YARN-938
> URL: https://issues.apache.org/jira/browse/YARN-938
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls
>
>
> I am running the benchmarks on Hadoop 2 and will update the results soon.
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1171) Add defaultQueueSchedulingPolicy to Fair Scheduler documentation

2013-09-09 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1171:


 Summary: Add defaultQueueSchedulingPolicy to Fair Scheduler 
documentation 
 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


The Fair Scheduler doc is missing the defaultQueueSchedulingPolicy property.  I 
suspect there are a few other ones too that provide defaults for all queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers

2013-09-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761164#comment-13761164
 ] 

Sandy Ryza commented on YARN-1049:
--

+1

> ContainerExistStatus should define a status for preempted containers
> 
>
> Key: YARN-1049
> URL: https://issues.apache.org/jira/browse/YARN-1049
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Fix For: 2.1.1-beta
>
> Attachments: YARN-1049.patch
>
>
> With the current behavior is impossible to determine if a container has been 
> preempted or lost due to a NM crash.
> Adding a PREEMPTED exit status (-102) will help an AM determine that a 
> container has been preempted.
> Note the change of scope from the original summary/description. The original 
> scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm 
> reducing the scope of this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions

2013-09-05 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759511#comment-13759511
 ] 

Sandy Ryza commented on YARN-910:
-

Can we add in the initializeContainer doc exactly when this will be called?  
I.e. whether it's after getting a container launch command from the AM or 
whether it's after launching the actual container.

False whitespace change in ApplicationImpl.java

Otherwise, LGTM

> Allow auxiliary services to listen for container starts and completions
> ---
>
> Key: YARN-910
> URL: https://issues.apache.org/jira/browse/YARN-910
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Alejandro Abdelnur
> Attachments: YARN-910.patch, YARN-910.patch
>
>
> Making container start and completion events available to auxiliary services 
> would allow them to be resource-aware.  The auxiliary service would be able 
> to notify a co-located service that is opportunistically using free capacity 
> of allocation changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1100) Giving multiple commands to ContainerLaunchContext doesn't work as expected

2013-09-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756795#comment-13756795
 ] 

Sandy Ryza commented on YARN-1100:
--

In the MR and distributed shell code, the command list does not seem to be used 
for holding arguments to the same process.  All arguments are concatenated and 
placed in a single entry in the commands list.

> Giving multiple commands to ContainerLaunchContext doesn't work as expected
> ---
>
> Key: YARN-1100
> URL: https://issues.apache.org/jira/browse/YARN-1100
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, nodemanager
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>
> A ContainerLaunchContext accepts a list of commands (as strings) to be 
> executed to launch the container.  I would expect that giving a list with the 
> two commands "echo yolo" and "date" would print something like
> {code}
> yolo
> Mon Aug 26 14:40:23 PDT 2013
> {code}
> Instead it prints
> {code}
> yolo date
> {code}
> This is because the commands get executed with:
> {code}
> exec /bin/bash -c "echo yolo date"
> {code}
> To get the expected behavior I have to include semicolons at the end of each 
> command. At the very least, this should be documented, but I think better 
> would be for the NM to insert the semicolons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text

2013-09-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756793#comment-13756793
 ] 

Sandy Ryza commented on YARN-649:
-

Thanks Vinod!

> Make container logs available over HTTP in plain text
> -
>
> Key: YARN-649
> URL: https://issues.apache.org/jira/browse/YARN-649
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, 
> YARN-649-5.patch, YARN-649-6.patch, YARN-649-7.patch, YARN-649.patch, 
> YARN-752-1.patch
>
>
> It would be good to make container logs available over the REST API for 
> MAPREDUCE-4362 and so that they can be accessed programatically in general.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1135) Revisit log message levels in FairScheduler.

2013-09-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755977#comment-13755977
 ] 

Sandy Ryza commented on YARN-1135:
--

Agreed

> Revisit log message levels in FairScheduler.
> 
>
> Key: YARN-1135
> URL: https://issues.apache.org/jira/browse/YARN-1135
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>Assignee: Sandy Ryza
> Fix For: 2.1.1-beta
>
>
> FairScheduler produces allocation attempts at INFO level, those should be a 
> DEBUG level:
> {code}
> 2013-09-02 09:14:47,815 INFO  [ResourceManager Event Processor] 
> fair.AppSchedulable (AppSchedulable.java:assignContainer(277)) - Node offered 
> to app: application_1378106082360_0001 reserved: false
> 2013-09-02 09:14:47,815 INFO  [ResourceManager Event Processor] 
> fair.AppSchedulable (AppSchedulable.java:assignContainer(277)) - Node offered 
> to app: application_1378106082360_0002 reserved: false
> 2013-09-02 09:14:48,247 INFO  [ResourceManager Event Processor] 
> fair.AppSchedulable (AppSchedulable.java:assignContainer(277)) - Node offered 
> to app: application_1378106082360_0001 reserved: false
> ...
> {code}
> We should curate all log message levels

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1090) Job does not get into Pending State

2013-08-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755368#comment-13755368
 ] 

Sandy Ryza commented on YARN-1090:
--

Ah, ok, makes total sense

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Jian He
> Attachments: YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1090) Job does not get into Pending State

2013-08-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755219#comment-13755219
 ] 

Sandy Ryza commented on YARN-1090:
--

My understanding was that an application is "Pending" if no AM has yet been 
allocated to it.  Can you go a little more into what the issue is with this 
definition?

Also, whatever we decide, is there somewhere where we can document exactly what 
these mean?

> Job does not get into Pending State
> ---
>
> Key: YARN-1090
> URL: https://issues.apache.org/jira/browse/YARN-1090
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Jian He
> Attachments: YARN-1090.patch
>
>
> When there is no resource available to run a job, next job should go in 
> pending state. RM UI should show next job as pending app and the counter for 
> the pending app should be incremented.
> But Currently. Next job stays in ACCEPTED state and No AM has been assigned 
> to this job.Though Pending App count is not incremented. 
> Running 'job status ' shows job state=PREP. 
> $ mapred job -status job_1377122233385_0002
> 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at 
> host1/ip1
> Job: job_1377122233385_0002
> Job File: /ABC/.staging/job_1377122233385_0002/job.xml
> Job Tracking URL : http://host1:port1/application_1377122233385_0002/
> Uber job : false
> Number of maps: 0
> Number of reduces: 0
> map() completion: 0.0
> reduce() completion: 0.0
> Job state: PREP
> retired: false
> reason for failure:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables

2013-08-30 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1128:


 Summary: FifoPolicy.computeShares throws NPE on empty list of 
Schedulables
 Key: YARN-1128
 URL: https://issues.apache.org/jira/browse/YARN-1128
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


FifoPolicy gives all of a queue's share to the earliest-scheduled application.

{code}
Schedulable earliest = null;
for (Schedulable schedulable : schedulables) {
  if (earliest == null ||
  schedulable.getStartTime() < earliest.getStartTime()) {
earliest = schedulable;
  }
}
earliest.setFairShare(Resources.clone(totalResources));
{code}

If the queue has no schedulables in it, earliest will be left null, leading to 
an NPE on the last line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1122) FairScheduler user-as-default-queue always defaults to 'default'

2013-08-29 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753804#comment-13753804
 ] 

Sandy Ryza commented on YARN-1122:
--

Thanks for taking this up, [~lohit].  Would it be possible to add a test?

> FairScheduler user-as-default-queue always defaults to 'default'
> 
>
> Key: YARN-1122
> URL: https://issues.apache.org/jira/browse/YARN-1122
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.0.5-alpha
>Reporter: Lohit Vijayarenu
> Attachments: YARN-1122.1.patch
>
>
> By default YARN fairscheduler should use user name as queue name, but we see 
> that in our clusters all jobs were ending up in default queue. Even after 
> picking YARN-333 which is part of trunk, the behavior remains the same. Jobs 
> do end up in right queue, but from UI perspective they are shown as running 
> under default queue. It looks like there is small bug with
> {noformat}
> RMApp rmApp = rmContext.getRMApps().get(applicationAttemptId);
> {noformat}
> which should actually be
> {noformat}
> RMApp rmApp = 
> rmContext.getRMApps().get(applicationAttemptId.getApplicationId());
> {noformat}
> There is also a simple js change needed for filtering of jobs on 
> fairscheduler UI page.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1034) Remove "experimental" in the Fair Scheduler documentation

2013-08-28 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1034:
-

Fix Version/s: 2.1.1-beta

> Remove "experimental" in the Fair Scheduler documentation
> -
>
> Key: YARN-1034
> URL: https://issues.apache.org/jira/browse/YARN-1034
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Karthik Kambatla
>Priority: Trivial
>  Labels: doc
> Fix For: 2.1.1-beta
>
> Attachments: yarn-1034-1.patch
>
>
> The YARN Fair Scheduler is largely stable now, and should no longer be 
> declared experimental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1034) Remove "experimental" in the Fair Scheduler documentation

2013-08-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753194#comment-13753194
 ] 

Sandy Ryza commented on YARN-1034:
--

+1

> Remove "experimental" in the Fair Scheduler documentation
> -
>
> Key: YARN-1034
> URL: https://issues.apache.org/jira/browse/YARN-1034
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Karthik Kambatla
>Priority: Trivial
>  Labels: doc
> Attachments: yarn-1034-1.patch
>
>
> The YARN Fair Scheduler is largely stable now, and should no longer be 
> declared experimental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1110) NodeManager doesn't complete container after transition from LOCALIZED to KILLING

2013-08-27 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1110:


 Summary: NodeManager doesn't complete container after transition 
from LOCALIZED to KILLING
 Key: YARN-1110
 URL: https://issues.apache.org/jira/browse/YARN-1110
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


Multiple containers are sticking around on an NM, taking up resources, after 
they have been killed.

{code}
2013-08-27 15:56:36,597 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
 Start request for container_1377559361179_0018_01_001337 by user llama
2013-08-27 15:56:36,597 INFO 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=llama
IP=10.20.191.233OPERATION=Start Container Request   
TARGET=ContainerManageImpl  RESULT=SUCCESS  
APPID=application_1377559361179_0018
CONTAINERID=container_1377559361179_0018_01_001337
2013-08-27 15:56:36,598 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
 Adding container_1377559361179_0018_01_001337 to application 
application_1377559361179_0018
2013-08-27 15:56:36,598 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1377559361179_0018_01_001337 transitioned from NEW to 
LOCALIZED
2013-08-27 15:56:36,613 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
 Stopping container with container Id: container_1377559361179_0018_01_001337
2013-08-27 15:56:36,616 INFO 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=llama
IP=10.20.191.233OPERATION=Stop Container Request
TARGET=ContainerManageImpl  RESULT=SUCCESS  
APPID=application_1377559361179_0018
CONTAINERID=container_1377559361179_0018_01_001337
2013-08-27 15:56:36,616 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Container container_1377559361179_0018_01_001337 transitioned from LOCALIZED to 
KILLING
2013-08-27 15:56:36,616 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Cleaning up container container_1377559361179_0018_01_001337
2013-08-27 15:56:36,616 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Container container_1377559361179_0018_01_001337 not launched. No cleanup 
needed to be done
2013-08-27 15:56:36,617 INFO 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 
status for container: container_id {, app_attempt_id {, application_id {, id: 
18, cluster_timestamp: 1377559361179, }, attemptId: 1, }, id: 402, }, state: 
C_RUNNING, diagnostics: "", exit_status: -1000, 
{code}

This is the last time the container is mentioned in the logs.  We never get a 
{code}
2013-08-27 15:56:38,832 INFO 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed 
completed container 
{code}
like we do for other completed containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1109) Consider throttling or demoting NodeManager "Sending out status for container" logs

2013-08-27 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1109:


 Summary: Consider throttling or demoting NodeManager "Sending out 
status for container" logs
 Key: YARN-1109
 URL: https://issues.apache.org/jira/browse/YARN-1109
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


Diagnosing NodeManager and container launch problems is made more difficult by 
the enormous number of logs like
{code}
Sending out status for container: container_id {, app_attempt_id {, 
application_id {, id: 18, cluster_timestamp: 1377559361179, }, attemptId: 1, }, 
id: 1337, }, state: C_RUNNING, diagnostics: "Container killed by the 
ApplicationMaster.\n", exit_status: -1000
{code}

On an NM with a few containers I am seeing tens of these per second.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-832) Update Resource javadoc to clarify units for memory

2013-08-27 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-832:


Target Version/s: 2.3.0

> Update Resource javadoc to clarify units for memory
> ---
>
> Key: YARN-832
> URL: https://issues.apache.org/jira/browse/YARN-832
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>  Labels: newbie
>
> These values are supposed to be megabytes (need to check MB vs MiB ie 1000 vs 
> 1024)
>   /**
>* Get memory of the resource.
>* @return memory of the resource
>*/
>   @Public
>   @Stable
>   public abstract int getMemory();
>   
>   /**
>* Set memory of the resource.
>* @param memory memory of the resource
>*/
>   @Public
>   @Stable
>   public abstract void setMemory(int memory);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-832) Update Resource javadoc to clarify units for memory

2013-08-27 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-832:


Labels: newbie  (was: )

> Update Resource javadoc to clarify units for memory
> ---
>
> Key: YARN-832
> URL: https://issues.apache.org/jira/browse/YARN-832
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>  Labels: newbie
> Fix For: 2.3.0
>
>
> These values are supposed to be megabytes (need to check MB vs MiB ie 1000 vs 
> 1024)
>   /**
>* Get memory of the resource.
>* @return memory of the resource
>*/
>   @Public
>   @Stable
>   public abstract int getMemory();
>   
>   /**
>* Set memory of the resource.
>* @param memory memory of the resource
>*/
>   @Public
>   @Stable
>   public abstract void setMemory(int memory);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-832) Update Resource javadoc to clarify units for memory

2013-08-27 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-832:


Fix Version/s: (was: 2.3.0)

> Update Resource javadoc to clarify units for memory
> ---
>
> Key: YARN-832
> URL: https://issues.apache.org/jira/browse/YARN-832
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>  Labels: newbie
>
> These values are supposed to be megabytes (need to check MB vs MiB ie 1000 vs 
> 1024)
>   /**
>* Get memory of the resource.
>* @return memory of the resource
>*/
>   @Public
>   @Stable
>   public abstract int getMemory();
>   
>   /**
>* Set memory of the resource.
>* @param memory memory of the resource
>*/
>   @Public
>   @Stable
>   public abstract void setMemory(int memory);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-723) Yarn default value of physical cpu cores to virtual core is 2

2013-08-27 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751775#comment-13751775
 ] 

Sandy Ryza commented on YARN-723:
-

Resolving as invalid now that YARN-782 removed the vcores-pcores-ratio

> Yarn default value of physical cpu cores to virtual core is 2
> -
>
> Key: YARN-723
> URL: https://issues.apache.org/jira/browse/YARN-723
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.4-alpha
>Reporter: Bikas Saha
> Fix For: 2.3.0
>
>
> The default virtual core allocation in the RM is 1. That means every 
> container will get 1 virtual core == 1/2 a physical core. Not sure if this 
> breaks implicit MR assumptions of maps/reduces getting at least 1 physical 
> cpu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-723) Yarn default value of physical cpu cores to virtual core is 2

2013-08-27 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza resolved YARN-723.
-

Resolution: Invalid

> Yarn default value of physical cpu cores to virtual core is 2
> -
>
> Key: YARN-723
> URL: https://issues.apache.org/jira/browse/YARN-723
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.4-alpha
>Reporter: Bikas Saha
> Fix For: 2.3.0
>
>
> The default virtual core allocation in the RM is 1. That means every 
> container will get 1 virtual core == 1/2 a physical core. Not sure if this 
> breaks implicit MR assumptions of maps/reduces getting at least 1 physical 
> cpu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1100) Giving multiple commands to ContainerLaunchContext doesn't work as expected

2013-08-26 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1100:


 Summary: Giving multiple commands to ContainerLaunchContext 
doesn't work as expected
 Key: YARN-1100
 URL: https://issues.apache.org/jira/browse/YARN-1100
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


A ContainerLaunchContext accepts a list of commands (as strings) to be executed 
to launch the container.  I would expect that giving a list with the two 
commands "echo yolo" and "date" would print something like
{code}
yolo
Mon Aug 26 14:40:23 PDT 2013
{code}

Instead it prints
{code}
yolo date
{code}

This is because the commands get executed with:
{code}
exec /bin/bash -c "echo yolo date"
{code}

To get the expected behavior I have to include semicolons at the end of each 
command. At the very least, this should be documented, but I think better would 
be for the NM to insert the semicolons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-649) Make container logs available over HTTP in plain text

2013-08-26 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-649:


Attachment: YARN-649-7.patch

> Make container logs available over HTTP in plain text
> -
>
> Key: YARN-649
> URL: https://issues.apache.org/jira/browse/YARN-649
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, 
> YARN-649-5.patch, YARN-649-6.patch, YARN-649-7.patch, YARN-649.patch, 
> YARN-752-1.patch
>
>
> It would be good to make container logs available over the REST API for 
> MAPREDUCE-4362 and so that they can be accessed programatically in general.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-942) In Fair Scheduler documentation, inconsistency on which properties have prefix

2013-08-26 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-942:


Assignee: Akira AJISAKA

> In Fair Scheduler documentation, inconsistency on which properties have prefix
> --
>
> Key: YARN-942
> URL: https://issues.apache.org/jira/browse/YARN-942
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Akira AJISAKA
>  Labels: documentation, newbie
> Attachments: YARN-942.patch
>
>
> locality.threshold.node and locality.threshold.rack should have the 
> yarn.scheduler.fair prefix like the items before them
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1093) Corrections to Fair Scheduler documentation

2013-08-26 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750489#comment-13750489
 ] 

Sandy Ryza commented on YARN-1093:
--

I just committed this to trunk, branch-2, and branch-2.1-beta

> Corrections to Fair Scheduler documentation
> ---
>
> Key: YARN-1093
> URL: https://issues.apache.org/jira/browse/YARN-1093
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Wing Yew Poon
> Fix For: 2.1.1-beta
>
> Attachments: YARN-1093.patch
>
>
> The fair scheduler is still evolving, but the current documentation contains 
> some inaccuracies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-942) In Fair Scheduler documentation, inconsistency on which properties have prefix

2013-08-26 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-942:


Labels: documentation newbie  (was: docuentation newbie)

> In Fair Scheduler documentation, inconsistency on which properties have prefix
> --
>
> Key: YARN-942
> URL: https://issues.apache.org/jira/browse/YARN-942
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>  Labels: documentation, newbie
> Attachments: YARN-942.patch
>
>
> locality.threshold.node and locality.threshold.rack should have the 
> yarn.scheduler.fair prefix like the items before them
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1093) Corrections to Fair Scheduler documentation

2013-08-26 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1093:
-

Hadoop Flags: Reviewed

> Corrections to Fair Scheduler documentation
> ---
>
> Key: YARN-1093
> URL: https://issues.apache.org/jira/browse/YARN-1093
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Wing Yew Poon
> Fix For: 2.1.1-beta
>
> Attachments: YARN-1093.patch
>
>
> The fair scheduler is still evolving, but the current documentation contains 
> some inaccuracies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1093) Corrections to fair scheduler documentation

2013-08-26 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1093:
-

Summary: Corrections to fair scheduler documentation  (was: corrections to 
fair scheduler documentation)

> Corrections to fair scheduler documentation
> ---
>
> Key: YARN-1093
> URL: https://issues.apache.org/jira/browse/YARN-1093
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Wing Yew Poon
> Attachments: YARN-1093.patch
>
>
> The fair scheduler is still evolving, but the current documentation contains 
> some inaccuracies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1093) Corrections to Fair Scheduler documentation

2013-08-26 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1093:
-

Summary: Corrections to Fair Scheduler documentation  (was: Corrections to 
fair scheduler documentation)

> Corrections to Fair Scheduler documentation
> ---
>
> Key: YARN-1093
> URL: https://issues.apache.org/jira/browse/YARN-1093
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Wing Yew Poon
> Attachments: YARN-1093.patch
>
>
> The fair scheduler is still evolving, but the current documentation contains 
> some inaccuracies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-26 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750286#comment-13750286
 ] 

Sandy Ryza commented on YARN-1024:
--

bq. It seems to me that the only time you'd want a YCU value that's not -1 is 
when you're running a thread that uses less than 100% of the CPU. Is that a 
correct statement?
That's correct.  This is common for data-intensive tasks that can be more 
I/O-bound than CPU-bound.

bq. As an end user, how do I know what YCU value is reasonable for my job?
I think selecting the right value is an inherently difficult task. I think we 
would expect different users with different amounts of technical proficiency to 
do it in different ways.  Something like:
* Simple: Use the default value on the cluster.
* Intermediate: Notice your tasks are running too slow and increase YCUs.  Or 
notice your tasks aren't getting scheduled enough and decrease them.
* Advanced: Do the thing with top.

> Define a virtual core unambigiously
> ---
>
> Key: YARN-1024
> URL: https://issues.apache.org/jira/browse/YARN-1024
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: CPUasaYARNresource.pdf
>
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-905) Add state filters to nodes CLI

2013-08-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749142#comment-13749142
 ] 

Sandy Ryza commented on YARN-905:
-

I just committed this before Vinod's comment.  I don't think the current 
version is harmful in such a way that it needs to be reverted.  I would prefer 
to make these extra changes in a separate JIRA, but would also be happy to 
review/commit an addendum here.

> Add state filters to nodes CLI
> --
>
> Key: YARN-905
> URL: https://issues.apache.org/jira/browse/YARN-905
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Wei Yan
> Attachments: Yarn-905.patch, YARN-905.patch, YARN-905.patch
>
>
> It would be helpful for the nodes CLI to have a node-states option that 
> allows it to return nodes that are not just in the RUNNING state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-942) In Fair Scheduler documentation, inconsistency on which properties have prefix

2013-08-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748988#comment-13748988
 ] 

Sandy Ryza commented on YARN-942:
-

Thanks [~ajisakaa].  +1 pending jenkins.

> In Fair Scheduler documentation, inconsistency on which properties have prefix
> --
>
> Key: YARN-942
> URL: https://issues.apache.org/jira/browse/YARN-942
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>  Labels: docuentation, newbie
> Attachments: YARN-942.patch
>
>
> locality.threshold.node and locality.threshold.rack should have the 
> yarn.scheduler.fair prefix like the items before them
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1093) corrections to fair scheduler documentation

2013-08-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748879#comment-13748879
 ] 

Sandy Ryza commented on YARN-1093:
--

Thanks Wing Yew! +1 pendings Jenkins.

> corrections to fair scheduler documentation
> ---
>
> Key: YARN-1093
> URL: https://issues.apache.org/jira/browse/YARN-1093
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Wing Yew Poon
> Attachments: YARN-1093.patch
>
>
> The fair scheduler is still evolving, but the current documentation contains 
> some inaccuracies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1093) corrections to fair scheduler documentation

2013-08-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1093:
-

Fix Version/s: (was: 2.1.0-beta)

> corrections to fair scheduler documentation
> ---
>
> Key: YARN-1093
> URL: https://issues.apache.org/jira/browse/YARN-1093
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
>Reporter: Wing Yew Poon
> Attachments: YARN-1093.patch
>
>
> The fair scheduler is still evolving, but the current documentation contains 
> some inaccuracies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1024) Define a virtual core unambigiously

2013-08-22 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1024:
-

Attachment: CPUasaYARNresource.pdf

> Define a virtual core unambigiously
> ---
>
> Key: YARN-1024
> URL: https://issues.apache.org/jira/browse/YARN-1024
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: CPUasaYARNresource.pdf
>
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748062#comment-13748062
 ] 

Sandy Ryza commented on YARN-1024:
--

I wrote up a more detailed proposal and attached a PDF of it.

> Define a virtual core unambigiously
> ---
>
> Key: YARN-1024
> URL: https://issues.apache.org/jira/browse/YARN-1024
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: CPUasaYARNresource.pdf
>
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-08-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747881#comment-13747881
 ] 

Sandy Ryza commented on YARN-1089:
--

Yeah, I'll write up a document and post it on YARN-1024.  I'm hoping to keep 
the broader discussion there so we can use this (and perhaps additional JIRAs) 
for the actual implementation.

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-08-22 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1089:
-

Description: Based on discussion in YARN-1024, we will add YARN compute 
units as a resource for requesting and scheduling CPU processing power.

> Add YARN compute units alongside virtual cores
> --
>
> Key: YARN-1089
> URL: https://issues.apache.org/jira/browse/YARN-1089
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> Based on discussion in YARN-1024, we will add YARN compute units as a 
> resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747701#comment-13747701
 ] 

Sandy Ryza commented on YARN-1024:
--

Filed YARN-1089 for adding YCUs.

> Define a virtual core unambigiously
> ---
>
> Key: YARN-1024
> URL: https://issues.apache.org/jira/browse/YARN-1024
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> We need to clearly define the meaning of a virtual core unambiguously so that 
> it's easy to migrate applications between clusters.
> For e.g. here is Amazon EC2 definition of ECU: 
> http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
> Essentially we need to clearly define a YARN Virtual Core (YVC).
> Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
> equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1089) Add YARN compute units alongside virtual cores

2013-08-21 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1089:


 Summary: Add YARN compute units alongside virtual cores
 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-972) Allow requests and scheduling for fractional virtual cores

2013-08-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza resolved YARN-972.
-

Resolution: Won't Fix

> Allow requests and scheduling for fractional virtual cores
> --
>
> Key: YARN-972
> URL: https://issues.apache.org/jira/browse/YARN-972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, scheduler
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> As this idea sparked a fair amount of discussion on YARN-2, I'd like to go 
> deeper into the reasoning.
> Currently the virtual core abstraction hides two orthogonal goals.  The first 
> is that a cluster might have heterogeneous hardware and that the processing 
> power of different makes of cores can vary wildly.  The second is that a 
> different (combinations of) workloads can require different levels of 
> granularity.  E.g. one admin might want every task on their cluster to use at 
> least a core, while another might want applications to be able to request 
> quarters of cores.  The former would configure a single vcore per core.  The 
> latter would configure four vcores per core.
> I don't think that the abstraction is a good way of handling the second goal. 
>  Having a virtual cores refer to different magnitudes of processing power on 
> different clusters will make the difficult problem of deciding how many cores 
> to request for a job even more confusing.
> Can we not handle this with dynamic oversubscription?
> Dynamic oversubscription, i.e. adjusting the number of cores offered by a 
> machine based on measured CPU-consumption, should work as a complement to 
> fine-granularity scheduling.  Dynamic oversubscription is never going to be 
> perfect, as the amount of CPU a process consumes can vary widely over its 
> lifetime.  A task that first loads a bunch of data over the network and then 
> performs complex computations on it will suffer if additional CPU-heavy tasks 
> are scheduled on the same node because its initial CPU-utilization was low.  
> To guard against this, we will need to be conservative with how we 
> dynamically oversubscribe.  If a user wants to explicitly hint to the 
> scheduler that their task will not use much CPU, the scheduler should be able 
> to take this into account.
> On YARN-2, there are concerns that including floating point arithmetic in the 
> scheduler will slow it down.  I question this assumption, and it is perhaps 
> worth debating, but I think we can sidestep the issue by multiplying 
> CPU-quantities inside the scheduler by a decently sized number like 1000 and 
> keep doing the computations on integers.
> The relevant APIs are marked as evolving, so there's no need for the change 
> to delay 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-972) Allow requests and scheduling for fractional virtual cores

2013-08-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747142#comment-13747142
 ] 

Sandy Ryza commented on YARN-972:
-

Based on the approach we've agreed upon in YARN-1024 that allows separate 
values to be set for processing power and parallelism, closing this as won't 
fix.

> Allow requests and scheduling for fractional virtual cores
> --
>
> Key: YARN-972
> URL: https://issues.apache.org/jira/browse/YARN-972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, scheduler
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> As this idea sparked a fair amount of discussion on YARN-2, I'd like to go 
> deeper into the reasoning.
> Currently the virtual core abstraction hides two orthogonal goals.  The first 
> is that a cluster might have heterogeneous hardware and that the processing 
> power of different makes of cores can vary wildly.  The second is that a 
> different (combinations of) workloads can require different levels of 
> granularity.  E.g. one admin might want every task on their cluster to use at 
> least a core, while another might want applications to be able to request 
> quarters of cores.  The former would configure a single vcore per core.  The 
> latter would configure four vcores per core.
> I don't think that the abstraction is a good way of handling the second goal. 
>  Having a virtual cores refer to different magnitudes of processing power on 
> different clusters will make the difficult problem of deciding how many cores 
> to request for a job even more confusing.
> Can we not handle this with dynamic oversubscription?
> Dynamic oversubscription, i.e. adjusting the number of cores offered by a 
> machine based on measured CPU-consumption, should work as a complement to 
> fine-granularity scheduling.  Dynamic oversubscription is never going to be 
> perfect, as the amount of CPU a process consumes can vary widely over its 
> lifetime.  A task that first loads a bunch of data over the network and then 
> performs complex computations on it will suffer if additional CPU-heavy tasks 
> are scheduled on the same node because its initial CPU-utilization was low.  
> To guard against this, we will need to be conservative with how we 
> dynamically oversubscribe.  If a user wants to explicitly hint to the 
> scheduler that their task will not use much CPU, the scheduler should be able 
> to take this into account.
> On YARN-2, there are concerns that including floating point arithmetic in the 
> scheduler will slow it down.  I question this assumption, and it is perhaps 
> worth debating, but I think we can sidestep the issue by multiplying 
> CPU-quantities inside the scheduler by a decently sized number like 1000 and 
> keep doing the computations on integers.
> The relevant APIs are marked as evolving, so there's no need for the change 
> to delay 2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-649) Make container logs available over HTTP in plain text

2013-08-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-649:


Attachment: YARN-649-6.patch

> Make container logs available over HTTP in plain text
> -
>
> Key: YARN-649
> URL: https://issues.apache.org/jira/browse/YARN-649
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, 
> YARN-649-5.patch, YARN-649-6.patch, YARN-649.patch, YARN-752-1.patch
>
>
> It would be good to make container logs available over the REST API for 
> MAPREDUCE-4362 and so that they can be accessed programatically in general.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text

2013-08-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746880#comment-13746880
 ] 

Sandy Ryza commented on YARN-649:
-

Uploading a new patch that
* Puts IOUtils.skipFully(logByteStream, start) back in.  My mistake.
* Changes the annotation to Unstable and includes documentation on how long 
logs will be available.
* Removes the mortbay log and throws a YarnException on the URISyntaxException

I manually verified that the buffering works by creating a log file larger than 
the NodeManager memory, retrieving it with the API, and observing that the 
NodeManager did not fall over.

> Make container logs available over HTTP in plain text
> -
>
> Key: YARN-649
> URL: https://issues.apache.org/jira/browse/YARN-649
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, 
> YARN-649-5.patch, YARN-649.patch, YARN-752-1.patch
>
>
> It would be good to make container logs available over the REST API for 
> MAPREDUCE-4362 and so that they can be accessed programatically in general.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    4   5   6   7   8   9   10   11   12   13   >