[jira] [Updated] (YARN-1024) Define a CPU resource(s) unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1024: - Summary: Define a CPU resource(s) unambigiously (was: Define a virtual core unambigiously) > Define a CPU resource(s) unambigiously > -- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: CPUasaYARNresource.pdf > > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-976) Document the meaning of a virtual core
[ https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-976: Issue Type: Sub-task (was: Task) Parent: YARN-1024 > Document the meaning of a virtual core > -- > > Key: YARN-976 > URL: https://issues.apache.org/jira/browse/YARN-976 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-976.patch > > > As virtual cores are a somewhat novel concept, it would be helpful to have > thorough documentation that clarifies their meaning. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Issue Type: Sub-task (was: Improvement) Parent: YARN-1024 > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1089-1.patch, YARN-1089.patch > > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-976) Document the meaning of a virtual core
[ https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-976: Attachment: YARN-976.patch > Document the meaning of a virtual core > -- > > Key: YARN-976 > URL: https://issues.apache.org/jira/browse/YARN-976 > Project: Hadoop YARN > Issue Type: Task > Components: documentation >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-976.patch > > > As virtual cores are a somewhat novel concept, it would be helpful to have > thorough documentation that clarifies their meaning. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782566#comment-13782566 ] Sandy Ryza commented on YARN-1241: -- Uploaded patch to fix the new findbugs warnings > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241-3.patch, > YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1241: - Attachment: YARN-1241-3.patch > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241-3.patch, > YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1259) In Fair Scheduler web UI, queue num pending and num active apps switched
Sandy Ryza created YARN-1259: Summary: In Fair Scheduler web UI, queue num pending and num active apps switched Key: YARN-1259 URL: https://issues.apache.org/jira/browse/YARN-1259 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza The values returned in FairSchedulerLeafQueueInfo by numPendingApplications and numActiveApplications should be switched. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1010) FairScheduler: decouple container scheduling from nodemanager heartbeats
[ https://issues.apache.org/jira/browse/YARN-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782473#comment-13782473 ] Sandy Ryza commented on YARN-1010: -- This looks almost there to me. A few nits: {code} +LOG.warn("Error while doing sleep in continuous scheduling: " + +e.toString(), e); {code} There should be indentation non the second line here. {code} + private void continuousScheduling() { {code} Better to have method names be verbs. Maybe "scheduleContinuously". Most of the Fair Scheduler properties use dashes at the end instead of dots and I think this is a good convention. We should change yarn.scheduler.fair.locality.threshold.node.time.ms to yarn.scheduler.fair.locality-delay-node-ms. (And the same for rack). We should also change yarn.scheduler.fair.continuous.scheduling.enabled to yarn.scheduler.fair.continuous-scheduling-enabled and yarn.scheduler.fair.continuous.scheduling.sleep.time.ms to yarn.scheduler.fair.continuous-scheduling-sleep-ms. Adding multi-second sleeps in the unit tests will slow down build times and is still theoretically open to races if the OS pauses. Better would be to use the clock interface. In the test you can use a MockClock like in TestFairScheduler#testChoiceOfPreemptedContainers, and you can change the start time in AppSchedulable to come from scheduler.getClock().getTime(). > FairScheduler: decouple container scheduling from nodemanager heartbeats > > > Key: YARN-1010 > URL: https://issues.apache.org/jira/browse/YARN-1010 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Wei Yan >Priority: Critical > Attachments: YARN-1010.patch > > > Currently scheduling for a node is done when a node heartbeats. > For large cluster where the heartbeat interval is set to several seconds this > delays scheduling of incoming allocations significantly. > We could have a continuous loop scanning all nodes and doing scheduling. If > there is availability AMs will get the allocation in the next heartbeat after > the one that placed the request. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1258) Allow configuring the Fair Scheduler root queue
Sandy Ryza created YARN-1258: Summary: Allow configuring the Fair Scheduler root queue Key: YARN-1258 URL: https://issues.apache.org/jira/browse/YARN-1258 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza This would be useful for acls, maxRunningApps, scheduling modes, etc. The allocation file should be able to accept both: * An implicit root queue * A root queue at the top of the hierarchy with all queues under/inside of it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1241: - Attachment: YARN-1241-2.patch > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782448#comment-13782448 ] Sandy Ryza commented on YARN-1241: -- Uploaded patch to fix findbugs warnings > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782322#comment-13782322 ] Sandy Ryza commented on YARN-1221: -- I just committed this to trunk, branch-2, and branch-2.1-beta. Thanks Siqi! > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Siqi Li > Fix For: 2.1.2-beta > > Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, > YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782319#comment-13782319 ] Sandy Ryza commented on YARN-1241: -- Rebased on trunk > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241-1.patch, YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1221: - Assignee: Siqi Li > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Siqi Li > Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, > YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1241: - Attachment: YARN-1241-1.patch > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241-1.patch, YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782283#comment-13782283 ] Sandy Ryza commented on YARN-1221: -- +1 > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, > YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1241: - Attachment: YARN-1241.patch > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1241.patch > > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779516#comment-13779516 ] Sandy Ryza commented on YARN-1221: -- Thanks [~l201514]! Good point about the rootQueueMetrics update. I think it's an artifact from when we didn't have hierarchical queues. Looks like you probably need to rebase on latest trunk - do you mind removing the whitespace change on the line with the if when you do? > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779173#comment-13779173 ] Sandy Ryza commented on YARN-1221: -- bq. It will affect the amount shown in the web UI, since they all have a parent QueueMetrics, which is the root queue metrics. Ah, you are totally right. Also, applied your patch and the issue went away for me. For the ClusterMetricsInfo part, I'm still not convinced in the page, I'm still not convinced on the change, but either way we should do it in a separate JIRA. Also, are you able to add a test? An easy way to do this might be to just find an existing test in TestFairScheduler that, without the patch, has an incorrect value of reserved MB at the end and add an assert there. > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Attachments: YARN1221_v1.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778967#comment-13778967 ] Sandy Ryza commented on YARN-1241: -- Currently this property works by starting all apps as not runnable, and then marking some as runnable in the update thread. Because this thread runs only every half second, app start can be delayed by up to a half second. The changes required to make the property work for parent queues are a good opportunity for solving this problem as well. I propose splitting apps in leaf queues into runnable and non-runnable lists, and keeping a count of runnable apps in parent queues. Determining an application's runnability at add-time will only require looking at the size of the runnable list of the leaf queue and the numRunnable in all the parent queues. Then, when an application is removed, we need to check to see whether any applications can now be made runnable. We find the highest queue in the hierarchy that's a parent of the application's queue and that was previously at its maxRunningApps capacity. We go through the apps in all leaf queues under that queue in order of start time to see if any can be made runnable. > In Fair Scheduler maxRunningApps does not work for non-leaf queues > -- > > Key: YARN-1241 > URL: https://issues.apache.org/jira/browse/YARN-1241 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > Setting the maxRunningApps property on a parent queue should make it that the > sum of apps in all subqueues can't exceed it -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778208#comment-13778208 ] Sandy Ryza commented on YARN-1221: -- bq. The reason I removed the code above is that there is no corresponding unreserve method got called. Good catch. But that shouldn't affect the amount shown in the web UI, because the metrics for which there is double counting are the leaf queue metrics, whereas the value in the web UI is based only off of the root queue metrics. Is that not right? bq. As far as I saw from the webUI, the available memory never get decremented when it allocates memory to mr jobs. Where are you seeing the available memory reported on the web UI? To be clear, I'm referring to what's shown under the "Cluster Metrics" section when you go to http://:/cluster > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Attachments: YARN1221_v1.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
Sandy Ryza created YARN-1241: Summary: In Fair Scheduler maxRunningApps does not work for non-leaf queues Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778174#comment-13778174 ] Sandy Ryza commented on YARN-1221: -- {code} -this.totalMB = availableMB + reservedMB + allocatedMB; +this.totalMB = availableMB; {code} Total MB should still include allocatedMB. I agree that reservedMB should be removed from it, but I think this is work for a separate JIRA. This one is for dealing with why reservedMB is calculated incorrectly. {code} - getMetrics().reserveResource(app.getUser(), - container.getResource()); {code} Can you explain the rationale behind removing this? > With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely > - > > Key: YARN-1221 > URL: https://issues.apache.org/jira/browse/YARN-1221 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Attachments: YARN1221_v1.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Attachment: YARN-1228-2.patch > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1228-1.patch, YARN-1228-2.patch, YARN-1228.patch > > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777661#comment-13777661 ] Sandy Ryza commented on YARN-1228: -- Updated patch adds license header to test-fair-scheduler.xml > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1228-1.patch, YARN-1228-2.patch, YARN-1228.patch > > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1236) FairScheduler setting queue name in RMApp is not working
[ https://issues.apache.org/jira/browse/YARN-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1236: - Attachment: YARN-1236.patch > FairScheduler setting queue name in RMApp is not working > - > > Key: YARN-1236 > URL: https://issues.apache.org/jira/browse/YARN-1236 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1236.patch > > > The fair scheduler sometimes picks a different queue than the one an > application was submitted to, such as when user-as-default-queue is turned > on. It needs to update the queue name in the RMApp so that this choice will > be reflected in the UI. > This isn't working because the scheduler is looking up the RMApp by > application attempt id instead of app id and failing to find it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1236) FairScheduler setting queue name in RMApp is not working
Sandy Ryza created YARN-1236: Summary: FairScheduler setting queue name in RMApp is not working Key: YARN-1236 URL: https://issues.apache.org/jira/browse/YARN-1236 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1236.patch The fair scheduler sometimes picks a different queue than the one an application was submitted to, such as when user-as-default-queue is turned on. It needs to update the queue name in the RMApp so that this choice will be reflected in the UI. This isn't working because the scheduler is looking up the RMApp by application attempt id instead of app id and failing to find it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777572#comment-13777572 ] Sandy Ryza commented on YARN-1228: -- Updated patch includes a test > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1228-1.patch, YARN-1228.patch > > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Attachment: YARN-1228-1.patch > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1228-1.patch, YARN-1228.patch > > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables
[ https://issues.apache.org/jira/browse/YARN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1128: - Hadoop Flags: Reviewed Committed to trunk, branch-2, and branch-2.1-beta > FifoPolicy.computeShares throws NPE on empty list of Schedulables > - > > Key: YARN-1128 > URL: https://issues.apache.org/jira/browse/YARN-1128 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Karthik Kambatla > Fix For: 2.1.2-beta > > Attachments: yarn-1128-1.patch > > > FifoPolicy gives all of a queue's share to the earliest-scheduled application. > {code} > Schedulable earliest = null; > for (Schedulable schedulable : schedulables) { > if (earliest == null || > schedulable.getStartTime() < earliest.getStartTime()) { > earliest = schedulable; > } > } > earliest.setFairShare(Resources.clone(totalResources)); > {code} > If the queue has no schedulables in it, earliest will be left null, leading > to an NPE on the last line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777071#comment-13777071 ] Sandy Ryza commented on YARN-1089: -- As was requested, I posted a summary of the proposal on YARN-1024. In case it's not clear on the summary, here's the problem we're trying to solve: We want jobs to be portable between clusters. CPU is not a fluid resource in the way memory is. The number of cores on a machine is just as important its total processing power when scheduling tasks. Imagine a cluster where every node has powerful CPUs with many cores. One type of task that will be run on the cluster saturates a full CPU, but another type of task that will be run on the cluster contains two threads, each which can saturate only half a full CPU. If we have a single dimension for CPU requests, these tasks will request an equal number of those. What happens if we then move those tasks to a cluster with CPUs whose cores are half as fast? The first task will run half as fast, and the second task will run in the same amount of time. It's in the first task's interest to only request half as many CPU resources on that cluster. I'm also afraid of things getting complicated, but I can't think of anything better that doesn't require having the meaning of a virtual core vary widely from cluster to cluster. > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1089-1.patch, YARN-1089.patch > > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777034#comment-13777034 ] Sandy Ryza commented on YARN-1089: -- I'm ok with with waiting until 2.3. In case it's not clear, the consequence of this is that until then it will be impossible to place more tasks on a node than its number of virtual cores, which is essentially its number of physical cores. I think we should make YARN-976, documenting the meaning of vcores, a blocker for 2.2. > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1089-1.patch, YARN-1089.patch > > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1230) Fair scheduler aclSubmitApps does not handle acls with only groups
Sandy Ryza created YARN-1230: Summary: Fair scheduler aclSubmitApps does not handle acls with only groups Key: YARN-1230 URL: https://issues.apache.org/jira/browse/YARN-1230 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza ACLs are specified like "user1,user2 group1,group2". " group1,group2", but will be interpreted incorrectly by the Fair Scheduler because it trims the leading space. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775769#comment-13775769 ] Sandy Ryza commented on YARN-1228: -- Existing tests verify that absolute paths and not giving any file work. Adding a file to the classpath at runtime is difficult, so I verified that it picks up files from the classpath by manually testing on a pseudo-distributed cluster. > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1228.patch > > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Attachment: YARN-1228.patch > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza > Attachments: YARN-1228.patch > > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Description: Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. Additionally, the allocation file is loaded as fair-scheduler.xml from the classpath if it is not specified, but is loaded as a File if it is. This causes two problems 1. We see different behavior when not setting the yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, which is its default. 2. Classloaders may choose to cache resources, which can break the reload logic when yarn.scheduler.fair.allocation.file is not specified. We should never allow the yarn.scheduler properties to go into fair-scheduler.xml. And we should always load the allocations file as a file, not as a resource on the classpath. To preserve existing behavior and allow loading files from the classpath, we can look for files on the classpath, but strip of their scheme and interpret them as Files. was: Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. There's no need to keep around the second way. > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. > Additionally, the allocation file is loaded as fair-scheduler.xml from the > classpath if it is not specified, but is loaded as a File if it is. This > causes two problems > 1. We see different behavior when not setting the > yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, > which is its default. > 2. Classloaders may choose to cache resources, which can break the reload > logic when yarn.scheduler.fair.allocation.file is not specified. > We should never allow the yarn.scheduler properties to go into > fair-scheduler.xml. And we should always load the allocations file as a > file, not as a resource on the classpath. To preserve existing behavior and > allow loading files from the classpath, we can look for files on the > classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Summary: Clean up Fair Scheduler configuration loading (was: Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file) > Clean up Fair Scheduler configuration loading > - > > Key: YARN-1228 > URL: https://issues.apache.org/jira/browse/YARN-1228 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.1-beta >Reporter: Sandy Ryza > > Currently the Fair Scheduler is configured in two ways > * An allocations file that has a different format than the standard Hadoop > configuration file, which makes it easier to specify hierarchical objects > like queues and their properties. > * With properties like yarn.scheduler.fair.max.assign that are specified in > the standard Hadoop configuration format. > The standard and default way of configuring it is to use fair-scheduler.xml > as the allocations file and to put the yarn.scheduler properties in > yarn-site.xml. > It is also possible to specify a different file as the allocations file, and > to place the yarn.scheduler properties in fair-scheduler.xml, which will be > interpreted as in the standard Hadoop configuration format. This flexibility > is both confusing and unnecessary. There's no need to keep around the second > way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname
[ https://issues.apache.org/jira/browse/YARN-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1227: - Labels: newbie (was: ) > Update Single Cluster doc to use yarn.resourcemanager.hostname > -- > > Key: YARN-1227 > URL: https://issues.apache.org/jira/browse/YARN-1227 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Labels: newbie > > Now that yarn.resourcemanager.hostname can be used in place or > yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., > we should update the doc to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1188: - Assignee: Tsuyoshi OZAWA > The context of QueueMetrics becomes 'default' when using FairScheduler > -- > > Key: YARN-1188 > URL: https://issues.apache.org/jira/browse/YARN-1188 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Akira AJISAKA >Assignee: Tsuyoshi OZAWA >Priority: Minor > Labels: metrics, newbie > Attachments: YARN-1188.1.patch > > > I found the context of QueueMetrics changed to 'default' from 'yarn' when I > was using FairScheduler. > The context should always be 'yarn' by adding an annotation to FSQueueMetrics > like below: > {code} > + @Metrics(context="yarn") > public class FSQueueMetrics extends QueueMetrics { > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1228) Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file
Sandy Ryza created YARN-1228: Summary: Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. There's no need to keep around the second way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname
Sandy Ryza created YARN-1227: Summary: Update Single Cluster doc to use yarn.resourcemanager.hostname Key: YARN-1227 URL: https://issues.apache.org/jira/browse/YARN-1227 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Now that yarn.resourcemanager.hostname can be used in place or yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., we should update the doc to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774344#comment-13774344 ] Sandy Ryza commented on YARN-1188: -- I just committed this to trunk and branch-2. Thanks Tsuyoshi! > The context of QueueMetrics becomes 'default' when using FairScheduler > -- > > Key: YARN-1188 > URL: https://issues.apache.org/jira/browse/YARN-1188 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Minor > Labels: metrics, newbie > Attachments: YARN-1188.1.patch > > > I found the context of QueueMetrics changed to 'default' from 'yarn' when I > was using FairScheduler. > The context should always be 'yarn' by adding an annotation to FSQueueMetrics > like below: > {code} > + @Metrics(context="yarn") > public class FSQueueMetrics extends QueueMetrics { > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1188: - Assignee: (was: Akira AJISAKA) > The context of QueueMetrics becomes 'default' when using FairScheduler > -- > > Key: YARN-1188 > URL: https://issues.apache.org/jira/browse/YARN-1188 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Akira AJISAKA >Priority: Minor > Labels: metrics, newbie > Attachments: YARN-1188.1.patch > > > I found the context of QueueMetrics changed to 'default' from 'yarn' when I > was using FairScheduler. > The context should always be 'yarn' by adding an annotation to FSQueueMetrics > like below: > {code} > + @Metrics(context="yarn") > public class FSQueueMetrics extends QueueMetrics { > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1188: - Assignee: Akira AJISAKA > The context of QueueMetrics becomes 'default' when using FairScheduler > -- > > Key: YARN-1188 > URL: https://issues.apache.org/jira/browse/YARN-1188 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Minor > Labels: metrics, newbie > Attachments: YARN-1188.1.patch > > > I found the context of QueueMetrics changed to 'default' from 'yarn' when I > was using FairScheduler. > The context should always be 'yarn' by adding an annotation to FSQueueMetrics > like below: > {code} > + @Metrics(context="yarn") > public class FSQueueMetrics extends QueueMetrics { > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774336#comment-13774336 ] Sandy Ryza commented on YARN-1188: -- +1 > The context of QueueMetrics becomes 'default' when using FairScheduler > -- > > Key: YARN-1188 > URL: https://issues.apache.org/jira/browse/YARN-1188 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Akira AJISAKA >Priority: Minor > Labels: metrics, newbie > Attachments: YARN-1188.1.patch > > > I found the context of QueueMetrics changed to 'default' from 'yarn' when I > was using FairScheduler. > The context should always be 'yarn' by adding an annotation to FSQueueMetrics > like below: > {code} > + @Metrics(context="yarn") > public class FSQueueMetrics extends QueueMetrics { > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Attachment: YARN-1089-1.patch Updated patch should fix TestFairScheduler and TestSchedulerUtils. The TestRMContainerAllocator failure looks like MAPREDUCE-5514. > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1089-1.patch, YARN-1089.patch > > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
Sandy Ryza created YARN-1221: Summary: With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables
[ https://issues.apache.org/jira/browse/YARN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770823#comment-13770823 ] Sandy Ryza commented on YARN-1128: -- +1 > FifoPolicy.computeShares throws NPE on empty list of Schedulables > - > > Key: YARN-1128 > URL: https://issues.apache.org/jira/browse/YARN-1128 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Karthik Kambatla > Attachments: yarn-1128-1.patch > > > FifoPolicy gives all of a queue's share to the earliest-scheduled application. > {code} > Schedulable earliest = null; > for (Schedulable schedulable : schedulables) { > if (earliest == null || > schedulable.getStartTime() < earliest.getStartTime()) { > earliest = schedulable; > } > } > earliest.setFairShare(Resources.clone(totalResources)); > {code} > If the queue has no schedulables in it, earliest will be left null, leading > to an NPE on the last line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1213) Add an equivalent of mapred.fairscheduler.allow.undeclared.pools to the Fair Scheduler
Sandy Ryza created YARN-1213: Summary: Add an equivalent of mapred.fairscheduler.allow.undeclared.pools to the Fair Scheduler Key: YARN-1213 URL: https://issues.apache.org/jira/browse/YARN-1213 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Attachment: YARN-1089.patch > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1089.patch > > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Attachment: (was: YARN-1089.patch) > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Attachment: YARN-1089.patch > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-1089.patch > > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770430#comment-13770430 ] Sandy Ryza commented on YARN-1206: -- Did this work differently prior to YARN-649? My impression was that this was the existing behavior. > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Priority: Blocker > Labels: 2.1.1-beta > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1206: - Labels: (was: 2.1.1-beta) > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Priority: Blocker > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1212) Add a Scheduling Concepts page to the web doc
Sandy Ryza created YARN-1212: Summary: Add a Scheduling Concepts page to the web doc Key: YARN-1212 URL: https://issues.apache.org/jira/browse/YARN-1212 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Assignee: Sandy Ryza It would be helpful to have a page that explains some of the non-obvious concepts used in YARN scheduling. We get a lot of questions about these on the user lists because they aren't really documented anywhere. An incomplete list of concepts to cover: * Resources (memory / CPU) * Reservations * ResourceRequest format * Disabling locality relaxation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1184) ClassCastException is thrown during preemption When a huge job is submitted to a queue B whose resources is used by a job in queueA
[ https://issues.apache.org/jira/browse/YARN-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1184: - Component/s: capacityscheduler > ClassCastException is thrown during preemption When a huge job is submitted > to a queue B whose resources is used by a job in queueA > --- > > Key: YARN-1184 > URL: https://issues.apache.org/jira/browse/YARN-1184 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.1.0-beta >Reporter: J.Andreina >Assignee: Devaraj K > > preemption is enabled. > Queue = a,b > a capacity = 30% > b capacity = 70% > Step 1: Assign a big job to queue a ( so that job_a will utilize some > resources from queue b) > Step 2: Assigne a big job to queue b. > Following exception is thrown at Resource Manager > {noformat} > 2013-09-12 10:42:32,535 ERROR [SchedulingMonitor > (ProportionalCapacityPreemptionPolicy)] yarn.YarnUncaughtExceptionHandler > (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread > Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw > an Exception. > java.lang.ClassCastException: java.util.Collections$UnmodifiableSet cannot be > cast to java.util.NavigableSet > at > org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.getContainersToPreempt(ProportionalCapacityPreemptionPolicy.java:403) > at > org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:202) > at > org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:173) > at > org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72) > at > org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82) > at java.lang.Thread.run(Thread.java:662) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763829#comment-13763829 ] Sandy Ryza commented on YARN-938: - On vacation now, but I'll try to assemble them into a presentable form when I get back. > Hadoop 2 benchmarking > -- > > Key: YARN-938 > URL: https://issues.apache.org/jira/browse/YARN-938 > Project: Hadoop YARN > Issue Type: Task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls > > > I am running the benchmarks on Hadoop 2 and will update the results soon. > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763798#comment-13763798 ] Sandy Ryza commented on YARN-938: - Thanks for working on these, [~mayank_bansal]. The results are pretty consistent with some internal benchmarking we've done at Cloudera. A few questions: * In MR1 was io.sort.record.percent tuned to spill the same number of times as MR2 does? * What was slowstart completed maps set to? * How many slots and MB were the TTs and NMs configured with? * Any idea what caused the improvement between RC1 and the final release? I'm guessing MAPREDUCE-5399 helped. > Hadoop 2 benchmarking > -- > > Key: YARN-938 > URL: https://issues.apache.org/jira/browse/YARN-938 > Project: Hadoop YARN > Issue Type: Task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls > > > I am running the benchmarks on Hadoop 2 and will update the results soon. > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1171) Add defaultQueueSchedulingPolicy to Fair Scheduler documentation
Sandy Ryza created YARN-1171: Summary: Add defaultQueueSchedulingPolicy to Fair Scheduler documentation Key: YARN-1171 URL: https://issues.apache.org/jira/browse/YARN-1171 Project: Hadoop YARN Issue Type: Improvement Components: documentation, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza The Fair Scheduler doc is missing the defaultQueueSchedulingPolicy property. I suspect there are a few other ones too that provide defaults for all queues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers
[ https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761164#comment-13761164 ] Sandy Ryza commented on YARN-1049: -- +1 > ContainerExistStatus should define a status for preempted containers > > > Key: YARN-1049 > URL: https://issues.apache.org/jira/browse/YARN-1049 > Project: Hadoop YARN > Issue Type: Bug > Components: api >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur >Priority: Blocker > Fix For: 2.1.1-beta > > Attachments: YARN-1049.patch > > > With the current behavior is impossible to determine if a container has been > preempted or lost due to a NM crash. > Adding a PREEMPTED exit status (-102) will help an AM determine that a > container has been preempted. > Note the change of scope from the original summary/description. The original > scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm > reducing the scope of this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759511#comment-13759511 ] Sandy Ryza commented on YARN-910: - Can we add in the initializeContainer doc exactly when this will be called? I.e. whether it's after getting a container launch command from the AM or whether it's after launching the actual container. False whitespace change in ApplicationImpl.java Otherwise, LGTM > Allow auxiliary services to listen for container starts and completions > --- > > Key: YARN-910 > URL: https://issues.apache.org/jira/browse/YARN-910 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Alejandro Abdelnur > Attachments: YARN-910.patch, YARN-910.patch > > > Making container start and completion events available to auxiliary services > would allow them to be resource-aware. The auxiliary service would be able > to notify a co-located service that is opportunistically using free capacity > of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1100) Giving multiple commands to ContainerLaunchContext doesn't work as expected
[ https://issues.apache.org/jira/browse/YARN-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756795#comment-13756795 ] Sandy Ryza commented on YARN-1100: -- In the MR and distributed shell code, the command list does not seem to be used for holding arguments to the same process. All arguments are concatenated and placed in a single entry in the commands list. > Giving multiple commands to ContainerLaunchContext doesn't work as expected > --- > > Key: YARN-1100 > URL: https://issues.apache.org/jira/browse/YARN-1100 > Project: Hadoop YARN > Issue Type: Bug > Components: api, nodemanager >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > > A ContainerLaunchContext accepts a list of commands (as strings) to be > executed to launch the container. I would expect that giving a list with the > two commands "echo yolo" and "date" would print something like > {code} > yolo > Mon Aug 26 14:40:23 PDT 2013 > {code} > Instead it prints > {code} > yolo date > {code} > This is because the commands get executed with: > {code} > exec /bin/bash -c "echo yolo date" > {code} > To get the expected behavior I have to include semicolons at the end of each > command. At the very least, this should be documented, but I think better > would be for the NM to insert the semicolons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756793#comment-13756793 ] Sandy Ryza commented on YARN-649: - Thanks Vinod! > Make container logs available over HTTP in plain text > - > > Key: YARN-649 > URL: https://issues.apache.org/jira/browse/YARN-649 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.3.0 > > Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, > YARN-649-5.patch, YARN-649-6.patch, YARN-649-7.patch, YARN-649.patch, > YARN-752-1.patch > > > It would be good to make container logs available over the REST API for > MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1135) Revisit log message levels in FairScheduler.
[ https://issues.apache.org/jira/browse/YARN-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755977#comment-13755977 ] Sandy Ryza commented on YARN-1135: -- Agreed > Revisit log message levels in FairScheduler. > > > Key: YARN-1135 > URL: https://issues.apache.org/jira/browse/YARN-1135 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Sandy Ryza > Fix For: 2.1.1-beta > > > FairScheduler produces allocation attempts at INFO level, those should be a > DEBUG level: > {code} > 2013-09-02 09:14:47,815 INFO [ResourceManager Event Processor] > fair.AppSchedulable (AppSchedulable.java:assignContainer(277)) - Node offered > to app: application_1378106082360_0001 reserved: false > 2013-09-02 09:14:47,815 INFO [ResourceManager Event Processor] > fair.AppSchedulable (AppSchedulable.java:assignContainer(277)) - Node offered > to app: application_1378106082360_0002 reserved: false > 2013-09-02 09:14:48,247 INFO [ResourceManager Event Processor] > fair.AppSchedulable (AppSchedulable.java:assignContainer(277)) - Node offered > to app: application_1378106082360_0001 reserved: false > ... > {code} > We should curate all log message levels -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1090) Job does not get into Pending State
[ https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755368#comment-13755368 ] Sandy Ryza commented on YARN-1090: -- Ah, ok, makes total sense > Job does not get into Pending State > --- > > Key: YARN-1090 > URL: https://issues.apache.org/jira/browse/YARN-1090 > Project: Hadoop YARN > Issue Type: Bug >Reporter: yeshavora >Assignee: Jian He > Attachments: YARN-1090.patch > > > When there is no resource available to run a job, next job should go in > pending state. RM UI should show next job as pending app and the counter for > the pending app should be incremented. > But Currently. Next job stays in ACCEPTED state and No AM has been assigned > to this job.Though Pending App count is not incremented. > Running 'job status ' shows job state=PREP. > $ mapred job -status job_1377122233385_0002 > 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at > host1/ip1 > Job: job_1377122233385_0002 > Job File: /ABC/.staging/job_1377122233385_0002/job.xml > Job Tracking URL : http://host1:port1/application_1377122233385_0002/ > Uber job : false > Number of maps: 0 > Number of reduces: 0 > map() completion: 0.0 > reduce() completion: 0.0 > Job state: PREP > retired: false > reason for failure: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1090) Job does not get into Pending State
[ https://issues.apache.org/jira/browse/YARN-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755219#comment-13755219 ] Sandy Ryza commented on YARN-1090: -- My understanding was that an application is "Pending" if no AM has yet been allocated to it. Can you go a little more into what the issue is with this definition? Also, whatever we decide, is there somewhere where we can document exactly what these mean? > Job does not get into Pending State > --- > > Key: YARN-1090 > URL: https://issues.apache.org/jira/browse/YARN-1090 > Project: Hadoop YARN > Issue Type: Bug >Reporter: yeshavora >Assignee: Jian He > Attachments: YARN-1090.patch > > > When there is no resource available to run a job, next job should go in > pending state. RM UI should show next job as pending app and the counter for > the pending app should be incremented. > But Currently. Next job stays in ACCEPTED state and No AM has been assigned > to this job.Though Pending App count is not incremented. > Running 'job status ' shows job state=PREP. > $ mapred job -status job_1377122233385_0002 > 13/08/21 21:59:23 INFO client.RMProxy: Connecting to ResourceManager at > host1/ip1 > Job: job_1377122233385_0002 > Job File: /ABC/.staging/job_1377122233385_0002/job.xml > Job Tracking URL : http://host1:port1/application_1377122233385_0002/ > Uber job : false > Number of maps: 0 > Number of reduces: 0 > map() completion: 0.0 > reduce() completion: 0.0 > Job state: PREP > retired: false > reason for failure: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables
Sandy Ryza created YARN-1128: Summary: FifoPolicy.computeShares throws NPE on empty list of Schedulables Key: YARN-1128 URL: https://issues.apache.org/jira/browse/YARN-1128 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza FifoPolicy gives all of a queue's share to the earliest-scheduled application. {code} Schedulable earliest = null; for (Schedulable schedulable : schedulables) { if (earliest == null || schedulable.getStartTime() < earliest.getStartTime()) { earliest = schedulable; } } earliest.setFairShare(Resources.clone(totalResources)); {code} If the queue has no schedulables in it, earliest will be left null, leading to an NPE on the last line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1122) FairScheduler user-as-default-queue always defaults to 'default'
[ https://issues.apache.org/jira/browse/YARN-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753804#comment-13753804 ] Sandy Ryza commented on YARN-1122: -- Thanks for taking this up, [~lohit]. Would it be possible to add a test? > FairScheduler user-as-default-queue always defaults to 'default' > > > Key: YARN-1122 > URL: https://issues.apache.org/jira/browse/YARN-1122 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.0.5-alpha >Reporter: Lohit Vijayarenu > Attachments: YARN-1122.1.patch > > > By default YARN fairscheduler should use user name as queue name, but we see > that in our clusters all jobs were ending up in default queue. Even after > picking YARN-333 which is part of trunk, the behavior remains the same. Jobs > do end up in right queue, but from UI perspective they are shown as running > under default queue. It looks like there is small bug with > {noformat} > RMApp rmApp = rmContext.getRMApps().get(applicationAttemptId); > {noformat} > which should actually be > {noformat} > RMApp rmApp = > rmContext.getRMApps().get(applicationAttemptId.getApplicationId()); > {noformat} > There is also a simple js change needed for filtering of jobs on > fairscheduler UI page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1034) Remove "experimental" in the Fair Scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1034: - Fix Version/s: 2.1.1-beta > Remove "experimental" in the Fair Scheduler documentation > - > > Key: YARN-1034 > URL: https://issues.apache.org/jira/browse/YARN-1034 > Project: Hadoop YARN > Issue Type: Task > Components: documentation, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Karthik Kambatla >Priority: Trivial > Labels: doc > Fix For: 2.1.1-beta > > Attachments: yarn-1034-1.patch > > > The YARN Fair Scheduler is largely stable now, and should no longer be > declared experimental. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1034) Remove "experimental" in the Fair Scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753194#comment-13753194 ] Sandy Ryza commented on YARN-1034: -- +1 > Remove "experimental" in the Fair Scheduler documentation > - > > Key: YARN-1034 > URL: https://issues.apache.org/jira/browse/YARN-1034 > Project: Hadoop YARN > Issue Type: Task > Components: documentation, scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Karthik Kambatla >Priority: Trivial > Labels: doc > Attachments: yarn-1034-1.patch > > > The YARN Fair Scheduler is largely stable now, and should no longer be > declared experimental. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1110) NodeManager doesn't complete container after transition from LOCALIZED to KILLING
Sandy Ryza created YARN-1110: Summary: NodeManager doesn't complete container after transition from LOCALIZED to KILLING Key: YARN-1110 URL: https://issues.apache.org/jira/browse/YARN-1110 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Multiple containers are sticking around on an NM, taking up resources, after they have been killed. {code} 2013-08-27 15:56:36,597 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1377559361179_0018_01_001337 by user llama 2013-08-27 15:56:36,597 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=llama IP=10.20.191.233OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1377559361179_0018 CONTAINERID=container_1377559361179_0018_01_001337 2013-08-27 15:56:36,598 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1377559361179_0018_01_001337 to application application_1377559361179_0018 2013-08-27 15:56:36,598 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1377559361179_0018_01_001337 transitioned from NEW to LOCALIZED 2013-08-27 15:56:36,613 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1377559361179_0018_01_001337 2013-08-27 15:56:36,616 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=llama IP=10.20.191.233OPERATION=Stop Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1377559361179_0018 CONTAINERID=container_1377559361179_0018_01_001337 2013-08-27 15:56:36,616 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1377559361179_0018_01_001337 transitioned from LOCALIZED to KILLING 2013-08-27 15:56:36,616 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1377559361179_0018_01_001337 2013-08-27 15:56:36,616 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container container_1377559361179_0018_01_001337 not launched. No cleanup needed to be done 2013-08-27 15:56:36,617 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 18, cluster_timestamp: 1377559361179, }, attemptId: 1, }, id: 402, }, state: C_RUNNING, diagnostics: "", exit_status: -1000, {code} This is the last time the container is mentioned in the logs. We never get a {code} 2013-08-27 15:56:38,832 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed container {code} like we do for other completed containers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1109) Consider throttling or demoting NodeManager "Sending out status for container" logs
Sandy Ryza created YARN-1109: Summary: Consider throttling or demoting NodeManager "Sending out status for container" logs Key: YARN-1109 URL: https://issues.apache.org/jira/browse/YARN-1109 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Diagnosing NodeManager and container launch problems is made more difficult by the enormous number of logs like {code} Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 18, cluster_timestamp: 1377559361179, }, attemptId: 1, }, id: 1337, }, state: C_RUNNING, diagnostics: "Container killed by the ApplicationMaster.\n", exit_status: -1000 {code} On an NM with a few containers I am seeing tens of these per second. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-832) Update Resource javadoc to clarify units for memory
[ https://issues.apache.org/jira/browse/YARN-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-832: Target Version/s: 2.3.0 > Update Resource javadoc to clarify units for memory > --- > > Key: YARN-832 > URL: https://issues.apache.org/jira/browse/YARN-832 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bikas Saha > Labels: newbie > > These values are supposed to be megabytes (need to check MB vs MiB ie 1000 vs > 1024) > /** >* Get memory of the resource. >* @return memory of the resource >*/ > @Public > @Stable > public abstract int getMemory(); > > /** >* Set memory of the resource. >* @param memory memory of the resource >*/ > @Public > @Stable > public abstract void setMemory(int memory); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-832) Update Resource javadoc to clarify units for memory
[ https://issues.apache.org/jira/browse/YARN-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-832: Labels: newbie (was: ) > Update Resource javadoc to clarify units for memory > --- > > Key: YARN-832 > URL: https://issues.apache.org/jira/browse/YARN-832 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bikas Saha > Labels: newbie > Fix For: 2.3.0 > > > These values are supposed to be megabytes (need to check MB vs MiB ie 1000 vs > 1024) > /** >* Get memory of the resource. >* @return memory of the resource >*/ > @Public > @Stable > public abstract int getMemory(); > > /** >* Set memory of the resource. >* @param memory memory of the resource >*/ > @Public > @Stable > public abstract void setMemory(int memory); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-832) Update Resource javadoc to clarify units for memory
[ https://issues.apache.org/jira/browse/YARN-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-832: Fix Version/s: (was: 2.3.0) > Update Resource javadoc to clarify units for memory > --- > > Key: YARN-832 > URL: https://issues.apache.org/jira/browse/YARN-832 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bikas Saha > Labels: newbie > > These values are supposed to be megabytes (need to check MB vs MiB ie 1000 vs > 1024) > /** >* Get memory of the resource. >* @return memory of the resource >*/ > @Public > @Stable > public abstract int getMemory(); > > /** >* Set memory of the resource. >* @param memory memory of the resource >*/ > @Public > @Stable > public abstract void setMemory(int memory); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-723) Yarn default value of physical cpu cores to virtual core is 2
[ https://issues.apache.org/jira/browse/YARN-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751775#comment-13751775 ] Sandy Ryza commented on YARN-723: - Resolving as invalid now that YARN-782 removed the vcores-pcores-ratio > Yarn default value of physical cpu cores to virtual core is 2 > - > > Key: YARN-723 > URL: https://issues.apache.org/jira/browse/YARN-723 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.0.4-alpha >Reporter: Bikas Saha > Fix For: 2.3.0 > > > The default virtual core allocation in the RM is 1. That means every > container will get 1 virtual core == 1/2 a physical core. Not sure if this > breaks implicit MR assumptions of maps/reduces getting at least 1 physical > cpu. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-723) Yarn default value of physical cpu cores to virtual core is 2
[ https://issues.apache.org/jira/browse/YARN-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved YARN-723. - Resolution: Invalid > Yarn default value of physical cpu cores to virtual core is 2 > - > > Key: YARN-723 > URL: https://issues.apache.org/jira/browse/YARN-723 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.0.4-alpha >Reporter: Bikas Saha > Fix For: 2.3.0 > > > The default virtual core allocation in the RM is 1. That means every > container will get 1 virtual core == 1/2 a physical core. Not sure if this > breaks implicit MR assumptions of maps/reduces getting at least 1 physical > cpu. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1100) Giving multiple commands to ContainerLaunchContext doesn't work as expected
Sandy Ryza created YARN-1100: Summary: Giving multiple commands to ContainerLaunchContext doesn't work as expected Key: YARN-1100 URL: https://issues.apache.org/jira/browse/YARN-1100 Project: Hadoop YARN Issue Type: Bug Components: api, nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza A ContainerLaunchContext accepts a list of commands (as strings) to be executed to launch the container. I would expect that giving a list with the two commands "echo yolo" and "date" would print something like {code} yolo Mon Aug 26 14:40:23 PDT 2013 {code} Instead it prints {code} yolo date {code} This is because the commands get executed with: {code} exec /bin/bash -c "echo yolo date" {code} To get the expected behavior I have to include semicolons at the end of each command. At the very least, this should be documented, but I think better would be for the NM to insert the semicolons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-649: Attachment: YARN-649-7.patch > Make container logs available over HTTP in plain text > - > > Key: YARN-649 > URL: https://issues.apache.org/jira/browse/YARN-649 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, > YARN-649-5.patch, YARN-649-6.patch, YARN-649-7.patch, YARN-649.patch, > YARN-752-1.patch > > > It would be good to make container logs available over the REST API for > MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-942) In Fair Scheduler documentation, inconsistency on which properties have prefix
[ https://issues.apache.org/jira/browse/YARN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-942: Assignee: Akira AJISAKA > In Fair Scheduler documentation, inconsistency on which properties have prefix > -- > > Key: YARN-942 > URL: https://issues.apache.org/jira/browse/YARN-942 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Akira AJISAKA > Labels: documentation, newbie > Attachments: YARN-942.patch > > > locality.threshold.node and locality.threshold.rack should have the > yarn.scheduler.fair prefix like the items before them > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1093) Corrections to Fair Scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750489#comment-13750489 ] Sandy Ryza commented on YARN-1093: -- I just committed this to trunk, branch-2, and branch-2.1-beta > Corrections to Fair Scheduler documentation > --- > > Key: YARN-1093 > URL: https://issues.apache.org/jira/browse/YARN-1093 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha >Reporter: Wing Yew Poon > Fix For: 2.1.1-beta > > Attachments: YARN-1093.patch > > > The fair scheduler is still evolving, but the current documentation contains > some inaccuracies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-942) In Fair Scheduler documentation, inconsistency on which properties have prefix
[ https://issues.apache.org/jira/browse/YARN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-942: Labels: documentation newbie (was: docuentation newbie) > In Fair Scheduler documentation, inconsistency on which properties have prefix > -- > > Key: YARN-942 > URL: https://issues.apache.org/jira/browse/YARN-942 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Labels: documentation, newbie > Attachments: YARN-942.patch > > > locality.threshold.node and locality.threshold.rack should have the > yarn.scheduler.fair prefix like the items before them > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1093) Corrections to Fair Scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1093: - Hadoop Flags: Reviewed > Corrections to Fair Scheduler documentation > --- > > Key: YARN-1093 > URL: https://issues.apache.org/jira/browse/YARN-1093 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha >Reporter: Wing Yew Poon > Fix For: 2.1.1-beta > > Attachments: YARN-1093.patch > > > The fair scheduler is still evolving, but the current documentation contains > some inaccuracies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1093) Corrections to fair scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1093: - Summary: Corrections to fair scheduler documentation (was: corrections to fair scheduler documentation) > Corrections to fair scheduler documentation > --- > > Key: YARN-1093 > URL: https://issues.apache.org/jira/browse/YARN-1093 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha >Reporter: Wing Yew Poon > Attachments: YARN-1093.patch > > > The fair scheduler is still evolving, but the current documentation contains > some inaccuracies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1093) Corrections to Fair Scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1093: - Summary: Corrections to Fair Scheduler documentation (was: Corrections to fair scheduler documentation) > Corrections to Fair Scheduler documentation > --- > > Key: YARN-1093 > URL: https://issues.apache.org/jira/browse/YARN-1093 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha >Reporter: Wing Yew Poon > Attachments: YARN-1093.patch > > > The fair scheduler is still evolving, but the current documentation contains > some inaccuracies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750286#comment-13750286 ] Sandy Ryza commented on YARN-1024: -- bq. It seems to me that the only time you'd want a YCU value that's not -1 is when you're running a thread that uses less than 100% of the CPU. Is that a correct statement? That's correct. This is common for data-intensive tasks that can be more I/O-bound than CPU-bound. bq. As an end user, how do I know what YCU value is reasonable for my job? I think selecting the right value is an inherently difficult task. I think we would expect different users with different amounts of technical proficiency to do it in different ways. Something like: * Simple: Use the default value on the cluster. * Intermediate: Notice your tasks are running too slow and increase YCUs. Or notice your tasks aren't getting scheduled enough and decrease them. * Advanced: Do the thing with top. > Define a virtual core unambigiously > --- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: CPUasaYARNresource.pdf > > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-905) Add state filters to nodes CLI
[ https://issues.apache.org/jira/browse/YARN-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749142#comment-13749142 ] Sandy Ryza commented on YARN-905: - I just committed this before Vinod's comment. I don't think the current version is harmful in such a way that it needs to be reverted. I would prefer to make these extra changes in a separate JIRA, but would also be happy to review/commit an addendum here. > Add state filters to nodes CLI > -- > > Key: YARN-905 > URL: https://issues.apache.org/jira/browse/YARN-905 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Wei Yan > Attachments: Yarn-905.patch, YARN-905.patch, YARN-905.patch > > > It would be helpful for the nodes CLI to have a node-states option that > allows it to return nodes that are not just in the RUNNING state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-942) In Fair Scheduler documentation, inconsistency on which properties have prefix
[ https://issues.apache.org/jira/browse/YARN-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748988#comment-13748988 ] Sandy Ryza commented on YARN-942: - Thanks [~ajisakaa]. +1 pending jenkins. > In Fair Scheduler documentation, inconsistency on which properties have prefix > -- > > Key: YARN-942 > URL: https://issues.apache.org/jira/browse/YARN-942 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza > Labels: docuentation, newbie > Attachments: YARN-942.patch > > > locality.threshold.node and locality.threshold.rack should have the > yarn.scheduler.fair prefix like the items before them > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1093) corrections to fair scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748879#comment-13748879 ] Sandy Ryza commented on YARN-1093: -- Thanks Wing Yew! +1 pendings Jenkins. > corrections to fair scheduler documentation > --- > > Key: YARN-1093 > URL: https://issues.apache.org/jira/browse/YARN-1093 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha >Reporter: Wing Yew Poon > Attachments: YARN-1093.patch > > > The fair scheduler is still evolving, but the current documentation contains > some inaccuracies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1093) corrections to fair scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1093: - Fix Version/s: (was: 2.1.0-beta) > corrections to fair scheduler documentation > --- > > Key: YARN-1093 > URL: https://issues.apache.org/jira/browse/YARN-1093 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha >Reporter: Wing Yew Poon > Attachments: YARN-1093.patch > > > The fair scheduler is still evolving, but the current documentation contains > some inaccuracies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1024: - Attachment: CPUasaYARNresource.pdf > Define a virtual core unambigiously > --- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: CPUasaYARNresource.pdf > > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748062#comment-13748062 ] Sandy Ryza commented on YARN-1024: -- I wrote up a more detailed proposal and attached a PDF of it. > Define a virtual core unambigiously > --- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: CPUasaYARNresource.pdf > > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747881#comment-13747881 ] Sandy Ryza commented on YARN-1089: -- Yeah, I'll write up a document and post it on YARN-1024. I'm hoping to keep the broader discussion there so we can use this (and perhaps additional JIRAs) for the actual implementation. > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Description: Based on discussion in YARN-1024, we will add YARN compute units as a resource for requesting and scheduling CPU processing power. > Add YARN compute units alongside virtual cores > -- > > Key: YARN-1089 > URL: https://issues.apache.org/jira/browse/YARN-1089 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > Based on discussion in YARN-1024, we will add YARN compute units as a > resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747701#comment-13747701 ] Sandy Ryza commented on YARN-1024: -- Filed YARN-1089 for adding YCUs. > Define a virtual core unambigiously > --- > > Key: YARN-1024 > URL: https://issues.apache.org/jira/browse/YARN-1024 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > > We need to clearly define the meaning of a virtual core unambiguously so that > it's easy to migrate applications between clusters. > For e.g. here is Amazon EC2 definition of ECU: > http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it > Essentially we need to clearly define a YARN Virtual Core (YVC). > Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the > equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1089) Add YARN compute units alongside virtual cores
Sandy Ryza created YARN-1089: Summary: Add YARN compute units alongside virtual cores Key: YARN-1089 URL: https://issues.apache.org/jira/browse/YARN-1089 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-972) Allow requests and scheduling for fractional virtual cores
[ https://issues.apache.org/jira/browse/YARN-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved YARN-972. - Resolution: Won't Fix > Allow requests and scheduling for fractional virtual cores > -- > > Key: YARN-972 > URL: https://issues.apache.org/jira/browse/YARN-972 > Project: Hadoop YARN > Issue Type: Improvement > Components: api, scheduler >Affects Versions: 2.0.5-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > As this idea sparked a fair amount of discussion on YARN-2, I'd like to go > deeper into the reasoning. > Currently the virtual core abstraction hides two orthogonal goals. The first > is that a cluster might have heterogeneous hardware and that the processing > power of different makes of cores can vary wildly. The second is that a > different (combinations of) workloads can require different levels of > granularity. E.g. one admin might want every task on their cluster to use at > least a core, while another might want applications to be able to request > quarters of cores. The former would configure a single vcore per core. The > latter would configure four vcores per core. > I don't think that the abstraction is a good way of handling the second goal. > Having a virtual cores refer to different magnitudes of processing power on > different clusters will make the difficult problem of deciding how many cores > to request for a job even more confusing. > Can we not handle this with dynamic oversubscription? > Dynamic oversubscription, i.e. adjusting the number of cores offered by a > machine based on measured CPU-consumption, should work as a complement to > fine-granularity scheduling. Dynamic oversubscription is never going to be > perfect, as the amount of CPU a process consumes can vary widely over its > lifetime. A task that first loads a bunch of data over the network and then > performs complex computations on it will suffer if additional CPU-heavy tasks > are scheduled on the same node because its initial CPU-utilization was low. > To guard against this, we will need to be conservative with how we > dynamically oversubscribe. If a user wants to explicitly hint to the > scheduler that their task will not use much CPU, the scheduler should be able > to take this into account. > On YARN-2, there are concerns that including floating point arithmetic in the > scheduler will slow it down. I question this assumption, and it is perhaps > worth debating, but I think we can sidestep the issue by multiplying > CPU-quantities inside the scheduler by a decently sized number like 1000 and > keep doing the computations on integers. > The relevant APIs are marked as evolving, so there's no need for the change > to delay 2.1.0-beta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-972) Allow requests and scheduling for fractional virtual cores
[ https://issues.apache.org/jira/browse/YARN-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747142#comment-13747142 ] Sandy Ryza commented on YARN-972: - Based on the approach we've agreed upon in YARN-1024 that allows separate values to be set for processing power and parallelism, closing this as won't fix. > Allow requests and scheduling for fractional virtual cores > -- > > Key: YARN-972 > URL: https://issues.apache.org/jira/browse/YARN-972 > Project: Hadoop YARN > Issue Type: Improvement > Components: api, scheduler >Affects Versions: 2.0.5-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > As this idea sparked a fair amount of discussion on YARN-2, I'd like to go > deeper into the reasoning. > Currently the virtual core abstraction hides two orthogonal goals. The first > is that a cluster might have heterogeneous hardware and that the processing > power of different makes of cores can vary wildly. The second is that a > different (combinations of) workloads can require different levels of > granularity. E.g. one admin might want every task on their cluster to use at > least a core, while another might want applications to be able to request > quarters of cores. The former would configure a single vcore per core. The > latter would configure four vcores per core. > I don't think that the abstraction is a good way of handling the second goal. > Having a virtual cores refer to different magnitudes of processing power on > different clusters will make the difficult problem of deciding how many cores > to request for a job even more confusing. > Can we not handle this with dynamic oversubscription? > Dynamic oversubscription, i.e. adjusting the number of cores offered by a > machine based on measured CPU-consumption, should work as a complement to > fine-granularity scheduling. Dynamic oversubscription is never going to be > perfect, as the amount of CPU a process consumes can vary widely over its > lifetime. A task that first loads a bunch of data over the network and then > performs complex computations on it will suffer if additional CPU-heavy tasks > are scheduled on the same node because its initial CPU-utilization was low. > To guard against this, we will need to be conservative with how we > dynamically oversubscribe. If a user wants to explicitly hint to the > scheduler that their task will not use much CPU, the scheduler should be able > to take this into account. > On YARN-2, there are concerns that including floating point arithmetic in the > scheduler will slow it down. I question this assumption, and it is perhaps > worth debating, but I think we can sidestep the issue by multiplying > CPU-quantities inside the scheduler by a decently sized number like 1000 and > keep doing the computations on integers. > The relevant APIs are marked as evolving, so there's no need for the change > to delay 2.1.0-beta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-649: Attachment: YARN-649-6.patch > Make container logs available over HTTP in plain text > - > > Key: YARN-649 > URL: https://issues.apache.org/jira/browse/YARN-649 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, > YARN-649-5.patch, YARN-649-6.patch, YARN-649.patch, YARN-752-1.patch > > > It would be good to make container logs available over the REST API for > MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746880#comment-13746880 ] Sandy Ryza commented on YARN-649: - Uploading a new patch that * Puts IOUtils.skipFully(logByteStream, start) back in. My mistake. * Changes the annotation to Unstable and includes documentation on how long logs will be available. * Removes the mortbay log and throws a YarnException on the URISyntaxException I manually verified that the buffering works by creating a log file larger than the NodeManager memory, retrieving it with the API, and observing that the NodeManager did not fall over. > Make container logs available over HTTP in plain text > - > > Key: YARN-649 > URL: https://issues.apache.org/jira/browse/YARN-649 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.0.4-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, > YARN-649-5.patch, YARN-649.patch, YARN-752-1.patch > > > It would be good to make container logs available over the REST API for > MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira