[ https://issues.apache.org/jira/browse/YARN-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206993#comment-14206993 ]
Karthik Kambatla commented on YARN-2791: ---------------------------------------- Thanks [~sdaingade] for sharing the design doc. Well articulated. The designs on YARN-2139 and YARN-2791 are very similar, except for the disk resources are called vdisks in YARN-2139 and spindles in YARN-2791. In addition to the items specified here, YARN-2139 talks about isolation as well. Other than that, do you see any major items YARN-2791 covers that YARN-2139? The WebUI is good and very desirable, we should definitely include it. Also, I suggest we make this (as is - or split into multiple JIRAs) a sub-task of YARN-2139. Discussing the high-level details on one JIRA helps with aligning on one final design doc based on everyone's suggestions. > Add Disk as a resource for scheduling > ------------------------------------- > > Key: YARN-2791 > URL: https://issues.apache.org/jira/browse/YARN-2791 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler > Affects Versions: 2.5.1 > Reporter: Swapnil Daingade > Assignee: Yuliya Feldman > Attachments: DiskDriveAsResourceInYARN.pdf > > > Currently, the number of disks present on a node is not considered a factor > while scheduling containers on that node. Having large amount of memory on a > node can lead to high number of containers being launched on that node, all > of which compete for I/O bandwidth. This multiplexing of I/O across > containers can lead to slower overall progress and sub-optimal resource > utilization as containers starved for I/O bandwidth hold on to other > resources like cpu and memory. This problem can be solved by considering disk > as a resource and including it in deciding how many containers can be > concurrently run on a node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)