[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2016-01-30 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124824#comment-15124824
 ] 

He Tianyi commented on YARN-2139:
-

Recently introduced SSD in my cluster for MapReduce shuffle. 
Then there is one issue, if map output gets too large, it cannot be placed on 
SSD. We have to implement a custom strategy (called SSDFirst) to make best 
effort to use SSD, but fallbacks to HDD when available space of SSD gets tight. 
This worked in most cases, but it is only a local optimum. To achieve global 
optimum, scheduler must be aware and management these resources.

> [Umbrella] Support for Disk as a Resource in YARN 
> --
>
> Key: YARN-2139
> URL: https://issues.apache.org/jira/browse/YARN-2139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
> Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
> Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
> YARN-2139-prototype-2.patch, YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2015-06-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573468#comment-14573468
 ] 

kassiano josé matteussi commented on YARN-2139:
---

Dears, 

I have studied resource management under Hadoop applications running wrapped in 
Linux containers and I have faced troubles to restrict disk I/O with cgroups 
(bps_write, bps_read). 

Does anybody know if it is possible to do so?

I have heard that limiting I/O with cgroups is restricted to synchronous 
writing (SYNC) and that is why it wouldn't work well with Hadoop + HDFS. Is 
this still true in more recent kernel implementation?

Best Regards,
Kassiano

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2015-05-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530887#comment-14530887
 ] 

Vinod Kumar Vavilapalli commented on YARN-2139:
---

YARN-2619 already covered some of the disk isolation work in, well, isolation. 
It doesn't care about any new concepts like vdisks - all it does is that all 
containers get 'equal' share of local disk resources.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2015-01-14 Thread Swapnil Daingade (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277969#comment-14277969
 ] 

Swapnil Daingade commented on YARN-2139:


Had a look at the latest design doc and was wondering if it would be possible 
to make the isolation part separate and optional from the avoiding 
over-allocation part. Enforcing isolation using Cgroups may not always work, 
especially in cases where HDFS is not the default dfs.



 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-03 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233341#comment-14233341
 ] 

Bikas Saha commented on YARN-2139:
--

So to be clear, currently vdisks is counting the number of physical drives 
present on the box.

Something to keep in mind would be whether this also entails a change in the NM 
policy of providing a directly on every local dir (which typically maps to 
every disk) to every task. And tasks are free to choose one or more of those 
dirs (disks) to write to. This puts the spinning disk head under contention and 
affects performance of all writers on that disk because seeks are expensive. 
The thumb rule tends to be to allocate as many number of tasks to a machine as 
the number of disks (maybe 2x) so as to keep this seek cost low. Should we 
consider evaluating a change in this policy that gives a container 1 local dir 
to a container with 1 vdisk. This way for a machine with 6 disks (and 6 vdisks) 
would have 6 tasks running, each with their own dedicated disk. Off hand its 
hard to say how this would compare with all 6 disks allocated to all 6 tasks 
and letting cgroups enforce sharing. If multiple tasks end up choosing the same 
disk for their writes, then they may not end up getting the allocation that 
they thought they would get.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-03 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233605#comment-14233605
 ] 

Karthik Kambatla commented on YARN-2139:


bq. currently vdisks is counting the number of physical drives present on the 
box.
We see vdisks as a multiple of the number of physical disks on the box. Again, 
it is just one of the ways, and we can add more ways to share disk resources in 
the future. 

bq. Should we consider evaluating a change in this policy that gives a 
container 1 local dir to a container with 1 vdisk. This way for a machine with 
6 disks (and 6 vdisks) would have 6 tasks running, each with their own 
dedicated disk. 
Good point. We were thinking of giving the AM the option to choose the amount 
of disk IO parallelism at the time of launching the container, as part of the 
spindle locality work. I see AMs wanting to either (1) pick a single local 
directory for guaranteed performance or (2) stripe accesses across multiple 
disks for potentially higher throughput based on other work on the node.

Initially, we could provide a global config for all containers - vdisks to span 
fewest or most disks. 

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-02 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232133#comment-14232133
 ] 

Bikas Saha commented on YARN-2139:
--

Thanks for the update.
Its not clear to me how we are going to clearly de-couple 1) and 2) from 3). 
From first thoughts, scheduling is what prevents over-allocation and the NM 
enforces the scheduling decision.
Could you please throw some light on that?


 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232293#comment-14232293
 ] 

Karthik Kambatla commented on YARN-2139:


The NMs will specify the amount of disk resources on a node, and the RM 
automatically allocates a fixed amount to each container. For example, each NM 
could report a vdisks value equal to the number of disks on the node, and each 
container would get one vdisk. That way, we limit the number of containers 
running on a node to number of disks on that node. 

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-02 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232448#comment-14232448
 ] 

Bikas Saha commented on YARN-2139:
--

Is the concept of vdisk representing a spinning disk or is it going to be some 
pluggable API?

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232642#comment-14232642
 ] 

Karthik Kambatla commented on YARN-2139:


disk-vdisks is one of the ways to represent disk resources, it captures disk 
shares for weighted sharing of spinning disks/ SSDs. In the future, we could 
add other dimensions like disk-bandwidth, disk-iops, disk-capacity etc. To 
specify the dimension(s) to consider for isolation and scheduling, one could 
set yarn.nodemanager.resource.disk-dimensions and 
yarn.scheduler.disk-dimensions. The design doc - 
Disk_IO_Isolation_Scheduling_3.pdf - has more details. 

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Isolation_Scheduling_3.pdf, 
 Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf, 
 YARN-2139-prototype-2.patch, YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-24 Thread Swapnil Daingade (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223772#comment-14223772
 ] 

Swapnil Daingade commented on YARN-2139:


+1 for having an abstract policy to wrap spindles / disk affinity / iops / 
bandwidth, etc.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-22 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222111#comment-14222111
 ] 

Steve Loughran commented on YARN-2139:
--

I'd assumed the vspindles you asked for were == SATA HDD spindles, so on an SSD 
the mapping of  multiple vspindles to a physical one would make sense. And if 
even faster persistent storage/storage interconnect comes out, you'd increase 
the number.



 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221561#comment-14221561
 ] 

Karthik Kambatla commented on YARN-2139:


[~leftnoteasy] - completely agree with both Arun and you on the 
spindle-locality-affinity front. The design doc hints at it, but doesn't cover 
it in as much detail as it should. I am all up for accomplishing that too here, 
I can work on fleshing out the locality-affinity pieces as we start getting the 
remaining parts in. 

I am considering starting the development on a feature-branch so we have a 
chance to change things before merging into trunk and branch-2.  Are people 
okay with that? 

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221591#comment-14221591
 ] 

Bikas Saha commented on YARN-2139:
--

Given that this design and possible implementation might go through unstable 
rounds and are currently not abstracted enough in the core code, doing this on 
a branch seems prudent. 
Given that SSDs are becoming common, thinking of storage as only spinning disks 
may be limited. Multiple writers  may affect each other more negatively on 
spinning disk vs SSDs. It may be useful to see if the consideration of storage 
could be abstracted into a plugin so that storage could have a different 
resource allocation policy by storage type (e.g. allocate/share by spindle for 
spinning disk storage vs allocate/share by iops on ssd storage vs 
allocate/share by network bandwidth for non-DAS storage). If we can abstract 
the policy into a plugin on trunk itself then perhaps we would not need a 
branch. Secondly, it will probably take a long time to agree on what a common 
policy should be and the consensus decision will probably not be a good fit for 
a large percentage of real clusters because of hardware variety. So making this 
a plugin would enable quicker development, trial and usage of disk based 
allocation compared to arriving at a grand unified allocation model for storage.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221669#comment-14221669
 ] 

Wangda Tan commented on YARN-2139:
--

Thanks [~bikassaha] and [~kasha],

+1 for work on a branch, there might be some great amount of changes across all 
the major modules, frequently rebasing might be a issue if this is based on 
trunk.
And also totally agree about having an abstract policy to wrap disk affinity / 
iops / bandwidth, etc.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221783#comment-14221783
 ] 

Karthik Kambatla commented on YARN-2139:


Valid points, Bikas. [~ywskycn] and I will spend sometime and propose a design 
that would allow plugging in these multiple dimensions.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-20 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220047#comment-14220047
 ] 

Arun C Murthy commented on YARN-2139:
-

Sorry, been busy with 2.6.0 - just coming up for air.

What are we modeling with vdisk again? What is the metric? Is it directly the 
blkio parameter? If so, that is my biggest concern.

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220077#comment-14220077
 ] 

Karthik Kambatla commented on YARN-2139:


It is very similar to vcores. vdisks is the number of virtual disks, no metric 
just a number. 

If we want to allow upto 'n' tasks to share a disk, {{vdisks = n * num-disks}}. 
For cases with n  1, spindle locality will help with ensuring all the 'n' 
vdisks correspond to the same spindle(s). 

 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206994#comment-14206994
 ] 

Karthik Kambatla commented on YARN-2139:


Thanks for the prototype, Wei. In light of the updates on YARN-2791 and 
YARN-2817, I propose we incorporate suggestions from [~sdaingade] and 
[~acmurthy] before posting patches for sub-tasks. 

Updated JIRA title, description, and marked it unassigned as this is an 
umbrella JIRA. 


 [Umbrella] Support for Disk as a Resource in YARN 
 --

 Key: YARN-2139
 URL: https://issues.apache.org/jira/browse/YARN-2139
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
 Attachments: Disk_IO_Scheduling_Design_1.pdf, 
 Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
 YARN-2139-prototype.patch


 YARN should consider disk as another resource for (1) scheduling tasks on 
 nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)