[ 
https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663125#comment-13663125
 ] 

Carlo Curino commented on YARN-624:
-----------------------------------

I have two level of comments, the first is to clarify the intent of my earlier 
messages, and the second one to match robert description of a use case for ML 
frameworks.

Intent:
[~vinodkv], I completely agree with you that we should be very deliberate in 
choosing what use cases to support and make sure we only add features that 
target concrete and I would argue imminent use cases. 
Reflecting on a conversation I had with Alejandro, I was trying to help this 
conversation to take this form:
1) push for a broad discussion of what are the use cases for gang-scheduling we 
know of, so that we understand the entire complexity of the problem (hence the 
comments around more advanced feature such as OR of gangs)
2) let a set of core features emerge from the most concrete short-term needs we 
have (the storm example is a good example of where to start for this)
3) try to devise a protocol that supports the core features well, but that is 
amenable to future expansions (inasmuch as we can guess our future needs based 
on 1)
So in term of concrete actions I am totally aligned with your request for 
"groundedness", but I think it would really benefit us to spell out also some 
of the future requirements 
so that we have a chance to designed for extensibility (similarly to what you 
guys pushed for in YARN-45, which I thought was really a good call).

ML Use Cases:
I asked Markus Weimer (ML/Systems guy in our group) to summarize why he sees 
gang scheduling to be key for ML frameworks (which I think are going to flock 
into yarn in the coming months/years). 

Here his response:
"In many iterative algorithms, it is imperative to load all the data into the 
main memory to minimize execution time. This is true for systems like Giraph, 
Mahout and many others that will over time be on YARN. In order to satisfy 
their memory requirement, they will block holding on to idle slots until YARN 
has delivered all the resources needed. Exposing that pattern via gang 
scheduling seems beneficial.
Furthermore, these systems are often communications intensive. Hence, they’d 
benefit from a gang of containers that are collocated on the network. This is a 
gang-wide property of the resource ask that cannot be captured easily without 
gang scheduling. The alternatives (e.g. getting a container on each rack, then 
expand from there to see which rack “wins”) are quite wasteful in comparison.
Lastly, scheduling with alternatives at the gang level would be helpful. If 
e.g. the training data for a machine learning algorithm needs 128GB of RAM, any 
combination of containers with that amount of RAM would satisfy the need. 
However, preference is given to fewer machines as that reduces the 
communication overhead."

While I appreciate the level of urgency for what Markus describe and for Storm 
is not comparable, I see ML as an important future use case for YARN. And 
gang-scheduling seems one of those features that will determine whether people 
build on Yarn or on something like Mesos.

                
> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>
>                 Key: YARN-624
>                 URL: https://issues.apache.org/jira/browse/YARN-624
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a 
> scheduler runs a set of tasks when they can all be run at the same time, 
> would be a useful feature for YARN schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they 
> get all the ones they need.  However, this lends itself to deadlocks when 
> different AMs are waiting on the same containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to