[ https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663125#comment-13663125 ]
Carlo Curino commented on YARN-624: ----------------------------------- I have two level of comments, the first is to clarify the intent of my earlier messages, and the second one to match robert description of a use case for ML frameworks. Intent: [~vinodkv], I completely agree with you that we should be very deliberate in choosing what use cases to support and make sure we only add features that target concrete and I would argue imminent use cases. Reflecting on a conversation I had with Alejandro, I was trying to help this conversation to take this form: 1) push for a broad discussion of what are the use cases for gang-scheduling we know of, so that we understand the entire complexity of the problem (hence the comments around more advanced feature such as OR of gangs) 2) let a set of core features emerge from the most concrete short-term needs we have (the storm example is a good example of where to start for this) 3) try to devise a protocol that supports the core features well, but that is amenable to future expansions (inasmuch as we can guess our future needs based on 1) So in term of concrete actions I am totally aligned with your request for "groundedness", but I think it would really benefit us to spell out also some of the future requirements so that we have a chance to designed for extensibility (similarly to what you guys pushed for in YARN-45, which I thought was really a good call). ML Use Cases: I asked Markus Weimer (ML/Systems guy in our group) to summarize why he sees gang scheduling to be key for ML frameworks (which I think are going to flock into yarn in the coming months/years). Here his response: "In many iterative algorithms, it is imperative to load all the data into the main memory to minimize execution time. This is true for systems like Giraph, Mahout and many others that will over time be on YARN. In order to satisfy their memory requirement, they will block holding on to idle slots until YARN has delivered all the resources needed. Exposing that pattern via gang scheduling seems beneficial. Furthermore, these systems are often communications intensive. Hence, they’d benefit from a gang of containers that are collocated on the network. This is a gang-wide property of the resource ask that cannot be captured easily without gang scheduling. The alternatives (e.g. getting a container on each rack, then expand from there to see which rack “wins”) are quite wasteful in comparison. Lastly, scheduling with alternatives at the gang level would be helpful. If e.g. the training data for a machine learning algorithm needs 128GB of RAM, any combination of containers with that amount of RAM would satisfy the need. However, preference is given to fewer machines as that reduces the communication overhead." While I appreciate the level of urgency for what Markus describe and for Storm is not comparable, I see ML as an important future use case for YARN. And gang-scheduling seems one of those features that will determine whether people build on Yarn or on something like Mesos. > Support gang scheduling in the AM RM protocol > --------------------------------------------- > > Key: YARN-624 > URL: https://issues.apache.org/jira/browse/YARN-624 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, scheduler > Affects Versions: 2.0.4-alpha > Reporter: Sandy Ryza > Assignee: Sandy Ryza > > Per discussion on YARN-392 and elsewhere, gang scheduling, in which a > scheduler runs a set of tasks when they can all be run at the same time, > would be a useful feature for YARN schedulers to support. > Currently, AMs can approximate this by holding on to containers until they > get all the ones they need. However, this lends itself to deadlocks when > different AMs are waiting on the same containers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira