[ https://issues.apache.org/jira/browse/MYRIAD-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096851#comment-15096851 ]
DarinJ commented on MYRIAD-179: ------------------------------- This would be a really cool feature, I have some concerns on the difficulty and maintainability though. In particular, if we want to kill a placeholder task we'll need to go deeper into the NodeManager than MyriadExecutorAuxServices to kill the actual container. This might mean we will have to extend the NodeManager class (and others), and possibly using reflection to access some private variables and methods. While possible it may be very difficult to maintain, hopefully we could contribute changes to the YARN project to make this easier. There's been some recent commits/issues on the 2.8.0 branch that may help in particular YARN-291. > Support Revocable resources in Mesos > ------------------------------------ > > Key: MYRIAD-179 > URL: https://issues.apache.org/jira/browse/MYRIAD-179 > Project: Myriad > Issue Type: Improvement > Components: Scheduler > Affects Versions: Myriad 0.1.0 > Reporter: John Omernik > > Mesos has introduced revocable resources. Based on my reading of things, > Myriad would be an awesome use case for over subscription, especially when > you combine it with the Fine Grain Scaling (FGS). > Based on what I've read on oversubscription, if Myriad was aware of > oversubscription, we could have Myriad be smart about various Yarn > containers. Have some jobs that may be production jobs, be tagged in such a > way that they could run on non-revocable resources, but we could have other > yarn jobs with certain users/flags, especially in FGS mode, be submitted > using the revocable resources. This would be exceptionally powerful for big > map reduce jobs etc. > These are the jobs that would be adhoc in nature, and in addition to not > using resources when no jobs are running, the node managers, when they did > run certain jobs would run on the revocable resources so they could be killed > if needed. > I am speaking now not from a Dev perspective, so this may be a lot harder > than it seems, I am just trying to outline use cases. > Another use case (I think both are very valid and worth pursuing) would be > once we have the the multi-tenancy built in, have a whole myriad framework > dedicated to adhoc type jobs, and have another myriad framework dedicated to > production jobs. These adhoc jobs could be setup in such a way that all > submissions would be run with revocable resources. Thus being appropriate for > dev work, or other non production type jobs. Obviously this hinges on being > able to run two Myriad clusters on the same Mesos cluster. > The other thing, is a whole frame was set to be revocable resources, we'd > have to ensure the resource manager was running on non-revocable resources... > while containers for Yarn jobs can be killed, we don't want the whole > framework to die. > I see use cases for both, this just seems to add another layer of awesome > flexibility as it pertains to jobs on the cluster. > I'd be interested in flushing this idea out more with the dev team. -- This message was sent by Atlassian JIRA (v6.3.4#6332)