I agree that having it be pluggable opens up a lot of new possibilities. +1 for the idea. Although I think in the short term we are having enough problems as it is with just CPU and memory that it may be a little while before we get to a pluggable solution. Once YARN-2 goes in, if you can get an initial proof of concept patch for a generic solution I would be happy to review it and push for it to go in.
--Bobby On 10/22/12 5:41 AM, "Radim Kolar" <h...@filez.com> wrote: >I have proposal for improved resource scheduling. > >https://issues.apache.org/jira/browse/MAPREDUCE-4256 > >as i see, development seems to go other way for example in >https://issues.apache.org/jira/browse/YARN-2 for every added kind of >resource there has to be significant rework. > >you do not see benefits of having framework able to handle custom >resource types? Its not all about memory and cores. You need to schedule >jobs based on other factors (network capacity, availability of GPU >cores, data locality). > >And every cluster might have special considerations for example do not >overload central SQL database. We usually have few hundred submitted >jobs, proper resource sharing is essential. No point in running jobs >which needs GPU which is in use by other mapper, better to run some >other jobs until gpu becomes available again.