[ https://issues.apache.org/jira/browse/MAPREDUCE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800729#action_12800729 ]
Arun C Murthy commented on MAPREDUCE-1380: ------------------------------------------ bq. Note that the current approach to estimate the completion time of jobs is quite simplistic: it is based on the time it takes to finish each task, so it works well with regular jobs Polo - Can you please expand on your definition of 'regular' jobs? Are these, for e.g. part of regular workflows? IAC, how do you propose to communicate this information to the AdaptiveScheduler? > Adaptive Scheduler > ------------------ > > Key: MAPREDUCE-1380 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1380 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Jordà Polo > Priority: Minor > > The Adaptive Scheduler is a pluggable Hadoop scheduler that automatically > adjusts the amount of used resources depending on the performance of jobs and > on user-defined high-level business goals. > Existing Hadoop schedulers are focused on managing large, static clusters in > which nodes are added or removed manually. On the other hand, the goal of > this scheduler is to improve the integration of Hadoop and the applications > that run on top of it with environments that allow a more dynamic > provisioning of resources. > The current implementation is quite straightforward. Users specify a deadline > at job submission time, and the scheduler adjusts the resources to meet that > deadline (at the moment, the scheduler can be configured to either minimize > or maximize the amount of resources). If multiple jobs are run > simultaneously, the scheduler prioritizes them by deadline. Note that the > current approach to estimate the completion time of jobs is quite simplistic: > it is based on the time it takes to finish each task, so it works well with > regular jobs, but there is still room for improvement for unpredictable jobs. > The idea is to further integrate it with cloud-like and virtual environments > (such as Amazon EC2, Emotive, etc.) so that if, for instance, a job isn't > able to meet its deadline, the scheduler automatically requests more > resources. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.