[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1783: -- Fix Version/s: 0.22.0 > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > Fix For: 0.22.0, 0.23.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, > submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1783: -- Resolution: Fixed Fix Version/s: (was: 0.22.0) 0.23.0 Status: Resolved (was: Patch Available) I just committed this. Thanks Ram. > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > Fix For: 0.23.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, > submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Patch Available (was: Open) > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, > submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Attachment: MAPREDUCE-1783.patch Patch after svn up > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, > submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Open (was: Patch Available) Will submit an up-to-date patch. > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali >Assignee: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Open (was: Patch Available) > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Patch Available (was: Open) Trying again > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Patch Available (was: Open) Hadoop Flags: [Reviewed] > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Open (was: Patch Available) Patch was not generated correctly > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Attachment: submit-mapreduce-1783.patch Formatted patch, this should work. > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Status: Patch Available (was: Open) Fix Version/s: 0.22.0 > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Fix For: 0.22.0 > > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1 > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Attachment: 0001-Pool-aware-job-initialization.patch.1 Made a fix per Scott's comments. > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Attachments: 0001-Pool-aware-job-initialization.patch, > 0001-Pool-aware-job-initialization.patch.1 > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run
[ https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1783: --- Attachment: 0001-Pool-aware-job-initialization.patch > Task Initialization should be delayed till when a job can be run > > > Key: MAPREDUCE-1783 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share >Affects Versions: 0.20.1 >Reporter: Ramkumar Vadali > Attachments: 0001-Pool-aware-job-initialization.patch > > > The FairScheduler task scheduler uses PoolManager to impose limits on the > number of jobs that can be running at a given time. However, jobs that are > submitted are initiaiized immediately by EagerTaskInitializationListener by > calling JobInProgress.initTasks. This causes the job split file to be read > into memory. The split information is not needed until the number of running > jobs is less than the maximum specified. If the amount of split information > is large, this leads to unnecessary memory pressure on the Job Tracker. > To ease memory pressure, FairScheduler can use another implementation of > JobInProgressListener that is aware of PoolManager limits and can delay task > initialization until the number of running jobs is below the maximum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.