[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2011-02-01 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1783:
--

Fix Version/s: 0.22.0

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Fix For: 0.22.0, 0.23.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, 
> submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-11-30 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1783:
--

   Resolution: Fixed
Fix Version/s: (was: 0.22.0)
   0.23.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Ram.

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Fix For: 0.23.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, 
> submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-11-19 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Status: Patch Available  (was: Open)

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, 
> submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-11-19 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Attachment: MAPREDUCE-1783.patch

Patch after svn up

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch, 
> submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-10-27 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Status: Open  (was: Patch Available)

Will submit an up-to-date patch.

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
>Assignee: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-06-30 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Status: Open  (was: Patch Available)

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-06-30 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Status: Patch Available  (was: Open)

Trying again

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-05-21 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

  Status: Patch Available  (was: Open)
Hadoop Flags: [Reviewed]

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-05-21 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Status: Open  (was: Patch Available)

Patch was not generated correctly

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-05-21 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Attachment: submit-mapreduce-1783.patch

Formatted patch, this should work.

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1, submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-05-20 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

   Status: Patch Available  (was: Open)
Fix Version/s: 0.22.0

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Fix For: 0.22.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-05-20 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Attachment: 0001-Pool-aware-job-initialization.patch.1

Made a fix per Scott's comments.

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Attachments: 0001-Pool-aware-job-initialization.patch, 
> 0001-Pool-aware-job-initialization.patch.1
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1783) Task Initialization should be delayed till when a job can be run

2010-05-18 Thread Ramkumar Vadali (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-1783:
---

Attachment: 0001-Pool-aware-job-initialization.patch

> Task Initialization should be delayed till when a job can be run
> 
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1
>Reporter: Ramkumar Vadali
> Attachments: 0001-Pool-aware-job-initialization.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the 
> number of jobs that can be running at a given time. However, jobs that are 
> submitted are initiaiized immediately by EagerTaskInitializationListener by 
> calling JobInProgress.initTasks. This causes the job split file to be read 
> into memory. The split information is not needed until the number of running 
> jobs is less than the maximum specified. If the amount of split information 
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of 
> JobInProgressListener that is aware of PoolManager limits and can delay task 
> initialization until the number of running jobs is below the maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.