[jira] Commented: (MAPREDUCE-1526) Cache the job related information while submitting the job , this would avoid many RPC calls to JobTracker.

rahul k singh (JIRA) Sat, 10 Apr 2010 04:49:04 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855541#action_12855541
 ]


rahul k singh commented on MAPREDUCE-1526:
------------------------------------------

Added the new patch . Following comments are done in this.:

- Restore the behavior of default seq == -1 for GridmixJob and 
GenerateData
- why do we need this?  wouldn't noOfRunningJobs always be the same as 
jobMaps.size()?
{noformat}
+  private static AtomicInteger noOfRunningJobs= new AtomicInteger(0);
{noformat}
- The proper logic for addJobStats should be: first check if seq < 0, 
if yes, ignore the job; then if jobdesc is null, we should throw 
exception instead of adjust the #maps == 1.
- The implementation of Statistics.add(Job job) is still wrong: You 
should hold the return value of jobMaps.remove(), and call 
StatListener<JobStats>.update() with the return value iff the return 
value is not null.
- We should eliminate the variable runningJobs in StressJobFactory?

Minor things:
- The following comments from my previous review were not addressed:
> - I think the following statement should be Log.debug() instead of 
> Log.info() (and be protected by a check of LOG.isDebugEnabled()):
> {noformat}
> -    if (LOG.isDebugEnabled()) {
> -      LOG.info(
> +    LOG.info(
>         System.currentTimeMillis() + " Overloaded is " + 
> Boolean.toString(
>           overloaded) + " incompleteMapTasks " + relOp + " " +
>           OVERLAOD_MAPTASK_MAPSLOT_RATIO + "*mapSlotCapacity" + "(" +
>           incompleteMapTasks + " " + relOp + " " +
>           OVERLAOD_MAPTASK_MAPSLOT_RATIO + "*" +
>           clusterStatus.getMaxMapTasks() + ")");
> -    }
> +
> {noformat}
- static List<InputSplit> pullDescription(JobContext jobCtxt) can be 
implemented on top of GridmixJob.getJobSeqId

- removed the redundant GridmixJob.getJobSeqId() calls.
- fixed a minor bug in Statistics.addJobStats(Job, JobStats)


> Cache the job related information while submitting the job , this would avoid 
> many RPC calls to JobTracker.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1526
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1526
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/gridmix
>            Reporter: rahul k singh
>         Attachments: 1526-yahadoop-20-101-2.patch, 
> 1526-yahadoop-20-101-3.patch, 1526-yahadoop-20-101.patch, 
> 1526-yhadoop-20-101-4.patch, 1526-yhadoop-20-101-4.patch
>
>


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-1526) Cache the job related information while submitting the job , this would avoid many RPC calls to JobTracker.

Reply via email to