[jira] Created: (MAPREDUCE-1690) Using BuddySystem to reduce the ReduceTask's mem usage in the step of shuffle
Using BuddySystem to reduce the ReduceTask's mem usage in the step of shuffle - Key: MAPREDUCE-1690 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1690 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: luoli -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1689) Write MR wire protocols in Avro IDL
[ https://issues.apache.org/jira/browse/MAPREDUCE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855674#action_12855674 ] Jeff Hammerbacher commented on MAPREDUCE-1689: -- Hey Arun, I think there's a typo in your description. Do you mean "write all MapReduce protocols in Avro IDL"? Thanks, Jeff > Write MR wire protocols in Avro IDL > --- > > Key: MAPREDUCE-1689 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1689 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client, jobtracker, task, tasktracker >Reporter: Arun C Murthy >Assignee: Arun C Murthy > > As part of the the move to AVRO and wire compatibility, write all HDFS > protocols in AVRO IDL. This is analogous to HDFS-1069. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-1689) Write MR wire protocols in Avro IDL
Write MR wire protocols in Avro IDL --- Key: MAPREDUCE-1689 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1689 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, jobtracker, task, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy As part of the the move to AVRO and wire compatibility, write all HDFS protocols in AVRO IDL. This is analogous to HDFS-1069. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1526) Cache the job related information while submitting the job , this would avoid many RPC calls to JobTracker.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855541#action_12855541 ] rahul k singh commented on MAPREDUCE-1526: -- Added the new patch . Following comments are done in this.: - Restore the behavior of default seq == -1 for GridmixJob and GenerateData - why do we need this? wouldn't noOfRunningJobs always be the same as jobMaps.size()? {noformat} + private static AtomicInteger noOfRunningJobs= new AtomicInteger(0); {noformat} - The proper logic for addJobStats should be: first check if seq < 0, if yes, ignore the job; then if jobdesc is null, we should throw exception instead of adjust the #maps == 1. - The implementation of Statistics.add(Job job) is still wrong: You should hold the return value of jobMaps.remove(), and call StatListener.update() with the return value iff the return value is not null. - We should eliminate the variable runningJobs in StressJobFactory? Minor things: - The following comments from my previous review were not addressed: > - I think the following statement should be Log.debug() instead of > Log.info() (and be protected by a check of LOG.isDebugEnabled()): > {noformat} > -if (LOG.isDebugEnabled()) { > - LOG.info( > +LOG.info( > System.currentTimeMillis() + " Overloaded is " + > Boolean.toString( > overloaded) + " incompleteMapTasks " + relOp + " " + > OVERLAOD_MAPTASK_MAPSLOT_RATIO + "*mapSlotCapacity" + "(" + > incompleteMapTasks + " " + relOp + " " + > OVERLAOD_MAPTASK_MAPSLOT_RATIO + "*" + > clusterStatus.getMaxMapTasks() + ")"); > -} > + > {noformat} - static List pullDescription(JobContext jobCtxt) can be implemented on top of GridmixJob.getJobSeqId - removed the redundant GridmixJob.getJobSeqId() calls. - fixed a minor bug in Statistics.addJobStats(Job, JobStats) > Cache the job related information while submitting the job , this would avoid > many RPC calls to JobTracker. > --- > > Key: MAPREDUCE-1526 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1526 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: rahul k singh > Attachments: 1526-yahadoop-20-101-2.patch, > 1526-yahadoop-20-101-3.patch, 1526-yahadoop-20-101.patch, > 1526-yhadoop-20-101-4.patch, 1526-yhadoop-20-101-4.patch > > -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1526) Cache the job related information while submitting the job , this would avoid many RPC calls to JobTracker.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-1526: - Attachment: 1526-yhadoop-20-101-4.patch > Cache the job related information while submitting the job , this would avoid > many RPC calls to JobTracker. > --- > > Key: MAPREDUCE-1526 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1526 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: rahul k singh > Attachments: 1526-yahadoop-20-101-2.patch, > 1526-yahadoop-20-101-3.patch, 1526-yahadoop-20-101.patch, > 1526-yhadoop-20-101-4.patch, 1526-yhadoop-20-101-4.patch > > -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1526) Cache the job related information while submitting the job , this would avoid many RPC calls to JobTracker.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-1526: - Attachment: 1526-yhadoop-20-101-4.patch > Cache the job related information while submitting the job , this would avoid > many RPC calls to JobTracker. > --- > > Key: MAPREDUCE-1526 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1526 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/gridmix >Reporter: rahul k singh > Attachments: 1526-yahadoop-20-101-2.patch, > 1526-yahadoop-20-101-3.patch, 1526-yahadoop-20-101.patch, > 1526-yhadoop-20-101-4.patch, 1526-yhadoop-20-101-4.patch > > -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1683) Remove JNI calls from ClusterStatus cstr
[ https://issues.apache.org/jira/browse/MAPREDUCE-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855540#action_12855540 ] Vinod K V commented on MAPREDUCE-1683: -- +1 for leaving the memory vals in the detailed info. Patch for 20 looks good too. When working on the trunk patch, we should cleanup/remove all the useless constructors - constructors are all package private so no harm in doing a cleanup here, I think. > Remove JNI calls from ClusterStatus cstr > > > Key: MAPREDUCE-1683 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1683 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.20.2 >Reporter: Chris Douglas > Attachments: MAPREDUCE-1683_yhadoop_20_9.patch > > > The {{ClusterStatus}} constructor makes two JNI calls to the {{Runtime}} to > fetch memory information. {{ClusterStatus}} instances are often created > inside the {{JobTracker}} to obtain other, unrelated metrics (sometimes from > schedulers' inner loops). Given that this information is related to the > {{JobTracker}} process and not the cluster, the metrics are also available > via {{JvmMetrics}}, and the jsps can gather this information for themselves: > these fields can be removed from {{ClusterStatus}} -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1505) Cluster class should create the rpc client only when needed
[ https://issues.apache.org/jira/browse/MAPREDUCE-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855539#action_12855539 ] Vinod K V commented on MAPREDUCE-1505: -- I looked at the latest 20 patch. Looks good except that the changes in {{ensureState()}} are never reachable. > Cluster class should create the rpc client only when needed > --- > > Key: MAPREDUCE-1505 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1505 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 0.20.2 >Reporter: Devaraj Das > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1505_yhadoop20.patch, > MAPREDUCE-1505_yhadoop20_9.patch > > > It will be good to have the org.apache.hadoop.mapreduce.Cluster create the > rpc client object only when needed (when a call to the jobtracker is actually > required). org.apache.hadoop.mapreduce.Job constructs the Cluster object > internally and in many cases the application that created the Job object > really wants to look at the configuration only. It'd help to not have these > connections to the jobtracker especially when Job is used in the tasks (for > e.g., Pig calls mapreduce.FileInputFormat.setInputPath in the tasks and that > requires a Job object to be passed). > In Hadoop 20, the Job object internally creates the JobClient object, and the > same argument applies there too. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira