[ https://issues.apache.org/jira/browse/MAPREDUCE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated MAPREDUCE-1853: ----------------------------------------------- Hadoop Flags: [Reviewed] > MultipleOutputs does not cache TaskAttemptContext > ------------------------------------------------- > > Key: MAPREDUCE-1853 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1853 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.21.0 > Environment: OSX 10.6 > java6 > Reporter: Torsten Curdt > Priority: Critical > Fix For: 0.21.0 > > Attachments: cache-task-attempts.diff > > > In MultipleOutputs there is > [code] > private TaskAttemptContext getContext(String nameOutput) throws IOException { > // The following trick leverages the instantiation of a record writer via > // the job thus supporting arbitrary output formats. > Job job = new Job(context.getConfiguration()); > job.setOutputFormatClass(getNamedOutputFormatClass(context, nameOutput)); > job.setOutputKeyClass(getNamedOutputKeyClass(context, nameOutput)); > job.setOutputValueClass(getNamedOutputValueClass(context, nameOutput)); > TaskAttemptContext taskContext = > new TaskAttemptContextImpl(job.getConfiguration(), > context.getTaskAttemptID()); > return taskContext; > } > [code] > so for every reduce call it creates a new Job instance ...which creates a new > LocalJobRunner. > That does not sound like a good idea. > You end up with a flood of "jvm.JvmMetrics: Cannot initialize JVM Metrics > with processName=JobTracker, sessionId= - already initialized" > This should probably also be added to 0.22. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.