[jira] Assigned: (MAPREDUCE-1888) Streaming overrides user given output key and value types.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi reassigned MAPREDUCE-1888: --- Assignee: Ravi Gummadi Streaming overrides user given output key and value types. -- Key: MAPREDUCE-1888 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Ravi Gummadi Fix For: 0.22.0 The following code in StreamJob.java overrides user given output key and value types. {code} idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT, IdentifierResolver.TEXT_ID)); conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS, idResolver.getOutputReaderClass(), OutputReader.class); job.setMapOutputKeyClass(idResolver.getOutputKeyClass()); job.setMapOutputValueClass(idResolver.getOutputValueClass()); idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT, IdentifierResolver.TEXT_ID)); conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS, idResolver.getOutputReaderClass(), OutputReader.class); job.setOutputKeyClass(idResolver.getOutputKeyClass()); job.setOutputValueClass(idResolver.getOutputValueClass()); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1857) Remove unused streaming configuration from src
[ https://issues.apache.org/jira/browse/MAPREDUCE-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882065#action_12882065 ] Hadoop QA commented on MAPREDUCE-1857: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447015/patch-1857-1.txt against trunk revision 957283. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/584/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/584/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/584/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/584/console This message is automatically generated. Remove unused streaming configuration from src -- Key: MAPREDUCE-1857 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1857 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Priority: Trivial Fix For: 0.22.0 Attachments: patch-1857-1.txt, patch-1857.txt The configuration stream.numinputspecs is just set and not read anywhere. It can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1857) Remove unused streaming configuration from src
[ https://issues.apache.org/jira/browse/MAPREDUCE-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882066#action_12882066 ] Amareshwari Sriramadasu commented on MAPREDUCE-1857: Test failure is because of MAPREDUCE-1834. Will check this in. Remove unused streaming configuration from src -- Key: MAPREDUCE-1857 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1857 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Priority: Trivial Fix For: 0.22.0 Attachments: patch-1857-1.txt, patch-1857.txt The configuration stream.numinputspecs is just set and not read anywhere. It can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1699) JobHistory shouldn't be disabled for any reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-1699: - Status: Open (was: Patch Available) This needs more work... # Remove JobHistory.disableHistory # JobHistory.logSubmitted shouldn't throw an IOException, failure to log job-history shouldn't fail a job. We probably need a map from JobID - disabled which should be checked everywhere in-lieu of JobHistory.disableHistory JobHistory shouldn't be disabled for any reason --- Key: MAPREDUCE-1699 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1699 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Arun C Murthy Assignee: Krishna Ramachandran Fix For: 0.20.3 Attachments: mapred-1699-1.patch, mapred-1699-2.patch, mapred-1699-3.patch, mapred-1699.patch Recently we have had issues with JobTracker silently disabling job-history and starting to keep all completed jobs in memory. This leads to OOM on the JobTracker. We should never do this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1857) Remove unused streaming configuration from src
[ https://issues.apache.org/jira/browse/MAPREDUCE-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1857: --- Status: Resolved (was: Patch Available) Resolution: Fixed I just committed this. Remove unused streaming configuration from src -- Key: MAPREDUCE-1857 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1857 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Priority: Trivial Fix For: 0.22.0 Attachments: patch-1857-1.txt, patch-1857.txt The configuration stream.numinputspecs is just set and not read anywhere. It can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME
[ https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1864: --- Attachment: patch-1864.txt Patch removes un-initialized and unused variables : fs_, debug_, debugFailEarly_, debugFailDuring_, debugFailLate_, jobLog_, log_, LOGNAME and mapredKey_. bq. If some debug statements are needed, we could add them as.. Two of the debug statements do not look relevant, so removed them. Replaced the other one with LOG.debug. bq. mapredKey_ is not set at all but referenced in getContext(). We need to set it appropriately. The variable is never set from the first version. The next line in getContext() is adding last output. So, removed this. Also removed new Date() from getContext(), because getContext() is called from log message and logger will anyway prefix the log with the current time. PipeMapRed.java has uninitialized members log_ and LOGNAME --- Key: MAPREDUCE-1864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1864.txt PipeMapRed.java has members log_ and LOGNAME, which are never initialized and they are used in code for logging in several places. They should be removed and PipeMapRed should use commons LogFactory and Log for logging. This would improve code maintainability. Also, as per [comment | https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530], stream.joblog_ configuration property can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME
[ https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1864: --- Status: Patch Available (was: Open) PipeMapRed.java has uninitialized members log_ and LOGNAME --- Key: MAPREDUCE-1864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1864.txt PipeMapRed.java has members log_ and LOGNAME, which are never initialized and they are used in code for logging in several places. They should be removed and PipeMapRed should use commons LogFactory and Log for logging. This would improve code maintainability. Also, as per [comment | https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530], stream.joblog_ configuration property can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1857) Remove unused streaming configuration from src
[ https://issues.apache.org/jira/browse/MAPREDUCE-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1857: --- Hadoop Flags: [Reviewed] Remove unused streaming configuration from src -- Key: MAPREDUCE-1857 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1857 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Priority: Trivial Fix For: 0.22.0 Attachments: patch-1857-1.txt, patch-1857.txt The configuration stream.numinputspecs is just set and not read anywhere. It can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1851) Document configuration parameters in streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1851: --- Hadoop Flags: [Reviewed] Document configuration parameters in streaming -- Key: MAPREDUCE-1851 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1851 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming, documentation Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1851-1.txt, patch-1851-2.txt, patch-1851.txt There are several streaming options such as stream.map.output.field.separator, stream.num.map.output.key.fields, stream.map.input.field.separator, stream.reduce.input.field.separator, stream.map.input.ignoreKey, stream.non.zero.exit.is.failure etc which are spread everywhere. These should be documented at single place with description and default-value. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1853) MultipleOutputs does not cache TaskAttemptContext
[ https://issues.apache.org/jira/browse/MAPREDUCE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1853: --- Hadoop Flags: [Reviewed] MultipleOutputs does not cache TaskAttemptContext - Key: MAPREDUCE-1853 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1853 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Environment: OSX 10.6 java6 Reporter: Torsten Curdt Priority: Critical Fix For: 0.21.0 Attachments: cache-task-attempts.diff In MultipleOutputs there is [code] private TaskAttemptContext getContext(String nameOutput) throws IOException { // The following trick leverages the instantiation of a record writer via // the job thus supporting arbitrary output formats. Job job = new Job(context.getConfiguration()); job.setOutputFormatClass(getNamedOutputFormatClass(context, nameOutput)); job.setOutputKeyClass(getNamedOutputKeyClass(context, nameOutput)); job.setOutputValueClass(getNamedOutputValueClass(context, nameOutput)); TaskAttemptContext taskContext = new TaskAttemptContextImpl(job.getConfiguration(), context.getTaskAttemptID()); return taskContext; } [code] so for every reduce call it creates a new Job instance ...which creates a new LocalJobRunner. That does not sound like a good idea. You end up with a flood of jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized This should probably also be added to 0.22. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1318) Document exit codes and their meanings used by linux task controller
[ https://issues.apache.org/jira/browse/MAPREDUCE-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1318: --- Status: Open (was: Patch Available) patch needs to be updated to trunk Document exit codes and their meanings used by linux task controller Key: MAPREDUCE-1318 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1318 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Reporter: Sreekanth Ramakrishnan Assignee: Anatoli Fomenko Fix For: 0.21.0 Attachments: HADOOP-5912.1.patch, MAPREDUCE-1318.1.patch, MAPREDUCE-1318.2.patch, MAPREDUCE-1318.patch Currently, linux task controller binary uses a set of exit code, which is not documented. These should be documented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1744) DistributedCache creates its own FileSytem instance when adding a file/archive to the path
[ https://issues.apache.org/jira/browse/MAPREDUCE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882093#action_12882093 ] Amareshwari Sriramadasu commented on MAPREDUCE-1744: Shouldn't the Implementation of DistributedCache.addFileToClassPath(Path file, Configuration conf) call file.getFileSystem(conf) instead of FileSystem.get(conf)? Similar change would be needed for DistributedCache.addArchiveToClassPath(Path archive, Configuration conf) also. Also, javadoc should not have two usage messages for deprecation. That will be confusing for users, DistributedCache creates its own FileSytem instance when adding a file/archive to the path -- Key: MAPREDUCE-1744 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1744 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Dick King Assignee: Krishna Ramachandran Attachments: BZ-3503564--2010-05-06.patch, h1744.patch, mapred-1744-1.patch, mapred-1744-2.patch, mapred-1744-3.patch, mapred-1744.patch, MAPREDUCE-1744.patch According to the contract of {{UserGroupInformation.doAs()}} the only required operations within the {{doAs()}} block are the creation of a {{JobClient}} or getting a {{FileSystem}} . The {{DistributedCache.add(File/Archive)ToClasspath()}} methods create a {{FileSystem}} instance outside of the {{doAs()}} block, this {{FileSystem}} instance is not in the scope of the proxy user but of the superuser and permissions may make the method fail. One option is to overload the methods above to receive a filesystem. Another option is to do obtain the {{FileSystem}} within a {{doAs()}} block, for this it would be required to have the proxy user set in the passed configuration. The second option seems nicer, but I don't know if the proxy user is as a property in the jobconf. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1741) Automate the test scenario of job related files are moved from history directory to done directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iyappan Srinivasan updated MAPREDUCE-1741: -- Attachment: TestJobHistoryLocation-ydist-security-patch.txt Addressing Vinay's comments. Automate the test scenario of job related files are moved from history directory to done directory --- Key: MAPREDUCE-1741 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1741 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 0.22.0 Reporter: Iyappan Srinivasan Fix For: 0.22.0 Attachments: TestJobHistoryLocation-ydist-security-patch.txt, TestJobHistoryLocation-ydist-security-patch.txt, TestJobHistoryLocation.patch, TestJobHistoryLocation.patch, TestJobHistoryLocation.patch Job related files are moved from history directory to done directory, when 1) Job succeeds 2) Job is killed 3) When 100 files are put in the done directory 4) When multiple jobs are completed at the same time, some successful, some failed. Also, two files, conf.xml and job files should be present in the done directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882111#action_12882111 ] Hadoop QA commented on MAPREDUCE-1848: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447894/MAPREDUCE-1848-20100623.txt against trunk revision 957283. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/265/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/265/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/265/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/265/console This message is automatically generated. Put number of speculative, data local, rack local tasks in JobTracker metrics - Key: MAPREDUCE-1848 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1848 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1848-20100614.txt, MAPREDUCE-1848-20100617.txt, MAPREDUCE-1848-20100623.txt It will be nice that we can collect these information in JobTracker metrics -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1887) MRAsyncDiskService does not properly absolutize volume root paths
[ https://issues.apache.org/jira/browse/MAPREDUCE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882132#action_12882132 ] Hadoop QA commented on MAPREDUCE-1887: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447778/MAPREDUCE-1887.3.patch against trunk revision 957437. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/585/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/585/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/585/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/585/console This message is automatically generated. MRAsyncDiskService does not properly absolutize volume root paths - Key: MAPREDUCE-1887 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1887 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1887.2.patch, MAPREDUCE-1887.3.patch, MAPREDUCE-1887.patch In MRAsyncDiskService, volume names are sometimes specified as relative paths, which are not converted to absolute paths. This can cause errors of the form cannot delete /full/path/to/foo since it is outside of relative/volume/root even though the actual path is inside the root. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1823) Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode
[ https://issues.apache.org/jira/browse/MAPREDUCE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882148#action_12882148 ] Hadoop QA commented on MAPREDUCE-1823: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447914/MAPREDUCE-1823.txt against trunk revision 957437. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/console This message is automatically generated. Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode - Key: MAPREDUCE-1823 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1823 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1823.txt RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method fetches information from DataNode so it is slow. It becomes the bottleneck of the RaidNode. It will be nice if we can make this more efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file
[ https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882225#action_12882225 ] Scott Chen commented on MAPREDUCE-1831: --- Thanks, Dhruba. Yes, this one makes sense only when MAPREDUCE-1861 is available. Delete the co-located replicas when raiding file Key: MAPREDUCE-1831 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt In raid, it is good to have the blocks on the same stripe located on different machine. This way when one machine is down, it does not broke two blocks on the stripe. By doing this, we can decrease the block error probability in raid from O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica missing probability). One way to do this is that we can add a new BlockPlacementPolicy which deletes the replicas that are co-located. So when raiding the file, we can make the remaining replicas live on different machines. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1889) [herriot] Ability to restart a single node for pushconfig
[ https://issues.apache.org/jira/browse/MAPREDUCE-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Rajagopalan updated MAPREDUCE-1889: -- Attachment: restartDaemon.txt [herriot] Ability to restart a single node for pushconfig - Key: MAPREDUCE-1889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1889 Project: Hadoop Map/Reduce Issue Type: New Feature Components: test Reporter: Balaji Rajagopalan Assignee: Balaji Rajagopalan Attachments: restartDaemon.txt Right now the pushconfig is supported only at a cluster level, this jira will introduce the functionality to be supported at node level. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1854) [herriot] Automate health script system test
[ https://issues.apache.org/jira/browse/MAPREDUCE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Rajagopalan updated MAPREDUCE-1854: -- Attachment: health_script_7.txt Addresses all the code review comments. [herriot] Automate health script system test Key: MAPREDUCE-1854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1854 Project: Hadoop Map/Reduce Issue Type: New Feature Components: test Environment: Herriot framework Reporter: Balaji Rajagopalan Assignee: Balaji Rajagopalan Attachments: health_script_5.txt, health_script_7.txt Original Estimate: 120h Remaining Estimate: 120h 1. There are three scenarios, first is induce a error from health script, verify that task tracker is blacklisted. 2. Make the health script timeout and verify the task tracker is blacklisted. 3. Make an error in the health script path and make sure the task tracker stays healthy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME
[ https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882243#action_12882243 ] Hadoop QA commented on MAPREDUCE-1864: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447931/patch-1864.txt against trunk revision 957437. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/586/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/586/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/586/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/586/console This message is automatically generated. PipeMapRed.java has uninitialized members log_ and LOGNAME --- Key: MAPREDUCE-1864 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1864.txt PipeMapRed.java has members log_ and LOGNAME, which are never initialized and they are used in code for logging in several places. They should be removed and PipeMapRed should use commons LogFactory and Log for logging. This would improve code maintainability. Also, as per [comment | https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530], stream.joblog_ configuration property can be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1699) JobHistory shouldn't be disabled for any reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882320#action_12882320 ] Krishna Ramachandran commented on MAPREDUCE-1699: - yes - this is good. I have added a map and logSubmitted will initialize in case of an error for that job and use that for subsequent logging (failed, killed etc.,) Will update patch once testing is completed Am not able to entirely take out disableHistory because of this code (incorrect javadoc - there is no param - jobID) /** * Logs history meta-info to the history file. This needs to be called once * per history file. * @param jobId job id, assigned by jobtracker. */ static void logMetaInfo(ArrayListPrintWriter writers){ if (!disableHistory){ if (null != writers){ JobHistory.log(writers, RecordTypes.Meta, new Keys[] {Keys.VERSION}, new String[] {String.valueOf(VERSION)}); } } } also there are couple of (public) methods that use this flag. though I do not see where they are called /** * Returns history disable status. by default history is enabled so this * method returns false. * @return true if history logging is disabled, false otherwise. */ public static boolean isDisableHistory() { return disableHistory; } /** * Enable/disable history logging. Default value is false, so history * is enabled by default. * @param disableHistory true if history should be disabled, false otherwise. */ public static void setDisableHistory(boolean disableHistory) { JobHistory.disableHistory = disableHistory; } JobHistory shouldn't be disabled for any reason --- Key: MAPREDUCE-1699 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1699 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Arun C Murthy Assignee: Krishna Ramachandran Fix For: 0.20.3 Attachments: mapred-1699-1.patch, mapred-1699-2.patch, mapred-1699-3.patch, mapred-1699.patch Recently we have had issues with JobTracker silently disabling job-history and starting to keep all completed jobs in memory. This leads to OOM on the JobTracker. We should never do this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number
[ https://issues.apache.org/jira/browse/MAPREDUCE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882348#action_12882348 ] Matei Zaharia commented on MAPREDUCE-1845: -- This looks good. Thanks for adding the unit tests too. We should check this into 0.21 as well, if that's not out yet. The only concern I have is that the existing unit tests, such as testMinAndFairSharePreemption, work correctly. This seems to be because those pools either only test one type of preemption (min-share or fair-share), or they place the over-scheduled jobs in pools that have no min share set. This means that one of the values in max(tasksDueToMinShare, tasksDueToFairShare) is zero. Would you mind creating a second copy of testMinAndFairSharePreemption where job 1 is in a pool with a min share set (i.e. not in the default pool)? A minor comment on clarity: rather than adding the line tasksToPreempt = tasksToPreempt 0 ? 0 : tasksToPreempt, it would be better to make sure that tasksDueToMinShare and tasksDueToFairShare are themselves never negative. You can do it by adding a max(0, ...) on the lines where they are computed (for example, tasksDueToMinShare = Math.max(0, target - sched.getRunningTasks())). FairScheduler.tasksToPeempt() can return negative number Key: MAPREDUCE-1845 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1845 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1845.20100717.txt This method can return negative number. This will cause the preemption to under-preempt. The bug was discovered by Joydeep. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats
[ https://issues.apache.org/jira/browse/MAPREDUCE-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1309: - Attachment: rumen-yhadoop-20.patch Backport to hadoop 20.1xx branch. Not to be committed. I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats - Key: MAPREDUCE-1309 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1309 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dick King Assignee: Dick King Fix For: 0.21.0 Attachments: demuxer-plus-concatenated-files--2009-12-21.patch, demuxer-plus-concatenated-files--2010-01-06.patch, demuxer-plus-concatenated-files--2010-01-08-b.patch, demuxer-plus-concatenated-files--2010-01-08-c.patch, demuxer-plus-concatenated-files--2010-01-08-d.patch, demuxer-plus-concatenated-files--2010-01-08.patch, demuxer-plus-concatenated-files--2010-01-11.patch, mapreduce-1309--2009-01-14-a.patch, mapreduce-1309--2009-01-14.patch, mapreduce-1309--2010-01-20.patch, mapreduce-1309--2010-02-03.patch, mapreduce-1309--2010-02-04.patch, mapreduce-1309--2010-02-10.patch, mapreduce-1309--2010-02-12.patch, mapreduce-1309--2010-02-16-a.patch, mapreduce-1309--2010-02-16.patch, mapreduce-1309--2010-02-17.patch, rumen-yhadoop-20.patch There are two orthogonal questions to answer when processing a job tracker log: how will the logs and the xml configuration files be packaged, and in which release of hadoop map/reduce were the logs generated? The existing rumen only has a couple of answers to this question. The new engine will handle three answers to the version question: 0.18, 0.20 and current, and two answers to the packaging question: separate files with names derived from the job ID, and concatenated files with a header between sections [used for easier file interchange]. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1887) MRAsyncDiskService does not properly absolutize volume root paths
[ https://issues.apache.org/jira/browse/MAPREDUCE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882372#action_12882372 ] Zheng Shao commented on MAPREDUCE-1887: --- Can you take a look at the failed contrib tests? MRAsyncDiskService does not properly absolutize volume root paths - Key: MAPREDUCE-1887 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1887 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1887.2.patch, MAPREDUCE-1887.3.patch, MAPREDUCE-1887.patch In MRAsyncDiskService, volume names are sometimes specified as relative paths, which are not converted to absolute paths. This can cause errors of the form cannot delete /full/path/to/foo since it is outside of relative/volume/root even though the actual path is inside the root. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number
[ https://issues.apache.org/jira/browse/MAPREDUCE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882373#action_12882373 ] Scott Chen commented on MAPREDUCE-1845: --- Thanks. Good suggestions. I will update the patch. FairScheduler.tasksToPeempt() can return negative number Key: MAPREDUCE-1845 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1845 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1845.20100717.txt This method can return negative number. This will cause the preemption to under-preempt. The bug was discovered by Joydeep. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1893) Multiple reducers for Slive
Multiple reducers for Slive --- Key: MAPREDUCE-1893 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1893 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, test Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.22.0 Slive currently uses single reducer. It could use multiple ones. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1894) DistributedRaidFileSystem.readFully() does not return
DistributedRaidFileSystem.readFully() does not return - Key: MAPREDUCE-1894 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1894 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Ramkumar Vadali DistributedRaidFileSystem.readFully() has a while(true) loop with no return. The read(*) functions do not have this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1893) Multiple reducers for Slive
[ https://issues.apache.org/jira/browse/MAPREDUCE-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882398#action_12882398 ] Konstantin Shvachko commented on MAPREDUCE-1893: Slive maps output different stats of the operations performed by the test. The stats are currently aggregated by a single reducer, which may run for a long time if the amount of generated data is large. I propose to add a parameter to Slive args, which specifies the number of reducers R. Then SlivePartitioner for each output stat calculates a hash value of the operation type, modular R. This defines the reducer number. Multiple reducers for Slive --- Key: MAPREDUCE-1893 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1893 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, test Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.22.0 Slive currently uses single reducer. It could use multiple ones. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1699) JobHistory shouldn't be disabled for any reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882402#action_12882402 ] Krishna Ramachandran commented on MAPREDUCE-1699: - In trunk I believe, all job state loggings are handled in jobHistory.logEvent(jobId) { /** * Method to log the specified event * @param event The event to log * @param id The Job ID of the event */ public void logEvent(HistoryEvent event, JobID id) { try { final MetaInfo mi = fileMap.get(id); if (mi != null) { mi.writeEvent(event); } } catch (IOException e) { LOG.error(Error Logging event, + e.getMessage()); } } } if logging fails just log a message. This is a lot cleaner - disableHistory is never used JobHistory shouldn't be disabled for any reason --- Key: MAPREDUCE-1699 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1699 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Arun C Murthy Assignee: Krishna Ramachandran Fix For: 0.20.3 Attachments: mapred-1699-1.patch, mapred-1699-2.patch, mapred-1699-3.patch, mapred-1699.patch Recently we have had issues with JobTracker silently disabling job-history and starting to keep all completed jobs in memory. This leads to OOM on the JobTracker. We should never do this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1893) Multiple reducers for Slive
[ https://issues.apache.org/jira/browse/MAPREDUCE-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1893: --- Attachment: SliveMultiR.patch In the patch: - Introduced SlivePartitioner, and changed code to collect reports from multiple output files. - Made Slive a Tool, so that people could specify generic options. - Updated TestSlive to run with 2 mappers and 2 reducers. Also tuned it up to run in eclipse, and fixed some stream closing issue. - Improved some JavaDoc and log messages. - updated the design document in https://issues.apache.org/jira/secure/attachment/12448004/SLiveTest.pdf Multiple reducers for Slive --- Key: MAPREDUCE-1893 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1893 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, test Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.22.0 Attachments: SliveMultiR.patch Slive currently uses single reducer. It could use multiple ones. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1893) Multiple reducers for Slive
[ https://issues.apache.org/jira/browse/MAPREDUCE-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1893: --- Status: Patch Available (was: Open) Multiple reducers for Slive --- Key: MAPREDUCE-1893 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1893 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, test Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.22.0 Attachments: SliveMultiR.patch Slive currently uses single reducer. It could use multiple ones. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1887) MRAsyncDiskService does not properly absolutize volume root paths
[ https://issues.apache.org/jira/browse/MAPREDUCE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882412#action_12882412 ] Aaron Kimball commented on MAPREDUCE-1887: -- The only failing test has failed for the last 37 builds. Unrelated to this patch. I think we're good. MRAsyncDiskService does not properly absolutize volume root paths - Key: MAPREDUCE-1887 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1887 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1887.2.patch, MAPREDUCE-1887.3.patch, MAPREDUCE-1887.patch In MRAsyncDiskService, volume names are sometimes specified as relative paths, which are not converted to absolute paths. This can cause errors of the form cannot delete /full/path/to/foo since it is outside of relative/volume/root even though the actual path is inside the root. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1887) MRAsyncDiskService does not properly absolutize volume root paths
[ https://issues.apache.org/jira/browse/MAPREDUCE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated MAPREDUCE-1887: -- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Release Note: MAPREDUCE-1887. MRAsyncDiskService now properly absolutizes volume root paths. (Aaron Kimball via zshao) Fix Version/s: 0.22.0 Resolution: Fixed Committed revision 957772. Thanks Aaron! MRAsyncDiskService does not properly absolutize volume root paths - Key: MAPREDUCE-1887 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1887 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Fix For: 0.22.0 Attachments: MAPREDUCE-1887.2.patch, MAPREDUCE-1887.3.patch, MAPREDUCE-1887.patch In MRAsyncDiskService, volume names are sometimes specified as relative paths, which are not converted to absolute paths. This can cause errors of the form cannot delete /full/path/to/foo since it is outside of relative/volume/root even though the actual path is inside the root. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1699) JobHistory shouldn't be disabled for any reason
[ https://issues.apache.org/jira/browse/MAPREDUCE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krishna Ramachandran updated MAPREDUCE-1699: Attachment: mapred-1699-5.patch Arun I took out references to disableHistory I do not believe we need a separate structure to store disabled_state for each jobId. In jobHistory.logSubmitted, writer (entry point for history logging) is initialized for each jobID If fileManager.addWriter(jobId, writer) fails, we log an error in catch block (just like trunk) During subsequent writing of all log events (started/failed/killed/completed/inited/ogInfo ..) we check for fileManager.getWriter(jobId) != null (will be null if logSubmitted failed to initialize) This should be sufficient and is similar to what you proposed (jobId-disabled) I am modifying the patch accordingly Let me know! JobHistory shouldn't be disabled for any reason --- Key: MAPREDUCE-1699 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1699 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Arun C Murthy Assignee: Krishna Ramachandran Fix For: 0.20.3 Attachments: mapred-1699-1.patch, mapred-1699-2.patch, mapred-1699-3.patch, mapred-1699-5.patch, mapred-1699.patch Recently we have had issues with JobTracker silently disabling job-history and starting to keep all completed jobs in memory. This leads to OOM on the JobTracker. We should never do this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1850) Include job submit host information (name and ip) in jobconf and jobdetails display
[ https://issues.apache.org/jira/browse/MAPREDUCE-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krishna Ramachandran updated MAPREDUCE-1850: Attachment: mapred-1850-4.patch I think I got all suggested changes (previous comment) Include job submit host information (name and ip) in jobconf and jobdetails display --- Key: MAPREDUCE-1850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1850 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Krishna Ramachandran Assignee: Krishna Ramachandran Attachments: mapred-1850-1.patch, mapred-1850-2.patch, mapred-1850-3.patch, mapred-1850-4.patch, mapred-1850.patch, mapred-1850.patch Enhancement to identify the source (submit host and ip) of a job request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1850) Include job submit host information (name and ip) in jobconf and jobdetails display
[ https://issues.apache.org/jira/browse/MAPREDUCE-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1850: --- Status: Open (was: Patch Available) JobConf still uses the configuration names, not String constants. Also, Javadoc for JobConf.getJobSubmitHostAddress() still has links. The new configuration should be added in MRJobConfig.java, not MRConfig. Include job submit host information (name and ip) in jobconf and jobdetails display --- Key: MAPREDUCE-1850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1850 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Krishna Ramachandran Assignee: Krishna Ramachandran Attachments: mapred-1850-1.patch, mapred-1850-2.patch, mapred-1850-3.patch, mapred-1850-4.patch, mapred-1850.patch, mapred-1850.patch Enhancement to identify the source (submit host and ip) of a job request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1863) [Rumen] Null failedMapAttemptCDFs in job traces generated by Rumen
[ https://issues.apache.org/jira/browse/MAPREDUCE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-1863: -- Attachment: rumen-npe-v1.1-bin.patch Attaching a patch that includes the binary diffs of the gold-standard files. [Rumen] Null failedMapAttemptCDFs in job traces generated by Rumen -- Key: MAPREDUCE-1863 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1863 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.22.0 Attachments: counters-test-trace.json.gz, dispatch-trace-output.json.gz, rumen-npe-v1.1-bin.patch, rumen-npe-v1.1.patch All the traces generated by Rumen for jobs having failed task attempts has null value for failedMapAttemptCDFs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1863) [Rumen] Null failedMapAttemptCDFs in job traces generated by Rumen
[ https://issues.apache.org/jira/browse/MAPREDUCE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-1863: -- Status: Patch Available (was: Open) Running through hudson. [Rumen] Null failedMapAttemptCDFs in job traces generated by Rumen -- Key: MAPREDUCE-1863 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1863 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.22.0 Attachments: counters-test-trace.json.gz, dispatch-trace-output.json.gz, rumen-npe-v1.1-bin.patch, rumen-npe-v1.1.patch All the traces generated by Rumen for jobs having failed task attempts has null value for failedMapAttemptCDFs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1895) MapEventFetcherThread should not iterate over jobs that are not localized
MapEventFetcherThread should not iterate over jobs that are not localized - Key: MAPREDUCE-1895 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1895 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Reporter: Amareshwari Sriramadasu We have seen a scenario of lost trackers on our clusters because of the following: TaskLauncher has locked a TaskTracker$RunningJob and doing localizeJob, which involves DFS operations. Map-event fetcher has locked TaskTracker.runningJobs map and is waiting to lock the RunningJob object. TaskTracker offerService is waiting to lock TaskTracker.runningJobs map, thus failing to send heartbeats in 10 minutes. So, I think map-event fetcher should circuit jobs that are not localized. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1895) MapEventFetcherThread should not iterate over jobs that are not localized
[ https://issues.apache.org/jira/browse/MAPREDUCE-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882456#action_12882456 ] Amareshwari Sriramadasu commented on MAPREDUCE-1895: Here is the stacktrace for the above scenario : {noformat} TaskLauncher for MAP tasks daemon prio=10 tid=0xaf51f800 nid=0x70ce in Object.wait() [0xaf6ad000..0xaf6adf30] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:937) - locked 0xee8d2218 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:223) at $Proxy7.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy7.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:676) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:507) at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:700) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:218) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1255) at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1236) at org.apache.hadoop.mapred.TaskTracker.localizeJobJarFile(TaskTracker.java:1171) at org.apache.hadoop.mapred.TaskTracker.localizeJobFiles(TaskTracker.java:1046) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:954) - locked 0xba4b6f40 (a org.apache.hadoop.mapred.TaskTracker$RunningJob) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2165) at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2130) Map-events fetcher for all reduce tasks on tracker_gsgd40932.gold.ygrid.yahoo.com:localhost/127.0.0.1:50542 daemon prio=10 tid=0xaf597800 nid=0x70c9 waiting for monitor entry [0xaefe1000..0xaefe2130] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:777) - waiting to lock 0xba4b6f40 (a org.apache.hadoop.mapred.TaskTracker$RunningJob) at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:812) - locked 0xb4f02fe8 (a java.util.TreeMap) main prio=10 tid=0x0805ac00 nid=0x70a2 waiting for monitor entry [0xf7fbb000..0xf7fbc1f8] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapred.TaskTracker.removeTaskFromJob(TaskTracker.java:449) - waiting to lock 0xb4f02fe8 (a java.util.TreeMap) at org.apache.hadoop.mapred.TaskTracker.purgeTask(TaskTracker.java:1882) at org.apache.hadoop.mapred.TaskTracker.markUnresponsiveTasks(TaskTracker.java:1737) - locked 0xb4ea44b8 (a org.apache.hadoop.mapred.TaskTracker) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1501) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2236) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3414) {noformat} MapEventFetcherThread should not iterate over jobs that are not localized - Key: MAPREDUCE-1895 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1895 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Reporter: Amareshwari Sriramadasu We have seen a scenario of lost trackers on our clusters because of the following: TaskLauncher has locked a TaskTracker$RunningJob and doing localizeJob, which involves DFS operations. Map-event fetcher has locked TaskTracker.runningJobs map and is waiting to lock the RunningJob object. TaskTracker offerService is waiting to lock TaskTracker.runningJobs map, thus failing to send heartbeats in 10 minutes. So, I think map-event fetcher should circuit jobs that are not localized. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.