[jira] [Commented] (MAPREDUCE-4937) MR AM handles an oversized split metainfo file poorly
[ https://issues.apache.org/jira/browse/MAPREDUCE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975143#comment-13975143 ] Hudson commented on MAPREDUCE-4937: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1738 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1738/]) MAPREDUCE-4937. MR AM handles an oversized split metainfo file poorly. Contributed by Eric Payne (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1588559) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/event/JobEventType.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java MR AM handles an oversized split metainfo file poorly - Key: MAPREDUCE-4937 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4937 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.2-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Eric Payne Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, MAPREDUCE-4937.MRAMHandlOversizeSplits.txt When an job runs with a split metainfo file that's larger than it has been configured to handle then it just crashes. This leaves the user with a less-than-ideal debug session since there are no useful diagnostic messages sent to the client for this failure. In addition it crashes before registering/unregistering with the RM and crashes without generating history, so the proxy URL is not very useful and there's no archived configuration to check to see what setting the AM was using when it encountered the error. The AM should handle this error case more gracefully and treat the failure as it does any other failed job, with a proper unregistration from the RM and with history. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5642) TestMiniMRChildTask fails on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975140#comment-13975140 ] Hudson commented on MAPREDUCE-5642: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1738 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1738/]) MAPREDUCE-5642. TestMiniMRChildTask fails on Windows. Contributed by Chuan Liu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1588605) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java TestMiniMRChildTask fails on Windows Key: MAPREDUCE-5642 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5642 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0, 2.4.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5642.patch The test fails on Windows as a regression from MAPREDUCE-5451. In MAPREDUCE-5451, we set default config of mapreduce.admin.user.env to PATH=%PATH%;%HADOOP_COMMON_HOME%\\bin on Windows. In the test, we set PATH=%PATH%;tmp for mapreduce.map.env and mapreduce.map.env. Because the the change in MAPREDUCE-5451, PATH will be set twice now and the value we get in the child tasks no longer matches the previous expected value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
[ https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975212#comment-13975212 ] David Rosenstrauch commented on MAPREDUCE-5402: --- Looks like the patch is working. Tested it on a few heavy load jobs today, and it definitely seemed to work around the issue I was having with the Too many chunks created error. I probably need to tune the parms a bit to optimize the distcp runs I'm doing, but the basic functionality does seem to work. (Note: I did my testing using your patch with the backported version of distcp at https://github.com/QwertyManiac/hadoop-distcp-mr1, as described in https://issues.cloudera.org/browse/DISTRO-420 , and not the current Hadoop trunk version of the code.) I'm still seeing a bit of long tail behavior, but it's more like 100 mappers that are taking longer to complete, rather than 1 or 2, which indicates that the copy is being distributed more optimally. Here's some stats recorded from a few of these jobs: Job 1: Total number of files: 20,027 Number of files copied: 20,017 Number of files skipped:10 Number of bytes copied: 84,802,510,328 Number of mappers: 512 Split ratio:10 Max chunks tolerable: 10,000 Number of dynamic-chunk-files created: 5012 Run time: 5mins, 19sec Job 2: Total number of files: 36,374 Number of files copied: 17,160 Number of files skipped:19,214 Number of bytes copied: 1,196,591,437,407 Number of mappers: 512 Split ratio:10 Max chunks tolerable: 10,000 Number of dynamic-chunk-files created: 4714 Run time: 50mins, 50sec Job 2 can obviously be optimized a bit better. (I.e., it'll distribute much better if I eliminate all those files being skipped.) But this is still an improvement - it used to take over 2 hours to run that job. Thanks again for working on the fix for this issue. Any additional questions you have please let me know. DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE -- Key: MAPREDUCE-5402 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distcp, mrv2 Reporter: David Rosenstrauch Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch, MAPREDUCE-5402.3.patch In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author describes the implementation of DynamicInputFormat, with one of the main motivations cited being to reduce the chance of long-tails where a few leftover mappers run much longer than the rest. However, I today ran into a situation where I experienced exactly such a long tail using DistCpV2 and DynamicInputFormat. And when I tried to alleviate the problem by overriding the number of mappers and the split ratio used by the DynamicInputFormat, I was prevented from doing so by the hard-coded limit set in the code by the MAX_CHUNKS_TOLERABLE constant. (Currently set to 400.) This constant is actually set quite low for production use. (See a description of my use case below.) And although MAPREDUCE-2765 states that this is an overridable maximum, when reading through the code there does not actually appear to be any mechanism available to override it. This should be changed. It should be possible to expand the maximum # of chunks beyond this arbitrary limit. For example, here is the situation I ran into today: I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots. The job consisted of copying ~2800 files from HDFS to Amazon S3. I overrode the number of mappers for the job from the default of 20 to 128, so as to more properly parallelize the copy across the cluster. The number of chunk files created was calculated as 241, and mapred.num.entries.per.chunk was calculated as 12. As the job ran on, it reached a point where there were only 4 remaining map tasks, which had each been running for over 2 hours. The reason for this was that each of the 12 files that those mappers were copying were quite large (several hundred megabytes in size) and took ~20 minutes each. However, during this time, all the other 124 mappers sat idle. In theory I should be able to alleviate this problem with DynamicInputFormat. If I were able to, say, quadruple the number of chunk files created, that would have made each chunk contain only 3 files, and these large files would have gotten distributed better around the cluster and copied in parallel. However, when I tried to do that - by overriding mapred.listing.split.ratio to, say, 10 - DynamicInputFormat responded with an exception (Too many chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease
[jira] [Commented] (MAPREDUCE-5597) Missing alternatives in javadocs for deprecated constructors in mapreduce.Job
[ https://issues.apache.org/jira/browse/MAPREDUCE-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975355#comment-13975355 ] edward pacheco commented on MAPREDUCE-5597: --- I've just downloaded the hadoop 2.4, then I applied the patch MAPREDUCE-5597.2.patch, and it works ok. Thanks Missing alternatives in javadocs for deprecated constructors in mapreduce.Job - Key: MAPREDUCE-5597 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5597 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, documentation, job submission Affects Versions: 2.2.0 Reporter: Christopher Tubbs Assignee: Akira AJISAKA Priority: Minor Labels: newbie Attachments: MAPREDUCE-5597.2.patch, MAPREDUCE-5597.patch Deprecated API, such as `new Job()` don't have javadocs explaining what the alternatives are. (It'd also help if the new methods had @since tags to help determine if one could safely use that API on older versions at runtime.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975383#comment-13975383 ] Benjamin Kim commented on MAPREDUCE-4718: - Hi Chen I tested it with CDH4.5.0(hadoop-2.0.0+1518) and doesn't seem to have same problem. I'am able to successfully run a wordcount MRv1 job with s3n protocol. So is it pretty safe to say this issue is fixed on other 2.x.x versions? MapReduce fails If I pass a parameter as a S3 folder Key: MAPREDUCE-4718 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 1.0.0, 1.0.3 Environment: Hadoop with default configurations Reporter: Benjamin Kim I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt MR fails if I pass s3 folder as a parameter In summary, This works hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ This doesn't work hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)