[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert Chu updated MAPREDUCE-5528:
----------------------------------

    Attachment: MAPREDUCE-5528.patch
    
> TeraSort fails with "can't read paritions file" - does not read partition 
> file from distributed cache
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5528
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5528
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: examples
>    Affects Versions: 3.0.0
>            Reporter: Albert Chu
>            Priority: Minor
>         Attachments: MAPREDUCE-5528.patch
>
>
> I was trying to run TeraSort against a parallel networked file system, 
> setting things up via the 'file://" scheme.  I always got the following error 
> when running terasort:
> {noformat}
> 13/09/23 11:15:12 INFO mapreduce.Job: Task Id : 
> attempt_1379960046506_0001_m_000080_1, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
>         at 
> org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.setConf(TeraSort.java:254)
>         at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
>         at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>         at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:678)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)
> Caused by: java.io.FileNotFoundException: File _partition.lst does not exist
>         at org.apache.hadoop.fs.Stat.parseExecResult(Stat.java:124)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:486)
>         at org.apache.hadoop.util.Shell.run(Shell.java:417)
>         at org.apache.hadoop.fs.Stat.getFileStatus(Stat.java:74)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.getNativeFileLinkStatus(RawLocalFileSystem.java:808)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:740)
>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:525)
>         at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
>         at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137)
>         at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763)
>         at 
> org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.readPartitions(TeraSort.java:161)
>         at 
> org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.setConf(TeraSort.java:246)
>         ... 10 more
> {noformat}
> After digging into TeraSort, I noticed that the partitions file was created 
> in the output directory, then added into the distributed cache
> {noformat}
> Path outputDir = new Path(args[1]);
> ...
> Path partitionFile = new Path(outputDir, TeraInputFormat.PARTITION_FILENAME);
> ...
> job.addCacheFile(partitionUri);
> {noformat}
> but the partitions file doesn't seem to be read back from the output 
> directory or distributed cache:
> {noformat}
> FileSystem fs = FileSystem.getLocal(conf);
> ...
> Path partFile = new Path(TeraInputFormat.PARTITION_FILENAME);
> splitPoints = readPartitions(fs, partFile, conf);
> {noformat}
> It seems the file is being read from whatever the working directory is for 
> the filesystem returned from FileSystem.getLocal(conf).
> Under HDFS this code works, the working directory seems to be the distributed 
> cache (I guess by default??).
> But when I set things up with the networked file system and 'file://' scheme, 
> the working directory was the directory I was running my Hadoop binaries out 
> of.
> The attached patch fixed things for me.  It grabs the partition file from the 
> distributed cache all of the time, instead of trusting things underneath to 
> work out.  It seems to be the right thing to do???
> Apologies, I was unable to get this to reproduce under the TeraSort example 
> tests, such as TestTeraSort.java, so no test added.  Not sure what the subtle 
> difference is in the setup.  I tested under both HDFS & 'file' scheme and the 
> patch worked under both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to