Nathan Marz wrote:
I have some unit tests which run MapReduce jobs and test the
inputs/outputs in standalone mode. I recently started using
DistributedCache in one of these jobs, but now my tests fail with
errors such as:
Caused by: java.io.IOException: Incomplete HDFS URI, no host:
hdfs:///tmp/file.data
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:70)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1367)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
at
org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:472)
at
org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:676)
Does anyone know of a way to get DistributedCache working in a test
environment?
You can look at the source code for
org.apache.hadoop.mapred.TestMiniMRDFSCaching.
And DistributedCache does not work with LocalJobRunner. see
http://issues.apache.org/jira/browse/HADOOP-2914
-Amareshwari