Hi all,
I've been tinkering with hadoop for some time, but am new to the
mailing list. Please forgive me if this has already been asked and
answered. I am attempting to use the Distributed Cache to allow my
map reduce job to access some lookup files. I have the following code
to add the files to the distributed cache (showing only a single file
for brevity):
tmpPath = new Path(cl.getOptionValue("lookup_file"));
conf.set("lookupfileName", tmpPath.getName());
DistributedCache.addCacheFile(tmpPath.toUri(),conf);
System.out.println("added " + tmpPath.toUri().toString() + " as " +
tmpPath.getName() );
and the following code in the Mapper.setup method to access these files:
Path[] localFiles = DistributedCache.getLocalCacheFiles(conf);
for (Path file : localFiles) {
if (file.getName().equals( conf.get("lookupfileName")) ){
parser.registerResource("bad_uas", new FileReader(new
File( file.toUri())));
}
// further checks for other files in cache
}
this is generating the exception "java.lang.IllegalArgumentException:
URI is not absolute" when I attempt to instantiate the File object.
The registerResource method is currently designed to accept an
instance of a reader from which it pulls its information. That method
is under my control, and I can reconfigure it to take a more
appropriate input if such exists.
I have tried a few variations on this specific method, and all seem to
come back to the "URI is not absolute" error. What is the piece I am
missing here?
Thanks,
--Mark Tozzi