I'm trying to use the distributed cache in a MapReduce job written to the
new API (org.apache.hadoop.mapreduce.*). In my "Tool" class, a file path is
added to the distributed cache as follows:

    public int run(String[] args) throws Exception {
        Configuration conf = getConf();
        Job job = new Job(conf, "Job");
        ...
        DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);
        ...
        return job.waitForCompletion(true) ? 0 : 1;
    }

The "setup()" method in my mapper tries to read the path as follows:

    protected void setup(Context context) throws IOException {
        Path[] paths = DistributedCache.getLocalCacheFiles(context
                .getConfiguration());
    }

But "paths" is null.

I'm assuming I'm setting up the distributed cache incorrectly. I've seen a
few hints in previous mailing list postings that indicate that the
distributed cache is accessed via the Job and JobContext objects in the
revised API, but the javadocs don't seem to support that.

Thanks.
Larry

Reply via email to