Ted, Thanks. I have looked at that example. The javadocs for DistributedCache still refer to deprecated classes, like JobConf. I'm trying to use the revised API.
Larry On Thu, Apr 15, 2010 at 4:07 PM, Ted Yu <[email protected]> wrote: > Please see the sample within > src\core\org\apache\hadoop\filecache\DistributedCache.java: > > * JobConf job = new JobConf(); > * DistributedCache.addCacheFile(new > URI("/myapp/lookup.dat#lookup.dat"), > * job); > > > On Thu, Apr 15, 2010 at 12:56 PM, Larry Compton > <[email protected]>wrote: > > > I'm trying to use the distributed cache in a MapReduce job written to the > > new API (org.apache.hadoop.mapreduce.*). In my "Tool" class, a file path > is > > added to the distributed cache as follows: > > > > public int run(String[] args) throws Exception { > > Configuration conf = getConf(); > > Job job = new Job(conf, "Job"); > > ... > > DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf); > > ... > > return job.waitForCompletion(true) ? 0 : 1; > > } > > > > The "setup()" method in my mapper tries to read the path as follows: > > > > protected void setup(Context context) throws IOException { > > Path[] paths = DistributedCache.getLocalCacheFiles(context > > .getConfiguration()); > > } > > > > But "paths" is null. > > > > I'm assuming I'm setting up the distributed cache incorrectly. I've seen > a > > few hints in previous mailing list postings that indicate that the > > distributed cache is accessed via the Job and JobContext objects in the > > revised API, but the javadocs don't seem to support that. > > > > Thanks. > > Larry > > >
