Thanks. That clears it up. Larry
On Fri, Apr 16, 2010 at 1:05 AM, Amareshwari Sri Ramadasu < [email protected]> wrote: > Hi, > @Ted, below code is internal code. Users are not expected to call > DistributedCache.getLocalCache(), they cannot use it also. They do not know > all the parameters. > @Larry, DistributedCache is not changed to use new api in branch 0.20. The > change is done in only from branch 0.21. See MAPREDUCE-898 ( > https://issues.apache.org/jira/browse/MAPREDUCE-898). > If you are using branch 0.20, you are encouraged to use deprecated JobConf > itself. > You can try the following change in your code: > Change the line > > > DistributedCache.addCacheFile(new > Path(args[0]).toUri(), conf); > to DistributedCache.addCacheFile(new Path(args[0]).toUri(), > job.getConfiguration()); > > Thanks > Amareshwari > > On 4/16/10 2:27 AM, "Ted Yu" <[email protected]> wrote: > > Please take a look at the loop starting at line 158 in TaskRunner.java: > p[i] = DistributedCache.getLocalCache(files[i], conf, > new Path(baseDir), > fileStatus, > false, Long.parseLong( > > fileTimestamps[i]), > new Path(workDir. > getAbsolutePath()), > false); > } > DistributedCache.setLocalFiles(conf, stringifyPathArray(p)); > > I think the confusing part is that DistributedCache.getLocalCacheFiles() is > paired with DistributedCache.setLocalFiles() > > Cheers > > On Thu, Apr 15, 2010 at 1:16 PM, Larry Compton > <[email protected]>wrote: > > > Ted, > > > > Thanks. I have looked at that example. The javadocs for DistributedCache > > still refer to deprecated classes, like JobConf. I'm trying to use the > > revised API. > > > > Larry > > > > On Thu, Apr 15, 2010 at 4:07 PM, Ted Yu <[email protected]> wrote: > > > > > Please see the sample within > > > src\core\org\apache\hadoop\filecache\DistributedCache.java: > > > > > > * JobConf job = new JobConf(); > > > * DistributedCache.addCacheFile(new > > > URI("/myapp/lookup.dat#lookup.dat"), > > > * job); > > > > > > > > > On Thu, Apr 15, 2010 at 12:56 PM, Larry Compton > > > <[email protected]>wrote: > > > > > > > I'm trying to use the distributed cache in a MapReduce job written to > > the > > > > new API (org.apache.hadoop.mapreduce.*). In my "Tool" class, a file > > path > > > is > > > > added to the distributed cache as follows: > > > > > > > > public int run(String[] args) throws Exception { > > > > Configuration conf = getConf(); > > > > Job job = new Job(conf, "Job"); > > > > ... > > > > DistributedCache.addCacheFile(new Path(args[0]).toUri(), > conf); > > > > ... > > > > return job.waitForCompletion(true) ? 0 : 1; > > > > } > > > > > > > > The "setup()" method in my mapper tries to read the path as follows: > > > > > > > > protected void setup(Context context) throws IOException { > > > > Path[] paths = DistributedCache.getLocalCacheFiles(context > > > > .getConfiguration()); > > > > } > > > > > > > > But "paths" is null. > > > > > > > > I'm assuming I'm setting up the distributed cache incorrectly. I've > > seen > > > a > > > > few hints in previous mailing list postings that indicate that the > > > > distributed cache is accessed via the Job and JobContext objects in > the > > > > revised API, but the javadocs don't seem to support that. > > > > > > > > Thanks. > > > > Larry > > > > > > > > > > >
