Thanks. That clears it up.

Larry

On Fri, Apr 16, 2010 at 1:05 AM, Amareshwari Sri Ramadasu <
[email protected]> wrote:

> Hi,
> @Ted, below code is internal code. Users are not expected to call
> DistributedCache.getLocalCache(), they cannot use it also. They do not know
> all the parameters.
> @Larry, DistributedCache is not changed to use new api in branch 0.20. The
> change is done in only from branch 0.21. See MAPREDUCE-898 (
> https://issues.apache.org/jira/browse/MAPREDUCE-898).
> If you are using branch 0.20, you are encouraged to use deprecated JobConf
> itself.
> You can try the following change in your code:
> Change the line > > >        DistributedCache.addCacheFile(new
> Path(args[0]).toUri(), conf);
>  to DistributedCache.addCacheFile(new Path(args[0]).toUri(),
> job.getConfiguration());
>
> Thanks
> Amareshwari
>
> On 4/16/10 2:27 AM, "Ted Yu" <[email protected]> wrote:
>
> Please take a look at the loop starting at line 158 in TaskRunner.java:
>            p[i] = DistributedCache.getLocalCache(files[i], conf,
>                                                  new Path(baseDir),
>                                                  fileStatus,
>                                                  false, Long.parseLong(
>
> fileTimestamps[i]),
>                                                  new Path(workDir.
>                                                        getAbsolutePath()),
>                                                  false);
>          }
>          DistributedCache.setLocalFiles(conf, stringifyPathArray(p));
>
> I think the confusing part is that DistributedCache.getLocalCacheFiles() is
> paired with DistributedCache.setLocalFiles()
>
> Cheers
>
> On Thu, Apr 15, 2010 at 1:16 PM, Larry Compton
> <[email protected]>wrote:
>
> > Ted,
> >
> > Thanks. I have looked at that example. The javadocs for DistributedCache
> > still refer to deprecated classes, like JobConf. I'm trying to use the
> > revised API.
> >
> > Larry
> >
> > On Thu, Apr 15, 2010 at 4:07 PM, Ted Yu <[email protected]> wrote:
> >
> > > Please see the sample within
> > > src\core\org\apache\hadoop\filecache\DistributedCache.java:
> > >
> > >  *     JobConf job = new JobConf();
> > >  *     DistributedCache.addCacheFile(new
> > > URI("/myapp/lookup.dat#lookup.dat"),
> > >  *                                   job);
> > >
> > >
> > > On Thu, Apr 15, 2010 at 12:56 PM, Larry Compton
> > > <[email protected]>wrote:
> > >
> > > > I'm trying to use the distributed cache in a MapReduce job written to
> > the
> > > > new API (org.apache.hadoop.mapreduce.*). In my "Tool" class, a file
> > path
> > > is
> > > > added to the distributed cache as follows:
> > > >
> > > >    public int run(String[] args) throws Exception {
> > > >        Configuration conf = getConf();
> > > >        Job job = new Job(conf, "Job");
> > > >        ...
> > > >        DistributedCache.addCacheFile(new Path(args[0]).toUri(),
> conf);
> > > >        ...
> > > >        return job.waitForCompletion(true) ? 0 : 1;
> > > >    }
> > > >
> > > > The "setup()" method in my mapper tries to read the path as follows:
> > > >
> > > >    protected void setup(Context context) throws IOException {
> > > >        Path[] paths = DistributedCache.getLocalCacheFiles(context
> > > >                .getConfiguration());
> > > >    }
> > > >
> > > > But "paths" is null.
> > > >
> > > > I'm assuming I'm setting up the distributed cache incorrectly. I've
> > seen
> > > a
> > > > few hints in previous mailing list postings that indicate that the
> > > > distributed cache is accessed via the Job and JobContext objects in
> the
> > > > revised API, but the javadocs don't seem to support that.
> > > >
> > > > Thanks.
> > > > Larry
> > > >
> > >
> >
>
>

Reply via email to