So a common use case for the Distributed Cache would be to store a lookup
table for use during a map task, perhaps?

On 9/10/07, Jeff Hammerbacher <[EMAIL PROTECTED]> wrote:
>
> Thanks, Owen and Doug.  I am looking at that presentation with fresh eyes
> Owen and it's great!  If you could toss me the OmniGraffle file for the
> "Process Diagram" on page 10, that would be awesome.  That will serve as the
> main diagram for understanding how a job gets run, and I would love to just
> flesh it out a bit more ( e.g. throw some data structures on there, and
> some of the other threads that the JobTracker/TaskTracker run).  I am also
> chatting with some of our Flash guys to see if we can make the diagram
> dynamic, so that you could drill down on various components.
>
> Much appreciated,
> Jeff
>
> On 9/10/07, Doug Cutting <[EMAIL PROTECTED]> wrote:
> >
> > Owen O'Malley wrote:
> > >
> > > On Sep 9, 2007, at 11:18 PM, Jeff Hammerbacher wrote:
> > >
> > >> What's the DistributedCache for, in words?
> > >
> > > It is for distributing large read-only files that need to be available
> >
> > > to each task in the job. I've added an entry for it at the bottom of
> > >
> > > http://wiki.apache.org/lucene-hadoop/FAQ
> > >
> > > The answer needs more meat about how to set it up, but at least I
> > > started the entry.
> >
> > We should really improve the javadoc for this and link to it.  The
> > javadoc should be good reference documentation, but is not currently.
> > The wiki and website should provide "user guide" style documentation,
> > but we should primarily rely on javadoc for reference.  Thus the
> > class-level documentation in DistributedCache.java should describe how
> > its configured, and link to other relevant javadocs (e.g., command line
> > programs that add files to the cache).
> >
> > Doug
> >
>
>

Reply via email to