Thanks Jason.

My object is relatively small. But how do I pass it via the JobConf object?
Can you elaborate a bit...



Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, May 2, 2009 at 11:53 PM, jason hadoop <jason.had...@gmail.com>wrote:

> If it is relatively small you can pass it via the JobConf object, storing a
> serialized version of your dataset.
> If it is larger you can pass a serialized version via the distributed
> cache.
> Your map task will need to deserialize the object in the configure method.
>
> None of the above methods give you an object that is write shared between
> map tasks.
>
> Please remember that the map tasks execute in separate JVM's on distinct
> machines in the normal MapReduce environment.
>
>
>
> On Sat, May 2, 2009 at 10:59 PM, Amandeep Khurana <ama...@gmail.com>
> wrote:
>
> > How can I create a global variable for each node running my map task. For
> > example, a common ArrayList that my map function can access for every k,v
> > pair it works on. It doesnt really need to create the ArrayList
> everytime.
> >
> > If I create it in the main function of the job, the map function gets a
> > null
> > pointer exception. Where else can this be created?
> >
> > Amandeep
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
>

Reply via email to