Thanks Jason. My object is relatively small. But how do I pass it via the JobConf object? Can you elaborate a bit...
Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sat, May 2, 2009 at 11:53 PM, jason hadoop <jason.had...@gmail.com>wrote: > If it is relatively small you can pass it via the JobConf object, storing a > serialized version of your dataset. > If it is larger you can pass a serialized version via the distributed > cache. > Your map task will need to deserialize the object in the configure method. > > None of the above methods give you an object that is write shared between > map tasks. > > Please remember that the map tasks execute in separate JVM's on distinct > machines in the normal MapReduce environment. > > > > On Sat, May 2, 2009 at 10:59 PM, Amandeep Khurana <ama...@gmail.com> > wrote: > > > How can I create a global variable for each node running my map task. For > > example, a common ArrayList that my map function can access for every k,v > > pair it works on. It doesnt really need to create the ArrayList > everytime. > > > > If I create it in the main function of the job, the map function gets a > > null > > pointer exception. Where else can this be created? > > > > Amandeep > > > > > > Amandeep Khurana > > Computer Science Graduate Student > > University of California, Santa Cruz > > > > > > -- > Alpha Chapters of my book on Hadoop are available > http://www.apress.com/book/view/9781430219422 >