My question is simply how to have a global variable (eg. HashTable) in hadoop ?
To be available for all mappers. Please help, Thank you, Maha On Feb 7, 2011, at 11:21 AM, maha wrote: > Thanks Vijay, now my question is how can I build one inverted index and have > it ready to be accessed by all Mappers ?? > > I had my main function initialize a global variable declared in the main > class as: > > public static Hashtable<String,String> hashtable = new > Hashtable<String,String>(); ; > > Yet, the mappers find it Null. > > Any help is appreciated , > > > Maha > > Depending on the scale of data, between the two, it would be best stored in > hdfs > , and use the built-in InputFormat-s , as that is more scalable. > > If necessary, (depending on how the data is stored), build a custom > InputFormat, > as per the API and set it for the job. > http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapred/InputFormat.html > . > > > > -- > Vijay > > > > ----- Original Message ---- >> From: maha <m...@umail.ucsb.edu> >> To: common-user <common-user@hadoop.apache.org> >> Sent: Sun, February 6, 2011 5:09:38 PM >> Subject: Mapper reading from local directory or global variable? >> >> Hello, >> >> I'm wondering which option is more efficient to store "People's Names" to >> be processed by Mappers. >> >> >> 1. Store it in a global variable declared in the main class? >> >> 2. Store it in the HDFS to be distributed and read in each map. >> >> >> Note that the number of mappers until now is around 1000 mappers. >> Appreciate >> any thought :) >> >> Thank you, >> >> Maha