Depending on the scale of data, between the two, it would be best stored in 
hdfs 
 , and use the built-in InputFormat-s , as that is more scalable. 

If necessary, (depending on how the data is stored), build a custom 
InputFormat, 
as per the API and set it for the job. 
http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapred/InputFormat.html
 . 


 
--
  Vijay



----- Original Message ----
> From: maha <m...@umail.ucsb.edu>
> To: common-user <common-user@hadoop.apache.org>
> Sent: Sun, February 6, 2011 5:09:38 PM
> Subject: Mapper reading from local directory or global variable?
> 
> Hello,
> 
>   I'm wondering which option is more efficient to store  "People's Names"  to 
>be processed by Mappers. 
>
> 
>  1. Store it in a  global variable declared in the main class?
> 
>  2. Store it in the HDFS to  be distributed and read in each map.
> 
> 
>   Note that the number of  mappers until now is around 1000 mappers. 
> Appreciate 
>any thought :)
> 
> Thank  you,
> 
> Maha

Reply via email to