Hadoop scheduler and number of reducers config

2011-04-13 Thread Hrishikesh Gadre
Hello All, I have a question regarding configuring the number of reducers property in case of a non FIFO scheduler (either Capacity/Fair-share scheduler). As per the guidelines on the Hadoop wiki page, we should set number of reducers = 0.75 * maximum_reduce_slots_available_in_cluster (minimum) h

map-reduce across data centers

2010-11-02 Thread Hrishikesh Gadre
Hello everyone, I am curious to know if anyone has tried using map-reduce across multiple data centers? The use case that I have in my mind where the dataset is geographically distributed across multiple data centers and it may be not be cost effective to move the data to a single site (e.g. due t