Hello All,
I have a question regarding configuring the number of reducers property in
case of a non FIFO scheduler (either Capacity/Fair-share scheduler).
As per the guidelines on the Hadoop wiki page, we should set number of
reducers = 0.75 * maximum_reduce_slots_available_in_cluster (minimum)
h
Hello everyone,
I am curious to know if anyone has tried using map-reduce across multiple
data centers? The use case that I have in my mind where the dataset is
geographically distributed across multiple data centers and it may be not be
cost effective to move the data to a single site (e.g. due t