Re: Reduce copy speed too slow

Marcos Ortiz Tue, 20 Mar 2012 09:12:42 -0700

Hi, Gayatri


On 03/20/2012 11:59 AM, Gayatri Rao wrote:

Hi all,

I am running a map reduce job in EC2 instances and it seems to be very
slow. It takes hours together for simple projection and aggregation of
data.

What filesystem are you using for data storage: HDFS in EC2 or Amazon S3?
Which is the data size that you are analyzing?

Upon observation, I gathered that the reduce copy speed is 0.01 MB/sec. I
am new to hadoop. Could any one please share  insights about the reduce
copy speeds
are good to work with. If any one has an experience any tips in improving
it.

Hadoop Map/Reduce jobs shuffle lots of data, so the recommendedconfiguration is to use 10Gbps networks for

the underline connection (and dedicated switches on dual-gigabit networks)

Remember too that Hadoop is not a real-time system, if you needreal-time random access to your data, use HBase

http://hbase.apache.org

Regards


Thanks
Gayatri


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


--
Marcos Luis Ortíz Valmaseda (@marcosluis2186)
 Data Engineer at UCI
 http://marcosluis2186.posterous.com


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Reduce copy speed too slow

Reply via email to