I'm trying to run a MapReduce task against a cluster of 4 DataNodes with 4
cores each.
My input data is 4GB in size and it's split into 100MB files. Current
configuration is default so block size is 64MB.

If I understand it correctly Hadoop should be running 64 Mappers to process
the data.

I'm running a simple data counting MapReduce and it's taking about 30mins to
complete. This seems like way too much, doesn't it?
Is there any tunning you guys would recommend to try and see an improvement
in performance?

Thanks,
Pony

Reply via email to