Hi, Can somebody provide me a rough estimate of the time taken in hours/mins for a cluster of say 30 nodes to run a map reduce job to perform a word count on say 10 TB of data, assuming that the hardware and the map reduce program is tuned optimally.
Just a rough estimate, it could be 5TB,10 TB or 20 TB data. If not word count it could be just to analyze the above size of data. Regards Shashidhar