Re: Processing 10MB files in Hadoop

CubicDesign Fri, 27 Nov 2009 16:25:03 -0800

30000 records in 10MB files.
Files can vary and the number of records also can vary.

If the data is 10MB and you have 30k records, and it takes ~2 mins to
process each record, I'd suggest using map to distribute the data across
several reducers then do the actual processing on reduce.

Hmmm... Good idea. Thanks. But is 'Reduce' optimized to do the heavypart of the computation?

Re: Processing 10MB files in Hadoop

Reply via email to