30000 records in 10MB files. Files can vary and the number of records also can vary.
Hmmm... Good idea. Thanks. But is 'Reduce' optimized to do the heavy part of the computation?If the data is 10MB and you have 30k records, and it takes ~2 mins to process each record, I'd suggest using map to distribute the data across several reducers then do the actual processing on reduce.