Hi, in an MR step, I need to extract text from various files (using Tika). I have put text extraction into reduce(), because I am writing the extracted text to the output on HDFS. But now it occurs to me that I might as well have put it into map() and have default reduce() which will write every map() result out, is that true?
Thank you, Mark