The following property has been to no effect. mapreduce.input.fileinputformat.split.maxsize = 67108864
I'm still getting 1 Mapper per file. On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar <ank...@yahoo-inc.com> wrote: > you can explicitly set the split size > > > > On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota < > pradeep...@gmail.com> wrote: > > > Hi All, > > I'm writing an MR job to read data using HCatInputFormat... however, the > job is generating too many splits. I don't have this problem when running > queries in Hive since it combines splits by default. > > Is there an equivalent in MR so that I'm not generating thousands of > mappers? > > Thanks, > Pradeep > > >