Still no effect. Set minsize to 32M and maxsize to 64M On Thu, May 14, 2015 at 11:07 AM, Ankit Bhatnagar <ank...@yahoo-inc.com> wrote:
> try these > mapred.max.split.size= > mapred.min.split.size= > > mapreduce.input.fileinputformat.split.maxsize= > mapreduce.input.fileinputformat.split.minsize= > > > > > > On Thursday, May 14, 2015 11:04 AM, Pradeep Gollakota < > pradeep...@gmail.com> wrote: > > > The following property has been to no effect. > > mapreduce.input.fileinputformat.split.maxsize = 67108864 > > I'm still getting 1 Mapper per file. > > On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar <ank...@yahoo-inc.com> > wrote: > > you can explicitly set the split size > > > > On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota < > pradeep...@gmail.com> wrote: > > > Hi All, > > I'm writing an MR job to read data using HCatInputFormat... however, the > job is generating too many splits. I don't have this problem when running > queries in Hive since it combines splits by default. > > Is there an equivalent in MR so that I'm not generating thousands of > mappers? > > Thanks, > Pradeep > > > > > >