subject:"Re\: HCatInputFormat combine splits"

Re: HCatInputFormat combine splits

2015-05-14 Thread Pradeep Gollakota

The following property has been to no effect.

mapreduce.input.fileinputformat.split.maxsize = 67108864

I'm still getting 1 Mapper per file.

On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar ank...@yahoo-inc.com
wrote:

 you can explicitly set the split size



   On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota 
 pradeep...@gmail.com wrote:


 Hi All,

 I'm writing an MR job to read data using HCatInputFormat... however, the
 job is generating too many splits. I don't have this problem when running
 queries in Hive since it combines splits by default.

 Is there an equivalent in MR so that I'm not generating thousands of
 mappers?

 Thanks,
 Pradeep

Re: HCatInputFormat combine splits

2015-05-14 Thread Ankit Bhatnagar

try thesemapred.max.split.size= mapred.min.split.size=  
mapreduce.input.fileinputformat.split.maxsize= 
mapreduce.input.fileinputformat.split.minsize=   



 On Thursday, May 14, 2015 11:04 AM, Pradeep Gollakota 
pradeep...@gmail.com wrote:
   

 The following property has been to no effect.
mapreduce.input.fileinputformat.split.maxsize = 67108864
I'm still getting 1 Mapper per file.
On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar ank...@yahoo-inc.com wrote:

you can explicitly set the split size 


 On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota 
pradeep...@gmail.com wrote:
   

 Hi All,
I'm writing an MR job to read data using HCatInputFormat... however, the job is 
generating too many splits. I don't have this problem when running queries in 
Hive since it combines splits by default.
Is there an equivalent in MR so that I'm not generating thousands of mappers?
Thanks,Pradeep

Re: HCatInputFormat combine splits

Re: HCatInputFormat combine splits

2 matches

Site Navigation

Mail list logo

Footer information