I think I understand that by last 2 replies :) But my question is can I change this configuration to say split file into 250K so that multiple mappers can be invoked?
On Thu, May 26, 2011 at 3:41 PM, James Seigel <ja...@tynt.com> wrote: > have more data for it to process :) > > > On 2011-05-26, at 4:30 PM, Mohit Anchlia wrote: > >> I ran a simple pig script on this file: >> >> -rw-r--r-- 1 root root 208348 May 26 13:43 excite-small.log >> >> that orders the contents by name. But it only created one mapper. How >> can I change this to distribute accross multiple machines? >> >> On Thu, May 26, 2011 at 3:08 PM, jagaran das <jagaran_...@yahoo.co.in> wrote: >>> Hi Mohit, >>> >>> No of Maps - It depends on what is the Total File Size / Block Size >>> No of Reducers - You can specify. >>> >>> Regards, >>> Jagaran >>> >>> >>> >>> ________________________________ >>> From: Mohit Anchlia <mohitanch...@gmail.com> >>> To: common-user@hadoop.apache.org >>> Sent: Thu, 26 May, 2011 2:48:20 PM >>> Subject: No. of Map and reduce tasks >>> >>> How can I tell how the map and reduce tasks were spread accross the >>> cluster? I looked at the jobtracker web page but can't find that info. >>> >>> Also, can I specify how many map or reduce tasks I want to be launched? >>> >>> From what I understand is that it's based on the number of input files >>> passed to hadoop. So if I have 4 files there will be 4 Map taks that >>> will be launced and reducer is dependent on the hashpartitioner. >>> > >