Er, sorry I meant mapred.map.tasks = 1

On Thu, Jul 12, 2012 at 10:44 AM, Harsh J <ha...@cloudera.com> wrote:
> Try passing mapred.map.tasks = 0 or set a higher min-split size?
>
> On Thu, Jul 12, 2012 at 10:36 AM, Yang <teddyyyy...@gmail.com> wrote:
>> Thanks Harsh
>>
>> I see
>>
>> then there seems to be some small problems with the Splitter / InputFormat.
>>
>> I'm just reading a 1-line text file through pig:
>>
>> A = LOAD 'myinput.txt' ;
>>
>> supposedly it should generate at most 1 mapper.
>>
>> but in reality , it seems that pig generated 3 mappers, and basically fed
>> empty input to 2 of the mappers
>>
>>
>> Thanks
>> Yang
>>
>> On Wed, Jul 11, 2012 at 10:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Yang,
>>>
>>> No, those three are individual task attempts.
>>>
>>> This is how you may generally dissect an attempt ID when reading it:
>>>
>>> attempt_201207111710_0024_m_000000_0
>>>
>>> 1. "attempt" - indicates its an attempt ID you'll be reading
>>> 2. "201207111710" - The job tracker timestamp ID, indicating which
>>> instance of JT ran this job
>>> 3. "0024" - The Job ID for which this was a task attempt
>>> 4. "m" - Indicating this is a mapper (reducers are "r")
>>> 5. "000000" - The task ID of the mapper (00000 is the first mapper,
>>> 00001 is the second, etc.)
>>> 6. "0" - The attempt # for the task ID. 0 means it is the first
>>> attempt, 1 indicates the second attempt, etc.
>>>
>>> On Thu, Jul 12, 2012 at 9:16 AM, Yang <teddyyyy...@gmail.com> wrote:
>>> > I set the following params to be false in my pig script (0.10.0)
>>> >
>>> > SET mapred.map.tasks.speculative.execution false;
>>> > SET mapred.reduce.tasks.speculative.execution false;
>>> >
>>> >
>>> > I also verified in the jobtracker UI in the job.xml that they are indeed
>>> > set correctly.
>>> >
>>> > when the job finished, jobtracker UI shows that there is only one attempt
>>> > for each task (in fact I have only 1 task too).
>>> >
>>> > but when I went to the tasktracker node, looked under the
>>> > /var/log/hadoop/userlogs/job_id_here/
>>> > dir , there are 3 attempts dir ,
>>> >  job_201207111710_0024 # ls
>>> > attempt_201207111710_0024_m_000000_0
>>>  attempt_201207111710_0024_m_000001_0
>>> >  attempt_201207111710_0024_m_000002_0  job-acls.xml
>>> >
>>> > so 3 attempts were indeed fired ??
>>> >
>>> > I have to get this controlled correctly because I'm trying to debug the
>>> > mappers through eclipse,
>>> > but if more than 1 mapper process is fired, they all try to connect to
>>> the
>>> > same debugger port, and the end result is that nobody is able to
>>> > hook to the debugger.
>>> >
>>> >
>>> > Thanks
>>> > Yang
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>
>
>
> --
> Harsh J



-- 
Harsh J

Reply via email to