Re: Performance of mappers

Iman E Fri, 05 Aug 2011 13:34:36 -0700

Milind, are you talking about the cache specified by the parameter 
local.cache.size. I have not actually changed its value and I can see that the 
default is 10GB. 
ThanksIman

________________________________
From: "milind.bhandar...@emc.com" <milind.bhandar...@emc.com>
To: mapreduce-user@hadoop.apache.org; hadoop_...@yahoo.com
Sent: Friday, August 5, 2011 3:47 PM
Subject: Re: Performance of mappers

Iman,

Are you using cache archives ? If yes, what's the size of the cache archive?

- milind
---
---
Milind Bhandarkar
Greenplum Labs, EMC
(Disclaimer: Opinions expressed in this email are those of the author, and do
not necessarily represent the views of any organization, past or present, the 
author might be affiliated with.)

From: Iman E <hadoop_...@yahoo.com<mailto:hadoop_...@yahoo.com>>
Reply-To: 
"mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>" 
<mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>>, 
Iman E <hadoop_...@yahoo.com<mailto:hadoop_...@yahoo.com>>
Date: Fri, 5 Aug 2011 15:31:51 -0400
To: "mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>" 
<mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>>
Subject: Re: Performance of mappers

Hi Arun,
Thanks for your reply. I am running Hadoop-0.20.1 and I also tried the cloudera 
hadoop-0.20.1+152.

The task tracker logs does not show any problem. These are the log entries for 
a task attempt that is too slow
2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: 
LaunchTaskAction (registerTask): attempt_201108041814_0035_m_000000_0 task's 
state:UNASSIGNED
2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: Trying to 
launch : attempt_201108041814_0035_m_000000_0
2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: In 
TaskLauncher, current free slots : 2 and trying to launch 
attempt_201108041814_0035_m_000000_0
2011-08-05 14:28:03,097 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: 
jvm_201108041814_0035_m_1371719584 given task: 
attempt_201108041814_0035_m_000000_0
2011-08-05 14:32:52,341 INFO org.apache.hadoop.mapred.TaskTracker: 
attempt_201108041814_0035_m_000000_0 0.07052739%
2011-08-05 14:32:55,398 INFO org.apache.hadoop.mapred.TaskTracker: 
attempt_201108041814_0035_m_000000_0 0.123025686%
2011-08-05 14:32:58,402 INFO org.apache.hadoop.mapred.TaskTracker: 
attempt_201108041814_0035_m_000000_0 0.16794641%
2011-08-05 14:33:01,419 INFO org.apache.hadoop.mapred.TaskTracker: 
attempt_201108041814_0035_m_000000_0 0.41990894%
2011-08-05 14:33:04,804 INFO org.apache.hadoop.mapred.TaskTracker: 
attempt_201108041814_0035_m_000000_0 0.8607056%
2011-08-05 14:33:06,617 INFO org.apache.hadoop.mapred.TaskTracker: 
attempt_201108041814_0035_m_000000_0 1.0%
2011-08-05 14:33:06,625 INFO org.apache.hadoop.mapred.TaskTracker: Task 
attempt_201108041814_0035_m_000000_0 is done.

Thanks
Iman

From: Arun C Murthy <a...@hortonworks.com<mailto:a...@hortonworks.com>>
To: mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>; 
Iman E <hadoop_...@yahoo.com<mailto:hadoop_...@yahoo.com>>
Sent: Friday, August 5, 2011 2:05 PM
Subject: Re: Performance of mappers

Which release of Hadoop are you running?

What do the logs on the TaskTracker tell you during the time the slow tasks are 
getting launched?

hadoop-0.20.203 has a ton of bug fixes since hadoop-0.20.2 which help fix 
issues with slow launches - you might want to upgrade.

Arun

On Aug 5, 2011, at 11:02 AM, Iman E wrote:

Hello all,
I have a question regarding the mappers. I can see from the logs that the start 
time of the mapper is different from start time of logging. I am having a 
problem because that time difference sometimes is few seconds, but other times 
it is

For example, one mapper that is supposed to read 65 MB. Its start time is 
12:30:53 whereis the logging start time is 12:33:01 and the end time is 
12:33:20. All the laoded data are local to the same rack.
In a perfect run, these numbers are as follows: the start time is 18:15:45, 
logging start time: 18:15:48, and end time: 18:16:02.

I am running a job of more than 2400 mapper. Because of the above problem, 
instead of the job taking 15-20 mins  on 14 machine ( it happened in few runs), 
other times it is taking more than 70 mins. Any suggestions how to fix this 
problem or what could possibly be causing it.

Thanks,
Iman

Re: Performance of mappers

Reply via email to