Re: Performance of mappers

2011-08-05 Thread Arun C Murthy
On Aug 5, 2011, at 12:31 PM, Iman E wrote: > > > The task tracker logs does not show any problem. These are the log entries > for a task attempt that is too slow > 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: > LaunchTaskAction (registerTask): attempt_201108041814_0035_m_

Re: Performance of mappers

2011-08-05 Thread Milind.Bhandarkar
No, I was wondering if you are specifying –cacheArchive or –cacheFile. These are fetched by the tasktracker prior to task startup, and can delay task launch. * Milind * --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do no

Re: Performance of mappers

2011-08-05 Thread Iman E
Milind, are you talking about the cache specified by the parameter local.cache.size. I have not actually changed its value and I can see that the default is 10GB.  ThanksIman From: "milind.bhandar...@emc.com" To: mapreduce-user@hadoop.apache.org; hadoop_...@yah

Re: Performance of mappers

2011-08-05 Thread Milind.Bhandarkar
Iman, Are you using cache archives ? If yes, what's the size of the cache archive? - milind --- --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the au

Re: Performance of mappers

2011-08-05 Thread Iman E
Hi Arun, Thanks for your reply. I am running Hadoop-0.20.1 and I also tried the cloudera hadoop-0.20.1+152.   The task tracker logs does not show any problem. These are the log entries for a task attempt that is too slow 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: Launch

Re: Reducer Run on Which Machine?

2011-08-05 Thread Milind.Bhandarkar
Arun, As we had discussed several years ago, there are many use cases where reducer scheduling control will be beneficial. Suhendry, Currently it is not possible to specify hints for reducer scheduling, so patches welcome. - milind --- Milind Bhandarkar Greenplum Labs, EMC ((Disclaimer: Opinio

Re: Performance of mappers

2011-08-05 Thread Arun C Murthy
Which release of Hadoop are you running? What do the logs on the TaskTracker tell you during the time the slow tasks are getting launched? hadoop-0.20.203 has a ton of bug fixes since hadoop-0.20.2 which help fix issues with slow launches - you might want to upgrade. Arun On Aug 5, 2011, at 1

Performance of mappers

2011-08-05 Thread Iman E
Hello all, I have a question regarding the mappers. I can see from the logs that the start time of the mapper is different from start time of logging. I am having a problem because that time difference sometimes is few seconds, but other times it is   For example, one mapper that is supposed to

Re: Reducer Run on Which Machine?

2011-08-05 Thread Felix Halim
Suppose if we want to override the scheduling to force a reducer to run on a particular machine. Are there any particular classes in Hadoop that we can override to achieve that? Thanks, Felix Halim On Fri, Aug 5, 2011 at 1:27 PM, Arun C Murthy wrote: > Nope, currently we don't do any smart sch