What is the property for setting the number of tolerated failure task in one job

2011-05-10 Thread Jeff Zhang
Hi all, I just remember there's a property for setting the number of failure task can been tolerated in one job. Does anyone know what's the property name ? -- Best Regards Jeff Zhang

Re: purpose of JobTracker copying job.jar

2011-05-10 Thread Dmitriy Ryaboy
Looking around 0.22, looks like this behavior is gone there, it's not copying the jar. The only copyToLocal call it makes is to get the job file (not the jar file). 0.20 did both. Does anyone remember why 0.20 needed the jar? -D On Mon, May 9, 2011 at 12:55 AM, Raghu Angadi wrote: > On Wed, Ap

Accessing jobs on the JobTracker

2011-05-10 Thread Christoph Schmitz
Hi, for reporting and monitoring purposes, I would like to access - from Java code - the job configuration of Jobs that someone else has submitted to a JobTracker (in 0.20.169). Basically, this would mean doing a lot of what "hadoop job -status " does (to get to the location of the job.xml fi

Re:What is the property for setting the number of tolerated failure taskin one job

2011-05-10 Thread Wantao
I guess the property is "mapreduce.job.maxtaskfailures.per.tracker", and the default value is 4. Wantao -- Original -- From: "Jeff Zhang"; Date: Tue, May 10, 2011 04:32 PM To: "mapreduce-user"; Subject: What is the property for setting the number of tolerat

Re: FW: NNbench and MRBench

2011-05-10 Thread Wantao
Hi Stanley, Evaluating and analysing benchmark results of a system is complicated, it requires some deep understanding of the target system. Probably you can get some hints from papers about MapReduce/Hadoop, in which the experiment section typically provides good examples on how to evaluate te

RE: Where are map and reduce functions from the Gridmix2 examples

2011-05-10 Thread stanley.shi
I was just reading the gridmix2 code, happy to find that I can answer this question: For javasort, I see these two lines of code: jobConf.setMapperClass(IdentityMapper.class); jobConf.setReducerClass(IdentityReducer.

Is there a way to see what file killed a mapper?

2011-05-10 Thread Jonathan Coveney
I have a basic job that is dying, I think, on one badly compressed file. Is there a way to see what file it is choking on? Via the job tracker I can find the mapper that is dying but I cannot find a record of the file that it died on. Thank you for your help

RE: Is there a way to see what file killed a mapper?

2011-05-10 Thread GOEKE, MATTHEW [AG/1000]
Someone might have a more graceful method of determining this but I've found outputting that kind of data to counters is the most effective way. Otherwise you could use stderr or stdout but then you would need to mine the log data on each node to figure it out. Matt From: Jonathan Coveney [

Re: What is the property for setting the number of tolerated failure task in one job

2011-05-10 Thread Amar Kamat
The property to set the max number of task failures a job can tolerate is 'mapred.max.map.failures.percent' in the old API and 'mapreduce.map.failures.maxpercent' in the new API. This determines the job faillure. Amar On 5/10/11 2:02 PM, "Jeff Zhang" wrote: Hi all, I just remember there's

Re: Is there a way to see what file killed a mapper?

2011-05-10 Thread Amar Kamat
MapReduce updates the task's configuration and sets 'map.input.file' to point to the file on which the task intends to work on. In the new MapReduce API, its renamed to 'mapreduce.map.input.file'. You can print the value corresponding to 'map.input.file'. Similarly 'map.input.start' point to th

Re: Including external libraries in my job.

2011-05-10 Thread Amar Kamat
You can place the extra library JARS in the $HADOOP_HOME/lib folder and hadoop will pick it up from there. Amar On 5/3/11 7:12 PM, "Niels Basjes" wrote: Hi, I've written my first very simple job that does something with hbase. Now when I try to submit my jar in my cluster I get this: [nbasj

Re: Is there a way to see what file killed a mapper?

2011-05-10 Thread Jonathan Coveney
Thanks, these are quite useful. 2011/5/10 Amar Kamat > MapReduce updates the task’s configuration and sets ‘map.input.file’ to > point to the file on which the task intends to work on. In the new MapReduce > API, its renamed to ‘mapreduce.map.input.file’. You can print the value > corresponding

Null pointer exception in Mapper initialization

2011-05-10 Thread Mapred Learn
Hi, I get error like: java.lang.NullPointerException at org.apache.hadoop.io .serializer.SerializationFactory.getSerializer(SerializationFactory.java:73) at org.apache.hadoop.mapred.MapTask$MapOut

Getting (or setting) a job ID

2011-05-10 Thread Adam Phelps
We have intermittently seen cases where a job will "freeze" for some as yet unknown reason, and thereby block other processes waiting for that job to complete. I'm trying to modify our job-launching scripts to kill the job if it doesn't complete in a reasonable amount of time, however to do th

RE: Getting (or setting) a job ID

2011-05-10 Thread Aaron Baff
As part of the job submission, once it's submitted, grab the JobID from that object and print it out on STDOUT or to a file and have your startup script(s) parse it out from there. --Aaron -Original Message- From: Adam Phelps [mailto:a...@opendns.com] Sent: Tuesday, May 10, 2011 3:45 PM

Re: Getting (or setting) a job ID

2011-05-10 Thread Adam Phelps
We could make a change of that nature. We currently use job.waitForCompletion() to submit jobs, if we switch to job.submit() is there an alternate blocking call to wait for the job to finish other than polling job.isComplete()? I'm not currently seeing anything in the API. - Adam On 5/10/1

Re: Getting (or setting) a job ID

2011-05-10 Thread Harsh J
Hey Adam, On Wed, May 11, 2011 at 5:17 AM, Adam Phelps wrote: > We could make a change of that nature.  We currently use > job.waitForCompletion() to submit jobs, if we switch to job.submit() is > there an alternate blocking call to wait for the job to finish other than > polling job.isComplete()