On Nov 20, 2011, at 5:18 PM, Mat Kelcey wrote: > Thanks for the suggestion Arun, I hadn't seen these params before. > > No way to do it for a job in flight though I guess? >
Unfortunately, no. You'll need to re-run the job. Also, you want to use 'bin/mapred job -fail-task <taskattemptid>' 4 times to abandon the task. If you use '-kill-task' it will continue to be re-run. Arun > Cheers, > Mat > > On 20 November 2011 16:43, Arun C Murthy <a...@hortonworks.com> wrote: >> Mat, >> >> Take a look at mapred.max.(map|reduce).failures.percent. >> >> See: >> >> http://hadoop.apache.org/common/docs/r0.20.205.0/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int) >> >> http://hadoop.apache.org/common/docs/r0.20.205.0/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int) >> >> hth, >> Arun >> >> On Nov 20, 2011, at 1:31 PM, Mat Kelcey wrote: >> >>> Hi, >>> >>> I have a largish job running that, due to the quirks of the third >>> party input format I'm using, has 280,000 map tasks. ( I know this is >>> far from ideal but it's it'll do for me ) >>> >>> I'm passing this data (the common crawl web crawl dataset) through a >>> visible-text-from-html extraction library (boilerpipe) which is >>> struggling with _1_ particular task. It's hits a sequence of records >>> that are _insanely_ slow to parse for some reason. Rather than a few >>> minutes per split it's took 7+ hrs before I started explicitly trying >>> to fail the task (hadoop job -fail-task). Since I'm running with bad >>> record skipping I was hoping I could issue -fail-task a few times and >>> ride over the bad records but it looks like there's quite a few there. >>> Since it's only 1 of the 280,000 I'm actually happy to just give up on >>> the entire split. >>> >>> Now if I was running a map only job I'd just kill the job since I'd >>> have the output of the other 279,999. This job has a no-op reduce step >>> though since I wanted to take the chance to compact the output into a >>> much smaller number of sequence files ( I regret that decision now) As >>> such I can't just kill the job since I'd lose the rest of the >>> processed data (if I understand correctly?) >>> >>> So does anyone know a way to just abandon the entire split? >>> >>> Cheers, >>> Mat >> >>