Hi Oleg >From the job tracker page, you can get to the failed tasks and see which was the file split processed by that task. The split information is available under the status column for each task.
The file split information is not available on job history. Regrads Bejoy KS On Tue, Jul 24, 2012 at 1:49 PM, Oleg Ruchovets <oruchov...@gmail.com> wrote: > Hi , I got such exception running hadoop job: > > java.io.EOFException: Unexpected end of input stream at > org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:99) > at > org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:87) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75) > at java.io.InputStream.read(InputStream.java:85) at > org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:205) at > org.apache.hadoop.util.LineReader.readLine(LineReader.java:169) at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:114) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456) > at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at > org.apache.hadoop.mapred.Child$4.run(Child. > > As I understood some of my files are corrupted ( I am working with GZ > format). > > I resolve the issue using conf.set("mapred.max.map.failures.percent" , "1"), > > But I don't know what file cause the problem. > > Question: > How can I get a filename which is corrupted. > > Thanks in advance > Oleg.