Sincerely, Vadim
I have a bunch of gzip files which I am trying to process with Hadoop
task. The task fails with exception:
java.io.EOFException: Unexpected end of ZLIB input stream at
java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:223)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:
141) at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:92) at
org.apache.hadoop.io.compress.GzipCodec
$GzipInputStream.read(GzipCodec.java:124) at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at
java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
org
.apache.hadoop.mapred.LineRecordReader.readLine(LineRecordReader.java:
136) at
org
.apache.hadoop.mapred.LineRecordReader.readLine(LineRecordReader.java:
128) at
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:
117) at
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:
39) at org.apache.hadoop.mapred.MapTask
$TrackedRecordReader.next(MapTask.java:147) at
org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:208) at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2016)
I guess some of files are invalid. However I could not find anywhere
in logs file name of the file causing this exception. Due to the huge
size of the dataset I would not want to extract files from DFS and
verify them with Gzip one by one. Any suggestions? Thanks!
- broken gzip file Vadim Zaliva
- Re: broken gzip file Ted Dunning
- Re: broken gzip file Vadim Zaliva
- Re: broken gzip file Vadim Zaliva
- Re: broken gzip file Jason Venner
- Re: broken gzip file Arun C Murthy
- Re: broken gzip file Jason Venner
