Thanks Harsh. On digging some more it appears there was a data corruption issue with the file that caused the exception. After having regenerated the gzip file from source I no longer see the issue.
On Jul 20, 2012, at 8:48 PM, Harsh J <ha...@cloudera.com> wrote: > Prashant, > > Can you add in some context on how these files were written, etc.? > Perhaps open a JIRA with a sample file and test-case to reproduce > this? Other env stuff with info on version of hadoop, etc. would help > too. > > On Sat, Jul 21, 2012 at 2:05 AM, Prashant Kommireddi > <prash1...@gmail.com> wrote: >> I am seeing these exceptions, anyone know what they might be caused due to? >> Case of corrupt file? >> >> java.io.IOException: too many length or distance symbols >> at >> org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native >> Method) >> at >> org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:221) >> at >> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:80) >> at >> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74) >> at java.io.InputStream.read(InputStream.java:85) >> at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) >> at >> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:97) >> at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:109) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) >> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >> at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> >> >> Thanks, >> Prashant > > > > -- > Harsh J