Todd, you rock. We'll integrate this into Kevin's branch right away. -Dmitriy
On Wed, Apr 7, 2010 at 5:24 PM, Todd Lipcon <t...@cloudera.com> wrote: > For Dmitriy and anyone else who has seen this error, I just committed a fix > to my github repository: > > http://github.com/toddlipcon/hadoop-lzo/commit/f3bc3f8d003bb8e24f254b25bca2053f731cdd58 > > The problem turned out to be an assumption that InputStream.read() would > return all the bytes that were asked for. This turns out to almost always be > true on local filesystems, but on HDFS it's not true if the read crosses a > block boundary. So, every couple of TB of lzo compressed data one might see > this error. > > Big thanks to Alex Roetter who was able to provide a file that exhibited the > bug! > > Thanks > -Todd > > > On Tue, Apr 6, 2010 at 10:35 AM, Todd Lipcon <t...@cloudera.com> wrote: > >> Hi Alex, >> Unfortunately I wasn't able to reproduce, and the data Dmitriy is >> working with is sensitive. >> Do you have some data you could upload (or send me off list) that >> exhibits the issue? >> -Todd >> >> On Tue, Apr 6, 2010 at 9:50 AM, Alex Roetter <aroet...@imageshack.net> >> wrote: >> > >> > Todd Lipcon <t...@...> writes: >> > >> > > >> > > Hey Dmitriy, >> > > >> > > This is very interesting (and worrisome in a way!) I'll try to take a >> look >> > > this afternoon. >> > > >> > > -Todd >> > > >> > >> > Hi Todd, >> > >> > I wanted to see if you made any progress on this front. I'm seeing a very >> > similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of >> > LZOP compressed / indexed files (using Kevin Weil's package), and I have >> one >> > map task that always fails in what looks like the same place as described >> in >> > the previous post. I haven't yet done the experimentation mentioned above >> > (isolating the input file corresponding to the failed map task, >> decompressing >> > it / recompressing it, testing it out operating directly on local disk >> > instead of HDFS, etc). >> > >> > However, since I am crashing in exactly the same place it seems likely >> this >> > is related, and thought I'd check on your work in the meantime. >> > >> > FYI, my stack track is below: >> > >> > 2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker: Error >> > running child : java.lang.InternalError: lzo1x_decompress_safe returned: >> > at >> com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect >> > (Native Method) >> > at com.hadoop.compression.lzo.LzoDecompressor.decompress >> > (LzoDecompressor.java:303) >> > at >> > com.hadoop.compression.lzo.LzopDecompressor.decompress >> > (LzopDecompressor.java:104) >> > at com.hadoop.compression.lzo.LzopInputStream.decompress >> > (LzopInputStream.java:223) >> > at >> > org.apache.hadoop.io.compress.DecompressorStream.read >> > (DecompressorStream.java:74) >> > at java.io.InputStream.read(InputStream.java:85) >> > at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) >> > at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187) >> > at >> > com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue >> > (LzoLineRecordReader.java:126) >> > at >> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue >> > (MapTask.java:423) >> > at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) >> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> > at org.apache.hadoop.mapred.Child.main(Child.java:170) >> > >> > >> > Any update much appreciated, >> > Alex >> > >> > >> > >> > >> > >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >