Re: Errors reading lzo-compressed files from Hadoop

2010-04-08 Thread Todd Lipcon
Doh, a couple more silly bugs in there. Don't use that version quite yet -
I'll put up a better patch later today. (Thanks to Kevin and Ted Yu for
pointing out the additional problems)

-Todd

On Wed, Apr 7, 2010 at 5:24 PM, Todd Lipcon t...@cloudera.com wrote:

 For Dmitriy and anyone else who has seen this error, I just committed a fix
 to my github repository:


 http://github.com/toddlipcon/hadoop-lzo/commit/f3bc3f8d003bb8e24f254b25bca2053f731cdd58

 The problem turned out to be an assumption that InputStream.read() would
 return all the bytes that were asked for. This turns out to almost always be
 true on local filesystems, but on HDFS it's not true if the read crosses a
 block boundary. So, every couple of TB of lzo compressed data one might see
 this error.

 Big thanks to Alex Roetter who was able to provide a file that exhibited
 the bug!

 Thanks
 -Todd


 On Tue, Apr 6, 2010 at 10:35 AM, Todd Lipcon t...@cloudera.com wrote:

 Hi Alex,
 Unfortunately I wasn't able to reproduce, and the data Dmitriy is
 working with is sensitive.
 Do you have some data you could upload (or send me off list) that
 exhibits the issue?
 -Todd

 On Tue, Apr 6, 2010 at 9:50 AM, Alex Roetter aroet...@imageshack.net
 wrote:
 
  Todd Lipcon t...@... writes:
 
  
   Hey Dmitriy,
  
   This is very interesting (and worrisome in a way!) I'll try to take a
 look
   this afternoon.
  
   -Todd
  
 
  Hi Todd,
 
  I wanted to see if you made any progress on this front. I'm seeing a
 very
  similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of
  LZOP compressed / indexed files (using Kevin Weil's package), and I have
 one
  map task that always fails in what looks like the same place as
 described in
  the previous post. I haven't yet done the experimentation mentioned
 above
  (isolating the input file corresponding to the failed map task,
 decompressing
  it / recompressing it, testing it out operating directly on local disk
  instead of HDFS, etc).
 
  However, since I am crashing in exactly the same place it seems likely
 this
  is related, and thought I'd check on your work in the meantime.
 
  FYI, my stack track is below:
 
  2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker:
 Error
  running child : java.lang.InternalError: lzo1x_decompress_safe returned:
 at
 com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect
  (Native Method)
 at com.hadoop.compression.lzo.LzoDecompressor.decompress
  (LzoDecompressor.java:303)
 at
  com.hadoop.compression.lzo.LzopDecompressor.decompress
  (LzopDecompressor.java:104)
 at com.hadoop.compression.lzo.LzopInputStream.decompress
  (LzopInputStream.java:223)
 at
  org.apache.hadoop.io.compress.DecompressorStream.read
  (DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at
 org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at
 org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
 at
  com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue
  (LzoLineRecordReader.java:126)
 at
  org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue
  (MapTask.java:423)
 at
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
 at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 
 
  Any update much appreciated,
  Alex
 
 
 
 
 



 --
 Todd Lipcon
 Software Engineer, Cloudera




 --
 Todd Lipcon
 Software Engineer, Cloudera




-- 
Todd Lipcon
Software Engineer, Cloudera


Re: Errors reading lzo-compressed files from Hadoop

2010-04-08 Thread Todd Lipcon
OK, fixed, unit tests passing again. If anyone sees any more problems let
one of us know!

Thanks
-Todd

On Thu, Apr 8, 2010 at 10:39 AM, Todd Lipcon t...@cloudera.com wrote:

 Doh, a couple more silly bugs in there. Don't use that version quite yet -
 I'll put up a better patch later today. (Thanks to Kevin and Ted Yu for
 pointing out the additional problems)

 -Todd


 On Wed, Apr 7, 2010 at 5:24 PM, Todd Lipcon t...@cloudera.com wrote:

 For Dmitriy and anyone else who has seen this error, I just committed a
 fix to my github repository:


 http://github.com/toddlipcon/hadoop-lzo/commit/f3bc3f8d003bb8e24f254b25bca2053f731cdd58

 The problem turned out to be an assumption that InputStream.read() would
 return all the bytes that were asked for. This turns out to almost always be
 true on local filesystems, but on HDFS it's not true if the read crosses a
 block boundary. So, every couple of TB of lzo compressed data one might see
 this error.

 Big thanks to Alex Roetter who was able to provide a file that exhibited
 the bug!

 Thanks
 -Todd


 On Tue, Apr 6, 2010 at 10:35 AM, Todd Lipcon t...@cloudera.com wrote:

 Hi Alex,
 Unfortunately I wasn't able to reproduce, and the data Dmitriy is
 working with is sensitive.
 Do you have some data you could upload (or send me off list) that
 exhibits the issue?
 -Todd

 On Tue, Apr 6, 2010 at 9:50 AM, Alex Roetter aroet...@imageshack.net
 wrote:
 
  Todd Lipcon t...@... writes:
 
  
   Hey Dmitriy,
  
   This is very interesting (and worrisome in a way!) I'll try to take a
 look
   this afternoon.
  
   -Todd
  
 
  Hi Todd,
 
  I wanted to see if you made any progress on this front. I'm seeing a
 very
  similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of
  LZOP compressed / indexed files (using Kevin Weil's package), and I
 have one
  map task that always fails in what looks like the same place as
 described in
  the previous post. I haven't yet done the experimentation mentioned
 above
  (isolating the input file corresponding to the failed map task,
 decompressing
  it / recompressing it, testing it out operating directly on local disk
  instead of HDFS, etc).
 
  However, since I am crashing in exactly the same place it seems likely
 this
  is related, and thought I'd check on your work in the meantime.
 
  FYI, my stack track is below:
 
  2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker:
 Error
  running child : java.lang.InternalError: lzo1x_decompress_safe
 returned:
 at
 com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect
  (Native Method)
 at com.hadoop.compression.lzo.LzoDecompressor.decompress
  (LzoDecompressor.java:303)
 at
  com.hadoop.compression.lzo.LzopDecompressor.decompress
  (LzopDecompressor.java:104)
 at com.hadoop.compression.lzo.LzopInputStream.decompress
  (LzopInputStream.java:223)
 at
  org.apache.hadoop.io.compress.DecompressorStream.read
  (DecompressorStream.java:74)
 at java.io.InputStream.read(InputStream.java:85)
 at
 org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
 at
 org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
 at
  com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue
  (LzoLineRecordReader.java:126)
 at
  org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue
  (MapTask.java:423)
 at
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
 at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 
 
  Any update much appreciated,
  Alex
 
 
 
 
 



 --
 Todd Lipcon
 Software Engineer, Cloudera




 --
 Todd Lipcon
 Software Engineer, Cloudera




 --
 Todd Lipcon
 Software Engineer, Cloudera




-- 
Todd Lipcon
Software Engineer, Cloudera


Re: Errors reading lzo-compressed files from Hadoop

2010-04-08 Thread Dmitriy Ryaboy
Both Kevin's and Todd's branches now pass my tests. Thanks again Todd.

-D

On Thu, Apr 8, 2010 at 10:46 AM, Todd Lipcon t...@cloudera.com wrote:
 OK, fixed, unit tests passing again. If anyone sees any more problems let
 one of us know!

 Thanks
 -Todd

 On Thu, Apr 8, 2010 at 10:39 AM, Todd Lipcon t...@cloudera.com wrote:

 Doh, a couple more silly bugs in there. Don't use that version quite yet -
 I'll put up a better patch later today. (Thanks to Kevin and Ted Yu for
 pointing out the additional problems)

 -Todd


 On Wed, Apr 7, 2010 at 5:24 PM, Todd Lipcon t...@cloudera.com wrote:

 For Dmitriy and anyone else who has seen this error, I just committed a
 fix to my github repository:


 http://github.com/toddlipcon/hadoop-lzo/commit/f3bc3f8d003bb8e24f254b25bca2053f731cdd58

 The problem turned out to be an assumption that InputStream.read() would
 return all the bytes that were asked for. This turns out to almost always be
 true on local filesystems, but on HDFS it's not true if the read crosses a
 block boundary. So, every couple of TB of lzo compressed data one might see
 this error.

 Big thanks to Alex Roetter who was able to provide a file that exhibited
 the bug!

 Thanks
 -Todd


 On Tue, Apr 6, 2010 at 10:35 AM, Todd Lipcon t...@cloudera.com wrote:

 Hi Alex,
 Unfortunately I wasn't able to reproduce, and the data Dmitriy is
 working with is sensitive.
 Do you have some data you could upload (or send me off list) that
 exhibits the issue?
 -Todd

 On Tue, Apr 6, 2010 at 9:50 AM, Alex Roetter aroet...@imageshack.net
 wrote:
 
  Todd Lipcon t...@... writes:
 
  
   Hey Dmitriy,
  
   This is very interesting (and worrisome in a way!) I'll try to take a
 look
   this afternoon.
  
   -Todd
  
 
  Hi Todd,
 
  I wanted to see if you made any progress on this front. I'm seeing a
 very
  similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of
  LZOP compressed / indexed files (using Kevin Weil's package), and I
 have one
  map task that always fails in what looks like the same place as
 described in
  the previous post. I haven't yet done the experimentation mentioned
 above
  (isolating the input file corresponding to the failed map task,
 decompressing
  it / recompressing it, testing it out operating directly on local disk
  instead of HDFS, etc).
 
  However, since I am crashing in exactly the same place it seems likely
 this
  is related, and thought I'd check on your work in the meantime.
 
  FYI, my stack track is below:
 
  2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker:
 Error
  running child : java.lang.InternalError: lzo1x_decompress_safe
 returned:
         at
 com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect
  (Native Method)
         at com.hadoop.compression.lzo.LzoDecompressor.decompress
  (LzoDecompressor.java:303)
         at
  com.hadoop.compression.lzo.LzopDecompressor.decompress
  (LzopDecompressor.java:104)
         at com.hadoop.compression.lzo.LzopInputStream.decompress
  (LzopInputStream.java:223)
         at
  org.apache.hadoop.io.compress.DecompressorStream.read
  (DecompressorStream.java:74)
         at java.io.InputStream.read(InputStream.java:85)
         at
 org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
         at
 org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
         at
  com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue
  (LzoLineRecordReader.java:126)
         at
  org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue
  (MapTask.java:423)
         at
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
         at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
         at org.apache.hadoop.mapred.Child.main(Child.java:170)
 
 
  Any update much appreciated,
  Alex
 
 
 
 
 



 --
 Todd Lipcon
 Software Engineer, Cloudera




 --
 Todd Lipcon
 Software Engineer, Cloudera




 --
 Todd Lipcon
 Software Engineer, Cloudera




 --
 Todd Lipcon
 Software Engineer, Cloudera



Re: Errors reading lzo-compressed files from Hadoop

2010-04-06 Thread Alex Roetter

Todd Lipcon t...@... writes:

 
 Hey Dmitriy,
 
 This is very interesting (and worrisome in a way!) I'll try to take a look
 this afternoon.
 
 -Todd
 

Hi Todd,

I wanted to see if you made any progress on this front. I'm seeing a very
similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of
LZOP compressed / indexed files (using Kevin Weil's package), and I have one
map task that always fails in what looks like the same place as described in 
the previous post. I haven't yet done the experimentation mentioned above 
(isolating the input file corresponding to the failed map task, decompressing
it / recompressing it, testing it out operating directly on local disk
instead of HDFS, etc).

However, since I am crashing in exactly the same place it seems likely this
is related, and thought I'd check on your work in the meantime.

FYI, my stack track is below:

2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker: Error
running child : java.lang.InternalError: lzo1x_decompress_safe returned:
at com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect
(Native Method)
at com.hadoop.compression.lzo.LzoDecompressor.decompress
(LzoDecompressor.java:303)
at
com.hadoop.compression.lzo.LzopDecompressor.decompress
(LzopDecompressor.java:104)
at com.hadoop.compression.lzo.LzopInputStream.decompress
(LzopInputStream.java:223)
at
org.apache.hadoop.io.compress.DecompressorStream.read
(DecompressorStream.java:74)
at java.io.InputStream.read(InputStream.java:85)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
at
com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue
(LzoLineRecordReader.java:126)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue
(MapTask.java:423)
at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)


Any update much appreciated,
Alex







Re: Errors reading lzo-compressed files from Hadoop

2010-04-01 Thread Todd Lipcon
Hey Dmitriy,

This is very interesting (and worrisome in a way!) I'll try to take a look
this afternoon.

-Todd

On Thu, Apr 1, 2010 at 12:16 AM, Dmitriy Ryaboy dmit...@twitter.com wrote:

 Hi folks,
 We write a lot of lzo-compressed files to HDFS -- some via scribe,
 some using internal tools. Occasionally, we discover that the created
 lzo files cannot be read from HDFS -- they get through some (often
 large) portion of the file, and then fail with the following stack
 trace:

 Exception in thread main java.lang.InternalError:
 lzo1x_decompress_safe returned:
at
 com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native
 Method)
at
 com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:303)
at
 com.hadoop.compression.lzo.LzopDecompressor.decompress(LzopDecompressor.java:122)
at
 com.hadoop.compression.lzo.LzopInputStream.decompress(LzopInputStream.java:223)
at
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
at java.io.InputStream.read(InputStream.java:85)
at com.twitter.twadoop.jobs.LzoReadTest.main(LzoReadTest.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 The initial thought is of course that the lzo file is corrupt --
 however, plain-jane lzop is able to read these files. Moreover, if we
 pull the files out of hadoop, uncompress them, compress them again,
 and put them back into HDFS, we can usually read them from HDFS as
 well.

 We've been thinking that this strange behavior is caused by a bug in
 the hadoop-lzo libraries (we use the version with Twitter and Cloudera
 fixes, on github: http://github.com/kevinweil/hadoop-lzo )
 However, today I discovered that using the exact same environment,
 codec, and InputStreams, we can successfully read from the local file
 system, but cannot read from HDFS. This appears to point at possible
 issues in the FSDataInputStream or further down the stack.

 Here's a small test class that tries to read the same file from HDFS
 and from the local FS, and the output of running it on our cluster.
 We are using the CDH2 distribution.

 https://gist.github.com/e1bf7e4327c7aef56303

 Any ideas on what could be going on?

 Thanks,
 -Dmitriy




-- 
Todd Lipcon
Software Engineer, Cloudera