This is from our production environment. Unfortunately, I cannot test this on any newer version until it is upgraded to cdh4 (0.23)
But since this is chd3u1 release, it presumably already contains a lot of bug fixes back ported from 0.21. The SIGSER is just one of the issues. EOFException is raised gracefully for other cases. On 5/22/12, Edward Capriolo <edlinuxg...@gmail.com> wrote: > if You are getting a SIGSEG it never hurts to try a more recent JVM. > 21 has many bug fixes at this point. > > On Tue, May 22, 2012 at 11:45 AM, Jason B <urg...@gmail.com> wrote: >> JIRA entry created: >> >> https://issues.apache.org/jira/browse/HADOOP-8423 >> >> >> On 5/21/12, Jason B <urg...@gmail.com> wrote: >>> Sorry about using attachment. The code is below for the reference. >>> (I will also file a jira as you suggesting) >>> >>> package codectest; >>> >>> import com.hadoop.compression.lzo.LzoCodec; >>> import java.io.IOException; >>> import java.util.Formatter; >>> import org.apache.hadoop.conf.Configuration; >>> import org.apache.hadoop.fs.FileSystem; >>> import org.apache.hadoop.io.MapFile; >>> import org.apache.hadoop.io.SequenceFile.CompressionType; >>> import org.apache.hadoop.io.Text; >>> import org.apache.hadoop.io.compress.CompressionCodec; >>> import org.apache.hadoop.io.compress.DefaultCodec; >>> import org.apache.hadoop.io.compress.SnappyCodec; >>> import org.apache.hadoop.util.Tool; >>> import org.apache.hadoop.util.ToolRunner; >>> >>> public class MapFileCodecTest implements Tool { >>> private Configuration conf = new Configuration(); >>> >>> private void createMapFile(Configuration conf, FileSystem fs, String >>> path, >>> CompressionCodec codec, CompressionType type, int records) >>> throws IOException { >>> MapFile.Writer writer = new MapFile.Writer(conf, fs, path, >>> Text.class, Text.class, >>> type, codec, null); >>> Text key = new Text(); >>> for (int j = 0; j < records; j++) { >>> StringBuilder sb = new StringBuilder(); >>> Formatter formatter = new Formatter(sb); >>> formatter.format("%03d", j); >>> key.set(sb.toString()); >>> writer.append(key, key); >>> } >>> writer.close(); >>> } >>> >>> private void testCodec(Configuration conf, Class<? extends >>> CompressionCodec> clazz, >>> CompressionType type, int records) throws IOException { >>> FileSystem fs = FileSystem.getLocal(conf); >>> try { >>> System.out.println("Creating MapFiles with " + records + >>> " records using codec " + clazz.getSimpleName()); >>> String path = clazz.getSimpleName() + records; >>> createMapFile(conf, fs, path, clazz.newInstance(), type, >>> records); >>> MapFile.Reader reader = new MapFile.Reader(fs, path, conf); >>> Text key1 = new Text("002"); >>> if (reader.get(key1, new Text()) != null) { >>> System.out.println("1st key found"); >>> } >>> Text key2 = new Text("004"); >>> if (reader.get(key2, new Text()) != null) { >>> System.out.println("2nd key found"); >>> } >>> } catch (Throwable ex) { >>> ex.printStackTrace(); >>> } >>> } >>> >>> @Override >>> public int run(String[] strings) throws Exception { >>> System.out.println("Using native library " + >>> System.getProperty("java.library.path")); >>> >>> testCodec(conf, DefaultCodec.class, CompressionType.RECORD, >>> 100); >>> testCodec(conf, SnappyCodec.class, CompressionType.RECORD, 100); >>> testCodec(conf, LzoCodec.class, CompressionType.RECORD, 100); >>> >>> testCodec(conf, DefaultCodec.class, CompressionType.RECORD, 10); >>> testCodec(conf, SnappyCodec.class, CompressionType.RECORD, 10); >>> testCodec(conf, LzoCodec.class, CompressionType.RECORD, 10); >>> >>> testCodec(conf, DefaultCodec.class, CompressionType.BLOCK, 100); >>> testCodec(conf, SnappyCodec.class, CompressionType.BLOCK, 100); >>> testCodec(conf, LzoCodec.class, CompressionType.BLOCK, 100); >>> >>> testCodec(conf, DefaultCodec.class, CompressionType.BLOCK, 10); >>> testCodec(conf, SnappyCodec.class, CompressionType.BLOCK, 10); >>> testCodec(conf, LzoCodec.class, CompressionType.BLOCK, 10); >>> return 0; >>> } >>> >>> @Override >>> public void setConf(Configuration c) { >>> this.conf = c; >>> } >>> >>> @Override >>> public Configuration getConf() { >>> return conf; >>> } >>> >>> public static void main(String[] args) throws Exception { >>> ToolRunner.run(new MapFileCodecTest(), args); >>> } >>> >>> } >>> >>> >>> On 5/21/12, Todd Lipcon <t...@cloudera.com> wrote: >>>> Hi Jason, >>>> >>>> Sounds like a bug. Unfortunately the mailing list strips attachments. >>>> >>>> Can you file a jira in the HADOOP project, and attach your test case >>>> there? >>>> >>>> Thanks >>>> Todd >>>> >>>> On Mon, May 21, 2012 at 3:57 PM, Jason B <urg...@gmail.com> wrote: >>>>> I am using Cloudera distribution cdh3u1. >>>>> >>>>> When trying to check native codecs for better decompression >>>>> performance such as Snappy or LZO, I ran into issues with random >>>>> access using MapFile.Reader.get(key, value) method. >>>>> First call of MapFile.Reader.get() works but a second call fails. >>>>> >>>>> Also I am getting different exceptions depending on number of entries >>>>> in a map file. >>>>> With LzoCodec and 10 record file, jvm gets aborted. >>>>> >>>>> At the same time the DefaultCodec works fine for all cases, as well as >>>>> record compression for the native codecs. >>>>> >>>>> I created a simple test program (attached) that creates map files >>>>> locally with sizes of 10 and 100 records for three codecs: Default, >>>>> Snappy, and LZO. >>>>> (The test requires corresponding native library available) >>>>> >>>>> The summary of problems are given below: >>>>> >>>>> Map Size: 100 >>>>> Compression: RECORD >>>>> ================== >>>>> DefaultCodec: OK >>>>> SnappyCodec: OK >>>>> LzoCodec: OK >>>>> >>>>> Map Size: 10 >>>>> Compression: RECORD >>>>> ================== >>>>> DefaultCodec: OK >>>>> SnappyCodec: OK >>>>> LzoCodec: OK >>>>> >>>>> Map Size: 100 >>>>> Compression: BLOCK >>>>> ================ >>>>> DefaultCodec: OK >>>>> >>>>> SnappyCodec: java.io.EOFException at >>>>> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114) >>>>> >>>>> LzoCodec: java.io.EOFException at >>>>> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114) >>>>> >>>>> Map Size: 10 >>>>> Compression: BLOCK >>>>> ================== >>>>> DefaultCodec: OK >>>>> >>>>> SnappyCodec: java.lang.NoClassDefFoundError: Ljava/lang/InternalError >>>>> at >>>>> org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native >>>>> Method) >>>>> >>>>> LzoCodec: >>>>> # >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> # SIGSEGV (0xb) at pc=0x00002b068ffcbc00, pid=6385, >>>>> tid=47304763508496 >>>>> # >>>>> # JRE version: 6.0_21-b07 >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode >>>>> linux-amd64 ) >>>>> # Problematic frame: >>>>> # C [liblzo2.so.2+0x13c00] lzo1x_decompress+0x1a0 >>>>> # >>>>> # An error report file with more information is saved as: >>>>> # /hadoop/user/yurgis/testapp/hs_err_pid6385.log >>>>> # >>>>> # If you would like to submit a bug report, please visit: >>>>> # http://java.sun.com/webapps/bugreport/crash.jsp >>>>> # The crash happened outside the Java Virtual Machine in native code. >>>>> # See problematic frame for where to report the bug. >>>>> # >>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >