[ https://issues.apache.org/jira/browse/HADOOP-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031283#comment-17031283 ]
Redriver commented on HADOOP-12990: ----------------------------------- Is there a quick way to decompress the hadoop '.lz4'? I encountered this problem when I try to decompress the event logs downloaded from Spark history server. > lz4 incompatibility between OS and Hadoop > ----------------------------------------- > > Key: HADOOP-12990 > URL: https://issues.apache.org/jira/browse/HADOOP-12990 > Project: Hadoop Common > Issue Type: Bug > Components: io, native > Affects Versions: 2.6.0 > Reporter: John Zhuge > Priority: Minor > > {{hdfs dfs -text}} hit exception when trying to view the compression file > created by Linux lz4 tool. > The Hadoop version has HADOOP-11184 "update lz4 to r123", thus it is using > LZ4 library in release r123. > Linux lz4 version: > {code} > $ /tmp/lz4 -h 2>&1 | head -1 > *** LZ4 Compression CLI 64-bits r123, by Yann Collet (Apr 1 2016) *** > {code} > Test steps: > {code} > $ cat 10rows.txt > 001|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 002|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 003|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 004|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 005|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 006|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 007|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 008|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 009|c1|c2|c3|c4|c5|c6|c7|c8|c9 > 010|c1|c2|c3|c4|c5|c6|c7|c8|c9 > $ /tmp/lz4 10rows.txt 10rows.txt.r123.lz4 > Compressed 310 bytes into 105 bytes ==> 33.87% > $ hdfs dfs -put 10rows.txt.r123.lz4 /tmp > $ hdfs dfs -text /tmp/10rows.txt.r123.lz4 > 16/04/01 08:19:07 INFO compress.CodecPool: Got brand-new decompressor [.lz4] > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:123) > at > org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > at java.io.InputStream.read(InputStream.java:101) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119) > at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:106) > at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:101) > at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) > at > org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) > at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) > at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org