[ https://issues.apache.org/jira/browse/PIG-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tomas Hudik updated PIG-4599: ----------------------------- Assignee: (was: Tomas Hudik) > tar.gz compression doesn't produce correct output > ------------------------------------------------- > > Key: PIG-4599 > URL: https://issues.apache.org/jira/browse/PIG-4599 > Project: Pig > Issue Type: Bug > Affects Versions: 0.12.1 > Reporter: Tomas Hudik > Labels: compression, easytest > > I'm not completely sure whether this is the right place to put this issue > since Pig is involved, however, Pig leave decompression of tar.gz to > hadoop-common. > How to reproduce the issue: > # simple file (file1) with arbitrary text lines put into in1 in HDFS > # same file (file1) compressed by tar -cvzf file1.tar.gz file put into in2 in > HDFS > # issue simple pig commands in pig: > {quote} > raw = load 'in1/' USING TextLoader AS (line: bytearray); > dump raw; > {quote} > run for both (compressed and uncompressed file) > # in case of compressed version you will get strange 1st line > {quote} > a0000644000570000001440000000002512534073736011260 0ustar loadhadoopusersa > ... > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)