Get the meaning now. As sort would use SequenceFileFormat to write the
output file and to use gzip as the compression type, we need to use native
library...would try that.


On Wed, May 19, 2010 at 3:17 PM, stan lee <lee.stan...@gmail.com> wrote:

> Thanks All. So if we don't call setCompressOutput() and
> setOutputCompressorClass() funciton in the sort programe,we  just set
> mapred.output.compress to true and set mapred.output.compression.codec to
> org.apache.hadoop.io.compress.GzipCodec, that wouldn't have compressed
> output file like part-xxxx.gz?
>
> On Wed, May 19, 2010 at 1:31 AM, Harsh J <qwertyman...@gmail.com> wrote:
>
>> Hi stan,
>>
>> You can do something of this sort if you use FileOutputFormat, from
>> within your Job Driver:
>>
>>    FileOutputFormat.setCompressOutput(job, true);
>>    FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);
>>    // GzipCodec from org.apache.hadoop.io.compress.
>>    // and where 'job' is either JobConf or Job object.
>>
>> This will write the simple file output in Gzip format. You also have
>> BZip2Codec.
>>
>> On Tue, May 18, 2010 at 9:14 PM, stan lee <lee.stan...@gmail.com> wrote:
>> > Hi Guys,
>> >
>> > I am trying to use compression to reduce the IO workload when trying to
>> run
>> > a job but failed. I have several questions which needs your help.
>> >
>> > For lzo compression, I found a guide
>> > http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ, why it said
>> "Note
>> > that you must have both 32-bit and 64-bit liblzo2 installed" ? I am not
>> sure
>> > whether it means that we also need 32bit liblzo2 installed even when we
>> are
>> > on 64bit system. If so, why?
>> >
>> > Also if I don't use lzo compression and tried to use gzip to compress
>> the
>> > final reduce output file, I just set below value in mapred-site.xml, but
>> > seems it doesn't work(how can I find the final .gz file compressed? I
>> used
>> > "hadoop dfs -l <dir>" and didn't find that.). My question: can we use
>> gzip
>> > to compress the final result when it's not streaming job? How can we
>> ensure
>> > that the compression has been enabled during a job execution?
>> >
>> > <property>
>> >       <name>mapred.output.compress</name>
>> >       <value>true</value>
>> > </property>
>> >
>> > Thanks!
>> > Stan Lee
>> >
>>
>>
>>
>>  --
>> Harsh J
>> www.harshj.com
>>
>
>

Reply via email to