Hi Guys, I am trying to use compression to reduce the IO workload when trying to run a job but failed. I have several questions which needs your help.
For lzo compression, I found a guide http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ, why it said "Note that you must have both 32-bit and 64-bit liblzo2 installed" ? I am not sure whether it means that we also need 32bit liblzo2 installed even when we are on 64bit system. If so, why? Also if I don't use lzo compression and tried to use gzip to compress the final reduce output file, I just set below value in mapred-site.xml, but seems it doesn't work(how can I find the final .gz file compressed? I used "hadoop dfs -l <dir>" and didn't find that.). My question: can we use gzip to compress the final result when it's not streaming job? How can we ensure that the compression has been enabled during a job execution? <property> <name>mapred.output.compress</name> <value>true</value> </property> Thanks! Stan Lee