I just got hadoop running on EC2 (0.19 just because that's the AMI the scripts seemed to go for). The PI example worked and I believe the wordcount example worked too. However, the output file is in .deflate format. "hadoop fs -text" fails to decompress the file -- it produces the same binary output as "hadoop fs -cat", which I find counterintuitive; isn't -text specifically supposed to handle this situation?
I copied the file to local and tried manually decompressing it with gunzip and lzop (by appending appropriate suffixes), but both tools failed to recognize the file. To add to the confusion, I see this in the default configuration offered by the EC2 scripts: <name>mapred.output.compress</name> <value>false</value> <description>Should the job outputs be compressed? </description> ...so I don't understand why the output was compressed in the first place. At this point, I'm kind of stuck. The output shouldn't be compressed to begin with, and all attempts to decompress it have failed. Any ideas? Thanks. ________________________________________________________________________________ Keith Wiley kwi...@keithwiley.com keithwiley.com music.keithwiley.com "And what if we picked the wrong religion? Every week, we're just making God madder and madder!" -- Homer Simpson ________________________________________________________________________________