Re: Hadoop and .tgz files
On Mon, 01 Dec 2008 12:16:28 EST, Ryan LeCompte wrote: I believe I spoke a little too soon. Looks like Hadoop supports .gz files, not .tgz. :-) On Mon, Dec 1, 2008 at 10:46 AM, Ryan LeCompte [EMAIL PROTECTED] wrote: Hello all, I'm using Hadoop 0.19 and just discovered that it has no problems processing .tgz files that contain text files. I was under the impression that it wouldn't be able to break a .tgz file up into multiple maps, but instead just treat it as 1 map per .tgz file. Was this a recent change or enhancement? I'm noticing that it is breaking up the .tgz file into multiple maps. Thanks, Ryan Work is in progress to support splitting of .bz2 files. See http://issues.apache.org/jira/browse/HADOOP-4012 I don't believe splitting of .tgz files is possible, something compressed with gzip can only be uncompressed from the beginning. -John Heidemann
Hadoop and .tgz files
Hello all, I'm using Hadoop 0.19 and just discovered that it has no problems processing .tgz files that contain text files. I was under the impression that it wouldn't be able to break a .tgz file up into multiple maps, but instead just treat it as 1 map per .tgz file. Was this a recent change or enhancement? I'm noticing that it is breaking up the .tgz file into multiple maps. Thanks, Ryan
Re: Hadoop and .tgz files
I believe I spoke a little too soon. Looks like Hadoop supports .gz files, not .tgz. :-) On Mon, Dec 1, 2008 at 10:46 AM, Ryan LeCompte [EMAIL PROTECTED] wrote: Hello all, I'm using Hadoop 0.19 and just discovered that it has no problems processing .tgz files that contain text files. I was under the impression that it wouldn't be able to break a .tgz file up into multiple maps, but instead just treat it as 1 map per .tgz file. Was this a recent change or enhancement? I'm noticing that it is breaking up the .tgz file into multiple maps. Thanks, Ryan