Re: Hadoop and .tgz files
On Mon, 01 Dec 2008 12:16:28 EST, "Ryan LeCompte" wrote: >I believe I spoke a little too soon. Looks like Hadoop supports .gz >files, not .tgz. :-) > > >On Mon, Dec 1, 2008 at 10:46 AM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: >> Hello all, >> >> I'm using Hadoop 0.19 and just discovered that it has no problems >> processing .tgz files that contain text files. I was under the >> impression that it wouldn't be able to break a .tgz file up into >> multiple maps, but instead just treat it as 1 map per .tgz file. Was >> this a recent change or enhancement? I'm noticing that it is breaking >> up the .tgz file into multiple maps. >> >> Thanks, >> Ryan >> > Work is in progress to support splitting of .bz2 files. See http://issues.apache.org/jira/browse/HADOOP-4012 I don't believe splitting of .tgz files is possible, something compressed with gzip can only be uncompressed from the beginning. -John Heidemann
Re: Hadoop and .tgz files
I believe I spoke a little too soon. Looks like Hadoop supports .gz files, not .tgz. :-) On Mon, Dec 1, 2008 at 10:46 AM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: > Hello all, > > I'm using Hadoop 0.19 and just discovered that it has no problems > processing .tgz files that contain text files. I was under the > impression that it wouldn't be able to break a .tgz file up into > multiple maps, but instead just treat it as 1 map per .tgz file. Was > this a recent change or enhancement? I'm noticing that it is breaking > up the .tgz file into multiple maps. > > Thanks, > Ryan >
Hadoop and .tgz files
Hello all, I'm using Hadoop 0.19 and just discovered that it has no problems processing .tgz files that contain text files. I was under the impression that it wouldn't be able to break a .tgz file up into multiple maps, but instead just treat it as 1 map per .tgz file. Was this a recent change or enhancement? I'm noticing that it is breaking up the .tgz file into multiple maps. Thanks, Ryan