I believe 2 lines of Pig script will also work: a = load 'source' using PigStorage(); store a into 'target' using PigStorage();
Thank you everyone for your help. -Ayon See My Photos on Flickr Also check out my Blog for answers to commonly asked questions. ________________________________ From: Charles Gonçalves <charles...@gmail.com> To: hdfs-user@hadoop.apache.org; Ayon Sinha <ayonsi...@yahoo.com> Sent: Friday, June 17, 2011 10:07 AM Subject: Re: unzip gz file in HDFS ? I know that it's not perfect but you can always use unix pipe ;P hdfs -cat X | gzip -d | hdfs -put - Y Or somethin like that! On Fri, Jun 17, 2011 at 1:02 PM, Ayon Sinha <ayonsi...@yahoo.com> wrote: The hadoop dfs -cp or -mv seem like the perfect candidate to add an uncompress option. > >-Ayon >See My Photos on Flickr >Also check out my Blog for answers to commonly asked questions. > > > >________________________________ >From: Harsh J <ha...@cloudera.com> >To: Ayon Sinha <ayonsi...@yahoo.com> >Cc: "hdfs-user@hadoop.apache.org" <hdfs-user@hadoop.apache.org> >Sent: Friday, June 17, 2011 1:42 AM > >Subject: Re: unzip gz file in HDFS ? > > >Ayon, > >We could write a utility for that, but the issue is that there's no >"server-side" for processing files on HDFS alone. The utility will >have to run an MR job either way, to avoid incurring network transfers >to and back from the invocation machine. > >Perhaps it could be added to examples, or to a set of general tools MR >provides (not aware of one)? > >On Fri, Jun 17, 2011 at 2:07 PM, Ayon Sinha <ayonsi...@yahoo.com> wrote: >> Yup, thought about that. That sounds like then only way. I was hoping >> someone already wrote a hadoop shell command equivalent like: >> hadoop dfs -unzip >> >> -Ayon >> See My Photos on Flickr >> Also check out my Blog for answers to commonly asked questions. >> >> ________________________________ >> From: Harsh J <ha...@cloudera.com> >> To: hdfs-user@hadoop.apache.org; Ayon Sinha <ayonsi...@yahoo.com> >> Sent: Friday, June 17, 2011 1:00 AM >> Subject: Re: unzip gz file in HDFS ? >> >> Ayon, >> >> You can run an identity map job with no output compression set to it. >> >> On Fri, Jun 17, 2011 at 12:59 PM, Ayon Sinha <ayonsi...@yahoo.com> wrote: >>> Is there a way to unzip a gzip file within HDFS where source & target both >>> live on HDFS? I don't want to pull a large file to local and put it back. >>> >>> -Ayon >>> See My Photos on Flickr >>> Also check out my Blog for answers to commonly asked questions. >>> >> >> >> >> -- >> Harsh J >> >> >> > > > >-- >Harsh J > > > -- Charles Ferreira Gonçalves http://homepages.dcc.ufmg.br/~charles/ UFMG - ICEx - Dcc Cel.: 55 31 87741485 Tel.: 55 31 34741485 Lab.: 55 31 34095840