Re: tar or hadoop archive

2011-07-06 Thread Manhee Jo
8:46 AM Subject: Re: tar or hadoop archive Yes, you can see a picture describing HAR files in this old blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ -Joey On Mon, Jun 27, 2011 at 4:36 PM, Rita wrote: So, it does an index of the file? On Mon, Jun 27, 2011 at

Re: tar or hadoop archive

2011-06-27 Thread Joey Echeverria
Yes, you can see a picture describing HAR files in this old blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ -Joey On Mon, Jun 27, 2011 at 4:36 PM, Rita wrote: > So, it does an index of the file? > > > > On Mon, Jun 27, 2011 at 10:10 AM, Joey Echeverria wrote: > >> The

Re: tar or hadoop archive

2011-06-27 Thread Rita
So, it does an index of the file? On Mon, Jun 27, 2011 at 10:10 AM, Joey Echeverria wrote: > The advantage of a hadoop archive files is it lets you access the > files stored in it directly. For example, if you archived three files > (a.txt, b.txt, c.txt) in an archive called foo.har. You could

Re: tar or hadoop archive

2011-06-27 Thread Joey Echeverria
The advantage of a hadoop archive files is it lets you access the files stored in it directly. For example, if you archived three files (a.txt, b.txt, c.txt) in an archive called foo.har. You could cat one of the three files using the hadoop command line: hadoop fs -cat har:///user/joey/out/foo.ha

tar or hadoop archive

2011-06-27 Thread Rita
We use hadoop/hdfs to archive data. I archive a lot of file by creating one large tar file and then placing to hdfs. Is it better to use hadoop archive for this or is it essentially the same thing? -- --- Get your facts first, then you can distort them as you please.--