Yes, you can see a picture describing HAR files in this old blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/
-Joey On Mon, Jun 27, 2011 at 4:36 PM, Rita <rmorgan...@gmail.com> wrote: > So, it does an index of the file? > > > > On Mon, Jun 27, 2011 at 10:10 AM, Joey Echeverria <j...@cloudera.com> wrote: > >> The advantage of a hadoop archive files is it lets you access the >> files stored in it directly. For example, if you archived three files >> (a.txt, b.txt, c.txt) in an archive called foo.har. You could cat one >> of the three files using the hadoop command line: >> >> hadoop fs -cat har:///user/joey/out/foo.har/a.txt >> >> You can also copy files out of the archive or use files in the archive >> as input to map reduce jobs. >> >> -Joey >> >> On Mon, Jun 27, 2011 at 3:06 AM, Rita <rmorgan...@gmail.com> wrote: >> > We use hadoop/hdfs to archive data. I archive a lot of file by creating >> one >> > large tar file and then placing to hdfs. Is it better to use hadoop >> archive >> > for this or is it essentially the same thing? >> > >> > -- >> > --- Get your facts first, then you can distort them as you please.-- >> > >> >> >> >> -- >> Joseph Echeverria >> Cloudera, Inc. >> 443.305.9434 >> > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- Joseph Echeverria Cloudera, Inc. 443.305.9434