---------- Forwarded message ---------- From: Weishung Chung <weish...@gmail.com> Date: Tue, Mar 22, 2011 at 11:31 AM Subject: Re: File formats in Hadoop To: Vivek Krishna <vivekris...@gmail.com> Cc: u...@hbase.apache.org, common-u...@hadoop.apache.org, qwertyman...@gmail.com, Doug Cutting <cutt...@apache.org>
I also found this informative article http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html <http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html>is the key value pair be eg column family1 with one qualifier 1 with 2 versions key1 : rowkey1+column family1:qualifier1+timestamp1 value1: corresponding cell value1 key2 : rowkey1+column family1:qualifier1+timestamp2 value2: corresponding cell value 2 key3: rowkey2+column family1:qualifier1+timestamp1 value3: corresponding cell value 3 <http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html> On Tue, Mar 22, 2011 at 10:58 AM, Vivek Krishna <vivekris...@gmail.com>wrote: > http://nosql.mypopescu.com/post/3220921756/hbase-internals-hfile-explained > might help. > > Viv > > > > > On Tue, Mar 22, 2011 at 11:43 AM, Weishung Chung <weish...@gmail.com>wrote: > >> My fellow superb hbase experts, >> >> Looking at the HFile specs and have some questions. >> How is a particular table cell in a HBase table being represented in the >> HFile? Does the key of the key value pair represent the rowkey+column >> family:qualifier+timestamp and the value represent the corresponding cell >> value? If so, to read a row, multiple key/value pair reads have to be >> done? >> >> Thank you :) >> >> >> On Tue, Mar 22, 2011 at 9:09 AM, Weishung Chung <weish...@gmail.com> >> wrote: >> >> > Thank you, I will definitely take a look. Also, the TFile spec below >> helps >> > me to understand more, >> > what an exciting work ! >> > >> > >> > >> https://issues.apache.org/jira/secure/attachment/12396286/TFile+Specification+20081217.pdf >> > >> > < >> https://issues.apache.org/jira/secure/attachment/12396286/TFile+Specification+20081217.pdf >> > >> > On Mon, Mar 21, 2011 at 11:41 AM, Doug Cutting <cutt...@apache.org> >> wrote: >> > >> >> On 03/19/2011 09:01 AM, Weishung Chung wrote: >> >> > I am browsing through the hadoop.io package and was wondering what >> >> other >> >> > file formats are available in hadoop other than SequenceFile and >> TFile? >> >> > Is all data written through hadoop including those from hbase saved >> in >> >> the >> >> > above formats? It seems like SequenceFile is in key value pair >> format. >> >> >> >> Avro includes a file format that works with Hadoop. >> >> >> >> >> >> >> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/package-summary.html >> >> >> >> Doug >> >> >> > >> > >> > >