On Tue, Sep 22, 2009 at 10:10 PM, stchu <stchu.cl...@gmail.com> wrote:
> Hi Stack and Erik, > > Thanks for your answers. I think the timestamp is also contain in mapfiles > (in binary format?), > am I right? > > Yes, its a serialized long. > Hfile looks better. I will migrate my prog. to hadoop 0.20 and hbase 0.20 > after I finished my experiments in 0.19. > But it needs some efforts for those imcompatible apis... :P > > Well, the old APIs are still in place, just deprecated, so hopefully you shouldn't have to migrate anything. Go easy, St.Ack > stchu > > > 2009/9/23 stack <st...@duboce.net> > > > Yes, what Erik said. MapFile is a binary format. What you are some > > preamble up front listing the key and value class types plus some > > miscellaneous meta data. Then, per key and value, these are serialized > > Writable types. > > > > Move to hbase 0.20.0. It uses hfile instead of mapfile. There is a nice > > little utility that does a toString on the hfile binary serializations > that > > prints prettier than the below. > > > > St.Ack > > > > > > On Tue, Sep 22, 2009 at 3:10 AM, stchu <stchu.cl...@gmail.com> wrote: > > > > > Hi, > > > > > > I use Hadoop 0.19.1 and HBase 0.19.3. > > > I write a simple table which have 2 column families (Level0:trail_id, > > > Level1:trail_id). > > > And I put the data (4 rows) into hbase table: > > > 120_25 column=Level0:trail_id, > > > timestamp=2009091613240001, value=39999;21234 > > > 121.1_23.4 column=Level1:trail_id, > > > timestamp=2009091613240001, value=50001;00048;111110 > > > 121.1_25.0 column=Level1:trail_id, > > > timestamp=2009091613240001, value=39999;21234 > > > 121_25 column=Level0:trail_id, > > > timestamp=2009091613240003, value=39999;21234;000001;000003 > > > > > > > > > I find the content of files in HDFS is: > > > > > > for the mapfile Level0: > > > SEQ > > > > > > !org.apache.hadoop.hbase.HStoreKey1org.apache.hadoop.hbase.io.ImmutableBytesWritable�������h > > > = > > > �p{9 > > > ��1������.��� 120_25 Level0:trail_id� #B ����� 39999;21234���<��� > > 121_25 > > > Level0:trail_id� #B ����� 39999;21234;000001;000003 > > > > > > for the mapfile Level1: > > > SEQ > > > > > > !org.apache.hadoop.hbase.HStoreKey1org.apache.hadoop.hbase.io.ImmutableBytesWritable�������>T� > > > �4�q-�� ��.���9���# > > > 121.1_23.4 Level1:trail_id� #B ����� 50001;00048;111110���2���# > > > 121.1_25.0 Level1:trail_id� #B ����� 39999;21234 > > > > > > > > > I wonder that what the messy code means? Is that "offset" and/or > > > "timestamps"? > > > Besides, since hbase store the mapfile depends on columnfamily, why we > > need > > > to save that (in this case: Level0 and Level1)? > > > > > > I appreciate your helps or guides. > > > > > > stchu > > > > > >