I completely agree with the need to understand both the fundamental HBase
API, and how HBase stores data at a low level.  Both are very important in
knowing how to structure your data for best performance.  Which you should
figure out before moving on to other niceties.

As far as the actual data storage, Lars George did a really informative
write-up:
http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html

And of course there's the HBase Architecture doc:
http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
and the Google BigTable paper.


On Tue, Dec 15, 2009 at 11:21 AM, Edward Capriolo <[email protected]>wrote:

> On Tue, Dec 15, 2009 at 11:04 AM, Gary Helmling <[email protected]>
> wrote:
> > This definitely seems to be a common initial hurdle, though I think each
> > project comes at it with their own specific needs.  There are a variety
> of
> > frameworks or libraries you can check out on the Supporting Projects
> page:
> > http://wiki.apache.org/hadoop/SupportingProjects
> >
> > In my case, I wanted a simple object -> hbase mapping layer that would
> take
> > care of the boilerplate work of persistence and provide a slightly higher
> > level API for queries.  It's been open-sourced on github:
> >
> > http://github.com/ghelmling/meetup.beeno
> >
> > It still only really account for my project needs -- we're serving
> realtime
> > requests from our web site and not currently doing any MR processing.
>  But
> > if it could be of use, I could always use feedback on how to evolve it.
> :)
> >
> > Some of the other projects listed on the wiki page are doubtless more
> > mature, so they may meet your needs as well.  If none of them are quite
> what
> > you're looking for, then there's always room for another!
> >
> > --gh
> >
> >
> > On Tue, Dec 15, 2009 at 10:39 AM, Edward Capriolo <[email protected]
> >wrote:
> >
> >> On Tue, Dec 15, 2009 at 1:03 AM, stack <[email protected]> wrote:
> >> > HBase requires java 6 (1.6) or above.
> >> > St.Ack
> >> >
> >> > On Mon, Dec 14, 2009 at 7:41 PM, Paul Smith <[email protected]>
> wrote:
> >> >
> >> >> Just wondering if anyone knows of an existing Hbase utility library
> that
> >> is
> >> >> open sourced that can assist those that have Java5 and above.  I'm
> >> starting
> >> >> off in Hbase, and thinking it'd be great to have API calls similar to
> >> the
> >> >> Google Collections framework.  If one doesn't exist, I think I could
> >> start
> >> >> off a new project in Google Code (ASL it).  I think Hbase is
> targetted <
> >> >> Java 5, so can't take advantage of this yet internally.
> >> >>
> >> >> The sorts of API functions I thought would be useful to make code
> more
> >> >> readable would be something like:
> >> >>
> >> >>
> >> >>        HTable hTable = new
> >> >> TableBuilder(hbaseConfiguration).withTableName("foo")
> >> >>                .withSimpleColumnFamilies("bar", "eek",
> >> >> "moo").deleteAndRecreate();
> >> >>
> >> >> and
> >> >>
> >> >>        ResultScanner scanner = new
> >> >> ResultScannerBuilder(hTable).withColumnFamilies(
> >> >>                "family1", "family2").build();
> >> >>
> >> >>
> >> >> taking advantage of varargs liberally and using nice Patterns etc.
> >>  While
> >> >> the Bytes class is useful, I'd personally benefit from an API that
> can
> >> >> easily pack arbitrary multiple ints (and other data types) together
> into
> >> >> byte[] for RowKeyGoodness(tm) ala:
> >> >>
> >> >> byte[] rowKey = BytePacker.pack(fooId, barId, eekId, mooId);
> >> >>
> >> >> (behind the scenes this is a vararg method that recursively packs
> each
> >> into
> >> >> into byte[] via Bytes.add(byte[] b1, byte[] b2) etc.
> >> >>
> >> >> If anyone knows of a library that does this, pointers please.
> >> >>
> >> >> cheers,
> >> >>
> >> >> Paul
> >> >
> >>
> >> I could see this being very useful. My first barrier to hbase was
> >> trying to figure out how to turn what I knew of as an SQL select cause
> >> into a set of HBaser server side filters. Mostly, I pieced this
> >> together with help from the list, and the Test Cases. That could be
> >> frustrating for some.  Now that I am used to it, I notice that the
> >> HBase way is actually much cleaner and much less code.
> >>
> >> So, yes a helper library is a great thing.
> >>
> >> As part of the "proof of concept" I am working on, large sections of
> >> it are mostly descriptions of doing things like column projections in
> >> both SQL and HBase with filters. So I think both are very helpful for
> >> making Hbase more attractive to an end user.
> >>
> >
>
> All interesting. In a sense, I believe you should learn to walk before
> you can run :). It is hard to troubleshoot how an ORM mapper is
> working if you basically clueless on the Hbase API.
>
> You know when lots of user tools get pulled in the mix:
> q: How do I only get column X?
> a: You need to get a spring inject able, grails, restful, ORM mapper,
> that is only found in git, but there is like 4 forks of it, so pick
> this one :)
>

Reply via email to