Re: HBase MR overhead over Hadoop

stack Mon, 11 May 2009 08:28:32 -0700

On Mon, May 11, 2009 at 5:41 AM, Eran Bergman <[email protected]>wrote:


>
> The map function basically uses the column I wish to join by  as the key
> and
> the rest of the columns are used as value.
> The reducer just combines all of  the values into a single row.
>

Pardon me.  I don't follow.  Would you mind providing a little more detail?
How does the table row key get into the mix?

Tell us also about your hbase schema and how you did the join here (scan and
random accesses then writing the result back into hbase?).



> Should there be such a big overhead (50 times multiplier) when using HBase
> instead of Hadoop?



Which version of hbase are you using?  Are you in a hurry?  (Hopefully w/i
the month we'll have caching and in-memory tables working in TRUNK).

Is this an RDF project by any chance?
Thanks,
St.Ack

Re: HBase MR overhead over Hadoop

Reply via email to