Re: loading from HBase - Pig 0.7

Dmitriy Ryaboy Mon, 25 Oct 2010 15:02:45 -0700

Anze, the reason we bumped up to 20.6 in the ticket was because HBase
20.2 had a bug in it. Ask the HBase folks, but I'd say you should
upgrade.
FWIW we upgraded to 20.6 from 20.2 a few months back and it's been
working smoothly.


The Elephant-Bird hbase loader for pig 0.6 does add row keys and most
of the other features we added to the built-in loader for pig 0.8
(notably, it does not do storage). But I don't recommend downgrading
to pig 0.6, as 7 and especially 8 are great improvements to the
software.

-D


On Mon, Oct 25, 2010 at 7:01 AM, Anze <anzen...@volja.net> wrote:
> Hi all!
>
> I am struggling to find a working solution to load data from HBase directly. I
> am using Cloudera CDH3b3 which comes with Pig 0.7. What would be the easiest
> way to load data from HBase?
> If it matters: we need the rows to be included, too.
>
> I have checked ElephantBird, but it seems to require Pig 0.6. I could
> downgrade, but it seems... well... :)
>
> On the other hand, loading from HBase with rows is only added in Pig 0.8:
> https://issues.apache.org/jira/browse/PIG-915
> https://issues.apache.org/jira/browse/PIG-1205
> But judging from the last issue Pig 0.8 requires HBase 0.20.6?
>
> I can install latest Pig from source if needed, but I'd rather leave Hadoop
> and HBase at their versions (0.20.2 and 0.89.20100924 respectively).
>
> Should I write my own UDF? I'd appreciate some pointers.
>
> Thanks,
>
> Anze
>

Re: loading from HBase - Pig 0.7

Reply via email to