Anze, the reason we bumped up to 20.6 in the ticket was because HBase 20.2 had a bug in it. Ask the HBase folks, but I'd say you should upgrade. FWIW we upgraded to 20.6 from 20.2 a few months back and it's been working smoothly.
The Elephant-Bird hbase loader for pig 0.6 does add row keys and most of the other features we added to the built-in loader for pig 0.8 (notably, it does not do storage). But I don't recommend downgrading to pig 0.6, as 7 and especially 8 are great improvements to the software. -D On Mon, Oct 25, 2010 at 7:01 AM, Anze <anzen...@volja.net> wrote: > Hi all! > > I am struggling to find a working solution to load data from HBase directly. I > am using Cloudera CDH3b3 which comes with Pig 0.7. What would be the easiest > way to load data from HBase? > If it matters: we need the rows to be included, too. > > I have checked ElephantBird, but it seems to require Pig 0.6. I could > downgrade, but it seems... well... :) > > On the other hand, loading from HBase with rows is only added in Pig 0.8: > https://issues.apache.org/jira/browse/PIG-915 > https://issues.apache.org/jira/browse/PIG-1205 > But judging from the last issue Pig 0.8 requires HBase 0.20.6? > > I can install latest Pig from source if needed, but I'd rather leave Hadoop > and HBase at their versions (0.20.2 and 0.89.20100924 respectively). > > Should I write my own UDF? I'd appreciate some pointers. > > Thanks, > > Anze >