Hi Eduardo, there is no 0.9.1.. do you mean you built it from the 0.9
branch?
Could you try trunk?

On Tue, Sep 6, 2011 at 9:50 AM, Eduardo Afonso Ferreira
<eafon...@yahoo.com>wrote:

> Hi there,
>
> We hit a possible issue with Pig (version 0.9.1) and HBaseStorage where we
> try to LOAD multiple sets of data and UNION them. Here's a simple example
> that shows the problem:
>
> HBase Data (use hbase shell to create table and add rows):
>
>
> create 'test', {NAME => 'data', VERSIONS => 1}
>
> put 'test', '11111', 'data:value', '1'
> put 'test', '11112', 'data:value', '2'
> put 'test', '11113', 'data:value', '3'
> put 'test', '22221', 'data:value', '4'
> put 'test', '22222', 'data:value', '5'
>
> put 'test', '22223', 'data:value', '6'
>
> Pig Statements (create file test.pig):
>
> load1 = LOAD 'hbase://test' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('data:*','-loadKey -gte
> 11110 -lte 22220') AS (key:chararray, map:map[]);
> load2 = LOAD 'hbase://test' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('data:*','-loadKey -gte
> 22220 -lte 33330') AS (key:chararray, map:map[]);
> result = UNION load1, load2;
> dump result;
>
>
> Run Script:
> pig -x local test.pig
>
>
> Result:
> (11111,[value#1])
> (11112,[value#2])
> (11113,[value#3])
> (11111,[value#1])
> (11112,[value#2])
> (11113,[value#3])
>
>
>
> The result should be the following:
> (11111,[value#1])
> (11112,[value#2])
> (11113,[value#3])
> (22221,[value#4])
> (22222,[value#5])
> (22223,[value#6])
>
> If we dump load1 or load2 we see the results we expect, but when the UNION
> is performed, it does not put the expected data together.
>
> Is this a known issue with Pig/HBaseStorage or are we not using them as we
> should?
> If it's a usage problem, what would be the proper way of loading multiple
> sets of data and union them?
>
> Thanks in advance.
> Eduardo.
>

Reply via email to