HBaseStorage not casting correctly
----------------------------------

                 Key: PIG-2611
                 URL: https://issues.apache.org/jira/browse/PIG-2611
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.9.2
         Environment: Ubuntu 11.10, Hadoop 0.20.2, HBase 0.92.0
            Reporter: David Arthur
            Priority: Minor


When loading data into HBase with HBaseStorage, there is unexpected behavior 
regarding record schema and casting.

Here is the relevant code snippet:
{code}
B = group A by (time_tuple, some_scalar);
C = foreach B {
        -- UDF to generate id (bytearray)
        generate id, flatten(group.$0), COUNT(A);
}
{code}

At this point the schema for C is unknown, so I declare a schema with a foreach 
statement

{code}
D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int, $3 
as date:int, $4 as count:int;
{code}

Even though I've declared C.$4 as an int, it is still a long (from the COUNT). 
When I go to insert into HBase I get a ClassCastException since the schema 
(int) does not match the actual tuple value (long). I can fix this by 
explicitly casting when I declare the schema.

{code}
D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int, $3 
as date:int, (int)$4 as count:int;
{code}

Is this expected behavior? If not, is this an HBaseStorage issue - not honoring 
the schema before going off casting things?

Cheers,
David

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to