[ 
https://issues.apache.org/jira/browse/PIG-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987991#action_12987991
 ] 

Dmitriy V. Ryaboy commented on PIG-1782:
----------------------------------------

Bill, I think what you are suggesting is the "correct" way but I'd prefer not 
to break people's existing scripts which is what would happen if we changed 
what we return when a schema like 'cf2:foo cf2:bar' is specified in your 
proposal...

There are also usability benefits to having the flat return schema you get from 
HBaseStorage now -- it looks exactly like loading from PigStorage, so no 
surprises. You ask for 2 columns, and get 2 values in a tuple, it's sort of 
what you'd expect.

Perhaps we take your suggestion, put that into builtins.AdvancedHBaseStorage, 
deprecate the current HBaseStorage, and move the current code to 
builtins.SimpleHBaseStorage ?

> Add ability to load data by column family in HBaseStorage
> ---------------------------------------------------------
>
>                 Key: PIG-1782
>                 URL: https://issues.apache.org/jira/browse/PIG-1782
>             Project: Pig
>          Issue Type: New Feature
>         Environment: Java 6, Mac OS X 10.6
>            Reporter: Eric Yang
>            Assignee: Bill Graham
>
> It would be nice to load all columns in the column family by using short hand 
> syntax like:
> {noformat}
> CpuMetrics = load 'hbase://SystemMetrics' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cpu:','-loadKey');
> {noformat}
> Assuming there are columns cpu: sys.0, cpu:sys.1, cpu:user.0, cpu:user.1,  in 
> cpu column family.
> CpuMetrics would contain something like:
> {noformat}
> (rowKey, cpu:sys.0, cpu:sys.1, cpu:user.0, cpu:user.1)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to