[ https://issues.apache.org/jira/browse/PIG-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bill Graham updated PIG-1782: ----------------------------- Attachment: apply-PIG-1782-patch.sh PIG-1782_1.patch Attached are two files, a patch and a script to apply it. A few things to note about this patch: * It relies on HBase 0.89.0 or greater and it effectively replaces PIG-1680. * I've updated HBaseStorage for now. If we want to deprecate that class and create a new one instead, I can do that. * I added support for a {{columnPrefix}} option to filter down columns returned. Proper column prefix functionality though requires HBASE-3550. * I had to do some hackery in {{setStoreLocation}} and {{getOutputFormat}} with the conf objects to keep NPEs from being thrown from HBase (see comments in code). A review of what I'm doing with the conf objects in that part of code would be good. * There are still no unit tests for this code, since it's a tricky thing to test. I have a few simple hbase and pig scripts that I've been using that I could provide. > Add ability to load data by column family in HBaseStorage > --------------------------------------------------------- > > Key: PIG-1782 > URL: https://issues.apache.org/jira/browse/PIG-1782 > Project: Pig > Issue Type: New Feature > Environment: Java 6, Mac OS X 10.6 > Reporter: Eric Yang > Assignee: Bill Graham > Attachments: PIG-1782_1.patch, apply-PIG-1782-patch.sh > > > It would be nice to load all columns in the column family by using short hand > syntax like: > {noformat} > CpuMetrics = load 'hbase://SystemMetrics' USING > org.apache.pig.backend.hadoop.hbase.HBaseStorage('cpu:','-loadKey'); > {noformat} > Assuming there are columns cpu: sys.0, cpu:sys.1, cpu:user.0, cpu:user.1, in > cpu column family. > CpuMetrics would contain something like: > {noformat} > (rowKey, cpu:sys.0, cpu:sys.1, cpu:user.0, cpu:user.1) > {noformat} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira