[
https://issues.apache.org/jira/browse/PIG-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy V. Ryaboy updated PIG-1782:
-----------------------------------
Resolution: Fixed
Fix Version/s: 0.9.0
Release Note:
Enhanced HBaseStorage functionality to support loading dynamically named
columns by column family or by column name prefixes.
Javadoc:
/**
* A HBase implementation of LoadFunc and StoreFunc.
* <P>
* Below is an example showing how to load data from HBase:
* <pre>{@code
* raw = LOAD 'hbase://SampleTable'
* USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
* 'info:first_name info:last_name friends:* info:*', '-loadKey true
-limit 5')
* AS (id:bytearray, first_name:chararray, last_name:chararray,
friends_map:map[], info_map:map[]);
* }</pre>
* This example loads data redundantly from the info column family just to
* illustrate usage. Note that the row key is inserted first in the result
schema.
* To load only column names that start with a given prefix, specify the column
* name with a trailing '*'. For example passing <code>friends:bob_*</code> to
* the constructor in the above example would cause only columns that start with
* <i>bob_</i> to be loaded.
* <P>
* Below is an example showing how to store data into HBase:
* <pre>{@code
* copy = STORE raw INTO 'hbase://SampleTableCopy'
* USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
* 'info:first_name info:last_name friends:* info:*')
* AS (info:first_name info:last_name buddies:* info:*);
* }</pre>
* Note that STORE will expect the first value in the tuple to be the row key.
* Scalars values need to map to an explicit column descriptor and maps need to
* map to a column family name. In the above examples, the <code>friends</code>
* column family data from <code>SampleTable</code> will be written to a
* <code>buddies</code> column family in the <code>SampleTableCopy</code> table.
*
*/
was:Enhanced HBaseStorage functionality to support loading dynamically named
columns by column family or by column name prefixes.
Status: Resolved (was: Patch Available)
Committed to 0.9 trunk.
> Add ability to load data by column family in HBaseStorage
> ---------------------------------------------------------
>
> Key: PIG-1782
> URL: https://issues.apache.org/jira/browse/PIG-1782
> Project: Pig
> Issue Type: New Feature
> Environment: Java 6, Mac OS X 10.6
> Reporter: Eric Yang
> Assignee: Bill Graham
> Fix For: 0.9.0
>
> Attachments: PIG-1782_1.patch, PIG-1782_4.patch, PIG_1782_2.patch,
> PIG_1782_3.patch, apply-PIG-1782-patch.sh
>
>
> It would be nice to load all columns in the column family by using short hand
> syntax like:
> {noformat}
> CpuMetrics = load 'hbase://SystemMetrics' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cpu:','-loadKey');
> {noformat}
> Assuming there are columns cpu: sys.0, cpu:sys.1, cpu:user.0, cpu:user.1, in
> cpu column family.
> CpuMetrics would contain something like:
> {noformat}
> (rowKey, cpu:sys.0, cpu:sys.1, cpu:user.0, cpu:user.1)
> {noformat}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira