[ 
https://issues.apache.org/jira/browse/PIG-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459177#comment-13459177
 ] 

Bill Graham commented on PIG-2921:
----------------------------------

I don't think we'd want to do anything too crazy on the client side here, since 
the proper solution would be to insert a marker into the logical/physical plan 
that gets executed on the client. Pig isn't set up to support something like 
this yet (PIG-2906 could help) for custom use cases like this.

HBaseStorage could always just writes HFiles which the caller then needs to 
bulk import. This puts the burden on caller to know what to do. Not the 
greatest solution.

I think it would be worth exploring whether we can do this with PIG-1891 
though. This kind of use-case (or similarly doing a table swap in SQL for 
example) is what I hoping PIG-1891 would handle. The tricky bit though is that 
you only want *one* of the mappers or reducers to take the success action. 

                
> Provide a bulkloadable option in HBaseStorage
> ---------------------------------------------
>
>                 Key: PIG-2921
>                 URL: https://issues.apache.org/jira/browse/PIG-2921
>             Project: Pig
>          Issue Type: New Feature
>          Components: data
>    Affects Versions: 0.9.2
>            Reporter: Harsh J
>
> Right now, the Pig HBaseStorage writes Puts directly into HBase. This is slow 
> for bulk operations (such as the ones Pig exactly does). The Puts/Deletes are 
> more meant for realtime operations, so it would be nice if Pig had an 
> automatic mechanism to prepare bulkloadable HFiles for the target table, and 
> bulkload it in right at the end of the job.
> For compatibility reasons, this can be optional and turned off by default 
> until it is agreed that this must be default (but can continue to provide a 
> turn-off option).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to