[ https://issues.apache.org/jira/browse/PIG-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459177#comment-13459177 ]
Bill Graham commented on PIG-2921: ---------------------------------- I don't think we'd want to do anything too crazy on the client side here, since the proper solution would be to insert a marker into the logical/physical plan that gets executed on the client. Pig isn't set up to support something like this yet (PIG-2906 could help) for custom use cases like this. HBaseStorage could always just writes HFiles which the caller then needs to bulk import. This puts the burden on caller to know what to do. Not the greatest solution. I think it would be worth exploring whether we can do this with PIG-1891 though. This kind of use-case (or similarly doing a table swap in SQL for example) is what I hoping PIG-1891 would handle. The tricky bit though is that you only want *one* of the mappers or reducers to take the success action. > Provide a bulkloadable option in HBaseStorage > --------------------------------------------- > > Key: PIG-2921 > URL: https://issues.apache.org/jira/browse/PIG-2921 > Project: Pig > Issue Type: New Feature > Components: data > Affects Versions: 0.9.2 > Reporter: Harsh J > > Right now, the Pig HBaseStorage writes Puts directly into HBase. This is slow > for bulk operations (such as the ones Pig exactly does). The Puts/Deletes are > more meant for realtime operations, so it would be nice if Pig had an > automatic mechanism to prepare bulkloadable HFiles for the target table, and > bulkload it in right at the end of the job. > For compatibility reasons, this can be optional and turned off by default > until it is agreed that this must be default (but can continue to provide a > turn-off option). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira