[
https://issues.apache.org/jira/browse/PHOENIX-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235025#comment-16235025
]
James Taylor commented on PHOENIX-4344:
---------------------------------------
Here's a possible way to proceed with this:
- In PhoenixInputFormat, we drive things based on a QueryPlan. I think the
first thing we'll need is PHOENIX-4342 - providing a way of getting the
underlying QueryPlan from a MutationPlan (which is what you get when you
compile a DELETE statement).
- Create different implementation of PhoenixInputFormat.getQueryPlan() that
compiles the DELETE statement and gets the QueryPlan from the MutationPlan.
- Keep the same logic that ends up setting up on mapper per scan in the
QueryPlan
- Instead of executing each individual scan, you'd want to execute a DELETE
statement bounded by the start/stop key of each scan
- Execute code just like FormatToBytesWritableMapper to put together the list
of Delete mutations
- Make sure we've got the write-to-multiple HTables working correctly (I
believe MultiHfileOutputFormat does that)
> MapReduce Delete Support
> ------------------------
>
> Key: PHOENIX-4344
> URL: https://issues.apache.org/jira/browse/PHOENIX-4344
> Project: Phoenix
> Issue Type: New Feature
> Affects Versions: 4.12.0
> Reporter: Geoffrey Jacoby
> Assignee: Geoffrey Jacoby
> Priority: Major
>
> Phoenix already has the ability to use MapReduce for asynchronous handling of
> long-running SELECTs. It would be really useful to have this capability for
> long-running DELETEs, particularly of tables with indexes where using HBase's
> own MapReduce integration would be prohibitively complicated.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)