[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897489#comment-13897489
 ] 

James Taylor commented on PHOENIX-6:
------------------------------------

This could be done in two stages: first for UPSERT VALUES and next for UPSERT 
SELECT. Here's one way this could be approached:
- add an ON DUPLICATE KEY IGNORE clause to UPSERT in the sql grammar.
- pass this through the UpsertStatement as a new ignoreDuplicateKeys boolean
- modify UpsertCompiler to pass this boolean into MutationState
- modify MutationState to create a different operation than Put. Unfortunately 
checkAndPut is not batch-able, so you need to either create a new class that 
implements org.apache.hadoop.hbase.client.Row or you might be able to "borrow" 
the Append operation (as Phoenix doesn't support this operation). The former 
would be better (and [~lhofhansl] mentioned to me before that this would not be 
difficult).
- modify the Indexer.preBatchMutate to look for instances of your new class - 
you'd want to collect all these up and turn them into a region.checkAndPut 
operations instead. You could test for the existence of our empty key value 
(column family is dependent on the table through the 
SchemaUtil.getEmptyColumnFamily(ptable) method and column qualifier of 
QueryConstant.EMPTY_COLUMN_BYTES). Not sure if when you do a checkAndPut if the 
regular Put coprocessor will fire if the check passes ([~jesse_yates] might 
know), but if it does, that'd be good, because then the index maintenance code 
would kick in which is what you want. If not, you'll need to get the row lock 
yourself, do the region.get() to see if the row exists and then do the 
region.put() if it doesn't (see SequenceRegionObserver for an example).

> Support on duplicate key ignore construct
> -----------------------------------------
>
>                 Key: PHOENIX-6
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key ignore" construct (or it's SQL standard 
> equivalent) for UPSERT.
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to