[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

James Taylor (JIRA) Tue, 25 Oct 2016 20:22:39 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607265#comment-15607265
 ]


James Taylor commented on PHOENIX-6:
------------------------------------

[[email protected]] - Can you give us a bit of information on usage 
pattern for this feature to help guide [~mujtabachohan]'s performance 
evaluation? In particular:
- How many rows will you be incrementing in one commit batch? Is it just a 
single row commit? Or could it be hundreds of rows at a time?
- Will one row be updated multiple times in the same commit batch?
- Any idea of the velocity of updates? How many rows per second would you 
expect to be updated?
- How frequently will the same rows be updated? Is it highly skewed in that the 
same rows will be incremented over and over again? Or is it a pretty even 
distribution across the board?

> Support ON DUPLICATE KEY construct
> ----------------------------------
>
>                 Key: PHOENIX-6
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.9.0
>
>         Attachments: PHOENIX-6.patch, PHOENIX-6_4.x-HBase-0.98.patch, 
> PHOENIX-6_wip1.patch, PHOENIX-6_wip2.patch, PHOENIX-6_wip3.patch, 
> PHOENIX-6_wip4.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> <column>=<expression>, ...] ]
> {code}
> The following restrictions will apply:
> * The <column> may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> * This new clause cannot be used with
> ** Immutable tables since the whole point is to atomically update a row in 
> place which isn't allowed for immutable tables. 
> ** Transactional tables because these use optimistic concurrency as their 
> mechanism for consistency and isolation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

Reply via email to