[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-6:
-------------------------------
    Description: 
To support inserting a new row only if it doesn't already exist, we should 
support the "on duplicate key" construct for UPSERT. With this construct, the 
UPSERT VALUES statement would run atomically and would thus require a read 
before write which would obviously have a negative impact on performance. For 
an example of similar syntax , see MySQL documentation at 
http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html

See this discussion for more detail: 
https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. A 
related discussion is on PHOENIX-2909.

Initially we'd support the following:
# This would prevent the setting of VAL to 0 if the row already exists:
{code}
UPSERT INTO T (PK, VAL) VALUES ('a',0) 
ON DUPLICATE KEY IGNORE;
{code}
# This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
exists and otherwise initialize them to 0:
{code}
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
{code}

So the general form is:
{code}
UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
<column>=<expression>, ...] ]
{code}
The following restrictions will apply:
- The <column> may not be part of the primary key constraint - only KeyValue 
columns will be allowed.
- If the table is immutable, the <column> may not appear in a secondary index. 
This is because the mutations for indexes on immutable tables are calculated on 
the client-side, while this new syntax would potentially modify the value on 
the server-side.

  was:
To support inserting a new row only if it doesn't already exist, we should 
support the "on duplicate key" construct for UPSERT. With this construct, the 
UPSERT VALUES statement would run atomically and would thus require a read 
before write which would obviously have a negative impact on performance. For 
an example of similar syntax , see MySQL documentation at 
http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html

See this discussion for more detail: 
https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. A 
related discussion is on PHOENIX-2909.

Initially we'd support the following:
# This would prevent the setting of VAL to 0 if the row already exists:
{code}
UPSERT INTO T (PK, VAL) VALUES ('a',0) 
ON DUPLICATE KEY IGNORE;
{code}
# This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
exists and otherwise initialize them to 0:
{code}
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
{code}



> Support ON DUPLICATE KEY construct
> ----------------------------------
>
>                 Key: PHOENIX-6
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> <column>=<expression>, ...] ]
> {code}
> The following restrictions will apply:
> - The <column> may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the <column> may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to