[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576237#comment-15576237
 ] 

Cameron Hatfield commented on PHOENIX-6:


And in general, I agree with the heavyweightness of the merge statement. I 
could also see ON DUPLICATE KEY having slightly different semantics / 
optimization opportunities, ala Postgreses discussion around the feature, where 
having two statements makes sense, even if the more general statement can 
handle the same problem.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576149#comment-15576149
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/14/16 7:15 PM:
--

It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT

Does the first FROM SELECT see the results of the first DUPLICATE KEY? 
Does the second DUPLICATE KEY see the results from the first FROM SELECT? 
Does the second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some of the results from 
statements in the same commit batch could occur in different orders. It might 
be better, at least long term, just to support per-statement auto-commit 
batching at the client level (since commit basically handles non-autocommit 
batching), though the difficulty would be higher.


was (Author: cameron.hatfield):
It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT

Does the first FROM SELECT see the results of the first DUPLICATE KEY? 
Does the second DUPLICATE KEY see the results from the first FROM SELECT? 
Does the second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576149#comment-15576149
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/14/16 6:44 PM:
--

It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT

Does the first FROM SELECT see the results of the first DUPLICATE KEY? 
Does the second DUPLICATE KEY see the results from the first FROM SELECT? 
Does the second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.


was (Author: cameron.hatfield):
It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT

Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576149#comment-15576149
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/14/16 6:43 PM:
--

It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.


was (Author: cameron.hatfield):
It would work for me, though I'm not sure if it would be consistent the 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576149#comment-15576149
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/14/16 6:43 PM:
--

It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.


was (Author: cameron.hatfield):
It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576149#comment-15576149
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/14/16 6:44 PM:
--

It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT

Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.


was (Author: cameron.hatfield):
It would work for me, though I'm not sure if it would be consistent the other 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
* UPSERT DUPLICATE KEY
* UPSERT FROM SELECT
Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576149#comment-15576149
 ] 

Cameron Hatfield commented on PHOENIX-6:


It would work for me, though I'm not sure if it would be consistent the 
statements that exist within phoenix, as far as read isolation goes. 
Specifically, with the following ordering: 
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
*UPSERT DUPLICATE KEY
*UPSERT FROM SELECT
Does the first FROM SELECT see the results of the first DUPLICATE KEY? Does the 
second DUPLICATE KEY see the results from the first FROM SELECT? Does the 
second FROM SELECT see the results from the first FROM SELECT?

Would definitely be counter intuitive if only some statements in the same 
commit batch could occur in different orders. It might be better, at least long 
term, just to support per-statement auto-commit batching at the client level 
(since commit basically handles non-autocommit batching), though the difficulty 
would be higher.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576121#comment-15576121
 ] 

Cameron Hatfield commented on PHOENIX-6:


Here is what MSSQL does to handle not having to repeat the actual values 
(though the clause is still repeated), as well as how it supports multiple 
values for the same merge:
{code:sql}
MERGE INTO Sales.SalesReason AS Target  
  USING (VALUES ('Recommendation','Other'), ('Review', 'Marketing'), 
('Internet', 'Promotion'))  
   AS Source (NewName, NewReasonType)  
  ON Target.Name = Source.NewName  
WHEN MATCHED THEN  
  UPDATE SET ReasonType = Source.NewReasonType  
WHEN NOT MATCHED BY TARGET THEN  
  INSERT (Name, ReasonType) VALUES (NewName, NewReasonType)
{code}

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-14 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576102#comment-15576102
 ] 

Cameron Hatfield commented on PHOENIX-6:


In general, the main use case I have is reducing RPC calls across 
"compare-and-set" requests. A tertiary use case would be reducing the amount of 
work the parser/compiler has to do (issuing and parsing one statement is 
generally easier/faster then issuing and parsing multiple of the same 
statements). The first could be handled by either supporting batching at the 
client level (so one RPC call for multiple statements being run), though there 
are questions on what autocommit means in that respect, or by supporting a 
single statement that supports multiple values (such as a values statement that 
supports multiple rows). 

The allowing multiple statements for the same row key would basically be 
equivalent to an "autocommit" after each individual statement, so that would 
work for me.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
> Attachments: PHOENIX-6_wip1.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> To handle the maintenance of immutable indexes, we'll need to push the 
> maintenance to the server side.
> This is because the mutations for indexes on immutable tables are calculated 
> on the client-side, while this new syntax would potentially modify the value 
> on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546874#comment-15546874
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/4/16 10:56 PM:
--

For the help with my use case question, not really, due to the round-trip per 
row insert (same reasoning as the other "batch checkandput" bugs within phoenix 
and HBase).

For the question about commit, and this is just me trying to understand it, if 
you run an UPSERT VALUES and an UPSERT SELECT in the same commit, in that 
order, the UPSERT SELECT should not see the results of UPSERT VALUES (or a 
delete, or select), according to that wording. Is that correct? For example, 
what happens in the following case (when using jdbc driver commands for 
autocommit / commit):
{code:sql}
AUTOCOMMIT OFF;
UPSERT INTO T (PK, V) VALUES (0,0);

-- Run batch size + 1 times
UPSERT INTO entity.stuff (PK, V) SELECT PK + 1 as PK, V + 1 FROM entity.stuff;

COMMIT;
{code}

In theory, according to SQL + the docs, I would expect to see only one new row 
added to the table. But if more then one batch occurs, and the same PK is in 
both sides of the batch, an extra two rows should exist. The most likely time 
for this to show up, of course, is multiple upsert selects within the same 
commit on a large enough table. In playing with this in squirrel sql, I don't 
really have enough knowledge to what would actually trigger multiple batches to 
happen, so I have no repro for this :).

Anyways, this is only mildly related to this bug, as it this same problem would 
be more pronounced with an on duplicate key command, as you would probably 
actually be using it for things like increments / etc. if the problem exists at 
all.


was (Author: cameron.hatfield):
For the help with my use case question, not really, due to the round-trip per 
row insert (same reasoning as the other "batch checkandput" bugs within phoenix 
and HBase).

For the question about commit, and this is just me trying to understand it, if 
you run an UPSERT VALUES and an UPSERT SELECT in the same commit, in that 
order, the UPSERT SELECT should not see the results of UPSERT VALUES (or a 
delete, or select), according to that wording. Is that correct? For example, 
what happens in the following case (when using jdbc driver commands for 
autocommit / commit):
{code:sql}
AUTOCOMMIT OFF;
UPSERT INTO T (PK, V) VALUES (0,0);

-- Run batch size + 1 times
UPSERT INTO entity.stuff (PK, V) SELECT PK + 1 as PK, V + 1 FROM entity.stuff;

COMMIT;
{code}

In theory, according to SQL + the docs, I would expect to see only one new row 
added to the table. But if more then one batch occurs, and the same PK is in 
both sides of the batch, an extra two rows should exist. The most likely time 
for this to show up, of course, is multiple upsert selects within the same 
commit on a large enough table.

Anyways, this is only mildly related to this bug, as it this same problem would 
be more pronounced with an on duplicate key command, as you would probably 
actually be using it for things like increments / etc.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables 

[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546874#comment-15546874
 ] 

Cameron Hatfield commented on PHOENIX-6:


For the help with my use case question, not really, due to the round-trip per 
row insert (same reasoning as the other "batch checkandput" bugs within phoenix 
and HBase).

For the question about commit, and this is just me trying to understand it, if 
you run an UPSERT VALUES and an UPSERT SELECT in the same commit, in that 
order, the UPSERT SELECT should not see the results of UPSERT VALUES (or a 
delete, or select), according to that wording. Is that correct? For example, 
what happens in the following case (when using jdbc driver commands for 
autocommit / commit):
{code:sql}
AUTOCOMMIT OFF;
UPSERT INTO T (PK, V) VALUES (0,0);

-- Run batch size + 1 times
UPSERT INTO entity.stuff (PK, V) SELECT PK + 1 as PK, V + 1 FROM entity.stuff;

COMMIT;
{code}

In theory, according to SQL + the docs, I would expect to see only one new row 
added to the table. But if more then one batch occurs, an extra two rows should 
exist. The most likely time for this to show up, of course, is multiple upsert 
selects within the same commit on a large enough table.

Anyways, this is only mildly related to this bug, as it this same problem would 
be more pronounced with an on duplicate key command, as you would probably 
actually be using it for things like increments / etc.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546874#comment-15546874
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/4/16 10:34 PM:
--

For the help with my use case question, not really, due to the round-trip per 
row insert (same reasoning as the other "batch checkandput" bugs within phoenix 
and HBase).

For the question about commit, and this is just me trying to understand it, if 
you run an UPSERT VALUES and an UPSERT SELECT in the same commit, in that 
order, the UPSERT SELECT should not see the results of UPSERT VALUES (or a 
delete, or select), according to that wording. Is that correct? For example, 
what happens in the following case (when using jdbc driver commands for 
autocommit / commit):
{code:sql}
AUTOCOMMIT OFF;
UPSERT INTO T (PK, V) VALUES (0,0);

-- Run batch size + 1 times
UPSERT INTO entity.stuff (PK, V) SELECT PK + 1 as PK, V + 1 FROM entity.stuff;

COMMIT;
{code}

In theory, according to SQL + the docs, I would expect to see only one new row 
added to the table. But if more then one batch occurs, and the same PK is in 
both sides of the batch, an extra two rows should exist. The most likely time 
for this to show up, of course, is multiple upsert selects within the same 
commit on a large enough table.

Anyways, this is only mildly related to this bug, as it this same problem would 
be more pronounced with an on duplicate key command, as you would probably 
actually be using it for things like increments / etc.


was (Author: cameron.hatfield):
For the help with my use case question, not really, due to the round-trip per 
row insert (same reasoning as the other "batch checkandput" bugs within phoenix 
and HBase).

For the question about commit, and this is just me trying to understand it, if 
you run an UPSERT VALUES and an UPSERT SELECT in the same commit, in that 
order, the UPSERT SELECT should not see the results of UPSERT VALUES (or a 
delete, or select), according to that wording. Is that correct? For example, 
what happens in the following case (when using jdbc driver commands for 
autocommit / commit):
{code:sql}
AUTOCOMMIT OFF;
UPSERT INTO T (PK, V) VALUES (0,0);

-- Run batch size + 1 times
UPSERT INTO entity.stuff (PK, V) SELECT PK + 1 as PK, V + 1 FROM entity.stuff;

COMMIT;
{code}

In theory, according to SQL + the docs, I would expect to see only one new row 
added to the table. But if more then one batch occurs, an extra two rows should 
exist. The most likely time for this to show up, of course, is multiple upsert 
selects within the same commit on a large enough table.

Anyways, this is only mildly related to this bug, as it this same problem would 
be more pronounced with an on duplicate key command, as you would probably 
actually be using it for things like increments / etc.

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546605#comment-15546605
 ] 

Cameron Hatfield commented on PHOENIX-6:


Ahh, I saw where I was getting confused. I thought that Phoenix implemented 
executebatch according to the JDBC contract (multiple statements will be sent 
in one go to the server), however instead they send each statement separately 
to the underlying db.

Would probably be worthwhile to mention that if you want to use this for 
counters / etc, that each separate statement hitting the same row requires a 
separate commit (just to make it more obvious that it is how it would work). 
This would also be an interesting interaction with how UPSERT ... SELECT 
batches inserts, when duplicates are involved, since the it would act 
differently depending on where duplicates where in the batch. Technically, I 
believe this is a problem now, as UPSERT .. SELECT would allow you to read data 
that is not yet "committed", so its possible that the documentation for that 
should be updated (as a separate bug): "Non transactional tables have no 
guarantees above and beyond the HBase guarantee of row level atomicity (see 
here). *In addition, non transactional tables will not see their updates until 
after a commit has occurred.*"



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543705#comment-15543705
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/4/16 7:45 PM:
-

At what point will the expressions be evaluated in the context of update 
statement?

Specifically, intra-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase
{code:sql}
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
{code}
After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


inter-query (Same query as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
{code:sql}
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
{code}
After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client. Aside from that, since 
batching is usually a performance optimization, I wouldn't want the 
optimization actually affecting the results.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.




was (Author: cameron.hatfield):
At what point will the expressions be evaluated in the context of update 
statement?

Specifically, intra-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


inter-query (Same query as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client. Aside from that, since 
batching is usually a performance optimization, I wouldn't want the 
optimization actually affecting the results.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue 

[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546436#comment-15546436
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/4/16 7:43 PM:
-

Except, in the second case (two statements, same batch), the value of row 'a' 
of would non-existent, as they have never been inserted before. So wouldn't the 
result be 0,0 if it was in the same batch?


was (Author: cameron.hatfield):
Except, in the second case (two statements, same batch), the value of row 'a' 
of would non-existent, as they have never been interested before. So wouldn't 
the result be 0,0 if it was in the same batch?

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546436#comment-15546436
 ] 

Cameron Hatfield commented on PHOENIX-6:


Except, in the second case (two statements, same batch), the value of row 'a' 
of would non-existent, as they have never been interested before. So wouldn't 
the result be 0,0 if it was in the same batch?

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-04 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543705#comment-15543705
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/4/16 6:01 PM:
-

At what point will the expressions be evaluated in the context of update 
statement?

Specifically, intra-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


inter-query (Same query as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client. Aside from that, since 
batching is usually a performance optimization, I wouldn't want the 
optimization actually affecting the results.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.




was (Author: cameron.hatfield):
At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client. Aside from that, since 
batching is usually a performance optimization, I wouldn't want the 
optimization actually affecting the results.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>

[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-03 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543705#comment-15543705
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/3/16 11:21 PM:
--

At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client. Aside from that, since 
batching is usually a performance optimization, I wouldn't want the 
optimization actually affecting the results.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.




was (Author: cameron.hatfield):
At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it 

[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-03 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543719#comment-15543719
 ] 

Cameron Hatfield commented on PHOENIX-6:


And as a general note, though inefficent, when combined with an IF-like 
function, this would also cover our CAS use cases. However, it will be highly 
dependent on the ordering guarantees above. Here's the first blog post example 
I found of doing CAS like things in another SQL engine, 
https://thewebfellas.com/blog/conditional-duplicate-key-updates-with-mysql

> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-03 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543705#comment-15543705
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/3/16 11:13 PM:
--

At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
COUNTER1, COUNTER2
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
COUNTER1, COUNTER2
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.




was (Author: cameron.hatfield):
At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would 

[jira] [Comment Edited] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-03 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543705#comment-15543705
 ] 

Cameron Hatfield edited comment on PHOENIX-6 at 10/3/16 11:12 PM:
--

At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Statements ran in order, in two separate execute round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.




was (Author: cameron.hatfield):
At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Both run in two separate round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a 

[jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct

2016-10-03 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543705#comment-15543705
 ] 

Cameron Hatfield commented on PHOENIX-6:


At what point will the expressions be evaluated in the context of update 
statement?

Specifically, inter-query order (Notice the use of COUNTER1 and COUNTER2 on the 
RHS of each expression):
Both run in two separate round trips to Phoenix/HBase

UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER2 + 1, COUNTER2 = COUNTER1 + 1;

After the second upsert, which would be the result?
1: 1, 1
2: 1, 2
3: 2, 1

Last time I looked at this during the update statement thread, I believe it was 
number 1 that should be the result according to the SQL standard.


intra-query (Same querie as example query in description, twice):
Multiple queries, within the same batch sent to the Phoenix coprocessor
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;

After the batch is run, which would be the result?
1: 0,0
2: 1,1

This is the one we had issues with choosing how we wanted to do it with the CAS 
change. Since they aren't in the memstore until the whole batch is finished, 
you have to keep state within the coprocessor, for each edit, to ensure it came 
out with number 2. On top of that, if you have two different queries that will 
have very different results in different orders, then batching can readily 
affect the outcome. Right now, even though all of the warnings within HBase 
talk about ordering not guaranteed for batching, for a single row, it seems to 
be ordered in batch order as sent by the client.

>From a SQL standard direction, I believe 2 would be correct here, as there is 
>no idea of a "transaction" for a batch in phoenix (even per key), so I would 
>assume that it would be equiv. to auto commit after each statement.



> Support ON DUPLICATE KEY construct
> --
>
> Key: PHOENIX-6
> URL: https://issues.apache.org/jira/browse/PHOENIX-6
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> To support inserting a new row only if it doesn't already exist, we should 
> support the "on duplicate key" construct for UPSERT. With this construct, the 
> UPSERT VALUES statement would run atomically and would thus require a read 
> before write which would obviously have a negative impact on performance. For 
> an example of similar syntax , see MySQL documentation at 
> http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: 
> https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J. 
> A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already 
> exists and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE 
> =, ...] ]
> {code}
> The following restrictions will apply:
> - The  may not be part of the primary key constraint - only KeyValue 
> columns will be allowed.
> - If the table is immutable, the  may not appear in a secondary 
> index. This is because the mutations for indexes on immutable tables are 
> calculated on the client-side, while this new syntax would potentially modify 
> the value on the server-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2909) Surface checkAndPut through UPDATE statement

2016-08-24 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435860#comment-15435860
 ] 

Cameron Hatfield commented on PHOENIX-2909:
---

I would agree, and I would say is a good argument for having something else 
within the merge statement call out the use of whether this will be a fast / 
slow version. My basic worry is that when I see something like UPDATE vs UPSERT 
or INSERT vs INSERT ... ON DUPLICATE KEY IGNORE, I always expect the more 
complicated / does more for me statement to be either the same speed or slower 
in the general case. With the proposed change, the more constrained case UPDATE 
is actually slower. Whether that is a valid concern is up to debate.


How does this update statement interact with transaction support?

> Surface checkAndPut through UPDATE statement
> 
>
> Key: PHOENIX-2909
> URL: https://issues.apache.org/jira/browse/PHOENIX-2909
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> We can surface atomic checkAndPut like functionality through support of the 
> SQL UPSERT statement.
> For example, the following could use do a get under row lock to perform the 
> row update atomically
> {code}
> UPDATE  my_table SET counter=coalesce(counter,0) + 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> To force prior MVCC transactions to complete (making it serializable as an 
> Increment is), we'd have code like this:
> {code}
> mvcc = region.getMVCC();
> mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
> {code}
> By users setting auto commit to true and issuing an UPDATE statement over a 
> non transactional table, they'd get a way for row updates to be atomic. This 
> would work especially well to support counters.
> An UPDATE statement would simply be translated to an equivalent UPSERT SELECT 
> with a flag being passed to the server such that the row lock and read occurs 
> when executed. For example, the above statement would become:
> {code}
> UPSERT INTO  my_table(pk1,pk2,counter) SELECT pk1, pk2, coalesce(counter,0) + 
> 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> Note that the coalesce call above handles the case where counter is null. 
> This could be made prettier with support for the DEFAULT clause at CREATE 
> TABLE time (PHOENIX-476).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2909) Surface checkAndPut through UPDATE statement

2016-08-24 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435680#comment-15435680
 ] 

Cameron Hatfield edited comment on PHOENIX-2909 at 8/24/16 9:04 PM:


I like the idea of this (of course), but the issue I have is one that this 
doesn't support our use case because it requires an existing row, unlike an 
upsert statement. Also seems like it starts getting confusing when you have two 
separate statements, that are essentially doing the same thing, but have a 
slightly different use case due to the purely backend change to it flushing the 
memstore. I would expect an UPDATE statement to behave exactly the same as an 
UPSERT, in the case that a row already exists. But violating the principle of 
least surprise, it actually ends up being more expensive, with little to no 
warning to the user aside from a note in the documentation.

Wouldn't this be better to add as a flag / etc on top of the existing upsert 
statement? Should we reconsider the work already done on PHOENIX-2271?


was (Author: cameron.hatfield):
I like the idea of this (of course), but the issue I have is one that this 
doesn't support our use case because it requires an existing row, unlike an 
upsert statement. Also seems like it starts getting confusing when you have two 
separate statements, that are essentially doing the same thing, but have a 
slightly different use case due to the purely backend change to it flushing the 
memstore. I would expect an UPDATE statement to behave exactly the same as an 
UPSERT, in the case that it already exists. But violating the principle of 
least surprise, it actually ends up being more expensive, with little to no 
warning to the user aside from a note in the documentation.

Wouldn't this be better to add as a flag / etc on top of the existing upsert 
statement? Should we reconsider the work already done on PHOENIX-2271?

> Surface checkAndPut through UPDATE statement
> 
>
> Key: PHOENIX-2909
> URL: https://issues.apache.org/jira/browse/PHOENIX-2909
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> We can surface atomic checkAndPut like functionality through support of the 
> SQL UPSERT statement.
> For example, the following could use do a get under row lock to perform the 
> row update atomically
> {code}
> UPDATE  my_table SET counter=coalesce(counter,0) + 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> To force prior MVCC transactions to complete (making it serializable as an 
> Increment is), we'd have code like this:
> {code}
> mvcc = region.getMVCC();
> mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
> {code}
> By users setting auto commit to true and issuing an UPDATE statement over a 
> non transactional table, they'd get a way for row updates to be atomic. This 
> would work especially well to support counters.
> An UPDATE statement would simply be translated to an equivalent UPSERT SELECT 
> with a flag being passed to the server such that the row lock and read occurs 
> when executed. For example, the above statement would become:
> {code}
> UPSERT INTO  my_table(pk1,pk2,counter) SELECT pk1, pk2, coalesce(counter,0) + 
> 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> Note that the coalesce call above handles the case where counter is null. 
> This could be made prettier with support for the DEFAULT clause at CREATE 
> TABLE time (PHOENIX-476).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2909) Surface checkAndPut through UPDATE statement

2016-08-24 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435680#comment-15435680
 ] 

Cameron Hatfield commented on PHOENIX-2909:
---

I like the idea of this (of course), but the issue I have is one that this 
doesn't support our use case because it requires an existing row, unlike an 
upsert statement. Also seems like it starts getting confusing when you have two 
separate statements, that are essentially doing the same thing, but have a 
slightly different use case due to the purely backend change to it flushing the 
memstore. I would expect an UPDATE statement to behave exactly the same as an 
UPSERT, in the case that it already exists. But violating the principle of 
least surprise, it actually ends up being more expensive, with little to no 
warning to the user aside from a note in the documentation.

Wouldn't this be better to add as a flag / etc on top of the existing upsert 
statement? Should we reconsider the work already done on PHOENIX-2271?

> Surface checkAndPut through UPDATE statement
> 
>
> Key: PHOENIX-2909
> URL: https://issues.apache.org/jira/browse/PHOENIX-2909
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> We can surface atomic checkAndPut like functionality through support of the 
> SQL UPSERT statement.
> For example, the following could use do a get under row lock to perform the 
> row update atomically
> {code}
> UPDATE  my_table SET counter=coalesce(counter,0) + 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> To force prior MVCC transactions to complete (making it serializable as an 
> Increment is), we'd have code like this:
> {code}
> mvcc = region.getMVCC();
> mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
> {code}
> By users setting auto commit to true and issuing an UPDATE statement over a 
> non transactional table, they'd get a way for row updates to be atomic. This 
> would work especially well to support counters.
> An UPDATE statement would simply be translated to an equivalent UPSERT SELECT 
> with a flag being passed to the server such that the row lock and read occurs 
> when executed. For example, the above statement would become:
> {code}
> UPSERT INTO  my_table(pk1,pk2,counter) SELECT pk1, pk2, coalesce(counter,0) + 
> 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> Note that the coalesce call above handles the case where counter is null. 
> This could be made prettier with support for the DEFAULT clause at CREATE 
> TABLE time (PHOENIX-476).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2909) Surface checkAndPut through UPDATE statement

2016-08-24 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435629#comment-15435629
 ] 

Cameron Hatfield commented on PHOENIX-2909:
---

What has changed since the original discussion that no longer makes 
PHOENIX-2275 the correct answer?

> Surface checkAndPut through UPDATE statement
> 
>
> Key: PHOENIX-2909
> URL: https://issues.apache.org/jira/browse/PHOENIX-2909
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> We can surface atomic checkAndPut like functionality through support of the 
> SQL UPSERT statement.
> For example, the following could use do a get under row lock to perform the 
> row update atomically
> {code}
> UPDATE  my_table SET counter=coalesce(counter,0) + 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> To force prior MVCC transactions to complete (making it serializable as an 
> Increment is), we'd have code like this:
> {code}
> mvcc = region.getMVCC();
> mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
> {code}
> By users setting auto commit to true and issuing an UPDATE statement over a 
> non transactional table, they'd get a way for row updates to be atomic. This 
> would work especially well to support counters.
> An UPDATE statement would simply be translated to an equivalent UPSERT SELECT 
> with a flag being passed to the server such that the row lock and read occurs 
> when executed. For example, the above statement would become:
> {code}
> UPSERT INTO  my_table(pk1,pk2,counter) SELECT pk1, pk2, coalesce(counter,0) + 
> 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> Note that the coalesce call above handles the case where counter is null. 
> This could be made prettier with support for the DEFAULT clause at CREATE 
> TABLE time (PHOENIX-476).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2909) Surface checkAndPut through UPDATE statement

2016-08-24 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435628#comment-15435628
 ] 

Cameron Hatfield commented on PHOENIX-2909:
---

Previous discussion on check-and-put

> Surface checkAndPut through UPDATE statement
> 
>
> Key: PHOENIX-2909
> URL: https://issues.apache.org/jira/browse/PHOENIX-2909
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.9.0
>
>
> We can surface atomic checkAndPut like functionality through support of the 
> SQL UPSERT statement.
> For example, the following could use do a get under row lock to perform the 
> row update atomically
> {code}
> UPDATE  my_table SET counter=coalesce(counter,0) + 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> To force prior MVCC transactions to complete (making it serializable as an 
> Increment is), we'd have code like this:
> {code}
> mvcc = region.getMVCC();
> mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
> {code}
> By users setting auto commit to true and issuing an UPDATE statement over a 
> non transactional table, they'd get a way for row updates to be atomic. This 
> would work especially well to support counters.
> An UPDATE statement would simply be translated to an equivalent UPSERT SELECT 
> with a flag being passed to the server such that the row lock and read occurs 
> when executed. For example, the above statement would become:
> {code}
> UPSERT INTO  my_table(pk1,pk2,counter) SELECT pk1, pk2, coalesce(counter,0) + 
> 1 
> FROM my_table WHERE pk1 = 1 AND pk2 = 2;
> {code}
> Note that the coalesce call above handles the case where counter is null. 
> This could be made prettier with support for the DEFAULT clause at CREATE 
> TABLE time (PHOENIX-476).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-25 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908804#comment-14908804
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

For values:
Thanks for the link, answers the problems that I was unsure about.

The particular case I was mentioning isn't necessarily just count, but that the 
statement can return the keys themselves that were updated / inserted / 
deleted, generally used for finding out things such as what was already in the 
table, what wasn't inserted cause it already existed, etc. The count itself is 
also something that would be nice to have, and probably much easier to return 
compared to actual rows. I have actually used this feature in TSQL the past, 
though I can't remember if it was for anything more then counts. (Which I 
believe is how TSQL got around the fact that you only return one count).

The next question would then be, can we get this patch in, with a limited 
subset of merge implemented (a insert/update only merge, for values only), with 
the assumption being that adding table support would be harder problem then 
just the values that it is now, or should we wait on the longer-burn timeline 
of calcite.




> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-25 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908805#comment-14908805
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

For values:
Thanks for the link, answers the problems that I was unsure about.

The particular case I was mentioning isn't necessarily just count, but that the 
statement can return the keys themselves that were updated / inserted / 
deleted, generally used for finding out things such as what was already in the 
table, what wasn't inserted cause it already existed, etc. The count itself is 
also something that would be nice to have, and probably much easier to return 
compared to actual rows. I have actually used this feature in TSQL the past, 
though I can't remember if it was for anything more then counts. (Which I 
believe is how TSQL got around the fact that you only return one count).

The next question would then be, can we get this patch in, with a limited 
subset of merge implemented (a insert/update only merge, for values only), with 
the assumption being that adding table support would be harder problem then 
just the values that it is now, or should we wait on the longer-burn timeline 
of calcite.




> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-25 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908805#comment-14908805
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/25/15 10:49 PM:
-

For values:
Thanks for the link, answers the problems that I was unsure about.

The particular case I was mentioning isn't necessarily just count, but that the 
statement can return the keys themselves that were updated / inserted / 
deleted, generally used for finding out things such as what was already in the 
table, what wasn't inserted cause it already existed, etc. The count itself is 
also something that would be nice to have, and probably much easier to return 
compared to actual rows. I have actually used this feature in TSQL the past, 
though I can't remember if it was for anything more then counts. (Which I 
believe is how TSQL got around the fact that you only return one count).

The next question would then be, can we get this patch in, with a limited 
subset of merge implemented (a insert/update only merge, for values only), with 
the assumption being that adding table support would be harder problem then 
just the values that it is now, or should we wait on the longer-burn timeline 
of calcite (since I believe we have previously removed the requirement of 
waiting for transactions).





was (Author: cameron.hatfield):
For values:
Thanks for the link, answers the problems that I was unsure about.

The particular case I was mentioning isn't necessarily just count, but that the 
statement can return the keys themselves that were updated / inserted / 
deleted, generally used for finding out things such as what was already in the 
table, what wasn't inserted cause it already existed, etc. The count itself is 
also something that would be nice to have, and probably much easier to return 
compared to actual rows. I have actually used this feature in TSQL the past, 
though I can't remember if it was for anything more then counts. (Which I 
believe is how TSQL got around the fact that you only return one count).

The next question would then be, can we get this patch in, with a limited 
subset of merge implemented (a insert/update only merge, for values only), with 
the assumption being that adding table support would be harder problem then 
just the values that it is now, or should we wait on the longer-burn timeline 
of calcite.




> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-23 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905376#comment-14905376
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

For the second part, in reference to rowprocessor / multiple conditional checks 
/ etc:
The currently discussed patch supports all of those features, using the 
currently existing Indexer Region Observer, allowing for easy integration with 
the indexing process that exists. If it makes sense to pull that out to a 
rowprocessor, then that seems a reasonable change to make, but I don't believe 
we need to bring in or use any of mentioned extensions to HBase, as they don't 
necessarily integrate as nicely with Phoenixes indexes (though that is just 
from our understanding of the codebase of both Phoenix and HBase, which neither 
of us can claim expertise in), nor how the phoenix client works.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-23 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905383#comment-14905383
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/23/15 10:05 PM:
-

For the issue of trying to automatically determine when to do what for MERGE:
The main issue I see is that transactions and compare-and-set offer similar, 
but subtly different concurrency guarantees, 
that I believe are impossible to automatically choose for a user.

Specifically, imagine the following example:

CREATE TABLE MyTable (id INT primary key, value INT)
UPSERT INTO MyTable (id, value) values(1, 1)


Now, you have merge statement A:
MERGE INTO MyTable
USING (VALUES (1, 2)) AS NewData (id, value)
ON MyTable.id = NewData.id and value >= 1
WHEN MATCHED THEN UPDATE SET MyTable.value = MyTable.value
WHEN NOT MATCHED THEN INSERT VALUES (NewData.id, NewData.value)

And merge statement B
MERGE INTO MyTable
USING (VALUES (1, 3)) AS NewData (id, value)
ON MyTable.id = NewData.id and value >= 1
WHEN MATCHED THEN UPDATE SET MyTable.value = MyTable.value
WHEN NOT MATCHED THEN INSERT VALUES (NewData.id, NewData.value)
 

The main difference between the two is that value is 2 in the first, and value 
is 3 in the other.

Now what happens if they conflict?

In a compare-and-set world, the last one to commit will win (since the 
condition will match, no matter the order). No exception will ever be thrown.
But in a transaction world, the first one to commit will win, as a transaction 
error would be thrown when the second one is committed.

So the options seem to become either adding a special option to merge, that 
allows you to force compare-and-set like functionality, which may mean the 
grammar has some cases 
where it has features that are not supported for compare-and-set vs a 
transaction based merged, as well as having two slightly different MERGE 
statements with the only difference being a single keywords, with two subtly 
different semantics; or use the already non-standard UPSERT command as to add 
that support, with a syntax that calls out that this is a different operation. 
Having is part of the UPSERT command would also help differentiate UPSERT from 
MERGE, allowing you to do more low-level / different types of concurrency, as 
otherwise UPSERT becomes essentially a subset of the MERGE command.


was (Author: cameron.hatfield):
For the issue of trying to automatically determine when to do what for MERGE:
The main issue I see is that transactions and compare-and-set offer similar, 
but subtly different concurrency guarantees, 
that I believe are impossible to automatically choose for a user.

Specifically, imagine the following example:

CREATE TABLE MyTable (id INT primary key, value INT)
UPSERT INTO MyTable (id, value) values(1, 1)


Now, you have merge statement A:
MERGE INTO MyTable
USING (VALUES (1, 2)) AS NewData (id, value)
ON MyTable.id = NewData.id and value >= 1
WHEN MATCHED THEN UPDATE SET MyTable.value = MyTable.value
WHEN NOT MATCHED THEN INSERT VALUES (NewData.id, NewData.value)

And merge statement B
MERGE INTO MyTable
USING (VALUES (1, 3)) AS NewData (id, value)
ON MyTable.id = NewData.id and value >= 1
WHEN MATCHED THEN UPDATE SET MyTable.value = MyTable.value
WHEN NOT MATCHED THEN INSERT VALUES (NewData.id, NewData.value)
 

The main difference between the two is that value is 2 in the first, and value 
is 3 in the other.

Now what happens if they conflict?

In a compare-and-set world, the last one to commit will win (since the 
condition will match, no matter the order). No exception will ever be thrown.
But in a transaction world, the first one to commit will win, as a transaction 
error would be thrown when the second one is committed.

So the options seem to become either adding a special option to merge, that 
allows you to force compare-and-set like functionality, which may mean the 
grammar has some cases 
where it has features that are not supported for compare-and-set vs a 
transaction based merged, as well as having two slightly different MERGE 
statements with the only difference being a single keywords, with two subtly 
different semantics; or use the already non-standard UPSERT command as to add 
that support, with a syntax that calls out that this is a different operation. 
This would
also help differentiate UPSERT from MERGE, allowing you to do more low-level / 
different types of concurrency, as otherwise UPSERT becomes essentially a 
subset of the merge operation.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus 

[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-23 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905383#comment-14905383
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

For the issue of trying to automatically determine when to do what for MERGE:
The main issue I see is that transactions and compare-and-set offer similar, 
but subtly different concurrency guarantees, 
that I believe are impossible to automatically choose for a user.

Specifically, imagine the following example:

CREATE TABLE MyTable (id INT primary key, value INT)
UPSERT INTO MyTable (id, value) values(1, 1)


Now, you have merge statement A:
MERGE INTO MyTable
USING (VALUES (1, 2)) AS NewData (id, value)
ON MyTable.id = NewData.id and value >= 1
WHEN MATCHED THEN UPDATE SET MyTable.value = MyTable.value
WHEN NOT MATCHED THEN INSERT VALUES (NewData.id, NewData.value)

And merge statement B
MERGE INTO MyTable
USING (VALUES (1, 3)) AS NewData (id, value)
ON MyTable.id = NewData.id and value >= 1
WHEN MATCHED THEN UPDATE SET MyTable.value = MyTable.value
WHEN NOT MATCHED THEN INSERT VALUES (NewData.id, NewData.value)
 

The main difference between the two is that value is 2 in the first, and value 
is 3 in the other.

Now what happens if they conflict?

In a compare-and-set world, the last one to commit will win (since the 
condition will match, no matter the order). No exception will ever be thrown.
But in a transaction world, the first one to commit will win, as a transaction 
error would be thrown when the second one is committed.

So the options seem to become either adding a special option to merge, that 
allows you to force compare-and-set like functionality, which may mean the 
grammar has some cases 
where it has features that are not supported for compare-and-set vs a 
transaction based merged, as well as having two slightly different MERGE 
statements with the only difference being a single keywords, with two subtly 
different semantics; or use the already non-standard UPSERT command as to add 
that support, with a syntax that calls out that this is a different operation. 
This would
also help differentiate UPSERT from MERGE, allowing you to do more low-level / 
different types of concurrency, as otherwise UPSERT becomes essentially a 
subset of the merge operation.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-23 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905452#comment-14905452
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/23/15 10:56 PM:
-

Would you be thinking of extending the standard merge statement (which, from 
what I can tell without access to the ISO doc, only supports working on direct 
table references) with non-standard support for a value constructor, ala the 
syntax I stole from TSQL? Without that, it seems relatively hard to use it for 
our use case (compare-and-set for a row).
The other minus I would see is the inability to mix and match transactions and 
compare-and-set within the same table, forcing you to have to choose one or the 
other. Though I am unsure how much sense that would make anyways.

Another issue, with either path, would be if the MERGE / UPSERT would need to 
be extended to return information on success or not of the result, similar to 
POSTGRESQLs RETURNING. Currently we don't have a requirement for this, though I 
know there are use cases for doing so.


was (Author: cameron.hatfield):
Would you be thinking of extending the standard merge statement (which, from 
what I can tell without access to the ISO doc, only supports working on direct 
table references) with non-standard support for a value constructor, ala the 
syntax I stole from TSQL? Without that, it seems relatively hard to use it for 
our use case (compare-and-set for a row).
The other minus I would see is the inability to mix and match transactions and 
compare-and-set within the same table, forcing you to have to choose one or the 
other. Though I am unsure how much sense that would make anyways.

The another issue, with either decision should, would be if the MERGE / UPSERT 
would need to be extended to return information on success or not of the 
UPSERT, similar to POSTGRESQLs RETURNING. Currently we don't have a requirement 
for this, though I know there are use cases for doing so.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-23 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905452#comment-14905452
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

Would you be thinking of extending the standard merge statement (which, from 
what I can tell without access to the ISO doc, only supports working on direct 
table references) with non-standard support for a value constructor, ala the 
syntax I stole from TSQL? Without that, it seems relatively hard to use it for 
our use case (compare-and-set for a row).
The other minus I would see is the inability to mix and match transactions and 
compare-and-set within the same table, forcing you to have to choose one or the 
other. Though I am unsure how much sense that would make anyways.

The another issue, with either decision should, would be if the MERGE / UPSERT 
would need to be extended to return information on success or not of the 
UPSERT, similar to POSTGRESQLs RETURNING. Currently we don't have a requirement 
for this, though I know there are use cases for doing so.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-23 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905452#comment-14905452
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/23/15 10:57 PM:
-

Would you be thinking of extending the standard merge statement (which, from 
what I can tell without access to the ISO doc, only supports working on direct 
table references) with non-standard support for a value constructor, ala the 
syntax I stole from TSQL? Without that, it seems relatively hard to use it for 
our use case (compare-and-set for a row).
The other minus I would see is the inability to mix and match transactions and 
compare-and-set within the same table, forcing you to have to choose one or the 
other. Though I am unsure how much sense that would make anyways.

Another issue, with either path, would be if the MERGE / UPSERT would need to 
be extended to return information on success or not of the result, similar to 
POSTGRESQLs RETURNING or the OUTPUT clause in TSQL. Currently we don't have a 
requirement for this, though I know there are use cases for doing so.


was (Author: cameron.hatfield):
Would you be thinking of extending the standard merge statement (which, from 
what I can tell without access to the ISO doc, only supports working on direct 
table references) with non-standard support for a value constructor, ala the 
syntax I stole from TSQL? Without that, it seems relatively hard to use it for 
our use case (compare-and-set for a row).
The other minus I would see is the inability to mix and match transactions and 
compare-and-set within the same table, forcing you to have to choose one or the 
other. Though I am unsure how much sense that would make anyways.

Another issue, with either path, would be if the MERGE / UPSERT would need to 
be extended to return information on success or not of the result, similar to 
POSTGRESQLs RETURNING. Currently we don't have a requirement for this, though I 
know there are use cases for doing so.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-22 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903074#comment-14903074
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

It seems like it would be nice to expose an underlying primitive for compare 
and set, allowing you to get to features of HBase that aren't currently 
exposed, for when you need lighter weight features, especially since compare 
and set can open up more then a couple possibilities for different types of 
distributed computations (such as the one we are trying to support)

For the upsert workaround, it seems a bit hacky to try to use udfs and a select 
upsert, instead of exposing a real primitive for it. That would also require a 
UDF call for each changed column, to compare the two versions, correct? (Don't 
have enough background with UDFs, as I haven't taken a good look at the 
underlying code yet).

For the merge statement, you would also add the RPC and implementation overhead 
of defining both a when / when not pair, vs just the values + the compare 
statement, not including complicating the optimization detection for the case 
of simple compare and set.

Of the two, the merge statement seems the most promising, but only if it: 
Supports inserting multiple values, instead of a single value (so that you get 
the advantage of doing compare and set in one statement, instead of one upsert 
per compare and set)
Supports a way to ensure you're using the compare and set, vs relying on the 
optimizer  to choose the option that does a compare and set under the hood. 
This is because it won't do the correct thing, *unless* you happened to start a 
transaction before running the statement.

The main issue I see is the second one is that requires you to basically expose 
some non-sql standard way of doing things, in a sql standard statement, vs 
exposing it in the already non-sql standard statement.

Aside from that, what is the timeline for calcite integration?


> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-22 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903398#comment-14903398
 ] 

Cameron Hatfield commented on PHOENIX-2271:
---

Something along on the lines of (using TSQL syntax as an example, since it was 
the first I ran across):
MERGE INTO A
   USING (VALUES ('foo', 1), ('bar', 2)) AS B (ColA, ColB)
  ON A.ColA = B.ColA
  WHEN MATCHED THEN UPDATE SET A.ColB = B.ColB
  WHEN NOT MATCHED THEN INSERT VALUES (B.ColA, B.ColB)

And then in theory you would have some type of extension to say "Use compare 
and set".

Though it still seems a bit like trying to force a square peg into a round 
hole. Yes, you can use merge to do it, but it seems like that would be more 
confusing then having a dedicated syntax for it.

Another thing that might be worth considering is a syntax related to 
postgresql's syntax they are considering, or at least their discussion of the 
issues of MERGE vs their upsert statement:
https://wiki.postgresql.org/wiki/UPSERT#SQL_MERGE_syntax

Their syntax could also be extended to support a conditional clause in the case 
of a conflict as well.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-22 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903409#comment-14903409
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/22/15 9:10 PM:


The other possible point of confusion is "HBase CAS is based on a single cell. 
How would you see it extended to multiple cells?"

The patch [~babartareen] has made actually does multi-cell CAS, due to how 
rowlocks are acquired in the coprocessor. So it is technically a more powerful 
extension of the built-in checkAndPut operations provided by default HBase. The 
use of coprocessors for multiCheckAndPut is vaguely mentioned in 
https://issues.apache.org/jira/browse/HBASE-8458, and the reasonable extension 
is that since I have a rowlock, why not multiple compares. Multiple cell 
checkandput are referenced in this apparently dead patch 
https://issues.apache.org/jira/browse/HBASE-11274


was (Author: cameron.hatfield):
The other possible point of confusion is "HBase CAS is based on a single cell. 
How would you see it extended to multiple cells?"

The patch [~babartareen] has made actually does multi-cell CAS, due to how 
rowlocks are acquired in the coprocessor. So it is technically a more powerful 
extension of the built-in checkAndPut operations provided by default HBase. The 
use of coprocessors for multiCheckAndPut is vaguely mentioned in 
https://issues.apache.org/jira/browse/HBASE-8458, and the reasonable extension 
is that since I have a rowlock, why not multiple compares.

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-22 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903409#comment-14903409
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/22/15 9:10 PM:


The other possible point of confusion is "HBase CAS is based on a single cell. 
How would you see it extended to multiple cells?"

The patch [~babartareen] has made actually does multi-cell CAS, due to how 
rowlocks are acquired in the coprocessor. So it is technically a more powerful 
extension of the built-in checkAndPut operations provided by default HBase. The 
use of coprocessors for multiCheckAndPut is vaguely mentioned in 
https://issues.apache.org/jira/browse/HBASE-8458, and the reasonable extension 
is that since I have a rowlock, why not multiple compares. Multiple cell 
checkandput is referenced in this apparently dead patch 
https://issues.apache.org/jira/browse/HBASE-11274


was (Author: cameron.hatfield):
The other possible point of confusion is "HBase CAS is based on a single cell. 
How would you see it extended to multiple cells?"

The patch [~babartareen] has made actually does multi-cell CAS, due to how 
rowlocks are acquired in the coprocessor. So it is technically a more powerful 
extension of the built-in checkAndPut operations provided by default HBase. The 
use of coprocessors for multiCheckAndPut is vaguely mentioned in 
https://issues.apache.org/jira/browse/HBASE-8458, and the reasonable extension 
is that since I have a rowlock, why not multiple compares. Multiple cell 
checkandput are referenced in this apparently dead patch 
https://issues.apache.org/jira/browse/HBASE-11274

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PHOENIX-2271) Upsert - CheckAndPut like functionality

2015-09-22 Thread Cameron Hatfield (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903409#comment-14903409
 ] 

Cameron Hatfield edited comment on PHOENIX-2271 at 9/22/15 9:08 PM:


The other possible point of confusion is "HBase CAS is based on a single cell. 
How would you see it extended to multiple cells?"

The patch [~babartareen] has made actually does multi-cell CAS, due to how 
rowlocks are acquired in the coprocessor. So it is technically a more powerful 
extension of the built-in checkAndPut operations provided by default HBase. The 
use of coprocessors for multiCheckAndPut is vaguely mentioned in 
https://issues.apache.org/jira/browse/HBASE-8458, and the reasonable extension 
is that since I have a rowlock, why not multiple compares.


was (Author: cameron.hatfield):
The other possible point of confusion is "HBase CAS is based on a single cell. 
How would you see it extended to multiple cells?"

The patch [~babartareen] has made actually does multi-cell CAS, due to how 
rowlocks are acquired in the coprocessor. So it is technically a more powerful 
extension of the built-in checkAndPut operations provided by default HBase. 
This is vaguely mentioned in https://issues.apache.org/jira/browse/HBASE-8458

> Upsert - CheckAndPut like functionality
> ---
>
> Key: PHOENIX-2271
> URL: https://issues.apache.org/jira/browse/PHOENIX-2271
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Babar Tareen
> Attachments: patch.diff
>
>
> The Upsert statement does not support HBase's checkAndPut api, thus making it 
> difficult to conditionally update a row. Based on the comments from 
> PHOENIX-6, I have implemented such functionality. The Upsert statement is 
> modified to support compare clause, which allows us to pass in an expression. 
> The expression is evaluated against the current record and Upsert is only 
> performed when the expression evaluates to true. More details 
> [here|https://github.com/babartareen/phoenix].
> h4. Examples
> Given that the FirstName is always set for the users, create a user record if 
> one doesn't already exist.
> {code:sql}
> UPSERT INTO User (UserId, FirstName, LastName, Phone, Address, PIN) VALUES 
> (1, 'Alice', 'A', '123 456 7890', 'Some St. in a city', 1122) COMPARE 
> FirstName IS NULL;
> {code}
> Update the phone number for UserId '1' if the FirstName is set. Given that 
> the FirstName is always set for the users, this will only update the record 
> if it already exists.
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> IS NOT NULL;
> {code}
> Update the phone number if the first name for UserId '1' starts with 'Al' and 
> last name is 'A'
> {code:sql}
> UPSERT INTO User (UserId, Phone) VALUES (1, '987 654 3210') COMPARE FirstName 
> LIKE 'Al%' AND LastName = 'A';  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)