Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-03-26 Thread Dean Rasheed
On Thu, 21 Mar 2024 at 09:35, Dean Rasheed  wrote:
>
> Trivial rebase forced by 6185c9737c.
>

I think it would be good to get this committed.

It has had a decent amount of review, at least up to v9, but a number
of things have changed since then:

1). Concurrent update behaviour -- now if a concurrent update causes a
matched candidate row to no longer match the join condition, it will
execute the first qualifying NOT MATCHED BY SOURCE action, and then
the first qualifying NOT MATCHED [BY TARGET] action. I.e., it may
execute 2 actions, which makes sense because if the rows had started
out not matching, the full join would have output 2 rows.

2). ResultRelInfo now has 3 lists of actions, one per match kind.
Previously I was putting the NOT MATCHED BY SOURCE actions in the same
list as the MATCHED actions, since they are both handled by
ExecMergeMatched(). However, to achieve (1) above, it turned out to be
easier to have 3 separate lists, and this makes some other code a
little neater.

3). I've added a new field Query.mergeJoinCondition so that
transformMergeStmt() no longer puts the join conditions in
qry->jointree->quals. That's necessary to make it work correctly on an
auto-updatable view which might have its own quals, but it also seems
neater anyway.

4). To distinguish the MATCHED case from NOT MATCHED BY SOURCE case in
the executor, it now uses the join condition (previously it added a
"source IS [NOT] NULL" clause to each merge action). This has the
advantage that it involves just one qual check per candidate row,
rather than one for each action. On the downside, it's checking the
join condition twice (since the source subplan's join node already
checked it), but I couldn't see an easy way round that. (It only does
this if there are both MATCHED and NOT MATCHED BY SOURCE actions, so
it's not making any existing queries worse.)

5). To support (4), I added new fields
ModifyTablePath.mergeJoinConditions, ModifyTable.mergeJoinConditions
and ResultRelInfo.ri_MergeJoinCondition, since the attribute numbers
in the join condition might vary by partition.

6). I got rid of Query.mergeSourceRelation, which is no longer needed,
because of (4). (And as before, it also gets rid of
Query.mergeUseOuterJoin, since the parser is no longer making the
decision about what kind of join to build.)

7). To support (1), I added a new field
ModifyTableState.mt_merge_pending_not_matched, because if it has to
execute 2 actions following a concurrent update, and there is a
RETURNING clause, it has to defer the second action until the next
call to ExecModifyTable().

8). I've added isolation tests to test (1).

9). I've added a lot more regression tests.

10). I've made a lot of comment changes in nodeModifyTable.c,
especially relating to the discussion around concurrent updates.

Overall, I feel like this is in pretty good shape, and it manages to
make a few code simplifications that look quite nice.

Regards,
Dean




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-03-13 Thread Dean Rasheed
Rebased version attached.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index f8f83d4..380d0c9
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -394,10 +394,14 @@
 conditions for each action are re-evaluated on the updated version of
 the row, starting from the first action, even if the action that had
 originally matched appears later in the list of actions.
-On the other hand, if the row is concurrently updated or deleted so
-that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+On the other hand, if the row is concurrently updated so that the join
+condition fails, then MERGE will evaluate the
+command's NOT MATCHED BY SOURCE and
+NOT MATCHED [BY TARGET] actions next, and execute
+the first one of each kind that succeeds.
+If the row is concurrently deleted, then MERGE
+will evaluate the command's NOT MATCHED [BY TARGET]
+actions, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index e745fbd..6845f14
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -72,7 +73,9 @@ DELETE
from data_source to
the target table
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -254,18 +257,40 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
   and the candidate change row matches a row in the
-  target table,
-  the WHEN clause is executed if the
+  data_source to a row in the
+  target table, the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
-  and the candidate change row does not match a row in the
-  target table,
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the target table that does not match a row in the
+  data_source, the
+  WHEN clause is executed if the
+  condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET] and the candidate change
+  row represents a row in the
+  data_source that does not
+  match a row in the target table,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
@@ -285,7 +310,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -409,8 +437,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-29 Thread Dean Rasheed
On Fri, 26 Jan 2024 at 15:57, Alvaro Herrera  wrote:
>
> Well, firstly this is clearly a feature we want to have, even though
> it's non-standard, because people use it and other implementations have
> it.  (Eh, so maybe somebody should be talking to the SQL standard
> committee about it).  As for code quality, I didn't do a comprehensive
> review, but I think it is quite reasonable.  Therefore, my inclination
> would be to get it committed soonish, and celebrate it widely so that
> people can test it soon and complain if they see something they don't
> like.
>

Thanks. I have been going over this patch again, and for the most
part, I'm pretty happy with it.

One thing that's bothering me though is what happens if a row being
merged is concurrently updated. Specifically, if a concurrent update
causes a formerly matching row to no longer match the join condition,
and there are both NOT MATCHED BY SOURCE and NOT MATCHED BY TARGET
actions, so that it's doing in full join between the source and target
relations. In this case, when the EPQ mechanism rescans the subplan
node, there will be 2 possible output tuples (one with source null,
and one with target null), and EvalPlanQual() will just return the
first one, which is a more-or-less arbitrary choice, depending on the
type of join (hash/merge), and (for a mergejoin) the values of the
inner and outer join keys. Thus, it may execute a NOT MATCHED BY
SOURCE action, or a NOT MATCHED BY TARGET action, and it's difficult
to predict which.

Arguably it's not worth worrying too much about what happens in a
corner-case concurrent update like this, when MERGE is already
inconsistent under other concurrent update scenarios, but I don't like
having unpredictable results like this, which can depend on the plan
chosen.

I think the best (and probably simplest) solution is to always opt for
a NOT MATCHED BY TARGET action in this case, so then the result is
predictable, and we can document what is expected to happen.

Regards,
Dean




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread Alvaro Herrera
On 2024-Jan-26, Dean Rasheed wrote:

> I think it has had a decent amount of review and all the review
> comments have been addressed. I'm not quite sure from Alvaro's last
> comment whether he was implying that he thought it was ready for
> commit.

Well, firstly this is clearly a feature we want to have, even though
it's non-standard, because people use it and other implementations have
it.  (Eh, so maybe somebody should be talking to the SQL standard
committee about it).  As for code quality, I didn't do a comprehensive
review, but I think it is quite reasonable.  Therefore, my inclination
would be to get it committed soonish, and celebrate it widely so that
people can test it soon and complain if they see something they don't
like.

I have to say that I find the idea of booting patches as Returned with
Feedback just because of inactivity (as opposed to unresponsive authors)
rather wrong-headed, and I wish we didn't do it.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread Dean Rasheed
n Fri, 26 Jan 2024 at 14:59, vignesh C  wrote:
>
> CFBot shows that the patch does not apply anymore as in [1]:
>

Rebased version attached.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index f8f83d4..6ef0c2b
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 655f7dc..f421716
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -382,8 +409,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the data_source.
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  expression can only use values from the original row in the target table.
+  If used in a WHEN NOT MATCHED [BY TARGET] clause, the
+  expression can only use values from the data_source.
  
 

@@ -452,8 +481,9 @@ MERGE tot

 
  
-  Evaluate whether each row is MATCHED or
-  NOT MATCHED.
+  Evaluate whether each row is MATCHED,
+  NOT MATCHED BY SOURCE, or
+  NOT MATCHED [BY TARGET].
  
 
 
@@ -528,7 +558,8 @@ MERGE tot
   
If a 

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread Dean Rasheed
On Mon, 22 Jan 2024 at 02:10, Peter Smith  wrote:
>
> Hi, this patch was marked in CF as "Needs Review" [1], but there has
> been no activity on this thread for 6+ months.
>
> Is anything else planned? Can you post something to elicit more
> interest in the latest patch? Otherwise, if nothing happens then the
> CF entry will be closed ("Returned with feedback") at the end of this
> CF.
>

I think it has had a decent amount of review and all the review
comments have been addressed. I'm not quite sure from Alvaro's last
comment whether he was implying that he thought it was ready for
commit.

Looking back through the thread, the general sentiment seems to be in
favour of adding this feature, and I still think it's worth doing, but
I haven't managed to find much time to progress it recently.

Regards,
Dean




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread vignesh C
On Sat, 1 Jul 2023 at 18:04, Dean Rasheed  wrote:
>
> On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera  wrote:
> >
> > On 2023-Mar-21, Dean Rasheed wrote:
> >
> > > Looking at it with fresh eyes though, I realise that I could have just 
> > > written
> > >
> > > action->qual = make_and_qual((Node *) ntest, action->qual);
> > >
> > > which is equivalent, but more concise.
> >
> > Nice.
> >
> > I have no further observations about this patch.
> >
>
> Looking at this one afresh, it seems that the change to make Vars
> outer-join aware broke it -- the Var in the qual to test whether the
> source row is null needs to be marked as nullable by the join added by
> transform_MERGE_to_join(). That's something that needs to be done in
> transform_MERGE_to_join(), so it makes more sense to add the new qual
> there rather than in transformMergeStmt().
>
> Also, now that MERGE has ruleutils support, it's clear that adding the
> qual in transformMergeStmt() isn't right anyway, since it would then
> appear in the deparsed output.
>
> So attached is an updated patch doing that, which seems neater all
> round, since adding the qual is closely related to the join-type
> choice, which is now a decision taken entirely in
> transform_MERGE_to_join(). This requires a new "mergeSourceRelation"
> field on the Query structure, but as before, it does away with the
> "mergeUseOuterJoin" field.
>
> I've also updated the ruleutils support. In the absence of any WHEN
> NOT MATCHED BY SOURCE actions, this will output not-matched actions
> simply as "WHEN NOT MATCHED" for backwards compatibility, and to be
> SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE
> actions though, I think it's preferable to output explicit "BY SOURCE"
> and "BY TARGET" qualifiers for all not-matched actions, to make the
> meaning clearer.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
f2bf8fb04886e3ea82e7f7f86696ac78e06b7e60 ===
=== applying patch ./support-merge-when-not-matched-by-source-v8.patch
...
patching file doc/src/sgml/ref/merge.sgml
Hunk #5 FAILED at 409.
Hunk #9 FAILED at 673.
2 out of 9 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/merge.sgml.rej
..
patching file src/include/nodes/parsenodes.h
Hunk #1 succeeded at 175 (offset -8 lines).
Hunk #2 succeeded at 1657 (offset -6 lines).
Hunk #3 succeeded at 1674 (offset -6 lines).
Hunk #4 FAILED at 1696.
1 out of 4 hunks FAILED -- saving rejects to file
src/include/nodes/parsenodes.h.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4092.log

Regards,
Vignesh




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-21 Thread Peter Smith
Hi, this patch was marked in CF as "Needs Review" [1], but there has
been no activity on this thread for 6+ months.

Is anything else planned? Can you post something to elicit more
interest in the latest patch? Otherwise, if nothing happens then the
CF entry will be closed ("Returned with feedback") at the end of this
CF.

==
[1]  https://commitfest.postgresql.org/46/4092/

Kind Regards,
Peter Smith.




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-07-01 Thread Dean Rasheed
On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera  wrote:
>
> On 2023-Mar-21, Dean Rasheed wrote:
>
> > Looking at it with fresh eyes though, I realise that I could have just 
> > written
> >
> > action->qual = make_and_qual((Node *) ntest, action->qual);
> >
> > which is equivalent, but more concise.
>
> Nice.
>
> I have no further observations about this patch.
>

Looking at this one afresh, it seems that the change to make Vars
outer-join aware broke it -- the Var in the qual to test whether the
source row is null needs to be marked as nullable by the join added by
transform_MERGE_to_join(). That's something that needs to be done in
transform_MERGE_to_join(), so it makes more sense to add the new qual
there rather than in transformMergeStmt().

Also, now that MERGE has ruleutils support, it's clear that adding the
qual in transformMergeStmt() isn't right anyway, since it would then
appear in the deparsed output.

So attached is an updated patch doing that, which seems neater all
round, since adding the qual is closely related to the join-type
choice, which is now a decision taken entirely in
transform_MERGE_to_join(). This requires a new "mergeSourceRelation"
field on the Query structure, but as before, it does away with the
"mergeUseOuterJoin" field.

I've also updated the ruleutils support. In the absence of any WHEN
NOT MATCHED BY SOURCE actions, this will output not-matched actions
simply as "WHEN NOT MATCHED" for backwards compatibility, and to be
SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE
actions though, I think it's preferable to output explicit "BY SOURCE"
and "BY TARGET" qualifiers for all not-matched actions, to make the
meaning clearer.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index f8f83d4..6ef0c2b
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN 

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-03-21 Thread Alvaro Herrera
On 2023-Mar-21, Dean Rasheed wrote:

> Looking at it with fresh eyes though, I realise that I could have just written
> 
> action->qual = make_and_qual((Node *) ntest, action->qual);
> 
> which is equivalent, but more concise.

Nice.

I have no further observations about this patch.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-03-21 Thread Dean Rasheed
On Tue, 21 Mar 2023 at 10:28, Alvaro Herrera  wrote:
>
> > + /* Combine it with the action's WHEN condition */
> > + if (action->qual == NULL)
> > + action->qual = (Node *) ntest;
> > + else
> > + action->qual =
> > + (Node *) makeBoolExpr(AND_EXPR,
> > +   
> > list_make2(ntest, action->qual),
> > +   
> > -1);
>
> Hmm, I think ->qual is already in implicit-and form, so do you really
> need to makeBoolExpr, or would it be sufficient to append this new
> condition to the list?
>

No, this has come directly from transformWhereClause() in the parser,
so it's an expression tree, not a list. Transforming to implicit-and
form doesn't happen until later.

Looking at it with fresh eyes though, I realise that I could have just written

action->qual = make_and_qual((Node *) ntest, action->qual);

which is equivalent, but more concise.

Regards,
Dean




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-03-21 Thread Alvaro Herrera
On 2023-Mar-19, Dean Rasheed wrote:

> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> new file mode 100644
> index efe88cc..e1ebc8d
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y

> +merge_when_tgt_matched:
> + WHEN MATCHED{ $$ = 
> MERGE_WHEN_MATCHED; }
> + | WHEN NOT MATCHED BY SOURCE{ $$ = 
> MERGE_WHEN_NOT_MATCHED_BY_SOURCE; }
> + ;

I think a one-line comment on why this "matched" production matches "NOT
MATCHED BY" would be useful.  I think you have a big one in
transformMergeStmt already.


> + /* Combine it with the action's WHEN condition */
> + if (action->qual == NULL)
> + action->qual = (Node *) ntest;
> + else
> + action->qual =
> + (Node *) makeBoolExpr(AND_EXPR,
> + 
>   list_make2(ntest, action->qual),
> + 
>   -1);

Hmm, I think ->qual is already in implicit-and form, so do you really
need to makeBoolExpr, or would it be sufficient to append this new
condition to the list?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-03-19 Thread Dean Rasheed
I see the PlaceHolderVar issue turned out to be a pre-existing bug after all.
Rebased version attached.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -382,8 +409,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the data_source.
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  expression can only use values from the original row in the target table.
+  If used in a WHEN NOT MATCHED [BY TARGET] clause, the
+  expression can only use values from the data_source.
  
 

@@ -452,8 +481,9 @@ MERGE tot

 
  
-  Evaluate whether each row is MATCHED or
-  NOT MATCHED.
+  Evaluate whether each row is MATCHED,
+  NOT MATCHED BY SOURCE, or
+  NOT MATCHED [BY TARGET].
  
 
 
@@ -528,7 +558,8 @@ MERGE tot
   
If a WHEN clause omits an AND

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-02-25 Thread Vik Fearing

On 1/4/23 12:57, Alvaro Herrera wrote:

I haven't read this patch other than superficially; I suppose the
feature it's introducing is an OK one to have as an extension to the
standard.  (I hope the community members that are committee members
will propose this extension to become part of the standard.)



I have been doing some research on this, reading the original papers 
that introduced the feature and its improvements.


I don't see anything that ever considered what this patch proposes, even 
though SQL Server has it. (The initial MERGE didn't even have DELETE!)


SOURCE and TARGET are not currently keywords, but the only things that 
can come after MATCHED are THEN and AND, so I don't foresee any issues 
with us implementing this before the committee accepts such a change 
proposal.  I also don't see how the committee could possibly change the 
semantics of this, and two implementations having it is a good argument 
for getting it in.


We should be cautious in doing something differently from SQL Server 
here, and I would appreciate any differences being brought to my 
attention so I can incorporate them into a specification, even if that 
means resorting to the hated "implementation-defined".

--
Vik Fearing





Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-02-07 Thread Dean Rasheed
On Sat, 21 Jan 2023 at 14:18, Ted Yu  wrote:
>
> On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed  wrote:
>>
>> Rebased version, following 8eba3e3f02 and 5d29d525ff.
>>

Another rebased version attached.

> In transform_MERGE_to_join :
>
> +   if (action->matchKind == 
> MERGE_WHEN_NOT_MATCHED_BY_SOURCE)
> +   tgt_only_tuples = true;
> +   if (action->matchKind == 
> MERGE_WHEN_NOT_MATCHED_BY_TARGET)
>
> There should be an `else` in front of the second `if`.
> When tgt_only_tuples and src_only_tuples are both true, we can come out of 
> the loop.
>

I decided not to do that. Adding an "else" doesn't change the code
that the compiler generates, and IMO it's slightly more readable
without it, since it keeps the line length shorter, and the test
conditions aligned, but that's a matter of opinion / personal
preference.

I think adding extra logic to exit the loop early if both
tgt_only_tuples and src_only_tuples are true would be a premature
optimisation, increasing the code size for no real benefit. In
practice, there are unlikely to be more than a few merge actions in
the list.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by 

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-21 Thread Ted Yu
On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed 
wrote:

> On Tue, 10 Jan 2023 at 14:43, Dean Rasheed 
> wrote:
> >
> > Rebased version attached.
> >
>
> Rebased version, following 8eba3e3f02 and 5d29d525ff.
>
> Regards,
> Dean
>
Hi,
In transform_MERGE_to_join :

+   if (action->matchKind ==
MERGE_WHEN_NOT_MATCHED_BY_SOURCE)
+   tgt_only_tuples = true;
+   if (action->matchKind ==
MERGE_WHEN_NOT_MATCHED_BY_TARGET)

There should be an `else` in front of the second `if`.
When tgt_only_tuples and src_only_tuples are both true, we can come out of
the loop.

Cheers


Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-21 Thread Dean Rasheed
On Tue, 10 Jan 2023 at 14:43, Dean Rasheed  wrote:
>
> Rebased version attached.
>

Rebased version, following 8eba3e3f02 and 5d29d525ff.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -382,8 +409,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the data_source.
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  expression can only use values from the original row in the target table.
+  If used in a WHEN NOT MATCHED [BY TARGET] clause, the
+  expression can only use values from the data_source.
  
 

@@ -452,8 +481,9 @@ MERGE tot

 
  
-  Evaluate whether each row is MATCHED or
-  NOT MATCHED.
+  Evaluate whether each row is MATCHED,
+  NOT MATCHED BY SOURCE, or
+  NOT MATCHED [BY TARGET].
  
 
 
@@ -528,7 +558,8 @@ MERGE tot
   
If a WHEN 

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-10 Thread Dean Rasheed
Rebased version attached.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -382,8 +409,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the data_source.
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  expression can only use values from the original row in the target table.
+  If used in a WHEN NOT MATCHED [BY TARGET] clause, the
+  expression can only use values from the data_source.
  
 

@@ -452,8 +481,9 @@ MERGE tot

 
  
-  Evaluate whether each row is MATCHED or
-  NOT MATCHED.
+  Evaluate whether each row is MATCHED,
+  NOT MATCHED BY SOURCE, or
+  NOT MATCHED [BY TARGET].
  
 
 
@@ -528,7 +558,8 @@ MERGE tot
   
If a WHEN clause omits an AND
sub-clause, it becomes the final reachable clause of that
-   kind (MATCHED or NOT 

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-07 Thread Dean Rasheed
On Thu, 5 Jan 2023 at 13:21, Dean Rasheed  wrote:
>
> On Thu, 5 Jan 2023 at 11:03, Alvaro Herrera  wrote:
> >
> > > + /* Join type required */
> > > + if (left_join && right_join)
> > > + qry->mergeJoinType = JOIN_FULL;
> > > + else if (left_join)
> > > + qry->mergeJoinType = JOIN_LEFT;
> > > + else if (right_join)
> > > + qry->mergeJoinType = JOIN_RIGHT;
> > > + else
> > > + qry->mergeJoinType = JOIN_INNER;
> >
> > One of the review comments that MERGE got initially was that parse
> > analysis was not a place to "do query optimization", in the sense that
> > the original code was making a decision whether to make an outer or
> > inner join based on the set of WHEN clauses that appear in the command.
> > That's how we ended up with transform_MERGE_to_join and
> > mergeUseOuterJoin instead.  This new code is certainly not the same, but
> > it makes me a bit unconfortable.  Maybe it's OK, though.
> >
>
> Yeah I agree, it's a bit ugly. Perhaps a better solution would be to
> do away with that field entirely and just make the decision in
> transform_MERGE_to_join() by examining the action list again.
>

Attached is an updated patch taking that approach, allowing
mergeUseOuterJoin to be removed from the Query node, which I think is
probably a good thing.

Aside from that, it includes a few additional comment updates in the
executor that I'd missed, and psql tab completion support.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only 

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-05 Thread Dean Rasheed
On Thu, 5 Jan 2023 at 11:03, Alvaro Herrera  wrote:
>
> I haven't read this patch other than superficially; I suppose the
> feature it's introducing is an OK one to have as an extension to the
> standard.  (I hope the community members that are committee members
> will propose this extension to become part of the standard.)
>

Thanks for looking!

> > --- a/src/backend/optimizer/prep/preptlist.c
> > +++ b/src/backend/optimizer/prep/preptlist.c
> > @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root)
> >   /*
> >* Add resjunk entries for any Vars used in each 
> > action's
> >* targetlist and WHEN condition that belong to 
> > relations other
> > -  * than target.  Note that aggregates, window 
> > functions and
> > -  * placeholder vars are not possible anywhere in 
> > MERGE's WHEN
> > -  * clauses.  (PHVs may be added later, but they don't 
> > concern us
> > -  * here.)
> > +  * than target.  Note that aggregates and window 
> > functions are not
> > +  * possible anywhere in MERGE's WHEN clauses, but 
> > PlaceHolderVars
> > +  * may have been added by subquery pullup.
> >*/
> >   vars = pull_var_clause((Node *)
> >  
> > list_concat_copy((List *) action->qual,
> > 
> >   action->targetList),
> > -0);
> > +
> > PVC_INCLUDE_PLACEHOLDERS);
>
> Hmm, is this new because of NOT MATCHED BY SOURCE, or is it something
> that can already be hit by existing features of MERGE?  In other words
> -- is this a bug fix that should be backpatched ahead of introducing NOT
> MATCHED BY SOURCE?
>

It's new because of NOT MATCHED BY SOURCE, and I also found that I had
to make the same change in the MERGE INTO view patch, in the case
where the target view is simple enough to allow subquery pullup, but
also had INSTEAD OF triggers causing the pullup to happen in the
planner rather than the rewriter.

I couldn't think of a way that it could happen with the existing MERGE
code though, so I don't think it's a bug that needs fixing and
back-patching.

> > @@ -127,10 +143,12 @@ transformMergeStmt(ParseState *pstate, M
> >*/
> >   is_terminal[0] = false;
> >   is_terminal[1] = false;
> > + is_terminal[2] = false;
>
> I think these 0/1/2 should be replaced by the values of MergeMatchKind.
>

Agreed.

> > + /* Join type required */
> > + if (left_join && right_join)
> > + qry->mergeJoinType = JOIN_FULL;
> > + else if (left_join)
> > + qry->mergeJoinType = JOIN_LEFT;
> > + else if (right_join)
> > + qry->mergeJoinType = JOIN_RIGHT;
> > + else
> > + qry->mergeJoinType = JOIN_INNER;
>
> One of the review comments that MERGE got initially was that parse
> analysis was not a place to "do query optimization", in the sense that
> the original code was making a decision whether to make an outer or
> inner join based on the set of WHEN clauses that appear in the command.
> That's how we ended up with transform_MERGE_to_join and
> mergeUseOuterJoin instead.  This new code is certainly not the same, but
> it makes me a bit unconfortable.  Maybe it's OK, though.
>

Yeah I agree, it's a bit ugly. Perhaps a better solution would be to
do away with that field entirely and just make the decision in
transform_MERGE_to_join() by examining the action list again. That
would require making MergeAction's "matched" field a MergeMatchKind
rather than a bool, but maybe that's not so bad, since retaining that
information might prove useful one day.

Regards,
Dean




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-05 Thread Alvaro Herrera
I haven't read this patch other than superficially; I suppose the
feature it's introducing is an OK one to have as an extension to the
standard.  (I hope the community members that are committee members 
will propose this extension to become part of the standard.)

On 2023-Jan-02, Dean Rasheed wrote:

> --- a/src/backend/optimizer/prep/preptlist.c
> +++ b/src/backend/optimizer/prep/preptlist.c
> @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root)
>   /*
>* Add resjunk entries for any Vars used in each 
> action's
>* targetlist and WHEN condition that belong to 
> relations other
> -  * than target.  Note that aggregates, window functions 
> and
> -  * placeholder vars are not possible anywhere in 
> MERGE's WHEN
> -  * clauses.  (PHVs may be added later, but they don't 
> concern us
> -  * here.)
> +  * than target.  Note that aggregates and window 
> functions are not
> +  * possible anywhere in MERGE's WHEN clauses, but 
> PlaceHolderVars
> +  * may have been added by subquery pullup.
>*/
>   vars = pull_var_clause((Node *)
>  
> list_concat_copy((List *) action->qual,
>   
> action->targetList),
> -0);
> +
> PVC_INCLUDE_PLACEHOLDERS);

Hmm, is this new because of NOT MATCHED BY SOURCE, or is it something
that can already be hit by existing features of MERGE?  In other words
-- is this a bug fix that should be backpatched ahead of introducing NOT
MATCHED BY SOURCE?

> @@ -127,10 +143,12 @@ transformMergeStmt(ParseState *pstate, M
>*/
>   is_terminal[0] = false;
>   is_terminal[1] = false;
> + is_terminal[2] = false;

I think these 0/1/2 should be replaced by the values of MergeMatchKind.

> + /* Join type required */
> + if (left_join && right_join)
> + qry->mergeJoinType = JOIN_FULL;
> + else if (left_join)
> + qry->mergeJoinType = JOIN_LEFT;
> + else if (right_join)
> + qry->mergeJoinType = JOIN_RIGHT;
> + else
> + qry->mergeJoinType = JOIN_INNER;

One of the review comments that MERGE got initially was that parse
analysis was not a place to "do query optimization", in the sense that
the original code was making a decision whether to make an outer or
inner join based on the set of WHEN clauses that appear in the command.
That's how we ended up with transform_MERGE_to_join and
mergeUseOuterJoin instead.  This new code is certainly not the same, but
it makes me a bit unconfortable.  Maybe it's OK, though.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-01-02 Thread Dean Rasheed
On Fri, 30 Dec 2022 at 16:56, Dean Rasheed  wrote:
>
> Attached is a WIP patch.
>

Updated patch attached, now with updated docs and some other minor tidying up.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -382,8 +409,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the data_source.
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  expression can only use values from the original row in the target table.
+  If used in a WHEN NOT MATCHED [BY TARGET] clause, the
+  expression can only use values from the data_source.
  
 

@@ -452,8 +481,9 @@ MERGE tot

 
  
-  Evaluate whether each row is MATCHED or
-  NOT MATCHED.
+  Evaluate whether each row is MATCHED,
+  NOT MATCHED BY SOURCE, or
+  NOT MATCHED [BY TARGET].
  
 
 
@@ -528,7 +558,8 @@