Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Thu, 21 Mar 2024 at 09:35, Dean Rasheed wrote: > > Trivial rebase forced by 6185c9737c. > I think it would be good to get this committed. It has had a decent amount of review, at least up to v9, but a number of things have changed since then: 1). Concurrent update behaviour -- now if a concurrent update causes a matched candidate row to no longer match the join condition, it will execute the first qualifying NOT MATCHED BY SOURCE action, and then the first qualifying NOT MATCHED [BY TARGET] action. I.e., it may execute 2 actions, which makes sense because if the rows had started out not matching, the full join would have output 2 rows. 2). ResultRelInfo now has 3 lists of actions, one per match kind. Previously I was putting the NOT MATCHED BY SOURCE actions in the same list as the MATCHED actions, since they are both handled by ExecMergeMatched(). However, to achieve (1) above, it turned out to be easier to have 3 separate lists, and this makes some other code a little neater. 3). I've added a new field Query.mergeJoinCondition so that transformMergeStmt() no longer puts the join conditions in qry->jointree->quals. That's necessary to make it work correctly on an auto-updatable view which might have its own quals, but it also seems neater anyway. 4). To distinguish the MATCHED case from NOT MATCHED BY SOURCE case in the executor, it now uses the join condition (previously it added a "source IS [NOT] NULL" clause to each merge action). This has the advantage that it involves just one qual check per candidate row, rather than one for each action. On the downside, it's checking the join condition twice (since the source subplan's join node already checked it), but I couldn't see an easy way round that. (It only does this if there are both MATCHED and NOT MATCHED BY SOURCE actions, so it's not making any existing queries worse.) 5). To support (4), I added new fields ModifyTablePath.mergeJoinConditions, ModifyTable.mergeJoinConditions and ResultRelInfo.ri_MergeJoinCondition, since the attribute numbers in the join condition might vary by partition. 6). I got rid of Query.mergeSourceRelation, which is no longer needed, because of (4). (And as before, it also gets rid of Query.mergeUseOuterJoin, since the parser is no longer making the decision about what kind of join to build.) 7). To support (1), I added a new field ModifyTableState.mt_merge_pending_not_matched, because if it has to execute 2 actions following a concurrent update, and there is a RETURNING clause, it has to defer the second action until the next call to ExecModifyTable(). 8). I've added isolation tests to test (1). 9). I've added a lot more regression tests. 10). I've made a lot of comment changes in nodeModifyTable.c, especially relating to the discussion around concurrent updates. Overall, I feel like this is in pretty good shape, and it manages to make a few code simplifications that look quite nice. Regards, Dean
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
Rebased version attached. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index f8f83d4..380d0c9 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -394,10 +394,14 @@ conditions for each action are re-evaluated on the updated version of the row, starting from the first action, even if the action that had originally matched appears later in the list of actions. -On the other hand, if the row is concurrently updated or deleted so -that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +On the other hand, if the row is concurrently updated so that the join +condition fails, then MERGE will evaluate the +command's NOT MATCHED BY SOURCE and +NOT MATCHED [BY TARGET] actions next, and execute +the first one of each kind that succeeds. +If the row is concurrently deleted, then MERGE +will evaluate the command's NOT MATCHED [BY TARGET] +actions, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index e745fbd..6845f14 --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -72,7 +73,9 @@ DELETE from data_source to the target table producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -254,18 +257,40 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED and the candidate change row matches a row in the - target table, - the WHEN clause is executed if the + data_source to a row in the + target table, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED - and the candidate change row does not match a row in the - target table, + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the target table that does not match a row in the + data_source, the + WHEN clause is executed if the + condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change + row represents a row in the + data_source that does not + match a row in the target table, the WHEN clause is executed if the condition is absent or it evaluates to true. @@ -285,7 +310,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by definition there is no matching target row. Only the system attributes from the target table are accessible. @@ -409,8 +437,10 @@ DELETE WHEN MATCHED clause, the expression can use values from the original row in the target table, and values from the data_source row. - If used in a WHEN NOT MATCHED clause, the - expression can use values from the + If used in a WHEN NOT MATCHED BY SOURCE clause, the + expre
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Fri, 26 Jan 2024 at 15:57, Alvaro Herrera wrote: > > Well, firstly this is clearly a feature we want to have, even though > it's non-standard, because people use it and other implementations have > it. (Eh, so maybe somebody should be talking to the SQL standard > committee about it). As for code quality, I didn't do a comprehensive > review, but I think it is quite reasonable. Therefore, my inclination > would be to get it committed soonish, and celebrate it widely so that > people can test it soon and complain if they see something they don't > like. > Thanks. I have been going over this patch again, and for the most part, I'm pretty happy with it. One thing that's bothering me though is what happens if a row being merged is concurrently updated. Specifically, if a concurrent update causes a formerly matching row to no longer match the join condition, and there are both NOT MATCHED BY SOURCE and NOT MATCHED BY TARGET actions, so that it's doing in full join between the source and target relations. In this case, when the EPQ mechanism rescans the subplan node, there will be 2 possible output tuples (one with source null, and one with target null), and EvalPlanQual() will just return the first one, which is a more-or-less arbitrary choice, depending on the type of join (hash/merge), and (for a mergejoin) the values of the inner and outer join keys. Thus, it may execute a NOT MATCHED BY SOURCE action, or a NOT MATCHED BY TARGET action, and it's difficult to predict which. Arguably it's not worth worrying too much about what happens in a corner-case concurrent update like this, when MERGE is already inconsistent under other concurrent update scenarios, but I don't like having unpredictable results like this, which can depend on the plan chosen. I think the best (and probably simplest) solution is to always opt for a NOT MATCHED BY TARGET action in this case, so then the result is predictable, and we can document what is expected to happen. Regards, Dean
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On 2024-Jan-26, Dean Rasheed wrote: > I think it has had a decent amount of review and all the review > comments have been addressed. I'm not quite sure from Alvaro's last > comment whether he was implying that he thought it was ready for > commit. Well, firstly this is clearly a feature we want to have, even though it's non-standard, because people use it and other implementations have it. (Eh, so maybe somebody should be talking to the SQL standard committee about it). As for code quality, I didn't do a comprehensive review, but I think it is quite reasonable. Therefore, my inclination would be to get it committed soonish, and celebrate it widely so that people can test it soon and complain if they see something they don't like. I have to say that I find the idea of booting patches as Returned with Feedback just because of inactivity (as opposed to unresponsive authors) rather wrong-headed, and I wish we didn't do it. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
n Fri, 26 Jan 2024 at 14:59, vignesh C wrote: > > CFBot shows that the patch does not apply anymore as in [1]: > Rebased version attached. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index f8f83d4..6ef0c2b --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 655f7dc..f421716 --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by definition there is no matching target row. Only the system attributes from the target table are accessible. @@ -382,8 +409,10 @@ DELETE WHEN MATCHED clause, the expression can use values from the original row in the target table, and values from the data_source row. - If used in a WHEN NOT MATCHED clause, the - expression can use values from the data_source. + If used in a WHEN NOT MATCHED BY SOURCE clause, the + expression can only use values from the original row in the target table. + If used in a WHEN NOT MATCHED [BY TARGET] clause, the + expression can only use values from the data_source. @@ -452,8 +481,9 @@ MERGE tot - Evaluate whether each row is MATCHED or - NOT MATCHED. + Evaluate whether each row is MATCHED, + NOT MATCHED BY SOURCE, or + NOT MATCHED [BY TARGET]. @@ -528,7 +558,8 @@ MERGE tot If a WHE
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Mon, 22 Jan 2024 at 02:10, Peter Smith wrote: > > Hi, this patch was marked in CF as "Needs Review" [1], but there has > been no activity on this thread for 6+ months. > > Is anything else planned? Can you post something to elicit more > interest in the latest patch? Otherwise, if nothing happens then the > CF entry will be closed ("Returned with feedback") at the end of this > CF. > I think it has had a decent amount of review and all the review comments have been addressed. I'm not quite sure from Alvaro's last comment whether he was implying that he thought it was ready for commit. Looking back through the thread, the general sentiment seems to be in favour of adding this feature, and I still think it's worth doing, but I haven't managed to find much time to progress it recently. Regards, Dean
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Sat, 1 Jul 2023 at 18:04, Dean Rasheed wrote: > > On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera wrote: > > > > On 2023-Mar-21, Dean Rasheed wrote: > > > > > Looking at it with fresh eyes though, I realise that I could have just > > > written > > > > > > action->qual = make_and_qual((Node *) ntest, action->qual); > > > > > > which is equivalent, but more concise. > > > > Nice. > > > > I have no further observations about this patch. > > > > Looking at this one afresh, it seems that the change to make Vars > outer-join aware broke it -- the Var in the qual to test whether the > source row is null needs to be marked as nullable by the join added by > transform_MERGE_to_join(). That's something that needs to be done in > transform_MERGE_to_join(), so it makes more sense to add the new qual > there rather than in transformMergeStmt(). > > Also, now that MERGE has ruleutils support, it's clear that adding the > qual in transformMergeStmt() isn't right anyway, since it would then > appear in the deparsed output. > > So attached is an updated patch doing that, which seems neater all > round, since adding the qual is closely related to the join-type > choice, which is now a decision taken entirely in > transform_MERGE_to_join(). This requires a new "mergeSourceRelation" > field on the Query structure, but as before, it does away with the > "mergeUseOuterJoin" field. > > I've also updated the ruleutils support. In the absence of any WHEN > NOT MATCHED BY SOURCE actions, this will output not-matched actions > simply as "WHEN NOT MATCHED" for backwards compatibility, and to be > SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE > actions though, I think it's preferable to output explicit "BY SOURCE" > and "BY TARGET" qualifiers for all not-matched actions, to make the > meaning clearer. CFBot shows that the patch does not apply anymore as in [1]: === Applying patches on top of PostgreSQL commit ID f2bf8fb04886e3ea82e7f7f86696ac78e06b7e60 === === applying patch ./support-merge-when-not-matched-by-source-v8.patch ... patching file doc/src/sgml/ref/merge.sgml Hunk #5 FAILED at 409. Hunk #9 FAILED at 673. 2 out of 9 hunks FAILED -- saving rejects to file doc/src/sgml/ref/merge.sgml.rej .. patching file src/include/nodes/parsenodes.h Hunk #1 succeeded at 175 (offset -8 lines). Hunk #2 succeeded at 1657 (offset -6 lines). Hunk #3 succeeded at 1674 (offset -6 lines). Hunk #4 FAILED at 1696. 1 out of 4 hunks FAILED -- saving rejects to file src/include/nodes/parsenodes.h.rej Please post an updated version for the same. [1] - http://cfbot.cputube.org/patch_46_4092.log Regards, Vignesh
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
Hi, this patch was marked in CF as "Needs Review" [1], but there has been no activity on this thread for 6+ months. Is anything else planned? Can you post something to elicit more interest in the latest patch? Otherwise, if nothing happens then the CF entry will be closed ("Returned with feedback") at the end of this CF. == [1] https://commitfest.postgresql.org/46/4092/ Kind Regards, Peter Smith.
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera wrote: > > On 2023-Mar-21, Dean Rasheed wrote: > > > Looking at it with fresh eyes though, I realise that I could have just > > written > > > > action->qual = make_and_qual((Node *) ntest, action->qual); > > > > which is equivalent, but more concise. > > Nice. > > I have no further observations about this patch. > Looking at this one afresh, it seems that the change to make Vars outer-join aware broke it -- the Var in the qual to test whether the source row is null needs to be marked as nullable by the join added by transform_MERGE_to_join(). That's something that needs to be done in transform_MERGE_to_join(), so it makes more sense to add the new qual there rather than in transformMergeStmt(). Also, now that MERGE has ruleutils support, it's clear that adding the qual in transformMergeStmt() isn't right anyway, since it would then appear in the deparsed output. So attached is an updated patch doing that, which seems neater all round, since adding the qual is closely related to the join-type choice, which is now a decision taken entirely in transform_MERGE_to_join(). This requires a new "mergeSourceRelation" field on the Query structure, but as before, it does away with the "mergeUseOuterJoin" field. I've also updated the ruleutils support. In the absence of any WHEN NOT MATCHED BY SOURCE actions, this will output not-matched actions simply as "WHEN NOT MATCHED" for backwards compatibility, and to be SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE actions though, I think it's preferable to output explicit "BY SOURCE" and "BY TARGET" qualifiers for all not-matched actions, to make the meaning clearer. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index f8f83d4..6ef0c2b --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN cla
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On 2023-Mar-21, Dean Rasheed wrote: > Looking at it with fresh eyes though, I realise that I could have just written > > action->qual = make_and_qual((Node *) ntest, action->qual); > > which is equivalent, but more concise. Nice. I have no further observations about this patch. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Tue, 21 Mar 2023 at 10:28, Alvaro Herrera wrote: > > > + /* Combine it with the action's WHEN condition */ > > + if (action->qual == NULL) > > + action->qual = (Node *) ntest; > > + else > > + action->qual = > > + (Node *) makeBoolExpr(AND_EXPR, > > + > > list_make2(ntest, action->qual), > > + > > -1); > > Hmm, I think ->qual is already in implicit-and form, so do you really > need to makeBoolExpr, or would it be sufficient to append this new > condition to the list? > No, this has come directly from transformWhereClause() in the parser, so it's an expression tree, not a list. Transforming to implicit-and form doesn't happen until later. Looking at it with fresh eyes though, I realise that I could have just written action->qual = make_and_qual((Node *) ntest, action->qual); which is equivalent, but more concise. Regards, Dean
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On 2023-Mar-19, Dean Rasheed wrote: > diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y > new file mode 100644 > index efe88cc..e1ebc8d > --- a/src/backend/parser/gram.y > +++ b/src/backend/parser/gram.y > +merge_when_tgt_matched: > + WHEN MATCHED{ $$ = > MERGE_WHEN_MATCHED; } > + | WHEN NOT MATCHED BY SOURCE{ $$ = > MERGE_WHEN_NOT_MATCHED_BY_SOURCE; } > + ; I think a one-line comment on why this "matched" production matches "NOT MATCHED BY" would be useful. I think you have a big one in transformMergeStmt already. > + /* Combine it with the action's WHEN condition */ > + if (action->qual == NULL) > + action->qual = (Node *) ntest; > + else > + action->qual = > + (Node *) makeBoolExpr(AND_EXPR, > + > list_make2(ntest, action->qual), > + > -1); Hmm, I think ->qual is already in implicit-and form, so do you really need to makeBoolExpr, or would it be sufficient to append this new condition to the list? -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
I see the PlaceHolderVar issue turned out to be a pre-existing bug after all. Rebased version attached. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index b87ad5c..1482ede --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by definition there is no matching target row. Only the system attributes from the target table are accessible. @@ -382,8 +409,10 @@ DELETE WHEN MATCHED clause, the expression can use values from the original row in the target table, and values from the data_source row. - If used in a WHEN NOT MATCHED clause, the - expression can use values from the data_source. + If used in a WHEN NOT MATCHED BY SOURCE clause, the + expression can only use values from the original row in the target table. + If used in a WHEN NOT MATCHED [BY TARGET] clause, the + expression can only use values from the data_source. @@ -452,8 +481,9 @@ MERGE tot - Evaluate whether each row is MATCHED or - NOT MATCHED. + Evaluate whether each row is MATCHED, + NOT MATCHED BY SOURCE, or + NOT MATCHED [BY TARGET]. @@ -528,7 +558,8 @@ MERGE tot If a WHEN clause omits an AND sub-clause,
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On 1/4/23 12:57, Alvaro Herrera wrote: I haven't read this patch other than superficially; I suppose the feature it's introducing is an OK one to have as an extension to the standard. (I hope the community members that are committee members will propose this extension to become part of the standard.) I have been doing some research on this, reading the original papers that introduced the feature and its improvements. I don't see anything that ever considered what this patch proposes, even though SQL Server has it. (The initial MERGE didn't even have DELETE!) SOURCE and TARGET are not currently keywords, but the only things that can come after MATCHED are THEN and AND, so I don't foresee any issues with us implementing this before the committee accepts such a change proposal. I also don't see how the committee could possibly change the semantics of this, and two implementations having it is a good argument for getting it in. We should be cautious in doing something differently from SQL Server here, and I would appreciate any differences being brought to my attention so I can incorporate them into a specification, even if that means resorting to the hated "implementation-defined". -- Vik Fearing
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Sat, 21 Jan 2023 at 14:18, Ted Yu wrote: > > On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed wrote: >> >> Rebased version, following 8eba3e3f02 and 5d29d525ff. >> Another rebased version attached. > In transform_MERGE_to_join : > > + if (action->matchKind == > MERGE_WHEN_NOT_MATCHED_BY_SOURCE) > + tgt_only_tuples = true; > + if (action->matchKind == > MERGE_WHEN_NOT_MATCHED_BY_TARGET) > > There should be an `else` in front of the second `if`. > When tgt_only_tuples and src_only_tuples are both true, we can come out of > the loop. > I decided not to do that. Adding an "else" doesn't change the code that the compiler generates, and IMO it's slightly more readable without it, since it keeps the line length shorter, and the test conditions aligned, but that's a matter of opinion / personal preference. I think adding extra logic to exit the loop early if both tgt_only_tuples and src_only_tuples are true would be a premature optimisation, increasing the code size for no real benefit. In practice, there are unlikely to be more than a few merge actions in the list. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index b87ad5c..1482ede --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by defi
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed wrote: > On Tue, 10 Jan 2023 at 14:43, Dean Rasheed > wrote: > > > > Rebased version attached. > > > > Rebased version, following 8eba3e3f02 and 5d29d525ff. > > Regards, > Dean > Hi, In transform_MERGE_to_join : + if (action->matchKind == MERGE_WHEN_NOT_MATCHED_BY_SOURCE) + tgt_only_tuples = true; + if (action->matchKind == MERGE_WHEN_NOT_MATCHED_BY_TARGET) There should be an `else` in front of the second `if`. When tgt_only_tuples and src_only_tuples are both true, we can come out of the loop. Cheers
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Tue, 10 Jan 2023 at 14:43, Dean Rasheed wrote: > > Rebased version attached. > Rebased version, following 8eba3e3f02 and 5d29d525ff. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index b87ad5c..1482ede --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by definition there is no matching target row. Only the system attributes from the target table are accessible. @@ -382,8 +409,10 @@ DELETE WHEN MATCHED clause, the expression can use values from the original row in the target table, and values from the data_source row. - If used in a WHEN NOT MATCHED clause, the - expression can use values from the data_source. + If used in a WHEN NOT MATCHED BY SOURCE clause, the + expression can only use values from the original row in the target table. + If used in a WHEN NOT MATCHED [BY TARGET] clause, the + expression can only use values from the data_source. @@ -452,8 +481,9 @@ MERGE tot - Evaluate whether each row is MATCHED or - NOT MATCHED. + Evaluate whether each row is MATCHED, + NOT MATCHED BY SOURCE, or + NOT MATCHED [BY TARGET]. @@ -528,7 +558,8 @@ MERGE tot If a WHEN c
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
Rebased version attached. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index b87ad5c..1482ede --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by definition there is no matching target row. Only the system attributes from the target table are accessible. @@ -382,8 +409,10 @@ DELETE WHEN MATCHED clause, the expression can use values from the original row in the target table, and values from the data_source row. - If used in a WHEN NOT MATCHED clause, the - expression can use values from the data_source. + If used in a WHEN NOT MATCHED BY SOURCE clause, the + expression can only use values from the original row in the target table. + If used in a WHEN NOT MATCHED [BY TARGET] clause, the + expression can only use values from the data_source. @@ -452,8 +481,9 @@ MERGE tot - Evaluate whether each row is MATCHED or - NOT MATCHED. + Evaluate whether each row is MATCHED, + NOT MATCHED BY SOURCE, or + NOT MATCHED [BY TARGET]. @@ -528,7 +558,8 @@ MERGE tot If a WHEN clause omits an AND sub-clause, it becomes the final reachable clause of that - kind (MATCHED or NOT MATCHE
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Thu, 5 Jan 2023 at 13:21, Dean Rasheed wrote: > > On Thu, 5 Jan 2023 at 11:03, Alvaro Herrera wrote: > > > > > + /* Join type required */ > > > + if (left_join && right_join) > > > + qry->mergeJoinType = JOIN_FULL; > > > + else if (left_join) > > > + qry->mergeJoinType = JOIN_LEFT; > > > + else if (right_join) > > > + qry->mergeJoinType = JOIN_RIGHT; > > > + else > > > + qry->mergeJoinType = JOIN_INNER; > > > > One of the review comments that MERGE got initially was that parse > > analysis was not a place to "do query optimization", in the sense that > > the original code was making a decision whether to make an outer or > > inner join based on the set of WHEN clauses that appear in the command. > > That's how we ended up with transform_MERGE_to_join and > > mergeUseOuterJoin instead. This new code is certainly not the same, but > > it makes me a bit unconfortable. Maybe it's OK, though. > > > > Yeah I agree, it's a bit ugly. Perhaps a better solution would be to > do away with that field entirely and just make the decision in > transform_MERGE_to_join() by examining the action list again. > Attached is an updated patch taking that approach, allowing mergeUseOuterJoin to be removed from the Query node, which I think is probably a good thing. Aside from that, it includes a few additional comment updates in the executor that I'd missed, and psql tab completion support. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index b87ad5c..1482ede --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refe
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Thu, 5 Jan 2023 at 11:03, Alvaro Herrera wrote: > > I haven't read this patch other than superficially; I suppose the > feature it's introducing is an OK one to have as an extension to the > standard. (I hope the community members that are committee members > will propose this extension to become part of the standard.) > Thanks for looking! > > --- a/src/backend/optimizer/prep/preptlist.c > > +++ b/src/backend/optimizer/prep/preptlist.c > > @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root) > > /* > >* Add resjunk entries for any Vars used in each > > action's > >* targetlist and WHEN condition that belong to > > relations other > > - * than target. Note that aggregates, window > > functions and > > - * placeholder vars are not possible anywhere in > > MERGE's WHEN > > - * clauses. (PHVs may be added later, but they don't > > concern us > > - * here.) > > + * than target. Note that aggregates and window > > functions are not > > + * possible anywhere in MERGE's WHEN clauses, but > > PlaceHolderVars > > + * may have been added by subquery pullup. > >*/ > > vars = pull_var_clause((Node *) > > > > list_concat_copy((List *) action->qual, > > > > action->targetList), > > -0); > > + > > PVC_INCLUDE_PLACEHOLDERS); > > Hmm, is this new because of NOT MATCHED BY SOURCE, or is it something > that can already be hit by existing features of MERGE? In other words > -- is this a bug fix that should be backpatched ahead of introducing NOT > MATCHED BY SOURCE? > It's new because of NOT MATCHED BY SOURCE, and I also found that I had to make the same change in the MERGE INTO view patch, in the case where the target view is simple enough to allow subquery pullup, but also had INSTEAD OF triggers causing the pullup to happen in the planner rather than the rewriter. I couldn't think of a way that it could happen with the existing MERGE code though, so I don't think it's a bug that needs fixing and back-patching. > > @@ -127,10 +143,12 @@ transformMergeStmt(ParseState *pstate, M > >*/ > > is_terminal[0] = false; > > is_terminal[1] = false; > > + is_terminal[2] = false; > > I think these 0/1/2 should be replaced by the values of MergeMatchKind. > Agreed. > > + /* Join type required */ > > + if (left_join && right_join) > > + qry->mergeJoinType = JOIN_FULL; > > + else if (left_join) > > + qry->mergeJoinType = JOIN_LEFT; > > + else if (right_join) > > + qry->mergeJoinType = JOIN_RIGHT; > > + else > > + qry->mergeJoinType = JOIN_INNER; > > One of the review comments that MERGE got initially was that parse > analysis was not a place to "do query optimization", in the sense that > the original code was making a decision whether to make an outer or > inner join based on the set of WHEN clauses that appear in the command. > That's how we ended up with transform_MERGE_to_join and > mergeUseOuterJoin instead. This new code is certainly not the same, but > it makes me a bit unconfortable. Maybe it's OK, though. > Yeah I agree, it's a bit ugly. Perhaps a better solution would be to do away with that field entirely and just make the decision in transform_MERGE_to_join() by examining the action list again. That would require making MergeAction's "matched" field a MergeMatchKind rather than a bool, but maybe that's not so bad, since retaining that information might prove useful one day. Regards, Dean
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
I haven't read this patch other than superficially; I suppose the feature it's introducing is an OK one to have as an extension to the standard. (I hope the community members that are committee members will propose this extension to become part of the standard.) On 2023-Jan-02, Dean Rasheed wrote: > --- a/src/backend/optimizer/prep/preptlist.c > +++ b/src/backend/optimizer/prep/preptlist.c > @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root) > /* >* Add resjunk entries for any Vars used in each > action's >* targetlist and WHEN condition that belong to > relations other > - * than target. Note that aggregates, window functions > and > - * placeholder vars are not possible anywhere in > MERGE's WHEN > - * clauses. (PHVs may be added later, but they don't > concern us > - * here.) > + * than target. Note that aggregates and window > functions are not > + * possible anywhere in MERGE's WHEN clauses, but > PlaceHolderVars > + * may have been added by subquery pullup. >*/ > vars = pull_var_clause((Node *) > > list_concat_copy((List *) action->qual, > > action->targetList), > -0); > + > PVC_INCLUDE_PLACEHOLDERS); Hmm, is this new because of NOT MATCHED BY SOURCE, or is it something that can already be hit by existing features of MERGE? In other words -- is this a bug fix that should be backpatched ahead of introducing NOT MATCHED BY SOURCE? > @@ -127,10 +143,12 @@ transformMergeStmt(ParseState *pstate, M >*/ > is_terminal[0] = false; > is_terminal[1] = false; > + is_terminal[2] = false; I think these 0/1/2 should be replaced by the values of MergeMatchKind. > + /* Join type required */ > + if (left_join && right_join) > + qry->mergeJoinType = JOIN_FULL; > + else if (left_join) > + qry->mergeJoinType = JOIN_LEFT; > + else if (right_join) > + qry->mergeJoinType = JOIN_RIGHT; > + else > + qry->mergeJoinType = JOIN_INNER; One of the review comments that MERGE got initially was that parse analysis was not a place to "do query optimization", in the sense that the original code was making a decision whether to make an outer or inner join based on the set of WHEN clauses that appear in the command. That's how we ended up with transform_MERGE_to_join and mergeUseOuterJoin instead. This new code is certainly not the same, but it makes me a bit unconfortable. Maybe it's OK, though. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: MERGE ... WHEN NOT MATCHED BY SOURCE
On Fri, 30 Dec 2022 at 16:56, Dean Rasheed wrote: > > Attached is a WIP patch. > Updated patch attached, now with updated docs and some other minor tidying up. Regards, Dean diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml new file mode 100644 index b87ad5c..1482ede --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -396,8 +396,8 @@ originally matched appears later in the list of actions. On the other hand, if the row is concurrently updated or deleted so that the join condition fails, then MERGE will -evaluate the condition's NOT MATCHED actions next, -and execute the first one that succeeds. +evaluate the condition's NOT MATCHED [BY TARGET] +actions next, and execute the first one that succeeds. If MERGE attempts an INSERT and a unique index is present and a duplicate row is concurrently inserted, then a uniqueness violation error is raised; diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml new file mode 100644 index 0995fe0..8ef121a --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -33,7 +33,8 @@ USING dat and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | - WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } + WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | + WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: @@ -70,7 +71,9 @@ DELETE from data_source to target_table_name producing zero or more candidate change rows. For each candidate change - row, the status of MATCHED or NOT MATCHED + row, the status of MATCHED, + NOT MATCHED BY SOURCE, + or NOT MATCHED [BY TARGET] is set just once, after which WHEN clauses are evaluated in the order specified. For each candidate change row, the first clause to evaluate as true is executed. No more than one WHEN @@ -226,16 +229,37 @@ DELETE At least one WHEN clause is required. + The WHEN clause may specify WHEN MATCHED, + WHEN NOT MATCHED BY SOURCE, or + WHEN NOT MATCHED [BY TARGET]. + Note that the SQL standard only defines + WHEN MATCHED and WHEN NOT MATCHED + (which is defined to mean no matching target row). + WHEN NOT MATCHED BY SOURCE is an extension to the + SQL standard, as is the option to append + BY TARGET to WHEN NOT MATCHED, to + make its meaning more explicit. + + If the WHEN clause specifies WHEN MATCHED - and the candidate change row matches a row in the + and the candidate change row matches a row in the source to a row in the target_table_name, the WHEN clause is executed if the condition is absent or it evaluates to true. - Conversely, if the WHEN clause specifies - WHEN NOT MATCHED + If the WHEN clause specifies + WHEN NOT MATCHED BY SOURCE and the candidate change + row represents a row in the + target_table_name that does + not match a source row, the WHEN clause is executed + if the condition is + absent or it evaluates to true. + + + If the WHEN clause specifies + WHEN NOT MATCHED [BY TARGET] and the candidate change row does not match a row in the target_table_name, the WHEN clause is executed if the @@ -257,7 +281,10 @@ DELETE A condition on a WHEN MATCHED clause can refer to columns in both the source and the target relations. A condition on a - WHEN NOT MATCHED clause can only refer to columns from + WHEN NOT MATCHED BY SOURCE clause can only refer to + columns from the target relation, since by definition there is no matching + source row. A condition on a WHEN NOT MATCHED [BY TARGET] + clause can only refer to columns from the source relation, since by definition there is no matching target row. Only the system attributes from the target table are accessible. @@ -382,8 +409,10 @@ DELETE WHEN MATCHED clause, the expression can use values from the original row in the target table, and values from the data_source row. - If used in a WHEN NOT MATCHED clause, the - expression can use values from the data_source. + If used in a WHEN NOT MATCHED BY SOURCE clause, the + expression can only use values from the original row in the target table. + If used in a WHEN NOT MATCHED [BY TARGET] clause, the + expression can only use values from the data_source. @@ -452,8 +481,9 @@ MERGE tot - Evaluate whether each row is MATCHED or - NOT MATCHED. + Evaluate whether each row is MATCHED, + NOT MATCHED BY SOURCE, or + NOT MATCHED [BY TARGET]. @@ -528,7 +558,8 @@ MERGE
MERGE ... WHEN NOT MATCHED BY SOURCE
This allows MERGE to UPDATE or DELETE target rows where there is no matching source row. In addition, it allows the existing "WHEN NOT MATCHED" syntax to include an optional "BY TARGET" to make its meaning more explicit. E.g., MERGE INTO tgt USING src ON ... WHEN NOT MATCHED BY SOURCE THEN UPDATE/DELETE ... WHEN NOT MATCHED BY TARGET THEN INSERT ... AFAIK, this is not part of the standard (though I only have a very old draft copy). It is supported by at least 2 other major DB vendors though, and I think it usefully rounds off the set of possible MERGE actions. Attached is a WIP patch. I haven't updated the docs yet, and there are probably a few other things to tidy up and test, but the basic functionality is there. Regards, Dean diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c new file mode 100644 index 4cec12a..ed538e5 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -142,7 +142,6 @@ transform_MERGE_to_join(Query *parse) { RangeTblEntry *joinrte; JoinExpr *joinexpr; - JoinType jointype; int joinrti; List *vars; @@ -152,20 +151,10 @@ transform_MERGE_to_join(Query *parse) /* XXX probably bogus */ vars = NIL; - /* - * When any WHEN NOT MATCHED THEN INSERT clauses exist, we need to use an - * outer join so that we process all unmatched tuples from the source - * relation. If none exist, we can use an inner join. - */ - if (parse->mergeUseOuterJoin) - jointype = JOIN_RIGHT; - else - jointype = JOIN_INNER; - /* Manufacture a join RTE to use. */ joinrte = makeNode(RangeTblEntry); joinrte->rtekind = RTE_JOIN; - joinrte->jointype = jointype; + joinrte->jointype = parse->mergeJoinType; joinrte->joinmergedcols = 0; joinrte->joinaliasvars = vars; joinrte->joinleftcols = NIL; /* MERGE does not allow JOIN USING */ @@ -189,7 +178,7 @@ transform_MERGE_to_join(Query *parse) * Create a JOIN between the target and the source relation. */ joinexpr = makeNode(JoinExpr); - joinexpr->jointype = jointype; + joinexpr->jointype = parse->mergeJoinType; joinexpr->isNatural = false; joinexpr->larg = (Node *) makeNode(RangeTblRef); ((RangeTblRef *) joinexpr->larg)->rtindex = parse->resultRelation; diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c new file mode 100644 index 137b283..1a4e13e --- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -157,15 +157,14 @@ preprocess_targetlist(PlannerInfo *root) /* * Add resjunk entries for any Vars used in each action's * targetlist and WHEN condition that belong to relations other - * than target. Note that aggregates, window functions and - * placeholder vars are not possible anywhere in MERGE's WHEN - * clauses. (PHVs may be added later, but they don't concern us - * here.) + * than target. Note that aggregates and window functions are not + * possible anywhere in MERGE's WHEN clauses, but PlaceHolderVars + * may have been added by subquery pullup. */ vars = pull_var_clause((Node *) list_concat_copy((List *) action->qual, action->targetList), - 0); + PVC_INCLUDE_PLACEHOLDERS); foreach(l2, vars) { Var *var = (Var *) lfirst(l2); diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y new file mode 100644 index 63b4baa..95531ab --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -273,6 +273,7 @@ static Node *makeRecursiveViewSelect(cha struct SelectLimit *selectlimit; SetQuantifier setquantifier; struct GroupClause *groupclause; + MergeMatchKind mergematch; MergeWhenClause *mergewhen; struct KeyActions *keyactions; struct KeyAction *keyaction; @@ -512,6 +513,7 @@ static Node *makeRecursiveViewSelect(cha %type opt_on_conflict %type merge_insert merge_update merge_delete +%type merge_when_tgt_matched merge_when_tgt_not_matched %type merge_when_clause opt_merge_when_condition %type merge_when_list @@ -740,11 +742,11 @@ static Node *makeRecursiveViewSelect(cha SAVEPOINT SCHEMA SCHEMAS SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES SERIALIZABLE SERVER SESSION SESSION_USER SET SETS SETOF SHARE SHOW - SIMILAR SIMPLE SKIP SMALLINT SNAPSHOT SOME SQL_P STABLE STANDALONE_P + SIMILAR SIMPLE SKIP SMALLINT SNAPSHOT SOME SOURCE SQL_P STABLE STANDALONE_P START STATEMENT STATISTICS STDIN STDOUT STORAGE STORED STRICT_P STRIP_P SUBSCRIPTION SUBSTRING SUPPORT SYMMETRIC SYSID SYSTEM_P SYSTEM_USER - TABLE TABLES TABLESAMPLE TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN + TABLE TABLES TABLESAMPLE TABLESPACE TARGET TEMP TEMPLATE TEMPORARY TEXT_P THEN TIES TIME TIMESTAMP TO TRAILING TRANSACTION TRANSFORM TREAT TRIGGER TRIM TRUE_P TRUNCATE TRUSTED TYPE_P TYPES_P @@ -12260,49 +12262,59 @@ merge_when_list: ; merge_when_clause: - WHEN MATCHED opt_merge_when_condition THEN merge_update