Minor fixes for couple some comments around MERGE RETURNING

2024-05-18 Thread David Rowley
I noticed that PlannedStmt.hasReturning and hasModifyingCTE have an
outdated comment now that MERGE supports RETURNING (per commit
c649fa24a)

i.e. these two:

> bool hasReturning; /* is it insert|update|delete RETURNING? */

> bool hasModifyingCTE; /* has insert|update|delete in WITH? */

transformWithClause() has:

/* must be a data-modifying statement */
Assert(IsA(cte->ctequery, InsertStmt) ||
   IsA(cte->ctequery, UpdateStmt) ||
   IsA(cte->ctequery, DeleteStmt) ||
   IsA(cte->ctequery, MergeStmt));

pstate->p_hasModifyingCTE = true;

which eventually makes it into PlannedStmt.hasModifyingCTE.

The attached trivial patch fixes these.

David


merge_returning_comments.patch
Description: Binary data


Re: MERGE ... RETURNING

2024-03-18 Thread Dean Rasheed
On Fri, 15 Mar 2024 at 17:14, Jeff Davis  wrote:
>
> On Fri, 2024-03-15 at 11:20 +, Dean Rasheed wrote:
>
> > So barring any further objections, I'd like to go ahead and get this
> > patch committed.
>
> I like this feature from a user perspective. So +1 from me.
>

I have committed this. Thanks for all the feedback everyone.

Regards,
Dean




Re: MERGE ... RETURNING

2024-03-15 Thread Jeff Davis
On Fri, 2024-03-15 at 11:20 +, Dean Rasheed wrote:
> To recap, this adds support for a single RETURNING list at the end of
> a MERGE command, and a special MERGE_ACTION() function that may be
> used in the RETURNING list to return the action command string
> ('INSERT', 'UPDATE', or 'DELETE') that was executed.

...

> So barring any further objections, I'd like to go ahead and get this
> patch committed.

All of my concerns have been extensively discussed and it seems like
they are just the cost of having a good feature. Thank you for going
through so many alternative approaches, I think the one you've arrived
at is consistent with what Vik endorsed[1].

The MERGE_ACTION keyword is added to the 'col_name_keyword' and the
'bare_label_keyword' lists. That has some annoying effects, like:

   CREATE FUNCTION merge_action() RETURNS TEXT
  LANGUAGE SQL AS $$ SELECT 'asdf'; $$;
   ERROR:  syntax error at or near "("
   LINE 1: CREATE FUNCTION merge_action() RETURNS TEXT

I didn't see any affirmative endorsement of exactly how the keyword is
implemented, but that patch has been around for a while, and I didn't
see any objection, either.

I like this feature from a user perspective. So +1 from me.

Regards,
Jeff Davis

[1]
https://www.postgresql.org/message-id/7db39b45-821f-4894-ada9-c19570b11...@postgresfriends.org




Re: MERGE ... RETURNING

2024-03-15 Thread Dean Rasheed
On Fri, 15 Mar 2024 at 11:06, Dean Rasheed  wrote:
>
> Updated patch attached.
>

I have gone over this patch again in detail, and I believe that the
code is in good shape. All review comments have been addressed, and
the only thing remaining is the syntax question.

To recap, this adds support for a single RETURNING list at the end of
a MERGE command, and a special MERGE_ACTION() function that may be
used in the RETURNING list to return the action command string
('INSERT', 'UPDATE', or 'DELETE') that was executed.

Looking for similar precedents in other databases, SQL Server uses a
slightly different (non-standard) syntax for MERGE, and uses "OUTPUT"
instead of "RETURNING" to return rows. But it does allow "$action" in
the output list, which is functionally equivalent to MERGE_ACTION():

https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver16#output_clause

In the future, we may choose to support the SQL standard syntax for
returning rows modified by INSERT, UPDATE, DELETE, and MERGE commands,
but I don't think that this patch needs to do that.

What this patch does is to make MERGE more consistent with INSERT,
UPDATE, and DELETE, by allowing RETURNING. And if the patch to add
support for returning OLD/NEW values [1] makes it in too, it will be
more powerful than the SQL standard syntax, since it will allow both
old and new values to be returned at the same time, in arbitrary
expressions.

So barring any further objections, I'd like to go ahead and get this
patch committed.

Regards,
Dean

[1] 
https://www.postgresql.org/message-id/flat/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50agoe+pv...@mail.gmail.com




Re: MERGE ... RETURNING

2024-03-14 Thread Alvaro Herrera
On 2024-Mar-13, Dean Rasheed wrote:

> On Wed, 13 Mar 2024 at 06:44, jian he  wrote:
> >
> > 
> > [ WITH with_query [, ...] ]
> > MERGE INTO [ ONLY ]  >
> > here the "WITH" part should have "[ RECURSIVE ]"
> 
> Actually, no. MERGE doesn't support WITH RECURSIVE.
> 
> It's not entirely clear to me why though. I did a quick test, removing
> that restriction in the parse analysis code, and it seemed to work
> fine. Alvaro, do you remember why that restriction is there?

There's no real reason for it, other than I didn't want to have to think
it through; I did suspect that it might Just Work, but I felt I would
have had to come up with more nontrivial test cases than I wanted to
write at the time.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"People get annoyed when you try to debug them."  (Larry Wall)




Re: MERGE ... RETURNING

2024-03-13 Thread jian he
Hi
mainly document issues. Other than that, it looks good!

MERGE not supported in COPY
MERGE not supported in WITH query
These entries in src/backend/po.* need to be deleted if this patch is
committed?
--
  
   RETURNING
  

  
   INSERT
   RETURNING
  

  
   UPDATE
   RETURNING
  

  
   DELETE
   RETURNING
  

  
   MERGE
   RETURNING
  

in doc/src/sgml/dml.sgml, what is the point of these?
It is not rendered in the html file, deleting it still generates all the
html file.
--
The following part is about doc/src/sgml/plpgsql.sgml.


 The query used in this type of
FOR
 statement can be any SQL command that returns rows to the caller:
 SELECT is the most common case,
 but you can also use INSERT,
UPDATE, or
 DELETE with a RETURNING clause.
Some utility
 commands such as EXPLAIN will work too.

here we need to add MERGE?


   
Row-level triggers fired BEFORE can return null to
signal the
trigger manager to skip the rest of the operation for this row
(i.e., subsequent triggers are not fired, and the

INSERT/UPDATE/DELETE
does not occur
for this row).  If a nonnull
here we need to add MERGE?


   
Variable substitution currently works only in SELECT,
INSERT, UPDATE,
DELETE, and commands containing one of
these (such as EXPLAIN and CREATE TABLE
... AS SELECT),
because the main SQL engine allows query parameters only in these
commands.  To use a non-constant name or value in other statement
types (generically called utility statements), you must construct
the utility statement as a string and EXECUTE it.
   
here we need to add MERGE?
demo:
CREATE OR REPlACE FUNCTION stamp_user2(id int, comment text) RETURNS void
AS $$
<>
DECLARE
curtime timestamp := now();
BEGIN
MERGE INTO users
USING (SELECT 1)
ON true
WHEN MATCHED and (users.id = stamp_user2.id) THEN
  update SET last_modified = fn.curtime, comment =
stamp_user2.comment;
raise notice 'test';
END;
$$ LANGUAGE plpgsql;


INSTEAD OF triggers (which are always row-level
triggers,
and may only be used on views) can return null to signal that they did
not perform any updates, and that the rest of the operation for this
row should be skipped (i.e., subsequent triggers are not fired, and the
row is not counted in the rows-affected status for the surrounding

INSERT/UPDATE/DELETE).
I am not sure we need to add MERGE. Maybe not.


Re: MERGE ... RETURNING

2024-03-13 Thread Dean Rasheed
On Wed, 13 Mar 2024 at 08:58, Dean Rasheed  wrote:
>
> I think I'll go make those doc changes, and back-patch them
> separately, since they're not related to this patch.
>

OK, I've done that. Here is a rebased patch on top of that, with the
other changes you suggested.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..3d95bdb
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,21 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  Since it is quite common for the source and target to
+   have many of the same columns, specifying RETURNING *
+   can lead to a lot of duplicated columns, so it is often more useful to
+   qualify it so as to return just the source or target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 0bb7aeb..0e2b888
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22421,6 +22421,85 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   PostgreSQL includes one merge support function
+   that may be used in the RETURNING list of a
+command to identify the action taken for each
+   row.
+  
+
+  
+   Merge Support Functions
+
+   
+
+ 
+  
+   Function
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   
+merge_action
+   
+   merge_action ( )
+   text
+  
+  
+   Returns the merge action command executed for the current row.  This
+   will be 'INSERT', 'UPDATE', or
+   'DELETE'.
+  
+ 
+
+   
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that this function can only be used in the RETURNING
+   list of a MERGE command.  It is an error to use it in any
+   other part of a query.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 8c2f114..a81c17a
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1442,9 +1442,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index c2b9c6a..406834e
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1043,8 +1043,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1172,6 +1172,7 @@ SELECT select_expressionsexpressions INTO STRICT target;
 UPDATE ... RETURNING expressions INTO STRICT target;
 DELETE ... RETURNING expressions INTO STRICT target;
+MERGE ... RETURNING expressions INTO STRICT target;
 
 
  where target can be a record variable, a row
@@ -1182,8 +1183,8 @@ DELETE ... RETURNING expres
  INTO clause) just as described above,
  and the plan is cached in the same way.
  This works for SELECT,
- INSERT/UPDATE/DELETE with
- RETURNING, and certain utility commands
+ INSERT/UPDATE/DELETE/MERGE
+ with RETURNING, and certain utility commands
  that return row sets, such as EXPLAIN.
  Except for the INTO clause, the SQL command is the same
  as it would be written outside PL/pgSQL.
@@ -1259,7 +1260,7 @@ END;
 
 
 
- For INSERT/UPDATE/DELETE with
+ For INSERT/UPDATE/DELETE/MERGE with
  R

Re: MERGE ... RETURNING

2024-03-13 Thread Dean Rasheed
On Wed, 13 Mar 2024 at 06:44, jian he  wrote:
>
> 
> [ WITH with_query [, ...] ]
> MERGE INTO [ ONLY ] 
> here the "WITH" part should have "[ RECURSIVE ]"

Actually, no. MERGE doesn't support WITH RECURSIVE.

It's not entirely clear to me why though. I did a quick test, removing
that restriction in the parse analysis code, and it seemed to work
fine. Alvaro, do you remember why that restriction is there?

It's probably worth noting it in the docs, since it's different from
INSERT, UPDATE and DELETE. I think this would suffice:

   
with_query

 
  The WITH clause allows you to specify one or more
  subqueries that can be referenced by name in the MERGE
  query. See  and 
  for details.  Note that WITH RECURSIVE is not supported
  by MERGE.
 

   

And then maybe we can remove that restriction in HEAD, if there really
isn't any need for it anymore.

I also noticed that the "UPDATE SET ..." syntax in the synopsis is
missing a couple of options that are supported -- the optional "ROW"
keyword in the multi-column assignment syntax, and the syntax to
assign from a subquery that returns multiple columns. So this should
be updated to match update.sgml:

UPDATE SET { column_name
= { expression | DEFAULT
} |
 ( column_name [, ...] ) = [ ROW ] ( {
expression | DEFAULT } [,
...] ) |
 ( column_name [, ...] ) = ( sub-SELECT )
   } [, ...]

and then in the parameter section:

   
sub-SELECT

 
  A SELECT sub-query that produces as many output columns
  as are listed in the parenthesized column list preceding it.  The
  sub-query must yield no more than one row when executed.  If it
  yields one row, its column values are assigned to the target columns;
  if it yields no rows, NULL values are assigned to the target columns.
  The sub-query can refer to values from the original row in the
target table,
  and values from the data_source.
 

   

(basically copied verbatim from update.sgml)

I think I'll go make those doc changes, and back-patch them
separately, since they're not related to this patch.

Regards,
Dean




Re: MERGE ... RETURNING

2024-03-13 Thread jian he
Hi, some minor issues:


[ WITH with_query [, ...] ]
MERGE INTO [ ONLY ] target_table_name [ * ] [ [ AS ]
target_alias ]
USING data_source ON
join_condition
when_clause [...]
[ RETURNING * | output_expression [ [ AS ]
output_name ] [, ...] ]

here the "WITH" part should have "[ RECURSIVE ]" like:
[ WITH [ RECURSIVE ] with_query [, ...] ]

+  An expression to be computed and returned by the MERGE
+  command after each row is merged.  The expression can use any columns of
+  the source or target tables, or the 
+  function to return additional information about the action executed.
+ 
should be:
+  An expression to be computed and returned by the MERGE
+  command after each row is changed.


one minor issue:
add
`
table sq_target;
table sq_source;
`
before `-- RETURNING` in src/test/regress/sql/merge.sql, so we can
easily understand the tests.




Re: MERGE ... RETURNING

2024-03-08 Thread walther

Jeff Davis:

To summarize, most of the problem has been in retrieving the action
(INSERT/UPDATE/DELETE) taken or the WHEN-clause number applied to a
particular matched row. The reason this is important is because the row
returned is the old row for a DELETE action, and the new row for an
INSERT or UPDATE action. Without a way to distinguish the particular
action, the RETURNING clause returns a mixture of old and new rows,
which would be hard to use sensibly.


It seems to me that all of this is only a problem, because there is only
one RETURNING clause.

Dean Rasheed wrote in the very first post to this thread:

I considered allowing a separate RETURNING list at the end of each
action, but rapidly dismissed that idea. Firstly, it introduces
shift/reduce conflicts to the grammar. These can be resolved by making
the "AS" before column aliases non-optional, but that's pretty ugly,
and there may be a better way. More serious drawbacks are that this
syntax is much more cumbersome for the end user, having to repeat the
RETURNING clause several times, and the implementation is likely to be
pretty complex, so I didn't pursue it.


I can't judge the grammar and complexity issues, but as a potential user
it seems to me to be less complex to have multiple RETURNING clauses, 
where I could inject my own constants about the specific actions, than 
to have to deal with any of the suggested functions / clauses. More 
repetitive, yes - but not more complex.


More importantly, I could add RETURNING to only some of the actions and 
not always all at the same time - which seems pretty useful to me.


Best,

Wolfgang




Re: MERGE ... RETURNING

2024-03-06 Thread Merlin Moncure
On Thu, Feb 29, 2024 at 1:49 PM Jeff Davis  wrote:

>
> Can we get some input on whether the current MERGE ... RETURNING patch
> is the right approach from a language standpoint?
>

MERGE_CLAUSE_NUMBER() seems really out of place to me, it feels out of
place to identify output set by number rather than some kind of name.  Did
not see a lot of support for that position though.

merlin


Re: MERGE ... RETURNING

2024-03-06 Thread Dean Rasheed
On Wed, 6 Mar 2024 at 08:51, Peter Eisentraut  wrote:
>
> For comparison with standard SQL (see ):
>
> For an INSERT you could write
>
> SELECT whatever FROM NEW TABLE (INSERT statement here)
>
> or for an DELETE
>
> SELECT whatever FROM OLD TABLE (DELETE statement here)
>
> And for an UPDATE could can pick either OLD or NEW.
>

Thanks, that's very interesting. I hadn't seen that syntax before.

Over on [1], I have a patch in the works that extends RETURNING,
allowing it to return OLD.colname, NEW.colname, OLD.*, and NEW.*. It
looks like this new SQL standard syntax could be built on top of that
(perhaps by having the rewriter turn queries of the above form into
CTEs).

However, the RETURNING syntax is more powerful, because it allows OLD
and NEW to be used together in arbitrary expressions, for example:

RETURNING ..., NEW.val - OLD.val AS delta, ...

> > The current implementation uses a special function MERGING (a
> > grammatical construct without an OID that parses into a new MergingFunc
> > expr), which takes keywords ACTION or CLAUSE_NUMBER in the argument
> > positions. That's not totally unprecedented in SQL -- the XML and JSON
> > functions are kind of similar. But it's different in the sense that
> > MERGING is also context-sensitive: grammatically, it fits pretty much
> > anywhere a function fits, but then gets rejected at parse analysis time
> > (or perhaps even execution time?) if it's not called from the right
> > place.
>
> An analogy here might be that MATCH_RECOGNIZE (row-pattern recognition)
> has a magic function MATCH_NUMBER() that can be used inside that clause.
>   So a similar zero-argument magic function might make sense.  I don't
> like the MERGING(ACTION) syntax, but something like MERGE_ACTION() might
> make sense.  (This is just in terms of what kind of syntax might be
> palatable.  Depending on where the syntax of the overall clause ends up,
> we might not need it (see above).)
>

It could be that having the ability to return OLD and NEW values, as
in [1], is sufficient for use in MERGE, to identify the action
performed. However, I still think that dedicated functions would be
useful, if we can agree on names/syntax.

I think that I prefer the names MERGE_ACTION() and
MERGE_CLAUSE_NUMBER() from an aesthetic point of view, but it requires
2 new COL_NAME_KEYWORD keywords. Maybe that's OK, I don't know.

Alternatively, we could avoid adding new keywords by going back to
making these regular functions, as they were in an earlier version of
this patch, and then use some special-case code during parse analysis
to turn them into MergeFunc nodes (not quite a complete revert back to
an earlier version of the patch, but not far off).

Regards,
Dean

[1] 
https://www.postgresql.org/message-id/flat/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50agoe+pv...@mail.gmail.com




Re: MERGE ... RETURNING

2024-03-06 Thread Peter Eisentraut

On 29.02.24 20:49, Jeff Davis wrote:

To summarize, most of the problem has been in retrieving the action
(INSERT/UPDATE/DELETE) taken or the WHEN-clause number applied to a
particular matched row. The reason this is important is because the row
returned is the old row for a DELETE action, and the new row for an
INSERT or UPDATE action. Without a way to distinguish the particular
action, the RETURNING clause returns a mixture of old and new rows,
which would be hard to use sensibly.


For comparison with standard SQL (see ):

For an INSERT you could write

SELECT whatever FROM NEW TABLE (INSERT statement here)

or for an DELETE

SELECT whatever FROM OLD TABLE (DELETE statement here)

And for an UPDATE could can pick either OLD or NEW.

(There is also FINAL, which appears to be valid in cases where NEW is 
valid.  Here is an explanation: 
<https://www.ibm.com/docs/en/db2oc?topic=statement-result-sets-from-sql-data-changes>)


For a MERGE statement, whether you can specify OLD or NEW (or FINAL) 
depends on what actions appear in the MERGE statement.


So if we were to translate that to our syntax, it might be something like

    MERGE ... RETURNING OLD *

or

    MERGE ... RETURNING NEW *

This wouldn't give you the ability to return both old and new.  (Is that 
useful?)  But maybe you could also do something like


    MERGE ... RETURNING OLD 'old'::text, * RETURNING NEW 'new'::text, *

(I mean here you could insert your own constants into the returning lists.)


The current implementation uses a special function MERGING (a
grammatical construct without an OID that parses into a new MergingFunc
expr), which takes keywords ACTION or CLAUSE_NUMBER in the argument
positions. That's not totally unprecedented in SQL -- the XML and JSON
functions are kind of similar. But it's different in the sense that
MERGING is also context-sensitive: grammatically, it fits pretty much
anywhere a function fits, but then gets rejected at parse analysis time
(or perhaps even execution time?) if it's not called from the right
place.


An analogy here might be that MATCH_RECOGNIZE (row-pattern recognition) 
has a magic function MATCH_NUMBER() that can be used inside that clause. 
 So a similar zero-argument magic function might make sense.  I don't 
like the MERGING(ACTION) syntax, but something like MERGE_ACTION() might 
make sense.  (This is just in terms of what kind of syntax might be 
palatable.  Depending on where the syntax of the overall clause ends up, 
we might not need it (see above).)






Re: MERGE ... RETURNING

2024-02-29 Thread Jeff Davis
On Thu, 2024-02-29 at 19:24 +, Dean Rasheed wrote:
> Attached is a rebased version on top of 5f2e179bd3 (support for MERGE
> into views), with a few additional tests to confirm that MERGE ...
> RETURNING works for views as well as tables.

Thank you for the rebase. I just missed your message (race condition),
so I replied to v15:

https://www.postgresql.org/message-id/e03a87eb4e728c5e475b360b5845979f78d49020.camel%40j-davis.com

> I see that this patch was discussed at the PostgreSQL Developers
> Meeting. Did anything new come out of that discussion?

I don't think we made any conclusions at the meeting, but I expressed
that we need input from one of them on this patch.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2024-02-29 Thread Jeff Davis


Can we get some input on whether the current MERGE ... RETURNING patch
is the right approach from a language standpoint?

We've gone through a lot of iterations -- thank you Dean, for
implementing so many variations.

To summarize, most of the problem has been in retrieving the action
(INSERT/UPDATE/DELETE) taken or the WHEN-clause number applied to a
particular matched row. The reason this is important is because the row
returned is the old row for a DELETE action, and the new row for an
INSERT or UPDATE action. Without a way to distinguish the particular
action, the RETURNING clause returns a mixture of old and new rows,
which would be hard to use sensibly.

Granted, DELETE in a MERGE may be a less common case. But given that we
also have INSERT ... ON CONFLICT, MERGE commands are more likely to be
the complicated cases where distinguishing the action or clause number
is important.

But linguistically it's not clear where the action or clause number
should come from. The clauses don't have assigned numbers, and even if
they did, linguistically it's not clear how to refer to the clause
number in a language like SQL. Would it be a special identifier, a
function, a special function, or be a column in a special table
reference? Or, do we just have one RETURNING-clause per WHEN-clause,
and let the user use a literal of their choice in the RETURNING clause?

The current implementation uses a special function MERGING (a
grammatical construct without an OID that parses into a new MergingFunc
expr), which takes keywords ACTION or CLAUSE_NUMBER in the argument
positions. That's not totally unprecedented in SQL -- the XML and JSON
functions are kind of similar. But it's different in the sense that
MERGING is also context-sensitive: grammatically, it fits pretty much
anywhere a function fits, but then gets rejected at parse analysis time
(or perhaps even execution time?) if it's not called from the right
place.

Is that a reasonable thing to do?

Another related topic came up, which is that the RETURNING clause (for
UPDATE as well as MERGE) should probably accept some kind of alias like
NEW/OLD or BEFORE/AFTER to address the version of the row that you
want. That doesn't eliminate the need for the MERGING function, but
it's good to think about how that might fit in with whatever we do
here.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2024-01-30 Thread jian he
I didn't find any issue with v15.
no commit message in the patch, If a commit message is there, I can
help proofread.




Re: MERGE ... RETURNING

2024-01-28 Thread jian he
On Fri, Jan 19, 2024 at 1:44 AM Dean Rasheed  wrote:
>
>
> Thanks for reviewing. Updated patch attached.
>
> The wider question is whether people are happy with the overall
> approach this patch now takes, and the new MERGING() function and
> MergingFunc node.
>

one minor white space issue:

git diff --check
doc/src/sgml/func.sgml:22482: trailing whitespace.
+ action | clause_number | product_id | in_stock | quantity


@@ -3838,7 +3904,7 @@ ExecModifyTable(PlanState *pstate)
  }
  slot = ExecGetUpdateNewTuple(resultRelInfo, context.planSlot,
  oldSlot);
- context.relaction = NULL;
+ node->mt_merge_action = NULL;

I wonder what's the purpose of setting node->mt_merge_action to null ?
I add `node->mt_merge_action = NULL;` at the end of each branch in
`switch (operation)`.
All the tests still passed.
Other than this question, this patch is very good.




Re: MERGE ... RETURNING

2024-01-17 Thread jian he
On Sat, Nov 18, 2023 at 8:55 PM Dean Rasheed  wrote:
>
> The v13 patch still applies on top of this, so I won't re-post it.
>

Hi.
minor issues based on v13.

+
+MERGING (
property )
+
+   The following are valid property values specifying what to return:
+
+   
+
+ ACTION
+ 
+  
+   The merge action command executed for the current row
+   ('INSERT', 'UPDATE', or
+   'DELETE').
+  
+ 
+
do we change to property?
Maybe the main para should be two sentences like:
The merge action command executed for the current row. Possible values
are: 'INSERT', 'UPDATE',
'DELETE'.

 static Node *
+transformMergingFunc(ParseState *pstate, MergingFunc *f)
+{
+ /*
+ * Check that we're in the RETURNING list of a MERGE command.
+ */
+ if (pstate->p_expr_kind != EXPR_KIND_MERGE_RETURNING)
+ {
+ ParseState *parent_pstate = pstate->parentParseState;
+
+ while (parent_pstate &&
+   parent_pstate->p_expr_kind != EXPR_KIND_MERGE_RETURNING)
+ parent_pstate = parent_pstate->parentParseState;
+
+ if (!parent_pstate ||
+ parent_pstate->p_expr_kind != EXPR_KIND_MERGE_RETURNING)
+ ereport(ERROR,
+ errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("MERGING() can only be used in the RETURNING list of a MERGE command"),
+ parser_errposition(pstate, f->location));
+ }
+
+ return (Node *) f;
+}
+
the object is correct, but not in the right place.
maybe we should change errcode(ERRCODE_WRONG_OBJECT_TYPE) to
errcode(ERRCODE_INVALID_OBJECT_DEFINITION)
also do we need to add some comments explain that why we return it as
is when it's EXPR_KIND_MERGE_RETURNING.
(my understanding is that, if key words not match, then it will fail
at gram.y, like syntax error, else MERGING will based on keywords make
a MergingFunc node and assign mfop, mftype, location to it)

in src/backend/executor/functions.c
/*
* Break from loop if we didn't shut down (implying we got a
* lazily-evaluated row).  Otherwise we'll press on till the whole
* function is done, relying on the tuplestore to keep hold of the
* data to eventually be returned.  This is necessary since an
* INSERT/UPDATE/DELETE RETURNING that sets the result might be
* followed by additional rule-inserted commands, and we want to
* finish doing all those commands before we return anything.
*/
Does the above comments need to change to INSERT/UPDATE/DELETE/MERGE?

in src/backend/nodes/nodeFuncs.c
case T_UpdateStmt:
{
UpdateStmt *stmt = (UpdateStmt *) node;
if (WALK(stmt->relation))
return true;
if (WALK(stmt->targetList))
return true;
if (WALK(stmt->whereClause))
return true;
if (WALK(stmt->fromClause))
return true;
if (WALK(stmt->returningList))
return true;
if (WALK(stmt->withClause))
return true;
}
break;
case T_MergeStmt:
{
MergeStmt  *stmt = (MergeStmt *) node;
if (WALK(stmt->relation))
return true;
if (WALK(stmt->sourceRelation))
return true;
if (WALK(stmt->joinCondition))
return true;
if (WALK(stmt->mergeWhenClauses))
return true;
if (WALK(stmt->withClause))
return true;
}
break;

you add "returningList" to MergeStmt.
do you need to do the following similar to UpdateStmt, even though
it's so abstract, i have no idea what's going on.
`
if (WALK(stmt->returningList))
return true;
`




Re: MERGE ... RETURNING

2023-11-18 Thread Dean Rasheed
On Fri, 17 Nov 2023 at 04:30, jian he  wrote:
>
> I think it should be:
> +   You will require the SELECT privilege on any column(s)
> +   of the data_source and
> +   target_table_name referred to
> +   in any condition or expression.
>

Ah, of course. As always, I'm blind to grammatical errors in my own
text, no matter how many times I read it. Thanks for checking!

Pushed.

The v13 patch still applies on top of this, so I won't re-post it.

Regards,
Dean




Re: MERGE ... RETURNING

2023-11-16 Thread jian he
>
> Attached is a separate patch with those doc updates, intended to be
> applied and back-patched independently of the main RETURNING patch.
>
> Regards,
> Dean

+   You will require the SELECT privilege and any column(s)
+   of the data_source and
+   target_table_name referred to
+   in any condition or expression.

I think it should be:
+   You will require the SELECT privilege on any column(s)
+   of the data_source and
+   target_table_name referred to
+   in any condition or expression.

Other than that, it looks fine.




Re: MERGE ... RETURNING

2023-11-15 Thread Dean Rasheed
On Mon, 13 Nov 2023 at 05:29, jian he  wrote:
>
> v13 works fine. all tests passed. The code is very intuitive. played
> with multi WHEN clauses, even with before/after row triggers, work as
> expected.
>

Thanks for the review and testing!

> I don't know when replace_outer_merging will be invoked. even set a
> breakpoint on it. coverage shows replace_outer_merging only called
> once.
>

It's used when MERGING() is used in a subquery in the RETURNING list.
The MergingFunc node in the subquery is replaced by a Param node,
referring to the outer MERGE query, so that the result from MERGING()
is available in the SELECT subquery (under any other circumstances,
you're not allowed to use MERGING() in a SELECT). This is similar to
what happens when a subquery contains an aggregate over columns from
an outer query only -- for example, see:

https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES:~:text=When%20an%20aggregate,aggregate%20belongs%20to.

https://github.com/postgres/postgres/commit/e649796f128bd8702ba5744d36f4e8cb81f0b754

A MERGING() expression in a subquery in the RETURNING list is
analogous, in that it belongs to the outer MERGE query, not the SELECT
subquery.

> sql-merge.html miss mentioned RETURNING need select columns privilege?
> in sql-insert.html, we have:
> "Use of the RETURNING clause requires SELECT privilege on all columns
> mentioned in RETURNING. If you use the query clause to insert rows
> from a query, you of course need to have SELECT privilege on any table
> or column used in the query."
>

Ah, good point. I don't think I looked at the privileges paragraph on
the MERGE page. Currently it says:

   You will require the SELECT privilege on the data_source
   and any column(s) of the target_table_name referred to in a
   condition.

Being pedantic, there are 2 problems with that:

1. It might be taken to imply that you need the SELECT privilege on
every column of the data_source, which isn't the case.

2. It mentions conditions, but not expressions (such as those that can
appear in INSERT and UPDATE actions).

A more accurate statement would be:

   You will require the SELECT privilege and any column(s)
   of the data_source and target_table_name referred to in any
   condition or expression.

which is also consistent with the wording used on the UPDATE manual page.

Done that way, I don't think it would need to be updated to mention
RETURNING, because RETURNING just returns a list of expressions.
Again, that would be consistent with the UPDATE page, which doesn't
mention RETURNING in its discussion of privileges.

> I saw the change in src/sgml/glossary.sgml, So i looked around. in the
> "Materialized view (relation)" part. "It cannot be modified via
> INSERT, UPDATE, or DELETE operations.". Do we need to put "MERGE" into
> that sentence?
> also there is SELECT, INSERT, UPDATE, DELETE,  do we need to add a
> MERGE entry in glossary.sgml?

Yes, that makes sense.

Attached is a separate patch with those doc updates, intended to be
applied and back-patched independently of the main RETURNING patch.

Regards,
Dean


merge-docs.patch.no-cfbot
Description: Binary data


Re: MERGE ... RETURNING

2023-11-12 Thread jian he
Hi.
v13 works fine. all tests passed. The code is very intuitive. played
with multi WHEN clauses, even with before/after row triggers, work as
expected.

I don't know when replace_outer_merging will be invoked. even set a
breakpoint on it. coverage shows replace_outer_merging only called
once.

sql-merge.html miss mentioned RETURNING need select columns privilege?
in sql-insert.html, we have:
"Use of the RETURNING clause requires SELECT privilege on all columns
mentioned in RETURNING. If you use the query clause to insert rows
from a query, you of course need to have SELECT privilege on any table
or column used in the query."

I saw the change in src/sgml/glossary.sgml, So i looked around. in the
"Materialized view (relation)" part. "It cannot be modified via
INSERT, UPDATE, or DELETE operations.". Do we need to put "MERGE" into
that sentence?
also there is SELECT, INSERT, UPDATE, DELETE,  do we need to add a
MERGE entry in glossary.sgml?




Re: MERGE ... RETURNING

2023-11-09 Thread Dean Rasheed
On Sun, 5 Nov 2023 at 11:52, Dean Rasheed  wrote:
>
> OK, that's a fair point. Attached is a new version, replacing those
> parts of the implementation with a new MergingFunc node. It doesn't
> add that much more complexity, and I think the new code is much
> neater.
>

Rebased version attached, following the changes made in 615f5f6faa and
a4f7d33a90.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..3d95bdb
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,21 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  Since it is quite common for the source and target to
+   have many of the same columns, specifying RETURNING *
+   can lead to a lot of duplicated columns, so it is often more useful to
+   qualify it so as to return just the source or target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index d963f0a..b25de53
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21990,6 +21990,84 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   MERGING
+  
+
+  
+   PostgreSQL includes one merge support function
+   that may be used in the RETURNING list of a
+command to return additional information about
+   the action taken for each row:
+
+MERGING ( property )
+
+   The following are valid property values specifying what to return:
+
+   
+
+ ACTION
+ 
+  
+   The merge action command executed for the current row
+   ('INSERT', 'UPDATE', or
+   'DELETE').
+  
+ 
+
+
+
+ CLAUSE_NUMBER
+ 
+  
+   The 1-based index of the WHEN clause executed for
+   the current row.
+  
+ 
+
+   
+  
+
+  
+   The following example illustrates how this may be used to determine
+   which actions were performed for each row affected by the
+   MERGE command:
+
+  
+
+  
+   Note that this function can only be used in the RETURNING
+   list of a MERGE command.  It is an error to use it in any
+   other part of a query.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index fe8def4..51a15ca
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1387,9 +1387,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index 5977534..4ad9894
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1149,6 +1149,7 @@ SELECT select_expressionsexpressions INTO STRICT target;
 UPDATE ... RETURNING expressions INTO STRICT target;
 DELETE ... RETURNING expressions INTO STRICT target;
+MERGE ... RETURNING expressions INTO STRICT target;
 
 
  where target can be a record variable, a row
@@ -1159,8 +1160,8 @@ DELETE ... RETURNING expres
  INTO clause) just as described above,
  and the plan is cached in the same way.
  This works for SELECT,
- INSERT/UPDATE/DELETE with
- RETURNING, and certain utility commands
+ INSERT/UPDATE/DELETE/MERGE
+ with RETURNING, and certain utility commands

Re: MERGE ... RETURNING

2023-11-05 Thread Dean Rasheed
On Wed, 1 Nov 2023 at 17:49, Jeff Davis  wrote:
>
> Most of my concern is that parts of the implementation feel like a
> hack, which makes me concerned that we're approaching it the wrong way.
>

OK, that's a fair point. Attached is a new version, replacing those
parts of the implementation with a new MergingFunc node. It doesn't
add that much more complexity, and I think the new code is much
neater.

Also, I think this makes it easier / more natural to add additional
returning options, like Merlin's suggestion to return a user-defined
label value, though I haven't implemented that.

I have gone with the name originally suggested by Vik -- MERGING(),
which means that that has to be a new col-name keyword. I'm not
especially wedded to that name, but I think that it's not a bad
choice, and I think going with that is preferable to making MERGE a
col-name keyword.

So (quoting the example from the docs), the new syntax looks like this:

MERGE INTO products p
  USING stock s ON p.product_id = s.product_id
  WHEN MATCHED AND s.quantity > 0 THEN
UPDATE SET in_stock = true, quantity = s.quantity
  WHEN MATCHED THEN
UPDATE SET in_stock = false, quantity = 0
  WHEN NOT MATCHED THEN
INSERT (product_id, in_stock, quantity)
  VALUES (s.product_id, true, s.quantity)
  RETURNING MERGING(ACTION), MERGING(CLAUSE_NUMBER), p.*;

 action | clause_number | product_id | in_stock | quantity
+---++--+--
 UPDATE | 1 |   1001 | t|   50
 UPDATE | 2 |   1002 | f|0
 INSERT | 3 |   1003 | t|   10

By default, the returned column names are automatically taken from the
argument to the MERGING() function (which isn't actually a function
anymore).

There's one bug that I know about, to do with cross-partition updates,
but since that's a pre-existing bug, I'll start a new thread for it.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..3d95bdb
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,21 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  Since it is quite common for the source and target to
+   have many of the same columns, specifying RETURNING *
+   can lead to a lot of duplicated columns, so it is often more useful to
+   qualify it so as to return just the source or target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index a6fcac0..06d5fe8
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21960,6 +21960,84 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   MERGING
+  
+
+  
+   PostgreSQL includes one merge support function
+   that may be used in the RETURNING list of a
+command to return additional information about
+   the action taken for each row:
+
+MERGING ( property )
+
+   The following are valid property values specifying what to return:
+
+   
+
+ ACTION
+ 
+  
+   The merge action command executed for the current row
+   ('INSERT', 'UPDATE', or
+   'DELETE').
+  
+ 
+
+
+
+ CLAUSE_NUMBER
+ 
+  
+   The 1-based index of the WHEN clause executed for
+   the current row.
+  
+ 
+
+   
+  
+
+  
+   The following example illustrates how this may be used to determine
+   which actions were performed for each row affected by the
+   MERGE command:
+
+  
+
+  
+   Note that this function can only be used in the RETURNING
+   list of a MERGE command.  It is an error to use it in any
+   other part of a query.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index fe8def4..51a15ca
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1387,9 +1387,9 @@
  to a client upon the
  completion of an SQL c

Re: MERGE ... RETURNING

2023-11-01 Thread Jeff Davis
On Wed, 2023-11-01 at 10:12 +, Dean Rasheed wrote:
> Something I'm wondering about is to what extent this discussion is
> driven by concerns about aspects of the implementation (specifically,
> references to function OIDs in code), versus a desire for a different
> user-visible syntax. To a large extent, those are orthogonal
> questions.

Most of my concern is that parts of the implementation feel like a
hack, which makes me concerned that we're approaching it the wrong way.

At a language level, I'm also concerned that we don't have a way to
access the before/after versions of the tuple. I won't insist on this
because I'm hoping that could be solved as part of a later patch that
also addresses UPDATE ... RETURNING.

> (As an aside, I would note that there are already around a dozen
> references to specific function OIDs in the parse analysis code, and
> a
> lot more if you grep more widely across the whole of the backend
> code.)

If you can point to a precedent, then I'm much more inclined to be OK
with the implementation.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-11-01 Thread Merlin Moncure
On Wed, Nov 1, 2023 at 5:12 AM Dean Rasheed 
wrote:

> On Tue, 31 Oct 2023 at 23:19, Vik Fearing  wrote:
> >
> > On 10/31/23 19:28, Jeff Davis wrote:
> >
> > > Assuming we have one RETURNING clause at the end, then it creates the
> > > problem of how to communicate which WHEN clause a tuple came from,
> > > whether it's the old or the new version, and/or which action was
> > > performed on that tuple.
> > >
> > > How do we communicate any of those things? We need to get that
> > > information into the result table somehow, so it should probably be
> > > some kind of expression that can exist in the RETURNING clause. But
> > > what kind of expression?
> > >
> > > (a) It could be a totally new expression kind with a new keyword (or
> > > recycling some existing keywords for the same effect, or something that
> > > looks superficially like a function call but isn't) that's only valid
> > > in the RETURNING clause of a MERGE statement. If you use it in another
> > > expression (say the targetlist of a SELECT statement), then you'd get a
> > > failure at parse analysis time.
> >
> > This would be my choice, the same as how the standard GROUPING()
> > "function" for grouping sets is implemented by GroupingFunc.
> >
>
> Something I'm wondering about is to what extent this discussion is
> driven by concerns about aspects of the implementation (specifically,
> references to function OIDs in code), versus a desire for a different
> user-visible syntax. To a large extent, those are orthogonal
> questions.
>
> (As an aside, I would note that there are already around a dozen
> references to specific function OIDs in the parse analysis code, and a
> lot more if you grep more widely across the whole of the backend
> code.)
>
> At one point, as I was writing this patch, I went part-way down the
> route of adding a new node type (I think I called it MergeFunc), for
> these merge support functions, somewhat inspired by GroupingFunc. In
> the end, I backed out of that approach, because it seemed to be
> introducing a lot of unnecessary additional complexity, and I decided
> that a regular FuncExpr would suffice.
>
> If pg_merge_action() and pg_merge_when_clause_number() were
> implemented using a MergeFunc node, it would reduce the number of
> places that refer to specific function OIDs. Basically, a MergeFunc
> node would be very much like a FuncExpr node, except that it would
> have a "levels up" field, set during parse analysis, at the point
> where we check that it is being used in a merge returning clause, and
> this field would be used during subselect planning. Note, however,
> that that doesn't entirely eliminate references to specific function
> OIDs -- the parse analysis code would still do that. Also, additional
> special-case code in the executor would be required to handle
> MergeFunc nodes. Also, code like IncrementVarSublevelsUp() would need
> adjusting, and anything else like that.
>
> A separate question is what the syntax should be. We could invent a
> new syntax, like GROUPING(). Perhaps:
>
>   MERGE(ACTION) instead of pg_merge_action()
>   MERGE(CLAUSE NUMBER) instead of pg_merge_when_clause_number()
>

Hm, still struggling with this merge action and (especially) number stuff.
Currently we have:

 WHEN MATCHED [ AND *condition* ] THEN { *merge_update* |
*merge_delete* | DO NOTHING } |
  WHEN NOT MATCHED [ AND *condition* ] THEN { *merge_insert* | DO NOTHING } }

What about extending to something like:

WHEN MATCHED [ AND *condition* ] [ AS *merge_clause_name ]*

WHEN MATCHED AND tid > 2 AS giraffes THEN UPDATE SET balance = t.balance +
delta

...and have pg_merge_clause() return 'giraffes' (of name type).  If merge
clause is not identified, maybe don't return any data for that clause
through returning,, or return NULL.  Maybe 'returning' clause doesn't have
to be extended or molested in any way, it would follow mechanics as per
'update', and could not refer to identified merge_clauses, but would allow
for pg_merge_clause() functioning.  You wouldn't need to identify action or
number.  Food for thought, -- may have missed some finer details upthread.

for example,
with r as (
  merge into x using y on x.a = y.a
  when matched and x.c > 0 as good then do nothing
  when matched and x.c <= 0 as bad then do nothing
  returning pg_merge_clause(), x.*
) ...

yielding
pg_merge_clause a  c
good1  5
good2  7
bad 3  0
...

...maybe allow pg_merge_clause()  take to optionally yield column name:
  returning pg_merge_clause('result'), x.*
) ...

yielding
result a  c
good   1  5
good   2  7
bad3  0
...

merlin


Re: MERGE ... RETURNING

2023-11-01 Thread Vik Fearing

On 11/1/23 11:12, Dean Rasheed wrote:

On Tue, 31 Oct 2023 at 23:19, Vik Fearing  wrote:


On 10/31/23 19:28, Jeff Davis wrote:


Assuming we have one RETURNING clause at the end, then it creates the
problem of how to communicate which WHEN clause a tuple came from,
whether it's the old or the new version, and/or which action was
performed on that tuple.

How do we communicate any of those things? We need to get that
information into the result table somehow, so it should probably be
some kind of expression that can exist in the RETURNING clause. But
what kind of expression?

(a) It could be a totally new expression kind with a new keyword (or
recycling some existing keywords for the same effect, or something that
looks superficially like a function call but isn't) that's only valid
in the RETURNING clause of a MERGE statement. If you use it in another
expression (say the targetlist of a SELECT statement), then you'd get a
failure at parse analysis time.


This would be my choice, the same as how the standard GROUPING()
"function" for grouping sets is implemented by GroupingFunc.



Something I'm wondering about is to what extent this discussion is
driven by concerns about aspects of the implementation (specifically,
references to function OIDs in code), versus a desire for a different
user-visible syntax. To a large extent, those are orthogonal
questions.



For my part, I am most concerned about the language level.  I am 
sympathetic to the implementers' issues, but that is not my main focus.


So please do not take my implementation advice into account when I voice 
my opinions.

--
Vik Fearing





Re: MERGE ... RETURNING

2023-11-01 Thread Dean Rasheed
On Tue, 31 Oct 2023 at 23:19, Vik Fearing  wrote:
>
> On 10/31/23 19:28, Jeff Davis wrote:
>
> > Assuming we have one RETURNING clause at the end, then it creates the
> > problem of how to communicate which WHEN clause a tuple came from,
> > whether it's the old or the new version, and/or which action was
> > performed on that tuple.
> >
> > How do we communicate any of those things? We need to get that
> > information into the result table somehow, so it should probably be
> > some kind of expression that can exist in the RETURNING clause. But
> > what kind of expression?
> >
> > (a) It could be a totally new expression kind with a new keyword (or
> > recycling some existing keywords for the same effect, or something that
> > looks superficially like a function call but isn't) that's only valid
> > in the RETURNING clause of a MERGE statement. If you use it in another
> > expression (say the targetlist of a SELECT statement), then you'd get a
> > failure at parse analysis time.
>
> This would be my choice, the same as how the standard GROUPING()
> "function" for grouping sets is implemented by GroupingFunc.
>

Something I'm wondering about is to what extent this discussion is
driven by concerns about aspects of the implementation (specifically,
references to function OIDs in code), versus a desire for a different
user-visible syntax. To a large extent, those are orthogonal
questions.

(As an aside, I would note that there are already around a dozen
references to specific function OIDs in the parse analysis code, and a
lot more if you grep more widely across the whole of the backend
code.)

At one point, as I was writing this patch, I went part-way down the
route of adding a new node type (I think I called it MergeFunc), for
these merge support functions, somewhat inspired by GroupingFunc. In
the end, I backed out of that approach, because it seemed to be
introducing a lot of unnecessary additional complexity, and I decided
that a regular FuncExpr would suffice.

If pg_merge_action() and pg_merge_when_clause_number() were
implemented using a MergeFunc node, it would reduce the number of
places that refer to specific function OIDs. Basically, a MergeFunc
node would be very much like a FuncExpr node, except that it would
have a "levels up" field, set during parse analysis, at the point
where we check that it is being used in a merge returning clause, and
this field would be used during subselect planning. Note, however,
that that doesn't entirely eliminate references to specific function
OIDs -- the parse analysis code would still do that. Also, additional
special-case code in the executor would be required to handle
MergeFunc nodes. Also, code like IncrementVarSublevelsUp() would need
adjusting, and anything else like that.

A separate question is what the syntax should be. We could invent a
new syntax, like GROUPING(). Perhaps:

  MERGE(ACTION) instead of pg_merge_action()
  MERGE(CLAUSE NUMBER) instead of pg_merge_when_clause_number()

But note that those could equally well generate either FuncExpr nodes
or MergeFunc nodes, so the syntax question remains orthogonal to that
internal implementation question.

If MERGE(...) (or MERGING(...), or whatever) were part of the SQL
standard, then that would be the clear choice. But since it's not, I
don't see any real advantage to inventing special syntax here, rather
than just using a regular function call. In fact, it's worse, because
if this were to work like GROUPING(), it would require MERGE (or
MERGING, or whatever) to be a COL_NAME_KEYWORD, where currently MERGE
is an UNRESERVED_KEYWORD, and that would break any existing
user-defined functions with that name, whereas the "pg_" prefix of my
functions makes that much less likely.

So on the syntax question, in the absence of anything specific from
the SQL standard, I think we should stick to builtin functions,
without inventing special syntax. That doesn't preclude adding special
syntax later, if the SQL standard mandates it, but that might be
harder, if we invent our own syntax now.

On the implementation question, I'm not completely against the idea of
a MergeFunc node, but it does feel a little over-engineered.

Regards,
Dean




Re: MERGE ... RETURNING

2023-10-31 Thread Vik Fearing

On 10/31/23 19:28, Jeff Davis wrote:

On Tue, 2023-10-31 at 12:45 +0100, Vik Fearing wrote:

On 10/24/23 21:10, Jeff Davis wrote:

Can we revisit the idea of a per-WHEN RETURNING clause?


For the record, I dislike this idea.


I agree that it makes things awkward, and if it creates grammatical
problems as well, then it's not very appealing.

There are only so many approaches though, so it would be helpful if you
could say which approach you prefer.



This isn't as easy to answer for me as it seems because I care deeply 
about respecting the standard.  The standard does not have RETURNING at 
all but instead has  and the results for that 
for a MERGE statement are clearly defined.


On the other hand, we don't have that and we do have RETURNING so we 
should improve upon that if we can.


One thing I don't like about either solution is that you cannot get at 
both the old row versions and the new row versions at the same time.  I 
don't see how s can be fixed to support that, 
but RETURNING certainly can be with some spelling of OLD/NEW or 
BEFORE/AFTER or whatever.




Assuming we have one RETURNING clause at the end, then it creates the
problem of how to communicate which WHEN clause a tuple came from,
whether it's the old or the new version, and/or which action was
performed on that tuple.

How do we communicate any of those things? We need to get that
information into the result table somehow, so it should probably be
some kind of expression that can exist in the RETURNING clause. But
what kind of expression?

(a) It could be a totally new expression kind with a new keyword (or
recycling some existing keywords for the same effect, or something that
looks superficially like a function call but isn't) that's only valid
in the RETURNING clause of a MERGE statement. If you use it in another
expression (say the targetlist of a SELECT statement), then you'd get a
failure at parse analysis time.



This would be my choice, the same as how the standard GROUPING() 
"function" for grouping sets is implemented by GroupingFunc.


--
Vik Fearing





Re: MERGE ... RETURNING

2023-10-31 Thread Jeff Davis
On Tue, 2023-10-31 at 12:45 +0100, Vik Fearing wrote:
> On 10/24/23 21:10, Jeff Davis wrote:
> > Can we revisit the idea of a per-WHEN RETURNING clause?
> 
> For the record, I dislike this idea.

I agree that it makes things awkward, and if it creates grammatical
problems as well, then it's not very appealing.

There are only so many approaches though, so it would be helpful if you
could say which approach you prefer.

Assuming we have one RETURNING clause at the end, then it creates the
problem of how to communicate which WHEN clause a tuple came from,
whether it's the old or the new version, and/or which action was
performed on that tuple.

How do we communicate any of those things? We need to get that
information into the result table somehow, so it should probably be
some kind of expression that can exist in the RETURNING clause. But
what kind of expression?

(a) It could be a totally new expression kind with a new keyword (or
recycling some existing keywords for the same effect, or something that
looks superficially like a function call but isn't) that's only valid
in the RETURNING clause of a MERGE statement. If you use it in another
expression (say the targetlist of a SELECT statement), then you'd get a
failure at parse analysis time.

(b) It could be a FuncExpr that is passed the information out-of-band
(i.e. not as an argument) and would fail at runtime if called in the
wrong context.

(c) It could be a Var (or perhaps a Param?), but to which table would
it refer? The source or the target table could be used, but we would
also need a special table reference to represent additional context
(WHEN clause number, action, or "after" version of the tuple).

Dean's v11 patch had kind of a combination of (a) and (b). It's raises
an error at parse analysis time like (a), but without any grammar
changes or new expr kind because it's a function. I must admit that
might be a very reasonable compromise and I certainly won't reject it
without a clearly better alternative. It does feel like a hack though
in the sense that it's hard-coding function OIDs into the parse
analysis and I'm not sure that's a great thing to do. I wonder if it
would be worth thinking about a way to make it generic by really making
it into a different kind of function with pg_proc support? That feels
like over-engineering, and I hate to generalize from a single use case,
but it might be a good thought exercise.

The cleanest from a SQL perspective (in my opinion) would be something
more like (c), because the merge action and WHEN clause number would be
passed in tuple data. It also would be good precedent for something
like BEFORE/AFTER aliases, which could be useful for UPDATE actions.
But given the implementation complexities brought up earlier (I haven't
looked into the details, but others have), that might be over-
engineering.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-10-31 Thread Vik Fearing

On 10/24/23 21:10, Jeff Davis wrote:

Can we revisit the idea of a per-WHEN RETURNING clause?


For the record, I dislike this idea.
--
Vik Fearing





Re: MERGE ... RETURNING

2023-10-30 Thread Jeff Davis
On Fri, 2023-10-27 at 15:46 +0100, Dean Rasheed wrote:
> 
> One fiddly part is resolving the shift/reduce conflicts in the
> grammar. Specifically, on seeing "RETURNING expr when ...", there is
> ambiguity over whether the "when" is a column alias or the start of
> the next merge action. I've resolved that by assigning a slightly
> higher precedence to an expression without an alias, so WHEN is
> assumed to not be an alias. It seems pretty ugly though (in terms of
> having to duplicate so much code), and I'd be interested to know if
> there's a neater way to do it.

Can someone else comment on whether this is a reasonable solution to
the grammar problem?

> From a usability perspective, I'm still somewhat sceptical about this
> approach. It's a much more verbose syntax, and it gets quite tedious
> having to repeat the RETURNING list for every action, and keep them
> in
> sync.

If we go with the single RETURNING-clause-at-the-end approach, how
important is it that the action can be a part of an arbitrary
expression?

Perhaps something closer to your original proposal would be a good
compromise (sorry to backtrack yet again...)? It couldn't be used in an
arbitrary expression, but that also means that it couldn't end up in
the wrong kind of expression.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-10-27 Thread Dean Rasheed
 skip this for MERGE, since it uses per-action
+	 * RETURNING lists, which are handled below.
 	 */
-	if (node && node->returningLists != NIL)
+	if (node && node->returningLists != NIL && node->operation != CMD_MERGE)
 	{
 		TupleTableSlot *slot;
 		ExprContext *econtext;
 		List	   *returningList;
 
 		/* See the comment above for WCO lists. */
-		/* (except no RETURNING support for MERGE yet) */
 		Assert((node->operation == CMD_INSERT &&
 list_length(node->returningLists) == 1 &&
 list_length(node->resultRelations) == 1) ||
@@ -959,6 +959,25 @@ ExecInitPartitionInfo(ModifyTableState *
 	_whole_row);
 			action_state->mas_whenqual =
 ExecInitQual((List *) action->qual, >ps);
+
+			/* Build a projection for the action's RETURNING list, if any */
+			if (action->returningList)
+			{
+/* found_whole_row intentionally ignored. */
+action->returningList = (List *)
+	map_variable_attnos((Node *) action->returningList,
+		firstVarno, 0,
+		part_attmap,
+		RelationGetForm(partrel)->reltype,
+		_whole_row);
+
+action_state->mas_proj_returning =
+	ExecBuildProjectionInfo(action->returningList,
+			econtext,
+			mtstate->ps.ps_ResultTupleSlot,
+			>ps,
+			RelationGetDescr(partrel));
+			}
 		}
 	}
 	MemoryContextSwitchTo(oldcxt);
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
new file mode 100644
index bace252..68cb4fa
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -1660,8 +1660,8 @@ check_sql_fn_retval(List *queryTreeLists
 
 	/*
 	 * If it's a plain SELECT, it returns whatever the targetlist says.
-	 * Otherwise, if it's INSERT/UPDATE/DELETE with RETURNING, it returns
-	 * that. Otherwise, the function return type must be VOID.
+	 * Otherwise, if it's INSERT/UPDATE/DELETE/MERGE with RETURNING, it
+	 * returns that. Otherwise, the function return type must be VOID.
 	 *
 	 * Note: eventually replace this test with QueryReturnsTuples?	We'd need
 	 * a more general method of determining the output type, though.  Also, it
@@ -1679,7 +1679,8 @@ check_sql_fn_retval(List *queryTreeLists
 	else if (parse &&
 			 (parse->commandType == CMD_INSERT ||
 			  parse->commandType == CMD_UPDATE ||
-			  parse->commandType == CMD_DELETE) &&
+			  parse->commandType == CMD_DELETE ||
+			  parse->commandType == CMD_MERGE) &&
 			 parse->returningList)
 	{
 		tlist = parse->returningList;
@@ -1693,7 +1694,7 @@ check_sql_fn_retval(List *queryTreeLists
 (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
  errmsg("return type mismatch in function declared to return %s",
 		format_type_be(rettype)),
- errdetail("Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING.")));
+ errdetail("Function's final statement must be SELECT or INSERT/UPDATE/DELETE/MERGE RETURNING.")));
 		return false;			/* keep compiler quiet */
 	}
 
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
new file mode 100644
index 299c2c7..b8a3500
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -36,8 +36,7 @@
  *		RETURNING tuple after completing each row insert, update, or delete.
  *		It must be called again to continue the operation.  Without RETURNING,
  *		we just loop within the node until all the work is done, then
- *		return NULL.  This avoids useless call/return overhead.  (MERGE does
- *		not support RETURNING.)
+ *		return NULL.  This avoids useless call/return overhead.
  */
 
 #include "postgres.h"
@@ -90,6 +89,7 @@ typedef struct ModifyTableContext
 
 	/* MERGE specific */
 	MergeActionState *relaction;	/* MERGE action in progress */
+	int			relaction_idx;		/* its position in the list */
 
 	/*
 	 * Information about the changes that were made concurrently to a tuple
@@ -153,13 +153,14 @@ static TupleTableSlot *ExecMerge(ModifyT
  ItemPointer tupleid,
  bool canSetTag);
 static void ExecInitMerge(ModifyTableState *mtstate, EState *estate);
-static bool ExecMergeMatched(ModifyTableContext *context,
-			 ResultRelInfo *resultRelInfo,
-			 ItemPointer tupleid,
-			 bool canSetTag);
-static void ExecMergeNotMatched(ModifyTableContext *context,
-ResultRelInfo *resultRelInfo,
-bool canSetTag);
+static TupleTableSlot *ExecMergeMatched(ModifyTableContext *context,
+		ResultRelInfo *resultRelInfo,
+		ItemPointer tupleid,
+		bool canSetTag,
+		bool *matched);
+static TupleTableSlot *ExecMergeNotMatched(ModifyTableContext *context,
+		   ResultRelInfo *resultRelInfo,
+		   bool canSetTag);
 
 
 /*
@@ -779,6 +780,19 @@ ExecInsert(ModifyTableContext *context,
 		slot = ExecPrepareTupleRouting(mtstate, estate, pr

Re: MERGE ... RETURNING

2023-10-24 Thread Merlin Moncure
On Tue, Oct 24, 2023 at 2:11 PM Jeff Davis  wrote:

> On Wed, 2023-08-23 at 11:58 +0100, Dean Rasheed wrote:
> > Updated version attached, fixing an uninitialized-variable warning
> > from the cfbot.
>
> I took another look and I'm still not comfortable with the special
> IsMergeSupportFunction() functions. I don't object necessarily -- if
> someone else wants to commit it, they can -- but I don't plan to commit
> it in this form.
>
> Can we revisit the idea of a per-WHEN RETURNING clause? The returning
> clauses could be treated kind of like a UNION, which makes sense
> because it really is a union of different results (the returned tuples
> from an INSERT are different than the returned tuples from a DELETE).
> You can just add constants to the target lists to distinguish which
> WHEN clause they came from.
>
> I know you rejected that approach early on, but perhaps it's worth
> discussing further?
>

 Yeah.  Side benefit, the 'action_number' felt really out of place, and
that neatly might solve it.  It doesn't match tg_op, for example.  With the
current approach, return a text, or an enum? Why doesn't it match concepts
that are pretty well established elsewhere?  SQL has a pretty good track
record for not inventing weird numbers with no real meaning (sadly, not so
much the developers).   Having said that, pg_merge_action() doesn't feel
too bad if the syntax issues can be worked out.

Very supportive of the overall goal.

merlin


Re: MERGE ... RETURNING

2023-10-24 Thread Jeff Davis
On Wed, 2023-08-23 at 11:58 +0100, Dean Rasheed wrote:
> Updated version attached, fixing an uninitialized-variable warning
> from the cfbot.

I took another look and I'm still not comfortable with the special
IsMergeSupportFunction() functions. I don't object necessarily -- if
someone else wants to commit it, they can -- but I don't plan to commit
it in this form.

Can we revisit the idea of a per-WHEN RETURNING clause? The returning
clauses could be treated kind of like a UNION, which makes sense
because it really is a union of different results (the returned tuples
from an INSERT are different than the returned tuples from a DELETE).
You can just add constants to the target lists to distinguish which
WHEN clause they came from.

I know you rejected that approach early on, but perhaps it's worth
discussing further?

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-08-23 Thread Dean Rasheed
Updated version attached, fixing an uninitialized-variable warning
from the cfbot.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..7f65f6e
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index be2f54c..e79e2ad
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21772,6 +21772,100 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause_number
+
+pg_merge_when_clause_number ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that these functions can only be used in the RETURNING
+   list of a MERGE command.  It is an error to use them in
+   any other part of a query.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index fe8def4..51a15ca
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1387,9 +1387,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index f55e901..6803240
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1149,6 +1149,7 @@ SELECT select_expressionsexpressions INTO STRICT target;
 UPDATE ... RETURNING expressions INTO STRICT target;
 DELETE ... RETURNING expressions INTO STRICT target;
+MERGE ... RETURNING expressions INTO STRICT target;
 
 
  where target can be a record variable, a row
@@ -1159,8 +1160,8 @@ DELETE ... RETURNING expres
  INTO clause) just as described above,
  and the plan is cached in the same way.
  This works for SELECT,
- INSERT/UPDATE/DELETE with
- RETURNING, and certain utility commands
+ INSERT/UPDATE/DELETE/MERGE
+ with RETURNING, and certain utility commands
  that return row sets, such as EXPLAIN.
  Except

Re: MERGE ... RETURNING

2023-08-23 Thread Dean Rasheed
On Tue, 25 Jul 2023 at 21:46, Gurjeet Singh  wrote:
>
> There seems to be other use cases which need us to invent a method to
> expose a command-specific alias. See Tatsuo Ishii's call for help in
> his patch for Row Pattern Recognition [1].
>

I think that's different though, because in that example "START" is a
row from the table, and "price" is a table column, so using the table
alias syntax "START.price" makes sense, to refer to a value from the
table.

In this case "MERGE" and "action" have nothing to do with table rows
or columns, so saying "MERGE.action" doesn't fit the pattern.


> I performed the review vo v9 patch by comparing it to v8 and v7
> patches, and then comparing to HEAD.
>

Many thanks for looking at this.


> The v9 patch is more or less a complete revert to v7 patch, plus the
> planner support for calling the merge-support functions in subqueries,
> parser catching use of merge-support functions outside MERGE command,
> and name change for one of the support functions.
>

Yes, that's a fair summary.


> But reverting to v7 also means that some of my gripes with that
> version also return; e.g. invention of EXPR_KIND_MERGE_RETURNING. And
> as noted in v7 review, I don't have a better proposal.
>

True, but I think that it's in keeping with the purpose of the
ParseExprKind enumeration:

/*
 * Expression kinds distinguished by transformExpr().  Many of these are not
 * semantically distinct so far as expression transformation goes; rather,
 * we distinguish them so that context-specific error messages can be printed.
 */

which matches what the patch is using EXPR_KIND_MERGE_RETURNING for.


> - * Uplevel PlaceHolderVars and aggregates are replaced, too.
> + * Uplevel PlaceHolderVars, aggregates, GROUPING() expressions and merge
> + * support functions are replaced, too.
>
> Needs Oxford comma: s/GROUPING() expressions and/GROUPING() expressions, and/
>

Added.


> +pg_merge_action(PG_FUNCTION_ARGS)
> +{
> ...
> +relaction = mtstate->mt_merge_action;
> +if (relaction)
> +{
> ..
> +}
> +
> +PG_RETURN_NULL();
> +}
>
> Under what circumstances would the relaction be null? Is it okay to
> return NULL from this function if relaction is null, or is it better
> to throw an error? These questions apply to the
> pg_merge_when_clause_number() function as well.
>

Yes, it's really a "should never happen" situation, so I've converted
it to elog() an error. Similarly, commandType should never be
CMD_NOTHING in pg_merge_action(), so that also now throws an error.
Also, the planner code now throws an error if it sees a merge support
function outside a MERGE. Again, that should never happen, due to the
parser check, but it seems better to be sure, and catch it early.

While at it, I tidied up the planner code a bit, making the merge
support function handling more like the other cases in
replace_correlation_vars_mutator(), and making
replace_outer_merge_support_function() more like its neighbouring
functions, such as replace_outer_grouping(). In particular, it is now
only called if it is a reference to an upper-level MERGE, not for
local references, which matches the pattern used in the neighbouring
functions.

Finally, I have added some new RLS code and tests, to apply SELECT
policies to new rows inserted by MERGE INSERT actions, if a RETURNING
clause is specified, to make it consistent with a plain INSERT ...
RETURNING command (see commit c2e08b04c9).

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..7f65f6e
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT 

Re: MERGE ... RETURNING

2023-07-25 Thread Gurjeet Singh
On Fri, Jul 21, 2023 at 7:17 PM Dean Rasheed  wrote:
>
> On Mon, 17 Jul 2023 at 20:43, Jeff Davis  wrote:
> >
> > > > Maybe instead of a function it could be a special table reference
> > > > like:
> > > >
> > > >   MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?
> > > >
> > The benefits are:
> >
> > 1. It is naturally constrained to the right context. It doesn't require
> > global variables and the PG_TRY/PG_FINALLY, and can't be called in the
> > wrong contexts (like SELECT).
> >
> > 2. More likely to be consistent with eventual support for NEW/OLD
> > (actually BEFORE/AFTER for reasons the prior thread discussed).
> >
>
> Thinking about this some more, I think that the point about
> constraining these functions to the right context is a reasonable one,
> and earlier versions of this patch did that better, without needing
> global variables or a PG_TRY/PG_FINALLY block.
>
> Here is an updated patch that goes back to doing it that way. This is
> more like the way that aggregate functions and GROUPING() work, in
> that the parser constrains the location from which the functions can
> be used, and at execution time, the functions rely on the relevant
> context being passed via the FunctionCallInfo context.
>
> It's still possible to use these functions in subqueries in the
> RETURNING list, but attempting to use them anywhere else (like a
> SELECT on its own) will raise an error at parse time. If they do
> somehow get invoked in a non-MERGE context, they will elog an error
> (again, just like aggregate functions), because that's a "shouldn't
> happen" error.
>
> This does nothing to be consistent with eventual support for
> BEFORE/AFTER, but I think that's really an entirely separate thing,

+1

> From a user perspective, writing something like "BEFORE.id" is quite
> natural, because it's clear that "id" is a column, and "BEFORE" is the
> old state of the table. Writing something like "MERGE.action" seems a
> lot more counter-intuitive, because "action" isn't a column of
> anything (and if it was, I think this syntax would potentially cause
> even more confusion).
>
> So really, I think "MERGE.action" is an abuse of the syntax,
> inconsistent with any other SQL syntax, and using functions is much
> more natural, akin to GROUPING(), for example.

There seems to be other use cases which need us to invent a method to
expose a command-specific alias. See Tatsuo Ishii's call for help in
his patch for Row Pattern Recognition [1].


I was not able to find a way to implement expressions like START.price
(START is not a table alias). Any suggestion is greatly appreciated.


It looks like the SQL standard has started using more of such
context-specific keywords, and I'd expect such keywords to only
increase in future, as the SQL committee tries to introduce more
features into the standard.

So if MERGE.action is not to your taste, perhaps you/someone can
suggest an alternative that doesn't cause a confusion, and yet
implementing it would open up the way for more such context-specific
keywords.

I performed the review vo v9 patch by comparing it to v8 and v7
patches, and then comparing to HEAD.

The v9 patch is more or less a complete revert to v7 patch, plus the
planner support for calling the merge-support functions in subqueries,
parser catching use of merge-support functions outside MERGE command,
and name change for one of the support functions.

But reverting to v7 also means that some of my gripes with that
version also return; e.g. invention of EXPR_KIND_MERGE_RETURNING. And
as noted in v7 review, I don't have a better proposal.

Function name changed from pg_merge_when_clause() to
pg_merge_when_clause_number(). That's better, even though it's a bit
of mouthful.

Doc changes (compared to v7) look good.

The changes made to tests (compared to v7) are for the better.

- * Uplevel PlaceHolderVars and aggregates are replaced, too.
+ * Uplevel PlaceHolderVars, aggregates, GROUPING() expressions and merge
+ * support functions are replaced, too.

Needs Oxford comma: s/GROUPING() expressions and/GROUPING() expressions, and/

+pg_merge_action(PG_FUNCTION_ARGS)
+{
...
+relaction = mtstate->mt_merge_action;
+if (relaction)
+{
..
+}
+
+PG_RETURN_NULL();
+}

Under what circumstances would the relaction be null? Is it okay to
return NULL from this function if relaction is null, or is it better
to throw an error? These questions apply to the
pg_merge_when_clause_number() function as well.

[1]: Row pattern recognition
https://www.postgresql.org/message-id/flat/20230625.210509.1276733411677577841.t-ishii%40sranhm.sra.co.jp

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-21 Thread Dean Rasheed
On Mon, 17 Jul 2023 at 20:43, Jeff Davis  wrote:
>
> > > Maybe instead of a function it could be a special table reference
> > > like:
> > >
> > >   MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?
> > >
> The benefits are:
>
> 1. It is naturally constrained to the right context. It doesn't require
> global variables and the PG_TRY/PG_FINALLY, and can't be called in the
> wrong contexts (like SELECT).
>
> 2. More likely to be consistent with eventual support for NEW/OLD
> (actually BEFORE/AFTER for reasons the prior thread discussed).
>

Thinking about this some more, I think that the point about
constraining these functions to the right context is a reasonable one,
and earlier versions of this patch did that better, without needing
global variables or a PG_TRY/PG_FINALLY block.

Here is an updated patch that goes back to doing it that way. This is
more like the way that aggregate functions and GROUPING() work, in
that the parser constrains the location from which the functions can
be used, and at execution time, the functions rely on the relevant
context being passed via the FunctionCallInfo context.

It's still possible to use these functions in subqueries in the
RETURNING list, but attempting to use them anywhere else (like a
SELECT on its own) will raise an error at parse time. If they do
somehow get invoked in a non-MERGE context, they will elog an error
(again, just like aggregate functions), because that's a "shouldn't
happen" error.

This does nothing to be consistent with eventual support for
BEFORE/AFTER, but I think that's really an entirely separate thing,
and likely to work quite differently, internally.

>From a user perspective, writing something like "BEFORE.id" is quite
natural, because it's clear that "id" is a column, and "BEFORE" is the
old state of the table. Writing something like "MERGE.action" seems a
lot more counter-intuitive, because "action" isn't a column of
anything (and if it was, I think this syntax would potentially cause
even more confusion).

So really, I think "MERGE.action" is an abuse of the syntax,
inconsistent with any other SQL syntax, and using functions is much
more natural, akin to GROUPING(), for example.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..7f65f6e
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index b948276..b84a95a
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21705,6 +21705,100 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause_number
+
+pg_merge_w

Re: MERGE ... RETURNING

2023-07-21 Thread Jeff Davis
On Thu, 2023-07-20 at 23:19 -0700, Gurjeet Singh wrote:
> Plus, if we were able to make it work as SQL syntax, it's very likely
> we can use the same technique to implement BEFORE and AFTER behaviour
> in UPDATE ... RETURNING that the old thread could not accomplish a
> decade ago.

To clarify, I don't think having a special table alias will require any
changes in gram.y and I don't consider it a syntactical change.

I haven't looked into the implementation yet.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-21 Thread Gurjeet Singh
On Mon, Jul 17, 2023 at 12:43 PM Jeff Davis  wrote:
>
> On Fri, 2023-07-14 at 09:55 +0100, Dean Rasheed wrote:
> > I found a 10-year-old thread discussing adding support for OLD/NEW to
> > RETURNING [1], but it doesn't look like anything close to a
> > committable solution was developed, or even a design that might lead
> > to one. That's a shame, because there seemed to be a lot of demand
> > for
> > the feature, but it's not clear how much effort it would be to
> > implement.
>
> It looks like progress was made in the direction of using a table alias
> with executor support to bring the right attributes along.

That patch introduced RTE_ALIAS to carry that info through to the
executor, and having to special-case that that in many places was seen
as a bad thing.

> There was some concern about how exactly the table alias should work
> such that it doesn't look too much like a join. Not sure how much of a
> problem that is.

My understanding of that thread is that the join example Robert shared
was for illustrative purposes only, to show that executor already has
enough information to produce the desired output, and to show that
it's a matter of tweaking the parser and planner to tell the executor
what output to produce. But later reviewers pointed out that it's not
that simple (example was given of ExecDelete performing
pullups/working hard to get the correct values of the old version of
the row).

The concerns were mainly around use of OLD.* and NEW.*, too much
special-casing of RTE_ALIAS, and then the code quality of the patch
itself.

> > > Maybe instead of a function it could be a special table reference
> > > like:
> > >
> > >   MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?
> > >
> >
> > Well, that's a little more concise, but I'm not sure that it really
> > buys us that much, to be worth the extra complication. Presumably
> > something in the planner would turn that into something the executor
> > could handle, which might just end up being the existing functions
> > anyway.
>
> The benefits are:
>
> 1. It is naturally constrained to the right context.

+1

> I'm not sure how much extra complication it would cause, though.

If that attempt with UPDATE RETURNING a decade ago is any indication,
it's probably a tough one.

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-21 Thread Gurjeet Singh
On Fri, Jul 14, 2023 at 1:55 AM Dean Rasheed  wrote:
>
> On Thu, 13 Jul 2023 at 20:14, Jeff Davis  wrote:
> >
> > On Thu, 2023-07-13 at 18:01 +0100, Dean Rasheed wrote:
> > > For some use cases, I can imagine allowing OLD/NEW.colname would mean
> > > you wouldn't need pg_merge_action() (if the column was NOT NULL), so
> > > I
> > > think the features should work well together.
> >
> > For use cases where a user could do it either way, which would you
> > expect to be the "typical" way (assuming we supported the new/old)?
> >
> >   MERGE ... RETURNING pg_merge_action(), id, val;
> >
> > or
> >
> >   MERGE ... RETURNING id, OLD.val, NEW.val;
> >
> > ?
> >
>
> I think it might depend on whether OLD.val and NEW.val were actually
> required, but I think I would still probably use pg_merge_action() to
> get the action, since it doesn't rely on specific table columns being
> NOT NULL.

+1. It would be better to expose the action explicitly, rather than
asking the user to deduce it based on the old and new values of a
column. The server providing that value is better than letting users
rely on error-prone methods.

> I found a 10-year-old thread discussing adding support for OLD/NEW to
> RETURNING [1],

Thanks for digging up that thread. An important concern brought up in
that thread was how the use of names OLD and NEW will affect plpgsql
(an possibly other PLs) trigger functions, which rely on specific
meaning for those names. The names BEFORE and AFTER, proposed there
are not as intuitive as OLD/NEW for the purpose of identifying old and
new versions of the row, but I don't have a better proposal. Perhaps
PREVIOUS and CURRENT?

> but it doesn't look like anything close to a
> committable solution was developed, or even a design that might lead
> to one. That's a shame, because there seemed to be a lot of demand for
> the feature,

+1

> > I am still bothered that pg_merge_action() is so context-sensitive.
> > "SELECT pg_merge_action()" by itself doesn't make any sense, but it's
> > allowed in the v8 patch. We could make that a runtime error, which
> > would be better, but it feels like it's structurally wrong. This is not
> > an objection, but it's just making me think harder about alternatives.
> >
> > Maybe instead of a function it could be a special table reference like:
> >
> >   MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?

I believe Jeff meant s/action_number/when_number/. Not that we've
settled on a name for this virtual column.

> Well, that's a little more concise, but I'm not sure that it really
> buys us that much, to be worth the extra complication.

After considering the options, and their pros and cons (ease of
implementation, possibility of conflict with SQL spec, intuitiveness
of syntax), I'm now strongly leaning towards the SQL syntax variant.
Exposing the action taken via a context-sensitive function feels
kludgy, when compared to Jeff's proposed SQL syntax. Don't get me
wrong, I still feel it was very clever how you were able to make the
function context sensitive, and make it work in expressions deeper in
the subqueries.

Plus, if we were able to make it work as SQL syntax, it's very likely
we can use the same technique to implement BEFORE and AFTER behaviour
in UPDATE ... RETURNING that the old thread could not accomplish a
decade ago.

> Presumably
> something in the planner would turn that into something the executor
> could handle, which might just end up being the existing functions
> anyway.

If the current patch's functions can serve the needs of the SQL syntax
variant, that'd be a neat win!

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-17 Thread Jeff Davis
On Fri, 2023-07-14 at 09:55 +0100, Dean Rasheed wrote:
> I found a 10-year-old thread discussing adding support for OLD/NEW to
> RETURNING [1], but it doesn't look like anything close to a
> committable solution was developed, or even a design that might lead
> to one. That's a shame, because there seemed to be a lot of demand
> for
> the feature, but it's not clear how much effort it would be to
> implement.

It looks like progress was made in the direction of using a table alias
with executor support to bring the right attributes along.

There was some concern about how exactly the table alias should work
such that it doesn't look too much like a join. Not sure how much of a
problem that is.

> > Maybe instead of a function it could be a special table reference
> > like:
> > 
> >   MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?
> > 
> 
> Well, that's a little more concise, but I'm not sure that it really
> buys us that much, to be worth the extra complication. Presumably
> something in the planner would turn that into something the executor
> could handle, which might just end up being the existing functions
> anyway.

The benefits are:

1. It is naturally constrained to the right context. It doesn't require
global variables and the PG_TRY/PG_FINALLY, and can't be called in the
wrong contexts (like SELECT).

2. More likely to be consistent with eventual support for NEW/OLD
(actually BEFORE/AFTER for reasons the prior thread discussed).

I'm not sure how much extra complication it would cause, though.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-14 Thread Dean Rasheed
On Thu, 13 Jul 2023 at 20:14, Jeff Davis  wrote:
>
> On Thu, 2023-07-13 at 18:01 +0100, Dean Rasheed wrote:
> > For some use cases, I can imagine allowing OLD/NEW.colname would mean
> > you wouldn't need pg_merge_action() (if the column was NOT NULL), so
> > I
> > think the features should work well together.
>
> For use cases where a user could do it either way, which would you
> expect to be the "typical" way (assuming we supported the new/old)?
>
>   MERGE ... RETURNING pg_merge_action(), id, val;
>
> or
>
>   MERGE ... RETURNING id, OLD.val, NEW.val;
>
> ?
>

I think it might depend on whether OLD.val and NEW.val were actually
required, but I think I would still probably use pg_merge_action() to
get the action, since it doesn't rely on specific table columns being
NOT NULL. It's a little like writing a trigger function that handles
multiple command types. You could use OLD and NEW to deduce whether it
was an INSERT, UPDATE or DELETE, or you could use TG_OP. I tend to use
TG_OP, but maybe there are situations where using OLD and NEW is more
natural.

I found a 10-year-old thread discussing adding support for OLD/NEW to
RETURNING [1], but it doesn't look like anything close to a
committable solution was developed, or even a design that might lead
to one. That's a shame, because there seemed to be a lot of demand for
the feature, but it's not clear how much effort it would be to
implement.

> I am still bothered that pg_merge_action() is so context-sensitive.
> "SELECT pg_merge_action()" by itself doesn't make any sense, but it's
> allowed in the v8 patch. We could make that a runtime error, which
> would be better, but it feels like it's structurally wrong. This is not
> an objection, but it's just making me think harder about alternatives.
>
> Maybe instead of a function it could be a special table reference like:
>
>   MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?
>

Well, that's a little more concise, but I'm not sure that it really
buys us that much, to be worth the extra complication. Presumably
something in the planner would turn that into something the executor
could handle, which might just end up being the existing functions
anyway.

Regards,
Dean

[1] https://www.postgresql.org/message-id/flat/51822C0F.5030807%40gmail.com




Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Thu, 2023-07-13 at 18:01 +0100, Dean Rasheed wrote:
> For some use cases, I can imagine allowing OLD/NEW.colname would mean
> you wouldn't need pg_merge_action() (if the column was NOT NULL), so
> I
> think the features should work well together.

For use cases where a user could do it either way, which would you
expect to be the "typical" way (assuming we supported the new/old)?

  MERGE ... RETURNING pg_merge_action(), id, val;

or

  MERGE ... RETURNING id, OLD.val, NEW.val;

?

I am still bothered that pg_merge_action() is so context-sensitive.
"SELECT pg_merge_action()" by itself doesn't make any sense, but it's
allowed in the v8 patch. We could make that a runtime error, which
would be better, but it feels like it's structurally wrong. This is not
an objection, but it's just making me think harder about alternatives.

Maybe instead of a function it could be a special table reference like:

  MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-13 Thread Dean Rasheed
On Thu, 13 Jul 2023 at 17:43, Vik Fearing  wrote:
>
> There is also the WITH ORDINALITY and FOR ORDINALITY examples.
>

True. I just think "number" is a friendlier, more familiar word than "ordinal".

> So perhaps pg_merge_when_clause_number() would
> > be a better name. It's still quite long, but it's the best I can think
> > of.
>
> How about pg_merge_match_number() or pg_merge_ordinality()?

I think "match_number" is problematic, because it might be a "matched"
or a "not matched" action. "when_clause" is the term used on the MERGE
doc page.

Regards,
Dean




Re: MERGE ... RETURNING

2023-07-13 Thread Dean Rasheed
On Thu, 13 Jul 2023 at 17:01, Jeff Davis  wrote:
>
> MERGE can end up combining old and new values in a way that doesn't
> happen with INSERT/UPDATE/DELETE. For instance, a "MERGE ... RETURNING
> id" would return a mix of NEW.id (for INSERT/UPDATE actions) and OLD.id
> (for DELETE actions).
>

Right, but allowing OLD/NEW.colname in the RETURNING list would remove
that complication, and it shouldn't change how a bare colname
reference behaves.

> The pg_merge_action() can differentiate the old and new values, but
> it's a bit more awkward.
>

For some use cases, I can imagine allowing OLD/NEW.colname would mean
you wouldn't need pg_merge_action() (if the column was NOT NULL), so I
think the features should work well together.

Regards,
Dean




Re: MERGE ... RETURNING

2023-07-13 Thread Vik Fearing

On 7/13/23 17:38, Dean Rasheed wrote:

On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh  wrote:



I think the name of function pg_merge_when_clause() can be improved.
How about pg_merge_when_clause_ordinal().


That's a bit of a mouthful, but I don't have a better idea right now.
I've left the names alone for now, in case something better occurs to
anyone.


+1. How do we make sure we don't forget that it needs to be named
better. Perhaps a TODO item within the patch will help.



Thinking about that some more, I think the word "number" is more
familiar to most people than "ordinal". There's the row_number()
function, for example. 



There is also the WITH ORDINALITY and FOR ORDINALITY examples.


So perhaps pg_merge_when_clause_number() would

be a better name. It's still quite long, but it's the best I can think
of.



How about pg_merge_match_number() or pg_merge_ordinality()?
--
Vik Fearing





Re: MERGE ... RETURNING

2023-07-13 Thread Gurjeet Singh
On Thu, Jul 13, 2023 at 8:38 AM Dean Rasheed  wrote:
>
> On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh  wrote:

> >  { oid => '9499', descr => 'command type of current MERGE action',
> > -  proname => 'pg_merge_action',  provolatile => 'v',
> > +  proname => 'pg_merge_action',  provolatile => 'v', proparallel => 'r',
> > 
> >  { oid => '9500', descr => 'index of current MERGE WHEN clause',
> > -  proname => 'pg_merge_when_clause',  provolatile => 'v',
> > +  proname => 'pg_merge_when_clause',  provolatile => 'v', proparallel => 
> > 'r',
> > 
> >
> > I see that you've now set proparallel of these functions to 'r'. I'd
> > just like to understand how you got to that conclusion.
> >
>
> Now that these functions can appear in subqueries in the RETURNING
> list, there exists the theoretical possibility that the subquery might
> use a parallel plan (actually that can't happen today, for any query
> that modifies data, but maybe someday it may become a possibility),
> and it's possible to use these functions in a SELECT query inside a
> PL/pgSQL function called from the RETURNING list, which might consider
> a parallel plan. Since these functions rely on access to executor
> state that isn't copied to parallel workers, they must be run on the
> leader, hence I think PARALLEL RESTRICTED is the right level to use. A
> similar example is pg_trigger_depth().

Thanks for the explanation. That helps.

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Mon, 2023-01-09 at 12:29 +, Dean Rasheed wrote:
> > Would it be useful to have just the action? Perhaps "WITH ACTION"?
> > My idea is that this would return an enum of INSERT, UPDATE, DELETE
> > (so is "action" the right word?). It seems to me in many situations
> > I would be more likely to care about which of these 3 happened
> > rather than the exact clause that applied. This isn't necessarily
> > meant to be instead of your suggestion because I can imagine
> > wanting to know the exact clause, just an alternative that might
> > suffice in many situations. Using it would also avoid problems
> > arising from editing the query in a way which changes the numbers
> > of the clauses.
> > 
> 
> Hmm, perhaps that's something that can be added as well. Both use
> cases seem useful.

Can you expand a bit on the use cases for identifying individual WHEN
clauses? I see that it offers a new capability beyond just the action
type, but I'm having trouble thinking of real use cases.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Mon, 2023-01-09 at 12:29 +, Dean Rasheed wrote:
> > Would it be feasible to allow specifying old.column or new.column?
> > These would always be NULL for INSERT and DELETE respectively but
> > more useful with UPDATE. Actually I've been meaning to ask this
> > question about UPDATE … RETURNING.
> > 
> 
> I too have wished for the ability to do that with UPDATE ...
> RETURNING, though I'm not sure how feasible it is.
> 
> I think it's something best considered separately though. I haven't
> given any thought as to how to make it work, so there might be
> technical difficulties. But if it could be made to work for UPDATE,
> it
> shouldn't be much more effort to make it work for MERGE.

MERGE can end up combining old and new values in a way that doesn't
happen with INSERT/UPDATE/DELETE. For instance, a "MERGE ... RETURNING
id" would return a mix of NEW.id (for INSERT/UPDATE actions) and OLD.id
(for DELETE actions).

The pg_merge_action() can differentiate the old and new values, but
it's a bit more awkward.

I'm fine considering that as a separate patch, but it does seem worth
discussing briefly here.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-13 Thread Dean Rasheed
On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh  wrote:
>
> > > I think the name of function pg_merge_when_clause() can be improved.
> > > How about pg_merge_when_clause_ordinal().
> >
> > That's a bit of a mouthful, but I don't have a better idea right now.
> > I've left the names alone for now, in case something better occurs to
> > anyone.
>
> +1. How do we make sure we don't forget that it needs to be named
> better. Perhaps a TODO item within the patch will help.
>

Thinking about that some more, I think the word "number" is more
familiar to most people than "ordinal". There's the row_number()
function, for example. So perhaps pg_merge_when_clause_number() would
be a better name. It's still quite long, but it's the best I can think
of.

> I believe this elog can be safely turned into an Assert.
>
> +switch (mergeActionCmdType)
>  {
> -CmdType commandType = relaction->mas_action->commandType;
> 
> +case CMD_INSERT:
> 
> +default:
> +elog(ERROR, "unrecognized commandType: %d", (int)
> mergeActionCmdType);
>
> The same treatment can be applied to the elog(ERROR) in pg_merge_action().
>

Hmm, that's a very common code pattern used in dozens, if not hundreds
of places throughout the backend code, so I don't think this should be
different.

> > > +CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int,
> > > + OUT action text, OUT tid int,
> > > OUT new_balance int)
> > > +LANGUAGE sql AS
> > > +$$
> > > +MERGE INTO sq_target t
> > > +USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta)
> > > +ON tid = v.sid
> > > +WHEN MATCHED AND tid > 2 THEN
> > >
> > > Again, for consistency, the comparison operator should be >=. There
> > > are a few more occurrences of this comparison in the rest of the file,
> > >  that need the same treatment.
> > >
> >
> > I changed the new tests to use ">= 2" (and the COPY test now returns 3
> > rows, with an action of each type, which is easier to read), but I
> > don't think it's really necessary to change all the existing tests
> > from "> 2". There's nothing wrong with the "= 2" case having no
> > action, as long as the tests give decent coverage.
>
> I was just trying to drive these tests towards a consistent pattern.
> As a reader, if I see these differences, the first and the
> conservative thought I have is that these differences must be there
> for a reason, and then I have to work to find out what those reasons
> might be. And that's a lot of wasted effort, just in case someone
> intends to change something in these tests.
>

OK, I see what you're saying. I think it should be a follow-on patch
though, because I don't like the idea of this patch to be making
changes to tests not related to the feature being added.

>  { oid => '9499', descr => 'command type of current MERGE action',
> -  proname => 'pg_merge_action',  provolatile => 'v',
> +  proname => 'pg_merge_action',  provolatile => 'v', proparallel => 'r',
> 
>  { oid => '9500', descr => 'index of current MERGE WHEN clause',
> -  proname => 'pg_merge_when_clause',  provolatile => 'v',
> +  proname => 'pg_merge_when_clause',  provolatile => 'v', proparallel => 'r',
> 
>
> I see that you've now set proparallel of these functions to 'r'. I'd
> just like to understand how you got to that conclusion.
>

Now that these functions can appear in subqueries in the RETURNING
list, there exists the theoretical possibility that the subquery might
use a parallel plan (actually that can't happen today, for any query
that modifies data, but maybe someday it may become a possibility),
and it's possible to use these functions in a SELECT query inside a
PL/pgSQL function called from the RETURNING list, which might consider
a parallel plan. Since these functions rely on access to executor
state that isn't copied to parallel workers, they must be run on the
leader, hence I think PARALLEL RESTRICTED is the right level to use. A
similar example is pg_trigger_depth().

> --- error when using MERGE support functions outside MERGE
> -SELECT pg_merge_action() FROM sq_target;
>
> I believe it would be worthwhile to keep a record of the expected
> outputs of these invocations in the tests, just in case they change
> over time.
>

Yeah, that makes sense. I'll post an update soon.

Regards,
Dean




Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Sun, 2023-01-08 at 12:28 +, Dean Rasheed wrote:
> I considered allowing a separate RETURNING list at the end of each
> action, but rapidly dismissed that idea.

One potential benefit of that approach is that it would be more natural
to specify output specific to the action, e.g. 

   WHEN MATCHED THEN UPDATE ... RETURNING 'UPDATE', ...

which would be an alternative to the special function pg_merge_action()
or "WITH WHEN".

I agree that it can be awkward to specify multiple RETURNING clauses
and get the columns to match up, but it's hard for me to say whether
it's better or worse without knowing more about the use cases.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-12 Thread Vik Fearing

On 7/13/23 01:48, Jeff Davis wrote:

On Wed, 2023-07-12 at 03:47 +0200, Vik Fearing wrote:


There is no RETURNING clause in Standard SQL, and the way they would
do
this is:

  SELECT ...
  FROM OLD TABLE (
  MERGE ...
  ) AS m

The rules for that for MERGE are well defined.


I only see OLD TABLE referenced as part of a trigger definition. Where
is it defined for MERGE?


Look up  for that syntax.  For how MERGE 
generates those, see 9075-2:2023 Section 14.12  General 
Rules 6.b and 6.c.



In any case, as long as the SQL standard doesn't conflict, then we're
fine. And it looks unlikely to cause a conflict right now that wouldn't
also be a conflict with our existing RETURNING clause elsewhere, so I'm
not seeing a problem here.


I do not see a problem either, which was what I was trying to express 
(perhaps poorly).  At least not with the syntax.  I have not yet tested 
that the returned rows match the standard.

--
Vik Fearing





Re: MERGE ... RETURNING

2023-07-12 Thread Jeff Davis
On Wed, 2023-07-12 at 03:47 +0200, Vik Fearing wrote:

> There is no RETURNING clause in Standard SQL, and the way they would
> do 
> this is:
> 
>  SELECT ...
>  FROM OLD TABLE (
>  MERGE ...
>  ) AS m
> 
> The rules for that for MERGE are well defined.

I only see OLD TABLE referenced as part of a trigger definition. Where
is it defined for MERGE?

In any case, as long as the SQL standard doesn't conflict, then we're
fine. And it looks unlikely to cause a conflict right now that wouldn't
also be a conflict with our existing RETURNING clause elsewhere, so I'm
not seeing a problem here.

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-11 Thread Vik Fearing

On 7/12/23 02:43, Jeff Davis wrote:

On Sun, 2023-01-22 at 19:58 +0100, Alvaro Herrera wrote:


(We do have to keep our fingers
crossed that they will decide to use the same RETURNING syntax as we
do
in this patch, of course.)


Do we have a reason to think that they will accept something similar?


We have reason to think that they won't care at all.

There is no RETURNING clause in Standard SQL, and the way they would do 
this is:


SELECT ...
FROM OLD TABLE (
MERGE ...
) AS m

The rules for that for MERGE are well defined.
--
Vik Fearing





Re: MERGE ... RETURNING

2023-07-11 Thread Jeff Davis
On Sun, 2023-01-22 at 19:58 +0100, Alvaro Herrera wrote:
> 
> (We do have to keep our fingers
> crossed that they will decide to use the same RETURNING syntax as we
> do
> in this patch, of course.)

Do we have a reason to think that they will accept something similar?

Regards,
Jeff Davis





Re: MERGE ... RETURNING

2023-07-11 Thread Gurjeet Singh
On Tue, Jul 11, 2023 at 1:43 PM Gurjeet Singh  wrote:
>
> In the above hunk, if there's an exception/ERROR, I believe we should
> PG_RE_THROW(). If there's a reason to continue, we should at least set
> rslot = NULL, otherwise we may be returning an uninitialized value to
> the caller.

Excuse the brain-fart on my part. There's not need to PG_RE_THROW(),
since there's no PG_CATCH(). Re-learning the code's infrastructure
slowly :-)

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-11 Thread Gurjeet Singh
On Fri, Jul 7, 2023 at 3:48 PM Dean Rasheed  wrote:
>
> On Thu, 6 Jul 2023 at 06:13, Gurjeet Singh  wrote:
> >
> > > One minor annoyance is that, due to the way that pg_merge_action() and
> > > pg_merge_when_clause() require access to the MergeActionState, they
> > > only work if they appear directly in the RETURNING list. They can't,
> > > for example, appear in a subquery in the RETURNING list, and I don't
> > > see an easy way round that limitation.
> >
> > I believe that's a serious limitation, and would be a blocker for the 
> > feature.
>
> Yes, that was bugging me for quite a while.
>
> Attached is a new version that removes that restriction, allowing the
> merge support functions to appear anywhere. This requires a bit of
> planner support, to deal with merge support functions in subqueries,
> in a similar way to how aggregates and GROUPING() expressions are
> handled. But a number of the changes from the previous patch are no
> longer necessary, so overall, this version of the patch is slightly
> smaller.

+1

> > I think the name of function pg_merge_when_clause() can be improved.
> > How about pg_merge_when_clause_ordinal().
> >
>
> That's a bit of a mouthful, but I don't have a better idea right now.
> I've left the names alone for now, in case something better occurs to
> anyone.

+1. How do we make sure we don't forget that it needs to be named
better. Perhaps a TODO item within the patch will help.

> > In the doc page of MERGE, you've moved the 'with_query' from the
> > bottom of Parameters section to the top of that section. Any reason
> > for this? Since the Parameters section is quite large, for a moment I
> > thought the patch was _adding_ the description of 'with_query'.
> >
>
> Ah yes, I was just making the order consistent with the
> INSERT/UPDATE/DELETE pages. That could probably be committed
> separately.

I don't think that's necessary, if it improves consistency with related docs.

> > +/*
> > + * Merge support functions should only be called directly from a MERGE
> > + * command, and need access to the parent ModifyTableState.  The parser
> > + * should have checked that such functions only appear in the RETURNING
> > + * list of a MERGE, so this should never fail.
> > + */
> > +if (IsMergeSupportFunction(funcid))
> > +{
> > +if (!state->parent ||
> > +!IsA(state->parent, ModifyTableState) ||
> > +((ModifyTableState *) state->parent)->operation != CMD_MERGE)
> > +elog(ERROR, "merge support function called in non-merge 
> > context");
> >
> > As the comment says, this is an unexpected condition, and should have
> > been caught and reported by the parser. So I think it'd be better to
> > use an Assert() here. Moreover, there's ERROR being raised by the
> > pg_merge_action() and pg_merge_when_clause() functions, when they
> > detect the function context is not appropriate.
> >
> > I found it very innovative to allow these functions to be called only
> > in a certain part of certain SQL command. I don't think  there's a
> > precedence for this in  Postgres; I'd be glad to learn if there are
> > other functions that Postgres exposes this way.
> >
> > -EXPR_KIND_RETURNING,/* RETURNING */
> > +EXPR_KIND_RETURNING,/* RETURNING in INSERT/UPDATE/DELETE */
> > +EXPR_KIND_MERGE_RETURNING,  /* RETURNING in MERGE */
> >
> > Having to invent a whole new ParseExprKind enum, feels like a bit of
> > an overkill. My only objection is that this is exactly like
> > EXPR_KIND_RETURNING, hence EXPR_KIND_MERGE_RETURNING needs to be dealt
> > with exactly in as many places. But I don't have any alternative
> > proposals.
> >
>
> That's all gone now from the new patch, since there is no longer any
> restriction on where these functions can appear.

I believe this elog can be safely turned into an Assert.

+switch (mergeActionCmdType)
 {
-CmdType commandType = relaction->mas_action->commandType;

+case CMD_INSERT:

+default:
+elog(ERROR, "unrecognized commandType: %d", (int)
mergeActionCmdType);

The same treatment can be applied to the elog(ERROR) in pg_merge_action().

> > +CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int,
> > + OUT action text, OUT tid int,
> > OUT new_balance int)
> > +LANGUAGE sql AS
> > +$$
> > +MERGE INTO sq_target t
> > +USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta)
> > +ON tid = v.sid
> > +WHEN MATCHED AND tid > 2 THEN
> >
> > Again, for consistency, the comparison operator should be >=. There
> > are a few more occurrences of this comparison in the rest of the file,
> >  that need the same treatment.
> >
>
> I changed the new tests to use ">= 2" (and the COPY test now returns 3
> rows, with an action of each type, which is easier to read), but I
> don't think it's really necessary to change all the existing tests
> from "> 2". There's nothing 

Re: MERGE ... RETURNING

2023-07-07 Thread Dean Rasheed
On Thu, 6 Jul 2023 at 06:13, Gurjeet Singh  wrote:
>
> Most of the review was done with the v6 of the patch, minus the doc
> build step. The additional changes in v7 of the patch were eyeballed,
> and tested with `make check`.
>

Thanks for the review!

> > One minor annoyance is that, due to the way that pg_merge_action() and
> > pg_merge_when_clause() require access to the MergeActionState, they
> > only work if they appear directly in the RETURNING list. They can't,
> > for example, appear in a subquery in the RETURNING list, and I don't
> > see an easy way round that limitation.
>
> I believe that's a serious limitation, and would be a blocker for the feature.
>

Yes, that was bugging me for quite a while.

Attached is a new version that removes that restriction, allowing the
merge support functions to appear anywhere. This requires a bit of
planner support, to deal with merge support functions in subqueries,
in a similar way to how aggregates and GROUPING() expressions are
handled. But a number of the changes from the previous patch are no
longer necessary, so overall, this version of the patch is slightly
smaller.

> I think the name of function pg_merge_when_clause() can be improved.
> How about pg_merge_when_clause_ordinal().
>

That's a bit of a mouthful, but I don't have a better idea right now.
I've left the names alone for now, in case something better occurs to
anyone.

> In the doc page of MERGE, you've moved the 'with_query' from the
> bottom of Parameters section to the top of that section. Any reason
> for this? Since the Parameters section is quite large, for a moment I
> thought the patch was _adding_ the description of 'with_query'.
>

Ah yes, I was just making the order consistent with the
INSERT/UPDATE/DELETE pages. That could probably be committed
separately.

> +/*
> + * Merge support functions should only be called directly from a MERGE
> + * command, and need access to the parent ModifyTableState.  The parser
> + * should have checked that such functions only appear in the RETURNING
> + * list of a MERGE, so this should never fail.
> + */
> +if (IsMergeSupportFunction(funcid))
> +{
> +if (!state->parent ||
> +!IsA(state->parent, ModifyTableState) ||
> +((ModifyTableState *) state->parent)->operation != CMD_MERGE)
> +elog(ERROR, "merge support function called in non-merge 
> context");
>
> As the comment says, this is an unexpected condition, and should have
> been caught and reported by the parser. So I think it'd be better to
> use an Assert() here. Moreover, there's ERROR being raised by the
> pg_merge_action() and pg_merge_when_clause() functions, when they
> detect the function context is not appropriate.
>
> I found it very innovative to allow these functions to be called only
> in a certain part of certain SQL command. I don't think  there's a
> precedence for this in  Postgres; I'd be glad to learn if there are
> other functions that Postgres exposes this way.
>
> -EXPR_KIND_RETURNING,/* RETURNING */
> +EXPR_KIND_RETURNING,/* RETURNING in INSERT/UPDATE/DELETE */
> +EXPR_KIND_MERGE_RETURNING,  /* RETURNING in MERGE */
>
> Having to invent a whole new ParseExprKind enum, feels like a bit of
> an overkill. My only objection is that this is exactly like
> EXPR_KIND_RETURNING, hence EXPR_KIND_MERGE_RETURNING needs to be dealt
> with exactly in as many places. But I don't have any alternative
> proposals.
>

That's all gone now from the new patch, since there is no longer any
restriction on where these functions can appear.

> --- a/src/include/catalog/pg_proc.h
> +++ b/src/include/catalog/pg_proc.h
> +/* Is this a merge support function?  (Requires fmgroids.h) */
> +#define IsMergeSupportFunction(funcid) \
> +((funcid) == F_PG_MERGE_ACTION || \
> + (funcid) == F_PG_MERGE_WHEN_CLAUSE)
>
> If it doesn't cause recursion or other complications, I think we
> should simply include the fmgroids.h in pg_proc.h. I understand that
> not all .c files/consumers that include pg_proc.h may need fmgroids.h,
> but #include'ing it will eliminate the need to keep the "Requires..."
> note above, and avoid confusion, too.
>

There's now only one place that uses this macro, whereas there are a
lot of places that include pg_proc.h and not fmgroids.h, so I don't
think forcing them all to include fmgroids.h is a good idea. (BTW,
this approach and comment is borrowed from IsTrueArrayType() in
pg_type.h)

> --- a/src/test/regress/expected/merge.out
> +++ b/src/test/regress/expected/merge.out
>
> -WHEN MATCHED AND tid > 2 THEN
> +WHEN MATCHED AND tid >= 2 THEN
>
> This change can be treated as a bug fix :-)
>

Re: MERGE ... RETURNING

2023-07-06 Thread Gurjeet Singh
On Thu, Jul 6, 2023 at 4:07 AM jian he  wrote:
>
> On Thu, Jul 6, 2023 at 1:13 PM Gurjeet Singh  wrote:

> > I think the name of function pg_merge_when_clause() can be improved.
> > How about pg_merge_when_clause_ordinal().
>
> another idea: pg_merge_action_ordinal()

Since there can be many occurrences of the same action
(INSERT/UPDATE/DELETE) in a MERGE command associated with different
conditions, I don't think action_ordinal would make sense for this
function name.

e.g.
WHEN  MATCHED and src.col1 = val1 THEN UPDATE col2 = someval1
WHEN  MATCHED and src.col1 = val2 THEN UPDATE col2 = someval2
...

When looking at the implementation code, as well, we see that the code
in this function tracks and reports the lexical position of the WHEN
clause, irrespective of the action associated with that WHEN clause.

foreach(l, stmt->mergeWhenClauses)
{
...
action->index = foreach_current_index(l) + 1;

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-06 Thread Gurjeet Singh
On Thu, Jul 6, 2023 at 3:39 AM Alvaro Herrera  wrote:
>
> On 2023-Jul-05, Gurjeet Singh wrote:

> > I expected the .out file to have captured the stdout. I'm gradually,
> > and gladly, re-learning bits of the test infrastructure.
> >
> > The DELETE command tag in the output does not feel appropriate for a
> > COPY command that's using MERGE as the source of the data.
>
> You misread this one :-)  The COPY output is there, the tag is not.  So
> DELETE is the value from pg_merge_action(), and "1 100" correspond to
> the columns in the the sq_target row that was deleted.  The command tag
> is presumably MERGE 1.

:-) That makes more sense. It matches my old mental model. Thanks for
clarifying!

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-06 Thread jian he
On Thu, Jul 6, 2023 at 1:13 PM Gurjeet Singh  wrote:
>
> On Sat, Jul 1, 2023 at 4:08 AM Dean Rasheed  wrote:
> >
> > On Mon, 13 Mar 2023 at 13:36, Dean Rasheed  wrote:
> > >
> > > And another rebase.
> > >
> >
> > I ran out of cycles to pursue the MERGE patches in v16, but hopefully
> > I can make more progress in v17.
> >
> > Looking at this one with fresh eyes, it looks mostly in good shape.
>
> +1
>
> Most of the review was done with the v6 of the patch, minus the doc
> build step. The additional changes in v7 of the patch were eyeballed,
> and tested with `make check`.
>
> > To
> > recap, this adds support for the RETURNING clause in MERGE, together
> > with new support functions pg_merge_action() and
> > pg_merge_when_clause() that can be used in the RETURNING list of MERGE
>
> nit: s/can be used in/can be used only in/
>
> > to retrieve the kind of action (INSERT/UPDATE/DELETE), and the index
> > of the WHEN clause executed for each row merged. In addition,
> > RETURNING support allows MERGE to be used as the source query in COPY
> > TO and WITH queries.
> >
> > One minor annoyance is that, due to the way that pg_merge_action() and
> > pg_merge_when_clause() require access to the MergeActionState, they
> > only work if they appear directly in the RETURNING list. They can't,
> > for example, appear in a subquery in the RETURNING list, and I don't
> > see an easy way round that limitation.
>
> I believe that's a serious limitation, and would be a blocker for the feature.
>
> > Attached is an updated patch with some cosmetic updates, plus updated
> > ruleutils support.
>
> With each iteration of your patch, it is becoming increasingly clear
> that this will be a documentation-heavy patch :-)
>
> I think the name of function pg_merge_when_clause() can be improved.
> How about pg_merge_when_clause_ordinal().
>

> I think the name of function pg_merge_when_clause() can be improved.
> How about pg_merge_when_clause_ordinal().

another idea: pg_merge_action_ordinal()




Re: MERGE ... RETURNING

2023-07-06 Thread Alvaro Herrera
On 2023-Jul-05, Gurjeet Singh wrote:

> +BEGIN;
> +COPY (
> +MERGE INTO sq_target t
> +USING v
> +ON tid = sid
> +WHEN MATCHED AND tid > 2 THEN
> +UPDATE SET balance = t.balance + delta
> +WHEN NOT MATCHED THEN
> +INSERT (balance, tid) VALUES (balance + delta, sid)
> +WHEN MATCHED AND tid < 2 THEN
> +DELETE
> +RETURNING pg_merge_action(), t.*
> +) TO stdout;
> +DELETE  1   100
> +ROLLBACK;
> 
> I expected the .out file to have captured the stdout. I'm gradually,
> and gladly, re-learning bits of the test infrastructure.
> 
> The DELETE command tag in the output does not feel appropriate for a
> COPY command that's using MERGE as the source of the data.

You misread this one :-)  The COPY output is there, the tag is not.  So
DELETE is the value from pg_merge_action(), and "1 100" correspond to
the columns in the the sq_target row that was deleted.  The command tag
is presumably MERGE 1.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: MERGE ... RETURNING

2023-07-05 Thread Gurjeet Singh
On Sat, Jul 1, 2023 at 4:08 AM Dean Rasheed  wrote:
>
> On Mon, 13 Mar 2023 at 13:36, Dean Rasheed  wrote:
> >
> > And another rebase.
> >
>
> I ran out of cycles to pursue the MERGE patches in v16, but hopefully
> I can make more progress in v17.
>
> Looking at this one with fresh eyes, it looks mostly in good shape.

+1

Most of the review was done with the v6 of the patch, minus the doc
build step. The additional changes in v7 of the patch were eyeballed,
and tested with `make check`.

> To
> recap, this adds support for the RETURNING clause in MERGE, together
> with new support functions pg_merge_action() and
> pg_merge_when_clause() that can be used in the RETURNING list of MERGE

nit: s/can be used in/can be used only in/

> to retrieve the kind of action (INSERT/UPDATE/DELETE), and the index
> of the WHEN clause executed for each row merged. In addition,
> RETURNING support allows MERGE to be used as the source query in COPY
> TO and WITH queries.
>
> One minor annoyance is that, due to the way that pg_merge_action() and
> pg_merge_when_clause() require access to the MergeActionState, they
> only work if they appear directly in the RETURNING list. They can't,
> for example, appear in a subquery in the RETURNING list, and I don't
> see an easy way round that limitation.

I believe that's a serious limitation, and would be a blocker for the feature.

> Attached is an updated patch with some cosmetic updates, plus updated
> ruleutils support.

With each iteration of your patch, it is becoming increasingly clear
that this will be a documentation-heavy patch :-)

I think the name of function pg_merge_when_clause() can be improved.
How about pg_merge_when_clause_ordinal().

In the doc page of MERGE, you've moved the 'with_query' from the
bottom of Parameters section to the top of that section. Any reason
for this? Since the Parameters section is quite large, for a moment I
thought the patch was _adding_ the description of 'with_query'.


+/*
+ * Merge support functions should only be called directly from a MERGE
+ * command, and need access to the parent ModifyTableState.  The parser
+ * should have checked that such functions only appear in the RETURNING
+ * list of a MERGE, so this should never fail.
+ */
+if (IsMergeSupportFunction(funcid))
+{
+if (!state->parent ||
+!IsA(state->parent, ModifyTableState) ||
+((ModifyTableState *) state->parent)->operation != CMD_MERGE)
+elog(ERROR, "merge support function called in non-merge context");

As the comment says, this is an unexpected condition, and should have
been caught and reported by the parser. So I think it'd be better to
use an Assert() here. Moreover, there's ERROR being raised by the
pg_merge_action() and pg_merge_when_clause() functions, when they
detect the function context is not appropriate.

I found it very innovative to allow these functions to be called only
in a certain part of certain SQL command. I don't think  there's a
precedence for this in  Postgres; I'd be glad to learn if there are
other functions that Postgres exposes this way.

-EXPR_KIND_RETURNING,/* RETURNING */
+EXPR_KIND_RETURNING,/* RETURNING in INSERT/UPDATE/DELETE */
+EXPR_KIND_MERGE_RETURNING,  /* RETURNING in MERGE */

Having to invent a whole new ParseExprKind enum, feels like a bit of
an overkill. My only objection is that this is exactly like
EXPR_KIND_RETURNING, hence EXPR_KIND_MERGE_RETURNING needs to be dealt
with exactly in as many places. But I don't have any alternative
proposals.

--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
+/* Is this a merge support function?  (Requires fmgroids.h) */
+#define IsMergeSupportFunction(funcid) \
+((funcid) == F_PG_MERGE_ACTION || \
+ (funcid) == F_PG_MERGE_WHEN_CLAUSE)

If it doesn't cause recursion or other complications, I think we
should simply include the fmgroids.h in pg_proc.h. I understand that
not all .c files/consumers that include pg_proc.h may need fmgroids.h,
but #include'ing it will eliminate the need to keep the "Requires..."
note above, and avoid confusion, too.

--- a/src/test/regress/expected/merge.out
+++ b/src/test/regress/expected/merge.out

-WHEN MATCHED AND tid > 2 THEN
+WHEN MATCHED AND tid >= 2 THEN

This change can be treated as a bug fix :-)

+-- COPY (MERGE ... RETURNING) TO ...
+BEGIN;
+COPY (
+MERGE INTO sq_target t
+USING v
+ON tid = sid
+WHEN MATCHED AND tid > 2 THEN

For consistency, the check should be tid >= 2, like you've fixed in
command referenced above.

+BEGIN;
+COPY (
+MERGE INTO sq_target t
+USING v
+ON tid = sid
+WHEN MATCHED AND tid > 2 THEN
+UPDATE SET balance = t.balance + delta
+WHEN NOT MATCHED THEN
+INSERT (balance, tid) VALUES (balance + delta, sid)
+WHE

Re: MERGE ... RETURNING

2023-07-01 Thread Dean Rasheed
On Mon, 13 Mar 2023 at 13:36, Dean Rasheed  wrote:
>
> And another rebase.
>

I ran out of cycles to pursue the MERGE patches in v16, but hopefully
I can make more progress in v17.

Looking at this one with fresh eyes, it looks mostly in good shape. To
recap, this adds support for the RETURNING clause in MERGE, together
with new support functions pg_merge_action() and
pg_merge_when_clause() that can be used in the RETURNING list of MERGE
to retrieve the kind of action (INSERT/UPDATE/DELETE), and the index
of the WHEN clause executed for each row merged. In addition,
RETURNING support allows MERGE to be used as the source query in COPY
TO and WITH queries.

One minor annoyance is that, due to the way that pg_merge_action() and
pg_merge_when_clause() require access to the MergeActionState, they
only work if they appear directly in the RETURNING list. They can't,
for example, appear in a subquery in the RETURNING list, and I don't
see an easy way round that limitation.

Attached is an updated patch with some cosmetic updates, plus updated
ruleutils support.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..ff2a827
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 5a47ce4..ca1b3cc
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21705,6 +21705,100 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause
+
+pg_merge_when_clause ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that these functions can only be used in the RETURNING
+   list of a MERGE command.  It is an error to use them in
+   any other part of a query, or in a subquery.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index fe8def4..51a15ca
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1387,9 +1387,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index f55e901..6803240
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSE

Re: MERGE ... RETURNING

2023-03-13 Thread Dean Rasheed
On Sun, 26 Feb 2023 at 09:50, Dean Rasheed  wrote:
>
> Another rebase.
>

And another rebase.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..ff2a827
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 9c6107f..d07f90a
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21216,6 +21216,99 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause
+
+pg_merge_when_clause ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that it is an error to use these functions anywhere other than the
+   RETURNING list of a MERGE command.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 7c01a54..4317cfc
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1323,9 +1323,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index 7c8a49f..b839695
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1149,6 +1149,7 @@ SELECT select_expressionsexpressions INTO STRICT target;
 UPDATE ... RETURNING expressions INTO STRICT target;
 DELETE ... RETURNING expressions INTO STRICT target;
+MERGE ... RETURNING expressions INTO STRICT target;
 
 
  where target can be a record variable, a row
@@ -1159,8 +1160,8 @@ DELETE ... RETURNING expres
  INTO clause) just as described above,
  and the plan is cached in the same way.
  This works for SELECT,
- INSERT/UPDATE/DELETE with
- RETURNING, and certain utility commands
+ INSERT/UPDATE/DELETE/MERGE
+ with RETURNING, and certain utility commands
  that return row sets, such as EXPLAIN.
  Except for the INTO clause, the SQL command is

Re: MERGE ... RETURNING

2023-02-26 Thread Dean Rasheed
On Fri, 24 Feb 2023 at 05:46, Dean Rasheed  wrote:
>
> Rebased version attached.
>

Another rebase.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..ff2a827
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 0cbdf63..224e9b8
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21216,6 +21216,99 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause
+
+pg_merge_when_clause ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that it is an error to use these functions anywhere other than the
+   RETURNING list of a MERGE command.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 7c01a54..4317cfc
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1323,9 +1323,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index 7c8a49f..b839695
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1149,6 +1149,7 @@ SELECT select_expressionsexpressions INTO STRICT target;
 UPDATE ... RETURNING expressions INTO STRICT target;
 DELETE ... RETURNING expressions INTO STRICT target;
+MERGE ... RETURNING expressions INTO STRICT target;
 
 
  where target can be a record variable, a row
@@ -1159,8 +1160,8 @@ DELETE ... RETURNING expres
  INTO clause) just as described above,
  and the plan is cached in the same way.
  This works for SELECT,
- INSERT/UPDATE/DELETE with
- RETURNING, and certain utility commands
+ INSERT/UPDATE/DELETE/MERGE
+ with RETURNING, and certain utility commands
  that return row sets, such as EXPLAIN.
  Except for the INTO clause, the SQL command is

Re: MERGE ... RETURNING

2023-02-23 Thread Dean Rasheed
On Tue, 7 Feb 2023 at 10:56, Dean Rasheed  wrote:
>
> Attached is a more complete patch
>

Rebased version attached.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..ff2a827
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 0cbdf63..224e9b8
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21216,6 +21216,99 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause
+
+pg_merge_when_clause ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that it is an error to use these functions anywhere other than the
+   RETURNING list of a MERGE command.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 7c01a54..4317cfc
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1323,9 +1323,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index 8897a54..3249e9c
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1044,7 +1044,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
  PL/pgSQL variable values can be
  automatically inserted into optimizable SQL commands, which
  are SELECT, INSERT,
- UPDATE, DELETE, and certain
+ UPDATE, DELETE,
+ MERGE and certain
  utility commands that incorporate one of these, such
  as EXPLAIN and CREATE TABLE ... AS
  SELECT.  In these commands,
@@ -1148,6 +1149,7 @@ SELECT select_expressionsexpressions INTO STRICT target;
 UPDATE ... RETURNING expressions INTO STRICT target;
 DELETE ... RETURNING expressions INTO STRICT target;
+MERGE ... RETURNING expressions INTO STRICT target;
 
 
  where target can be a record variable, a row
@

Re: MERGE ... RETURNING

2023-02-07 Thread Dean Rasheed
On Mon, 23 Jan 2023 at 16:54, Dean Rasheed  wrote:
>
> On Sun, 22 Jan 2023 at 19:08, Alvaro Herrera  wrote:
> >
> > Regarding mas_action_idx, I would have thought that it belongs in
> > MergeAction rather than MergeActionState.  After all, you determine it
> > once at parse time, and it is a constant from there onwards, right?
>
> Oh, yes that makes sense (and removes the need for a couple of the
> executor changes). Thanks for looking.
>

Attached is a more complete patch, with that change and more doc and
test updates.

A couple of latest changes from this patch look like they represent
pre-existing issues with MERGE that should really be extracted from
this patch and applied to HEAD+v15. I'll take a closer look at that,
and start new threads for those.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..ff2a827
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index e09e289..6927228
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21202,6 +21202,99 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause
+
+pg_merge_when_clause ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that it is an error to use these functions anywhere other than the
+   RETURNING list of a MERGE command.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 7c01a54..4317cfc
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1323,9 +1323,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index 8897a54..3249e9c
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1044,7 +1044,8 @

Re: MERGE ... RETURNING

2023-01-23 Thread Dean Rasheed
On Sun, 22 Jan 2023 at 19:08, Alvaro Herrera  wrote:
>
> > diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
> > new file mode 100644
> > index e34f583..aa3cca0
> > --- a/src/backend/commands/copy.c
> > +++ b/src/backend/commands/copy.c
> > @@ -274,12 +274,6 @@ DoCopy(ParseState *pstate, const CopyStm
> >   {
> >   Assert(stmt->query);
> >
> > - /* MERGE is allowed by parser, but unimplemented. Reject for 
> > now */
> > - if (IsA(stmt->query, MergeStmt))
> > - ereport(ERROR,
> > - 
> > errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> > - errmsg("MERGE not supported in 
> > COPY"));
>
> Does this COPY stuff come from another branch where you're adding
> support for MERGE in COPY?  I see that you add a test that MERGE without
> RETURNING fails, but you didn't add any tests that it works with
> RETURNING.  Anyway, I suspect these small changes shouldn't be here.
>

A few of the code changes I made were entirely untested (this was
after all just a proof-of-concept, intended to gather opinions and get
consensus about the overall shape of the feature). They serve as
useful reminders of things to test. In fact, since then, I've been
doing more testing, and so far everything I have tried has just
worked, including COPY (MERGE ... RETURNING ...) TO ... Thinking about
it, I can't see any reason why it wouldn't.

Still, there's a lot more testing to do. Just going through the docs
looking for references to RETURNING gave me a lot more ideas of things
to test.

> Overall, the idea of using Postgres-specific functions for extracting
> context in the RETURNING clause looks acceptable to me.

Cool.

> We can change
> that to add support to whatever clauses the SQL committee offers, when
> they get around to offering something.  (We do have to keep our fingers
> crossed that they will decide to use the same RETURNING syntax as we do
> in this patch, of course.)
>

Yes indeed. At least, done this way, the only non-SQL-standard syntax
is the RETURNING keyword itself, which we've already settled on for
INSERT/UPDATE/DELETE. Let's just hope they don't decide to use
RETURNING in an incompatible way in the future.

> Regarding mas_action_idx, I would have thought that it belongs in
> MergeAction rather than MergeActionState.  After all, you determine it
> once at parse time, and it is a constant from there onwards, right?
>

Oh, yes that makes sense (and removes the need for a couple of the
executor changes). Thanks for looking.

Regards,
Dean




Re: MERGE ... RETURNING

2023-01-22 Thread Alvaro Herrera
> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
> new file mode 100644
> index e34f583..aa3cca0
> --- a/src/backend/commands/copy.c
> +++ b/src/backend/commands/copy.c
> @@ -274,12 +274,6 @@ DoCopy(ParseState *pstate, const CopyStm
>   {
>   Assert(stmt->query);
>  
> - /* MERGE is allowed by parser, but unimplemented. Reject for 
> now */
> - if (IsA(stmt->query, MergeStmt))
> - ereport(ERROR,
> - errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> - errmsg("MERGE not supported in COPY"));

Does this COPY stuff come from another branch where you're adding
support for MERGE in COPY?  I see that you add a test that MERGE without
RETURNING fails, but you didn't add any tests that it works with
RETURNING.  Anyway, I suspect these small changes shouldn't be here.


Overall, the idea of using Postgres-specific functions for extracting
context in the RETURNING clause looks acceptable to me.  We can change
that to add support to whatever clauses the SQL committee offers, when
they get around to offering something.  (We do have to keep our fingers
crossed that they will decide to use the same RETURNING syntax as we do
in this patch, of course.)

Regarding mas_action_idx, I would have thought that it belongs in
MergeAction rather than MergeActionState.  After all, you determine it
once at parse time, and it is a constant from there onwards, right?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: MERGE ... RETURNING

2023-01-22 Thread Dean Rasheed
On Mon, 9 Jan 2023 at 17:44, Dean Rasheed  wrote:
>
> On Mon, 9 Jan 2023 at 16:23, Vik Fearing  wrote:
> >
> > Bikeshedding here.  Instead of Yet Another WITH Clause, could we perhaps
> > make a MERGING() function analogous to the GROUPING() function that goes
> > with grouping sets?
> >
> > MERGE ...
> > RETURNING *, MERGING('clause'), MERGING('action');
> >
>
> Hmm, possibly, but I think that would complicate the implementation quite a 
> bit.
>
> GROUPING() is not really a function (in the sense that there is no
> pg_proc entry for it, you can't do "\df grouping", and it isn't
> executed with its arguments like a normal function). Rather, it
> requires special-case handling in the parser, through to the executor,
> and I think MERGING() would be similar.
>
> Also, it masks any user function with the same name, and would
> probably require MERGING to be some level of reserved keyword.
>

I thought about this some more, and I think functions do make more
sense here, rather than inventing a special WITH syntax. However,
rather than using a special MERGING() function like GROUPING(), which
isn't really a function at all, I think it's better (and much simpler
to implement) to have a pair of normal functions (one returning int,
and one text).

The example from the tests shows the sort of thing this allows:

MERGE INTO sq_target t USING sq_source s ON tid = sid
  WHEN MATCHED AND tid >= 2 THEN UPDATE SET balance = t.balance + delta
  WHEN NOT MATCHED THEN INSERT (balance, tid) VALUES (balance + delta, sid)
  WHEN MATCHED AND tid < 2 THEN DELETE
  RETURNING pg_merge_when_clause() AS when_clause,
pg_merge_action() AS merge_action,
t.*,
CASE pg_merge_action()
  WHEN 'INSERT' THEN 'Inserted '||t
  WHEN 'UPDATE' THEN 'Added '||delta||' to balance'
  WHEN 'DELETE' THEN 'Removed '||t
END AS description;

 when_clause | merge_action | tid | balance | description
-+--+-+-+-
   3 | DELETE   |   1 | 100 | Removed (1,100)
   1 | UPDATE   |   2 | 220 | Added 20 to balance
   2 | INSERT   |   4 |  40 | Inserted (4,40)
(3 rows)

I think this is easier to use than the WITH syntax, and more flexible,
since the new functions can be used anywhere in the RETURNING list,
including in expressions.

There is one limitation though. Due to the way these functions need
access to the originating query, they need to appear directly in
MERGE's RETURNING list, not in subqueries, plpgsql function bodies, or
anything else that amounts to a different query. Maybe there's a way
round that, but it looks tricky. In practice though, it's easy to work
around, if necessary (e.g., by wrapping the MERGE in a CTE).

Regards,
Dean
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..bedb2c8
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -25,6 +25,7 @@ PostgreSQL documentation
 MERGE INTO [ ONLY ] target_table_name [ * ] [ [ AS ] target_alias ]
 USING data_source ON join_condition
 when_clause [...]
+[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
 
 where data_source is:
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
new file mode 100644
index e34f583..aa3cca0
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -274,12 +274,6 @@ DoCopy(ParseState *pstate, const CopyStm
 	{
 		Assert(stmt->query);
 
-		/* MERGE is allowed by parser, but unimplemented. Reject for now */
-		if (IsA(stmt->query, MergeStmt))
-			ereport(ERROR,
-	errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	errmsg("MERGE not supported in COPY"));
-
 		query = makeNode(RawStmt);
 		query->stmt = stmt->query;
 		query->stmt_location = stmt_location;
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
new file mode 100644
index 8043b4e..e02d7d0
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -510,7 +510,8 @@ BeginCopyTo(ParseState *pstate,
 		{
 			Assert(query->commandType == CMD_INSERT ||
    query->commandType == CMD_UPDATE ||
-   query->commandType == CMD_DELETE);
+   query->commandType == CMD_DELETE ||
+   query->commandType == CMD_MERGE);
 
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
new file mode 100644
index 812ead9..8572b01
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -48,6 +48,7 @@
 #include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/typcache.h"

Re: MERGE ... RETURNING

2023-01-09 Thread Dean Rasheed
On Mon, 9 Jan 2023 at 16:23, Vik Fearing  wrote:
>
> Bikeshedding here.  Instead of Yet Another WITH Clause, could we perhaps
> make a MERGING() function analogous to the GROUPING() function that goes
> with grouping sets?
>
> MERGE ...
> RETURNING *, MERGING('clause'), MERGING('action');
>

Hmm, possibly, but I think that would complicate the implementation quite a bit.

GROUPING() is not really a function (in the sense that there is no
pg_proc entry for it, you can't do "\df grouping", and it isn't
executed with its arguments like a normal function). Rather, it
requires special-case handling in the parser, through to the executor,
and I think MERGING() would be similar.

Also, it masks any user function with the same name, and would
probably require MERGING to be some level of reserved keyword.

I'm not sure that's worth it, just to have a more standard-looking
RETURNING list, without a WITH clause.

Regards,
Dean




Re: MERGE ... RETURNING

2023-01-09 Thread Vik Fearing

On 1/9/23 13:29, Dean Rasheed wrote:

On Sun, 8 Jan 2023 at 20:09, Isaac Morland  wrote:


Would it be useful to have just the action? Perhaps "WITH ACTION"? My idea is that this 
would return an enum of INSERT, UPDATE, DELETE (so is "action" the right word?). It seems 
to me in many situations I would be more likely to care about which of these 3 happened rather than 
the exact clause that applied. This isn't necessarily meant to be instead of your suggestion 
because I can imagine wanting to know the exact clause, just an alternative that might suffice in 
many situations. Using it would also avoid problems arising from editing the query in a way which 
changes the numbers of the clauses.



Hmm, perhaps that's something that can be added as well. Both use
cases seem useful.


Bikeshedding here.  Instead of Yet Another WITH Clause, could we perhaps 
make a MERGING() function analogous to the GROUPING() function that goes 
with grouping sets?


MERGE ...
RETURNING *, MERGING('clause'), MERGING('action');

Or something.
--
Vik Fearing





Re: MERGE ... RETURNING

2023-01-09 Thread Dean Rasheed
On Sun, 8 Jan 2023 at 20:09, Isaac Morland  wrote:
>
> Would it be useful to have just the action? Perhaps "WITH ACTION"? My idea is 
> that this would return an enum of INSERT, UPDATE, DELETE (so is "action" the 
> right word?). It seems to me in many situations I would be more likely to 
> care about which of these 3 happened rather than the exact clause that 
> applied. This isn't necessarily meant to be instead of your suggestion 
> because I can imagine wanting to know the exact clause, just an alternative 
> that might suffice in many situations. Using it would also avoid problems 
> arising from editing the query in a way which changes the numbers of the 
> clauses.
>

Hmm, perhaps that's something that can be added as well. Both use
cases seem useful.

>> 1 row is returned for each merge action executed (other than DO
>> NOTHING actions), and as usual, the values represent old target values
>> for DELETE actions, and new target values for INSERT/UPDATE actions.
>
> Would it be feasible to allow specifying old.column or new.column? These 
> would always be NULL for INSERT and DELETE respectively but more useful with 
> UPDATE. Actually I've been meaning to ask this question about UPDATE … 
> RETURNING.
>

I too have wished for the ability to do that with UPDATE ...
RETURNING, though I'm not sure how feasible it is.

I think it's something best considered separately though. I haven't
given any thought as to how to make it work, so there might be
technical difficulties. But if it could be made to work for UPDATE, it
shouldn't be much more effort to make it work for MERGE.

>> It's also possible to return the source values, and a bare "*" in the
>> returning list expands to all the source columns, followed by all the
>> target columns.
>
> Does this lead to a problem in the event there are same-named columns between 
> source and target?
>

Not really. It's exactly the same as doing "SELECT * FROM src JOIN tgt
ON ...". That may lead to duplicate column names in the result, but
that's not necessarily a problem.

Regards,
Dean




Re: MERGE ... RETURNING

2023-01-08 Thread Isaac Morland
On Sun, 8 Jan 2023 at 07:28, Dean Rasheed  wrote:

So playing around with it (and inspired by the WITH ORDINALITY syntax
> for SRFs), I had the idea of allowing "WITH WHEN CLAUSE" at the end of
> the returning list, which adds an integer column to the list, whose
> value is set to the index of the when clause executed, as in the
> attached very rough patch.
>

Would it be useful to have just the action? Perhaps "WITH ACTION"? My idea
is that this would return an enum of INSERT, UPDATE, DELETE (so is "action"
the right word?). It seems to me in many situations I would be more likely
to care about which of these 3 happened rather than the exact clause that
applied. This isn't necessarily meant to be instead of your suggestion
because I can imagine wanting to know the exact clause, just an alternative
that might suffice in many situations. Using it would also avoid problems
arising from editing the query in a way which changes the numbers of the
clauses.

So, quoting an example from the tests, this allows things like:
>
> WITH t AS (
>   MERGE INTO sq_target t USING v ON tid = sid
> WHEN MATCHED AND tid > 2 THEN UPDATE SET balance = t.balance + delta
> WHEN NOT MATCHED THEN INSERT (balance, tid) VALUES (balance + delta,
> sid)
> WHEN MATCHED AND tid < 2 THEN DELETE
> RETURNING t.* WITH WHEN CLAUSE
> )
> SELECT CASE when_clause
>  WHEN 1 THEN 'UPDATE'
>  WHEN 2 THEN 'INSERT'
>  WHEN 3 THEN 'DELETE'
>END, *
> FROM t;
>
>   case  | tid | balance | when_clause
> +-+-+-
>  INSERT |  -1 | -11 |   2
>  DELETE |   1 | 100 |   3
> (2 rows)
>
> 1 row is returned for each merge action executed (other than DO
> NOTHING actions), and as usual, the values represent old target values
> for DELETE actions, and new target values for INSERT/UPDATE actions.
>

Would it be feasible to allow specifying old.column or new.column? These
would always be NULL for INSERT and DELETE respectively but more useful
with UPDATE. Actually I've been meaning to ask this question about UPDATE …
RETURNING.

Actually, even with DELETE/INSERT, I can imagine wanting, for example, to
get only the new values associated with INSERT or UPDATE and not the values
removed by a DELETE. So I can imagine specifying new.column to get NULLs
for any row that was deleted but still get the new values for other rows.

It's also possible to return the source values, and a bare "*" in the
> returning list expands to all the source columns, followed by all the
> target columns.
>

Does this lead to a problem in the event there are same-named columns
between source and target?

The name of the added column, if included, can be changed by
> specifying "WITH WHEN CLAUSE [AS] col_alias". I chose the syntax "WHEN
> CLAUSE" and "when_clause" as the default column name because those
> match the existing terminology used in the docs.
>


MERGE ... RETURNING

2023-01-08 Thread Dean Rasheed
ve a good query shape, so now look at the WHEN conditions and
 	 * action targetlists.
@@ -390,9 +418,6 @@ transformMergeStmt(ParseState *pstate, M
 
 	qry->mergeActionList = mergeActionList;
 
-	/* RETURNING could potentially be added in the future, but not in SQL std */
-	qry->returningList = NULL;
-
 	qry->hasTargetSRFs = false;
 	qry->hasSubLinks = pstate->p_hasSubLinks;
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
new file mode 100644
index 5389a0e..162a305
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -2313,9 +2313,10 @@ addRangeTableEntryForCTE(ParseState *pst
 		cte->cterefcount++;
 
 	/*
-	 * We throw error if the CTE is INSERT/UPDATE/DELETE without RETURNING.
-	 * This won't get checked in case of a self-reference, but that's OK
-	 * because data-modifying CTEs aren't allowed to be recursive anyhow.
+	 * We throw error if the CTE is INSERT/UPDATE/DELETE/MERGE without
+	 * RETURNING.  This won't get checked in case of a self-reference, but
+	 * that's OK because data-modifying CTEs aren't allowed to be recursive
+	 * anyhow.
 	 */
 	if (IsA(cte->ctequery, Query))
 	{
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
new file mode 100644
index 1960dad..ac5a0c7
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -3640,7 +3640,8 @@ RewriteQuery(Query *parsetree, List *rew
 			if (!(ctequery->commandType == CMD_SELECT ||
   ctequery->commandType == CMD_UPDATE ||
   ctequery->commandType == CMD_INSERT ||
-  ctequery->commandType == CMD_DELETE))
+  ctequery->commandType == CMD_DELETE ||
+  ctequery->commandType == CMD_MERGE))
 			{
 /*
  * Currently it could only be NOTIFY; this error message will
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
new file mode 100644
index c7d9d96..d393f51
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -2131,11 +2131,10 @@ QueryReturnsTuples(Query *parsetree)
 		case CMD_SELECT:
 			/* returns tuples */
 			return true;
-		case CMD_MERGE:
-			return false;
 		case CMD_INSERT:
 		case CMD_UPDATE:
 		case CMD_DELETE:
+		case CMD_MERGE:
 			/* the forms with RETURNING return tuples */
 			if (parsetree->returningList)
 return true;
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
new file mode 100644
index 38f9b10..024e75a
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -962,13 +962,17 @@ PrintQueryResult(PGresult *result, bool
 			else
 success = true;
 
-			/* if it's INSERT/UPDATE/DELETE RETURNING, also print status */
+			/*
+			 * If it's INSERT/UPDATE/DELETE/MERGE RETURNING, also print
+			 * status.
+			 */
 			if (last || pset.show_all_results)
 			{
 cmdstatus = PQcmdStatus(result);
 if (strncmp(cmdstatus, "INSERT", 6) == 0 ||
 	strncmp(cmdstatus, "UPDATE", 6) == 0 ||
-	strncmp(cmdstatus, "DELETE", 6) == 0)
+	strncmp(cmdstatus, "DELETE", 6) == 0 ||
+	strncmp(cmdstatus, "MERGE", 5) == 0)
 	PrintQueryStatus(result, printStatusFout);
 			}
 
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 20f4c8b..3a25c93
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -415,6 +415,7 @@ typedef struct MergeActionState
 	NodeTag		type;
 
 	MergeAction *mas_action;	/* associated MergeAction node */
+	int			mas_action_idx;	/* 1-based index of MergeAction node */
 	ProjectionInfo *mas_proj;	/* projection of the action's targetlist for
  * this rel */
 	ExprState  *mas_whenqual;	/* WHEN [NOT] MATCHED AND conditions */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
new file mode 100644
index cfeca96..c8fc76c
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -160,6 +160,7 @@ typedef struct Query
 
 	List	   *mergeActionList;	/* list of actions for MERGE (only) */
 	bool		mergeUseOuterJoin;	/* whether to use outer join */
+	bool		hasReturningWhenClause;	/* has RETURNING WITH WHEN CLAUSE */
 
 	List	   *targetList;		/* target list (of TargetEntry) */
 
@@ -1713,6 +1714,9 @@ typedef struct MergeStmt
 	Node	   *sourceRelation; /* source relation */
 	Node	   *joinCondition;	/* join condition between source and target */
 	List	   *mergeWhenClauses;	/* list of MergeWhenClause(es) */
+	List	   *returningList;	/* list of expressions to return */
+	char	   *returningWhenClause;	/* RETURNING ... WITH WHEN CLAUSE
+		 * column alias */
 	WithClause *withClause;		/* WITH clause */
 } MergeStmt;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index c1234fc..c8c8b97
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -254,6 +254,7 @@ typedef struct ModifyTable
 	List	   *exclRelTlist;	/* tlist of t