[HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-11-01 Thread Rob McColl
Between 9.6.5 and 10, the handling of parenthesized single-column UPDATE
statements changed. In 9.6.5, they were treated identically to
unparenthesized single-column UPDATES. In 10, they are treated as
multiple-column updates.  This results in this being valid in Postgres
9.6.5, but an error in Postgres 10:

CREATE TABLE test (id INT PRIMARY KEY, data INT);
INSERT INTO test VALUES (1, 1);
UPDATE test SET (data) = (2) WHERE id = 1;

In 10 and the current master, this produces the error:

errmsg("source for a multiple-column UPDATE item must be a sub-SELECT or
ROW() expression")

I believe that this is not an intended change or behavior, but is instead
an unintentional side effect of 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd
Improve handling of "UPDATE ... SET (column_list) = row_constructor". (
https://github.com/postgres/postgres/commit/906bfcad7ba7cb3863fe0e2a7810be
8e3cd84fbd).

This is a small patch to the grammar that restores the previous behavior by
adding a rule to the set_clause rule and modifying the final rule of the
set_clause rule to only match lists of more then one element.  I'm not sure
if there are more elegant or preferred ways to address this.

Compiled and tested on Ubuntu 17.04 Linux 4.10.0-33-generic x86_64.

Regression test added under the update test to cover the parenthesized
single-column case.

I see no reason this would affect performance.

Thanks,
-rob

-- 
Rob McColl
@robmccoll
r...@robmccoll.com
205.422.0909 <(205)%20422-0909>


Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-10-31 Thread Tom Lane
"David G. Johnston"  writes:
> ​Definitely moderates my opinion in my concurrent emai​l...though
> postponement is not strictly bad given the seeming frequency of the
> existing problematic syntax in the wild already.

Yeah, I'd hoped to get some capability extensions done here before v10
shipped, in line with the theory I've expressed in the past that it's
better if you can point to actual new features justifying a compatibility
break.  However, that didn't happen in time.

I'm disinclined to revert the change though; if there are people making
use of this oddity now, then the longer we leave it in place the more
people are going to be hurt when we do break it.

If I had a time machine, I'd go back and fix the original multi-column
SET patch so that it required the word ROW in all cases --- then at
least it'd have accepted a conformant subset of the standard syntax.
Oh well.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-10-31 Thread David G. Johnston
On Tue, Oct 31, 2017 at 3:43 PM, Tom Lane  wrote:

> According to the spec, the elements of a parenthesized
> SET list should be assigned from the fields of a composite RHS.  If
> there's just one element of the SET list, the RHS should be a single-field
> composite value, and this syntax should result in extracting its field.
> This patch doesn't do that; it preserves our previous broken behavior,
> and thus just puts off the day of reckoning that inevitably will come.
>

​Definitely moderates my opinion in my concurrent emai​l...though
postponement is not strictly bad given the seeming frequency of the
existing problematic syntax in the wild already.

David J.


Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-10-31 Thread David G. Johnston
On Tue, Oct 31, 2017 at 3:14 PM, Rob McColl  wrote:

>
>> I believe that this is not an intended change or behavior, but is instead
>> an unintentional side effect of 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd
>> Improve handling of "UPDATE ... SET (column_list) = row_constructor". (
>> https://github.com/postgres/postgres/commit/906bfcad7ba7cb3
>> 863fe0e2a7810be8e3cd84fbd).
>>
>
​At this point the intent of 906bfcad doesn't really matter - and given the
number of complaints in the month since ​v10 went live I'm tending to lean
toward bringing the pre-10 behavior back. There is no ambiguity involved
here and the breakage of existing applications seems considerably worse
than the technical oddity of allowing (val) to be interpreted as a
row_constructor in this situation.  From a standards perspective we are
strictly more permissive so no new problem there.

On a related note, the 10.0 syntax guide is wrong, it needs to break out
the parenthesized single-column and multi-column variants separately:

Presently: ( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [,
...] )

Basically one cannot specify only a single column_name AND omit ROW

It would have been nice if the syntax for that variant would have been:

( column_name, column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT },
{ expression | DEFAULT } [, ...] )
​
If the v10 behavior remains the above change should be made as well as
adding:

( column_name ) = ROW ( expression | DEFAULT )

If we revert 10 to the pre-10 behavior the existing syntax will work.

David J.


Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-10-31 Thread Tom Lane
Rob McColl  writes:
> Attaching patch... :-/

The reason why hacking your way to a backwards-compatible solution is
a bad idea is that it breaks the SQL standard compliance we're trying to
achieve here.  According to the spec, the elements of a parenthesized
SET list should be assigned from the fields of a composite RHS.  If
there's just one element of the SET list, the RHS should be a single-field
composite value, and this syntax should result in extracting its field.
This patch doesn't do that; it preserves our previous broken behavior,
and thus just puts off the day of reckoning that inevitably will come.

As a concrete example, the spec says this should work:

create table t (f1 int);
update t set (f1) = row(4);

and it does in v10, but (without having tested) your patch will break it.
This is not so exciting for simple row constructors, where you could just
leave off the word ROW; but it is a critical distinction if we're ever to
get to the point of allowing other composite-returning constructs here,
for example composite-returning functions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-10-31 Thread Rob McColl
Attaching patch... :-/

On Tue, Oct 31, 2017 at 4:27 PM, Rob McColl  wrote:

> Between 9.6.5 and 10, the handling of parenthesized single-column UPDATE
> statements changed. In 9.6.5, they were treated identically to
> unparenthesized single-column UPDATES. In 10, they are treated as
> multiple-column updates.  This results in this being valid in Postgres
> 9.6.5, but an error in Postgres 10:
>
> CREATE TABLE test (id INT PRIMARY KEY, data INT);
> INSERT INTO test VALUES (1, 1);
> UPDATE test SET (data) = (2) WHERE id = 1;
>
> In 10 and the current master, this produces the error:
>
> errmsg("source for a multiple-column UPDATE item must be a sub-SELECT or
> ROW() expression")
>
> I believe that this is not an intended change or behavior, but is instead
> an unintentional side effect of 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd
> Improve handling of "UPDATE ... SET (column_list) = row_constructor". (
> https://github.com/postgres/postgres/commit/906bfcad7ba7cb3
> 863fe0e2a7810be8e3cd84fbd).
>
> This is a small patch to the grammar that restores the previous behavior
> by adding a rule to the set_clause rule and modifying the final rule of the
> set_clause rule to only match lists of more then one element.  I'm not sure
> if there are more elegant or preferred ways to address this.
>
> Compiled and tested on Ubuntu 17.04 Linux 4.10.0-33-generic x86_64.
>
> Regression test added under the update test to cover the parenthesized
> single-column case.
>
> I see no reason this would affect performance.
>
> Thanks,
> -rob
>
> --
> Rob McColl
> @robmccoll
> r...@robmccoll.com
> 205.422.0909 <(205)%20422-0909>
>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
new file mode 100644
index 4c83a63..b5f4ccf
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*** set_clause:
*** 10694,10712 
  	$1->val = (Node *) $3;
  	$$ = list_make1($1);
  }
! 			| '(' set_target_list ')' '=' a_expr
  {
! 	int ncolumns = list_length($2);
  	int i = 1;
  	ListCell *col_cell;
  
  	/* Create a MultiAssignRef source for each target */
  	foreach(col_cell, $2)
  	{
  		ResTarget *res_col = (ResTarget *) lfirst(col_cell);
  		MultiAssignRef *r = makeNode(MultiAssignRef);
  
! 		r->source = (Node *) $5;
  		r->colno = i;
  		r->ncolumns = ncolumns;
  		res_col->val = (Node *) r;
--- 10694,10720 
  	$1->val = (Node *) $3;
  	$$ = list_make1($1);
  }
! 			| '(' set_target ')' '=' a_expr
  {
! 	$2->val = (Node *) $5;
! 	$$ = list_make1($2);
! }
! 			| '(' set_target_list ',' set_target ')' '=' a_expr
! {
! 	int ncolumns;
  	int i = 1;
  	ListCell *col_cell;
  
+ 	$2 = lappend($2,$4);
+ 	ncolumns = list_length($2);
+ 
  	/* Create a MultiAssignRef source for each target */
  	foreach(col_cell, $2)
  	{
  		ResTarget *res_col = (ResTarget *) lfirst(col_cell);
  		MultiAssignRef *r = makeNode(MultiAssignRef);
  
! 		r->source = (Node *) $7;
  		r->colno = i;
  		r->ncolumns = ncolumns;
  		res_col->val = (Node *) r;
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
new file mode 100644
index cef70b1..90d33ad
*** a/src/test/regress/expected/update.out
--- b/src/test/regress/expected/update.out
*** SELECT * FROM update_test;
*** 76,82 
   100 | 21 | 
  (4 rows)
  
! UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'foo';
  SELECT * FROM update_test;
a  | b  |   c   
  -++---
--- 76,93 
   100 | 21 | 
  (4 rows)
  
! -- parenthesized single column should be valid
! UPDATE update_test SET (c) = ('bungle') WHERE c = 'foo';
! SELECT * FROM update_test;
!   a  | b  |   c
! -++
!  100 | 20 | 
!  100 | 21 | 
!  100 | 20 | bungle
!  100 | 21 | bungle
! (4 rows)
! 
! UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'bungle';
  SELECT * FROM update_test;
a  | b  |   c   
  -++---
diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql
new file mode 100644
index 66d1fec..0ed5f6a
*** a/src/test/regress/sql/update.sql
--- b/src/test/regress/sql/update.sql
*** UPDATE update_test SET a = v.* FROM (VAL
*** 51,57 
  INSERT INTO update_test SELECT a,b+1,c FROM update_test;
  SELECT * FROM update_test;
  
! UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'foo';
  SELECT * FROM update_test;
  UPDATE update_test SET (c,b) = ('car', a+b), a = a + 1 WHERE a = 10;
  SELECT * FROM update_test;
--- 51,60 
  INSERT INTO update_test SELECT a,b+1,c FROM update_test;
  SELECT * FROM update_test;
  
! -- parenthesized single column should be valid
! UPDATE update_test SET (c) = ('bungle') WHERE c = 'foo';
! SELECT * FROM update_test;
! UPDATE update_test SET (c,b,a) = ('bugle', b+11, DEFAULT) WHERE c = 'bungle';
  SELECT * FROM update_test;
  UPDATE update_test SET (c,b) =