Amit-san,

Thanks for the comments!

(2019/04/18 14:08), Amit Langote wrote:
On 2019/04/18 14:06, Amit Langote wrote:
On 2019/04/17 21:49, Etsuro Fujita wrote:

I think the reason for that is: the row routed to remp is incorrectly
fetched as a to-be-updated row when updating remp as a subplan targetrel.

Yeah.  In the fully-local case, that is, where both the source and the
target partition of a row movement operation are local tables, heap AM
ensures that tuples that's moved into a given relation in the same command
(by way of row movement) are not returned as to-be-updated, because it
deems such tuples invisible.  The "same command" part being crucial for
that to work.

In the case where the target of a row movement operation is a foreign
table partition, the INSERT used as part of row movement and subsequent
UPDATE of the same foreign table are distinct commands for the remote
server.  So, the rows inserted by the 1st command (as part of the row
movement) are deemed visible by the 2nd command (UPDATE) even if both are
operating within the same transaction.

Yeah, I think so too.

I guess there's no easy way for postgres_fdw to make the remote server
consider them as a single command.  IOW, no way to make the remote server
not touch those "moved-in" rows during the UPDATE part of the local query.

I agree.

The right way to fix this would be to have some way in postgres_fdw in
which we don't fetch such rows when updating such subplan targetrels.  I
tried to figure out a (simple) way to do that, but I couldn't.

Yeah, that seems a bit hard to ensure with our current infrastructure.

Yeah, I think we should leave that for future work.

So what I'm thinking is to throw an error for cases like this.  (Though, I
think we should keep to allow rows to be moved from local partitions to a
foreign-table subplan targetrel that has been updated already.)

Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
two cases?

One thing I came up with to do that is this:

@@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,

        initStringInfo(&sql);

+       /*
+ * If the foreign table is an UPDATE subplan resultrel that hasn't yet
+        * been updated, routing tuples to the table might yield incorrect
+ * results, because if routing tuples, routed tuples will be mistakenly + * read from the table and updated twice when updating the table --- it + * would be nice if we could handle this case; but for now, throw an error
+        * for safety.
+        */
+       if (plan && plan->operation == CMD_UPDATE &&
+               (resultRelInfo->ri_usesFdwDirectModify ||
+                resultRelInfo->ri_FdwState) &&
+ resultRelInfo > mtstate->resultRelInfo + mtstate->mt_whichplan)
+               ereport(ERROR,
+                               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot route tuples into foreign table to be updated \"%s\"", + RelationGetRelationName(rel))));

Forgot to say that since this is a separate issue from the original bug
report, maybe we can first finish fixing the latter.  What to do you think?

Yeah, I think we can do that, but my favorite would be to fix the two issues in a single shot, because they seem to me rather close to each other. So I updated a new version in a single patch, which I'm attaching.

Notes:

* I kept all the changes in the previous patch, because otherwise postgres_fdw would fail to release resources for a foreign-insert operation created by postgresBeginForeignInsert() for a tuple-routable foreign table (ie, a foreign-table subplan resultrel that has been updated already) during postgresEndForeignInsert().

* I revised a comment according to your previous comment, though I changed a state name: s/sub_fmstate/aux_fmstate/

* DirectModify also has the latter issue.  Here is an example:

postgres=# create table p (a int, b text) partition by list (a);
postgres=# create table p1 partition of p for values in (1);
postgres=# create table p2base (a int check (a = 2 or a = 3), b text);
postgres=# create foreign table p2 partition of p for values in (2, 3) server loopback options (table_name 'p2base');

postgres=# insert into p values (1, 'foo');
INSERT 0 1
postgres=# explain verbose update p set a = a + 1;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Update on public.p  (cost=0.00..176.21 rows=2511 width=42)
   Update on public.p1
   Foreign Update on public.p2
   ->  Seq Scan on public.p1  (cost=0.00..25.88 rows=1270 width=42)
         Output: (p1.a + 1), p1.b, p1.ctid
-> Foreign Update on public.p2 (cost=100.00..150.33 rows=1241 width=42)
         Remote SQL: UPDATE public.p2base SET a = (a + 1)
(7 rows)

postgres=# update p set a = a + 1;
UPDATE 2
postgres=# select * from p;
 a |  b
---+-----
 3 | foo
(1 row)

As shown in the above bit added to postgresBeginForeignInsert(), I modified the patch so that we throw an error for cases like this even when using a direct modification plan, and I also added regression test cases for that.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***************
*** 7734,7739 **** update utrtest set a = 1 where a = 2 returning *;
--- 7734,7873 ----
  (1 row)
  
  drop trigger loct_br_insert_trigger on loct;
+ delete from utrtest;
+ insert into utrtest values (1, 'foo');
+ insert into utrtest values (2, 'qux');
+ -- We can move rows to a foreign partition that has been updated already,
+ -- but can't move rows to a foreign partition that hasn't yet been updated
+ -- Test the former case:
+ -- with a direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 1 returning *;
+                            QUERY PLAN                            
+ -----------------------------------------------------------------
+  Update on public.utrtest
+    Output: remp.a, remp.b
+    Foreign Update on public.remp
+    Update on public.locp
+    ->  Foreign Update on public.remp
+          Remote SQL: UPDATE public.loct SET a = 1 RETURNING a, b
+    ->  Seq Scan on public.locp
+          Output: 1, locp.b, locp.ctid
+ (8 rows)
+ 
+ update utrtest set a = 1 returning *;
+  a |  b  
+ ---+-----
+  1 | foo
+  1 | qux
+ (2 rows)
+ 
+ delete from utrtest;
+ insert into utrtest values (1, 'foo');
+ insert into utrtest values (2, 'qux');
+ -- with a non-direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x returning *;
+                                   QUERY PLAN                                  
+ ------------------------------------------------------------------------------
+  Update on public.utrtest
+    Output: remp.a, remp.b, "*VALUES*".column1
+    Foreign Update on public.remp
+      Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING a, b
+    Update on public.locp
+    ->  Hash Join
+          Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1
+          Hash Cond: (remp.a = "*VALUES*".column1)
+          ->  Foreign Scan on public.remp
+                Output: remp.b, remp.ctid, remp.a
+                Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
+          ->  Hash
+                Output: "*VALUES*".*, "*VALUES*".column1
+                ->  Values Scan on "*VALUES*"
+                      Output: "*VALUES*".*, "*VALUES*".column1
+    ->  Hash Join
+          Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1
+          Hash Cond: (locp.a = "*VALUES*".column1)
+          ->  Seq Scan on public.locp
+                Output: locp.b, locp.ctid, locp.a
+          ->  Hash
+                Output: "*VALUES*".*, "*VALUES*".column1
+                ->  Values Scan on "*VALUES*"
+                      Output: "*VALUES*".*, "*VALUES*".column1
+ (24 rows)
+ 
+ update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x returning *;
+  a |  b  | x 
+ ---+-----+---
+  1 | foo | 1
+  1 | qux | 2
+ (2 rows)
+ 
+ -- Change the definition of utrtest so that the foreign partition get
+ -- processed after the local partition
+ delete from utrtest;
+ alter table utrtest detach partition remp;
+ drop foreign table remp;
+ alter table loct drop constraint loct_a_check;
+ alter table loct add check (a in (3));
+ create foreign table remp (a int check (a in (3)), b text) server loopback options (table_name 'loct');
+ alter table utrtest attach partition remp for values in (3);
+ insert into utrtest values (2, 'qux');
+ insert into utrtest values (3, 'xyzzy');
+ -- Test the latter case:
+ -- with a direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 3 returning *;
+                            QUERY PLAN                            
+ -----------------------------------------------------------------
+  Update on public.utrtest
+    Output: locp.a, locp.b
+    Update on public.locp
+    Foreign Update on public.remp
+    ->  Seq Scan on public.locp
+          Output: 3, locp.b, locp.ctid
+    ->  Foreign Update on public.remp
+          Remote SQL: UPDATE public.loct SET a = 3 RETURNING a, b
+ (8 rows)
+ 
+ update utrtest set a = 3 returning *; -- ERROR
+ ERROR:  cannot route tuples into foreign table to be updated "remp"
+ delete from utrtest;
+ insert into utrtest values (2, 'qux');
+ insert into utrtest values (3, 'xyzzy');
+ -- with a non-direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x returning *;
+                                   QUERY PLAN                                  
+ ------------------------------------------------------------------------------
+  Update on public.utrtest
+    Output: locp.a, locp.b, "*VALUES*".column1
+    Update on public.locp
+    Foreign Update on public.remp
+      Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING a, b
+    ->  Hash Join
+          Output: 3, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1
+          Hash Cond: (locp.a = "*VALUES*".column1)
+          ->  Seq Scan on public.locp
+                Output: locp.b, locp.ctid, locp.a
+          ->  Hash
+                Output: "*VALUES*".*, "*VALUES*".column1
+                ->  Values Scan on "*VALUES*"
+                      Output: "*VALUES*".*, "*VALUES*".column1
+    ->  Hash Join
+          Output: 3, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1
+          Hash Cond: (remp.a = "*VALUES*".column1)
+          ->  Foreign Scan on public.remp
+                Output: remp.b, remp.ctid, remp.a
+                Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
+          ->  Hash
+                Output: "*VALUES*".*, "*VALUES*".column1
+                ->  Values Scan on "*VALUES*"
+                      Output: "*VALUES*".*, "*VALUES*".column1
+ (24 rows)
+ 
+ update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x returning *; -- ERROR
+ ERROR:  cannot route tuples into foreign table to be updated "remp"
  drop table utrtest;
  drop table loct;
  -- Test copy tuple routing
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 185,190 **** typedef struct PgFdwModifyState
--- 185,194 ----
  
  	/* working memory context */
  	MemoryContext temp_cxt;		/* context for per-tuple temporary data */
+ 
+ 	/* for update row movement, if subplan resultrel */
+ 	struct PgFdwModifyState *aux_fmstate;	/* foreign-insert state, if
+ 											 * created */
  } PgFdwModifyState;
  
  /*
***************
*** 1844,1851 **** postgresExecForeignInsert(EState *estate,
  						  TupleTableSlot *slot,
  						  TupleTableSlot *planSlot)
  {
! 	return execute_foreign_modify(estate, resultRelInfo, CMD_INSERT,
  								  slot, planSlot);
  }
  
  /*
--- 1848,1871 ----
  						  TupleTableSlot *slot,
  						  TupleTableSlot *planSlot)
  {
! 	PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
! 	TupleTableSlot *rslot;
! 
! 	/*
! 	 * If the fmstate has aux_fmstate set, it means the foreign table is an
! 	 * UPDATE subplan resultrel, in which case, the aux_fmstate is provided
! 	 * for the INSERT operation on the table, whereas the fmstate is provided
! 	 * for the UPDATE operation on the table; use the aux_fmstate
! 	 */
! 	if (fmstate->aux_fmstate)
! 		resultRelInfo->ri_FdwState = fmstate->aux_fmstate;
! 	rslot = execute_foreign_modify(estate, resultRelInfo, CMD_INSERT,
  								  slot, planSlot);
+ 	/* Revert that change */
+ 	if (fmstate->aux_fmstate)
+ 		resultRelInfo->ri_FdwState = fmstate;
+ 
+ 	return rslot;
  }
  
  /*
***************
*** 1917,1922 **** postgresBeginForeignInsert(ModifyTableState *mtstate,
--- 1937,1959 ----
  
  	initStringInfo(&sql);
  
+ 	/*
+ 	 * If the foreign table is an UPDATE subplan resultrel that hasn't yet
+ 	 * been updated, routing tuples to the table might yield incorrect
+ 	 * results, because if routing tuples, routed tuples will be mistakenly
+ 	 * read from the table and updated twice when updating the table --- it
+ 	 * would be nice if we could handle this case; but for now, throw an error
+ 	 * for safety.
+ 	 */
+ 	if (plan && plan->operation == CMD_UPDATE &&
+ 		(resultRelInfo->ri_usesFdwDirectModify ||
+ 		 resultRelInfo->ri_FdwState) &&
+ 		resultRelInfo > mtstate->resultRelInfo + mtstate->mt_whichplan)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 				 errmsg("cannot route tuples into foreign table to be updated \"%s\"",
+ 						RelationGetRelationName(rel))));
+ 
  	/* We transmit all columns that are defined in the foreign table. */
  	for (attnum = 1; attnum <= tupdesc->natts; attnum++)
  	{
***************
*** 1983,1989 **** postgresBeginForeignInsert(ModifyTableState *mtstate,
  									retrieved_attrs != NIL,
  									retrieved_attrs);
  
! 	resultRelInfo->ri_FdwState = fmstate;
  }
  
  /*
--- 2020,2038 ----
  									retrieved_attrs != NIL,
  									retrieved_attrs);
  
! 	/*
! 	 * If the given resultRelInfo already has PgFdwModifyState set, it means
! 	 * the foreign table is an UPDATE subplan resultrel; in which case, store
! 	 * the resulting state into the aux_fmstate of the PgFdwModifyState.
! 	 */
! 	if (resultRelInfo->ri_FdwState)
! 	{
! 		Assert(plan && plan->operation == CMD_UPDATE);
! 		Assert(resultRelInfo->ri_usesFdwDirectModify == false);
! 		((PgFdwModifyState *) resultRelInfo->ri_FdwState)->aux_fmstate = fmstate;
! 	}
! 	else
! 		resultRelInfo->ri_FdwState = fmstate;
  }
  
  /*
***************
*** 1998,2003 **** postgresEndForeignInsert(EState *estate,
--- 2047,2061 ----
  
  	Assert(fmstate != NULL);
  
+ 	/*
+ 	 * If the fmstate has aux_fmstate set, it means the foreign table is an
+ 	 * UPDATE subplan resultrel, in which case, the aux_fmstate is provided
+ 	 * for the INSERT operation on the table, whereas the fmstate is provided
+ 	 * for the UPDATE operation on the table; get the aux_fmstate
+ 	 */
+ 	if (fmstate->aux_fmstate)
+ 		fmstate = fmstate->aux_fmstate;
+ 
  	/* Destroy the execution state */
  	finish_foreign_modify(fmstate);
  }
***************
*** 3482,3487 **** create_foreign_modify(EState *estate,
--- 3540,3548 ----
  
  	Assert(fmstate->p_nums <= n_params);
  
+ 	/* Initialize auxiliary state */
+ 	fmstate->aux_fmstate = NULL;
+ 
  	return fmstate;
  }
  
*** a/contrib/postgres_fdw/sql/postgres_fdw.sql
--- b/contrib/postgres_fdw/sql/postgres_fdw.sql
***************
*** 2015,2020 **** update utrtest set a = 1 where a = 2 returning *;
--- 2015,2069 ----
  
  drop trigger loct_br_insert_trigger on loct;
  
+ delete from utrtest;
+ insert into utrtest values (1, 'foo');
+ insert into utrtest values (2, 'qux');
+ 
+ -- We can move rows to a foreign partition that has been updated already,
+ -- but can't move rows to a foreign partition that hasn't yet been updated
+ 
+ -- Test the former case:
+ -- with a direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 1 returning *;
+ update utrtest set a = 1 returning *;
+ 
+ delete from utrtest;
+ insert into utrtest values (1, 'foo');
+ insert into utrtest values (2, 'qux');
+ 
+ -- with a non-direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x returning *;
+ update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x returning *;
+ 
+ -- Change the definition of utrtest so that the foreign partition get
+ -- processed after the local partition
+ delete from utrtest;
+ alter table utrtest detach partition remp;
+ drop foreign table remp;
+ alter table loct drop constraint loct_a_check;
+ alter table loct add check (a in (3));
+ create foreign table remp (a int check (a in (3)), b text) server loopback options (table_name 'loct');
+ alter table utrtest attach partition remp for values in (3);
+ insert into utrtest values (2, 'qux');
+ insert into utrtest values (3, 'xyzzy');
+ 
+ -- Test the latter case:
+ -- with a direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 3 returning *;
+ update utrtest set a = 3 returning *; -- ERROR
+ 
+ delete from utrtest;
+ insert into utrtest values (2, 'qux');
+ insert into utrtest values (3, 'xyzzy');
+ 
+ -- with a non-direct modification plan
+ explain (verbose, costs off)
+ update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x returning *;
+ update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x returning *; -- ERROR
+ 
  drop table utrtest;
  drop table loct;
  

Reply via email to