(2014/09/08 16:18), Albe Laurenz wrote:
I wrote:
I gave it a spin and could not find any undesirable behaviour, and the
output of EXPLAIN ANALYZE looks like I'd expect.

Thank you for the review!

I noticed that you use the list length of fdw_private to check if
the UPDATE or DELETE is pushed down to the remote server or not.

While this works fine, I wonder if it wouldn't be better to have some
explicit flag in fdw_private for that purpose.  Future modifications that
change the list length might easily overlook that it is used for this
purpose, thereby breaking the code.

Other than that it looks alright to me.

Maybe I should have mentioned that I have set the patch to "Waiting for Author"
because I'd like to hear your opinion on that, but I'm prepared to set it
to "Ready for Committer" soon.

I agree with you on that point. So, I've updated the patch to have the explicit flag, as you proposed. Attached is the updated version of the patch. In this version, I've also revised code and its comments a bit.

Sorry for the delay.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***************
*** 188,197 **** is_foreign_expr(PlannerInfo *root,
  	if (!foreign_expr_walker((Node *) expr, &glob_cxt, &loc_cxt))
  		return false;
  
- 	/* Expressions examined here should be boolean, ie noncollatable */
- 	Assert(loc_cxt.collation == InvalidOid);
- 	Assert(loc_cxt.state == FDW_COLLATE_NONE);
- 
  	/*
  	 * An expression which includes any mutable functions can't be sent over
  	 * because its result is not stable.  For example, sending now() remote
--- 188,193 ----
***************
*** 927,932 **** deparseUpdateSql(StringInfo buf, PlannerInfo *root,
--- 923,982 ----
  }
  
  /*
+  * deparse remote UPDATE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
+ 					   Index rtindex, Relation rel,
+ 					   List	*remote_conds,
+ 					   List	*targetlist,
+ 					   List *targetAttrs,
+ 					   List *returningList,
+ 					   List **retrieved_attrs)
+ {
+ 	RelOptInfo *baserel = root->simple_rel_array[rtindex];
+ 	List	   *params_list = NIL;
+ 	deparse_expr_cxt context;
+ 	bool		first;
+ 	ListCell   *lc;
+ 
+ 	/* Set up context struct for recursion */
+ 	context.root = root;
+ 	context.foreignrel = baserel;
+ 	context.buf = buf;
+ 	context.params_list = &params_list;
+ 
+ 	appendStringInfoString(buf, "UPDATE ");
+ 	deparseRelation(buf, rel);
+ 	appendStringInfoString(buf, " SET ");
+ 
+ 	first = true;
+ 	foreach(lc, targetAttrs)
+ 	{
+ 		int			attnum = lfirst_int(lc);
+ 		TargetEntry *tle = get_tle_by_resno(targetlist, attnum);
+ 
+ 		if (!first)
+ 			appendStringInfoString(buf, ", ");
+ 		first = false;
+ 
+ 		deparseColumnRef(buf, rtindex, attnum, root);
+ 		appendStringInfo(buf, " = ");
+ 		deparseExpr((Expr *) tle->expr, &context);
+ 	}
+ 	if (remote_conds)
+ 		appendWhereClause(buf, root, baserel, remote_conds,
+ 						  true, &params_list);
+ 
+ 	deparseReturningList(buf, root, rtindex, rel, false,
+ 						 returningList, retrieved_attrs);
+ }
+ 
+ /*
   * deparse remote DELETE statement
   *
   * The statement text is appended to buf, and we also create an integer List
***************
*** 949,954 **** deparseDeleteSql(StringInfo buf, PlannerInfo *root,
--- 999,1031 ----
  }
  
  /*
+  * deparse remote DELETE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparseDirectDeleteSql(StringInfo buf, PlannerInfo *root,
+ 					   Index rtindex, Relation rel,
+ 					   List	*remote_conds,
+ 					   List *returningList,
+ 					   List **retrieved_attrs)
+ {
+ 	RelOptInfo *baserel = root->simple_rel_array[rtindex];
+ 	List	   *params_list = NIL;
+ 
+ 	appendStringInfoString(buf, "DELETE FROM ");
+ 	deparseRelation(buf, rel);
+ 	if (remote_conds)
+ 		appendWhereClause(buf, root, baserel, remote_conds,
+ 						  true, &params_list);
+ 
+ 	deparseReturningList(buf, root, rtindex, rel, false,
+ 						 returningList, retrieved_attrs);
+ }
+ 
+ /*
   * Add a RETURNING clause, if needed, to an INSERT/UPDATE/DELETE.
   */
  static void
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***************
*** 998,1004 **** INSERT INTO ft2 (c1,c2,c3)
--- 998,1025 ----
  (3 rows)
  
  INSERT INTO ft2 (c1,c2,c3) VALUES (1104,204,'ddd'), (1105,205,'eee');
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3;              -- can be pushed down
+                                                       QUERY PLAN                                                      
+ ----------------------------------------------------------------------------------------------------------------------
+  Update on public.ft2
+    ->  Foreign Scan on public.ft2
+          Output: c1, c2, NULL::integer, c3, c4, c5, c6, c7, c8, ctid
+          Remote SQL: UPDATE "S 1"."T 1" SET c2 = (c2 + 300), c3 = (c3 || '_update3'::text) WHERE ((("C 1" % 10) = 3))
+ (4 rows)
+ 
  UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3;
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING *;  -- can be pushed down
+                                                                             QUERY PLAN                                                                            
+ ------------------------------------------------------------------------------------------------------------------------------------------------------------------
+  Update on public.ft2
+    Output: c1, c2, c3, c4, c5, c6, c7, c8
+    ->  Foreign Scan on public.ft2
+          Output: c1, c2, NULL::integer, c3, c4, c5, c6, c7, c8, ctid
+          Remote SQL: UPDATE "S 1"."T 1" SET c2 = (c2 + 400), c3 = (c3 || '_update7'::text) WHERE ((("C 1" % 10) = 7)) RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+ (5 rows)
+ 
  UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING *;
    c1  | c2  |         c3         |              c4              |            c5            | c6 |     c7     | c8  
  ------+-----+--------------------+------------------------------+--------------------------+----+------------+-----
***************
*** 1108,1114 **** UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING
  
  EXPLAIN (verbose, costs off)
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
!   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
                                                                              QUERY PLAN                                                                             
  -------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Update on public.ft2
--- 1129,1135 ----
  
  EXPLAIN (verbose, costs off)
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
!   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;                               -- can't be pushed down
                                                                              QUERY PLAN                                                                             
  -------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Update on public.ft2
***************
*** 1129,1144 **** UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
    FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
  EXPLAIN (verbose, costs off)
!   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
!                                        QUERY PLAN                                       
! ----------------------------------------------------------------------------------------
   Delete on public.ft2
     Output: c1, c4
-    Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c4
     ->  Foreign Scan on public.ft2
           Output: ctid
!          Remote SQL: SELECT ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) FOR UPDATE
! (6 rows)
  
  DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
    c1  |              c4              
--- 1150,1164 ----
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
    FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
  EXPLAIN (verbose, costs off)
!   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;                               -- can be pushed down
!                                          QUERY PLAN                                         
! --------------------------------------------------------------------------------------------
   Delete on public.ft2
     Output: c1, c4
     ->  Foreign Scan on public.ft2
           Output: ctid
!          Remote SQL: DELETE FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 5)) RETURNING "C 1", c4
! (5 rows)
  
  DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
    c1  |              c4              
***************
*** 1249,1255 **** DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
  (103 rows)
  
  EXPLAIN (verbose, costs off)
! DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
                                                        QUERY PLAN                                                      
  ----------------------------------------------------------------------------------------------------------------------
   Delete on public.ft2
--- 1269,1275 ----
  (103 rows)
  
  EXPLAIN (verbose, costs off)
! DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;                -- can't be pushed down
                                                        QUERY PLAN                                                      
  ----------------------------------------------------------------------------------------------------------------------
   Delete on public.ft2
***************
*** 2092,2097 **** SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
--- 2112,2130 ----
   1104 | 204 | ddd                | 
  (819 rows)
  
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c7 = DEFAULT WHERE c1 % 10 = 0 AND date(c4) = '1970-01-01'::date;    -- can't be pushed down
+                                                       QUERY PLAN                                                       
+ -----------------------------------------------------------------------------------------------------------------------
+  Update on public.ft2
+    Remote SQL: UPDATE "S 1"."T 1" SET c7 = $2 WHERE ctid = $1
+    ->  Foreign Scan on public.ft2
+          Output: c1, c2, NULL::integer, c3, c4, c5, c6, 'ft2       '::character(10), c8, ctid
+          Filter: (date(ft2.c4) = '01-01-1970'::date)
+          Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c8, ctid FROM "S 1"."T 1" WHERE ((("C 1" % 10) = 0)) FOR UPDATE
+ (6 rows)
+ 
+ UPDATE ft2 SET c7 = DEFAULT WHERE c1 % 10 = 0 AND date(c4) = '1970-01-01'::date;
  -- Test that trigger on remote table works as expected
  CREATE OR REPLACE FUNCTION "S 1".F_BRTRIG() RETURNS trigger AS $$
  BEGIN
***************
*** 2233,2239 **** CONTEXT:  Remote SQL command: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6,
  UPDATE ft1 SET c2 = -c2 WHERE c1 = 1;  -- c2positive
  ERROR:  new row for relation "T 1" violates check constraint "c2positive"
  DETAIL:  Failing row contains (1, -1, 00001_trig_update, 1970-01-02 08:00:00+00, 1970-01-02 00:00:00, 1, 1         , foo).
! CONTEXT:  Remote SQL command: UPDATE "S 1"."T 1" SET c2 = $2 WHERE ctid = $1
  -- Test savepoint/rollback behavior
  select c2, count(*) from ft2 where c2 < 500 group by 1 order by 1;
   c2  | count 
--- 2266,2272 ----
  UPDATE ft1 SET c2 = -c2 WHERE c1 = 1;  -- c2positive
  ERROR:  new row for relation "T 1" violates check constraint "c2positive"
  DETAIL:  Failing row contains (1, -1, 00001_trig_update, 1970-01-02 08:00:00+00, 1970-01-02 00:00:00, 1, 1         , foo).
! CONTEXT:  Remote SQL command: UPDATE "S 1"."T 1" SET c2 = (- c2) WHERE (("C 1" = 1))
  -- Test savepoint/rollback behavior
  select c2, count(*) from ft2 where c2 < 500 group by 1 order by 1;
   c2  | count 
***************
*** 2392,2398 **** savepoint s3;
  update ft2 set c2 = -2 where c2 = 42 and c1 = 10; -- fail on remote side
  ERROR:  new row for relation "T 1" violates check constraint "c2positive"
  DETAIL:  Failing row contains (10, -2, 00010_trig_update_trig_update, 1970-01-11 08:00:00+00, 1970-01-11 00:00:00, 0, 0         , foo).
! CONTEXT:  Remote SQL command: UPDATE "S 1"."T 1" SET c2 = $2 WHERE ctid = $1
  rollback to savepoint s3;
  select c2, count(*) from ft2 where c2 < 500 group by 1 order by 1;
   c2  | count 
--- 2425,2431 ----
  update ft2 set c2 = -2 where c2 = 42 and c1 = 10; -- fail on remote side
  ERROR:  new row for relation "T 1" violates check constraint "c2positive"
  DETAIL:  Failing row contains (10, -2, 00010_trig_update_trig_update, 1970-01-11 08:00:00+00, 1970-01-11 00:00:00, 0, 0         , foo).
! CONTEXT:  Remote SQL command: UPDATE "S 1"."T 1" SET c2 = (-2) WHERE ((c2 = 42)) AND (("C 1" = 10))
  rollback to savepoint s3;
  select c2, count(*) from ft2 where c2 < 500 group by 1 order by 1;
   c2  | count 
***************
*** 2840,2845 **** NOTICE:  NEW: (13,"test triggered !")
--- 2873,3083 ----
   (0,27)
  (1 row)
  
+ -- cleanup
+ DROP TRIGGER trig_row_before ON rem1;
+ DROP TRIGGER trig_row_after ON rem1;
+ DROP TRIGGER trig_local_before ON loc1;
+ -- Test update-pushdown functionality
+ -- Test with statement-level triggers
+ CREATE TRIGGER trig_stmt_before
+ 	BEFORE DELETE OR INSERT OR UPDATE ON rem1
+ 	FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func();
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Update on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: f1, f2, ctid
+          Remote SQL: UPDATE public.loc1 SET f2 = ''::text
+ (4 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+                  QUERY PLAN                  
+ ---------------------------------------------
+  Delete on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: ctid
+          Remote SQL: DELETE FROM public.loc1
+ (4 rows)
+ 
+ DROP TRIGGER trig_stmt_before ON rem1;
+ CREATE TRIGGER trig_stmt_after
+ 	AFTER DELETE OR INSERT OR UPDATE ON rem1
+ 	FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func();
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Update on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: f1, f2, ctid
+          Remote SQL: UPDATE public.loc1 SET f2 = ''::text
+ (4 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+                  QUERY PLAN                  
+ ---------------------------------------------
+  Delete on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: ctid
+          Remote SQL: DELETE FROM public.loc1
+ (4 rows)
+ 
+ DROP TRIGGER trig_stmt_after ON rem1;
+ -- Test with row-level ON INSERT triggers
+ CREATE TRIGGER trig_row_before_insert
+ BEFORE INSERT ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Update on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: f1, f2, ctid
+          Remote SQL: UPDATE public.loc1 SET f2 = ''::text
+ (4 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+                  QUERY PLAN                  
+ ---------------------------------------------
+  Delete on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: ctid
+          Remote SQL: DELETE FROM public.loc1
+ (4 rows)
+ 
+ DROP TRIGGER trig_row_before_insert ON rem1;
+ CREATE TRIGGER trig_row_after_insert
+ AFTER INSERT ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Update on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: f1, f2, ctid
+          Remote SQL: UPDATE public.loc1 SET f2 = ''::text
+ (4 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+                  QUERY PLAN                  
+ ---------------------------------------------
+  Delete on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: ctid
+          Remote SQL: DELETE FROM public.loc1
+ (4 rows)
+ 
+ DROP TRIGGER trig_row_after_insert ON rem1;
+ -- Test with row-level ON UPDATE triggers
+ CREATE TRIGGER trig_row_before_update
+ BEFORE UPDATE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can't be pushed down
+                              QUERY PLAN                              
+ ---------------------------------------------------------------------
+  Update on public.rem1
+    Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+    ->  Foreign Scan on public.rem1
+          Output: f1, ''::text, ctid, rem1.*
+          Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+ (5 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+                  QUERY PLAN                  
+ ---------------------------------------------
+  Delete on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: ctid
+          Remote SQL: DELETE FROM public.loc1
+ (4 rows)
+ 
+ DROP TRIGGER trig_row_before_update ON rem1;
+ CREATE TRIGGER trig_row_after_update
+ AFTER UPDATE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can't be pushed down
+                                   QUERY PLAN                                   
+ -------------------------------------------------------------------------------
+  Update on public.rem1
+    Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2
+    ->  Foreign Scan on public.rem1
+          Output: f1, ''::text, ctid, rem1.*
+          Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+ (5 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+                  QUERY PLAN                  
+ ---------------------------------------------
+  Delete on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: ctid
+          Remote SQL: DELETE FROM public.loc1
+ (4 rows)
+ 
+ DROP TRIGGER trig_row_after_update ON rem1;
+ -- Test with row-level ON DELETE triggers
+ CREATE TRIGGER trig_row_before_delete
+ BEFORE DELETE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Update on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: f1, f2, ctid
+          Remote SQL: UPDATE public.loc1 SET f2 = ''::text
+ (4 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can't be pushed down
+                              QUERY PLAN                              
+ ---------------------------------------------------------------------
+  Delete on public.rem1
+    Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1
+    ->  Foreign Scan on public.rem1
+          Output: ctid, rem1.*
+          Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+ (5 rows)
+ 
+ DROP TRIGGER trig_row_before_delete ON rem1;
+ CREATE TRIGGER trig_row_after_delete
+ AFTER DELETE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Update on public.rem1
+    ->  Foreign Scan on public.rem1
+          Output: f1, f2, ctid
+          Remote SQL: UPDATE public.loc1 SET f2 = ''::text
+ (4 rows)
+ 
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can't be pushed down
+                                QUERY PLAN                               
+ ------------------------------------------------------------------------
+  Delete on public.rem1
+    Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1 RETURNING f1, f2
+    ->  Foreign Scan on public.rem1
+          Output: ctid, rem1.*
+          Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+ (5 rows)
+ 
+ DROP TRIGGER trig_row_after_delete ON rem1;
  -- ===================================================================
  -- test IMPORT FOREIGN SCHEMA
  -- ===================================================================
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 86,92 **** typedef struct PgFdwRelationInfo
   * planner to executor.  Currently we store:
   *
   * 1) SELECT statement text to be sent to the remote server
!  * 2) Integer list of attribute numbers retrieved by the SELECT
   *
   * These items are indexed with the enum FdwScanPrivateIndex, so an item
   * can be fetched with list_nth().  For example, to get the SELECT statement:
--- 86,98 ----
   * planner to executor.  Currently we store:
   *
   * 1) SELECT statement text to be sent to the remote server
!  * 2) List of restriction clauses that can be executed remotely
!  * 3) Integer list of attribute numbers retrieved by the SELECT
!  * 4) Boolean flag showing if an UPDATE/DELETE is pushed down
!  * 5) UPDATE/DELETE statement text to be sent to the remote server
!  * 6) Boolean flag showing if we set the command es_processed
!  * 7) Boolean flag showing if the remote query has a RETURNING clause
!  * 8) Integer list of attribute numbers retrieved by RETURNING, if any
   *
   * These items are indexed with the enum FdwScanPrivateIndex, so an item
   * can be fetched with list_nth().  For example, to get the SELECT statement:
***************
*** 96,103 **** enum FdwScanPrivateIndex
  {
  	/* SQL statement to execute remotely (as a String node) */
  	FdwScanPrivateSelectSql,
! 	/* Integer list of attribute numbers retrieved by the SELECT */
! 	FdwScanPrivateRetrievedAttrs
  };
  
  /*
--- 102,121 ----
  {
  	/* SQL statement to execute remotely (as a String node) */
  	FdwScanPrivateSelectSql,
! 	/* List of restriction clauses that can be executed remotely */
! 	FdwScanPrivateRemoteConds,
! 	/* Integer list of attribute numbers retrieved by SELECT */
! 	FdwScanPrivateRetrievedAttrsBySelect,
! 	/* UPDATE/DELETE-pushdown flag (as an integer Value node) */
! 	FdwScanPrivateUpdatePushdown,
! 	/* UPDATE/DELETE statement to execute remotely (as a String node) */
! 	FdwScanPrivateUpdateSql,
! 	/* set-processed flag (as an integer Value node) */
! 	FdwScanPrivateSetProcessed,
! 	/* has-returning flag (as an integer Value node) */
! 	FdwScanPrivateHasReturning,
! 	/* Integer list of attribute numbers retrieved by RETURNING */
! 	FdwScanPrivateRetrievedAttrsByReturning
  };
  
  /*
***************
*** 131,138 **** typedef struct PgFdwScanState
  	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
  
  	/* extracted fdw_private data */
! 	char	   *query;			/* text of SELECT command */
  	List	   *retrieved_attrs;	/* list of retrieved attribute numbers */
  
  	/* for remote query execution */
  	PGconn	   *conn;			/* connection for the scan */
--- 149,158 ----
  	AttInMetadata *attinmeta;	/* attribute datatype conversion metadata */
  
  	/* extracted fdw_private data */
! 	char	   *query;			/* text of SELECT or UPDATE/DELETE command */
  	List	   *retrieved_attrs;	/* list of retrieved attribute numbers */
+ 	bool		set_processed;	/* do we set the command es_processed? */
+ 	bool		has_returning;	/* is there a RETURNING clause? */
  
  	/* for remote query execution */
  	PGconn	   *conn;			/* connection for the scan */
***************
*** 152,157 **** typedef struct PgFdwScanState
--- 172,182 ----
  	int			fetch_ct_2;		/* Min(# of fetches done, 2) */
  	bool		eof_reached;	/* true if last fetch reached EOF */
  
+ 	/* for update pushdown */
+ 	bool		update_pushdown;	/* is an UPDATE/DELETE pushed down? */
+ 	PGresult   *result;			/* result of an UPDATE/DELETE query */
+ 	TupleTableSlot *rslot;		/* slot containing the result tuple */
+ 
  	/* working memory contexts */
  	MemoryContext batch_cxt;	/* context holding current batch of tuples */
  	MemoryContext temp_cxt;		/* context for per-tuple temporary data */
***************
*** 180,185 **** typedef struct PgFdwModifyState
--- 205,214 ----
  	int			p_nums;			/* number of parameters to transmit */
  	FmgrInfo   *p_flinfo;		/* output conversion functions for them */
  
+ 	/* for update pushdown */
+ 	bool		update_pushdown;	/* is an UPDATE/DELETE pushed down? */
+ 	PgFdwScanState *fsstate;	/* execution state of a foreign scan */
+ 
  	/* working memory context */
  	MemoryContext temp_cxt;		/* context for per-tuple temporary data */
  } PgFdwModifyState;
***************
*** 308,319 **** static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
  static void create_cursor(ForeignScanState *node);
  static void fetch_more_data(ForeignScanState *node);
  static void close_cursor(PGconn *conn, unsigned int cursor_number);
  static void prepare_foreign_modify(PgFdwModifyState *fmstate);
  static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
  						 ItemPointer tupleid,
  						 TupleTableSlot *slot);
! static void store_returning_result(PgFdwModifyState *fmstate,
! 					   TupleTableSlot *slot, PGresult *res);
  static int postgresAcquireSampleRowsFunc(Relation relation, int elevel,
  							  HeapTuple *rows, int targrows,
  							  double *totalrows,
--- 337,367 ----
  static void create_cursor(ForeignScanState *node);
  static void fetch_more_data(ForeignScanState *node);
  static void close_cursor(PGconn *conn, unsigned int cursor_number);
+ static bool update_is_pushdown_safe(PlannerInfo *root,
+ 									ModifyTable *plan,
+ 									Index resultRelation,
+ 									int subplan_index,
+ 									Relation rel,
+ 									List *targetAttrs);
+ static List *push_update_down(PlannerInfo *root,
+ 							  ModifyTable *plan,
+ 							  Index resultRelation,
+ 							  int subplan_index,
+ 							  Relation rel,
+ 							  List *targetAttrs);
+ static List *rewrite_targetlist(Index resultRelation, Relation rel,
+ 								List *targetlist, List *targetAttrs);
  static void prepare_foreign_modify(PgFdwModifyState *fmstate);
  static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
  						 ItemPointer tupleid,
  						 TupleTableSlot *slot);
! static void store_returning_result(TupleTableSlot *slot,
! 					   PGresult *res,
! 					   int row,
! 					   Relation rel,
! 					   AttInMetadata *attinmeta,
! 					   List *retrieved_attrs,
! 					   MemoryContext temp_context);
  static int postgresAcquireSampleRowsFunc(Relation relation, int elevel,
  							  HeapTuple *rows, int targrows,
  							  double *totalrows,
***************
*** 852,859 **** postgresGetForeignPlan(PlannerInfo *root,
  	 * Build the fdw_private list that will be available to the executor.
  	 * Items in the list must match enum FdwScanPrivateIndex, above.
  	 */
! 	fdw_private = list_make2(makeString(sql.data),
! 							 retrieved_attrs);
  
  	/*
  	 * Create the ForeignScan node from target list, local filtering
--- 900,909 ----
  	 * Build the fdw_private list that will be available to the executor.
  	 * Items in the list must match enum FdwScanPrivateIndex, above.
  	 */
! 	fdw_private = list_make4(makeString(sql.data),
! 							 remote_conds,
! 							 retrieved_attrs,
! 							 makeInteger(false));
  
  	/*
  	 * Create the ForeignScan node from target list, local filtering
***************
*** 914,935 **** postgresBeginForeignScan(ForeignScanState *node, int eflags)
  	server = GetForeignServer(table->serverid);
  	user = GetUserMapping(userid, server->serverid);
  
  	/*
  	 * Get connection to the foreign server.  Connection manager will
  	 * establish new connection if necessary.
  	 */
  	fsstate->conn = GetConnection(server, user, false);
  
- 	/* Assign a unique ID for my cursor */
- 	fsstate->cursor_number = GetCursorNumber(fsstate->conn);
- 	fsstate->cursor_exists = false;
- 
- 	/* Get private info created by planner functions. */
- 	fsstate->query = strVal(list_nth(fsplan->fdw_private,
- 									 FdwScanPrivateSelectSql));
- 	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
- 											   FdwScanPrivateRetrievedAttrs);
- 
  	/* Create contexts for batches of tuples and per-tuple temp workspace. */
  	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
  											   "postgres_fdw tuple data",
--- 964,997 ----
  	server = GetForeignServer(table->serverid);
  	user = GetUserMapping(userid, server->serverid);
  
+ 	/* Get private info created by planner functions. */
+ 	fsstate->update_pushdown = intVal(list_nth(fsplan->fdw_private,
+ 											   FdwScanPrivateUpdatePushdown));
+ 	if (fsstate->update_pushdown)
+ 	{
+ 		fsstate->query = strVal(list_nth(fsplan->fdw_private,
+ 										 FdwScanPrivateUpdateSql));
+ 		fsstate->set_processed = intVal(list_nth(fsplan->fdw_private,
+ 												 FdwScanPrivateSetProcessed));
+ 		fsstate->has_returning = intVal(list_nth(fsplan->fdw_private,
+ 												 FdwScanPrivateHasReturning));
+ 		fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
+ 									  FdwScanPrivateRetrievedAttrsByReturning);
+ 	}
+ 	else
+ 	{
+ 		fsstate->query = strVal(list_nth(fsplan->fdw_private,
+ 										 FdwScanPrivateSelectSql));
+ 		fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
+ 										 FdwScanPrivateRetrievedAttrsBySelect);
+ 	}
+ 
  	/*
  	 * Get connection to the foreign server.  Connection manager will
  	 * establish new connection if necessary.
  	 */
  	fsstate->conn = GetConnection(server, user, false);
  
  	/* Create contexts for batches of tuples and per-tuple temp workspace. */
  	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
  											   "postgres_fdw tuple data",
***************
*** 945,950 **** postgresBeginForeignScan(ForeignScanState *node, int eflags)
--- 1007,1025 ----
  	/* Get info we'll need for input data conversion. */
  	fsstate->attinmeta = TupleDescGetAttInMetadata(RelationGetDescr(fsstate->rel));
  
+ 	/*
+ 	 * If pushing update down, we've got no more to do.
+ 	 */
+ 	if (fsstate->update_pushdown)
+ 	{
+ 		fsstate->num_tuples = -1;		/* -1 means not set yet */
+ 		return;
+ 	}
+ 
+ 	/* Assign a unique ID for my cursor */
+ 	fsstate->cursor_number = GetCursorNumber(fsstate->conn);
+ 	fsstate->cursor_exists = false;
+ 
  	/* Prepare for output conversion of parameters used in remote query. */
  	numParams = list_length(fsplan->fdw_exprs);
  	fsstate->numParams = numParams;
***************
*** 995,1000 **** postgresIterateForeignScan(ForeignScanState *node)
--- 1070,1166 ----
  	TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
  
  	/*
+ 	 * Get the result of an UPDATE/DELETE RETURNING, if pushing the command down.
+ 	 */
+ 	if (fsstate->update_pushdown)
+ 	{
+ 		MemoryContext oldcontext;
+ 
+ 		/*
+ 		 * If this is the first call after Begin, we need to execute the statement,
+ 		 * and check for success.
+ 		 */
+ 		if (fsstate->num_tuples == -1)
+ 		{
+ 			/*
+ 			 * We don't use a PG_TRY block here, so be careful not to throw error
+ 			 * without releasing the PGresult.
+ 			 */
+ 			fsstate->result = PQexec(fsstate->conn, fsstate->query);
+ 			if (PQresultStatus(fsstate->result) !=
+ 				(fsstate->has_returning ? PGRES_TUPLES_OK : PGRES_COMMAND_OK))
+ 				pgfdw_report_error(ERROR, fsstate->result, fsstate->conn, true,
+ 								   fsstate->query);
+ 
+ 			/* Check number of rows affected. */
+ 			if (fsstate->has_returning)
+ 				fsstate->num_tuples = PQntuples(fsstate->result);
+ 			else
+ 				fsstate->num_tuples = atoi(PQcmdTuples(fsstate->result));
+ 		}
+ 
+ 		/*
+ 		 * If the update query doesn't have a RETURNING clause, then there is
+ 		 * nothing to do, so we just return an empty slot.
+ 		 */
+ 		if (!fsstate->has_returning)
+ 		{
+ 			/*
+ 			 * Increment the command es_processed count if necessary.
+ 			 * (Note: ModifyTable cannot do that by itself in this case.)
+ 			 */
+ 			if (fsstate->set_processed)
+ 			{
+ 				EState	   *estate = node->ss.ps.state;
+ 
+ 				estate->es_processed += fsstate->num_tuples;
+ 			}
+ 
+ 			/*
+ 			 * Increment the tuple count for EXPLAIN ANALYZE if necessary.
+ 			 * (Note: EXPLAIN ANALYZE cannot do that by itself in this case.)
+ 			 */
+ 			if (node->ss.ps.instrument)
+ 			{
+ 				Instrumentation *instr = node->ss.ps.instrument;
+ 
+ 				instr->ntuples += fsstate->num_tuples;
+ 			}
+ 
+ 			return ExecClearTuple(slot);
+ 		}
+ 
+ 		/* If we didn't get any tuples, must be end of data. */
+ 		if (fsstate->next_tuple >= fsstate->num_tuples)
+ 			return ExecClearTuple(slot);
+ 
+ 		/* OK, we'll store RETURNING tuples in the batch_cxt. */
+ 		oldcontext = MemoryContextSwitchTo(fsstate->batch_cxt);
+ 
+ 		/* Fetch the next tuple. */
+ 		store_returning_result(slot,
+ 							   fsstate->result,
+ 							   fsstate->next_tuple,
+ 							   fsstate->rel,
+ 							   fsstate->attinmeta,
+ 							   fsstate->retrieved_attrs,
+ 							   fsstate->temp_cxt);
+ 		fsstate->rslot = slot;
+ 		fsstate->next_tuple++;
+ 
+ 		MemoryContextSwitchTo(oldcontext);
+ 
+ 		/*
+ 		 * Return slot.  Note that this is safe because we can avoid applying
+ 		 * ExecQual to the tuple due to no local quals (see the comment for
+ 		 * update_is_pushdown_safe) and because the tuple can be safely
+ 		 * projected by ExecProject (see push_update_down) and would then be
+ 		 * ignored by postgresExecForeignUpdate or postgresExecForeignDelete.
+ 		 */
+ 		return slot;
+ 	}
+ 
+ 	/*
  	 * If this is the first call after Begin or ReScan, we need to create the
  	 * cursor on the remote side.
  	 */
***************
*** 1036,1041 **** postgresReScanForeignScan(ForeignScanState *node)
--- 1202,1210 ----
  	char		sql[64];
  	PGresult   *res;
  
+ 	/* This shouldn't be called in update pushdown case. */
+ 	Assert(fsstate->update_pushdown == false);
+ 
  	/* If we haven't created the cursor yet, nothing to do. */
  	if (!fsstate->cursor_exists)
  		return;
***************
*** 1094,1099 **** postgresEndForeignScan(ForeignScanState *node)
--- 1263,1275 ----
  	if (fsstate == NULL)
  		return;
  
+ 	/* if pushing update down, nothing to do other than cleanup */
+ 	if (fsstate->update_pushdown)
+ 	{
+ 		if (fsstate->result)
+ 			PQclear(fsstate->result);
+ 	}
+ 
  	/* Close the cursor if open, to prevent accumulation of cursors */
  	if (fsstate->cursor_exists)
  		close_cursor(fsstate->conn, fsstate->cursor_number);
***************
*** 1145,1157 **** postgresAddForeignUpdateTargets(Query *parsetree,
  /*
   * postgresPlanForeignModify
   *		Plan an insert/update/delete operation on a foreign table
-  *
-  * Note: currently, the plan tree generated for UPDATE/DELETE will always
-  * include a ForeignScan that retrieves ctids (using SELECT FOR UPDATE)
-  * and then the ModifyTable node will have to execute individual remote
-  * UPDATE/DELETE commands.  If there are no local conditions or joins
-  * needed, it'd be better to let the scan node do UPDATE/DELETE RETURNING
-  * and then do nothing at ModifyTable.  Room for future optimization ...
   */
  static List *
  postgresPlanForeignModify(PlannerInfo *root,
--- 1321,1326 ----
***************
*** 1167,1174 **** postgresPlanForeignModify(PlannerInfo *root,
  	List	   *returningList = NIL;
  	List	   *retrieved_attrs = NIL;
  
- 	initStringInfo(&sql);
- 
  	/*
  	 * Core code already has some lock on each rel being planned, so we can
  	 * use NoLock here.
--- 1336,1341 ----
***************
*** 1210,1215 **** postgresPlanForeignModify(PlannerInfo *root,
--- 1377,1405 ----
  	}
  
  	/*
+ 	 * For UPDATE/DELETE, try to push the command down into the remote.
+ 	 */
+ 	if (operation == CMD_UPDATE || operation == CMD_DELETE)
+ 	{
+ 		/* Check to see whether it's safe to push the command down. */
+ 		if (update_is_pushdown_safe(root, plan,
+ 									resultRelation,
+ 									subplan_index,
+ 									rel, targetAttrs))
+ 		{
+ 			List	   *fdw_private;
+ 
+ 			/* OK, modify plan so as to push the command down. */
+ 			fdw_private = push_update_down(root, plan,
+ 										   resultRelation,
+ 										   subplan_index,
+ 										   rel, targetAttrs);
+ 			heap_close(rel, NoLock);
+ 			return fdw_private;
+ 		}
+ 	}
+ 
+ 	/*
  	 * Extract the relevant RETURNING list if any.
  	 */
  	if (plan->returningLists)
***************
*** 1218,1223 **** postgresPlanForeignModify(PlannerInfo *root,
--- 1408,1414 ----
  	/*
  	 * Construct the SQL command string.
  	 */
+ 	initStringInfo(&sql);
  	switch (operation)
  	{
  		case CMD_INSERT:
***************
*** 1288,1293 **** postgresBeginForeignModify(ModifyTableState *mtstate,
--- 1479,1520 ----
  	fmstate = (PgFdwModifyState *) palloc0(sizeof(PgFdwModifyState));
  	fmstate->rel = rel;
  
+ 	/* Deconstruct fdw_private data. */
+ 	fmstate->query = strVal(list_nth(fdw_private,
+ 									 FdwModifyPrivateUpdateSql));
+ 	fmstate->target_attrs = (List *) list_nth(fdw_private,
+ 											  FdwModifyPrivateTargetAttnums);
+ 	fmstate->has_returning = intVal(list_nth(fdw_private,
+ 											 FdwModifyPrivateHasReturning));
+ 	fmstate->retrieved_attrs = (List *) list_nth(fdw_private,
+ 											 FdwModifyPrivateRetrievedAttrs);
+ 
+ 	/*
+ 	 * if query is NULL, we are in update pushdown case.
+ 	 */
+ 	if (fmstate->query == NULL)
+ 	{
+ 		PlanState	   *node = mtstate->mt_plans[subplan_index];
+ 		PgFdwScanState *fsstate;
+ 
+ 		Assert(fmstate->target_attrs == NIL);
+ 		Assert(fmstate->has_returning == false);
+ 		Assert(fmstate->retrieved_attrs == NIL);
+ 
+ 		Assert(nodeTag(node) == T_ForeignScanState);
+ 		fsstate = (PgFdwScanState *) ((ForeignScanState *) node)->fdw_state;
+ 		Assert(fsstate->update_pushdown);
+ 
+ 		fmstate->update_pushdown = true;
+ 		if (fsstate->has_returning)
+ 		{
+ 			fmstate->has_returning = true;
+ 			fmstate->fsstate = fsstate;
+ 		}
+ 		resultRelInfo->ri_FdwState = fmstate;
+ 		return;
+ 	}
+ 
  	/*
  	 * Identify which user to do the remote access as.  This should match what
  	 * ExecCheckRTEPerms() does.
***************
*** 1304,1319 **** postgresBeginForeignModify(ModifyTableState *mtstate,
  	fmstate->conn = GetConnection(server, user, true);
  	fmstate->p_name = NULL;		/* prepared statement not made yet */
  
- 	/* Deconstruct fdw_private data. */
- 	fmstate->query = strVal(list_nth(fdw_private,
- 									 FdwModifyPrivateUpdateSql));
- 	fmstate->target_attrs = (List *) list_nth(fdw_private,
- 											  FdwModifyPrivateTargetAttnums);
- 	fmstate->has_returning = intVal(list_nth(fdw_private,
- 											 FdwModifyPrivateHasReturning));
- 	fmstate->retrieved_attrs = (List *) list_nth(fdw_private,
- 											 FdwModifyPrivateRetrievedAttrs);
- 
  	/* Create context for per-tuple temp workspace. */
  	fmstate->temp_cxt = AllocSetContextCreate(estate->es_query_cxt,
  											  "postgres_fdw temporary data",
--- 1531,1536 ----
***************
*** 1411,1417 **** postgresExecForeignInsert(EState *estate,
  	{
  		n_rows = PQntuples(res);
  		if (n_rows > 0)
! 			store_returning_result(fmstate, slot, res);
  	}
  	else
  		n_rows = atoi(PQcmdTuples(res));
--- 1628,1638 ----
  	{
  		n_rows = PQntuples(res);
  		if (n_rows > 0)
! 			store_returning_result(slot, res, 0,
! 								   fmstate->rel,
! 								   fmstate->attinmeta,
! 								   fmstate->retrieved_attrs,
! 								   fmstate->temp_cxt);
  	}
  	else
  		n_rows = atoi(PQcmdTuples(res));
***************
*** 1442,1447 **** postgresExecForeignUpdate(EState *estate,
--- 1663,1676 ----
  	PGresult   *res;
  	int			n_rows;
  
+ 	/* Just return the slot of the ForeignScan node, if pushing update down */
+ 	if (fmstate->update_pushdown)
+ 	{
+ 		Assert(fmstate->has_returning);
+ 		Assert(fmstate->fsstate->rslot);
+ 		return fmstate->fsstate->rslot;
+ 	}
+ 
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
  		prepare_foreign_modify(fmstate);
***************
*** 1481,1487 **** postgresExecForeignUpdate(EState *estate,
  	{
  		n_rows = PQntuples(res);
  		if (n_rows > 0)
! 			store_returning_result(fmstate, slot, res);
  	}
  	else
  		n_rows = atoi(PQcmdTuples(res));
--- 1710,1720 ----
  	{
  		n_rows = PQntuples(res);
  		if (n_rows > 0)
! 			store_returning_result(slot, res, 0,
! 								   fmstate->rel,
! 								   fmstate->attinmeta,
! 								   fmstate->retrieved_attrs,
! 								   fmstate->temp_cxt);
  	}
  	else
  		n_rows = atoi(PQcmdTuples(res));
***************
*** 1512,1517 **** postgresExecForeignDelete(EState *estate,
--- 1745,1758 ----
  	PGresult   *res;
  	int			n_rows;
  
+ 	/* Just return the slot of the ForeignScan node, if pushing update down */
+ 	if (fmstate->update_pushdown)
+ 	{
+ 		Assert(fmstate->has_returning);
+ 		Assert(fmstate->fsstate->rslot);
+ 		return fmstate->fsstate->rslot;
+ 	}
+ 
  	/* Set up the prepared statement on the remote server, if we didn't yet */
  	if (!fmstate->p_name)
  		prepare_foreign_modify(fmstate);
***************
*** 1551,1557 **** postgresExecForeignDelete(EState *estate,
  	{
  		n_rows = PQntuples(res);
  		if (n_rows > 0)
! 			store_returning_result(fmstate, slot, res);
  	}
  	else
  		n_rows = atoi(PQcmdTuples(res));
--- 1792,1802 ----
  	{
  		n_rows = PQntuples(res);
  		if (n_rows > 0)
! 			store_returning_result(slot, res, 0,
! 								   fmstate->rel,
! 								   fmstate->attinmeta,
! 								   fmstate->retrieved_attrs,
! 								   fmstate->temp_cxt);
  	}
  	else
  		n_rows = atoi(PQcmdTuples(res));
***************
*** 1579,1584 **** postgresEndForeignModify(EState *estate,
--- 1824,1833 ----
  	if (fmstate == NULL)
  		return;
  
+ 	/* If pushing update down, nothing to do */
+ 	if (fmstate->update_pushdown)
+ 		return;
+ 
  	/* If we created a prepared statement, destroy it */
  	if (fmstate->p_name)
  	{
***************
*** 1657,1667 **** postgresExplainForeignScan(ForeignScanState *node, ExplainState *es)
  {
  	List	   *fdw_private;
  	char	   *sql;
  
  	if (es->verbose)
  	{
  		fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
! 		sql = strVal(list_nth(fdw_private, FdwScanPrivateSelectSql));
  		ExplainPropertyText("Remote SQL", sql, es);
  	}
  }
--- 1906,1922 ----
  {
  	List	   *fdw_private;
  	char	   *sql;
+ 	bool		update_pushdown;
  
  	if (es->verbose)
  	{
  		fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;
! 		update_pushdown = intVal(list_nth(fdw_private,
! 										  FdwScanPrivateUpdatePushdown));
! 		if (update_pushdown)
! 			sql = strVal(list_nth(fdw_private, FdwScanPrivateUpdateSql));
! 		else
! 			sql = strVal(list_nth(fdw_private, FdwScanPrivateSelectSql));
  		ExplainPropertyText("Remote SQL", sql, es);
  	}
  }
***************
*** 1682,1688 **** postgresExplainForeignModify(ModifyTableState *mtstate,
  		char	   *sql = strVal(list_nth(fdw_private,
  										  FdwModifyPrivateUpdateSql));
  
! 		ExplainPropertyText("Remote SQL", sql, es);
  	}
  }
  
--- 1937,1944 ----
  		char	   *sql = strVal(list_nth(fdw_private,
  										  FdwModifyPrivateUpdateSql));
  
! 		if (sql != NULL)
! 			ExplainPropertyText("Remote SQL", sql, es);
  	}
  }
  
***************
*** 1911,1916 **** ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
--- 2167,2402 ----
  }
  
  /*
+  * Check to see whether it's safe to push an UPDATE/DELETE command down.
+  *
+  * Conditions checked here:
+  *
+  * 1. If the target relation has any row-level local BEFORE/AFTER triggers, we
+  * must not push the command down, since that breaks execution of the triggers.
+  *
+  * 2. If there are any local joins needed, we mustn't push the command down,
+  * because that breaks execution of the joins.
+  *
+  * 3. If there are any quals that can't be evaluated remotely, we mustn't push
+  * the command down, because that breaks evaluation of the quals.
+  *
+  * 4. In UPDATE, if it is unsafe to evaluate any expressions to assign to the
+  * target columns on the remote server, we must not push the command down.
+  */
+ static bool
+ update_is_pushdown_safe(PlannerInfo *root,
+ 						ModifyTable *plan,
+ 						Index resultRelation,
+ 						int subplan_index,
+ 						Relation rel,
+ 						List *targetAttrs)
+ {
+ 	CmdType		operation = plan->operation;
+ 	RelOptInfo *baserel = root->simple_rel_array[resultRelation];
+ 	Plan	   *subplan = (Plan *) list_nth(plan->plans, subplan_index);
+ 	ListCell   *lc;
+ 
+ 	/* Check point 1 */
+ 	if (rel->trigdesc &&
+ 		((operation == CMD_UPDATE &&
+ 		  (rel->trigdesc->trig_update_after_row ||
+ 		   rel->trigdesc->trig_update_before_row)) ||
+ 		 (operation == CMD_DELETE &&
+ 		  (rel->trigdesc->trig_delete_after_row ||
+ 		   rel->trigdesc->trig_delete_before_row))))
+ 		return false;
+ 
+ 	/* Check point 2 */
+ 	if (nodeTag(subplan) != T_ForeignScan)
+ 		return false;
+ 
+ 	/* Check point 3 */
+ 	if (subplan->qual != NIL)
+ 		return false;
+ 
+ 	/* Check point 4 */
+ 	foreach(lc, targetAttrs)
+ 	{
+ 		int			attnum = lfirst_int(lc);
+ 		TargetEntry *tle = get_tle_by_resno(subplan->targetlist,
+ 											attnum);
+ 
+ 		if (!is_foreign_expr(root, baserel, (Expr *) tle->expr))
+ 			return false;
+ 	}
+ 
+ 	return true;
+ }
+ 
+ /*
+  * Modify a plan so as to push an UPDATE/DELETE command down.
+  */
+ static List *
+ push_update_down(PlannerInfo *root,
+ 				 ModifyTable *plan,
+ 				 Index resultRelation,
+ 				 int subplan_index,
+ 				 Relation rel,
+ 				 List *targetAttrs)
+ {
+ 	CmdType		operation = plan->operation;
+ 	bool		canSetTag = plan->canSetTag;
+ 	Plan	   *subplan = (Plan *) list_nth(plan->plans, subplan_index);
+ 	ForeignScan *fscan = (ForeignScan *) subplan;
+ 	StringInfoData sql;
+ 	List	   *remote_conds;
+ 	List	   *returningList = NIL;
+ 	List	   *retrieved_attrs = NIL;
+ 	List	   *fdw_private;
+ 	Value	   *v;
+ 
+ 	Assert(operation == CMD_UPDATE || operation == CMD_DELETE);
+ 
+ 	initStringInfo(&sql);
+ 
+ 	/*
+ 	 * Extract the baserestrictinfo clauses that can be evaluated remotely.
+ 	 */
+ 	remote_conds = (List *) list_nth(fscan->fdw_private,
+ 									 FdwScanPrivateRemoteConds);
+ 
+ 	/*
+ 	 * Extract the relevant RETURNING list if any.
+ 	 */
+ 	if (plan->returningLists)
+ 		returningList = (List *) list_nth(plan->returningLists, subplan_index);
+ 
+ 	/*
+ 	 * Construct the SQL command string.
+ 	 */
+ 	if (operation == CMD_UPDATE)
+ 	{
+ 		List	   *targetlist = subplan->targetlist;
+ 
+ 		deparseDirectUpdateSql(&sql, root, resultRelation, rel,
+ 							   remote_conds,
+ 							   targetlist,
+ 							   targetAttrs,
+ 							   returningList,
+ 							   &retrieved_attrs);
+ 
+ 		/*
+ 		 * Rrewrite targetlist for safety of ExecProject.
+ 		 */
+ 		subplan->targetlist = rewrite_targetlist(resultRelation,
+ 												 rel,
+ 												 targetlist,
+ 												 targetAttrs);
+ 	}
+ 	else
+ 	{
+ 		Assert(operation == CMD_DELETE);
+ 
+ 		deparseDirectDeleteSql(&sql, root, resultRelation, rel,
+ 							   remote_conds,
+ 							   returningList,
+ 							   &retrieved_attrs);
+ 	}
+ 
+ 	/*
+ 	 * Update the fdw_private list that will be available to the executor.
+ 	 * Items in the list must match enum FdwScanPrivateIndex, above.
+ 	 */
+ 	fscan->fdw_private = lappend(fscan->fdw_private, makeString(sql.data));
+ 	fscan->fdw_private = lappend(fscan->fdw_private, makeInteger(canSetTag));
+ 	fscan->fdw_private = lappend(fscan->fdw_private,
+ 								 makeInteger((retrieved_attrs != NIL)));
+ 	fscan->fdw_private = lappend(fscan->fdw_private, retrieved_attrs);
+ 
+ 	/*
+ 	 * Also set the UPDATE/DELETE-pushdown flag in the fdw_private list.
+ 	 */
+ 	v = (Value *) list_nth(fscan->fdw_private, FdwScanPrivateUpdatePushdown);
+ 	v->val.ival = true;
+ 
+ 	/*
+ 	 * Build the fdw_private list that will be available to the executor.
+ 	 * Items in the list must match enum FdwModifyPrivateIndex, above.
+ 	 */
+ 	fdw_private = list_make4(makeString(NULL), NIL, makeInteger(false), NIL);
+ 
+ 	return fdw_private;
+ }
+ 
+ /*
+  * Rrewrite the targetlist of an UPDATE.
+  *
+  * This is called for safety of ExecProject if pushing the command down.
+  */
+ static List *
+ rewrite_targetlist(Index resultRelation, Relation rel,
+ 				   List *targetlist, List *targetAttrs)
+ {
+ 	List	   *new_tlist = NIL;
+ 	int			numattrs = RelationGetNumberOfAttributes(rel);
+ 	int			attrno = 1;
+ 	ListCell   *lc;
+ 
+ 	foreach(lc, targetlist)
+ 	{
+ 		TargetEntry *tle = (TargetEntry *) lfirst(lc);
+ 
+ 		if (tle->resjunk)
+ 		{
+ 			new_tlist = lappend(new_tlist, tle);
+ 			continue;
+ 		}
+ 
+ 		if (attrno > numattrs)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 					 errmsg("table row type and query-specified row type do not match"),
+ 					 errdetail("Query has too many columns.")));
+ 
+ 		if (!list_member_int(targetAttrs, attrno))
+ 			new_tlist = lappend(new_tlist, tle);
+ 		else
+ 		{
+ 			Form_pg_attribute attr = rel->rd_att->attrs[attrno - 1];
+ 			Oid			atttype;
+ 			int32		atttypmod;
+ 			Oid			attcollation;
+ 			Node	   *new_expr;
+ 			TargetEntry *new_tle;
+ 
+ 			Assert(!attr->attisdropped);
+ 			atttype = attr->atttypid;
+ 			atttypmod = attr->atttypmod;
+ 			attcollation = attr->attcollation;
+ 
+ 			new_expr = (Node *) makeVar(resultRelation,
+ 										attrno,
+ 										atttype,
+ 										atttypmod,
+ 										attcollation,
+ 										0);
+ 
+ 			new_tle = makeTargetEntry((Expr *) new_expr,
+ 									  attrno,
+ 									  pstrdup(NameStr(attr->attname)),
+ 									  false);
+ 
+ 			new_tlist = lappend(new_tlist, new_tle);
+ 		}
+ 
+ 		attrno++;
+ 	}
+ 
+ 	if (attrno != numattrs + 1)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 				 errmsg("table row type and query-specified row type do not match"),
+ 				 errdetail("Query has too few columns.")));
+ 
+ 	return new_tlist;
+ }
+ 
+ /*
   * Create cursor for node's query with current parameter values.
   */
  static void
***************
*** 2258,2275 **** convert_prep_stmt_params(PgFdwModifyState *fmstate,
   * have PG_TRY blocks to ensure this happens.
   */
  static void
! store_returning_result(PgFdwModifyState *fmstate,
! 					   TupleTableSlot *slot, PGresult *res)
  {
  	PG_TRY();
  	{
  		HeapTuple	newtup;
  
! 		newtup = make_tuple_from_result_row(res, 0,
! 											fmstate->rel,
! 											fmstate->attinmeta,
! 											fmstate->retrieved_attrs,
! 											fmstate->temp_cxt);
  		/* tuple will be deleted when it is cleared from the slot */
  		ExecStoreTuple(newtup, slot, InvalidBuffer, true);
  	}
--- 2744,2766 ----
   * have PG_TRY blocks to ensure this happens.
   */
  static void
! store_returning_result(TupleTableSlot *slot,
! 					   PGresult *res,
! 					   int row,
! 					   Relation rel,
! 					   AttInMetadata *attinmeta,
! 					   List *retrieved_attrs,
! 					   MemoryContext temp_context)
  {
  	PG_TRY();
  	{
  		HeapTuple	newtup;
  
! 		newtup = make_tuple_from_result_row(res, row,
! 											rel,
! 											attinmeta,
! 											retrieved_attrs,
! 											temp_context);
  		/* tuple will be deleted when it is cleared from the slot */
  		ExecStoreTuple(newtup, slot, InvalidBuffer, true);
  	}
*** a/contrib/postgres_fdw/postgres_fdw.h
--- b/contrib/postgres_fdw/postgres_fdw.h
***************
*** 66,75 **** extern void deparseUpdateSql(StringInfo buf, PlannerInfo *root,
--- 66,87 ----
  				 Index rtindex, Relation rel,
  				 List *targetAttrs, List *returningList,
  				 List **retrieved_attrs);
+ extern void deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
+ 								   Index rtindex, Relation rel,
+ 								   List	*remote_conds,
+ 								   List	*targetlist,
+ 								   List *targetAttrs,
+ 								   List *returningList,
+ 								   List **retrieved_attrs);
  extern void deparseDeleteSql(StringInfo buf, PlannerInfo *root,
  				 Index rtindex, Relation rel,
  				 List *returningList,
  				 List **retrieved_attrs);
+ extern void deparseDirectDeleteSql(StringInfo buf, PlannerInfo *root,
+ 								   Index rtindex, Relation rel,
+ 								   List	*remote_conds,
+ 								   List *returningList,
+ 								   List **retrieved_attrs);
  extern void deparseAnalyzeSizeSql(StringInfo buf, Relation rel);
  extern void deparseAnalyzeSql(StringInfo buf, Relation rel,
  				  List **retrieved_attrs);
*** a/contrib/postgres_fdw/sql/postgres_fdw.sql
--- b/contrib/postgres_fdw/sql/postgres_fdw.sql
***************
*** 320,339 **** INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
  INSERT INTO ft2 (c1,c2,c3)
    VALUES (1101,201,'aaa'), (1102,202,'bbb'), (1103,203,'ccc') RETURNING *;
  INSERT INTO ft2 (c1,c2,c3) VALUES (1104,204,'ddd'), (1105,205,'eee');
  UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3;
  UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING *;
  EXPLAIN (verbose, costs off)
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
!   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
    FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
  EXPLAIN (verbose, costs off)
!   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
  DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
  EXPLAIN (verbose, costs off)
! DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
  DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
  SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
  
  -- Test that trigger on remote table works as expected
  CREATE OR REPLACE FUNCTION "S 1".F_BRTRIG() RETURNS trigger AS $$
--- 320,346 ----
  INSERT INTO ft2 (c1,c2,c3)
    VALUES (1101,201,'aaa'), (1102,202,'bbb'), (1103,203,'ccc') RETURNING *;
  INSERT INTO ft2 (c1,c2,c3) VALUES (1104,204,'ddd'), (1105,205,'eee');
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3;              -- can be pushed down
  UPDATE ft2 SET c2 = c2 + 300, c3 = c3 || '_update3' WHERE c1 % 10 = 3;
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING *;  -- can be pushed down
  UPDATE ft2 SET c2 = c2 + 400, c3 = c3 || '_update7' WHERE c1 % 10 = 7 RETURNING *;
  EXPLAIN (verbose, costs off)
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
!   FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;                               -- can't be pushed down
  UPDATE ft2 SET c2 = ft2.c2 + 500, c3 = ft2.c3 || '_update9', c7 = DEFAULT
    FROM ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 9;
  EXPLAIN (verbose, costs off)
!   DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;                               -- can be pushed down
  DELETE FROM ft2 WHERE c1 % 10 = 5 RETURNING c1, c4;
  EXPLAIN (verbose, costs off)
! DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;                -- can't be pushed down
  DELETE FROM ft2 USING ft1 WHERE ft1.c1 = ft2.c2 AND ft1.c1 % 10 = 2;
  SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c7 = DEFAULT WHERE c1 % 10 = 0 AND date(c4) = '1970-01-01'::date;    -- can't be pushed down
+ UPDATE ft2 SET c7 = DEFAULT WHERE c1 % 10 = 0 AND date(c4) = '1970-01-01'::date;
  
  -- Test that trigger on remote table works as expected
  CREATE OR REPLACE FUNCTION "S 1".F_BRTRIG() RETURNS trigger AS $$
***************
*** 616,621 **** UPDATE rem1 SET f2 = 'testo';
--- 623,712 ----
  -- Test returning a system attribute
  INSERT INTO rem1(f2) VALUES ('test') RETURNING ctid;
  
+ -- cleanup
+ DROP TRIGGER trig_row_before ON rem1;
+ DROP TRIGGER trig_row_after ON rem1;
+ DROP TRIGGER trig_local_before ON loc1;
+ 
+ 
+ -- Test update-pushdown functionality
+ 
+ -- Test with statement-level triggers
+ CREATE TRIGGER trig_stmt_before
+ 	BEFORE DELETE OR INSERT OR UPDATE ON rem1
+ 	FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func();
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+ DROP TRIGGER trig_stmt_before ON rem1;
+ 
+ CREATE TRIGGER trig_stmt_after
+ 	AFTER DELETE OR INSERT OR UPDATE ON rem1
+ 	FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func();
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+ DROP TRIGGER trig_stmt_after ON rem1;
+ 
+ -- Test with row-level ON INSERT triggers
+ CREATE TRIGGER trig_row_before_insert
+ BEFORE INSERT ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+ DROP TRIGGER trig_row_before_insert ON rem1;
+ 
+ CREATE TRIGGER trig_row_after_insert
+ AFTER INSERT ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+ DROP TRIGGER trig_row_after_insert ON rem1;
+ 
+ -- Test with row-level ON UPDATE triggers
+ CREATE TRIGGER trig_row_before_update
+ BEFORE UPDATE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can't be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+ DROP TRIGGER trig_row_before_update ON rem1;
+ 
+ CREATE TRIGGER trig_row_after_update
+ AFTER UPDATE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can't be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can be pushed down
+ DROP TRIGGER trig_row_after_update ON rem1;
+ 
+ -- Test with row-level ON DELETE triggers
+ CREATE TRIGGER trig_row_before_delete
+ BEFORE DELETE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can't be pushed down
+ DROP TRIGGER trig_row_before_delete ON rem1;
+ 
+ CREATE TRIGGER trig_row_after_delete
+ AFTER DELETE ON rem1
+ FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
+ EXPLAIN (verbose, costs off)
+ UPDATE rem1 set f2 = '';          -- can be pushed down
+ EXPLAIN (verbose, costs off)
+ DELETE FROM rem1;                 -- can't be pushed down
+ DROP TRIGGER trig_row_after_delete ON rem1;
+ 
  -- ===================================================================
  -- test IMPORT FOREIGN SCHEMA
  -- ===================================================================
*** a/doc/src/sgml/postgres-fdw.sgml
--- b/doc/src/sgml/postgres-fdw.sgml
***************
*** 414,419 ****
--- 414,428 ----
     <literal>WHERE</> clauses are not sent to the remote server unless they use
     only built-in data types, operators, and functions.  Operators and
     functions in the clauses must be <literal>IMMUTABLE</> as well.
+    For an <command>UPDATE</> or <command>DELETE</> query,
+    <filename>postgres_fdw</> attempts to optimize the query execution by
+    sending the whole query to the remote server if there are no query
+    <literal>WHERE</> clauses that cannot be sent to the remote server,
+    no local joins for the query, or no row-level local <literal>BEFORE</> or
+    <literal>AFTER</> triggers on the target table.  In <command>UPDATE</>,
+    expressions to assign to target columns must use only built-in data types,
+    <literal>IMMUTABLE</> operators, and <literal>IMMUTABLE</> functions,
+    to reduce the risk of misexecution of the query.
    </para>
  
    <para>
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to