Re: [HACKERS] Minor comment update in setrefs.c

2016-01-15 Thread Etsuro Fujita

On 2015/12/11 2:21, Robert Haas wrote:

On Tue, Dec 8, 2015 at 6:16 AM, Etsuro Fujita
 wrote:

Attached is a small patch to adjust a comment in setrefs.c; in
set_foreignscan_references, fdw_recheck_quals also gets adjusted to
reference foreign scan tuple, in case of a foreign join, so I added
"etc.", to a comment there, as the comment in case of a simple foreign
table scan.



Doesn't apply any more.  I suppose we could sync up the similar
comments in set_customscan_references() too.  But to be honest I'm not
sure this is adding any clarity.  "etc." may not be the least
informative thing you can put in a comment, but it's pretty close.


The point in the previous patch was to update the list of expressions to 
be adjusted for the case of scanrelid=0 like that for the case of 
scanrelid>0 case in set_foreignscan_references.  So, I'd like to propose 
to add *fdw_recheck_quals* to both lists, then.  Updated patch attached.


Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***
*** 1108,1114  set_foreignscan_references(PlannerInfo *root,
  
  	if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
  	{
! 		/* Adjust tlist, qual, fdw_exprs to reference foreign scan tuple */
  		indexed_tlist *itlist = build_tlist_index(fscan->fdw_scan_tlist);
  
  		fscan->scan.plan.targetlist = (List *)
--- 1108,1117 
  
  	if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
  	{
! 		/*
! 		 * Adjust tlist, qual, fdw_exprs, fdw_recheck_quals to reference
! 		 * foreign scan tuple
! 		 */
  		indexed_tlist *itlist = build_tlist_index(fscan->fdw_scan_tlist);
  
  		fscan->scan.plan.targetlist = (List *)
***
*** 1142,1148  set_foreignscan_references(PlannerInfo *root,
  	}
  	else
  	{
! 		/* Adjust tlist, qual, fdw_exprs, etc. in the standard way */
  		fscan->scan.plan.targetlist =
  			fix_scan_list(root, fscan->scan.plan.targetlist, rtoffset);
  		fscan->scan.plan.qual =
--- 1145,1154 
  	}
  	else
  	{
! 		/*
! 		 * Adjust tlist, qual, fdw_exprs, fdw_recheck_quals in the standard
! 		 * way
! 		 */
  		fscan->scan.plan.targetlist =
  			fix_scan_list(root, fscan->scan.plan.targetlist, rtoffset);
  		fscan->scan.plan.qual =

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor code improvements to create_foreignscan_plan/ExecInitForeignScan

2016-01-15 Thread Etsuro Fujita

On 2016/01/12 18:00, Etsuro Fujita wrote:

On 2016/01/12 2:36, Alvaro Herrera wrote:

I wonder,



--- 2166,2213 
   }

   /*
!  * If rel is a base relation, detect whether any system columns
are
!  * requested from the rel.  (If rel is a join relation,
rel->relid will be
!  * 0, but there can be no Var in the target list with relid 0,
so we skip
!  * this in that case.  Note that any such system columns are
assumed to be
!  * contained in fdw_scan_tlist, so we never need fsSystemCol to
be true in
!  * the joinrel case.)  This is a bit of a kluge and might go
away someday,
!  * so we intentionally leave it out of the API presented to FDWs.
*/
! scan_plan->fsSystemCol = false;
! if (scan_relid > 0)
   {
! Bitmapset  *attrs_used = NULL;
! ListCell   *lc;
! inti;

! /*
!  * First, examine all the attributes needed for joins or
final output.
!  * Note: we must look at reltargetlist, not the attr_needed
data,
!  * because attr_needed isn't computed for inheritance child
rels.
!  */
! pull_varattnos((Node *) rel->reltargetlist, scan_relid,
&attrs_used);

! /* Add all the attributes used by restriction clauses. */
! foreach(lc, rel->baserestrictinfo)
   {
! RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
!
! pull_varattnos((Node *) rinfo->clause, scan_relid,
&attrs_used);
   }

! /* Now, are any system columns requested from rel? */
! for (i = FirstLowInvalidHeapAttributeNumber + 1; i < 0; i++)
! {
! if (bms_is_member(i -
FirstLowInvalidHeapAttributeNumber, attrs_used))
! {
! scan_plan->fsSystemCol = true;
! break;
! }
! }
!
! bms_free(attrs_used);
! }

   return scan_plan;
   }



Would it make sense to call pull_varattnos(reltargetlist), then walk the
bitmapset and break if we see a system column, then call
pull_varattnos() on the rinfo->clause?  That way, if the targetlist
request a system column we don't have to walk the RestrictInfos.



Seems like a good idea.  Will update the patch.


Done.  Attached is an updated version of the patch.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***
*** 2097,2106  create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
  	RelOptInfo *rel = best_path->path.parent;
  	Index		scan_relid = rel->relid;
  	Oid			rel_oid = InvalidOid;
- 	Bitmapset  *attrs_used = NULL;
  	Plan	   *outer_plan = NULL;
- 	ListCell   *lc;
- 	int			i;
  
  	Assert(rel->fdwroutine != NULL);
  
--- 2097,2103 
***
*** 2169,2204  create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
  	}
  
  	/*
! 	 * Detect whether any system columns are requested from rel.  This is a
! 	 * bit of a kluge and might go away someday, so we intentionally leave it
! 	 * out of the API presented to FDWs.
! 	 *
! 	 * First, examine all the attributes needed for joins or final output.
! 	 * Note: we must look at reltargetlist, not the attr_needed data, because
! 	 * attr_needed isn't computed for inheritance child rels.
  	 */
! 	pull_varattnos((Node *) rel->reltargetlist, rel->relid, &attrs_used);
! 
! 	/* Add all the attributes used by restriction clauses. */
! 	foreach(lc, rel->baserestrictinfo)
  	{
! 		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
  
! 		pull_varattnos((Node *) rinfo->clause, rel->relid, &attrs_used);
! 	}
  
! 	/* Now, are any system columns requested from rel? */
! 	scan_plan->fsSystemCol = false;
! 	for (i = FirstLowInvalidHeapAttributeNumber + 1; i < 0; i++)
! 	{
! 		if (bms_is_member(i - FirstLowInvalidHeapAttributeNumber, attrs_used))
  		{
! 			scan_plan->fsSystemCol = true;
! 			break;
  		}
- 	}
  
! 	bms_free(attrs_used);
  
  	return scan_plan;
  }
--- 2166,2229 
  	}
  
  	/*
! 	 * If rel is a base relation, detect whether any system columns are
! 	 * requested from the rel.  (If rel is a join relation, rel->relid will be
! 	 * 0, but there can be no Var with relid 0 in the reltargetlist or the
! 	 * restriction clauses, so we skip this in that case.  Note that any such
! 	 * columns in base relations that were joined are assumed to be contained
! 	 * in fdw_scan_tlist.)  This is a bit of a kluge and might go away someday,
! 	 * so we intentionally leave it out of the API presented to FDWs.
  	 */
! 	scan_plan->fsSystemCol = false;
! 	if (scan_relid > 0)
  	{
! 		Bitmapset  *attrs_used = NULL;
! 		ListCell   *lc;
! 		int			i;
  
! 		/*
! 		 * First, examine all the attributes needed for joins or final output.
! 		 * Note: we must look at reltargetlist, not the attr_needed data,
! 		 * because attr_needed isn't computed fo

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-14 Thread Etsuro Fujita

On 2016/01/14 21:36, Rushabh Lathia wrote:

On Thu, Jan 14, 2016 at 2:00 PM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>> wrote:



On 2016/01/12 20:31, Rushabh Lathia wrote:



On Thu, Jan 7, 2016 at 6:15 PM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>
<mailto:fujita.ets...@lab.ntt.co.jp
<mailto:fujita.ets...@lab.ntt.co.jp>>> wrote:
 On 2016/01/06 18:58, Rushabh Lathia wrote:
 .) What the need of following change ?

 @@ -833,9 +833,6 @@ appendWhereClause(StringInfo buf,
   int nestlevel;
   ListCell   *lc;

 -   if (params)
 -   *params = NIL;  /* initialize result
list to
 empty */
 -
   /* Set up context struct for recursion */
   context.root = root;
   context.foreignrel = baserel;
 @@ -971,6 +968,63 @@ deparseUpdateSql(StringInfo buf,
 PlannerInfo *root,
}



 It is needed for deparsePushedDownUpdateSql to store params
in both
 WHERE clauses and expressions to assign to the target columns
 into one params_list list.



Hmm sorry but I am still not getting the point, can you provide some
example to explain this ?



Sorry, maybe my explanation was not enough.  Consider:

postgres=# create foreign table ft1 (a int, b int) server myserver
options (table_name 't1');
postgres=# insert into ft1 values (0, 0);
postgres=# prepare mt(int, int) as update ft1 set a = $1 where b = $2;
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);

After the 5 executions of mt we have

postgres=# explain verbose execute mt(1, 0);
  QUERY PLAN


  Update on public.ft1  (cost=100.00..140.35 rows=12 width=10)
->  Foreign Update on public.ft1  (cost=100.00..140.35 rows=12
width=10)
  Remote SQL: UPDATE public.t1 SET a = $1::integer WHERE ((b
= $2::integer))
(3 rows)

If we do that initialization in appendWhereClause, we would get a
wrong params_list list and a wrong remote pushed-down query for the
last mt() in deparsePushedDownUpdateSql.



Strange, I am seeing same behaviour with or without that initialization in
appendWhereClause. After the 5 executions of mt I with or without I am
getting following output:

postgres=# explain verbose execute mt(1, 0);
  QUERY PLAN

  Update on public.ft2  (cost=100.00..140.35 rows=12 width=10)
->  Foreign Update on public.ft2  (cost=100.00..140.35 rows=12 width=10)
  Remote SQL: UPDATE public.t2 SET a = $1::integer WHERE ((b =
$2::integer))
(3 rows)


Really?  With that initialization in appendWhereClause, I got the 
following wrong result (note that both parameter numbers are $1):


postgres=# explain verbose execute mt(1, 0);
 QUERY PLAN

 Update on public.ft1  (cost=100.00..140.35 rows=12 width=10)
   ->  Foreign Update on public.ft1  (cost=100.00..140.35 rows=12 width=10)
 Remote SQL: UPDATE public.t1 SET a = $1::integer WHERE ((b = 
$1::integer))

(3 rows)


BTW, I keep a ForeignScan node pushing down an update to the remote
server, in the updated patches.  I have to admit that that seems
like rather a misnomer.  So, it might be worth adding a new
ForeignUpdate node, but my concern about that is that if doing so,
we would have a lot of duplicate code in ForeignUpdate and
ForeignScan.  What do you think about that?



Yes, I noticed that in the patch and I was about to point that out in my
final review. As first review I was mainly focused on the functionality
testing
and other overview things. Another reason I haven't posted that in my
first review round is, I was not quite sure whether we need the
separate new node ForeignUpdate, ForeignDelete  and want to duplicate
code? Was also not quite sure about the fact that what we will achieve
by doing that.

So I thought, I will have this open question in my final review comment,
and will take committer's opinion on this. Since you already raised this
question lets take others opinion on this.


OK, let's do that.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To ma

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-14 Thread Etsuro Fujita

On 2016/01/12 20:31, Rushabh Lathia wrote:

On Thu, Jan 7, 2016 at 6:15 PM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>> wrote:
On 2016/01/06 18:58, Rushabh Lathia wrote:
.) What the need of following change ?

@@ -833,9 +833,6 @@ appendWhereClause(StringInfo buf,
  int nestlevel;
  ListCell   *lc;

-   if (params)
-   *params = NIL;  /* initialize result list to
empty */
-
  /* Set up context struct for recursion */
  context.root = root;
  context.foreignrel = baserel;
@@ -971,6 +968,63 @@ deparseUpdateSql(StringInfo buf,
PlannerInfo *root,
   }



It is needed for deparsePushedDownUpdateSql to store params in both
WHERE clauses and expressions to assign to the target columns
into one params_list list.



Hmm sorry but I am still not getting the point, can you provide some
example to explain this ?


Sorry, maybe my explanation was not enough.  Consider:

postgres=# create foreign table ft1 (a int, b int) server myserver 
options (table_name 't1');

postgres=# insert into ft1 values (0, 0);
postgres=# prepare mt(int, int) as update ft1 set a = $1 where b = $2;
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);
postgres=# explain verbose execute mt(1, 0);

After the 5 executions of mt we have

postgres=# explain verbose execute mt(1, 0);
 QUERY PLAN

 Update on public.ft1  (cost=100.00..140.35 rows=12 width=10)
   ->  Foreign Update on public.ft1  (cost=100.00..140.35 rows=12 width=10)
 Remote SQL: UPDATE public.t1 SET a = $1::integer WHERE ((b = 
$2::integer))

(3 rows)

If we do that initialization in appendWhereClause, we would get a wrong 
params_list list and a wrong remote pushed-down query for the last mt() 
in deparsePushedDownUpdateSql.



.) When Tom Lane and Stephen Frost suggested getting the core
code involved,
I thought that we can do the mandatory checks into core it self
and making
completely out of dml_is_pushdown_safe(). Please correct me



The reason why I put that function in postgres_fdw.c is Check point 4:

+  * 4. We can't push an UPDATE down, if any expressions to assign
to the target
+  * columns are unsafe to evaluate on the remote server.



Here I was talking about checks related to triggers, or to LIMIT. I think
earlier thread talked about those mandatory check to the core. So may
be we can move those checks into make_modifytable() before calling
the PlanDMLPushdown.


Noticed that.  Will do.

BTW, I keep a ForeignScan node pushing down an update to the remote 
server, in the updated patches.  I have to admit that that seems like 
rather a misnomer.  So, it might be worth adding a new ForeignUpdate 
node, but my concern about that is that if doing so, we would have a lot 
of duplicate code in ForeignUpdate and ForeignScan.  What do you think 
about that?


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW join pushdown and scanclauses

2016-01-13 Thread Etsuro Fujita

On 2016/01/08 22:05, Ashutosh Bapat wrote:

In add_paths_to_joinrel(), the FDW specific hook GetForeignJoinPaths()
is called. This hook if implemented should add ForeignPaths for pushed
down joins. create_plan_recurse() calls create_scan_plan() on seeing
these paths.



create_scan_plan() generates a list of clauses to be applied on scan
from rel->baserestrictinfo and parameterization clauses. This list is
passed to create_*scan_plan routine as scanclauses. This code is very
specific for single relations scans. Now that we are using
create_scan_plan() for creating plan for join relations, it needs some
changes so that quals relevant to a join can be passed to
create_foreignscan_plan().


Do we really need that?  The relevant join quals are passed to 
GetForeignJoinPaths as extra->restrictlist, so I think we can get those 
quals during GetForeignPlan, by looking at the selected ForeignPath that 
is passed to that function as a parameter.



A related problem is in
create_foreignscan_plan(), which sets ForeignScan::fsSystemCol if a
system column is being used in the targetlist or quals. Right now it
only checks rel->baserestrictinfo, which is NULL for a joinrel. Thus in
case a system column appears in the joinclauses it will not be considered.


IIUC, we assume that such system columns are assumed to be contained in 
fdw_scan_tlist in the joinrel case.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)

2016-01-12 Thread Etsuro Fujita

On 2016/01/12 20:36, Thom Brown wrote:

On 8 January 2016 at 05:08, Etsuro Fujita  wrote:



On 2016/01/06 20:37, Thom Brown wrote:

I've run into an issue:

*# UPDATE master_customers SET id = 22 WHERE id = 16 RETURNING
tableoid::regclass;
ERROR:
CONTEXT:  Remote SQL command: UPDATE public.customers SET id = 22
WHERE ((id = 16)) RETURNING NULL



While working on this, I noticed that the existing postgres_fdw system shows
similar behavior, so I changed the subject.

IIUC, the reason for that is when the local query specifies "RETURNING
tableoid::regclass", the FDW has fmstate->has_returning=false while the
remote query executed at ModifyTable has "RETURNING NULL", as shown in the
above example; that would cause an abnormal exit in executing the remote
query in postgresExecForeignUpdate, since that the FDW would get
PGRES_TUPLES_OK as a result of the query while the FDW would think that the
right result to get should be PGRES_COMMAND_OK, from the flag
fmstate->has_returning=false.



Attached is a patch to fix that.



I can't apply this patch in tandem with FDW DML pushdown patch (either
v2 or v3).


That patch is for fixing the similar issue in the existing postgres_fdw 
system.  So, please apply that patch without the DML pushdown patch.  If 
that patch is reasonable as a fix for the issue, I'll update the DML 
pushdown patch (v3) on top of that patch.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-12 Thread Etsuro Fujita

On 2016/01/07 21:45, Etsuro Fujita wrote:

On 2016/01/06 18:58, Rushabh Lathia wrote:

.) Documentation for the new API is missing (fdw-callbacks).



Will add the docs.


I added docs for new FDW APIs.

Other changes:

* Rename relation_has_row_level_triggers to relation_has_row_triggers 
shortly, and move it to rewriteHandler.c.  I'm not sure rewriteHandler.c 
is a good place for that, though.


* Revise code, including a helper function get_result_result, whcih I 
implemented using a modified version of store_returning_result in the 
previous patch, but on second thought, I think that that is a bit too 
invasive.  So, I re-implemented that function directly using 
make_tuple_from_result_row.


* Add more comments.

* Add more regression tests.

Attached is an updated version of the patch.  Comments are wellcome! 
(If the fix [1] is okay, I'd like to update this patch on top of the 
patch in [1].)


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/568f4430.6060...@lab.ntt.co.jp
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 816,822  deparseTargetList(StringInfo buf,
   *
   * If params is not NULL, it receives a list of Params and other-relation Vars
   * used in the clauses; these values must be transmitted to the remote server
!  * as parameter values.
   *
   * If params is NULL, we're generating the query for EXPLAIN purposes,
   * so Params and other-relation Vars should be replaced by dummy values.
--- 816,822 
   *
   * If params is not NULL, it receives a list of Params and other-relation Vars
   * used in the clauses; these values must be transmitted to the remote server
!  * as parameter values.  Caller is responsible for initializing it to empty.
   *
   * If params is NULL, we're generating the query for EXPLAIN purposes,
   * so Params and other-relation Vars should be replaced by dummy values.
***
*** 833,841  appendWhereClause(StringInfo buf,
  	int			nestlevel;
  	ListCell   *lc;
  
- 	if (params)
- 		*params = NIL;			/* initialize result list to empty */
- 
  	/* Set up context struct for recursion */
  	context.root = root;
  	context.foreignrel = baserel;
--- 833,838 
***
*** 971,976  deparseUpdateSql(StringInfo buf, PlannerInfo *root,
--- 968,1030 
  }
  
  /*
+  * deparse remote UPDATE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparsePushedDownUpdateSql(StringInfo buf, PlannerInfo *root,
+ 		   Index rtindex, Relation rel,
+ 		   List	*targetlist,
+ 		   List *targetAttrs,
+ 		   List	*remote_conds,
+ 		   List **params_list,
+ 		   List *returningList,
+ 		   List **retrieved_attrs)
+ {
+ 	RelOptInfo *baserel = root->simple_rel_array[rtindex];
+ 	deparse_expr_cxt context;
+ 	bool		first;
+ 	ListCell   *lc;
+ 
+ 	if (params_list)
+ 		*params_list = NIL;		/* initialize result list to empty */
+ 
+ 	/* Set up context struct for recursion */
+ 	context.root = root;
+ 	context.foreignrel = baserel;
+ 	context.buf = buf;
+ 	context.params_list = params_list;
+ 
+ 	appendStringInfoString(buf, "UPDATE ");
+ 	deparseRelation(buf, rel);
+ 	appendStringInfoString(buf, " SET ");
+ 
+ 	first = true;
+ 	foreach(lc, targetAttrs)
+ 	{
+ 		int			attnum = lfirst_int(lc);
+ 		TargetEntry *tle = get_tle_by_resno(targetlist, attnum);
+ 
+ 		if (!first)
+ 			appendStringInfoString(buf, ", ");
+ 		first = false;
+ 
+ 		deparseColumnRef(buf, rtindex, attnum, root);
+ 		appendStringInfoString(buf, " = ");
+ 		deparseExpr((Expr *) tle->expr, &context);
+ 	}
+ 	if (remote_conds)
+ 		appendWhereClause(buf, root, baserel, remote_conds,
+ 		  true, params_list);
+ 
+ 	deparseReturningList(buf, root, rtindex, rel, false,
+ 		 returningList, retrieved_attrs);
+ }
+ 
+ /*
   * deparse remote DELETE statement
   *
   * The statement text is appended to buf, and we also create an integer List
***
*** 993,998  deparseDeleteSql(StringInfo buf, PlannerInfo *root,
--- 1047,1082 
  }
  
  /*
+  * deparse remote DELETE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparsePushedDownDeleteSql(StringInfo buf, PlannerInfo *root,
+ 		   Index rtindex, Relation rel,
+ 		   List	*remote_conds,
+ 		   List	**params_list,
+ 		   List *returningList,
+ 		   List **retrieved_attrs)
+ {
+ 	RelOptInfo *baserel = root->simple_rel_array[rtindex];
+ 
+ 	if (params_list)
+ 		*params_list = NIL;		/* initialize result list to empty */
+ 
+ 	appendStringInfoString(buf, "DELETE FROM ");
+ 	deparseRelation(buf, rel);
+ 	if (remote_conds)

Re: [HACKERS] Minor code improvements to create_foreignscan_plan/ExecInitForeignScan

2016-01-12 Thread Etsuro Fujita

On 2016/01/12 2:36, Alvaro Herrera wrote:

I wonder,


--- 2166,2213 
}

/*
!* If rel is a base relation, detect whether any system columns are
!* requested from the rel.  (If rel is a join relation, rel->relid will 
be
!* 0, but there can be no Var in the target list with relid 0, so we 
skip
!* this in that case.  Note that any such system columns are assumed to 
be
!* contained in fdw_scan_tlist, so we never need fsSystemCol to be true 
in
!* the joinrel case.)  This is a bit of a kluge and might go away 
someday,
!* so we intentionally leave it out of the API presented to FDWs.
 */
!   scan_plan->fsSystemCol = false;
!   if (scan_relid > 0)
{
!   Bitmapset  *attrs_used = NULL;
!   ListCell   *lc;
!   int i;

!   /*
!* First, examine all the attributes needed for joins or final 
output.
!* Note: we must look at reltargetlist, not the attr_needed 
data,
!* because attr_needed isn't computed for inheritance child 
rels.
!*/
!   pull_varattnos((Node *) rel->reltargetlist, scan_relid, 
&attrs_used);

!   /* Add all the attributes used by restriction clauses. */
!   foreach(lc, rel->baserestrictinfo)
{
!   RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
!
!   pull_varattnos((Node *) rinfo->clause, scan_relid, 
&attrs_used);
}

!   /* Now, are any system columns requested from rel? */
!   for (i = FirstLowInvalidHeapAttributeNumber + 1; i < 0; i++)
!   {
!   if (bms_is_member(i - 
FirstLowInvalidHeapAttributeNumber, attrs_used))
!   {
!   scan_plan->fsSystemCol = true;
!   break;
!   }
!   }
!
!   bms_free(attrs_used);
!   }

return scan_plan;
   }


Would it make sense to call pull_varattnos(reltargetlist), then walk the
bitmapset and break if we see a system column, then call
pull_varattnos() on the rinfo->clause?  That way, if the targetlist
request a system column we don't have to walk the RestrictInfos.


Seems like a good idea.  Will update the patch.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Odd behavior in foreign table modification (Was: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW)

2016-01-07 Thread Etsuro Fujita

On 2016/01/07 21:50, Etsuro Fujita wrote:

On 2016/01/06 20:37, Thom Brown wrote:

On 25 December 2015 at 10:00, Etsuro Fujita
 wrote:

Attached is an updated version of the patch, which is
still WIP, but I'd be happy if I could get any feedback.



I've run into an issue:

*# UPDATE master_customers SET id = 22 WHERE id = 16 RETURNING
tableoid::regclass;
ERROR:
CONTEXT:  Remote SQL command: UPDATE public.customers SET id = 22
WHERE ((id = 16)) RETURNING NULL



Will fix.


While working on this, I noticed that the existing postgres_fdw system 
shows similar behavior, so I changed the subject.


IIUC, the reason for that is when the local query specifies "RETURNING 
tableoid::regclass", the FDW has fmstate->has_returning=false while the 
remote query executed at ModifyTable has "RETURNING NULL", as shown in 
the above example; that would cause an abnormal exit in executing the 
remote query in postgresExecForeignUpdate, since that the FDW would get 
PGRES_TUPLES_OK as a result of the query while the FDW would think that 
the right result to get should be PGRES_COMMAND_OK, from the flag 
fmstate->has_returning=false.


Attached is a patch to fix that.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 1003,1008  deparseReturningList(StringInfo buf, PlannerInfo *root,
--- 1003,1009 
  	 List **retrieved_attrs)
  {
  	Bitmapset  *attrs_used = NULL;
+ 	bool		has_returning = false;
  
  	if (trig_after_row)
  	{
***
*** 1021,1031  deparseReturningList(StringInfo buf, PlannerInfo *root,
--- 1022,1072 
  	   &attrs_used);
  	}
  
+ 	/*
+ 	 * Check to see whether the remote query has a RETURNING clause.
+ 	 *
+ 	 * XXX be careful to keep this in sync with deparseTargetList.
+ 	 */
  	if (attrs_used != NULL)
  	{
+ 		if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
+ 		  attrs_used))
+ 			has_returning = true;
+ 		else
+ 		{
+ 			TupleDesc	tupdesc = RelationGetDescr(rel);
+ 			bool		have_wholerow;
+ 			int			i;
+ 
+ 			/* If there's a whole-row reference, we'll need all the columns. */
+ 			have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
+ 		  attrs_used);
+ 
+ 			for (i = 1; i <= tupdesc->natts; i++)
+ 			{
+ Form_pg_attribute attr = tupdesc->attrs[i - 1];
+ 
+ /* Ignore dropped attributes. */
+ if (attr->attisdropped)
+ 	continue;
+ 
+ if (have_wholerow ||
+ 	bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
+   attrs_used))
+ {
+ 	has_returning = true;
+ 	break;
+ }
+ 			}
+ 		}
+ 	}
+ 
+ 	if (has_returning != false)
+ 	{
  		appendStringInfoString(buf, " RETURNING ");
  		deparseTargetList(buf, root, rtindex, rel, attrs_used,
  		  retrieved_attrs);
+ 		Assert(*retrieved_attrs != NIL);
  	}
  	else
  		*retrieved_attrs = NIL;
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***
*** 2408,2413  SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;
--- 2408,2466 
   1104 | 204 | ddd| 
  (819 rows)
  
+ EXPLAIN (verbose, costs off)
+ INSERT INTO ft2 (c1,c2,c3) VALUES (,999,'foo') RETURNING tableoid::regclass;
+QUERY PLAN
+ -
+  Insert on public.ft2
+Output: (tableoid)::regclass
+Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
+->  Result
+  Output: , 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2   '::character(10), NULL::user_enum
+ (5 rows)
+ 
+ INSERT INTO ft2 (c1,c2,c3) VALUES (,999,'foo') RETURNING tableoid::regclass;
+  tableoid 
+ --
+  ft2
+ (1 row)
+ 
+ EXPLAIN (verbose, costs off)
+ UPDATE ft2 SET c3 = 'bar' WHERE c1 =  RETURNING tableoid::regclass;
+ QUERY PLAN 
+ ---
+  Update on public.ft2
+Output: (tableoid)::regclass
+Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1
+->  Foreign Scan on public.ft2
+  Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, ctid
+  Remote SQL: SELEC

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-07 Thread Etsuro Fujita

On 2016/01/06 20:37, Thom Brown wrote:

On 25 December 2015 at 10:00, Etsuro Fujita  wrote:

Attached is an updated version of the patch, which is
still WIP, but I'd be happy if I could get any feedback.



I've run into an issue:

*# UPDATE master_customers SET id = 22 WHERE id = 16 RETURNING
tableoid::regclass;
ERROR:
CONTEXT:  Remote SQL command: UPDATE public.customers SET id = 22
WHERE ((id = 16)) RETURNING NULL

However, this works:

*# UPDATE master_customers SET id = 22 WHERE id = 16 RETURNING
tableoid::regclass, *;
 tableoid | id | name  |company| registered_date |
expiry_date | active | status  | account_level
-++---+---+-+-++-+---
  local_customers | 22 | Bruce | Jo's Cupcakes | 2015-01-15  |
2017-01-14  | t  | running | basic
(1 row)

In this example, "local_customers" inherits from the remote table
"public"."customers", which inherits again from the local table
"master_customers"

Same issue with DELETE of course, and the ::regclass isn't important here.


Will fix.

Thanks for the testing!

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-07 Thread Etsuro Fujita

On 2016/01/06 18:58, Rushabh Lathia wrote:

I started looking at updated patch and its definitely iked the new
approach.


Thanks for the review!


With the initial look and test overall things looking great, I am still
reviewing the code changes but here are few early doubts/questions:



.) What the need of following change ?

@@ -833,9 +833,6 @@ appendWhereClause(StringInfo buf,
 int nestlevel;
 ListCell   *lc;

-   if (params)
-   *params = NIL;  /* initialize result list to empty */
-
 /* Set up context struct for recursion */
 context.root = root;
 context.foreignrel = baserel;
@@ -971,6 +968,63 @@ deparseUpdateSql(StringInfo buf, PlannerInfo *root,
  }


It is needed for deparsePushedDownUpdateSql to store params in both 
WHERE clauses and expressions to assign to the target columns

into one params_list list.


.) When Tom Lane and Stephen Frost suggested getting the core code involved,
I thought that we can do the mandatory checks into core it self and making
completely out of dml_is_pushdown_safe(). Please correct me


The reason why I put that function in postgres_fdw.c is Check point 4:

+  * 4. We can't push an UPDATE down, if any expressions to assign to 
the target

+  * columns are unsafe to evaluate on the remote server.

I think this depends on the capabilities of the FDW.


.) Documentation for the new API is missing (fdw-callbacks).


Will add the docs.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor code improvements to create_foreignscan_plan/ExecInitForeignScan

2015-12-27 Thread Etsuro Fujita

On 2015/12/23 2:47, Robert Haas wrote:

On Tue, Dec 22, 2015 at 7:32 AM, Michael Paquier
 wrote:

Moved to next CF because of a lack of reviews.


Thanks, Michael!


I just took a look at this.  I think the basic idea of this patch is
good, but the comments need some work, because they don't really
explain why this should be skipped in the join case.  Maybe something
like this:


Thanks for the review, Robert!


If rel is a base relation, detect whether any system columns were
requested.  (If rel is a join relation, rel->relid will be 0, but
there can be no Var in the target list with relid 0, so we skip this
in that case.) This is a bit of a kluge and might go away someday, so
we intentionally leave it out of the API presented to FDWs.
And the rest as it is currently written.


Agreed.


It might be good, also, to say something about why we never need
fsSystemCol to be true in the joinrel case.


+1 for that.  How about adding something like this:

Note that any such system columns are assumed to be contained in 
fdw_scan_tlist, so we never need fsSystemCol to be true in the joinrel case.


Attached is an updated version of the patch.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***
*** 2097,2106  create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
  	RelOptInfo *rel = best_path->path.parent;
  	Index		scan_relid = rel->relid;
  	Oid			rel_oid = InvalidOid;
- 	Bitmapset  *attrs_used = NULL;
  	Plan	   *outer_plan = NULL;
- 	ListCell   *lc;
- 	int			i;
  
  	Assert(rel->fdwroutine != NULL);
  
--- 2097,2103 
***
*** 2169,2204  create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
  	}
  
  	/*
! 	 * Detect whether any system columns are requested from rel.  This is a
! 	 * bit of a kluge and might go away someday, so we intentionally leave it
! 	 * out of the API presented to FDWs.
! 	 *
! 	 * First, examine all the attributes needed for joins or final output.
! 	 * Note: we must look at reltargetlist, not the attr_needed data, because
! 	 * attr_needed isn't computed for inheritance child rels.
  	 */
! 	pull_varattnos((Node *) rel->reltargetlist, rel->relid, &attrs_used);
! 
! 	/* Add all the attributes used by restriction clauses. */
! 	foreach(lc, rel->baserestrictinfo)
  	{
! 		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
  
! 		pull_varattnos((Node *) rinfo->clause, rel->relid, &attrs_used);
! 	}
  
! 	/* Now, are any system columns requested from rel? */
! 	scan_plan->fsSystemCol = false;
! 	for (i = FirstLowInvalidHeapAttributeNumber + 1; i < 0; i++)
! 	{
! 		if (bms_is_member(i - FirstLowInvalidHeapAttributeNumber, attrs_used))
  		{
! 			scan_plan->fsSystemCol = true;
! 			break;
  		}
- 	}
  
! 	bms_free(attrs_used);
  
  	return scan_plan;
  }
--- 2166,2213 
  	}
  
  	/*
! 	 * If rel is a base relation, detect whether any system columns are
! 	 * requested from the rel.  (If rel is a join relation, rel->relid will be
! 	 * 0, but there can be no Var in the target list with relid 0, so we skip
! 	 * this in that case.  Note that any such system columns are assumed to be
! 	 * contained in fdw_scan_tlist, so we never need fsSystemCol to be true in
! 	 * the joinrel case.)  This is a bit of a kluge and might go away someday,
! 	 * so we intentionally leave it out of the API presented to FDWs.
  	 */
! 	scan_plan->fsSystemCol = false;
! 	if (scan_relid > 0)
  	{
! 		Bitmapset  *attrs_used = NULL;
! 		ListCell   *lc;
! 		int			i;
  
! 		/*
! 		 * First, examine all the attributes needed for joins or final output.
! 		 * Note: we must look at reltargetlist, not the attr_needed data,
! 		 * because attr_needed isn't computed for inheritance child rels.
! 		 */
! 		pull_varattnos((Node *) rel->reltargetlist, scan_relid, &attrs_used);
  
! 		/* Add all the attributes used by restriction clauses. */
! 		foreach(lc, rel->baserestrictinfo)
  		{
! 			RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
! 
! 			pull_varattnos((Node *) rinfo->clause, scan_relid, &attrs_used);
  		}
  
! 		/* Now, are any system columns requested from rel? */
! 		for (i = FirstLowInvalidHeapAttributeNumber + 1; i < 0; i++)
! 		{
! 			if (bms_is_member(i - FirstLowInvalidHeapAttributeNumber, attrs_used))
! 			{
! scan_plan->fsSystemCol = true;
! break;
! 			}
! 		}
! 
! 		bms_free(attrs_used);
! 	}
  
  	return scan_plan;
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-12-25 Thread Etsuro Fujita

On 2015/12/24 4:34, Robert Haas wrote:

On Wed, Dec 23, 2015 at 5:50 AM, Rushabh Lathia
 wrote:

+1.

I like idea of separate FDW API for the DML Pushdown. Was thinking can't we
can re-use the  IterateForeignScan(ForeignScanState *node) rather then
introducing IterateDMLPushdown(ForeignScanState *node) new API ?



Yeah, I think we need to ask ourselves what advantage we're getting
out of adding any new core APIs.  Marking the scan as a pushed-down
update or delete has some benefit in terms of making the information
visible via EXPLAIN, but even that's a pretty thin benefit.  The
iterate method seems to just complicate the core code without any
benefit at all.  More generally, there is very, very little code in
this patch that accomplishes anything that could not be done just as
well with the existing methods.  So why are we even doing these core
changes?


From the FDWs' point of view, ISTM that what FDWs have to do for 
IterateDMLPushdown is quite different from what FDWs have to do for 
IterateForeignScan; eg, IterateDMLPushdown must work in accordance with 
presence/absence of a RETURNING list.  (In addition to that, 
IterateDMLPushdown has been designed so that it must make the scan tuple 
available to later RETURNING projection in nodeModifyTable.c.)  So, I 
think that it's better to FDWs to add separate APIs for the DML 
pushdown, making the FDW code much simpler.  So based on that idea, I 
added the postgres_fdw changes to the patch.  Attached is an updated 
version of the patch, which is still WIP, but I'd be happy if I could 
get any feedback.



Tom seemed to think that we could centralize some checks in the core
code, say, related to triggers, or to LIMIT.  But there's nothing like
that in this patch, so I'm not really understanding the point.


For the trigger check, I added relation_has_row_level_triggers.  I 
placed that function in postgres_fdw.c in the updated patch, but I think 
that by placing that function in the core, FDWs can share that function. 
 As for the LIMIT, I'm not sure we can do something about that.


I think the current design allows us to handle a pushed-down update on a 
join, ie, "UPDATE foo ... FROM bar ..." where both foo and bar are 
remote, which was Tom's concern, but I'll leave that for another patch. 
 Also, I think the current design could also extend to push down INSERT 
.. RETURNING .., but I'd like to leave that for future work.


I'll add this to the next CF.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 816,822  deparseTargetList(StringInfo buf,
   *
   * If params is not NULL, it receives a list of Params and other-relation Vars
   * used in the clauses; these values must be transmitted to the remote server
!  * as parameter values.
   *
   * If params is NULL, we're generating the query for EXPLAIN purposes,
   * so Params and other-relation Vars should be replaced by dummy values.
--- 816,822 
   *
   * If params is not NULL, it receives a list of Params and other-relation Vars
   * used in the clauses; these values must be transmitted to the remote server
!  * as parameter values.  Caller is responsible for initializing it to empty.
   *
   * If params is NULL, we're generating the query for EXPLAIN purposes,
   * so Params and other-relation Vars should be replaced by dummy values.
***
*** 833,841  appendWhereClause(StringInfo buf,
  	int			nestlevel;
  	ListCell   *lc;
  
- 	if (params)
- 		*params = NIL;			/* initialize result list to empty */
- 
  	/* Set up context struct for recursion */
  	context.root = root;
  	context.foreignrel = baserel;
--- 833,838 
***
*** 971,976  deparseUpdateSql(StringInfo buf, PlannerInfo *root,
--- 968,1030 
  }
  
  /*
+  * deparse remote UPDATE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparsePushedDownUpdateSql(StringInfo buf, PlannerInfo *root,
+ 		   Index rtindex, Relation rel,
+ 		   List	*targetlist,
+ 		   List *targetAttrs,
+ 		   List	*remote_conds,
+ 		   List **params_list,
+ 		   List *returningList,
+ 		   List **retrieved_attrs)
+ {
+ 	RelOptInfo *baserel = root->simple_rel_array[rtindex];
+ 	deparse_expr_cxt context;
+ 	bool		first;
+ 	ListCell   *lc;
+ 
+ 	if (params_list)
+ 		*params_list = NIL;		/* initialize result list to empty */
+ 
+ 	/* Set up context struct for recursion */
+ 	context.root = root;
+ 	context.foreignrel = baserel;
+ 	context.buf = buf;
+ 	context.params_list = params_list;
+ 
+ 	appendStringInfoString(buf, "UPDATE ");
+ 	deparseRelation(buf, rel);
+ 	appendStringInfoString(buf, " SET ");
+ 
+ 	first = true;
+ 	foreach(lc, targetAttrs)
+ 	{
+ 		i

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-12-21 Thread Etsuro Fujita

On 2015/11/26 18:00, Etsuro Fujita wrote:

On 2015/11/25 20:36, Thom Brown wrote:

On 13 May 2015 at 04:10, Etsuro Fujita 
wrote:

On 2015/05/13 0:55, Stephen Frost wrote:

While the EXPLAIN output changed, the structure hasn't really changed
from what was discussed previously and there's not been any real
involvment from the core code in what's happening here.

Clearly, the documentation around how to use the FDW API hasn't changed
at all and there's been no additions to it for handling bulk work.
Everything here continues to be done inside of postgres_fdw, which
essentially ignores the prescribed "Update/Delete one tuple" interface
for ExecForeignUpdate/ExecForeignDelete.

I've spent the better part of the past two days trying to reason my way
around that while reviewing this patch and I haven't come out the other
side any happier with this approach than I was back in
20140911153049.gc16...@tamriel.snowman.net.

There are other things that don't look right to me, such as what's
going
on at the bottom of push_update_down(), but I don't think there's much
point going into it until we figure out what the core FDW API here
should look like.  It might not be all that far from what we have now,
but I don't think we can just ignore the existing, documented, API.



OK, I'll try to introduce the core FDW API for this (and make changes
to the
core code) to address your previous comments.



I'm a bit behind in reading up on this, so maybe it's been covered
since, but is there a discussion of this API on another thread, or a
newer patch available?


To address Stephen's comments, I'd like to propose the following FDW APIs:

bool
PlanDMLPushdown (PlannerInfo *root,
 ModifyTable *plan,
 Index resultRelation,
 int subplan_index);

This is called in make_modifytable, before calling PlanForeignModify. 
This checks to see whether a given UPDATE/DELETE .. RETURNING .. is 
pushdown-safe and if so, performs planning actions needed for the DML 
pushdown.  The idea is to modify a ForeignScan subplan accordingly as in 
the previous patch.  If the DML is pushdown-safe, this returns true, and 
we don't call PlanForeignModify anymore.  (Else returns false and call 
PlanForeignModify as before.)  When the DML is pushdown-safe, we hook 
the following FDW APIs located in nodeForeignscan.c, instead of 
BeginForeignModify, ExecForeignUpdate/ExecForeignDelete and 
EndForeignModify:


void
BeginDMLPushdown (ForeignScanState *node,
  int eflags);

This initializes the DML pushdown, like BeginForeignScan.

TupleTableSlot *
IterateDMLPushdown (ForeignScanState *node);

This fetches one returning result from the foreign server, like 
IterateForeignScan, if having a RETURNING clause.  If not, just return 
an empty slot.  (I'm thinking that it's required that the FDW replaces 
the targetlist of the ForeignScan subplan to the RETURNING clause during 
PlanDMLPushdown, if having the clause, so that we do nothing at 
ModifyTable.)


void
EndDMLPushdown (ForeignScanState *node);

This finishes the DML pushdown, like EndForeignScan.

I'm attaching a WIP patch, which only includes changes to the core.  I'm 
now working on the postgres_fdw patch to demonstrate that these APIs 
work well, but I'd be happy if I could get any feedback earlier.


Best regards,
Etsuro Fujita
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 887,893  ExplainNode(PlanState *planstate, List *ancestors,
  			pname = sname = "WorkTable Scan";
  			break;
  		case T_ForeignScan:
! 			pname = sname = "Foreign Scan";
  			break;
  		case T_CustomScan:
  			sname = "Custom Scan";
--- 887,911 
  			pname = sname = "WorkTable Scan";
  			break;
  		case T_ForeignScan:
! 			sname = "Foreign Scan";
! 			switch (((ForeignScan *) plan)->operation)
! 			{
! case CMD_SELECT:
! 	pname = "Foreign Scan";
! 	operation = "Select";
! 	break;
! case CMD_UPDATE:
! 	pname = "Foreign Update";
! 	operation = "Update";
! 	break;
! case CMD_DELETE:
! 	pname = "Foreign Delete";
! 	operation = "Delete";
! 	break;
! default:
! 	pname = "???";
! 	break;
! 			}
  			break;
  		case T_CustomScan:
  			sname = "Custom Scan";
***
*** 1658,1663  show_plan_tlist(PlanState *planstate, List *ancestors, ExplainState *es)
--- 1676,1686 
  		return;
  	if (IsA(plan, RecursiveUnion))
  		return;
+ 	/* Likewise for ForeignScan in case of pushed-down UPDATE/DELETE */
+ 	if (IsA(plan, ForeignScan) &&
+ 		(((ForeignScan *) plan)->operation == CMD_UPDATE ||
+ 		 ((ForeignScan *) plan)->operation == CMD_DELETE))
+ 		return;
  
  	/* Set up de

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-11 Thread Etsuro Fujita

On 2015/12/11 14:16, Ashutosh Bapat wrote:

On Thu, Dec 10, 2015 at 11:20 PM, Robert Haas mailto:robertmh...@gmail.com>> wrote:



On Tue, Dec 8, 2015 at 6:40 AM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>>
wrote:
> IMO I want to see the EvalPlanQual fix in the first version for 9.6.

+1.



I think there is still a lot functionality that is offered without
EvalPlanQual fix. As long as we do not push joins when there are
RowMarks involved, implementation of that hook is not required. We won't
be able to push down joins for DMLs and when there are FOR SHARE/UPDATE
clauses in the query. And there are huge number of queries, which will
be benefitted by the push down even without that support. There's
nothing in this patch, which comes in way of implementing the
EvalPlanQual fix. It can be easily added after committing the first
version. On the other hand, getting minimal (it's not really minimal,
it's much more than that) support for postgres_fdw support committed
opens up possibility to work on multiple items (as listed in my mail) in
parallel.



I am not saying that we do not need EvalPlanQual fix in 9.6. But it's
not needed in the first cut. If we get the first cut in first couple of
months of 2016, there's plenty of room for the fix to go in 9.6. It
would be really bad situation if we could not get postgres_fdw join
pushdown supported in 9.6 because EvalPlanQual hook could not be
committed while the rest of the code is ready. EvalPlanQual fix in core
was being discussed since April 2015. It took 8 months to get that
fixed. Hopefully we won't need that long to implement the hook in
postgres_fdw, but that number says something about the complexity of the
feature.


ISTM that further enhancements are of secondary importance. Let's do the 
EvalPlanQual fix first. I'll add the RecheckForeignScan callback routine 
to your version of the postgres_fdw patch as soon as possible.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-12-11 Thread Etsuro Fujita

On 2015/12/11 1:18, Robert Haas wrote:

On Wed, Dec 9, 2015 at 2:52 AM, Etsuro Fujita
 wrote:

Thank you for committing the patch!

Sorry, I overlooked a typo in docs: s/more that one/more than one/ Please
find attached a patch.



Committed, thanks.


Thanks!

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-09 Thread Etsuro Fujita

On 2015/12/09 13:26, Kouhei Kaigai wrote:

On Tue, Dec 8, 2015 at 10:00 PM, Etsuro Fujita
 wrote:

I think the actual regression test outputs are fine, and that your
desire to suppress part of the plan tree from showing up in the
EXPLAIN output is misguided.  I like it just the way it is.  To
prevent user confusion, I think that when we add support to
postgres_fdw for this we might also want to add some documentation
explaining how to interpret this EXPLAIN output, but I don't think
there's any problem with the output itself.



I'm not sure that that's a good idea.  one reason for that is I think that
that would be more confusing to users when more than two foreign tables are
involved in a foreign join as shown in the following example.  Note that the
outer plans will be shown recursively.  Another reason is there is no
consistency between the costs of the outer plans and that of the main plan.



I still don't really see a problem here, but, regardless, the solution
can't be to hide nodes that are in fact present from the user.  We can
talk about making further changes here, but hiding the nodes
altogether is categorically out in my mind.



If you really want to hide the alternative sub-plan, you can move the
outer planstate onto somewhere private field on BeginForeignScan,
then kick ExecProcNode() at the ForeignRecheck callback by itself.
Explain walks down the sub-plan if outerPlanState(planstate) is
valid. So, as long as your extension keeps the planstate privately,
it is not visible from the EXPLAIN.

Of course, I don't recommend it.


Sorry, my explanation might be not enough, but I'm not saying to hide 
the subplan.  I think it would be better to show the subplan somewhere 
in the EXPLAIN outout, but I'm not sure that it's a good idea to show 
that in the current form.  We have two plan trees; one for normal query 
execution and another for EvalPlanQual testing.  I think it'd be better 
to show the EXPLAIN output the way that allows users to easily identify 
each of the plan trees.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-12-08 Thread Etsuro Fujita

On 2015/12/09 2:56, Robert Haas wrote:

On Thu, Dec 3, 2015 at 9:51 PM, Noah Misch  wrote:

On Tue, Dec 01, 2015 at 11:05:47AM -0500, Robert Haas wrote:

On Mon, Nov 30, 2015 at 2:43 PM, Tom Lane  wrote:

* Foreign join pushdown vs EvalPlanQual

Is this fixed by 5fc4c26db?  If not, what remains to do?



Unfortunately, no.  That commit allows FDWs to do proper EPQ handling
for plain table scans, but it proves to be inadequate for EPQ handling
for joins. Solving that problem will require another patch, and,
modulo a bunch of cosmetic issues, I'm reasonably happy with KaiGai
Kohei's latest submission.  I'll respond in more detail on that
thread, but the question I want to raise here is: do we want to
back-patch those changes to 9.5 at this late date?



Yes.  If 9.5 added a bad interface, better to fix the interface even now than
to live with the bad one.



OK, I've pushed the latest patch for that issue to master and 9.5.
I'm not completely positive we've killed this problem dead, but I hope
so.


Thank you for committing the patch!

Sorry, I overlooked a typo in docs: s/more that one/more than one/ 
Please find attached a patch.


Best regards,
Etsuro Fujita
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 0090e24..dc2d890 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -793,7 +793,7 @@ RecheckForeignScan (ForeignScanState *node, TupleTableSlot *slot);
  ForeignScan.  When a recheck is required, this subplan
  can be executed and the resulting tuple can be stored in the slot.
  This plan need not be efficient since no base table will return more
- that one row; for example, it may implement all joins as nested loops.
+ than one row; for example, it may implement all joins as nested loops.
 

 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-08 Thread Etsuro Fujita

On 2015/12/09 1:13, Robert Haas wrote:

On Tue, Dec 8, 2015 at 5:49 AM, Etsuro Fujita
 wrote:

I'd like to discuss the next
thing about his patch.  As I mentioned in [1], the following change in
the patch will break the EXPLAIN output.

@@ -205,6 +218,11 @@ ExecInitForeignScan(ForeignScan *node, EState
*estate, int eflags)
 scanstate->fdwroutine = fdwroutine;
 scanstate->fdw_state = NULL;

+   /* Initialize any outer plan. */
+   if (outerPlanState(scanstate))
+   outerPlanState(scanstate) =
+   ExecInitNode(outerPlan(node), estate, eflags);
+

As pointed out by Horiguchi-san, that's not correct, though; we should
initialize the outer plan if outerPlan(node) != NULL, not
outerPlanState(scanstate) != NULL.  Attached is an updated version of
his patch.



I'm also attaching an updated version of the postgres_fdw
join pushdown patch.



Is that based on Ashutosh's version of the patch, or are the two of
you developing independent of each other?  We should avoid dueling
patches if possible.


That's not based on his version.  I'll add to his patch changes I've 
made.  IIUC, his version is an updated version of Hanada-san's original 
patches that I've modified, so I guess that I could do that easily. 
(I've added a helper function for creating a local join execution plan 
for a given foreign join, but that is a rush work.  So, I'll rewrite that.)



You can find the breaking examples by doing the
regression tests in the postgres_fdw patch.  Please apply the patches in
the following order:

epq-recheck-v6-efujita (attached)
usermapping_matching.patch in [2]
add_GetUserMappingById.patch in [2]
foreign_join_v16_efujita2.patch (attached)

As I proposed upthread, I think we could fix that by handling the outer
plan as in the patch [3]; a) the core initializes the outer plan and
stores it into somewhere in the ForeignScanState node, not the lefttree
of the ForeignScanState node, during ExecInitForeignScan, and b) when
the RecheckForeignScan routine gets called, the FDW extracts the plan
from the given ForeignScanState node and executes it.  What do you think
about that?



I think the actual regression test outputs are fine, and that your
desire to suppress part of the plan tree from showing up in the
EXPLAIN output is misguided.  I like it just the way it is.  To
prevent user confusion, I think that when we add support to
postgres_fdw for this we might also want to add some documentation
explaining how to interpret this EXPLAIN output, but I don't think
there's any problem with the output itself.


I'm not sure that that's a good idea.  one reason for that is I think 
that that would be more confusing to users when more than two foreign 
tables are involved in a foreign join as shown in the following example. 
 Note that the outer plans will be shown recursively.  Another reason 
is there is no consistency between the costs of the outer plans and that 
of the main plan.


postgres=# explain verbose select * from foo, bar, baz where foo.a = 
bar.a and bar.a = baz.a for update;


 QUERY PLAN




 LockRows  (cost=100.00..100.45 rows=15 width=96)
   Output: foo.a, bar.a, baz.a, foo.*, bar.*, baz.*
   ->  Foreign Scan  (cost=100.00..100.30 rows=15 width=96)
 Output: foo.a, bar.a, baz.a, foo.*, bar.*, baz.*
 Relations: ((public.foo) INNER JOIN (public.bar)) INNER JOIN 
(public.baz)
 Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1, r.a2 FROM 
(SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a9, ROW(l.a9) FROM (SELECT 
a a9 FROM p
ublic.foo FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT r.a9, ROW(r.a9) 
FROM (SELECT a a9 FROM public.bar FOR UPDATE) r) r (a1, a2) ON ((l.a1 = 
r.a1))) l
 (a1, a2, a3, a4) INNER JOIN (SELECT r.a9, ROW(r.a9) FROM (SELECT a a9 
FROM public.baz FOR UPDATE) r) r (a1, a2) ON ((l.a1 = r.a1))

 ->  Hash Join  (cost=272.13..272.69 rows=15 width=96)
   Output: foo.a, foo.*, bar.a, bar.*, baz.a, baz.*
   Hash Cond: (foo.a = baz.a)
   ->  Foreign Scan  (cost=100.00..100.04 rows=2 width=64)
 Output: foo.a, foo.*, bar.a, bar.*
 Relations: (public.foo) INNER JOIN (public.bar)
 Remote SQL: SELECT l.a1, l.a2, r.a1, r.a2 FROM 
(SELECT l.a9, ROW(l.a9) FROM (SELECT a a9 FROM public.foo FOR UPDATE) l) 
l (a1, a2)
INNER JOIN (SELECT r.a

Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-08 Thread Etsuro Fujita

On 2015/12/08 17:27, Ashutosh Bapat wrote:

On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>> wrote:



Generating paths




A join between two foreign relations is considered safe to push
down if



4. The join conditions (e.g. conditions in ON clause) are all
safe to
push down.
 This is important for OUTER joins as pushing down join clauses
partially and
 applying rest locally changes the result. There are ways [1] by
which partial
 OUTER join can be completed by applying unpushable clauses
locally
and then
 nullifying the nullable side and eliminating duplicate
non-nullable side
 rows. But that's again out of scope of first version of
postgres_fdw
join
 pushdown.



As for 4, as commented in the patch, we could relax the requirement
that all the join conditions (given by JoinPathExtraData's
restrictlist) need to be safe to push down to the remote server;
* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be
safe, while I think all the "joinclauses" need to be safe to get the
right results (where "joinclauses" and "otherclauses" are defined by
extract_actual_join_clauses).  And I think we should do this
relaxation to some extent for 9.6, to allow more joins to be pushed
down.



agreed. I will work on those.


Great!


Generating plan
===



Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function
deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and
now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by
calling itself
 in recursive fashion.
2. These SQLs are converted into subqueries and become part of
the FROM
clause
 with appropriate JOIN type and clauses. The left and right
subqueries are
 given aliases "l" and "r" respectively. The columns in each
subquery are
 aliased as "a1", "a2", "a3" and so on. Thus the third
column on left
side can
 be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
 that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at
that level
of join
 are added as part of WHERE clause.



Honestly, I'm not sure that that is a good idea.  One reason for
that is that a query string constructed by the procedure is
difficult to read especially when the procedure is applied
recursively.  So, I'm thinking to revise the procedure so as to
construct a query string with a flattened FROM clause, as discussed
in eg, [2].



Just to confirm, the hook discussed in [2] is not in place right? I can
find only one hook for foreign join
  50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
  51
RelOptInfo *joinrel,
  52 RelOptInfo
*outerrel,
  53 RelOptInfo
*innerrel,
  54   JoinType
jointype,
  55
JoinPathExtraData *extra);
This hook takes an inner and outer relation, so can not be used for
N-way join as discussed in that thread.

Are you suggesting that we should add that hook before we implement join
pushdown in postgres_fdw? Am I missing something?


I don't mean it.  I'm thinking that I'll just revise the procedure so as 
to generate a FROM clause that is something like "from c left join d on 
(...) full join e on (...)" based on the existing hook you mentioned.



TODOs
=



In another thread Robert, Fujita-san and Kaigai-san are
discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.



Yeah, we would need those changes including helper functions to
create a local join execution plan for that support.  I'd like to
add those changes to your updated patch if it's okay.



Right now, we do not have any support for postgres_fdw join pushdown. I
was thinking of adding at least minimal support for the same using this
patch, may be by preventi

[HACKERS] Minor comment update in setrefs.c

2015-12-08 Thread Etsuro Fujita
Hi,

Attached is a small patch to adjust a comment in setrefs.c; in
set_foreignscan_references, fdw_recheck_quals also gets adjusted to
reference foreign scan tuple, in case of a foreign join, so I added
"etc.", to a comment there, as the comment in case of a simple foreign
table scan.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/plan/setrefs.c
--- b/src/backend/optimizer/plan/setrefs.c
***
*** 1108,1114  set_foreignscan_references(PlannerInfo *root,
  
  	if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
  	{
! 		/* Adjust tlist, qual, fdw_exprs to reference foreign scan tuple */
  		indexed_tlist *itlist = build_tlist_index(fscan->fdw_scan_tlist);
  
  		fscan->scan.plan.targetlist = (List *)
--- 1108,1117 
  
  	if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
  	{
! 		/*
! 		 * Adjust tlist, qual, fdw_exprs, etc. to reference foreign scan
! 		 * tuple
! 		 */
  		indexed_tlist *itlist = build_tlist_index(fscan->fdw_scan_tlist);
  
  		fscan->scan.plan.targetlist = (List *)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-06 Thread Etsuro Fujita

On 2015/12/05 5:15, Robert Haas wrote:

On Tue, Dec 1, 2015 at 10:20 PM, Etsuro Fujita
 wrote:

One thing I can think of is that we can keep both the structure of a
ForeignPath node and the API of create_foreignscan_path as-is.  The latter
is a good thing for FDW authors.  And IIUC the patch you posted today, I
think we could make create_foreignscan_plan a bit simpler too.  Ie, in your
patch, you modified that function as follows:

@@ -2129,7 +2134,9 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath
*best_path,
  */
 scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,

best_path,
-
tlist, scan_clauses);
+
tlist,
+
scan_clauses);
+   outerPlan(scan_plan) = fdw_outerplan;

I think that would be OK, but I think we would have to do a bit more here
about the fdw_outerplan's targetlist and qual; I think that the targetlist
needs to be changed to fdw_scan_tlist, as in the patch [1], and that it'd be
better to change the qual to remote conditions, ie, quals not in the
scan_plan's scan.plan.qual, to avoid duplicate evaluation of local
conditions.  (In the patch [1], I didn't do anything about the qual because
the current postgres_fdw join pushdown patch assumes that all the the
scan_plan's scan.plan.qual are pushed down.) Or, FDW authors might want to
do something about fdw_recheck_quals for a foreign-join while creating the
fdw_outerplan.  So if we do that during GetForeignPlan, I think we could
make create_foreignscan_plan a bit simpler, or provide flexibility to FDW
authors.



It's certainly true that we need the alternative plan's tlist to match
that of the main plan; otherwise, it's going to be difficult for the
FDW to make use of that alternative subplan to fill its slot, which is
kinda the point of all this.


OK.


However, I'm quite reluctant to
introduce code into create_foreignscan_plan() that forces the
subplan's tlist to match that of the main plan.  For one thing, that
would likely foreclose the possibility of an FDW ever using the outer
plan for any purpose other than EPQ rechecks.  It may be hard to
imagine what else you'd do with the outer plan as things are today,
but right now the two haves of the patch - letting FDWs have an outer
subplan, and providing them with a way of overriding the EPQ recheck
behavior - are technically independent.  Putting tlist-altering
behavior into create_foreignscan_plan() ties those two things together
irrevocably.


Agreed.


Instead, I think we should go the opposite direction and pass the
outerplan to GetForeignPlan after all.  I was lulled into a full sense
of security by the realization that every FDW that uses this feature
MUST want to do outerPlan(scan_plan) = fdw_outerplan.  That's true,
but irrelevant.  The point is that the FDW might want to do something
additional, like frob the outer plan's tlist, and it can't do that if
we don't pass it fdw_outerplan.  So we should do that, after all.


As I proposed upthread, another idea would be to 1) to store an 
fdw_outerpath in the fdw_private list of a ForeignPath node, and then 2) 
to create an fdw_outerplan from *the fdw_outerpath stored into
the fdw_private* in GetForeignPlan.  One good thing for this is that we 
keep the API of create_foreignscan_path as-is.  What do you think about 
that?



Updated patch attached.  This fixes a couple of whitespace issues that
were pointed out, also.


Thanks for updating the patch!

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-12-03 Thread Etsuro Fujita

On 2015/12/04 11:51, Noah Misch wrote:

On Tue, Dec 01, 2015 at 11:05:47AM -0500, Robert Haas wrote:

On Mon, Nov 30, 2015 at 2:43 PM, Tom Lane  wrote:

* Foreign join pushdown vs EvalPlanQual

Is this fixed by 5fc4c26db?  If not, what remains to do?



Unfortunately, no.  That commit allows FDWs to do proper EPQ handling
for plain table scans, but it proves to be inadequate for EPQ handling
for joins. Solving that problem will require another patch, and,
modulo a bunch of cosmetic issues, I'm reasonably happy with KaiGai
Kohei's latest submission.  I'll respond in more detail on that
thread, but the question I want to raise here is: do we want to
back-patch those changes to 9.5 at this late date?



Yes.  If 9.5 added a bad interface, better to fix the interface even now than
to live with the bad one.


I'd vote for fixing this.

I think the latest version of the patch for this is in good shape, but 
that would need some changes as proposed on that thread.  So, if there 
are no objections, I'll update the patch.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-12-02 Thread Etsuro Fujita

Hi Ashutosh,

On 2015/12/02 20:45, Ashutosh Bapat wrote:

It's been a long time since last patch on this thread was posted. I have
started
to work on supporting join pushdown for postgres_fdw.


Thanks for the work!


Generating paths




A join between two foreign relations is considered safe to push down if
1. The joining sides are pushable
2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and ANTI
joins are not considered right now, because of difficulties in
constructing
the queries involving those. The join clauses of SEMI/ANTI joins are
not in a
form that can be readily converted to IN/EXISTS/NOT EXIST kind of
expression.
We might consider this as future optimization.
3. Joining sides do not have clauses which can not be pushed down to the
foreign
server. For an OUTER join this is important since those clauses need
to be
applied before performing the join and thus join can not be pushed
to the
foreign server. An example is
SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
ON (join clause)
Here the local_cond on ft2 needs to be executed before performing
LEFT JOIN
between ft1 and ft2.
This condition can be relaxed for an INNER join by pulling the local
clauses
up the join tree. But this needs more investigation and is not
considered in
this version.
4. The join conditions (e.g. conditions in ON clause) are all safe to
push down.
This is important for OUTER joins as pushing down join clauses
partially and
applying rest locally changes the result. There are ways [1] by
which partial
OUTER join can be completed by applying unpushable clauses locally
and then
nullifying the nullable side and eliminating duplicate non-nullable side
rows. But that's again out of scope of first version of postgres_fdw
join
pushdown.


As for 4, as commented in the patch, we could relax the requirement that 
all the join conditions (given by JoinPathExtraData's restrictlist) need 
to be safe to push down to the remote server;

* In case of inner join, all the conditions would not need to be safe.
* In case of outer join, all the "otherclauses" would not need to be 
safe, while I think all the "joinclauses" need to be safe to get the 
right results (where "joinclauses" and "otherclauses" are defined by 
extract_actual_join_clauses).  And I think we should do this relaxation 
to some extent for 9.6, to allow more joins to be pushed down.  I don't 
know about [1].  May I see more information about [1]?



Generating plan
===



Rest of this section describes the logic to construct
the SQL
for join; the logic is implemented as function deparseSelectSqlForRel().

deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
baserel
asd well) recursively.
For joinrels
1. it constructs SQL representing either side of join, by calling itself
in recursive fashion.
2. These SQLs are converted into subqueries and become part of the FROM
clause
with appropriate JOIN type and clauses. The left and right
subqueries are
given aliases "l" and "r" respectively. The columns in each subquery are
aliased as "a1", "a2", "a3" and so on. Thus the third column on left
side can
be referenced as "l.a3" at any recursion level.
3. Targetlist is added representing the columns in the join result
expected at
that level.
4. The join clauses are added as part of ON clause
5. Any clauses that planner has deemed fit to be evaluated at that level
of join
are added as part of WHERE clause.


Honestly, I'm not sure that that is a good idea.  One reason for that is 
that a query string constructed by the procedure is difficult to read 
especially when the procedure is applied recursively.  So, I'm thinking 
to revise the procedure so as to construct a query string with a 
flattened FROM clause, as discussed in eg, [2].



TODOs
=
This patch is very much WIP patch to show case the approach and invite early
comments. I will continue to improve the patch and some of the areas
that will
be improved are
1. Costing of foreign join paths.
2. Various TODOs in the patch, making it more readable, finishing etc.
3. Tests
4. Any comments/suggestions on approach or the attached patch.


That would be great!


In another thread Robert, Fujita-san and Kaigai-san are discussing about
EvalPlanQual support for foreign joins. Corresponding changes to
postgres_fdw
will need to be added once those changes get committed.


Yeah, we would need those changes including helper functions to create a 
local join execution plan for that support.  I'd like to add those 
changes to your updated patch if it's okay.


Best regards,
Etsuro Fujita

[2] 
http://www.postgresql.org/message-id/ca+tgmozh9pb8bc+z3re7wo8cwuxaf7vp3066isg39qfr1jj...@mail.gmail.com





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-02 Thread Etsuro Fujita

On 2015/12/02 1:54, Robert Haas wrote:

On Fri, Nov 27, 2015 at 1:25 AM, Kouhei Kaigai  wrote:

Sorry, I don't understand this.  In my understanding, fdw_recheck_quals
can be defined for a foreign join, regardless of the join type,



Yes, "can be defined", but will not be workable if either side of
joined tuple is NULL because of outer join. SQL functions returns
NULL prior to evaluation, then ExecQual() treats this result as FALSE.
However, a joined tuple that has NULL fields may be a valid tuple.

We don't need to care about unmatched tuple if INNER JOIN.



This is a really good point, and a very strong argument for the design
KaiGai has chosen here.


Maybe my explanation was not enough.  Sorry about that.  But I mean that 
we define fdw_recheck_quals for a foreign-join as quals that 1) were 
extracted by extract_actual_join_clauses as "otherclauses" 
(rinfo->is_pushed_down=true) and that 2) were pushed down to the remote 
server, not scan quals relevant to all the base tables invoved in the 
foreign-join.  So in this definition, I think fdw_recheck_quals for a 
foreign-join will be workable, regardless of the join type.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-02 Thread Etsuro Fujita

On 2015/12/02 14:54, Kouhei Kaigai wrote:

On 2015/12/02 1:41, Robert Haas wrote:

On Thu, Nov 26, 2015 at 7:59 AM, Etsuro Fujita
 wrote:

The attached patch adds: Path *fdw_outerpath field to ForeignPath node.
FDW driver can set arbitrary but one path-node here.
After that, this path-node shall be transformed to plan-node by
createplan.c, then passed to FDW driver using GetForeignPlan callback.



I understand this, as I also did the same thing in my patches, but actually,
that seems a bit complicated to me.  Instead, could we keep the
fdw_outerpath in the fdw_private of a ForeignPath node when creating the
path node during GetForeignPaths, and then create an outerplan accordingly
from the fdw_outerpath stored into the fdw_private during GetForeignPlan, by
using create_plan_recurse there?  I think that that would make the core
involvment much simpler.



I can't see how it's going to get much simpler than this.  The core
core is well under a hundred lines, and it all looks pretty
straightforward to me.  All of our existing path and plan types keep
lists of paths and plans separate from other kinds of data, and I
don't think we're going to win any awards for deviating from that
principle here.



One thing I can think of is that we can keep both the structure of a
ForeignPath node and the API of create_foreignscan_path as-is.  The
latter is a good thing for FDW authors.  And IIUC the patch you posted
today, I think we could make create_foreignscan_plan a bit simpler too.
   Ie, in your patch, you modified that function as follows:

@@ -2129,7 +2134,9 @@ create_foreignscan_plan(PlannerInfo *root,
ForeignPath *best_path,
 */
scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,

best_path,
-
tlist, scan_clauses);
+
tlist,
+
scan_clauses);
+   outerPlan(scan_plan) = fdw_outerplan;

I think that would be OK, but I think we would have to do a bit more
here about the fdw_outerplan's targetlist and qual; I think that the
targetlist needs to be changed to fdw_scan_tlist, as in the patch [1],



Hmm... you are right. The sub-plan shall generate a tuple according to
the fdw_scan_tlist, if valid. Do you think the surgical operation is best
to apply alternative target-list than build_path_tlist()?


Sorry, I'm not sure about that.  I thought changing it to fdw_scan_tlist 
just because that's simple.



and that it'd be better to change the qual to remote conditions, ie,
quals not in the scan_plan's scan.plan.qual, to avoid duplicate
evaluation of local conditions.  (In the patch [1], I didn't do anything
about the qual because the current postgres_fdw join pushdown patch
assumes that all the the scan_plan's scan.plan.qual are pushed down.)
Or, FDW authors might want to do something about fdw_recheck_quals for a
foreign-join while creating the fdw_outerplan.  So if we do that during
GetForeignPlan, I think we could make create_foreignscan_plan a bit
simpler, or provide flexibility to FDW authors.



So, you suggest it is better to pass fdw_outerplan on the GetForeignPlan
callback, to allow FDW to adjust target-list and quals of sub-plans.


I think that is one option for us.  Another option, which I proposed 
above, is 1) store an fdw_outerpath in the fdw_private when creating the 
ForeignPath node in GetForeignPaths, and then 2) create an fdw_outerplan 
from the fdw_outerpath stored into the fdw_private when creating the 
ForeignScan node in GetForeignPlan, by using create_plan_recurse in 
GetForeignPlan.  (To do so, I was thinking to make that function 
extern.)  One good point about that is that we can keep the API of 
create_foreignscan_path as-is, which I think would be a good thing for 
FDW authors that don't care about join pushdown.



I think it is reasonable argue. Only FDW knows which qualifiers are
executable on remote side, so it is not easy to remove qualifiers to be
executed on host-side only, from the sub-plan tree.


Yeah, we could provide the flexibility to FDW authors.


@@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot
*slot)

  ResetExprContext(econtext);

+   /*
+* FDW driver has to recheck visibility of EPQ tuple towards
+* the scan qualifiers once it gets pushed down.
+* In addition, if this node represents a join sub-tree, not
+* a scan, FDW driver is also responsible to reconstruct
+* a joined tuple according to the primitive EPQ tuples.
+*/
+   if (fdwroutine->RecheckForeignScan)
+   {
+   if (!fdwroutine->RecheckForeignScan(node, slot))
+   return false;
+   }

Maybe I'm missing something, but I think we should let FDW do the work if
scanrelid==0, not just if fdwroutine->RecheckForeignScan is given. (And if
scanrelid==0 and fdwroutine->RecheckForeign

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-02 Thread Etsuro Fujita

On 2015/12/02 1:53, Robert Haas wrote:

On Fri, Nov 27, 2015 at 1:33 AM, Etsuro Fujita
 wrote: Plan   *plan =
&node->scan.plan;

@@ -3755,7 +3763,7 @@ make_foreignscan(List *qptlist,
 /* cost will be filled in by create_foreignscan_plan */
 plan->targetlist = qptlist;
 plan->qual = qpqual;
-   plan->lefttree = NULL;
+   plan->lefttree = fdw_outerplan;
 plan->righttree = NULL;
 node->scan.scanrelid = scanrelid;

I think that that would break the EXPLAIN output.



In what way?  EXPLAIN recurses into the left and right trees of every
plan node regardless of what type it is, so superficially I feel like
this ought to just work. What problem do you foresee?

I do think that ExecInitForeignScan ought to be changed to
ExecInitNode on it's outer plan if present rather than leaving that to
the FDW's BeginForeignScan method.


IIUC, I think the EXPLAIN output for eg,

select localtab.* from localtab, ft1, ft2 where localtab.a = ft1.a and 
ft1.a = ft2.a for update


would be something like this:

 LockRows
   ->  Nested Loop
 Join Filter: (ft1.a = localtab.a)
 ->  Seq Scan on localtab
 ->  Foreign Scan on ft1/ft2-foreign-join
   -> Nested Loop
Join Filter: (ft1.a = ft2.a)
->  Foreign Scan on ft1
->  Foreign Scan on ft2

The subplan below the Foreign Scan on the foreign-join seems odd to me. 
 One option to avoid that is to handle the subplan as in my patch [2], 
which I created to address your comment that we should not break the 
equivalence discussed below.  I'm not sure that the patch's handling of 
chgParam for the subplan is a good idea, though.



One option to avoid that
is to set the fdw_outerplan in ExecInitForeignScan as in my patch [1], or
BeginForeignScan as you proposed.  That breaks the equivalence that the Plan
tree and the PlanState tree should be mirror images of each other, but I
think that that break would be harmless.



I'm not sure how many times I have to say this, but we are not doing
that.  I will not commit any patch that does that, and I will
vigorously argue against anyone else committing such a patch either.
That *would* break EXPLAIN, because EXPLAIN relies on being able to
walk the PlanState tree and find all the Plan nodes from the
corresponding PlanState nodes.  Now you might think that it would be
OK to omit a plan node that we decided we weren't ever going to
execute, but today we don't do that, and I don't think we should.  I
think it could be very confusing if EXPLAIN and EXPLAIN ANALYZE show
different sets of plan nodes for the same query.  Quite apart from
EXPLAIN, there are numerous other places that assume that they can
walk the PlanState tree and find all the Plan nodes.  Breaking that
assumption would be bad news.


Agreed.  Thanks for the explanation!

Best regards,
Etsuro Fujita

[2] http://www.postgresql.org/message-id/5624d583.10...@lab.ntt.co.jp




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-12-01 Thread Etsuro Fujita

On 2015/12/02 1:41, Robert Haas wrote:

On Thu, Nov 26, 2015 at 7:59 AM, Etsuro Fujita
 wrote:

The attached patch adds: Path *fdw_outerpath field to ForeignPath node.
FDW driver can set arbitrary but one path-node here.
After that, this path-node shall be transformed to plan-node by
createplan.c, then passed to FDW driver using GetForeignPlan callback.



I understand this, as I also did the same thing in my patches, but actually,
that seems a bit complicated to me.  Instead, could we keep the
fdw_outerpath in the fdw_private of a ForeignPath node when creating the
path node during GetForeignPaths, and then create an outerplan accordingly
from the fdw_outerpath stored into the fdw_private during GetForeignPlan, by
using create_plan_recurse there?  I think that that would make the core
involvment much simpler.



I can't see how it's going to get much simpler than this.  The core
core is well under a hundred lines, and it all looks pretty
straightforward to me.  All of our existing path and plan types keep
lists of paths and plans separate from other kinds of data, and I
don't think we're going to win any awards for deviating from that
principle here.


One thing I can think of is that we can keep both the structure of a 
ForeignPath node and the API of create_foreignscan_path as-is.  The 
latter is a good thing for FDW authors.  And IIUC the patch you posted 
today, I think we could make create_foreignscan_plan a bit simpler too. 
 Ie, in your patch, you modified that function as follows:


@@ -2129,7 +2134,9 @@ create_foreignscan_plan(PlannerInfo *root, 
ForeignPath *best_path,

 */
scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,

best_path,
-   
tlist, scan_clauses);
+   
tlist,
+   
scan_clauses);
+   outerPlan(scan_plan) = fdw_outerplan;

I think that would be OK, but I think we would have to do a bit more 
here about the fdw_outerplan's targetlist and qual; I think that the 
targetlist needs to be changed to fdw_scan_tlist, as in the patch [1], 
and that it'd be better to change the qual to remote conditions, ie, 
quals not in the scan_plan's scan.plan.qual, to avoid duplicate 
evaluation of local conditions.  (In the patch [1], I didn't do anything 
about the qual because the current postgres_fdw join pushdown patch 
assumes that all the the scan_plan's scan.plan.qual are pushed down.) 
Or, FDW authors might want to do something about fdw_recheck_quals for a 
foreign-join while creating the fdw_outerplan.  So if we do that during 
GetForeignPlan, I think we could make create_foreignscan_plan a bit 
simpler, or provide flexibility to FDW authors.



@@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot
*slot)

 ResetExprContext(econtext);

+   /*
+* FDW driver has to recheck visibility of EPQ tuple towards
+* the scan qualifiers once it gets pushed down.
+* In addition, if this node represents a join sub-tree, not
+* a scan, FDW driver is also responsible to reconstruct
+* a joined tuple according to the primitive EPQ tuples.
+*/
+   if (fdwroutine->RecheckForeignScan)
+   {
+   if (!fdwroutine->RecheckForeignScan(node, slot))
+   return false;
+   }

Maybe I'm missing something, but I think we should let FDW do the work if
scanrelid==0, not just if fdwroutine->RecheckForeignScan is given. (And if
scanrelid==0 and fdwroutine->RecheckForeignScan is not given, we should
abort the transaction.)



That would be unnecessarily restrictive.  On the one hand, even if
scanrelid != 0, the FDW can decide that it prefers to do the rechecks
using RecheckForeignScan rather than fdw_recheck_quals.  For most
FDWs, I expect using fdw_recheck_quals to be more convenient, but
there may be cases where somebody prefers to use RecheckForeignScan,
and allowing that costs nothing.


I suppose that the flexibility would probably be a good thing, but I'm a 
little bit concerned that that might be rather confusing to FDW authors. 
 Maybe I'm missing something, though.


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/5624d583.10...@lab.ntt.co.jp




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-26 Thread Etsuro Fujita
ignscan(List *qptlist,
 List *fdw_exprs,
 List *fdw_private,
 List *fdw_scan_tlist,
-List *fdw_recheck_quals)
+List *fdw_recheck_quals,
+Plan *fdw_outerplan)
 {
ForeignScan *node = makeNode(ForeignScan);
Plan   *plan = &node->scan.plan;
@@ -3755,7 +3763,7 @@ make_foreignscan(List *qptlist,
/* cost will be filled in by create_foreignscan_plan */
plan->targetlist = qptlist;
plan->qual = qpqual;
-   plan->lefttree = NULL;
+   plan->lefttree = fdw_outerplan;
plan->righttree = NULL;
node->scan.scanrelid = scanrelid;

I think that that would break the EXPLAIN output.  One option to avoid 
that is to set the fdw_outerplan in ExecInitForeignScan as in my patch 
[1], or BeginForeignScan as you proposed.  That breaks the equivalence 
that the Plan tree and the PlanState tree should be mirror images of 
each other, but I think that that break would be harmless.


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/55def5f0@lab.ntt.co.jp



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-26 Thread Etsuro Fujita
On 2015/11/27 0:14, Kouhei Kaigai wrote:

>> The documentation says as following so I think the former has.
>>
>> # I don't understhad what 'can or must' means, though... 'can and
>> # must'?
>>
>> + Also, this callback can or must recheck scan qualifiers and join
>> + conditions which are pushed down. Especially, it needs special

> If fdw_recheck_quals is set up correctly and join type is inner join,
> FDW driver does not recheck by itself. Elsewhere, it has to recheck
> the joined tuple, not only reconstruction.

Sorry, I don't understand this.  In my understanding, fdw_recheck_quals
can be defined for a foreign join, regardless of the join type, and when
the fdw_recheck_quals are defined, the RecheckForeignScan callback
routine doesn't need to evaluate the fdw_recheck_quals by itself.  No?

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-26 Thread Etsuro Fujita
andling the es_epqScanDone flags.  (Do you 
think that such FDWs should do something like what ExecScanFtch is doing 
about the flags, in their RecheckForeignScans?  If so, I think we need 
docs for that.)



There seems to be no changes to make_foreignscan.  Is that OK?



create_foreignscan_path(), not only make_foreignscan().


OK


This patch is not tested by actual FDW extensions, so it is helpful
to enhance postgres_fdw to run the alternative sub-plan on EPQ recheck.


Will do.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-11-26 Thread Etsuro Fujita

Hi Thom,

Thank you for paying attention to this!

On 2015/11/25 20:36, Thom Brown wrote:

On 13 May 2015 at 04:10, Etsuro Fujita  wrote:

On 2015/05/13 0:55, Stephen Frost wrote:

While the EXPLAIN output changed, the structure hasn't really changed
from what was discussed previously and there's not been any real
involvment from the core code in what's happening here.

Clearly, the documentation around how to use the FDW API hasn't changed
at all and there's been no additions to it for handling bulk work.
Everything here continues to be done inside of postgres_fdw, which
essentially ignores the prescribed "Update/Delete one tuple" interface
for ExecForeignUpdate/ExecForeignDelete.

I've spent the better part of the past two days trying to reason my way
around that while reviewing this patch and I haven't come out the other
side any happier with this approach than I was back in
20140911153049.gc16...@tamriel.snowman.net.

There are other things that don't look right to me, such as what's going
on at the bottom of push_update_down(), but I don't think there's much
point going into it until we figure out what the core FDW API here
should look like.  It might not be all that far from what we have now,
but I don't think we can just ignore the existing, documented, API.



OK, I'll try to introduce the core FDW API for this (and make changes to the
core code) to address your previous comments.



I'm a bit behind in reading up on this, so maybe it's been covered
since, but is there a discussion of this API on another thread, or a
newer patch available?


Actually, I'm now working on this.  My basic idea is to add new FDW APIs 
for handling the bulk work, in order not to make messy the prescribed 
"Update/Delete one tuple" interface; 1) add to nodeModifyTable.c or 
nodeForeignscan.c the new FDW APIs BeginPushedDownForeignModify, 
ExecPushedDownForeignModify and EndPushedDownForeignModify for that, and 
2) call these FDW APIs, instead of BeginForeignModify, 
ExecForeignUpdate/ExecForeignDelete and EndForeignModify, when doing 
update pushdown.  I'd like to propose that in more detail as soon as 
possible, probably with an updated patch.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-23 Thread Etsuro Fujita

On 2015/11/09 9:26, Kouhei Kaigai wrote:

The attached patch is an adjusted version of the previous one.


There seems to be no changes to make_foreignscan.  Is that OK?

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-23 Thread Etsuro Fujita

On 2015/11/20 22:45, Kouhei Kaigai wrote:
I wrote:

* This patch means we can define fdw_recheck_quals even for the case of
foreign tables with non-NIL fdw_scan_tlist.  However, we discussed in
another thread [1] that such foreign tables might break EvalPlanQual
tests.  Where are we on that issue?



In case of later locking, RefetchForeignRow() will set a base tuple
that have compatible layout of the base relation, not fdw_scan_tlist,
because RefetchForeignRow() does not have information about scan node.


IIUC, I think the base tuple would be stored into EPQ state not only in 
case of late row locking but in case of early row locking.



* For the case of foreign joins, I think fdw_recheck_quals can be
defined for example, the same way as for the case of foreign tables, ie,
quals not in scan.plan.qual, or ones defined as "otherclauses"
(rinfo->is_pushed_down=true) pushed down to the remote.  But since it's
required that the FDW has to add to the fdw_scan_tlist the set of
columns needed to check quals in fdw_recheck_quals in preparation for
EvalPlanQual tests, it's likely that fdw_scan_tlist will end up being
long, leading to an increase in a total data transfer amount from the
remote.  So, that seems not practical to me.  Maybe I'm missing
something, but what use cases are you thinking?



It is trade-off. What solution do you think we can have?
To avoid data transfer used for EPQ recheck only, we can implement
FDW driver to issue remote join again on EPQ recheck, however, it
is not a wise design, isn't it?

If we would be able to have no extra data transfer and no remote
join execution during EPQ recheck, it is a perfect.


I was thinking that in an approach using a local join execution plan, I 
would just set fdw_recheck_quals set to NIL and evaluate the 
otherclauses as part of the local join execution plan, so that 
fdw_scan_tlist won't end up being longer, as in the patch [1].  (Note 
that in that patch, remote_exprs==NIL when calling make_foreignscan 
during postgresGetForeignPlan in case of foreign joins.)


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/5624d583.10...@lab.ntt.co.jp



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-23 Thread Etsuro Fujita

On 2015/11/24 2:41, Robert Haas wrote:

On Fri, Nov 20, 2015 at 12:11 AM, Kouhei Kaigai  wrote:

One subplan means FDW driver run an entire join sub-tree with local
alternative sub-plan; that is my expectation for the majority case.



What I'm imagining is that we'd add handling that allows the
ForeignScan to have inner and outer children.  If the FDW wants to
delegate the EvalPlanQual handling to a local plan, it can use the
outer child for that.  Or the inner one, if it likes.  The other one
is available for some other purposes which we can't imagine yet.  If
this is too weird, we can only add handling for an outer subplan and
forget about having an inner subplan for now.  I just thought to make
it symmetric, since outer and inner subplans are pretty deeply baked
into the structure of the system.


I'd vote for only allowing an outer subplan.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-20 Thread Etsuro Fujita

On 2015/11/19 12:32, Robert Haas wrote:

On Tue, Nov 17, 2015 at 8:47 PM, Kouhei Kaigai  wrote:

The attached patch is the portion cut from the previous EPQ recheck
patch.



Thanks, committed.


Thanks, Robert and KaiGai-san.

Sorry, I'm a bit late to the party.  Here are my questions:

* This patch means we can define fdw_recheck_quals even for the case of 
foreign tables with non-NIL fdw_scan_tlist.  However, we discussed in 
another thread [1] that such foreign tables might break EvalPlanQual 
tests.  Where are we on that issue?


* For the case of foreign joins, I think fdw_recheck_quals can be 
defined for example, the same way as for the case of foreign tables, ie, 
quals not in scan.plan.qual, or ones defined as "otherclauses" 
(rinfo->is_pushed_down=true) pushed down to the remote.  But since it's 
required that the FDW has to add to the fdw_scan_tlist the set of 
columns needed to check quals in fdw_recheck_quals in preparation for 
EvalPlanQual tests, it's likely that fdw_scan_tlist will end up being 
long, leading to an increase in a total data transfer amount from the 
remote.  So, that seems not practical to me.  Maybe I'm missing 
something, but what use cases are you thinking?


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/55af3c08.1070...@lab.ntt.co.jp



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-19 Thread Etsuro Fujita

On 2015/11/20 6:57, Robert Haas wrote:

On Wed, Nov 18, 2015 at 10:54 PM, Etsuro Fujita
 wrote:

Noted, but let's do it that way and move on.  It would be a shame if
we didn't end up with a working FDW join pushdown system in 9.6
because of a disagreement on this point.



Another idea would be to consider join pushdown as unsupported for now when
select-for-update is involved in 9.5, as described in [1], and revisit this
issue when adding join pushdown to postgres_fdw in 9.6.



Well, I think it's probably too late to squeeze this into 9.5 at this
point, but I'm eager to get it fixed for 9.6.


OK, I'll update the postgres_fdw-join-pushdown patch so as to work with 
that callback routine, if needed.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-18 Thread Etsuro Fujita

On 2015/11/19 12:34, Robert Haas wrote:

On Tue, Nov 17, 2015 at 9:30 PM, Etsuro Fujita
 wrote:

I suppose you (and KaiGai-san) are probably right, but I really fail to see
it actually doing that.



Noted, but let's do it that way and move on.  It would be a shame if
we didn't end up with a working FDW join pushdown system in 9.6
because of a disagreement on this point.


Another idea would be to consider join pushdown as unsupported for now 
when select-for-update is involved in 9.5, as described in [1], and 
revisit this issue when adding join pushdown to postgres_fdw in 9.6.


Best regards,
Etsuro Fujita

[1] https://wiki.postgresql.org/wiki/Open_Items



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment improvement to create_foreignscan_plan

2015-11-18 Thread Etsuro Fujita

On 2015/11/19 5:29, Robert Haas wrote:

On Tue, Nov 17, 2015 at 9:29 PM, Etsuro Fujita
 wrote:

On 2015/11/18 2:57, Robert Haas wrote:

On Sun, Nov 15, 2015 at 9:25 PM, Etsuro Fujita
 wrote:

Oops, I've found another one.  I think we should update a comment in
postgresGetForeignPlan, too; add remote filtering expressions to the list
of
information needed to create a ForeignScan node.



Instead of saying "remote/local", how about saying "remote and local"
or just leaving it out altogether as in the attached?



+1 for your patch.



OK, committed.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-17 Thread Etsuro Fujita

On 2015/11/18 3:19, Robert Haas wrote:

On Thu, Nov 12, 2015 at 12:54 AM, Etsuro Fujita
 wrote:

Really?  I think there would be not a little burden on an FDW author; when
postgres_fdw delegates to the subplan to the remote server, for example, it
would need to create a remote join query by looking at tuples possibly
fetched and stored in estate->es_epqTuple[], send the query and receive the
result during the callback routine.  Furthermore, what I'm most concerned
about is that wouldn't be efficient.  So, my question about that approach is
whether FDWs really do some thing like that during the callback routine,
instead of performing a secondary join plan locally.  As I said before, I
know that KaiGai-san considers that that approach would be useful for custom
joins.  But I see zero evidence that there is a good use-case for an FDW.



It could do that.  But it could also just invoke a subplan as you are
proposing.  Or at least, I think we should set it up so that such a
thing is possible.  In which case I don't see the problem.


I suppose you (and KaiGai-san) are probably right, but I really fail to 
see it actually doing that.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment improvement to create_foreignscan_plan

2015-11-17 Thread Etsuro Fujita

On 2015/11/18 2:57, Robert Haas wrote:

On Sun, Nov 15, 2015 at 9:25 PM, Etsuro Fujita
 wrote:

Oops, I've found another one.  I think we should update a comment in
postgresGetForeignPlan, too; add remote filtering expressions to the list of
information needed to create a ForeignScan node.



Instead of saying "remote/local", how about saying "remote and local"
or just leaving it out altogether as in the attached?


+1 for your patch.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-15 Thread Etsuro Fujita

On 2015/11/13 13:44, Kyotaro HORIGUCHI wrote:

I wrote:

What I think is, I
see zero evidence that there is a good use-case for an FDW to do
something other than doing an ExecProcNode in the callback routine, as I
said below, so I don't see the need to add such a routine while that
would cause maybe not a large, but not a little burden for writing such
a routine on FDW authors.


KaiGai-san wrote:

It is quite natural because we cannot predicate what kind of extension
is implemented on FDW interface. You might know the initial version of
PG-Strom is implemented on FDW (about 4 years before...). If I would
continue to stick FDW, it became a FDW driver with own join engine.



 From the standpoint of interface design, if we would not admit flexibility
of implementation unless community don't see a working example, a reasonable
tactics *for extension author* is to follow the interface restriction even
if it is not best approach from his standpoint.
It does not mean the approach by majority is also best for the minority.
It just requires the minority a compromise.



Or try to open the way to introduce the feature he/she wants.


I think the biggest difference between KaiGai-san's patch and mine is 
that KaiGai-san's patch introduces a callback routine to allow an FDW 
author not only to execute a secondary plan but to do something else, 
instead of executing the plan, if he/she wants to do so.  His approach 
would provide the flexibility, but IMHO I think major FDWs that would be 
implementing join pushdown, such as postgres_fdw, wouldn't be utilizing 
the flexibility; probably, they would be just executing the secondary 
plan in the routine.  Furthermore, since that for executing the plan, 
his approach would require that an FDW author has to add code not only 
for creating the plan but for initializing/executing/ending it to 
his/her FDW by itself while in my approach, he/she only has to add code 
for the plan creation, his approach would impose a more development 
burden on such major FDWs' authors than mine.  I think the flexibility 
would be a good thing, but I also think it's important not to burden FDW 
authors.  Maybe I'm missing something, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-15 Thread Etsuro Fujita

On 2015/11/13 11:31, Kouhei Kaigai wrote:

On 2015/11/12 2:53, Robert Haas wrote:

  From my point of view I'm now
thinking this solution has two parts:

(1) Let foreign scans have inner and outer subplans.  For this
purpose, we only need one, but it's no more work to enable both, so we
may as well.  If we had some reason, we could add a list of subplans
of arbitrary length, but there doesn't seem to be an urgent need for
that.


I wrote:

But one thing I'm concerned about is enable both inner and
outer plans, because I think that that would make the planner
postprocessing complicated, depending on what the foreign scans do by
the inner/outer subplans.  Is it worth doing so?  Maybe I'm missing
something, though.



If you persuade other person who has different opinion, you need to
explain why was it complicated, how much complicated and what was
the solution you tried at that time.
The "complicated" is a subjectively-based term. At least, we don't
share your experience, so it is hard to understand the how complexity.


I don't mean to object that idea.  I'm unfamiliar with that idea, so I 
just wanted to know the reason, or use cases.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment improvement to create_foreignscan_plan

2015-11-15 Thread Etsuro Fujita

On 2015/11/12 19:02, Etsuro Fujita wrote:

On 2015/11/10 3:53, Robert Haas wrote:

On Mon, Nov 9, 2015 at 5:34 AM, Etsuro Fujita
 wrote:

Here is a small patch to update an comment in create_foreignscan_plan;
add fdw_recheck_quals to the list of expressions that need the
replace_nestloop_params processing.  I should have updated the comment
when I proposed the patch for the fdw_recheck_quals.



OK, not a big deal, but thanks.  Committed.



Thanks!


Oops, I've found another one.  I think we should update a comment in 
postgresGetForeignPlan, too; add remote filtering expressions to the 
list of information needed to create a ForeignScan node.


Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 902,908  postgresGetForeignPlan(PlannerInfo *root,
  			 retrieved_attrs);
  
  	/*
! 	 * Create the ForeignScan node from target list, local filtering
  	 * expressions, remote parameter expressions, and FDW private information.
  	 *
  	 * Note that the remote parameter expressions are stored in the fdw_exprs
--- 902,908 
  			 retrieved_attrs);
  
  	/*
! 	 * Create the ForeignScan node from target list, remote/local filtering
  	 * expressions, remote parameter expressions, and FDW private information.
  	 *
  	 * Note that the remote parameter expressions are stored in the fdw_exprs

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-12 Thread Etsuro Fujita

Horiguchi-san,

On 2015/11/12 16:10, Kyotaro HORIGUCHI wrote:

I really don't see why you're fighting on this point.  Making this a
generic feature will require only a few extra lines of code for FDW
authors.  If this were going to cause some great inconvenience for FDW
authors, then I'd agree it isn't worth it.  But I see zero evidence
that this is actually the case.



Really?  I think there would be not a little burden on an FDW author;
when postgres_fdw delegates to the subplan to the remote server, for
example, it would need to create a remote join query by looking at
tuples possibly fetched and stored in estate->es_epqTuple[], send the
query and receive the result during the callback routine.



Do you mind that FDW cannot generate a plan so that make a tuple
from eqpTules then apply fdw_quals from predefined executor
nodes?


No.  Please see my previous email.  Sorry for my unfinished email.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment improvement to create_foreignscan_plan

2015-11-12 Thread Etsuro Fujita

On 2015/11/10 3:53, Robert Haas wrote:

On Mon, Nov 9, 2015 at 5:34 AM, Etsuro Fujita
 wrote:

Here is a small patch to update an comment in create_foreignscan_plan;
add fdw_recheck_quals to the list of expressions that need the
replace_nestloop_params processing.  I should have updated the comment
when I proposed the patch for the fdw_recheck_quals.



OK, not a big deal, but thanks.  Committed.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-12 Thread Etsuro Fujita

Robert and Kaigai-san,

Sorry, I sent in an unfinished email.

On 2015/11/12 15:30, Kouhei Kaigai wrote:

On 2015/11/12 2:53, Robert Haas wrote:

On Sun, Nov 8, 2015 at 11:13 PM, Etsuro Fujita
 wrote:

To test this change, I think we should update the postgres_fdw patch so as
to add the RecheckForeignScan.

Having said that, as I said previously, I don't see much value in adding the
callback routine, to be honest.  I know KaiGai-san considers that that would
be useful for custom joins, but I don't think that that would be useful even
for foreign joins, because I think that in case of foreign joins, the
practical implementation of that routine in FDWs would be to create a
secondary plan and execute that plan by performing ExecProcNode, as my patch
does [1].  Maybe I'm missing something, though.



I really don't see why you're fighting on this point.  Making this a
generic feature will require only a few extra lines of code for FDW
authors.  If this were going to cause some great inconvenience for FDW
authors, then I'd agree it isn't worth it.  But I see zero evidence
that this is actually the case.



Really?  I think there would be not a little burden on an FDW author;
when postgres_fdw delegates to the subplan to the remote server, for
example, it would need to create a remote join query by looking at
tuples possibly fetched and stored in estate->es_epqTuple[], send the
query and receive the result during the callback routine.



I cannot understand why it is the only solution.


I didn't say that.


Furthermore,
what I'm most concerned about is that wouldn't be efficient. So, my



You have to add "because ..." sentence here because I and Robert
think a little inefficiency is not a problem.


Sorry, my explanation was not enough.  The reason for that is that in 
the above postgres_fdw case for example, the overhead in sending the 
query to the remote end and transferring the result to the local end 
would not be negligible.  Yeah, we might be able to apply a special 
handling for the improved efficiency when using early row locking, but 
otherwise can we do the same thing?



Please don't start the sentence from "I think ...". We all knows
your opinion, but what I've wanted to see is "the reason why my
approach is valuable is ...".


I didn't say that my approach is *valuable* either.  What I think is, I 
see zero evidence that there is a good use-case for an FDW to do 
something other than doing an ExecProcNode in the callback routine, as I 
said below, so I don't see the need to add such a routine while that 
would cause maybe not a large, but not a little burden for writing such 
a routine on FDW authors.



Nobody prohibits postgres_fdw performs a secondary join here.
All you need to do is, picking up a sub-plan tree from FDW's private
field then call ExecProcNode() inside the callback.



As I said before, I know that KaiGai-san considers that
that approach would be useful for custom joins.  But I see zero evidence
that there is a good use-case for an FDW.



 From my point of view I'm now
thinking this solution has two parts:

(1) Let foreign scans have inner and outer subplans.  For this
purpose, we only need one, but it's no more work to enable both, so we
may as well.  If we had some reason, we could add a list of subplans
of arbitrary length, but there doesn't seem to be an urgent need for
that.


I did the same thing in an earlier version of the patch I posted. 
Although I agreed on Robert's comment "The Plan tree and the PlanState 
tree should be mirror images of each other; breaking that equivalence 
will cause confusion, at least.", I think that that would make code much 
simpler, especially the code for setting chgParam for inner/outer 
subplans.  But one thing I'm concerned about is enable both inner and 
outer plans, because I think that that would make the planner 
postprocessing complicated, depending on what the foreign scans do by 
the inner/outer subplans.  Is it worth doing so?  Maybe I'm missing 
something, though.



(2) Add a recheck callback.

If the foreign data wrapper wants to adopt the solution you're
proposing, the recheck callback can call
ExecProcNode(outerPlanState(node)).  I don't think this should end up
being more than a few lines of code, although of course we should
verify that.


Yeah, I think FDWs would probably need to create a subplan accordingly 
at planning time, and then initializing/closing the plan at execution 
time.  I think we could facilitate subplan creation by providing helper 
functions for that, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-11 Thread Etsuro Fujita

On 2015/11/12 2:53, Robert Haas wrote:

On Sun, Nov 8, 2015 at 11:13 PM, Etsuro Fujita
 wrote:

To test this change, I think we should update the postgres_fdw patch so as
to add the RecheckForeignScan.

Having said that, as I said previously, I don't see much value in adding the
callback routine, to be honest.  I know KaiGai-san considers that that would
be useful for custom joins, but I don't think that that would be useful even
for foreign joins, because I think that in case of foreign joins, the
practical implementation of that routine in FDWs would be to create a
secondary plan and execute that plan by performing ExecProcNode, as my patch
does [1].  Maybe I'm missing something, though.



I really don't see why you're fighting on this point.  Making this a
generic feature will require only a few extra lines of code for FDW
authors.  If this were going to cause some great inconvenience for FDW
authors, then I'd agree it isn't worth it.  But I see zero evidence
that this is actually the case.


Really?  I think there would be not a little burden on an FDW author; 
when postgres_fdw delegates to the subplan to the remote server, for 
example, it would need to create a remote join query by looking at 
tuples possibly fetched and stored in estate->es_epqTuple[], send the 
query and receive the result during the callback routine.  Furthermore, 
what I'm most concerned about is that wouldn't be efficient.  So, my 
question about that approach is whether FDWs really do some thing like 
that during the callback routine, instead of performing a secondary join 
plan locally.  As I said before, I know that KaiGai-san considers that 
that approach would be useful for custom joins.  But I see zero evidence 
that there is a good use-case for an FDW.



From my point of view I'm now
thinking this solution has two parts:

(1) Let foreign scans have inner and outer subplans.  For this
purpose, we only need one, but it's no more work to enable both, so we
may as well.  If we had some reason, we could add a list of subplans
of arbitrary length, but there doesn't seem to be an urgent need for
that.

(2) Add a recheck callback.

If the foreign data wrapper wants to adopt the solution you're
proposing, the recheck callback can call
ExecProcNode(outerPlanState(node)).  I don't think this should end up
being more than a few lines of code, although of course we should
verify that.  So no problem: postgres_fdw and any other FDWs where the
remote side is a database can easily delegate to a subplan, and
anybody who wants to do something else still can.

What is not to like about that?





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Minor comment improvement to create_foreignscan_plan

2015-11-09 Thread Etsuro Fujita
Hi,

Here is a small patch to update an comment in create_foreignscan_plan;
add fdw_recheck_quals to the list of expressions that need the
replace_nestloop_params processing.  I should have updated the comment
when I proposed the patch for the fdw_recheck_quals.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***
*** 2141,2151  create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
  	scan_plan->fs_relids = best_path->path.parent->relids;
  
  	/*
! 	 * Replace any outer-relation variables with nestloop params in the qual
! 	 * and fdw_exprs expressions.  We do this last so that the FDW doesn't
! 	 * have to be involved.  (Note that parts of fdw_exprs could have come
! 	 * from join clauses, so doing this beforehand on the scan_clauses
! 	 * wouldn't work.)  We assume fdw_scan_tlist contains no such variables.
  	 */
  	if (best_path->path.param_info)
  	{
--- 2141,2152 
  	scan_plan->fs_relids = best_path->path.parent->relids;
  
  	/*
! 	 * Replace any outer-relation variables with nestloop params in the qual,
! 	 * fdw_exprs and fdw_recheck_quals expressions.  We do this last so that
! 	 * the FDW doesn't have to be involved.  (Note that parts of fdw_exprs
! 	 * or fdw_recheck_quals could have come from join clauses, so doing this
! 	 * beforehand on the scan_clauses wouldn't work.)  We assume
! 	 * fdw_scan_tlist contains no such variables.
  	 */
  	if (best_path->path.param_info)
  	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-09 Thread Etsuro Fujita

On 2015/11/09 13:40, Kouhei Kaigai wrote:

Having said that, as I said previously, I don't see much value in adding
the callback routine, to be honest.  I know KaiGai-san considers that
that would be useful for custom joins, but I don't think that that would
be useful even for foreign joins, because I think that in case of
foreign joins, the practical implementation of that routine in FDWs
would be to create a secondary plan and execute that plan by performing
ExecProcNode, as my patch does [1].  Maybe I'm missing something, though.



I've never denied that alternative local sub-plan is one of the best
approach for postgres_fdw, however, I've also never heard why you can
say the best approach for postgres_fdw is definitely also best for
others.
If we would justify less flexible interface specification because of
comfort for a particular extension, it should not be an extension,
but a built-in feature.
My standpoint has been consistent through the discussion; we can never
predicate which feature shall be implemented on FDW interface, therefore,
we also cannot predicate which implementation is best for EPQ rechecks
also. Only FDW driver knows which is the "best" for them, not us.


What the RecheckForeignScan routine does for the foreign-join case would 
be the following for tuples stored in estate->es_epqTuple[]:


1. Apply relevant restriction clauses, including fdw_recheck_quals, to 
the tuples for the baserels involved in a foreign-join, and see if the 
tuples still pass the clauses.


2. If so, form a join tuple, while applying relevant join clauses to the 
tuples, and set the join tuple in the given slot.  Else set empty.


I think these would be more efficiently processed internally in core 
than externally in FDWs.  That's why I don't see much value in adding 
the routine.  I have to admit that that means no flexibility, though.


However, the routine as-is doesn't seem good enough, either.  For 
example, since the routine is called after each of the tuples was 
re-fetched from the remote end or re-computed from the whole-row var and 
stored in the corresponding estate->es_epqTuple[], the routine wouldn't 
allow for what Robert proposed in [2].  To do such a thing, I think we 
would probably need to change the existing EPQ machinery more 
drastically and rethink the right place for calling the routine.


Best regards,
Etsuro Fujita

[2] 
http://www.postgresql.org/message-id/ca+tgmozdpu_fcspozxxpd1xvyq3czcawd7-x3avwbkgsfoh...@mail.gmail.com




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-08 Thread Etsuro Fujita

On 2015/11/09 9:26, Kouhei Kaigai wrote:

I think ForeignRecheck should first call ExecQual to test
fdw_recheck_quals.  If it returns false, return false.  If it returns
true, then give the FDW callback a chance, if one is defined.  If that
returns false, return false.   If we haven't yet returned false,
return true.



I think ExecQual on fdw_recheck_quals shall be called next to the
RecheckForeignScan callback, because econtext->ecxt_scantuple shall
not be reconstructed unless RecheckForeignScan callback is not called
if scanrelid==0.


I agree with KaiGai-san.  I think we can define fdw_recheck_quals for 
the foreign-join case as quals not in scan.plan.qual, the same way as 
the simple foreign scan case.  (In other words, the quals would be 
defind as "otherclauses", ie, rinfo->is_pushed_down=true, that have been 
pushed down to the remote server.  For checking the fdw_recheck_quals, 
however, I think we should reconstruct the join tuple first, which I 
think is essential for cases where an outer join is performed remotely, 
to avoid changing the semantics.  BTW, in my patch [1], a secondary plan 
will be created to evaluate such otherclauses after reconstructing the 
join tuple.



The attached patch is an adjusted version of the previous one.
Even though it co-exists a new callback and fdw_recheck_quals,
the callback is kicked first as follows.


Thanks for the patch!



@@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)

ResetExprContext(econtext);

+   /*
+* FDW driver has to recheck visibility of EPQ tuple towards
+* the scan qualifiers once it gets pushed down.
+* In addition, if this node represents a join sub-tree, not
+* a scan, FDW driver is also responsible to reconstruct
+* a joined tuple according to the primitive EPQ tuples.
+*/
+   if (fdwroutine->RecheckForeignScan)
+   {
+   if (!fdwroutine->RecheckForeignScan(node, slot))
+   return false;
+   }
return ExecQual(node->fdw_recheck_quals, econtext, false);
  }


If callback is invoked first, FDW driver can reconstruct a joined tuple
with its comfortable way, then remaining checks can be done by ExecQual
and fds_recheck_quals on the caller side.
If callback would be located on the tail, FDW driver has no choice.


To test this change, I think we should update the postgres_fdw patch so 
as to add the RecheckForeignScan.


Having said that, as I said previously, I don't see much value in adding 
the callback routine, to be honest.  I know KaiGai-san considers that 
that would be useful for custom joins, but I don't think that that would 
be useful even for foreign joins, because I think that in case of 
foreign joins, the practical implementation of that routine in FDWs 
would be to create a secondary plan and execute that plan by performing 
ExecProcNode, as my patch does [1].  Maybe I'm missing something, though.


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/5624d583.10...@lab.ntt.co.jp



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-05 Thread Etsuro Fujita

On 2015/11/04 18:50, Etsuro Fujita wrote:

On 2015/11/04 17:10, Kouhei Kaigai wrote:

On 2015/10/28 6:04, Robert Haas wrote:

On Tue, Oct 20, 2015 at 12:39 PM, Etsuro Fujita
 wrote:

Sorry, my explanation was not correct.  (Needed to take in
caffeine.) What
I'm concerned about is the following:

SELECT * FROM localtab JOIN (ft1 LEFT JOIN ft2 ON ft1.x = ft2.x) ON
localtab.id = ft1.id FOR UPDATE OF ft1



If an EPQ recheck was invoked
due to a concurrent transaction on the remote server that changed
only the
value x of the ft1 tuple previously retrieved, then we would have to
generate a fake ft1/ft2-join tuple with nulls for ft2. (Assume that
the ft2
tuple previously retrieved was not a null tuple.) However, I'm not
sure how
we can do that in ForeignRecheck; we can't know for example, which
one is
outer and which one is inner, without an alternative local join
execution
plan.  Maybe I'm missing something, though.



I would expect it to issue a new query like: SELECT * FROM ft1 LEFT
JOIN ft2 WHERE ft1.x = ft2.x AND ft1.tid = $0 AND ft2.tid = $1.



We assume here that ft1 uses late row locking, so I thought the above
SQL should include "FOR UPDATE of ft1".  But I still don't think that
that is right; the SQL with "FOR UPDATE of ft1" wouldn't generate the
fake ft1/ft2-join tuple with nulls for ft2, as expected.  The reason for
that is that the updated version of the ft1 tuple wouldn't satisfy the
ft1.tid = $0 condition in an EPQ recheck, because the ctid for the
updated version of the ft1 tuple has changed.  (IIUC, I think that if we
use a TID scan for ft1, the SQL would generate the expected result,
because I think that the TID condition would be ignored in the EPQ
recheck, but I don't think it's guaranteed to use a TID scan for ft1.)
Maybe I'm missing something, though.



It looks to me, we should not use ctid system column to identify remote
row when postgres_fdw tries to support late row locking.



The "rowid" should not be changed once it is fetched from the remote side
until it is actually updated, deleted or locked, for correct
identification.
If ctid is used for this purpose, it is safe only when remote row is
locked
when it is fetched - it is exactly early row locking behavior, isn't it?



In case of SELECT FOR UPDATE, I think we are allowed to use ctid to
identify target rows for late row locking, but I think the above SQL
should be changed to something like this:

SELECT * FROM (SELECT * FROM ft1 WHERE ft1.tid = $0 FOR UPDATE) ss1 LEFT
JOIN (SELECT * FROM ft2 WHERE ft2.tid = $1) ss2 ON ss1.x = ss2.x


I noticed that the modofied SQL was still wrong; ss1 would produce no 
tuple, if using eg, a sequential scan for ss1, as discussed above. 
Sheesh, where is my brain?


I still think we are allowed to do that, but what is the right SQL for 
that?  In the current implementation of postgres_fdw, we need not take 
into consideration that what was fetched was an updated version of the 
tuple rather than the same version previously obtained, since that 
always uses at least REPEATABLE READ in the remote session.  But 
otherwise it would be possible that what was fetched was an updated 
version of the tuple, having a different ctid value, which wouldn't 
satisfy the condition like "ft1.tid = $0" in ss1 any more.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-04 Thread Etsuro Fujita

On 2015/11/04 17:28, Kouhei Kaigai wrote:

I think we need to consider a general solution that can be applied not
only to the case where the component tables in a foreign join all use
ROW_MARK_COPY but to the case where those tables use different rowmark
types such as ROW_MARK_COPY and ROW_MARK_EXCLUSIVE, as I pointed out
upthread.



In mixture case, FDW/CSP can choose local recheck & reconstruction based
on the EPQ tuples of base relation. Nobody enforce FDW/CSP to return
a joined tuple always even if author don't want to support the feature.
Why do you think it is not a generic solution? FDW/CSP driver "can choose"
the best solution according to its implementation and capability.


It looked to me that you were discussing only the case where component 
foreign tables in a foreign join all use ROW_MARK_COPY, so I commented 
that.  Sorry for my misunderstanding.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-04 Thread Etsuro Fujita

On 2015/11/04 17:10, Kouhei Kaigai wrote:

On 2015/10/28 6:04, Robert Haas wrote:

On Tue, Oct 20, 2015 at 12:39 PM, Etsuro Fujita
 wrote:

Sorry, my explanation was not correct.  (Needed to take in caffeine.) What
I'm concerned about is the following:

SELECT * FROM localtab JOIN (ft1 LEFT JOIN ft2 ON ft1.x = ft2.x) ON
localtab.id = ft1.id FOR UPDATE OF ft1

LockRows
-> Nested Loop
   Join Filter: (localtab.id = ft1.id)
   -> Seq Scan on localtab
   -> Foreign Scan on 
Remote SQL: SELECT * FROM ft1 LEFT JOIN ft2 WHERE ft1.x = ft2.x
FOR UPDATE OF ft1

Assume that ft1 performs late row locking.



If the SQL includes "FOR UPDATE of ft1", then it clearly performs
early row locking.  I assume you meant to omit that.



If an EPQ recheck was invoked
due to a concurrent transaction on the remote server that changed only the
value x of the ft1 tuple previously retrieved, then we would have to
generate a fake ft1/ft2-join tuple with nulls for ft2. (Assume that the ft2
tuple previously retrieved was not a null tuple.) However, I'm not sure how
we can do that in ForeignRecheck; we can't know for example, which one is
outer and which one is inner, without an alternative local join execution
plan.  Maybe I'm missing something, though.



I would expect it to issue a new query like: SELECT * FROM ft1 LEFT
JOIN ft2 WHERE ft1.x = ft2.x AND ft1.tid = $0 AND ft2.tid = $1.



We assume here that ft1 uses late row locking, so I thought the above
SQL should include "FOR UPDATE of ft1".  But I still don't think that
that is right; the SQL with "FOR UPDATE of ft1" wouldn't generate the
fake ft1/ft2-join tuple with nulls for ft2, as expected.  The reason for
that is that the updated version of the ft1 tuple wouldn't satisfy the
ft1.tid = $0 condition in an EPQ recheck, because the ctid for the
updated version of the ft1 tuple has changed.  (IIUC, I think that if we
use a TID scan for ft1, the SQL would generate the expected result,
because I think that the TID condition would be ignored in the EPQ
recheck, but I don't think it's guaranteed to use a TID scan for ft1.)
Maybe I'm missing something, though.



It looks to me, we should not use ctid system column to identify remote
row when postgres_fdw tries to support late row locking.

The documentation says:
   
http://www.postgresql.org/docs/devel/static/fdw-callbacks.html#FDW-CALLBACKS-UPDATE

   UPDATE and DELETE operations are performed against rows previously
   fetched by the table-scanning functions. The FDW may need extra information,
   such as a row ID or the values of primary-key columns, to ensure that it can
   identify the exact row to update or delete

The "rowid" should not be changed once it is fetched from the remote side
until it is actually updated, deleted or locked, for correct identification.
If ctid is used for this purpose, it is safe only when remote row is locked
when it is fetched - it is exactly early row locking behavior, isn't it?


Yeah, we should use early row locking for a target foreign table in 
UPDATE/DELETE.


In case of SELECT FOR UPDATE, I think we are allowed to use ctid to 
identify target rows for late row locking, but I think the above SQL 
should be changed to something like this:


SELECT * FROM (SELECT * FROM ft1 WHERE ft1.tid = $0 FOR UPDATE) ss1 LEFT 
JOIN (SELECT * FROM ft2 WHERE ft2.tid = $1) ss2 ON ss1.x = ss2.x



This should be significantly more efficient than fetching the base
rows from each of two tables with two separate queries.



Maybe I think we could fix the SQL, so I have to admit that, but I'm
just wondering (1) what would happen for the case when ft1 uses late row
rocking and ft2 uses early row rocking and (2) that would be still more
efficient than re-fetching only the base row from ft1.



It should be decision by FDW driver. It is not easy to estimate a certain
FDW driver mixes up early and late locking policy within a same remote join
query. Do you really want to support such a mysterious implementation?


Yeah, the reason for that is because GetForeignRowMarkType allows that.


Or, do you expect all the FDW driver is enforced to return a joined tuple
if remote join case?


No.  That wouldn't make sense if at least one component table involved 
in a foreign join uses the rowmark type other than ROW_MARK_COPY.



It is different from my idea; it shall be an extra
optimization option if FDW can fetch a joined tuple at once, but not always.
So, if FDW driver does not support this optimal behavior, your driver can
fetch two base tables then run local alternative join (or something other).


OK, so if we all agree that the joined-tuple optimization is just an 
option for the case where all the component tables use ROW_MARK_COPY, 
I'd propose to leave that for 9.6.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-04 Thread Etsuro Fujita

On 2015/11/03 22:15, Kouhei Kaigai wrote:

A challenge is that junk wholerow references on behalf of ROW_MARK_COPY
are injected by preprocess_targetlist(). It is earlier than the main path
consideration by query_planner(), thus, it is not predictable how remote
query shall be executed at this point.
If ROW_MARK_COPY, base tuple image is fetched using this junk attribute.
So, here is two options if we allow to put joined tuple on either of
es_epqTuple[].

options-1) We ignore record type definition. FDW returns a joined tuple
towards the whole-row reference of either of the base relations in this
join. The junk attribute shall be filtered out eventually and only FDW
driver shall see, so it is harmless to do (probably).
This option takes no big changes, however, we need a little brave to adopt.

options-2) We allow FDW/CSP to adjust target-list of the relevant nodes
after these paths get chosen by planner. It enables to remove whole-row
reference of base relations and add alternative whole-row reference instead
if FDW/CSP can support it.
This feature can be relevant to target-list push-down to the remote side,
not only EPQ rechecks, because adjustment of target-list means we allows
FDW/CSP to determine which expression shall be executed locally, or shall
not be.
I think, this option is more straightforward, however, needs a little bit
deeper consideration, because we have to design the best hook point and
need to ensure how path-ification will perform.

Therefore, I think we need two steps towards the entire solution.
Step-1) FDW/CSP will recheck base EPQ tuples and support local
reconstruction on the fly. It does not need something special
enhancement on the planner - so we can fix up by v9.5 release.
Step-2) FDW/CSP will support adjustment of target-list to add whole-row
reference of joined tuple instead of multiple base relations, then FDW/CSP
will be able to put a joined tuple on either of EPQ slot if it wants - it
takes a new feature enhancement, so v9.6 is a suitable timeline.

How about your opinion towards the direction?
I don't want to drop extra optimization opportunity, however, we are now in
November. I don't have enough brave to add none-obvious new feature here.


I think we need to consider a general solution that can be applied not 
only to the case where the component tables in a foreign join all use 
ROW_MARK_COPY but to the case where those tables use different rowmark 
types such as ROW_MARK_COPY and ROW_MARK_EXCLUSIVE, as I pointed out 
upthread.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ATT_FOREIGN_TABLE and ATWrongRelkindError()

2015-11-03 Thread Etsuro Fujita

On 2015/10/28 20:10, Robert Haas wrote:

On Fri, Oct 23, 2015 at 11:51 AM, Etsuro Fujita
 wrote:

BTW, I found an incorrect error message in ATWrongRelkindError. Attached is
a patch for fixing the message.



Committed and back-patched to 9.3.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-03 Thread Etsuro Fujita

On 2015/10/28 6:04, Robert Haas wrote:

On Tue, Oct 20, 2015 at 12:39 PM, Etsuro Fujita
 wrote:

Sorry, my explanation was not correct.  (Needed to take in caffeine.) What
I'm concerned about is the following:

SELECT * FROM localtab JOIN (ft1 LEFT JOIN ft2 ON ft1.x = ft2.x) ON
localtab.id = ft1.id FOR UPDATE OF ft1

LockRows
-> Nested Loop
  Join Filter: (localtab.id = ft1.id)
  -> Seq Scan on localtab
  -> Foreign Scan on 
   Remote SQL: SELECT * FROM ft1 LEFT JOIN ft2 WHERE ft1.x = ft2.x
FOR UPDATE OF ft1

Assume that ft1 performs late row locking.



If the SQL includes "FOR UPDATE of ft1", then it clearly performs
early row locking.  I assume you meant to omit that.


Right.  Sorry for my mistake.


If an EPQ recheck was invoked
due to a concurrent transaction on the remote server that changed only the
value x of the ft1 tuple previously retrieved, then we would have to
generate a fake ft1/ft2-join tuple with nulls for ft2. (Assume that the ft2
tuple previously retrieved was not a null tuple.) However, I'm not sure how
we can do that in ForeignRecheck; we can't know for example, which one is
outer and which one is inner, without an alternative local join execution
plan.  Maybe I'm missing something, though.



I would expect it to issue a new query like: SELECT * FROM ft1 LEFT
JOIN ft2 WHERE ft1.x = ft2.x AND ft1.tid = $0 AND ft2.tid = $1.


We assume here that ft1 uses late row locking, so I thought the above 
SQL should include "FOR UPDATE of ft1".  But I still don't think that 
that is right; the SQL with "FOR UPDATE of ft1" wouldn't generate the 
fake ft1/ft2-join tuple with nulls for ft2, as expected.  The reason for 
that is that the updated version of the ft1 tuple wouldn't satisfy the 
ft1.tid = $0 condition in an EPQ recheck, because the ctid for the 
updated version of the ft1 tuple has changed.  (IIUC, I think that if we 
use a TID scan for ft1, the SQL would generate the expected result, 
because I think that the TID condition would be ignored in the EPQ 
recheck, but I don't think it's guaranteed to use a TID scan for ft1.) 
Maybe I'm missing something, though.



This should be significantly more efficient than fetching the base
rows from each of two tables with two separate queries.


Maybe I think we could fix the SQL, so I have to admit that, but I'm 
just wondering (1) what would happen for the case when ft1 uses late row 
rocking and ft2 uses early row rocking and (2) that would be still more 
efficient than re-fetching only the base row from ft1.


What I thought to improve the efficiency in the secondary-plan approach 
that I proposed was that if we could parallelize re-fetching foreign 
rows in ExecLockRows and EvalPlanQualFetchRowMarks, we would be able to 
improve the efficiency not only for the case when performing a join of 
foreign tables remotely but for the case when performing the join locally.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ATT_FOREIGN_TABLE and ATWrongRelkindError()

2015-10-23 Thread Etsuro Fujita

On 2015/10/23 6:06, Robert Haas wrote:

On Wed, Oct 21, 2015 at 1:51 AM, Amit Langote
 wrote:

This may just be nitpicking but I noticed that ATWrongRelkindError() could
emit a better message in case of such errors during ALTER COLUMN DEFAULT
and  ALTER COLUMN SET STORAGE than "%s is of the wrong type" which is what
it would emit now. Just need to add a couple of cases to the switch there:

+ case ATT_TABLE | ATT_VIEW | ATT_FOREIGN_TABLE:
+ msg = _("\"%s\" is not a table, view or foreign table");
+ break;

+ case ATT_TABLE | ATT_MATVIEW | ATT_FOREIGN_TABLE:
+ msg = _("\"%s\" is not a table, materialized view, or foreign table");
+ break;

Attached adds those.



Good catch.  Committed and back-patched to 9.5.


Thanks, Amit and Robert!

This is really really nitpicking, but I noticed that there is an 
implicit rule concerning the message format in ATWrongRelkindError; if 
more than two objects are present, the message is "\"%s\" is not a foo, 
bar, or baz". ("or" is preceded by a comma!)  So, would it be better 
that the former is "\"%s\" is not a table, view, or foreign table"?


BTW, I found an incorrect error message in ATWrongRelkindError. 
Attached is a patch for fixing the message.


Best regards,
Etsuro Fujita
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index a5bc508..6436d0c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -4358,7 +4358,7 @@ ATWrongRelkindError(Relation rel, int allowed_targets)
 			msg = _("\"%s\" is not a table, composite type, or foreign table");
 			break;
 		case ATT_TABLE | ATT_MATVIEW | ATT_INDEX | ATT_FOREIGN_TABLE:
-			msg = _("\"%s\" is not a table, materialized view, composite type, or foreign table");
+			msg = _("\"%s\" is not a table, materialized view, index, or foreign table");
 			break;
 		case ATT_VIEW:
 			msg = _("\"%s\" is not a view");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-22 Thread Etsuro Fujita

On 2015/10/20 9:36, Kouhei Kaigai wrote:

Even if we fetch whole-row of both side, join pushdown is exactly working
because we can receive less number of rows than local join + 2 of foreign-
scan. (If planner works well, we can expect join-path that increases number
of rows shall be dropped.)

One downside of my proposition is growth of width for individual rows.
It is a trade-off situation. The above approach takes no changes for
existing EPQ infrastructure, thus, its implementation design is clear.
On the other hands, your approach will reduce traffic over the network,
however, it is still unclear how we integrate scanrelid==0 with EPQ
infrastructure.


I agree with KaiGai-san that his proposition (or my proposition based on 
secondary plans) is still a performance improvement over the current 
implementation on local joining plus early row locking, since that that 
wouldn't have to transfer useless data that didn't satisfy join 
conditions at all!



On the other hands, in case of custom-scan that takes underlying local
scan-nodes, thus, any kind of ROW_MARK_* except for ROW_MARK_COPY will
happen. I think width of the joined tuples are relatively minor issue
than FDW cases. However, we cannot expect the fetched rows are protected
by early row-locking mechanism, so probability of re-fetching rows and
reconstruction of joined-tuple has relatively higher priority.


I see.


There is also some possible loss of efficiency with this approach.
Suppose that we have two tables ft1 and ft2 which are being joined,
and we push down the join.  They are being joined on an integer
column, and the join needs to select several other columns as well.
However, ft1 and ft2 are very wide tables that also contain some text
columns.   The query is like this:

SELECT localtab.a, ft1.p, ft2.p FROM localtab LEFT JOIN (ft1 JOIN ft2
ON ft1.x = ft2.x AND ft1.huge ~ 'stuff' AND f2.huge2 ~ 'nonsense') ON
localtab.q = ft1.q;

If we refetch each row individually, we will need a wholerow image of
ft1 and ft2 that includes all columns, or at least f1.huge and
f2.huge2.  If we just fetch a wholerow image of the join output, we
can exclude those.  The only thing we need to recheck is that it's
still the case that localtab.q = ft1.q (because the value of
localtab.q might have changed).


As KaiGai-san mentioned above, what we need to discuss more about with 
Robert's proposition is how to integrate that into the existing EPQ 
machinery.  For example, when, where, and how should we refetch the 
whole-row image of the join output in the case of late row locking?  IMV 
I think that that would need to add a new FDW API different from 
RefetchForeignRow, say RefetchForeignJoinRow.


IMO I think that another benefit from the proposition from KaiGai-san 
(or me) would be that that could provide the whole functionality for row 
locking in remote joins, without an additional development burden on an 
FDW author; the author only has to write GetForeignRowMarkType and 
RefetchForeignRow, which I think is relatively easy.  I think that in 
the proposition, the use of rowmark types such as ROW_MARK_SHARE or 
ROW_MARK_EXCLUSIVE for foreign tables in remote joins would be quite 
inefficient, but I think that the use of ROW_MARK_REFERENCE instead of 
ROW_MARK_COPY would be an option for the workload where EPQ rechecks are 
rarely invoked, because we just need to transfer ctids, not whole-row 
images.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-21 Thread Etsuro Fujita

On 2015/10/21 13:34, Kouhei Kaigai wrote:

On 2015/10/20 13:11, Etsuro Fujita wrote:

On 2015/10/20 5:34, Robert Haas wrote:

No.  You just need to populate fdw_recheck_quals correctly, same as
for the scan case.



As I said yesterday, that opinion of me is completely wrong.  Sorry for
the incorrectness.  Let me explain a little bit more.  I still think
that even if ROW_MARK_COPY is in use, we would need to locally rejoin
the tuples populated from the whole-row images for the foreign tables
involved in a remote join, using a secondary plan.  Consider eg,

SELECT localtab.*, ft2 from localtab, ft1, ft2
   WHERE ft1.x = ft2.x AND ft1.y = localtab.y FOR UPDATE

In this case, since the output of the foreign join would not include any
ft1 columns, I don't think we could do the same thing as for the scan
case, even if populating fdw_recheck_quals correctly.



As an aside, could you introduce the reason why you think so? It is
significant point in discussion, if we want to reach the consensus.



On the other hands, the joined-tuple we're talking about in this context
is a tuple prior to projection; formed according to the fdw_scan_tlist.
So, it contains all the necessary information to run scan/join qualifiers
towards the joined-tuple. It is not affected by the target-list of user
query.


After research into the planner, I noticed that I was still wrong; IIUC, 
the planner requires that the output of foreign join include the column 
ft1.y even for that case.  (I don't understand the reason why the 
planner requires that.)  So, as Robert mentioned, the clause ft1.y = 
localtab.y could be rechecked during an EPQ recheck, if populating 
fdw_recheck_quals correctly.  Sorry again for the incorrectness.



Even though I think the approach with joined-tuple reconstruction is
reasonable solution here, it is not a fair reason to introduce disadvantage
of Robert's suggestion.


Agreed.


Also, please don't mix up "what we do" and "how we do".

It is "what we do" to discuss which format of tuples shall be returned
to the core backend from the extension, because it determines the role
of interface. If our consensus is to return a joined-tuple, we need to
design the interface according to the consensus.

On the other hands, it is "how we do" discussion whether we should
enforce all the FDW/CSP extension to have alternative plan, or not.
Once we got a consensus in "what we do" discussion, there are variable
options to solve the requirement by the consensus, however, we cannot
prioritize "how we do" without "what we do".


Agreed.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-20 Thread Etsuro Fujita

On 2015/10/20 13:11, Etsuro Fujita wrote:

On 2015/10/20 5:34, Robert Haas wrote:

On Mon, Oct 19, 2015 at 3:45 AM, Etsuro Fujita
 wrote:

As Tom mentioned, just recomputing the original join tuple is not good
enough.  We would need to rejoin the test tuples for the baserels
even if
ROW_MARK_COPY is in use.  Consider:

A=# BEGIN;
A=# UPDATE t SET a = a + 1 WHERE b = 1;
B=# SELECT * from t, ft1, ft2
  WHERE t.a = ft1.a AND t.b = ft2.b AND ft1.c = ft2.c FOR UPDATE;
A=# COMMIT;

where the plan for the SELECT FOR UPDATE is

LockRows
-> Nested Loop
-> Seq Scan on t
-> Foreign Scan on 
 Remote SQL: SELECT * FROM ft1 JOIN ft2 WHERE ft1.c = ft2.c
AND ft1.a
= $1 AND ft2.b = $2

If an EPQ recheck is invoked by the A's UPDATE, just recomputing the
original join tuple from the whole-row image that you proposed would
output
an incorrect result in the EQP recheck since the value a in the updated
version of a to-be-joined tuple in t would no longer match the value
ft1.a
extracted from the whole-row image if the A's UPDATE has committed
successfully.  So I think we would need to rejoin the tuples
populated from
the whole-row images for the baserels ft1 and ft2, by executing the
secondary plan with the new parameter values for a and b.



No.  You just need to populate fdw_recheck_quals correctly, same as
for the scan case.



Yeah, I think we can probably do that for the case where a pushed-down
join clause is an inner-join one, but I'm not sure that we can do that
for the case where that clause is an outer-join one.  Maybe I'm missing
something, though.


As I said yesterday, that opinion of me is completely wrong.  Sorry for 
the incorrectness.  Let me explain a little bit more.  I still think 
that even if ROW_MARK_COPY is in use, we would need to locally rejoin 
the tuples populated from the whole-row images for the foreign tables 
involved in a remote join, using a secondary plan.  Consider eg,


SELECT localtab.*, ft2 from localtab, ft1, ft2
 WHERE ft1.x = ft2.x AND ft1.y = localtab.y FOR UPDATE

In this case, since the output of the foreign join would not include any 
ft1 columns, I don't think we could do the same thing as for the scan 
case, even if populating fdw_recheck_quals correctly.  And I think we 
would need to rejoin the tuples, using a local join execution plan, 
which would have the parameterization for the to-be-pushed-down clause 
ft1.y = localtab.y.  I'm still missing something, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typos in plannodes.h

2015-10-20 Thread Etsuro Fujita

On 2015/10/21 0:13, Robert Haas wrote:

On Tue, Oct 20, 2015 at 7:45 AM, Etsuro Fujita
 wrote:

I found typos in plannodes.h: s/scan.plan.quals/scan.plan.qual/g  Please
find attached a patch.



Oops.  Good catch.  Committed.


Thanks for picking this up!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typos in plannodes.h

2015-10-20 Thread Etsuro Fujita
Hi,

I found typos in plannodes.h: s/scan.plan.quals/scan.plan.qual/g  Please
find attached a patch.

Best regards,
Etsuro Fujita
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 92fd8e4..6b28c8e 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -513,7 +513,7 @@ typedef struct WorkTableScan
  * describing what is in the scan tuple's columns.
  *
  * fdw_recheck_quals should contain any quals which the core system passed to
- * the FDW but which were not added to scan.plan.quals; that is, it should
+ * the FDW but which were not added to scan.plan.qual; that is, it should
  * contain the quals being checked remotely.  This is needed for correct
  * behavior during EvalPlanQual rechecks.
  *
@@ -529,7 +529,7 @@ typedef struct ForeignScan
 	List	   *fdw_exprs;		/* expressions that FDW may evaluate */
 	List	   *fdw_private;	/* private data for FDW */
 	List	   *fdw_scan_tlist; /* optional tlist describing scan tuple */
-	List	   *fdw_recheck_quals;	/* original quals not in scan.plan.quals */
+	List	   *fdw_recheck_quals;	/* original quals not in scan.plan.qual */
 	Bitmapset  *fs_relids;		/* RTIs generated by this scan */
 	bool		fsSystemCol;	/* true if any "system column" is needed */
 } ForeignScan;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-20 Thread Etsuro Fujita

On Mon, Oct 19, 2015 at 3:45 AM, Etsuro Fujita
 wrote:

As Tom mentioned, just recomputing the original join tuple is not good
enough.  We would need to rejoin the test tuples for the baserels even if
ROW_MARK_COPY is in use.  Consider:

A=# BEGIN;
A=# UPDATE t SET a = a + 1 WHERE b = 1;
B=# SELECT * from t, ft1, ft2
   WHERE t.a = ft1.a AND t.b = ft2.b AND ft1.c = ft2.c FOR UPDATE;
A=# COMMIT;

where the plan for the SELECT FOR UPDATE is

LockRows
-> Nested Loop
 -> Seq Scan on t
 -> Foreign Scan on 
  Remote SQL: SELECT * FROM ft1 JOIN ft2 WHERE ft1.c = ft2.c AND ft1.a
= $1 AND ft2.b = $2

If an EPQ recheck is invoked by the A's UPDATE, just recomputing the
original join tuple from the whole-row image that you proposed would output
an incorrect result in the EQP recheck since the value a in the updated
version of a to-be-joined tuple in t would no longer match the value ft1.a
extracted from the whole-row image if the A's UPDATE has committed
successfully.  So I think we would need to rejoin the tuples populated from
the whole-row images for the baserels ft1 and ft2, by executing the
secondary plan with the new parameter values for a and b.


Robert Haas wrote:

No.  You just need to populate fdw_recheck_quals correctly, same as
for the scan case.


I wrote:

Yeah, I think we can probably do that for the case where a pushed-down
join clause is an inner-join one, but I'm not sure that we can do that
for the case where that clause is an outer-join one.  Maybe I'm missing
something, though.


On 2015/10/20 15:42, Kouhei Kaigai wrote:

Please check my message yesterday. The non-nullable side of outer-join is
always visible regardless of the join-clause pushed down, as long as it
satisfies the scan-quals pushed-down.


Sorry, my explanation was not correct.  (Needed to take in caffeine.) 
What I'm concerned about is the following:


SELECT * FROM localtab JOIN (ft1 LEFT JOIN ft2 ON ft1.x = ft2.x) ON 
localtab.id = ft1.id FOR UPDATE OF ft1


LockRows
-> Nested Loop
 Join Filter: (localtab.id = ft1.id)
 -> Seq Scan on localtab
 -> Foreign Scan on 
  Remote SQL: SELECT * FROM ft1 LEFT JOIN ft2 WHERE ft1.x = 
ft2.x FOR UPDATE OF ft1


Assume that ft1 performs late row locking.  If an EPQ recheck was 
invoked due to a concurrent transaction on the remote server that 
changed only the value x of the ft1 tuple previously retrieved, then we 
would have to generate a fake ft1/ft2-join tuple with nulls for ft2. 
(Assume that the ft2 tuple previously retrieved was not a null tuple.) 
However, I'm not sure how we can do that in ForeignRecheck; we can't 
know for example, which one is outer and which one is inner, without an 
alternative local join execution plan.  Maybe I'm missing something, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-19 Thread Etsuro Fujita

On 2015/10/20 5:34, Robert Haas wrote:

On Mon, Oct 19, 2015 at 3:45 AM, Etsuro Fujita
 wrote:

As Tom mentioned, just recomputing the original join tuple is not good
enough.  We would need to rejoin the test tuples for the baserels even if
ROW_MARK_COPY is in use.  Consider:

A=# BEGIN;
A=# UPDATE t SET a = a + 1 WHERE b = 1;
B=# SELECT * from t, ft1, ft2
  WHERE t.a = ft1.a AND t.b = ft2.b AND ft1.c = ft2.c FOR UPDATE;
A=# COMMIT;

where the plan for the SELECT FOR UPDATE is

LockRows
-> Nested Loop
-> Seq Scan on t
-> Foreign Scan on 
 Remote SQL: SELECT * FROM ft1 JOIN ft2 WHERE ft1.c = ft2.c AND ft1.a
= $1 AND ft2.b = $2

If an EPQ recheck is invoked by the A's UPDATE, just recomputing the
original join tuple from the whole-row image that you proposed would output
an incorrect result in the EQP recheck since the value a in the updated
version of a to-be-joined tuple in t would no longer match the value ft1.a
extracted from the whole-row image if the A's UPDATE has committed
successfully.  So I think we would need to rejoin the tuples populated from
the whole-row images for the baserels ft1 and ft2, by executing the
secondary plan with the new parameter values for a and b.



No.  You just need to populate fdw_recheck_quals correctly, same as
for the scan case.


Yeah, I think we can probably do that for the case where a pushed-down 
join clause is an inner-join one, but I'm not sure that we can do that 
for the case where that clause is an outer-join one.  Maybe I'm missing 
something, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-19 Thread Etsuro Fujita
I wrote:
>> As Robert mentioned in [1], I think that if we're inside EPQ,
>> pushed-down quals and/or pushed-down joins should be locally rechecked
>> in the same way as other cases such as IndexRecheck.  So, I'll propose
>> the updated version of the patch.

On 2015/10/16 18:48, Kouhei Kaigai wrote:
> You have never answered my question for two months.
> 
> I never deny to execute the pushed-down qualifiers locally.
> It is likely the best tactics in most cases.
> But, why you try to enforce all the people a particular manner?
> 
> Here are various kind of FDW drivers. How do you guarantee it is
> the best solution for all the people? It is basically impossible.
> (Please google "Probatio diabolica")
> 
> You try to add two special purpose fields in ForeignScan;
> fdw_recheck_plan and fdw_recheck_quals.
> It requires FDW drivers to have pushed-down qualifier in a particular
> data format, and also requires FDW drivers to process EPQ recheck by
> alternative local plan, even if a part of FDW drivers can process
> these jobs by its own implementation better.
> 
> I've repeatedly pointed out this issue, but never get reasonable
> answer from you.
> 
> Again, I also admit alternative plan may be reasonable tactics for
> most of FDW drivers. However, only FDW author can "decide" it is
> the best tactics to handle the task for their module, not us.
> 
> I don't think it is a good interface design to enforce people to
> follow a particular implementation manner. It should be discretion
> of the extension.

I think that if you think so, you should give at least one concrete
example for that.  Ideally accompanied by a demo of how that works well.

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-19 Thread Etsuro Fujita

On 2015/10/16 19:03, Kouhei Kaigai wrote:

*** 48,59  ExecScanFetch(ScanState *node,
+   /*
+* Execute recheck plan and get the next tuple if foreign join.
+*/
+   if (scanrelid == 0)
+   {
+   (*recheckMtd) (node, slot);
+   return slot;
+   }

Ensure the slot is empty if recheckMtd returned false, as base relation
case doing so.


Fixed.


*** 347,352  ExecScanReScan(ScanState *node)
{
Index   scanrelid = ((Scan *) node->ps.plan)->scanrelid;

+   if (scanrelid == 0)
+   return; /* nothing to do */
+
Assert(scanrelid > 0);

estate->es_epqScanDone[scanrelid - 1] = false;

Why nothing to do?
Base relations managed by ForeignScan are tracked in fs_relids bitmap.


I think the estate->es_epqScanDone flag should be initialized when we do 
ExecScanReSacn for each of the component ForeignScanState nodes in the 
local join execution plan state tree.



As you introduced a few days before, if ForeignScan has parametalized
remote join, EPQ slot contains invalid tuples based on old outer tuple.


Maybe my explanation was not enough, but I haven't said such a thing. 
The problem in that case is that just returning the previously-returned 
foeign-join tuple would produce an incorrect result if an outer tuple to 
be joined has changed due to a concurrent transaction, as explained 
upthread.  (I think that the EPQ slots would contain valid tuples.)


Attached is an updated version of the patch.

Other changes:
* remove unnecessary memory-context handling for the foreign-join case 
in ForeignRecheck

* revise code a bit and add a bit more comments

Thanks for the comments!

Best regards,
Etsuro Fujita
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***
*** 525,530  fileGetForeignPaths(PlannerInfo *root,
--- 525,531 
  	 total_cost,
  	 NIL,		/* no pathkeys */
  	 NULL,		/* no outer rel either */
+ 	 NULL,		/* no alternative path */
  	 coptions));
  
  	/*
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 560,565  postgresGetForeignPaths(PlannerInfo *root,
--- 560,566 
     fpinfo->total_cost,
     NIL, /* no pathkeys */
     NULL,		/* no outer rel either */
+    NULL,		/* no alternative path */
     NIL);		/* no fdw_private list */
  	add_path(baserel, (Path *) path);
  
***
*** 727,732  postgresGetForeignPaths(PlannerInfo *root,
--- 728,734 
  	   total_cost,
  	   NIL,		/* no pathkeys */
  	   param_info->ppi_req_outer,
+ 	   NULL,	/* no alternative path */
  	   NIL);	/* no fdw_private list */
  		add_path(baserel, (Path *) path);
  	}
*** a/src/backend/executor/execScan.c
--- b/src/backend/executor/execScan.c
***
*** 48,59  ExecScanFetch(ScanState *node,
  		 * conditions.
  		 */
  		Index		scanrelid = ((Scan *) node->ps.plan)->scanrelid;
  
  		Assert(scanrelid > 0);
  		if (estate->es_epqTupleSet[scanrelid - 1])
  		{
- 			TupleTableSlot *slot = node->ss_ScanTupleSlot;
- 
  			/* Return empty slot if we already returned a tuple */
  			if (estate->es_epqScanDone[scanrelid - 1])
  return ExecClearTuple(slot);
--- 48,67 
  		 * conditions.
  		 */
  		Index		scanrelid = ((Scan *) node->ps.plan)->scanrelid;
+ 		TupleTableSlot *slot = node->ss_ScanTupleSlot;
+ 
+ 		if (scanrelid == 0)
+ 		{
+ 			/* Execute recheck plan and store result in the slot */
+ 			if (!(*recheckMtd) (node, slot))
+ ExecClearTuple(slot);	/* would not be returned by scan */
+ 
+ 			return slot;
+ 		}
  
  		Assert(scanrelid > 0);
  		if (estate->es_epqTupleSet[scanrelid - 1])
  		{
  			/* Return empty slot if we already returned a tuple */
  			if (estate->es_epqScanDone[scanrelid - 1])
  return ExecClearTuple(slot);
***
*** 347,352  ExecScanReScan(ScanState *node)
--- 355,363 
  	{
  		Index		scanrelid = ((Scan *) node->ps.plan)->scanrelid;
  
+ 		if (scanrelid == 0)
+ 			return;/* nothing to do */
+ 
  		Assert(scanrelid > 0);
  
  		estate->es_epqScanDone[scanrelid - 1] = false;
*** a/src/backend/executor/nodeForeignscan.c
--- b/src/backend/executor/nodeForeignscan.c
***
*** 24,29 
--- 24,30 
  
  #include "executor/executor.h"
  #include "executor/nodeForeignscan.h"
+ #include "executor/tuptable.h"
  #include "foreign/fdwapi.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
***
*** 73,80  ForeignNext(ForeignScanState *node)
--- 74,99 
  static bool
  ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)
  {
+ 	

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-19 Thread Etsuro Fujita

On 2015/10/17 9:58, Robert Haas wrote:

But with Etsuro Fujita's patch, and I think what you have proposed has
been similar, how are you going to do it?  The proposal is to call the
recheck method and hope for the best, but what is the recheck method
going to do?  Where is it going to get the previously-returned tuple?


As I explained in a previous email, just returning the 
previously-returned tuple is not good enough.



How will it know if it has already returned it during the lifetime of
this EPQ check?  Offhand, it looks to me like, at least in some
circumstances, you're probably going to return whatever tuple you
returned most recently (which has a good chance of being the right
one, but not necessarily) over and over again.  That's not going to
fly.


No.  Since the local join execution plan is created so that the scan 
slot for each foreign table involved in the pushed-down join looks at 
its EPQ slot, I think the plan can return at most one tuple.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-19 Thread Etsuro Fujita

On 2015/10/17 12:22, Robert Haas wrote:

On Fri, Oct 16, 2015 at 9:51 PM, Tom Lane  wrote:

Robert Haas  writes:

Both you and Etsuro Fujita are proposing to fix this problem by
somehow making it the FDW's problem to reconstruct the tuple
previously produced by the join from whole-row images of the baserels.
But that's not looking back far enough: why are we asking for
whole-row images of the baserels when what we really want is a
whole-row image of the output of the join?  The output of the join is
what we need to re-return.



There are multiple components to the requirement though:



3. If so, form a join row and return that.  Else return NULL.



Not check.

Suppose we've got two foreign tables ft1 and ft2, using postgres_fdw.
There is a local table t.  The user does something like UPDATE t SET
... FROM ft1, ft2 WHERE t = ft1.a AND ft1.b = ft2.b AND   The
query planner generates something like:

Update
-> Join
   -> Scan on t
   -> Foreign Scan on 

If an EPQ recheck occurs, the only thing that matters is that the
Foreign Scan return the right output row (or possibly now rows, if the
row it would have formed no longer matches the quals).  It doesn't
matter how it does this.  Let's say the columns actually needed by the
query from the ft1-ft2 join are ft1.a, ft1.b, ft2.a, and ft2.b.
Currently, the output of the foreign scan is something like: ft1.a,
ft1.b, ft2.a, ft.b, ft1.*, ft2.*.  The EPQ recheck has access to ft1.*
and ft2.*, but it's not straightforward for postgres_fdw to regenerate
the join tuple from that.  Maybe the pushed-down was a left join,
maybe it was a right join, maybe it was a full join.  So some of the
columns could have gone to NULL.  To figure it out, you need to build
a secondary plan tree that mimics the structure of the join you pushed
down, which is kinda hairy.


As Tom mentioned, just recomputing the original join tuple is not good 
enough.  We would need to rejoin the test tuples for the baserels even 
if ROW_MARK_COPY is in use.  Consider:


A=# BEGIN;
A=# UPDATE t SET a = a + 1 WHERE b = 1;
B=# SELECT * from t, ft1, ft2
 WHERE t.a = ft1.a AND t.b = ft2.b AND ft1.c = ft2.c FOR UPDATE;
A=# COMMIT;

where the plan for the SELECT FOR UPDATE is

LockRows
-> Nested Loop
   -> Seq Scan on t
   -> Foreign Scan on 
Remote SQL: SELECT * FROM ft1 JOIN ft2 WHERE ft1.c = ft2.c AND 
ft1.a = $1 AND ft2.b = $2


If an EPQ recheck is invoked by the A's UPDATE, just recomputing the 
original join tuple from the whole-row image that you proposed would 
output an incorrect result in the EQP recheck since the value a in the 
updated version of a to-be-joined tuple in t would no longer match the 
value ft1.a extracted from the whole-row image if the A's UPDATE has 
committed successfully.  So I think we would need to rejoin the tuples 
populated from the whole-row images for the baserels ft1 and ft2, by 
executing the secondary plan with the new parameter values for a and b.


As for the secondary plan, I think we could create the corresponding 
local join execution path during GetForeignJoinPaths, (1) by looking at 
the pathlist of the joinrel RelOptInfo, which would have already 
contained some local join execution paths, as does the patch, or (2) by 
calling a helper function that creates a local join execution path from 
given outer/inner paths selected from the pathlists of the 
outerrel/innerrel RelOptInfos, as proposed be KaiGai-san before.  ISTM 
that the latter would be better, so I plan to propose such a function as 
part of the postgres_fdw join pushdown patch for 9.6.



This example is of the early row locking case, but I think the story
is about the same if the FDW wants to do late row locking instead.  If
there's an EPQ recheck, it could issue individual row re-fetches
against every base table and then re-do all the joins that it pushed
down locally.  But it would be faster and cleaner, I think, to send
one query to the remote side that re-fetches all the rows at once, and
whose target list is exactly what we need, rather than whole row
targetlists for each baserel that then have to be rejiggered on our
side.


I agree with you on that point.  (In fact, I thought that too!)  But 
considering that many FDWs including postgres_fdw use early row locking 
(ie, ROW_MARK_COPY) currently, I'd like to leave that for future work.



I think what Kaigai-san and Etsuro-san are after is trying to find a way
to reuse some of the existing EPQ machinery to help with that.  This may
not be practical, or it may end up being messier than a standalone
implementation; but it's not silly on its face to want to reuse some of
that code.



Yeah, I think we're all in agreement that reusing as much of the EPQ
machinery as is sensible is something we should do.  We are not in
agreement on which parts of it need to be changed or extended.


Agreed.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers ma

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-16 Thread Etsuro Fujita

On 2015/10/14 17:31, Etsuro Fujita wrote:

As KaiGai-san also pointed out before, I think we should address this in
each of the following cases:

1) remote qual (scanrelid>0)
2) remote join (scanrelid==0)


As for #2, I updated the patch, which uses a local join execution plan 
for an EvalPlanQual rechech, according to the comment from Robert [1]. 
Attached is an updated version of the patch.  This is a WIP patch, but 
it would be appreciated if I could get feedback earlier.


For tests, apply the patches:

foreign-recheck-for-foreign-join-1.patch
usermapping_matching.patch [2]
add_GetUserMappingById.patch [2]
foreign_join_v16_efujita.patch [3]

Since that as I said upthread, what I'd like to discuss is changes to 
the PG core, I didn't do anything about the postgres_fdw patches.


Best regards,
Etsuro Fujita

[1] 
http://www.postgresql.org/message-id/ca+tgmoaazs0dr23r7ptbseqfwotuvcpnbqdhxebo9gi+dmx...@mail.gmail.com
[2] 
http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_wagh2rgzpg0o4pqgd+iauyaj8wtze+cyj...@mail.gmail.com

[3] http://www.postgresql.org/message-id/55cb2d45.7040...@lab.ntt.co.jp
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***
*** 525,530  fileGetForeignPaths(PlannerInfo *root,
--- 525,531 
  	 total_cost,
  	 NIL,		/* no pathkeys */
  	 NULL,		/* no outer rel either */
+ 	 NULL,		/* no alternative path */
  	 coptions));
  
  	/*
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 560,565  postgresGetForeignPaths(PlannerInfo *root,
--- 560,566 
     fpinfo->total_cost,
     NIL, /* no pathkeys */
     NULL,		/* no outer rel either */
+    NULL,		/* no alternative path */
     NIL);		/* no fdw_private list */
  	add_path(baserel, (Path *) path);
  
***
*** 727,732  postgresGetForeignPaths(PlannerInfo *root,
--- 728,734 
  	   total_cost,
  	   NIL,		/* no pathkeys */
  	   param_info->ppi_req_outer,
+ 	   NULL,	/* no alternative path */
  	   NIL);	/* no fdw_private list */
  		add_path(baserel, (Path *) path);
  	}
*** a/src/backend/executor/execScan.c
--- b/src/backend/executor/execScan.c
***
*** 48,59  ExecScanFetch(ScanState *node,
  		 * conditions.
  		 */
  		Index		scanrelid = ((Scan *) node->ps.plan)->scanrelid;
  
  		Assert(scanrelid > 0);
  		if (estate->es_epqTupleSet[scanrelid - 1])
  		{
- 			TupleTableSlot *slot = node->ss_ScanTupleSlot;
- 
  			/* Return empty slot if we already returned a tuple */
  			if (estate->es_epqScanDone[scanrelid - 1])
  return ExecClearTuple(slot);
--- 48,67 
  		 * conditions.
  		 */
  		Index		scanrelid = ((Scan *) node->ps.plan)->scanrelid;
+ 		TupleTableSlot *slot = node->ss_ScanTupleSlot;
+ 
+ 		/*
+ 		 * Execute recheck plan and get the next tuple if foreign join.
+ 		 */
+ 		if (scanrelid == 0)
+ 		{
+ 			(*recheckMtd) (node, slot);
+ 			return slot;
+ 		}
  
  		Assert(scanrelid > 0);
  		if (estate->es_epqTupleSet[scanrelid - 1])
  		{
  			/* Return empty slot if we already returned a tuple */
  			if (estate->es_epqScanDone[scanrelid - 1])
  return ExecClearTuple(slot);
***
*** 347,352  ExecScanReScan(ScanState *node)
--- 355,363 
  	{
  		Index		scanrelid = ((Scan *) node->ps.plan)->scanrelid;
  
+ 		if (scanrelid == 0)
+ 			return;/* nothing to do */
+ 
  		Assert(scanrelid > 0);
  
  		estate->es_epqScanDone[scanrelid - 1] = false;
*** a/src/backend/executor/nodeForeignscan.c
--- b/src/backend/executor/nodeForeignscan.c
***
*** 24,29 
--- 24,30 
  
  #include "executor/executor.h"
  #include "executor/nodeForeignscan.h"
+ #include "executor/tuptable.h"
  #include "foreign/fdwapi.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
***
*** 80,85  ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)
--- 81,103 
  	 */
  	econtext = node->ss.ps.ps_ExprContext;
  
+ 	if (node->fdw_recheck_plan != NULL)
+ 	{
+ 		TupleTableSlot *result;
+ 		MemoryContext oldcontext;
+ 
+ 		/* Must be in query context to call recheck plan */
+ 		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+ 		/* Execute recheck plan */
+ 		result = ExecProcNode(node->fdw_recheck_plan);
+ 		MemoryContextSwitchTo(oldcontext);
+ 		if (TupIsNull(result))
+ 			return false;
+ 		/* Store result in the given slot */
+ 		ExecCopySlot(slot, result);
+ 		return true;
+ 	}
+ 
  	/* Does the tuple meet the remote qual condition? */
  	econtext->ecxt_scantuple = slot;
  
***
*** 200,205  ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
--- 218,229 
  	ExecAssignScanProjectionInfoWithVarno(&scanstate->ss, tlistvarno)

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-16 Thread Etsuro Fujita
>> On 2015/10/15 11:36, Kouhei Kaigai wrote:
>>> In case of scanrelid==0, expectation to ForeignScan/CustomScan is to
>>> behave as if local join exists here. It requires ForeignScan to generate
>>> joined-tuple as a result of remote join, that may contains multiple junk
>>> TLEs to carry whole-var references of base foreign tables.
>>> According to the criteria, the desirable behavior is clear as below:
>>>
>>> 1. FDW/CSP picks up base relation's tuple from the EPQ slots.
>>>  It shall be setup by whole-row reference if earlier row-lock semantics,
>>>  or by RefetchForeignRow if later row-lock semantics.
>>>
>>> 2. Fill up ss_ScanTupleSlot according to the xxx_scan_tlist.
>>>  We may be able to provide a common support function here, because this
>>>  list keeps relation between a particular attribute of the joined-tuple
>>>  and its source column.
>>>
>>> 3. Apply join-clause and base-restrict that were pushed down.
>>>  setrefs.c initializes expressions kept in fdw_exprs/custom_exprs to run
>>>  on the ss_ScanTupleSlot. It is the easiest way to check here.
>>>
>>> 4. If joined-tuple is still visible after the step 3, FDW/CSP returns
>>>  joined-tuple. Elsewhere, returns an empty slot.
>>>
>>> It is entirely compatible behavior even if local join is located on
>>> the point of ForeignScan/CustomScan with scanrelid==0.
>>>
>>> Even if remote join is parametalized by other relation, we can simply
>>> use param-info delivered from the corresponding outer scan at the step-3.
>>> EState should have the parameters already updated, FDW driver needs to
>>> care about nothing.
>>>
>>> It is quite less invasive approach towards the existing EPQ recheck
>>> mechanism.

I wrote:
>> I see.  That's an idea, but I guess that step 2 and 3 would need to add
>> a lot of code to the core.  Why don't you use a local join execution
>> plan that we discussed?  I think that that would make the series of
>> processing much simpler.  I'm now revising the patch that I created for
>> that.  If it's okay, I'd like to propose an updated version of the patch
>> in a few days.

On 2015/10/15 20:19, Kouhei Kaigai wrote:
> I have to introduce why above idea is simpler and suitable for v9.5
> timeline.
> As I've consistently proposed for this two months, the step-2 and 3
> are assumed to be handled in the callback routine to be kicked from
> ForeignRecheck().

Honestly, I still don't think I would see the much value in doing so.
As Robert mentioned in [1], I think that if we're inside EPQ,
pushed-down quals and/or pushed-down joins should be locally rechecked
in the same way as other cases such as IndexRecheck.  So, I'll propose
the updated version of the patch.

Thanks for the explanation!

Best regards,
Etsuro Fujita

[1]
http://www.postgresql.org/message-id/CA+Tgmoau7jVTLF0Oh9a_Mu9S=vrw7i6u_h7jspzbxv0xtyo...@mail.gmail.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-15 Thread Etsuro Fujita

On 2015/10/16 2:14, Robert Haas wrote:

On Thu, Oct 15, 2015 at 3:04 AM, Kyotaro HORIGUCHI
 wrote:

I confirmed that an epqtuple of foreign parameterized scan is
correctly rejected by fdw_recheck_quals with modified outer
tuple.

I have no objection to this and have two humble comments.

In file_fdw.c, the comment for the last parameter just after the
added line seems to be better to be aligned with other comments.



I've pgindented the file.  Any other space we might choose would just
be changed by the next pgindent run, so there's no point in trying to
vary.



In subselect.c, the added break is in the added curly-braces but
it would be better to place it after the closing brace, like the
other cases.



Changed that, and committed.


Thanks, Robert and Horiguchi-san.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-15 Thread Etsuro Fujita
On 2015/10/15 11:36, Kouhei Kaigai wrote:
>>> Once again, if FDW driver is responsible to construct join-tuple from
>>> the base relation's tuple cached in EPQ slot, this case don't need to
>>> kick remote query again, because all the materials to construct join-
>>> tuple are already held locally. Right?

I now understand clearly what you mean.  Sorry for my misunderstanding.

> In case of scanrelid==0, expectation to ForeignScan/CustomScan is to
> behave as if local join exists here. It requires ForeignScan to generate
> joined-tuple as a result of remote join, that may contains multiple junk
> TLEs to carry whole-var references of base foreign tables.
> According to the criteria, the desirable behavior is clear as below:
> 
> 1. FDW/CSP picks up base relation's tuple from the EPQ slots.
> It shall be setup by whole-row reference if earlier row-lock semantics,
> or by RefetchForeignRow if later row-lock semantics.
> 
> 2. Fill up ss_ScanTupleSlot according to the xxx_scan_tlist.
> We may be able to provide a common support function here, because this
> list keeps relation between a particular attribute of the joined-tuple
> and its source column.
> 
> 3. Apply join-clause and base-restrict that were pushed down.
> setrefs.c initializes expressions kept in fdw_exprs/custom_exprs to run
> on the ss_ScanTupleSlot. It is the easiest way to check here.
> 
> 4. If joined-tuple is still visible after the step 3, FDW/CSP returns
> joined-tuple. Elsewhere, returns an empty slot.
> 
> It is entirely compatible behavior even if local join is located on
> the point of ForeignScan/CustomScan with scanrelid==0.
> 
> Even if remote join is parametalized by other relation, we can simply
> use param-info delivered from the corresponding outer scan at the step-3.
> EState should have the parameters already updated, FDW driver needs to
> care about nothing.
> 
> It is quite less invasive approach towards the existing EPQ recheck
> mechanism.

I see.  That's an idea, but I guess that step 2 and 3 would need to add
a lot of code to the core.  Why don't you use a local join execution
plan that we discussed?  I think that that would make the series of
processing much simpler.  I'm now revising the patch that I created for
that.  If it's okay, I'd like to propose an updated version of the patch
in a few days.

> I cannot understand why Fujita-san never "try" this approach.

Maybe my explanation was not correct, but I didn't say such a thing.
What I rather objected against was to add a new FDW callback routine for
rechecking pushed-down quals or pushed-down joins, which I think you
insisted on.

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-14 Thread Etsuro Fujita
On 2015/10/14 12:07, Kouhei Kaigai wrote:
>> On 2015/10/07 15:39, Etsuro Fujita wrote:
>> I noticed that the approach using a column to populate the foreign
>> scan's slot directly wouldn't work well in some cases.  For example,
>> consider:
>>
>> SELECT * FROM verysmall v LEFT JOIN (bigft1 JOIN bigft2 ON bigft1.x =
>> bigft2.x) ON v.q = bigft1.q AND v.r = bigft2.r FOR UPDATE OF v;
>>
>> The best plan is presumably something like this as you said before:
>>
>> LockRows
>> -> Nested Loop
>>  -> Seq Scan on verysmall v
>>  -> Foreign Scan on bigft1 and bigft2
>>   Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2
>>
>> Consider the EvalPlanQual testing to see if the updated version of a
>> tuple in v satisfies the query.  If we use the column in the testing, we
>> would get the wrong results in some cases.

> In this case, does ForeignScan have to be reset prior to ExecProcNode()?
> Once ExecReScanForeignScan() gets called by ExecNestLoop(), it marks EPQ
> slot is invalid. So, more or less, ForeignScan needs to kick the remote
> join again based on the new parameter come from the latest verysmall tuple.
> Please correct me, if I don't understand correctly.
> In case of unparametalized ForeignScan case, the cached join-tuple work
> well because it is independent from verysmall.
> 
> Once again, if FDW driver is responsible to construct join-tuple from
> the base relation's tuple cached in EPQ slot, this case don't need to
> kick remote query again, because all the materials to construct join-
> tuple are already held locally. Right?

Sorry, maybe I misunderstand your words, but we are talking here about
an approach using a whole-row var that would populate a join tuple that
is returned by an FDW and stored in the scan slot in the corresponding
ForeingScanState node in the parent state tree.

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-14 Thread Etsuro Fujita

On 2015/10/10 10:17, Robert Haas wrote:

On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita
 wrote:

The best plan is presumably something like this as you said before:

LockRows
-> Nested Loop
 -> Seq Scan on verysmall v
 -> Foreign Scan on bigft1 and bigft2
  Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
bigft2.x AND bigft1.q = $1 AND bigft2.r = $2

Consider the EvalPlanQual testing to see if the updated version of a
tuple in v satisfies the query.  If we use the column in the testing, we
would get the wrong results in some cases.



More precisely, we would get the wrong result when the value of v.q or v.r
in the updated version has changed.



Interesting test case.  It's worth considering why this works if you
were to replace the Foreign Scan with an Index Scan; suppose the query
is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x =
t.x FOR UPDATE OF v, so that you get:

LockRows
-> Nested Loop
   -> Seq Scan on verysmall v
   -> Foreign Scan on realbiglocaltable t
 Index Cond: v.x = t.x

In your example, the remote SQL pushes down certain quals to the
remote server, and so if we just return the same tuple they might no
longer be satisfied.  In this example, the index qual is essentially a
filter condition that has been "pushed down" into the index AM.  The
EvalPlanQual machinery prevents this from generating wrong answers by
rechecking the index cond - see IndexRecheck.  Even though it's
normally the AM's job to enforce the index cond, and the executor does
not need to recheck, in the EvalPlanQual case it does need to recheck.

I think the foreign data wrapper case should be handled the same way.
Any condition that we initially pushed down to the foreign server
needs to be locally rechecked if we're inside EPQ.


Agreed.

As KaiGai-san also pointed out before, I think we should address this in 
each of the following cases:


1) remote qual (scanrelid>0)
2) remote join (scanrelid==0)

As for #1, I noticed that there is a bug in handling the same kind of 
FDW queries, which will be shown below.  As you said, I think this 
should be addressed by rechecking the remote quals *locally*.  (I 
thought another fix for this kind of bug before, though.)  IIUC, I think 
this should be fixed separately from #2, as this is a bug not only in 
9.5, but in back branches.  Please find attached a patch.


Create an environment:

mydatabase=# create table t1 (a int primary key, b text);
mydatabase=# insert into t1 select a, 'notsolongtext' from 
generate_series(1, 100) a;


postgres=# create server myserver foreign data wrapper postgres_fdw 
options (dbname 'mydatabase');

postgres=# create user mapping for current_user server myserver;
postgres=# create foreign table ft1 (a int, b text) server myserver 
options (table_name 't1');

postgres=# alter foreign table ft1 options (add use_remote_estimate 'true');
postgres=# create table inttab (a int);
postgres=# insert into inttab select a from generate_series(1, 10) a;
postgres=# analyze ft1;
postgres=# analyze inttab;

Run concurrent transactions that produce incorrect result:

[Terminal1]
postgres=# begin;
BEGIN
postgres=# update inttab set a = a + 1 where a = 1;
UPDATE 1

[Terminal2]
postgres=# explain verbose select * from inttab, ft1 where inttab.a = 
ft1.a limit 1 for update;

   QUERY PLAN
-
 Limit  (cost=100.43..198.99 rows=1 width=70)
   Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
   ->  LockRows  (cost=100.43..1086.00 rows=10 width=70)
 Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
 ->  Nested Loop  (cost=100.43..1085.90 rows=10 width=70)
   Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
   ->  Seq Scan on public.inttab  (cost=0.00..1.10 rows=10 
width=10)

 Output: inttab.a, inttab.ctid
   ->  Foreign Scan on public.ft1  (cost=100.43..108.47 
rows=1 width=18)

 Output: ft1.a, ft1.b, ft1.*
 Remote SQL: SELECT a, b FROM public.t1 WHERE 
(($1::integer = a)) FOR UPDATE

(11 rows)

postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1 for 
update;


[Terminal1]
postgres=# commit;
COMMIT

[Terminal2]
(After the commit in Terminal1, the following result will be shown in 
Terminal2.  Note that the values of inttab.a and ft1.a wouldn't satisfy 
the remote qual!)

 a | a |   b
---+---+---
 2 | 1 | notsolongtext
(1 row)

As for #2, I didn't come up with any solution to locally rechecking 
pushed-down join conditions against a joined tuple populated from a 
column that we discussed.  Instead, I'd like to revise a 
local-join-execution-plan-based approach that we discussed before, by 
addressing your comments such

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-09 Thread Etsuro Fujita

On 2015/09/12 1:38, Robert Haas wrote:

On Thu, Sep 10, 2015 at 11:36 PM, Etsuro Fujita
 wrote:

I've proposed the following API changes:

* I modified create_foreignscan_path, which is called from
postgresGetForeignJoinPaths/postgresGetForeignPaths, so that a path,
subpath, is passed as the eighth argument of the function. subpath
represents a local join execution path if scanrelid==0, but NULL if
scanrelid>0.



OK, I see now.  But I don't much like the way
get_unsorted_unparameterized_path() looks.

First, it's basically praying that MergePath, NodePath, and NestPath
can be flat-copied without breaking anything.  In general, we have
copyfuncs.c support for nodes that we need to be able to copy, and we
use copyObject() to do it.  Even if what you've got here works today,
it's not very future-proof.


Agreed.


Second, what guarantee do we have that we'll find a path with no
pathkeys and a NULL param_info?  Why can't all of the paths for a join
relation have pathkeys?  Why can't they all be parameterized?  I can't
think of anything that would guarantee that.


No.  The reason why I've modified the patch that way is simply because 
the latest postgres_fdw patch doesn't support creating a remote query 
for a presorted or parameterized path for a remote join.



Third, even if such a guarantee existed, why is this the right
behavior?  Any join type will produce the same output; it's just a
question of performance.  And if you have only one tuple on each side,
surely a nested loop would be fine.


Yeah, I think we would also need to consider the parameterization.


It seems to me that what you ought to be doing is using data hung off
the fdw_private field of each RelOptInfo to cache a NestPath that can
be used for EPQ rechecks at that level.  When you go to consider
pushing down another join, you can build up a new NestPath that's
suitable for the new level.  That seems much cleaner than groveling
through the list of surviving paths and hoping you find the right kind
of thing.


Agreed.

(From the first, I am not against that an FDW author creates the local 
join execution path by itself.  The reason why I've modified the patch 
so as to find a local join execution path from the path list is simply 
because that is simple.  The main point I'd like to discuss about the 
patch is the changes to the core code).



And all that having been said, I still don't really understand why you
are resisting the idea of providing a callback so that the FDW can
execute arbitrary code in the recheck path.  There doesn't seem to be
any reason not to let the FDW take control of the rechecks if it
wishes, and there's no real cost in complexity that I can see.


IMO I thought there would be not a little development burden on an FDW 
author.  So, I was rather against the idea of providing such a callback.


I know we still haven't reached a consensus on whether we address this 
issue by using a local join execution path.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Hooking at standard_join_search (Was: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual)

2015-10-09 Thread Etsuro Fujita

On 2015/09/11 6:30, Robert Haas wrote:

On Wed, Sep 9, 2015 at 2:30 AM, Etsuro Fujita
 wrote:

But that path might have already been discarded on the basis of cost.
I think Tom's idea is better: let the FDW consult some state cached
for this purpose in the RelOptInfo



Do you have an idea of what information would be collected into the state
and how the FDW would derive parameterizations to consider producing
pushed-down joins with from that information?  What I'm concerned about that
is to reduce the number of parameterizations to consider, to reduce overhead
in costing the corresponding queries.  I'm missing something, though.



I think the thing we'd want to store in the state would be enough
information to reconstruct a valid join nest.  For example, the
reloptinfo for (A B) might note that A needs to be left-joined to B.
When we go to construct paths for (A B C), and there is no
SpecialJoinInfo that mentions C, we know that we can construct (A LJ
B) IJ C rather than (A IJ B) IJ C.  If any paths survived, we could
find a way to pull that information out of the path, but pulling it
out of the RelOptInfo should always work.

I am not sure what to do about parameterizations.  That's one of my
remaining concerns about moving the hook.


Do you have any plan about the hook?

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-09 Thread Etsuro Fujita

On 2015/10/09 15:04, Kyotaro HORIGUCHI wrote:

At Fri, 9 Oct 2015 12:00:30 +0900, Etsuro Fujita  wrote 
in <56172dce.7080...@lab.ntt.co.jp>

On 2015/10/08 19:55, Etsuro Fujita wrote:

I noticed that the approach using a column to populate the foreign
scan's slot directly wouldn't work well in some cases.  For example,
consider:

SELECT * FROM verysmall v LEFT JOIN (bigft1 JOIN bigft2 ON bigft1.x =
bigft2.x) ON v.q = bigft1.q AND v.r = bigft2.r FOR UPDATE OF v;



Oops, I should have written "JOIN", not "LEFT JOIN".



The best plan is presumably something like this as you said before:

LockRows
-> Nested Loop
 -> Seq Scan on verysmall v
 -> Foreign Scan on bigft1 and bigft2
  Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
bigft2.x AND bigft1.q = $1 AND bigft2.r = $2

Consider the EvalPlanQual testing to see if the updated version of a
tuple in v satisfies the query.  If we use the column in the testing,
we
would get the wrong results in some cases.



More precisely, we would get the wrong result when the value of v.q or
v.r in the updated version has changed.



What do you think the right behavior?


IIUC, I think that the foreign scan's slot should be set empty, that the 
join should fail, and that the updated version of the tuple in v should 
be ignored in that scenario since that for the updated version of the 
tuple in v, the tuples obtained from those two foreign tables wouldn't 
satisfy the remote query.  But if populating the foreign scan's slot 
from that column, then the join would success and the updated version of 
the tuple in v would be returned wrongly, I think.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-08 Thread Etsuro Fujita

On 2015/10/08 19:55, Etsuro Fujita wrote:

I noticed that the approach using a column to populate the foreign
scan's slot directly wouldn't work well in some cases.  For example,
consider:

SELECT * FROM verysmall v LEFT JOIN (bigft1 JOIN bigft2 ON bigft1.x =
bigft2.x) ON v.q = bigft1.q AND v.r = bigft2.r FOR UPDATE OF v;


Oops, I should have written "JOIN", not "LEFT JOIN".


The best plan is presumably something like this as you said before:

LockRows
-> Nested Loop
-> Seq Scan on verysmall v
-> Foreign Scan on bigft1 and bigft2
 Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
bigft2.x AND bigft1.q = $1 AND bigft2.r = $2

Consider the EvalPlanQual testing to see if the updated version of a
tuple in v satisfies the query.  If we use the column in the testing, we
would get the wrong results in some cases.


More precisely, we would get the wrong result when the value of v.q or 
v.r in the updated version has changed.


I don't have a good idea for this, so would an approach using an local 
join execution plan be the good way to go?


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-08 Thread Etsuro Fujita

On 2015/10/07 15:39, Etsuro Fujita wrote:

On 2015/10/07 15:06, Kyotaro HORIGUCHI wrote:

At Wed, 7 Oct 2015 00:24:57 -0400, Robert Haas 
wrote

I think it rather requires *replacing* two resjunk columns by one new
one.  The whole-row references for the individual foreign tables are
only there to support EvalPlanQual; if we instead have a column to
populate the foreign scan's slot directly, then we can use that column
for that purpose directly and there's no remaining use for the
whole-row vars on the baserels.



It is what I had in mind.



OK  I'll investigate this further.


I noticed that the approach using a column to populate the foreign 
scan's slot directly wouldn't work well in some cases.  For example, 
consider:


SELECT * FROM verysmall v LEFT JOIN (bigft1 JOIN bigft2 ON bigft1.x = 
bigft2.x) ON v.q = bigft1.q AND v.r = bigft2.r FOR UPDATE OF v;


The best plan is presumably something like this as you said before:

LockRows
-> Nested Loop
   -> Seq Scan on verysmall v
   -> Foreign Scan on bigft1 and bigft2
Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x = 
bigft2.x AND bigft1.q = $1 AND bigft2.r = $2


Consider the EvalPlanQual testing to see if the updated version of a 
tuple in v satisfies the query.  If we use the column in the testing, we 
would get the wrong results in some cases.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-06 Thread Etsuro Fujita

On 2015/10/07 15:06, Kyotaro HORIGUCHI wrote:

At Wed, 7 Oct 2015 00:24:57 -0400, Robert Haas  wrote

I think it rather requires *replacing* two resjunk columns by one new
one.  The whole-row references for the individual foreign tables are
only there to support EvalPlanQual; if we instead have a column to
populate the foreign scan's slot directly, then we can use that column
for that purpose directly and there's no remaining use for the
whole-row vars on the baserels.



It is what I had in mind.


OK  I'll investigate this further.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-06 Thread Etsuro Fujita

On 2015/10/07 6:19, Robert Haas wrote:

On Fri, Oct 2, 2015 at 4:26 AM, Kyotaro HORIGUCHI
 wrote:

During join search, a joinrel should be comptible between local
joins and remote joins, of course target list also should be
so. So it is quite difficult to add wholerow resjunk for joinrels
before whole join tree is completed even if we allow row marks
that are not bound to base RTEs.



Suppose ROW_MARK_COPY is in use, and suppose the query is: SELECT
ft1.a, ft1.b, ft2.a, ft2.b FROM ft1, ft2 WHERE ft1.x = ft2.x;

When the foreign join is executed, there's going to be a slot that
needs to be populated with ft1.a, ft1.b, ft2.a, ft2.b, and a whole row
reference. Now, let's suppose the slot descriptor has 5 columns: those
4, plus a whole-row reference for ROW_MARK_COPY.


IIUC, I think that if ROW_MARK_COPY is in use, the descriptor would have 
6 columns: those 4, plus a whole-row var for ft1 and another whole-row 
bar for ft2.  Maybe I'm missing something, though.



4, plus a whole-row reference for ROW_MARK_COPY.  If we know what
values we're going to store in columns 1..4, couldn't we just form
them into a tuple to populate column 5? We don't actually need to be
able to fetch such a tuple from the remote side because we can just
construct it.  I think.


I also was thinking whether we could replace one of the whole-row vars 
with a whole-row var that represents the scan slot of the 
ForeignScanState node.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obsolete comment in tidpath.c

2015-10-06 Thread Etsuro Fujita
On 2015/10/07 7:01, Tom Lane wrote:
> Robert Haas  writes:
>> On Mon, Oct 5, 2015 at 3:05 AM, Etsuro Fujita
>>  wrote:
>>> I think "best_inner_indexscan()" in the following comment in tidpath.c
>>> is obsolete.
>>>
>>> * There is currently no special support for joins involving CTID; in
>>> * particular nothing corresponding to best_inner_indexscan().  Since it's
>>> * not very useful to store TIDs of one table in another table, there
>>> * doesn't seem to be enough use-case to justify adding a lot of code
>>> * for that.
>>>
>>> How about s/best_inner_indexscan()/parameterized scans/?
> 
>> I'm not sure that's altogether clear.
> 
> Probably consider_index_join_clauses() is the closest current equivalent.
> However, it may not be such a great idea to have this comment referencing
> a static function in another file, as it wouldn't occur to people to look
> here when rewriting indxpath.c.  (Ahem.)
> 
> Perhaps "in particular, no ability to produce parameterized paths here".

Works for me.

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-05 Thread Etsuro Fujita

On 2015/09/29 16:36, Etsuro Fujita wrote:

For the foreign table case (scanrelid>0), I imagined an approach
different than yours.  In that case, I thought the issue would be
probably addressed by just modifying the remote query performed in
RefetchForeignRow, which would be of the form "SELECT ctid, * FROM
remote table WHERE ctid = $1", so that the modified query would be of
the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote
quals*".


Sorry, I was wrong.  I noticed that the modifieid query (that will be 
sent to the remote server during postgresRefetchForeignRow) should be of 
the form "SELECT * FROM (SELECT ctid, * FROM remote table WHERE ctid = 
$1) ss WHERE *remote quals*".  (I think the query of the form "SELECT 
ctid, * FROM remote table WHERE ctid = $1 AND *remote quals*" would be 
okay if using a TID scan on the remote side, though.)


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Obsolete comment in tidpath.c

2015-10-05 Thread Etsuro Fujita
I think "best_inner_indexscan()" in the following comment in tidpath.c
is obsolete.

 * There is currently no special support for joins involving CTID; in
 * particular nothing corresponding to best_inner_indexscan().  Since it's
 * not very useful to store TIDs of one table in another table, there
 * doesn't seem to be enough use-case to justify adding a lot of code
 * for that.

How about s/best_inner_indexscan()/parameterized scans/?

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing remark about UPSERT in fdwhandler.sgml

2015-10-04 Thread Etsuro Fujita

On 2015/10/03 5:57, Robert Haas wrote:

On Fri, Oct 2, 2015 at 4:04 AM, Peter Geoghegan  wrote:

On Fri, Oct 2, 2015 at 1:00 AM, Etsuro Fujita
 wrote:

ISTM that the sentence "as remote constraints are not locally known" is
somewhat confusing, because check constrains on remote tables can be
defined locally in 9.5.  How about "unique constraints or exclusion
constraints on remote tables are not locally known"?  Attached is a
patch for that.



Makes sense to me.



Me, too.  Committed.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Confusing remark about UPSERT in fdwhandler.sgml

2015-10-02 Thread Etsuro Fujita
The following is a remark about UPSERT in fdwhandler.sgml.

 INSERT with an ON CONFLICT clause does not
 support specifying the conflict target, as remote constraints are not
 locally known. This in turn implies that ON CONFLICT DO
 UPDATE is not supported, since the specification is mandatory there.

ISTM that the sentence "as remote constraints are not locally known" is
somewhat confusing, because check constrains on remote tables can be
defined locally in 9.5.  How about "unique constraints or exclusion
constraints on remote tables are not locally known"?  Attached is a
patch for that.

Best regards,
Etsuro Fujita
*** a/doc/src/sgml/fdwhandler.sgml
--- b/doc/src/sgml/fdwhandler.sgml
***
*** 1196,1204  GetForeignServerByName(const char *name, bool missing_ok);
  
  
   INSERT with an ON CONFLICT clause does not
!  support specifying the conflict target, as remote constraints are not
!  locally known. This in turn implies that ON CONFLICT DO
!  UPDATE is not supported, since the specification is mandatory there.
  
  
 
--- 1196,1205 
  
  
   INSERT with an ON CONFLICT clause does not
!  support specifying the conflict target, as unique constraints or
!  exclusion constraints on remote tables are not locally known. This
!  in turn implies that ON CONFLICT DO UPDATE is not supported,
!  since the specification is mandatory there.
  
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in /src/backend/optimizer/README

2015-10-01 Thread Etsuro Fujita
On 2015/10/01 22:31, Tom Lane wrote:
> Etsuro Fujita  writes:
>> The following is a remark added to /src/backend/optimizer/README by
>> commit 8703059c6b55c427100e00a09f66534b6ccbfaa1, and IIUC, I think "LHS"
>> in the last sentence "We prevent that by forcing the min LHS for the
>> upper join to include B." should be "RHS".

> Mmm, yeah, that's a typo.  Will fix, thanks.

Thanks!

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-01 Thread Etsuro Fujita

On 2015/10/02 9:50, Kyotaro HORIGUCHI wrote:

As long as FDW author can choose their best way to produce a joined
tuple, it may be worth to investigate.

My comments are:
* ForeignRecheck is the best location to call RefetchForeignJoinRow
   when scanrelid==0, not ExecScanFetch. Why you try to add special
   case for FDW in the common routine.


In my understanding, the job that ExecScanRecheckMtd should do is to 
check if the test tuple *already stored* in the plan node's scan slot 
meets the access-method conditions, in general.  So, ISTM that it'd be
somewhat odd to replace RefetchForeignJoinRow within ForeignRecheck, to 
store the remote join tuple in the slot.  Also, RefetchForeignRow is 
called from the common routines ExecLockRows/EvalPlanQualFetchRowMarks ...



* It is FDW's choice where the remote join tuple is kept, even though
   most of FDW will keep it on the private field of ForeignScanState.


I see.

To make it possible that the FDW doesn't have to do anything for cases 
where the FDW doesn't do any late row locking, however, I think it'd be 
more promising to use the remote join tuple stored in the scan slot of 
the corresponding ForeignScanState node in the parent's planstate tree. 
 I haven't had a good idea for that yet, though.



EvalPlanQualFetchRowMarks fetches the possiblly
modified row then EvalPlanQualNext does recheck for the new
row.


Really?  EvalPlanQualFetchRowMarks fetches the tuples for any non-locked 
relations, so I think that that function should fetch the same version 
previously obtained for each such relation successfully.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-01 Thread Etsuro Fujita

On 2015/10/01 19:02, Kyotaro HORIGUCHI wrote:

At Thu, 1 Oct 2015 17:50:25 +0900, Etsuro Fujita  wrote 
in <560cf3d1.9060...@lab.ntt.co.jp>

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas



So, if we wanted to fix this in a way that preserves the spirit of
what's there now, it seems to me that we'd want the FDW to return
something that's like a whole row reference, but represents the output
of the foreign join rather than some underlying base table.  And then
get the EPQ machinery to have the evaluation of the ForeignScan for
the join, when it happens in an EPQ context, to return that tuple.
But I don't really have a good idea how to do that.



So, I'd like to investigate another approach that preserves the
applicability of late row locking to the join pushdown case as well as
the spirit of what's there now.  The basic idea is (1) add a new
callback routine RefetchForeignJoinRow that refetches one foreign-join
tuple from the foreign server, after locking remote tuples for the
component foreign tables if required,



It would be the case that at least one of the component relations
of a foreign join is other than ROW_MARK_COPY, which is not
possible so far on postgres_fdw.


Yes.  To be exact, it's possible for the component relations to have 
rowmark methods other than ROW_MARK_COPY using GetForeignRowMarkType, in 
principle, but the server crashes ...



For the case that some of the
component relations are other than ROW_MARK_COPY, we might should
call RefetchForeignRow for such relations and construct joined
row involving ROW_MARK_COPY relations.


You are saying that we should construct the joined row using an 
alternative local join execution plan?



Indeed we could consider some logic for the case, it is obvious
that the case now we should focus on is a "foreign join" scan
with all underlying foreign scans are ROW_MARK_COPY, I
think. "foreign join" scan with ROW_MARK_COPY looks to be
promising (for me) and in future it would be able to coexist with
refetch mechanism maybe in your mind from this point of
view... Maybe:p


I agree that the approach "foreign-join scan with ROW_MARK_COPY" would 
be promising.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in /src/backend/optimizer/README

2015-10-01 Thread Etsuro Fujita
The following is a remark added to /src/backend/optimizer/README by
commit 8703059c6b55c427100e00a09f66534b6ccbfaa1, and IIUC, I think "LHS"
in the last sentence "We prevent that by forcing the min LHS for the
upper join to include B." should be "RHS".

The use of minimum Relid sets has some pitfalls; consider a query like
A leftjoin (B leftjoin (C innerjoin D) on (Pbcd)) on Pa
where Pa doesn't mention B/C/D at all.  In this case a naive computation
would give the upper leftjoin's min LHS as {A} and min RHS as {C,D} (since
we know that the innerjoin can't associate out of the leftjoin's RHS, and
enforce that by including its relids in the leftjoin's min RHS).  And the
lower leftjoin has min LHS of {B} and min RHS of {C,D}.  Given such
information, join_is_legal would think it's okay to associate the upper
join into the lower join's RHS, transforming the query to
B leftjoin (A leftjoin (C innerjoin D) on Pa) on (Pbcd)
which yields totally wrong answers.  We prevent that by forcing the min LHS
for the upper join to include B.

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-01 Thread Etsuro Fujita

On 2015/10/01 11:15, Kouhei Kaigai wrote:

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujita
 wrote:

I thought the same thing [1].  While I thought it was relatively easy to
make changes to RefetchForeignRow that way for the foreign table case
(scanrelid>0), I was not sure how hard it would be to do so for the foreign
join case (scanrelid==0).  So, I proposed to leave that changes for 9.6.
I'll have a rethink on this issue along the lines of that approach.



So, if we wanted to fix this in a way that preserves the spirit of
what's there now, it seems to me that we'd want the FDW to return
something that's like a whole row reference, but represents the output
of the foreign join rather than some underlying base table.  And then
get the EPQ machinery to have the evaluation of the ForeignScan for
the join, when it happens in an EPQ context, to return that tuple.
But I don't really have a good idea how to do that.



Alternative built-in join execution?
Once it is executed under the EPQ context, built-in join node fetches
a tuple from both of inner and outer side for each. It is eventually
fetched from the EPQ slot, then the alternative join produce a result
tuple.
In case when FDW is not designed to handle join by itself, it is
a reasonable fallback I think.

I expect FDW driver needs to handle EPQ recheck in the case below:
* ForeignScan on base relation and it uses late row locking.
* ForeignScan on join relation, even if early locking.


I also think the approach would be one choice.  But one thing I'm 
concerned about is plan creation for that by the FDW author; that would 
make life hard for the FDW author.  (That was proposed by me ...)


So, I'd like to investigate another approach that preserves the 
applicability of late row locking to the join pushdown case as well as 
the spirit of what's there now.  The basic idea is (1) add a new 
callback routine RefetchForeignJoinRow that refetches one foreign-join 
tuple from the foreign server, after locking remote tuples for the 
component foreign tables if required, and (2) call that routine in 
ExecScanFetch if the target scan is for a foreign join and the component 
foreign tables require to be locked lately, else just return the 
foreign-join tuple stored in the parent's state tree, which is the tuple 
mentioned by Robert, for preserving the spirit of what's there now.  I 
think that ExecLockRows and EvalPlanQualFetchRowMarks should probably be 
modified so as to skip foreign tables involved in a foreign join.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-01 Thread Etsuro Fujita
On 2015/10/01 15:38, Kyotaro HORIGUCHI wrote:
>> I expect FDW driver needs to handle EPQ recheck in the case below:
>> * ForeignScan on base relation and it uses late row locking.

> I think this is indisputable.

I think so.  But I think this case would probably be handled by the
existing RefetchForeignRow routine as I said upthread.

>> * ForeignScan on join relation, even if early locking.

> This could be unnecessary if the "foreign join" scan node can
> have its own rowmark of ROW_MARK_COPY.

That's an idea, but I'd vote for preserving the applicability of late
row locking to the foreign join case, allowing component foreign tables
involved in a foreign join to have different rowmark methods other than
ROW_MARK_COPY.

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-30 Thread Etsuro Fujita

On 2015/09/30 6:55, Robert Haas wrote:

On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujita
 wrote:

I thought the same thing [1].  While I thought it was relatively easy to
make changes to RefetchForeignRow that way for the foreign table case
(scanrelid>0), I was not sure how hard it would be to do so for the foreign
join case (scanrelid==0).  So, I proposed to leave that changes for 9.6.
I'll have a rethink on this issue along the lines of that approach.


Well, I spent some more time looking at this today, and testing it out
using a fixed-up version of your foreign_join_v16 patch, and I decided
that RefetchForeignRow is basically a red herring.  That's only used
for FDWs that do late row locking, but postgres_fdw (and probably many
others) do early row locking, in which case RefetchForeignRow never
gets called. Instead, the row is treated as a "non-locked source row"
by ExecLockRows (even though it is in fact locked) and is re-fetched
by EvalPlanQualFetchRowMarks.  We should probably update the comment
about non-locked source rows to mention the case of FDWs that do early
row locking.

Anyway, everything appears to work OK up to this point: we correctly
retrieve the saved whole-rows from the foreign side and call
EvalPlanQualSetTuple on each one, setting es_epqTuple[rti - 1] and
es_epqTupleSet[rti - 1].  So far, so good.  Now we call
EvalPlanQualNext, and that's where we get into trouble.  We've got the
already-locked tuples from the foreign side and those tuples CANNOT
have gone away or been modified because we have already locked them.
So, all the foreign join needs to do is return the same tuple that it
returned before: the EPQ recheck was triggered by some *other* table
involved in the plan, not our table.  A local table also involved in
the query, or conceivably a foreign table that does late row locking,
could have had something change under it after the row was fetched,
but in postgres_fdw that can't happen because we locked the row up
front.  And thus, again, all we need to do is re-return the same
tuple.  But we don't have that.  Instead, the ROW_MARK_COPY logic has
caused us to preserve a copy of each *baserel* tuple.

Now, this is as sad as can be.  Early row locking has huge advantages
for FDWs, both in terms of minimizing server round trips and also
because the FDW doesn't really need to do anything about EPQ.  Sure,
it's inefficient to carry around whole-row references, but it makes
life easy for the FDW author.

So, if we wanted to fix this in a way that preserves the spirit of
what's there now, it seems to me that we'd want the FDW to return
something that's like a whole row reference, but represents the output
of the foreign join rather than some underlying base table.  And then
get the EPQ machinery to have the evaluation of the ForeignScan for
the join, when it happens in an EPQ context, to return that tuple.
But I don't really have a good idea how to do that.


I like a general solution.  Can't we extend that idea so that foreign 
tables involved in a foreign join are allowed to have different rowmark 
methods other than ROW_MARK_COPY, eg, ROW_MARK_EXCLUSIVE?


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-30 Thread Etsuro Fujita

On 2015/09/29 21:38, Kouhei Kaigai wrote:

Also note that EvalPlanQualFetchRowMarks() will raise an error
if RefetchForeignRow callback returned NULL tuple.
Is it right or expected behavior?



IIUC, I think that that behavior is reasonable.



It looks to me this callback is designed to pull out a particular
tuple identified by the remote-row-id, regardless of the qualifier
checks based on the latest value.



Because erm->markType==ROW_MARK_REFERENCE, I don't think that that
behavior would cause any problem.  Maybe I'm missing something, though.



Really?


Yeah, I think RefetchForeignRow should work differently depending on the 
rowmark type.  When erm->markType==ROW_MARK_REFERENCE, the callback 
should fetch a particular tuple identified by the rowid (ie, the same 
version previously obtained) successfully.  So for that case, I don't 
think the remote quals need to be checked during RefetchForeignRow.



ExecLockRows() calls EvalPlanQualFetchRowMarks() to fill up EPQ tuple
slot prior to EvalPlanQualNext(), because these tuples are referenced
during EPQ rechecks.
The purpose of EvalPlanQualNext() is evaluate whether the current bunch
of rows are visible towards the qualifiers of underlying scan/join.
Then, if not visible, it *ignores* the current tuples, as follows.

 /*
  * Now fetch any non-locked source rows --- the EPQ logic knows how to
  * do that.
  */
 EvalPlanQualSetSlot(&node->lr_epqstate, slot);
 EvalPlanQualFetchRowMarks(&node->lr_epqstate);   <--- LOAD REMOTE ROWS

 /*
  * And finally we can re-evaluate the tuple.
  */
 slot = EvalPlanQualNext(&node->lr_epqstate); <--- EVALUATE 
QUALIFIERS
 if (TupIsNull(slot))
 {
 /* Updated tuple fails qual, so ignore it and go on */
 goto lnext;   <-- IGNORE THE ROW, NOT RAISE AN ERROR
 }

What happen if RefetchForeignRow raise an error in case when the latest
row exists but violated towards the "remote quals" ?
This is the case to be ignored, unlike the case when remote row identified
by row-id didn't exist.


IIUC, I think that that depends on where RefetchForeignRow is called 
(ie, the rowmark type).  When it is called from 
EvalPlanQualFetchRowMarks, the transaction should be aborted as I 
mentioned above, if it couldn't fetch the same version previously 
obtained.  But when RefetchForeignRow is called from ExecLockRows, the 
tuple should be just ignored as the above code, if the latest version on 
the remote side didn't satisfy the remote quals.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment update to pathnode.c

2015-09-29 Thread Etsuro Fujita

On 2015/09/29 20:51, Robert Haas wrote:

On Tue, Sep 29, 2015 at 1:55 AM, Etsuro Fujita
 wrote:

Thanks for the comments!  Attached is an updated version of the patch.


Committed and back-patched to 9.5.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Etsuro Fujita

On 2015/09/29 17:49, Kouhei Kaigai wrote:

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita



RefetchForeignRow() does not take ForeignScanState as its argument,
so it is not obvious to access its private field, isn't it?
ExecRowMark contains "rti" field, so it might be feasible to find out
the target PlanState using walker routine recently supported, although
it is not a simple enough.
Unless we don't have reference to the private field, it is not feasible
to access expression that was pushed down to the remote-side, therefore,
it does not allow to apply proper rechecks here.



Could you introduce us (1) how to access private data field of
ForeignScanState from the RefetchForeignRow callback?



For the foreign table case (scanrelid>0), I imagined an approach
different than yours.  In that case, I thought the issue would be
probably addressed by just modifying the remote query performed in
RefetchForeignRow, which would be of the form "SELECT ctid, * FROM
remote table WHERE ctid = $1", so that the modified query would be of
the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote
quals*".


Sorry, I forgot to add "FOR UPDATE" to the before/after queries.


My question is how to pull expression of the remote query.
It shall be stored at somewhere private field of ForeignScanState,
however, RefetchForeignRow does not have direct access to the
relevant ForeignScanState node.
It is what I asked at the question (1).


I imagined the following steps to get the remote query string: (1) 
create the remote query string and store it in fdw_private during 
postgresGetForeignPlan, (2) extract the string from fdw_private and 
store it in erm->ermExtra during postgresBeginForeignScan, and (3) 
extract the string from erm->ermExtra in postgresRefetchForeignRow.



Also note that EvalPlanQualFetchRowMarks() will raise an error
if RefetchForeignRow callback returned NULL tuple.
Is it right or expected behavior?


IIUC, I think that that behavior is reasonable.


It looks to me this callback is designed to pull out a particular
tuple identified by the remote-row-id, regardless of the qualifier
checks based on the latest value.


Because erm->markType==ROW_MARK_REFERENCE, I don't think that that 
behavior would cause any problem.  Maybe I'm missing something, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Etsuro Fujita

On 2015/09/29 13:55, Kouhei Kaigai wrote:

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita
On 2015/09/29 9:13, Kouhei Kaigai wrote:



From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
On Mon, Sep 28, 2015 at 3:34 AM, Kouhei Kaigai  wrote:



The attached patch allows FDW driver to handle EPQ recheck by its own
preferable way, even if it is alternative local join or ExecQual to
the expression being pushed down.



Thanks.  I was all set to commit this, or at least part of it, when I
noticed that we already have an FDW callback called RefetchForeignRow.
We seem to be intending that this new callback should refetch the row
from the foreign server and verify that any pushed-down quals apply to
it.  But why can't RefetchForeignRow do that?  That seems to be what
it's for.



I thought the same thing [1].  While I thought it was relatively easy to
make changes to RefetchForeignRow that way for the foreign table case
(scanrelid>0), I was not sure how hard it would be to do so for the
foreign join case (scanrelid==0).  So, I proposed to leave that changes
for 9.6.  I'll have a rethink on this issue along the lines of that
approach.



Even if base relation case, is it really easy to do?

RefetchForeignRow() does not take ForeignScanState as its argument,
so it is not obvious to access its private field, isn't it?
ExecRowMark contains "rti" field, so it might be feasible to find out
the target PlanState using walker routine recently supported, although
it is not a simple enough.
Unless we don't have reference to the private field, it is not feasible
to access expression that was pushed down to the remote-side, therefore,
it does not allow to apply proper rechecks here.

In addition, it is problematic when scanrelid==0 because we have no
relevant ForeignScanState which represents the base relations, even
though ExecRowMark is associated with a particular base relation.
In case of scanrelid==0, EPQ recheck routine also have to ensure
the EPQ tuple is visible towards the join condition in addition to
the qualifier of base relation. These information is also stored within
private data field, so it has to have a reference to the private data
of ForeignScanState of the remote join (scanrelid==0) which contains
the target relation.

Could you introduce us (1) how to access private data field of
ForeignScanState from the RefetchForeignRow callback? (2) why it
is reasonable to implement than the callback on ForeignRecheck().


For the foreign table case (scanrelid>0), I imagined an approach 
different than yours.  In that case, I thought the issue would be 
probably addressed by just modifying the remote query performed in 
RefetchForeignRow, which would be of the form "SELECT ctid, * FROM 
remote table WHERE ctid = $1", so that the modified query would be of 
the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote 
quals*".


For the foreign join case (scanrelid==0), in my vision, I think we would 
need some changes not only to RefetchForeignRow but to the existing 
EvalPlanQual machinery in the core.  I've not had a clear image yet, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment update to pathnode.c

2015-09-28 Thread Etsuro Fujita

On 2015/09/12 4:26, Robert Haas wrote:

On Fri, Sep 11, 2015 at 6:22 AM, Etsuro Fujita
 wrote:

The comments for create_foreignscan_path says as follows, but that it's
now possible that the function is called by GetForeignJoinPaths, which
was added in 9.5.

1450 /*
1451  * create_foreignscan_path
1452  *Creates a path corresponding to a scan of a foreign table,
1453  *returning the pathnode.
1454  *
1455  * This function is never called from core Postgres; rather, it's
expected
1456  * to be called by the GetForeignPaths function of a foreign data
wrapper.
1457  * We make the FDW supply all fields of the path, since we do not
have any
1458  * way to calculate them in core.
1459  */

So, I've updated the comments.  Please find attached a patch.


I would write "to be called by the GetForeignPaths or
GetForeignJoinPaths function" to keep it a bit more concise.  And I
would reflow the paragraph.


Thanks for the comments!  Attached is an updated version of the patch.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
***
*** 1449,1461  create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
  
  /*
   * create_foreignscan_path
!  *	  Creates a path corresponding to a scan of a foreign table,
!  *	  returning the pathnode.
   *
   * This function is never called from core Postgres; rather, it's expected
!  * to be called by the GetForeignPaths function of a foreign data wrapper.
!  * We make the FDW supply all fields of the path, since we do not have any
!  * way to calculate them in core.
   */
  ForeignPath *
  create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
--- 1449,1461 
  
  /*
   * create_foreignscan_path
!  *	  Creates a path corresponding to a scan of a foreign table or
!  *	  a foreign join, returning the pathnode.
   *
   * This function is never called from core Postgres; rather, it's expected
!  * to be called by the GetForeignPaths or GetForeignJoinPaths function of
!  * a foreign data wrapper.  We make the FDW supply all fields of the path,
!  * since we do not have any way to calculate them in core.
   */
  ForeignPath *
  create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-28 Thread Etsuro Fujita

On 2015/09/29 9:13, Kouhei Kaigai wrote:

-Original Message-
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
Sent: Tuesday, September 29, 2015 5:46 AM
To: Kaigai Kouhei(海外 浩平)
Cc: Etsuro Fujita; PostgreSQL-development; 花田茂
Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

On Mon, Sep 28, 2015 at 3:34 AM, Kouhei Kaigai  wrote:

The attached patch allows FDW driver to handle EPQ recheck by its own
preferable way, even if it is alternative local join or ExecQual to
the expression being pushed down.


Thanks for the work, KaiGai-san!


Thanks.  I was all set to commit this, or at least part of it, when I
noticed that we already have an FDW callback called RefetchForeignRow.
We seem to be intending that this new callback should refetch the row
from the foreign server and verify that any pushed-down quals apply to
it.  But why can't RefetchForeignRow do that?  That seems to be what
it's for.


Thanks for the comments, Robert!

I thought the same thing [1].  While I thought it was relatively easy to 
make changes to RefetchForeignRow that way for the foreign table case 
(scanrelid>0), I was not sure how hard it would be to do so for the 
foreign join case (scanrelid==0).  So, I proposed to leave that changes 
for 9.6.  I'll have a rethink on this issue along the lines of that 
approach.


Sorry for having had no response.  I was on vacation.

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/55deb5a9.8010...@lab.ntt.co.jp



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


<    1   2   3   4   5   6   7   8   9   10   >