On 2016/03/08 2:35, Robert Haas wrote:
On Mon, Mar 7, 2016 at 7:53 AM, Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp> wrote:
Another option to avoid such a hazard would be to remove the two changes
from ExecInitModifyTable and create ExecAuxRowMarks and junk filters even in
the pushdown case.  I made the changes because we won't use ExecAuxRowMarks
in that case since we don't need to do EvalPlanQual rechecks and because we
won't use junk filters in that case since we do UPDATE/DELETE in the
subplan.  But the creating cost is enough small, so simply removing the
changes seems like a good idea.

Sure, that works.

OK, I removed the changes.

This issue crops up elsewhere as well.  The changes to
ExecModifyTable() have the same problem -- in that case, it might be
wise to move the code that's going to have to be indented yet another
level into a separate function.   That code also says this:

+                       /* No need to provide scan tuple to
ExecProcessReturning. */
+                       slot = ExecProcessReturning(resultRelInfo,
NULL, planSlot);

...but, uh, why not?  The comment says what the code does, but what it
should do is explain why it does it.

As documented in IterateDMLPushdown in fdwhandler.sgml, the reason for that
is that in the pushdown case it's the IterateDMLPushdown's responsiblity to
get actually inserted/updated/deleted tuples and make those tuples available
to the ExecProcessReturning.  I'll add comments.

Comments are good things to have.  :-)

Yeah, I added comments.

On a broader level, I'm not very happy with the naming this patch
uses.  Here's an example:

+    <para>
+     If an FDW supports optimizing foreign table updates, it still needs
to
+     provide <function>PlanDMLPushdown</>, <function>BeginDMLPushdown</>,
+     <function>IterateDMLPushdown</> and <function>EndDMLPushdown</>
+     described below.
+    </para>

"Optimizing foreign table updates" is both inaccurate (since it
doesn't only optimize updates) and so vague as to be meaningless
unless you already know what it means.  The actual patch uses
terminology like "fdwPushdowns" which is just as bad.  We might push a
lot of things to the foreign side -- sorts, joins, aggregates, limits
-- and this is just one of them.  Worse, "pushdown" is itself
something of a term of art - will people who haven't been following
all of the mammoth, multi-hundred-email threads on this topic know
what that means?  I think we need some better terminology here.

The best thing that I can come up with offhand is "bulk modify".  So
we'd have PlanBulkModify, BeginBulkModify, IterateBulkModify,
EndBulkModify, ExplainBulkModify.  Other suggestions welcome.   The
ResultRelInfo flag could be ri_usesFDWBulkModify.

I'm not sure that "bulk modify" is best.  Yeah, this would improve the
performance especially in the bulk-modification case, but would improve the
performance even in the case where an UPDATE/DELETE modifies just a single
row.  Let me explain using an example.  Without the patch, we have the
following plan for an UPDATE on a foreign table that updates a single row:

postgres=# explain verbose update foo set a = a + 1 where a = 1;
                                     QUERY PLAN
----------------------------------------------------------------------------------
  Update on public.foo  (cost=100.00..101.05 rows=1 width=14)
    Remote SQL: UPDATE public.foo SET a = $2 WHERE ctid = $1
    ->  Foreign Scan on public.foo  (cost=100.00..101.05 rows=1 width=14)
          Output: (a + 1), b, ctid
          Remote SQL: SELECT a, b, ctid FROM public.foo WHERE ((a = 1)) FOR
UPDATE
(5 rows)

The plan requires two queries, SELECT and UPDATE, to do the update.
(Actually, the plan have additional overheads in creating a cursor for the
SELECT and establishing a prepared statement for the UPDATE.)  But with the
patch, we have:

postgres=# explain verbose update foo set a = a + 1 where a = 1;
                                 QUERY PLAN
---------------------------------------------------------------------------
  Update on public.foo  (cost=100.00..101.05 rows=1 width=14)
    ->  Foreign Update on public.foo  (cost=100.00..101.05 rows=1 width=14)
          Remote SQL: UPDATE public.foo SET a = (a + 1) WHERE ((a = 1))
(3 rows)

The optimized plan requires just a single UPDATE query to do that!  So, even
in the single-row-modification case the patch could improve the performance.

How about "Direct Modify"; PlanDirectModify, BeginDirectModify,
IterateDirectModify, EndDirectModify, ExplainDirectModify, and
ri_usesFDWDirectModify.

Works for me!

Great! I changed the naming. I also updated docs as proposed by you in a previous email, and rebased the patch to the latest HEAD. Please find attached an updated version of the patch.

Best regards,
Etsuro Fujita

Attachment: fdw-dml-pushdown-v10.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to