> -----Original Message----- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro HORIGUCHI > Sent: Wednesday, July 22, 2015 4:10 PM > To: robertmh...@gmail.com > Cc: hlinn...@iki.fi; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Asynchronous execution on FDW > > Hello, thank you for the comment. > > At Fri, 17 Jul 2015 14:34:53 -0400, Robert Haas <robertmh...@gmail.com> wrote > in <ca+tgmoaijk1svzw_gkfu+zssxcijkfelqu2aomvuphpsfw4...@mail.gmail.com> > > On Fri, Jul 3, 2015 at 4:41 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote: > > > At a quick glance, I think this has all the same problems as starting the > > > execution at ExecInit phase. The correct way to do this is to kick off the > > > queries in the first IterateForeignScan() call. You said that "ExecProc > > > phase does not fit" - why not? > > > > What exactly are those problems? > > > > I can think of these: > > > > 1. If the scan is parametrized, we probably can't do it for lack of > > knowledge of what they will be. This seems easy; just don't do it in > > that case. > > We can put an early kick to foreign scans only for the first shot > if we do it outside (before) ExecProc phase. > > Nestloop > -> SeqScan > -> Append > -> Foreign (Index) Scan > -> Foreign (Index) Scan > .. > > This plan premises precise (even to some extent) estimate for > remote query but async execution within ExecProc phase would be > in effect for this case. > > > > 2. It's possible that we're down inside some subtree of the plan that > > won't actually get executed. This is trickier. > > As for current postgres_fdw, it is done simply abandoning queued > result then close the cursor. > > > Consider this: > > > > Append > > -> Foreign Scan > > -> Foreign Scan > > -> Foreign Scan > > <repeat 17 more times> > > > > If we don't start each foreign scan until the first tuple is fetched, > > we will not get any benefit here, because we won't fetch the first > > tuple from query #2 until we finish reading the results of query #1. > > If the result of the Append node will be needed in its entirety, we > > really, really want to launch of those queries as early as possible. > > OTOH, if there's a Limit node with a small limit on top of the Append > > node, that could be quite wasteful. > > It's the nature of speculative execution, but the Limit will be > pushed down onto every Foreign Scans near future. > > > We could decide not to care: after all, if our limit is > > satisfied, we can just bang the remote connections shut, and if > > they wasted some CPU, well, tough luck for them. But it would > > be nice to be smarter. I'm not sure how, though. > > Appropriate fetch size will cap the harm and the case will be > handled as I mentioned above as for postgres_fdw. > Horiguchi-san,
Let me ask an elemental question. If we have ParallelAppend node that kicks a background worker process for each underlying child node in parallel, does ForeignScan need to do something special? Expected waste of CPU or I/O is common problem to be solved, however, it does not need to add a special case handling to ForeignScan, I think. How about your opinion? Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers