> On Sat, Mar 14, 2015 at 3:48 AM, Robert Haas <robertmh...@gmail.com> wrote:
> 
> 
>       On Fri, Mar 13, 2015 at 2:31 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>       > Robert Haas <robertmh...@gmail.com> writes:
>       >> Another bit of this that I think we could commit without fretting
>       >> about it too much is the code adding set_join_pathlist_hook.  This
> is
>       >> - I think - analogous to set_rel_pathlist_hook, and like that hook,
>       >> could be used for other purposes than custom plan generation - e.g.
> to
>       >> delete paths we do not want to use.  I've extracted this portion of
>       >> the patch and adjusted the comments; if there are no objections, I
>       >> will commit this bit also.
>       >
>       > I don't object to the concept, but I think that is a pretty bad place
>       > to put the hook call: add_paths_to_joinrel is typically called 
> multiple
>       > (perhaps *many*) times per joinrel and thus this placement would force
>       > any user of the hook to do a lot of repetitive work.
> 
>       Interesting point.  I guess the question is whether a some or all
>       callers are going to actually *want* a separate call for each
>       invocation of add_paths_to_joinrel(), or whether they'll be happy to
>       operate on the otherwise-complete path list.  It's true that if your
>       goal is to delete paths, it's probably best to be called just once
>       after the path list is complete, and there might be a use case for
>       that, but I guess it's less useful than for baserels.  For a baserel,
>       as long as you don't nuke the sequential-scan path, there is always
>       going to be a way to complete the plan; so this would be a fine way to
>       implement a disable-an-index extension.  But for joinrels, it's not so
>       easy to rule out, say, a hash-join here.  Neither hook placement is
>       much good for that; the path you want to get rid of may have already
>       dominated paths you want to keep.
> 
>       Suppose you want to add paths - e.g. you have an extension that goes
>       and looks for a materialized view that matches this subtree of the
>       query, and if it finds one, it substitutes a scan of the materialized
>       view for a scan of the baserel.  Or, as in KaiGai's case, you have an
>       extension that can perform the whole join in GPU-land and produce the
>       same results we would have gotten via normal execution.  Either way,
>       you want - and this is the central point of the whole patch here - to
>       inject a scan path into a joinrel.  It is not altogether obvious to me
>       what the best placement for this is.  In the materialized view case,
>       you probably need a perfect match between the baserels in the view and
>       the baserels in the joinrel to do anything.  There's no point in
>       re-checking that for every innerrels/outerrels combination.  I don't
>       know enough about the GPU case to reason about it intelligently; maybe
>       KaiGai can comment.
> 
>       I think the foreign data wrapper join pushdown case, which also aims
>       to substitute a scan for a join, is interesting to think about, even
>       though it's likely to be handled by a new FDW method instead of via
>       the hook.  Where should the FDW method get called from?  Currently,
>       the FDW method in KaiGai's patch is GetForeignJoinPaths, and that gets
>       called from add_paths_to_joinrel().  The patch at
>       http://www.postgresql.org/message-id/CAEZqfEfy7p=uRpwN-Q-NNgzb8kwHbf
> qf82ysb9ztfzg7zn6...@mail.gmail.com
>       uses that to implement join pushdown in postgres_fdw; if you have A
>       JOIN B JOIN C all on server X, we'll notice that the join with A and B
>       can be turned into a foreign scan on A JOIN B, and similarly for A-C
>       and B-C.  Then, if it turns out that the cheapest path for A-B is the
>       foreign join, and the cheapest path for C is a foreign scan, we'll
>       arrive at the idea of a foreign scan on A-B-C, and we'll realize the
>       same thing in each of the other combinations as well.  So, eventually
>       the foreign join gets pushed down.
> 
>       But there's another possible approach: suppose that
>       join_search_one_level, after considering left-sided and right-sided
>       joins and after considering bushy joins, checks whether every relation
>       it's got is from the same foreign server, and if so, asks that foreign
>       server whether it would like to contribute any paths. Would that be
>       better or worse?  A disadvantage is that if you've got something like
>       A LEFT JOIN B LEFT JOIN C LEFT JOIN D LEFT JOIN E LEFT JOIN F LEFT
>       JOIN G LEFT JOIN H LEFT JOIN I but none of the joins can be pushed
>       down (say, each join clause calls a non-pushdown-safe function) you'll
>       end up examining a pile of joinrels - at every level of the join tree
>       - and individually rejecting each one.  With the
>       build-it-up-incrementally approach, you'll figure that all out at
>       level 2, and then after that there's nothing to do but give up
>       quickly.  On the other hand, I'm afraid the incremental approach might
>       miss a trick: consider small LEFT JOIN (big INNER JOIN huge ON big.x =
>       huge.x) ON small.y = big.y AND small.z = huge.z, where all three are
>       foreign tables on the same server.  If the output of the big/huge join
>       is big, none of those paths are going to survive at level 2, but the
>       overall join size might be very small, so we surely want a chance to
>       recover at level 3.  (We discussed test cases of this form quite a bit
>       in the context of e2fa76d80ba571d4de8992de6386536867250474.)
> 
> 
> 
> 
> The real problem here, is that with FDW in picture, the "optimal substructure"
> property required by dynamic programming is broken. If A foreign join B 
> foreign
> join C is optimal solution for problem A join B join C, A foreign join B is 
> not
> necessarily optimal solution for subproblem A join B. While for local 
> relations,
> PostgreSQL has to compute each two way join itself, and thus chooses the 
> cheapest
> path for each two way join, FDW (esp. those working with real foreign servers)
> do not compute the joins in two-way fashion and don't need to choose the 
> cheapest
> path for each two way join.
>
I cannot agree 100% because we cannot know whether A foreign join B foreign
join C is optimal than A join B join C. For example, if (A x B) is estimated
to generate O(N) rows but (A x B) x C is estimated to generate O(N x M) rows,
local join may be optimal to process the final stage.
Even if N-way remote join might be possible, we need to estimate the cost of
remote join for each level, and make a decision whether it shall be pushed-
down to the remote server based on the estimated cost.
The hooks location Tom suggested requires FDW to compute a foreign-scan path
for each joinrel during concentration of join combinations, but not multiple
times for each joinrel.

> A way to work around this is to leave the ForeignPaths (there can possibly be
> only one foreign path per join relation) in the joinrel without removing them.
> FDW should work on joining two relations if they have foreign paths in the 
> list
> of paths, irrespective of whether the cheapest path is foreign join path or 
> not.
> For the topmost joinrel, if the foreign path happens to be the cheapest one, 
> the
> whole join tree will be pushed down.
>
> On the other thread implementing foreign join for postgres_fdw,
> postgresGetForeignJoinPaths(), is just looking at the cheapest path, which 
> would
> cause the problem you have described above.
>
It might be an idea if foreign-scan path is not wiped out regardless of the
estimated cost, we will be able to construct an entirely remote-join path
even if intermediation path is expensive than local join.
A problem is, how to distinct these special paths from usual paths that are
eliminated on the previous stage once its path is more expensive.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to