Hi hackers, I'm working on $SUBJECT and would like to get comments about the design. Attached patch is for the design below. Note that the patch requires Kaigai-san's custom foriegn join patch[1]
[1] http://www.postgresql.org/message-id/9a28c8860f777e439aa12e8aea7694f80108c...@bpxm15gp.gisp.nec.co.jp Joins to be pushed down ======================= We have two levels of decision about Join push-down, core and FDW. I think we should allow them to push down joins as much as we can unless it doesn't break the semantics of join. Anyway FDWs should decide whether the join can be pushed down or not, on the basis of the FDW's capability. Here is the list of checks which should be done in core: 1. Join source relations All of foreign tables used in a join should be managed by one foreign data wrapper. I once proposed that all source tables should belong to one server, because at that time I assumed that FDWs use SERVER to express physical place of data source. But Robert's comment gave me an idea that SERVER is not important for some FDWs, so now I think check about server matching should be done by FDWs. USER MAPPING is another important attribute of foreign scan/join, and IMO it should be checked by FDW because some of FDWs don't require USER MAPPING. If an FDW want to check user mapping, all tables in the join should belong to the same server and have same RangeTablEntry#checkAsUser to ensure that only one user mapping is derived. 2. Join type Join type can be any, except JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER, though most of FDWs would support only INNER and OUTER. Pushing down CROSS joins might seem inefficient, because obviously CROSS JOIN always produces more result than retrieving all rows from each foreign table separately. However, some FDW might be able to accelerate such join with cache or something. So I think we should leave such decision to FDWs. Here is the list of checks which shold be done in postgres_fdw: 1. Join source relations As described above, postgres_fdw (and most of SQL-based FDWs) needs to check that 1) all foreign tables in the join belong to a server, and 2) all foreign tables have same checkAsUser. In addition to that, I add extra limitation that both inner/outer should be plain foreign tables, not a result of foreign join. This limiation makes SQL generator simple. Fundamentally it's possible to join even join relations, so N-way join is listed as enhancement item below. 2. Join type In the first proposal, postgres_fdw allows INNER and OUTER joins to be pushed down. CROSS, SEMI and ANTI would have much less use cases. 3. Join conditions and WHERE clauses Join conditions should consist of semantically safe expressions. Where the "semantically safe" means is same as WHERE clause push-down. Planned enhancements for 9.5 ============================ These features will be proposed as enhancements, hopefully in the 9.5 development cycle, but probably in 9.6. 1. Remove unnecessary column from SELECT clause Columns which are used for only join conditions can be removed from the target list, as postgres_fdw does in simple foreign scans. 2. Support N-way joins Mostly for difficulty of SQL generation, I didn't add support of N-Way joins. 3. Proper cost estimation Currently postgres_fdw always gives 0 as the cost of a foreign join, as a compromise. This is because estimating costs of each join without round-trip (EXPLAIN) is not easy. A rough idea about that I currently have is to use local statistics, but determining join method used at remote might require whole planner to run for the join subtree. Regards, -- Shigeru HANADA
join_pushdown.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers