Hi all, In 2011 I proposed join push-down support for foreign tables, which would improve performance of queries which contain join between foreign tables in one server, but it has not finished before time-up. This performance improvement would widen application range of foreign tables, so I'd like to tackle the work again.
The descriptions below are based on previous discussions and additional studies. Background ========== At the moment FDWs can't handle join, so every join are processed on local side even if the source relations are on the same server. It's apparently inefficient to fetch possible rows from remote and join them on local and waste some of them since join condition doesn't match. If FDW (typically SQL-based FDWs like postgres_fdw) can get control of JOIN operation, it would optimize queries for source tables into a join query and avoid transfer of un-match rows. With this improvement, most of joins in usual use, especially joins between large foreign tables which don't match much, would become remarkablly fast, for the reasons below. a) less data transfer Especially for inner joins, result of join is usually much smaller than source tables. If the original target list doesn't contain join keys, FDW might be able to omit from the SELECT list of remote queries because they are only necessary on remote side. b) more optimization on remote side Join query would provide remote data source more optimization chances, such as using index. Changes expected ================ In the past development trial, these changes seem necessary at least. (1) Add server oid field to RelOptInfo This attribute is set only when the RelOptInfo is a joinrel, and all underlying base relations are foreign tables and they have same server oid. This field is set through join consideration from lower join level to high (many tables) level, IOW from the bottom to the top. If all base relations joined in a query are on same server, top RelOptInfo which represents final output has valid server oid. In such case, whole query could be pushed down to the server and user can get most efficient result. New helper function GetFdwRoutineByServerId(Oid serverid) which returns FdwRoutine of given server oid would be handy. (2) Add new path node for foreign join New path node ForeignJoinPath, which inherits JoinPath like other join path nodes, represents a join between ForeignPath or ForeignJoinPath. ForeignJoinPath has fdw_private list to hold FDW-specific information through the path consideration phase. This is similar to fdw_private of ForeignPath path node. This node cares only type of join such as INNER JOIN and LEFT OUTER JOIN, but doesn't care how to do it. IOW foreign join is not special case of existing join nodes such as nested loops, merge join and hash join. FDW can implement a foreign join in arbitrary way, for instance, file_fdw can have already-joined file for particular combination for optimization, and postgres_fdw can generate a SELECT query which contains JOIN clause and avoid essentially unnecessary data transfer. At the moment I'm not sure whether we should support SEMI/ANTI join in the context of foreign join. It would require postgres_fdw (or other SQL-based FDWs) to generate query with subquery connected with IN/NOT IN clause, but it seems too far to head to in the first version. We (and especially FDW authors) need to note that join push-down is not the best way in some situations. In most cases OUTER JOIN populates data on remote side more than current FDWs transfer, especially for FULL OUTER JOIN and CROSS JOIN (cartesian product). (3) Add new plan node for foreign join New plan node ForeignJoin, which inherits Join like other join plan nodes. This node is similar to other join plan nodes such as NestLoop, MergeJoin and HashJoin, but it delegates actual processing to FDW associated to the server. This means that new plan state node for ForeignJoin, say ForeignJoinState, is also needed. (4) Add new FDW API functions Adding Join push-down support requires some functions to be added to FdwRoutine to give control to FDWs. a) GetForeignJoinPaths() This allows FDWs to provide alternative join paths for a join RelOptInfo. This is called from add_paths_to_joinrel() after considering other join possibilities, and FDW should call add_path() for each possible foreign join path. Foreign join paths are built similarly to existing join paths, in a bottom-up manner. FDWs may push ordered or unordered paths here, but combination of sort keys would bloat up easily if FDW has no information about efficient patterns such as remote indexes. FDW should not add too many paths to prevent exponential overhead of join combination. b) GetForeignJoinPlan() This creates ForeignJoin plan node from ForeignJoinPath and other planner infromation. c) Executor functions for ForeignJoin plan node A set of funcitons for executing ForeignJoin plan node is also needed. Begin/ReScan/Iterate/End are basic operations of a plan node, so we need to provide them for ForeignJoin node. Issues ====== (1) Separate cost estimation phases? For existing join paths, planner estimates their costs in two phaeses. In the first phase initial_cost_foo(), here foo is one of nestloop/mergejoin/hashjoin, produces lower-bound estimates for elimination. The second phase is done for only promising paths which passed add_path_precheck(), by final_cost_foo() for cost and result size. I'm not sure that we need to follow this manner, since FDWs would be able to estimate final cost/size with their own methods. (2) How to reflect cost of transfer Cost of transfer is dominant in foreign table operations, including foreign scans. It would be nice to have some mechanism to reflect actual time of transfer to the cost estimation. An idea is to have a FDW option which represents cost factor of transfer, say transfer_cost. (3) SELECT-with-Join SQL generation in postgres_fdw Probably Postgres-XC's shipping code would help us for implementing deparse JOIN SQL, but I've not studied it fully, I'll continue the study. (4) criteria for push-down It is assumed that FDWs can push joins down to remote when all foreign tables are in same server. IMO a SERVER objects represents a logical data source. For instance database for postgres_fdw and other connection-based FDWs, and disk volumes (or directory?) for file_fdw. Is this reasonable assumption? Perhaps more issues would come out later, but I'd like to get comments about the design. (5) Terminology I used "foreign join" as a process which joins foreign tables on *remote* side, but is this enough intuitive? Another idea is using "remote join", is this more appropriate for this kind of process? I hesitate to use "remote join" because it implies client-server FDWs, but foreign join is not limited to such FDWs, e.g. file_fdw can have extra file which is already joined files accessed via foreign tables. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers