Hanada-san,

The proposed patch put an invocation of PlanForeignJoin on the
create_foreignjoin_path() being also called by match_unsorted_outer().
Is it a suitable position to make a decision whether a join can be
pushed-down?

I think; it needs an additional functionality to provide higher priority
on the foreign-join plan that other plans, when fdw determind a particular
join can be pushed-down.
(Sorry, I have no idea right now.)

Let's see the following result.

postgres=# EXPLAIN SELECT * FROM ft1 , ft2, lt3 WHERE ft1.a = ft2.x
AND ft1.a = lt3.s;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=205.08..758.83 rows=30750 width=108)
   Merge Cond: (ft1.a = lt3.s)
   ->  Merge Join  (cost=119.66..199.66 rows=5000 width=72)
         Merge Cond: (ft1.a = ft2.x)
         ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
               Sort Key: ft1.a
               ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
                     Remote SQL: DECLARE pgsql_fdw_cursor_0 SCROLL
CURSOR FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1
         ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
               Sort Key: ft2.x
               ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)
                     Remote SQL: DECLARE pgsql_fdw_cursor_1 SCROLL
CURSOR FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2
   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
         Sort Key: lt3.s
         ->  Seq Scan on lt3  (cost=0.00..22.30 rows=1230 width=36)
(15 rows)

Then, I turned off the enable_mergejoin.

postgres=# EXPLAIN SELECT * FROM ft1 , ft2, lt3 WHERE ft1.a = ft2.x
AND ft1.a = lt3.s;
                                                   QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=37.67..1126.42 rows=30750 width=108)
   Hash Cond: (ft1.a = lt3.s)
   ->  Foreign Scan on multiple foreign tables  (cost=0.00..0.00
rows=5000 width=72)
         Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y FROM public.ft1
ft1, public.ft2 ft2 WHERE (ft1.a = ft2.x)
   ->  Hash  (cost=22.30..22.30 rows=1230 width=36)
         ->  Seq Scan on lt3  (cost=0.00..22.30 rows=1230 width=36)
(6 rows)

Probably, the basic design is correct. However, the planner gives
higher priority on the join plan between
local and foreign than pushing-down foreign relations.

Does it make sense not to consider any other possible plans when FDW
decided a particular join can be
pushed down?

Thanks,

2011年10月7日18:06 Kohei KaiGai <kai...@kaigai.gr.jp>:
> 2011年10月4日12:08 Shigeru Hanada <shigeru.han...@gmail.com>:
>>> In my opinion, FdwRoutine should have an additional API to inform the core 
>>> its
>>> supported features; such as inner-join, outer-join, order-by,
>>> group-by, aggregate
>>> functions, insert, update, delete, etc... in the future version.
>>
>> Sure, so in my design PlanForeignJoin is optional.
>>
>> The lack of capability is informed from FDW with setting function
>> pointer in FdwRoutine to NULL.  If PlanForeignJoin was NULL, core
>> (planner) will give up to consider join push-down, and use one of local
>> join methods such as NestLoop and MergeJoin for those foreign tables.
>> As you say, other push-down-able features would also have optional
>> handler function for each.
>>
> Sorry, I overlooked it was already implemented at create_foreignjoin_path().
>
> I additionally tried several cases using pgsql_fdw.
> In some cases, it seems to me the planner don't push down the join tree
> as you probably expected.
> Please see the following example:
>
> I defined three foreign tables: ft1(a int, b text), ft2(x int, y
> text), ft3(s int, t text),
> and lt1, lt2, lt3 are regular local tables.
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x join ft3 on a = s);
>
>  QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
>  Foreign Scan on multiple foreign tables  (cost=0.00..0.00 rows=25000 
> width=108)
>   Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y, ft3.s, ft3.t FROM
> public.ft1 ft1, public.ft2 ft2, public.ft3 ft3 WHERE (ft1.a = ft3.s)
> AND (ft1.a = ft2.x)
> (2 rows)
>
> It works good.
> (P.S. I noticed that pgsql_fdw has incorrect Assert(). Please fix
> pgsql_fdw.c:730)
>
> However, an existence of local relation makes planner confused.
> It seems to me you expect "ft1 join ft2 on a = x"
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x join lt3 on a = s);
>                                                      QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=205.08..758.83 rows=30750 width=108)
>   Merge Cond: (ft1.a = lt3.s)
>   ->  Merge Join  (cost=119.66..199.66 rows=5000 width=72)
>         Merge Cond: (ft1.a = ft2.x)
>         ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>               Sort Key: ft1.a
>               ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
>                     Remote SQL: DECLARE pgsql_fdw_cursor_2 SCROLL
> CURSOR FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1
>         ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>               Sort Key: ft2.x
>               ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)
>                     Remote SQL: DECLARE pgsql_fdw_cursor_3 SCROLL
> CURSOR FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2
>   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
>         Sort Key: lt3.s
>         ->  Seq Scan on lt3  (cost=0.00..22.30 rows=1230 width=36)
> (15 rows)
>
> What is the reason why the foreign join is not pushed down?
> Maybe, injected Sort plan prevent the planner to consider both side of
> relations being foreign scan owned by same server? I'm still
> investigating the reason.
>
> I hope comments from committers. :-(
>
> A collateral evidence is below.
> If we try to sort the result by a key being not used to join, the both
> of foreign scan gets pushed down.
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x) order by y;
>                                                   QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>  Sort  (cost=307.19..319.69 rows=5000 width=72)
>   Sort Key: ft2.y
>   ->  Foreign Scan on multiple foreign tables  (cost=0.00..0.00
> rows=5000 width=72)
>         Remote SQL: SELECT ft1.a, ft1.b, ft2.x, ft2.y FROM public.ft1
> ft1, public.ft2 ft2 WHERE (ft1.a = ft2.x)
> (4 rows)
>
> However, when I tried to sort by a key being used to join, the both of
> foreign scan was not pushed down.
>
> postgres=# explain SELECT * FROM (ft1 join ft2 on a = x) order by a;
>                                                   QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=119.66..199.66 rows=5000 width=72)
>   Merge Cond: (ft1.a = ft2.x)
>   ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>         Sort Key: ft1.a
>         ->  Foreign Scan on ft1  (cost=10.00..10.00 rows=1000 width=36)
>               Remote SQL: DECLARE pgsql_fdw_cursor_6 SCROLL CURSOR
> FOR SELECT ft1.a, ft1.b FROM public.ft1 ft1
>   ->  Sort  (cost=59.83..62.33 rows=1000 width=36)
>         Sort Key: ft2.x
>         ->  Foreign Scan on ft2  (cost=10.00..10.00 rows=1000 width=36)
>               Remote SQL: DECLARE pgsql_fdw_cursor_7 SCROLL CURSOR
> FOR SELECT ft2.x, ft2.y FROM public.ft2 ft2
> (10 rows)
>
> Thanks,
> --
> KaiGai Kohei <kai...@kaigai.gr.jp>
>



-- 
KaiGai Kohei <kai...@kaigai.gr.jp>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to