On 04/16/2014 06:12 AM, Hannu Krosing wrote: > On 04/16/2014 01:25 AM, Tom Lane wrote: >> Hannu Krosing <ha...@2ndquadrant.com> writes: >>> Is there a way to force it to prefer a plan where the results of (select >>> id from onemillion where data > '0.9' limit 100) >>> are passed to FDW as a single IN ( = ANY(...)) query and are retrieved >>> all at once ? >> You could write the query like that: >> >> select * from onemillion_pgfdw where id = any (array(select id from >> onemillion where data > '0.9' limit 100)); > My actual use-case was about a join between a local and a remote table > and without rewriting the query (they come from ORM) > > I was hoping to be able to nudge postgresql towards a better plan via some > tuning of table/fdw options or GUCs. > > for example, would postgresql use the WHERE id IN (...) query on remote > side for a query like > > select r.data, l.data > from onemillion_pgfdw r > join onemillion l > on r.id = l.id and l.data > '0.999'; > > if it recognizes that the local side returns only 1000 rows ? > > or would it still use 1000 individual WHERE id = $1 queries. > > Is getting the foreign data via IN and then turning the data into a hash > for joining one of the plans it considers at all ? It sees that could we need an extra tuning parameter for choosing the
ID IN (...) + HASH plan over individual SELECT .. WHERE ID = $1 something between `fdw_startup_cost` and `fdw_tuple_cost` to signify that an IN query returning 1000 rows runs faster than 1000 = queries as I understan currently they both would be estimated as fdw_startup_cost + 1000 * fdw_tuple_cost the new parameter could be fdw_call_cost or fdw_query_cost and would estimate how much each individual call to fdw costs, thus favouring calls which return more data in one call Cheers Hannu > > Best > Hannu > >> Or at least you should be able to, except when I try it I get >> >> explain analyze >> select * from onemillion_pgfdw where id = any (array(select id from >> onemillion where data > '0.9' limit 100)); >> ERROR: operator does not exist: integer = integer[] >> HINT: No operator matches the given name and argument type(s). You might >> need to add explicit type casts. >> CONTEXT: Remote SQL command: EXPLAIN SELECT id, inserted, data FROM >> public.onemillion WHERE ((id = ANY ((SELECT null::integer[])))) >> >> so there's something the remote-estimate code is getting wrong here. >> (It seems to work without remote_estimate, though.) >> >> regards, tom lane > -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers