Re: Asynchronous Append on postgres_fdw nodes.

Andrey Lepikhov Fri, 25 Sep 2020 11:34:41 -0700

Your AsyncAppend doesn't switch to another source if the data in currentleader is available:


/*
 * The request for the next node cannot be sent before the leader
 * responds. Finish the current leader if possible.
 */
if (PQisBusy(leader_state->s.conn))
{

int rc = WaitLatchOrSocket(NULL, WL_SOCKET_READABLE | WL_TIMEOUT |WL_EXIT_ON_PM_DEATH, PQsocket(leader_state->s.conn), 0,WAIT_EVENT_ASYNC_WAIT);

  if (!(rc & WL_SOCKET_READABLE))
    available = false;
}


/* fetch the leader's data and enqueue it for the next request */
if (available)
{
  fetch_received_data(leader);
  add_async_waiter(leader);
}

I don't understand, why it is needed. If we have fdw connections withdifferent latency, then we will read data from the fast connectionfirst. I think this may be a source of skew and decrease efficiency ofasynchronous append.


For example, see below synthetic query:
CREATE TABLE l (a integer) PARTITION BY LIST (a);

CREATE FOREIGN TABLE f1 PARTITION OF l FOR VALUES IN (1) SERVER lbOPTIONS (table_name 'l1');CREATE FOREIGN TABLE f2 PARTITION OF l FOR VALUES IN (2) SERVER lbOPTIONS (table_name 'l2');


INSERT INTO l (a) SELECT 2 FROM generate_series(1,200) as gs;
INSERT INTO l (a) SELECT 1 FROM generate_series(1,1000) as gs;

EXPLAIN ANALYZE (SELECT * FROM f1) UNION ALL (SELECT * FROM f2) LIMIT 400;

Result:

Limit (cost=100.00..122.21 rows=400 width=4) (actual time=0.483..1.183rows=400 loops=1)-> Append (cost=100.00..424.75 rows=5850 width=4) (actualtime=0.482..1.149 rows=400 loops=1)-> Foreign Scan on f1 (cost=100.00..197.75 rows=2925width=4) (actual time=0.481..1.115 rows=400 loops=1)-> Foreign Scan on f2 (cost=100.00..197.75 rows=2925width=4) (never executed)


As you can see, executor scans one input and doesn't tried to scan another.

--
regards,
Andrey Lepikhov
Postgres Professional

Re: Asynchronous Append on postgres_fdw nodes.

Reply via email to