Dear Kato-san,

Thank you for your interest!

> > I also want you to review the postgres_fdw part,
> > but I think it should not be attached because cfbot cannot understand
> > such a dependency
> > and will throw build error. Do you know how to deal with them in this
> > case?
> I don't know how to deal with them, but I hope you will attach the PoC,
> as it may be easier to review.

OK, I attached the PoC along with the dependent patches. Please see the zip 
add_helth_check_... patch is written by me, and other two patches are
just copied from [1].
In the new callback function ConnectionHash is searched sequentially and
WaitEventSetWait() is performed for WL_SOCKET_CLOSED socket event.
This event is added by the dependent ones.

How to use

I'll explain how to use it briefly.

1. boot two postmaster processes. One is coordinator, and another is worker
2. set remote_servers_connection_check_interval to non-zero value at the 
3. create tables to worker DB-cluster.
4. create foreign server, user mapping, and foreign table to coordinator.
5. connect to coordinator via psql.
6. open a transaction and access to foreing tables.
7. do "pg_ctl stop" command to woker DB-cluser.
8. execute some commands that does not access an foreign table.
9. Finally the following output will be get:

ERROR:  Postgres foreign server XXX might be down.

Example in some steps

3. at worker

postgres=# \d
        List of relations
 Schema |  Name  | Type  | Owner  
 public | remote | table | hayato
(1 row)

4. at coordinator 

postgres=# select * from pg_foreign_server ;
  oid  | srvname | srvowner | srvfdw | srvtype | srvversion | srvacl |         
 16406 | remote  |       10 |  16402 |         |            |        | 
(1 row)

postgres=# select * from pg_user_mapping ;
  oid  | umuser | umserver |   umoptions   
 16407 |     10 |    16406 | {user=hayato}
(1 row)

postgres=# \d
            List of relations
 Schema |  Name  |     Type      | Owner  
 public | local  | table         | hayato
 public | remote | foreign table | hayato
(2 rows)

6-9. at coordinator

postgres=# begin;
postgres=*# select * from remote ;
(1 row)

postgres=*# select * from local ;
ERROR:  Postgres foreign server remote might be down.

Note that some keepalive settings are needed
if you want to detect cable breakdown events.
In my understanding following parameters are needed as server options:

* keepalives_idle
* keepalives_count
* keepalives_interval


Best Regards,
Hayato Kuroda


Attachment: v01_add_checking_infrastracture.patch
Description: v01_add_checking_infrastracture.patch

Reply via email to