Hi all,

While playing on Windows with services, I noticed an inconsistent behavior
in the way failures are handled when using a service for a Postgres

Let's assume that there is a service called postgres that has been
$ psql -At -c 'select version()'
PostgreSQL 9.5devel, compiled by Visual C++ build 1600, 64-bit
$ tasklist.exe -svc -FI "SERVICES eq postgres"

Image Name                     PID Services
========================= ========
pg_ctl.exe                     556 postgres

When pg_ctl is directly killed, service manager is able to detect a failure
$ taskkill.exe -PID 556 -F
SUCCESS: The process with PID 556 has been terminated.
$ sc query postgres
SERVICE_NAME: postgres
        TYPE               : 10  WIN32_OWN_PROCESS
        STATE              : 1  STOPPED
        WIN32_EXIT_CODE    : 1067  (0x42b)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
In this case 1067 means that the process left unexpectedly. Note that at
this point the Postgres instance is still running but we can use the
failure callback to run a script that could do some cleanup before
restarting properly the service.

However when a backend process is directly killed something different
$ tasklist.exe -FI "IMAGENAME eq postgres.exe"

Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
postgres.exe                  2088 Services                   0     17,380 K
postgres.exe                  2132 Services                   0      4,400 K
postgres.exe                  2236 Services                   0      5,064 K
postgres.exe                  1524 Services                   0      6,304 K
postgres.exe                  2084 Services                   0      9,200 K
postgres.exe                  2384 Services                   0      5,968 K
postgres.exe                  2652 Services                   0      4,500 K
postgres.exe                  2116 Services                   0      4,384 K
$ taskkill.exe -PID 2084 -F
SUCCESS: The process with PID 2084 has been terminated.
After that some processes remain:
$ tasklist.exe -FI "IMAGENAME eq postgres.exe"
Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
postgres.exe                  2088 Services                   0      5,708 K
postgres.exe                  2132 Services                   0      4,400 K

Processes that are immediately taken down when attempting a connection to
the server. Note that before attempting any connections service is
considered as running normally:
$ sc query postgres
SERVICE_NAME: postgres
        TYPE               : 10  WIN32_OWN_PROCESS
        STATE              : 4  RUNNING
                                (STOPPABLE, PAUSABLE, ACCEPTS_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
$ psql
psql: could not connect to server: Connection refused (0x0000274D/10061)
        Is the server running on host "localhost" (::1) and accepting
        TCP/IP connections on port 5432?
could not connect to server: Connection refused (0x0000274D/10061)
        Is the server running on host "localhost" ( and accepting
        TCP/IP connections on port 5432?
$ tasklist.exe -FI "IMAGENAME eq postgres.exe"
INFO: No tasks are running which match the specified criteria.

But now service has stopped, and it is not considered as having failed:
$ sc query postgres
SERVICE_NAME: postgres
        TYPE               : 10  WIN32_OWN_PROCESS
        STATE              : 1  STOPPED
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
This seems like an inconsistent behavior in error detection.

I am guessing that pg_ctl is not able to track appropriately failures that
are happening on postgres side. But are there things we could do to improve
the failure detection here?


Reply via email to