Re: [HACKERS] Feature freeze date for 8.1

Heikki Linnakangas Mon, 02 May 2005 08:49:50 -0700

On Mon, 2 May 2005, Hannu Krosing wrote:

Well, I've had problems with clients which resolve DB timeouts by
closing the current connection and establish a new one.

If it is actual DB timeout, then it all is ok, the server soon notices
that the client connection is closed and kills itself.

Problems happen when the timeout is caused by actual network problems -
when i have 300 clients (server's max_connections=500) which try to
reconnect after network outage, only 200 of them can do so as the server
is holding to 300 old connections.

In my case this has nothing to do with locks or transactions.

It would be nice if I coud st up some timeut using keepalives (like ssh-
s ProtocoKeepalives") and use similar timeouts on client and server.


FWIW, I've been bitten by this problem twice with other applications.

1. We had a DB2 database with clients running in other computers in the network. A faulty switch caused random network outages. If the connection timed out and the client was unable to send it's request to the server, the client would notice that the connection was down, and open a new one. But the server never noticed that the connection was dead. Eventually, the maximum number of connections was reached, and the administrator had to kill all the connections manually.

2. We had a custom client-server application using TCP across a network. There was stateful firewall between the server and the clients that dropped the connection at night when there was no activity. After a couple of days, the server reached the maximum number of threads on the platform and stopped accepting new connections.

In case 1, the switch was fixed. If another switch fails, the same will happen again. In case 2, we added an application-level heartbeat that sends a dummy message from server to client every 10 minutes.

TCP keep-alive with a small interval would have saved the day in both cases. Unfortunately the default interval must be >= 2 hours, according to RFC1122.

On most platforms, including Windows and Linux, the TCP keep-alive interval can't be set on a per-connection basis. The ideal solution would be to modify the operating system to support it.

What we can do in PostgreSQL is to introduce an application-level heartbeat. A simple "Hello world" message sent from server to client that the client would ignore would do the trick.

- Heikki

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to [EMAIL PROTECTED] so that your
     message can get through to the mailing list cleanly

Re: [HACKERS] Feature freeze date for 8.1

Reply via email to