I wrote:
> Unfortunately, it seems the Solaris implementors didn't read Stevens,
> because it looks to me like they *do* return ECONNREFUSED on accept queue
> overflow.  Still, it's hard to see how that would be the issue if we're
> still seeing this failure with only five clients.

Also, after further inspection of the source code, it appears to me that
the kernel's limit on accept queue length is hard-wired at 4096 in
Solaris.  So there's basically no way that we're hitting that limit in the
regression tests, and the MAX_CONNECTIONS configuration is irrelevant.

We seem to be left with the race condition theory.  In that connection,
this comment in /usr/src/uts/common/io/tl.c is interesting:

 *      The T_CONN_CON is generated when processing the T_CONN_REQ i.e. before
 *      a T_CONN_RES is received from the acceptor. This means that a socket
 *      connect will complete before the peer has called accept.

I'm not sure that explains anything of value, but it's probably unlike any
other implementation, which makes it perhaps relevant.  It implies that
this is totally unrelated to any server-side behavior; so if it's possible
for us to work around it at all, we'd have to do so client-side.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to