Symptom:
In my code for a particular server, select() returns with a file
descriptor set in the read set. I try to read() on it, but read
returns a negative number, and I check errno, and it is equal to
EPIPE. I try to close the file descriptor using shutdown(), which
also returns an error, this time ENOTCONN.
Here is the code for the read(). It is mostly copied from
_Unix_Network_Programming_ by Stevens. MAXLINE is 1024.
available_space is some number up to MAXLINE.
available_space = request_buffer_space (input_buffer[fdcheck], MAXLINE);
if (available_space > 0) {
again:
if ( (int) (numread = read (fdcheck, read_buf, available_space)) < 0) {
if (errno == EINTR) { /* got an interrupt */
goto again;
} else {
if (debug.errnos) fprintf (stderr, "read %d got errno %d\n", fdcheck,
errno);
}
} else if (numread == 0) {
return fdcheck; /* got EOF */
} else
numcopied = copy_this_much_to_buffer (input_buffer[fdcheck], read_buf,
numread);
}
The above code is successful for several thousand calls, then begins
crapping out with EPIPE many times for many different clients. The
clients start giving up. When I look at netstat after this happens, I
see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State Timer
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2067 CLOSE on
(1.80/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2068 CLOSE on
(1.74/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2069 CLOSE on
(0.69/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2070 CLOSE on
(2.39/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2071 CLOSE on
(2.16/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2072 CLOSE on
(2.30/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2073 CLOSE on
(0.78/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2074 CLOSE on
(1.17/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2075 CLOSE on
(1.74/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2076 CLOSE on
(0.63/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2077 CLOSE on
(1.16/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2078 CLOSE on
(2.25/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2079 CLOSE on
(0.20/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2080 CLOSE on
(0.72/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2081 CLOSE on
(1.90/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2082 CLOSE on
(1.96/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2083 CLOSE on
(2.02/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2084 CLOSE on
(2.21/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2085 CLOSE on
(2.19/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2086 CLOSE on
(2.27/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2088 CLOSE on
(0.33/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2090 CLOSE on
(0.46/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2091 CLOSE on
(0.59/0)
tcp 1 5 honyara.rad.direct:3119 fred.rad.directint:2092 CLOSE on
(0.36/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2093 CLOSE on
(0.85/0)
tcp 1 5 honyara.rad.direct:3119 fred.rad.directint:2094 CLOSE on
(0.36/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2095 CLOSE on
(2.18/0)
tcp 1 5 honyara.rad.direct:3119 fred.rad.directint:2096 CLOSE on
(0.36/0)
tcp 1 5 honyara.rad.direct:3119 fred.rad.directint:2097 CLOSE on
(0.34/0)
tcp 1 5 honyara.rad.direct:3119 fred.rad.directint:2098 CLOSE on
(0.36/0)
tcp 1 10 honyara.rad.direct:3119 fred.rad.directint:2099 CLOSE on
(0.36/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2100 CLOSE on
(0.25/0)
tcp 304 5 honyara.rad.direct:3119 fred.rad.directint:2101 CLOSE on
(0.36/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2102 CLOSE on
(1.04/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2103 CLOSE on
(1.51/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2104 CLOSE on
(2.35/0)
tcp 0 5 honyara.rad.direct:3119 fred.rad.direct:eklogin CLOSE on
(1.05/0)
tcp 0 5 honyara.rad.direct:3119 fred.rad.directint:2106 CLOSE on
(1.05/0)
tcp 0 0 honyara.rad.direct:3119 fred.rad.directint:2107 CLOSE on
(0.18/0)
tcp 0 5 honyara.rad.direct:3119 fred.rad.directint:2108 CLOSE on
(0.19/0)
All these CLOSE state sockets can't possibly be good.
Question 1: What happened before the read() that caused read to return
EPIPE? I have read _Unix Network Programming_ over and can not find
any references to this situation.
Question 2: What can I do to avoid this situation?
Question 3: Above, some of the CLOSE state sockets have Recv-Q greater
than zero. Why would they have any problem reading? Shouldn't I
expect to see the Recv-Q go down to zero before the read returns any
errors?
Question 4: What do the timer numbers mean in the netstat? They don't
appear to be counting down.
Thanks for your help,
Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]