Symptom:

In my code for a particular server, select() returns with a file
descriptor set in the read set.  I try to read() on it, but read
returns a negative number, and I check errno, and it is equal to
EPIPE.  I try to close the file descriptor using shutdown(), which
also returns an error, this time ENOTCONN.

Here is the code for the read().  It is mostly copied from
_Unix_Network_Programming_ by Stevens.  MAXLINE is 1024.
available_space is some number up to MAXLINE.

    available_space = request_buffer_space (input_buffer[fdcheck], MAXLINE);

    if (available_space > 0) {
    again:
      if ( (int) (numread = read (fdcheck, read_buf, available_space)) < 0) {
        if (errno == EINTR) {                             /* got an interrupt */
          goto again;                                     
        } else {
          if (debug.errnos) fprintf (stderr, "read %d got errno %d\n", fdcheck, 
errno); 
        }
      } else if (numread == 0) {
        return fdcheck;                                   /* got EOF */
      } else
        numcopied = copy_this_much_to_buffer (input_buffer[fdcheck], read_buf, 
numread);
    }

The above code is successful for several thousand calls, then begins
crapping out with EPIPE many times for many different clients.  The
clients start giving up.  When I look at netstat after this happens, I
see the following:

Proto Recv-Q Send-Q Local Address           Foreign Address         State       Timer
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2067 CLOSE       on 
(1.80/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2068 CLOSE       on 
(1.74/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2069 CLOSE       on 
(0.69/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2070 CLOSE       on 
(2.39/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2071 CLOSE       on 
(2.16/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2072 CLOSE       on 
(2.30/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2073 CLOSE       on 
(0.78/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2074 CLOSE       on 
(1.17/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2075 CLOSE       on 
(1.74/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2076 CLOSE       on 
(0.63/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2077 CLOSE       on 
(1.16/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2078 CLOSE       on 
(2.25/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2079 CLOSE       on 
(0.20/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2080 CLOSE       on 
(0.72/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2081 CLOSE       on 
(1.90/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2082 CLOSE       on 
(1.96/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2083 CLOSE       on 
(2.02/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2084 CLOSE       on 
(2.21/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2085 CLOSE       on 
(2.19/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2086 CLOSE       on 
(2.27/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2088 CLOSE       on 
(0.33/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2090 CLOSE       on 
(0.46/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2091 CLOSE       on 
(0.59/0)
tcp        1      5 honyara.rad.direct:3119 fred.rad.directint:2092 CLOSE       on 
(0.36/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2093 CLOSE       on 
(0.85/0)
tcp        1      5 honyara.rad.direct:3119 fred.rad.directint:2094 CLOSE       on 
(0.36/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2095 CLOSE       on 
(2.18/0)
tcp        1      5 honyara.rad.direct:3119 fred.rad.directint:2096 CLOSE       on 
(0.36/0)
tcp        1      5 honyara.rad.direct:3119 fred.rad.directint:2097 CLOSE       on 
(0.34/0)
tcp        1      5 honyara.rad.direct:3119 fred.rad.directint:2098 CLOSE       on 
(0.36/0)
tcp        1     10 honyara.rad.direct:3119 fred.rad.directint:2099 CLOSE       on 
(0.36/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2100 CLOSE       on 
(0.25/0)
tcp      304      5 honyara.rad.direct:3119 fred.rad.directint:2101 CLOSE       on 
(0.36/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2102 CLOSE       on 
(1.04/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2103 CLOSE       on 
(1.51/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2104 CLOSE       on 
(2.35/0)
tcp        0      5 honyara.rad.direct:3119 fred.rad.direct:eklogin CLOSE       on 
(1.05/0)
tcp        0      5 honyara.rad.direct:3119 fred.rad.directint:2106 CLOSE       on 
(1.05/0)
tcp        0      0 honyara.rad.direct:3119 fred.rad.directint:2107 CLOSE       on 
(0.18/0)
tcp        0      5 honyara.rad.direct:3119 fred.rad.directint:2108 CLOSE       on 
(0.19/0)

All these CLOSE state sockets can't possibly be good.

Question 1: What happened before the read() that caused read to return
EPIPE?  I have read _Unix Network Programming_ over and can not find
any references to this situation.

Question 2: What can I do to avoid this situation?

Question 3: Above, some of the CLOSE state sockets have Recv-Q greater
than zero.  Why would they have any problem reading?  Shouldn't I
expect to see the Recv-Q go down to zero before the read returns any
errors?

Question 4: What do the timer numbers mean in the netstat?  They don't
appear to be counting down.

Thanks for your help,
Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

Reply via email to