Amos Shapira wrote:
On 15/05/07, *guy keren* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:

     > I think you are tinkering with semantics and so miss the real
    issue (do
     > you work as a consultant? :).

    did you write that to rafi or to me? i'm not dealing with semantics - i
    am dealing with a real problem, that stable applications have to deal
    with - when the network breaks, and you never get the close from the
    other side.


I wrote this to you, Guy. Rafi maybe used "disconnect" when he basically ment that the TCP connection went down from the other side while you seemed to hang on "disconnect" being defined as "cable eaten by an aligator" :).

lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours.

As long as Rafi feels happy about the replies that's not relevant any more, IMHO.

     > Alas - I think that I've just read not long
     > ago that there is a bug in Linux' select in implementing just
    that and
     > it might miss the close from the other side sometimes

    what you are describing here sounds astonishing - that such a basic
    feature of the sockets implementation is broken? i find this hard to
    believe, without clear evidence.


Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as "ready to read" by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a "0 read right after select" which does NOT indicate a close from the other side.

http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html

I don't know what would be a select(2)-based work-around, if required at all.

first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket.

if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block.

if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN.

in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection.

ofcourse, whenever i did select-based socket programming, i always set the sockets to non-blocking mode. this requires some careful programming, to avoid busy-waits, but it's the only way to gurantee fully non-blocking behaviour. and people should also note that the socket should be set to non-blocking mode before calling connect, and be ready to handle the peculear way that the connect call works for non-blocking sockets.

doing socket programming without referencing stevens' latest TCP/IP book is foolish.


     > (sorry, can't find
     > a reference with a quick google, closest I got to might be:
     >
    http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218
    <http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218>
     > <
    http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218
    <http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218>>).
     > I don't remember what was the work-around to that.

    you're describing an issue with JVM - not with linux. i never
    encountered such a problem when doing socket programming in C or C++.

    if you can find something clearer about this, that will be very
    interesting.


Yes, it was a JVM bug but it mentioned differences on Linux vs. other POSIX systems so I though it might be related.

probably not in this case. because the problem you originally described most likely does not exist. the other way around does exist, if one uses blocking sockets. but then again, no one uses blocking sockets in server software, unless they have a pair of reader+writer threads per socket - and even that may cause problems when shutting down the application.

    it helps avoiding copying too much data to/from kernel space on a sparse
    sockets list, and it helps avoiding having to scan large sets in the
    kernel, to initialize its onw internal data structures.


Actually, epoll looks really cool, and Boost's ASIO seems to provide a portable C++ interface around it: http://asio.sourceforge.net/ On the other hand - if you are listening on many FD's which turn out to be ready then epoll apparently looses because it requires syscall (or kernel intervention) on every single FD, making select(2) (/poll(2)?) more attractive.

besides epoll being non-portable, and thus it doesn't get used too much (that, and the fact people are not familiar with its existence).

--guy

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to