Re: need some help with tcp/ip programming

guy keren Mon, 14 May 2007 17:30:58 -0700

Amos Shapira wrote:

On 15/05/07, *guy keren* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>wrote:
     > I think you are tinkering with semantics and so miss the real
    issue (do
     > you work as a consultant? :).

    did you write that to rafi or to me? i'm not dealing with semantics - i
    am dealing with a real problem, that stable applications have to deal
    with - when the network breaks, and you never get the close from the
    other side.
I wrote this to you, Guy. Rafi maybe used "disconnect" when he basicallyment that the TCP connection went down from the other side while youseemed to hang on "disconnect" being defined as "cable eaten by analigator" :).

lets leave this subject. i brought it up, because many programmers newto socket programming are surprised by the fact that a networkdisconnection does not cause the socket to close, and that theconnection may stay there for hours.

As long as Rafi feels happy about the replies that's not relevant anymore, IMHO.
     > Alas - I think that I've just read not long
     > ago that there is a bug in Linux' select in implementing just
    that and
     > it might miss the close from the other side sometimes

    what you are describing here sounds astonishing - that such a basic
    feature of the sockets implementation is broken? i find this hard to
    believe, without clear evidence.
Here is something about what I read before, it's the other way around,and possibly only relevant to UDP but I'm not sure - if a packet arriveswith bad CRC, it's possible that the FD will be marked as "ready toread" by select but then the packet will be discarded (because of theCRC error) and when the process reads the socket it won't get anything.That would make the process get a "0 read right after select" which doesNOT indicate a close from the other side.
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html
I don't know what would be a select(2)-based work-around, if required atall.

first, it does not return a '0 read'. this situation could have twodifferent effects, depending on the blocking-mode of the socket.

if the socket is in blocking mode (the default mode) - select() mightstate there's data to be read, but recvmsg (or read) will block.

if the socket is in non-blocking mode - select() might state there'sdata to be read, but recvmsg (of read) will return with -1, and errnoset to EAGAIN.

in neither case will read return 0. the only time that read is allowedto return 0, is when it encounters an EOF. for a socket, this happensONLY if the other side closed the sending-side of the connection.

ofcourse, whenever i did select-based socket programming, i always setthe sockets to non-blocking mode. this requires some carefulprogramming, to avoid busy-waits, but it's the only way to guranteefully non-blocking behaviour. and people should also note that thesocket should be set to non-blocking mode before calling connect, and beready to handle the peculear way that the connect call works fornon-blocking sockets.

doing socket programming without referencing stevens' latest TCP/IP bookis foolish.


     > (sorry, can't find
     > a reference with a quick google, closest I got to might be:
     >
    http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218
    <http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218>
     > <
    http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218
    <http://forum.java.sun.com/thread.jspa?threadID=767657&messageID=4386218>>).
     > I don't remember what was the work-around to that.

    you're describing an issue with JVM - not with linux. i never
    encountered such a problem when doing socket programming in C or C++.

    if you can find something clearer about this, that will be very
    interesting.

Yes, it was a JVM bug but it mentioned differences on Linux vs. otherPOSIX systems so I though it might be related.

probably not in this case. because the problem you originally describedmost likely does not exist. the other way around does exist, if one usesblocking sockets. but then again, no one uses blocking sockets in serversoftware, unless they have a pair of reader+writer threads per socket -and even that may cause problems when shutting down the application.

    it helps avoiding copying too much data to/from kernel space on a sparse
    sockets list, and it helps avoiding having to scan large sets in the
    kernel, to initialize its onw internal data structures.
Actually, epoll looks really cool, and Boost's ASIO seems to provide aportable C++ interface around it: http://asio.sourceforge.net/On the other hand - if you are listening on many FD's which turn out tobe ready then epoll apparently looses because it requires syscall (orkernel intervention) on every single FD, making select(2) (/poll(2)?)more attractive.

besides epoll being non-portable, and thus it doesn't get used too much(that, and the fact people are not familiar with its existence).


--guy

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Re: need some help with tcp/ip programming

Reply via email to