Re: need some help with tcp/ip programming
Rafi Cohen wrote: Hi Shachar, can you please give more detailed explanation why a thread per socket is not a wise idea? Not that I'm in a hurry to impplement this way, but I'll give you an example where I thought this could be a solution for me. One of the requirements of my project asks that my application, when it is a client, will actually make endless attempts to connect some of the servers, until any of the servers comes alive and the connection is established. (an interesting subject of it's own, which to one problematic ussie of this, I'll relate further below.) You probably agree that this caanot be done form the main thread. I ended by implementing a thread just for this. Attempting to connect endlessly until the connection is established and exits while raising a flag which the main thread checks just before each select to add the appropriate file descriptor to the select/poll array of descriptors. However, I do ask myself what's wrong in creating a thread per socket? In this thread I could try to connect, which in case it does not connect whould not disturb any other thread and when connected, would continue to oerate concurrently with the other threads. So, yyes, please explain what could be unwise in such an implementation. Concerning the endlss loo of connect(), a person on the remote computer told me that when a server is not alive there, those endless connects stuck the computer and his only way to free it is by unplugging the ethernet wire. Upon replugging it, everything works again. So, I'm ask to sleep for some seconds between each attempt to connect. Can you the experienced people coment on this? Is this event a known one and would indeed the delay of some seconds solve this? Or the reason to this event is something else? if i understand you correctly, then naturally, 'connect' could fail immediately, so doing this in a tight loop is a bad idea. in general, whenever you perform retries of something, you should put some thresholds, about: - how many times to retry it. - how much to wait between 2 retries there are certain cases when you should retry something immediately, because the protocols states that - but even then you should handle the case when many immediate retries fail. for example, in SCSI, there is an error called 'unit attention', and the specs states that if you get it, it means there was a config change on the other side, and you should retry immediately. however, sometimes the server you talk to has abug, that causes to send these errors infinitely, or many times in a row. if you want your software to be robust, you should handle this, either by having an upper limit on the number of retries, or by stating if X retries in a row failed, lets delay for Y milliseconds, and then try again. --guy Thanks, Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shachar Shemesh Sent: Tuesday, May 15, 2007 8:14 AM To: Amos Shapira Cc: Linux-IL Subject: Re: need some help with tcp/ip programming Amos Shapira wrote: in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. Is there an on-line reference (or a manual page) to support this? man 2 read From what I remember about select, the definition of it returning a ready to read bit set is the next read won't block, which will be true for non-blocking sockets any time and therefore they weren't encouraged together with select. I believe you two are talking about two different things here. There is a world of difference between UDP and TCP in that regard. UDP is connectionless. This means that read's error codes relate only to the latest packet received. UDP also doesn't have a 100% clear concept of what CRC/checksum actually means. I still think it's a bug for select to report activity on a socket that merely received a packet with bad checksum (there is no CRC in TCP/IP), as how do you even know it was intended for this socket? In TCP, on the other hand, a read is related to the connection. Packets in TCP are coincidental. Under TCP, read returning 0 mean just one thing - the connection is close. if you have so many connections that multiple threads will become a problem then a single thread having to cycle through all these connections one by one will also slow things down. No, my experience begs to differ here. When I tested netchat (http://sourceforge.net/projects/nch), I found out that a single thread had no problem saturating the machine's capacity for network in/out communication. As long that your per-socket handling does not require too much processing to slow you down, merely cycling through the sockets will not be the problem if you are careful enough. With netchat, I used libevent for that (see further on for details), so I was using epoll. Your mileage may vary
Re: need some help with tcp/ip programming
i didn't find any ebooks by richard stevens. there is the source code of the examples from his book - you might find it useful (althought i think without the obok they are a bit out of context). http://www.kohala.com/start/unpv12e/unpv12e.tar.gz --guy (p.s. stevens himself died in 1999 - so don't expect new books :0) Rafi Cohen wrote: Hi Guy, well, to continue your terminology I feel not only foolish but a total moron, since I was not aware at all of this book. I'll be glad to receive a reference to this book in order to become cleverer. After all, this is a desire of any pprogrammer, if he does not already feel so beforehand. But seriously now, I based my learning excet on man pages, also on something I found on the internet called Beej S guide for network programming using internet sockets. Not a deeply detailed one, but still not bad with lots of valuable example. Just what I needed as a programmer working under managers that want their project to finish yesterday whereas the programmer himself needs to learn the subject just today. Again, I'll be glad to learn more on this subject regardless of the current project. However, as a blind erson, reading printed books requires many effort and time for me, and I sure preffer electronic format books (pdf etc). So if you can point me to an electronic format of Stevens' book, I'll much appreciate it. 2 more coments to what you said below: 1. As Amos already mentioned this in a separate message, I also got the impression from the documents that using select makes irrelevant to set the sockets as nonblocking, so did not bother with this at all. But if you say that there still are cases that read after select does block, you probably base on your experience and no doubt you are much more experienced than me in this area, so I'll take this under consideration. 2. I indeed was one of the people that was not aware of epoll until you mentioned it. I began to read the man pages related to epoll, but for the time being terminology like edge triggered or level triggered need deeper understanding from my side. I'll search the net for better documentation, but if anybody can give me brief explanation of this and which of them is relevant for my case, I'll be more than glad to listen and learn. Thanks, Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of guy keren Sent: Tuesday, May 15, 2007 3:17 AM To: Amos Shapira Cc: Linux-IL Subject: Re: need some help with tcp/ip programming Amos Shapira wrote: On 15/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours. As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket. if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block. if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN. in neither case will read return 0
Re: need some help with tcp/ip programming
On Wed, May 16, 2007, Rafi Cohen wrote about RE: need some help with tcp/ip programming: Hi Nadav, well, in my case I do not have thousands of concurrent connections, about 20-30 only. In this case, the performance difference between an implementation using select, poll, epoll, or even one thread per connection, should be very small. The last one (one thread from a pre-existing pool, per connection) is easiest to implement, and might be very well enough for your needs. Let me give you an example of when using a single thread for all connections (and epoll or one of its friends) is important, rather than having a thread (or process) per connection. Imagine a Web server serving static files. Now imagine that (as is the case in a realistic situation) a large chunk of your clients have slow connections, and fetch from you around 10 KB a second. Your machine's CPU, disk, kernel, and network card can easily handle 1,000 such connections concurrently (the total of all these connections move just 10 MB per second). But you could not conceivably have a thread (or worse, a process) for each of these connections, because threads have significant overheads. If, for example, each thread takes up just 2 MB of memory (for its stack, kernel structures, and perhaps other things), these 1,000 connections take up 2 GB of memory (and now, try to imagine moving to a 1 Gbit network card, and hoping that you could server 5 times more concurrent connections...). The point is that all these threads don't do anything most of the time, and just wait for their chance to send their 10 K every second. It is much more efficient - in speed and certainly in memory - to have a single thread, which epoll's (or whatever) to find the next connection that is ready (for writing, reading, open or close) and process it, all in a single thread. Or, of course, if your machine is an SMP with N CPUs, then you should have N threads. However, in some cases the input from those sockets is actually queries to a database and it may also end in operations on this database. Not a heavy database, but still insert/delete/update is done occassionally. Now, if I understand you correctly, you say that using a single thread with epoll, even with many concurrent connections will not decrease performance. Writing a pure single-threaded server is *very hard*. You must take extreme care not to wait, ever. If your server needs to wait to get the content it needs to serve, e.g., to read the content from disk or to get it from a database, then other connections are not being served at the same time, and your CPU is being wasted! This is why such servers usually have complex state machines - e.g., when you get a database request from the client, you send it to the database, and do not wait for the result - rather - you remember that this connection is in a sending to database state, remember the command you need to send to the DB and add the database connection to the poll list; When this connection is ready to write, you write the command to it (you may not be able to do it in a single go, if it's a long command) and then you start waiting for reading the result from the DB, and at the same time you send it to the client (remembering to do flow control - if the client is not ready to be sent to, don't read from the database response). Doing all this is extremely complex, and you wouldn't want to do it except in extreme cases, when performance is of utmost importance and you're expecting thousands of concurrent connections. A simpler approach in your case is to have a hybrid server: a single thread using a poll (or whatever) loop waits for these commands, and when it gets a command, it sends it to a small thread pool that acts on these commands. In some situations this can be more efficient than the straightforward one-thread-per-connection server - imagine for example that you have 30 concurrent connections, but each sends just one command a second which takes 0.1 seconds to process - in this case a thread pool with just 3 threads might be enough. Of course, like I said, if you're only expecting 30 concurrent connections, I would suggest that you just use the simplest approach: have on thread per connection. You will never notice the difference (I believe). -- Nadav Har'El|Wednesday, May 16 2007, 28 Iyyar 5767 [EMAIL PROTECTED] |- Phone +972-523-790466, ICQ 13349191 |A Life? Cool! Where can I download one of http://nadav.harel.org.il |those from? = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Yes, it is probably better to use a single thread that does the event waiting, and a thread pool for the actual processing. Having one thread pet socket, however, is not a wise idea IMHO. Are there any rules for setting how many connections will be handled by one thread when using the pool thread method ? Shachar Chava = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Chava Leviatan wrote: Are there any rules for setting how many connections will be handled by one thread when using the pool thread method ? It greatly depends on the amount of offline processing that needs to be done. If, like netchat, it is mostly a case of receive, compare, send, then a single thread can handle everything. If it's a case of receive, compute mandelbrot encrypted with an AES key you need to brute force, if you don't assign one thread per socket you will end up with an application that receives X connections (where X is the number of threads you have), and then stops responding. Middle ground cases will fit, well, the middle ground. Chava Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Shachar Shemesh wrote: guy keren wrote: Amos Shapira wrote: On 14/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Jumping in in the middle here, I don't have any clear evidence (or any evidence at all) for what Amos was talking about, but I did run across this worrying change in Wine: http://www.winehq.org/pipermail/wine-cvs/2006-November/027552.html Now, it is totally unclear to me whether the fd leak in question is a result of a Wine bug around select, or of select itself. This may, after all, prove to be nothing important. Then again, being as it is that all Wine used to do was translate the Windows version of select to the almost identical Linux version, I find this worrying. Shachar very nice. so to solve the problem, they switched to performing a memory allocation and freeing for each call to poll. sounds like the performance for busy programs will go down (as memory management is a rather CPU-intensive operation). i wonder if it's significantly less expensive then the poll system call itself. since i don't understand the fd leakage bug mentioned there, i don't know the answer myself. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
On Mon, May 14, 2007, guy keren wrote about Re: need some help with tcp/ip programming: this is interesting. can anyone provide more info on this? the problem with select, is that it is unable to optimize handling of 'holes' in the file descriptor set. suppose that you need to select on file descriptors 2 and 4000. you need to pass info about all file descriptors up to 4000 (i.e. many '0' bits, and only two '1' bits, in the different select sets). with poll, you pass an array of the descriptors you care about. so the size of the array is proportional to the amount of descriptors you are interested in, while with select it is proportional to the numeric value of the largest descriptor you are interested in. This is indeed an accurate accessment of the difference between pol and select. Another point worth noticing is that in select(), the same array is used both as input and output. This means that after every event, you need to refill this array, which can be quite slow if you have many thousands of events per second. In some cases, you reach a point where you are listening to thousands of file descriptors, and getting thousands of events per second. For example, one can write a single-threaded HTTP server which handles thousands of concurrent connections with amazing performance (I wrote such a server once, and it was a really interesting experience). When you reach such high demands, even poll() is not good enough - every time one fd is ready to act on, something which can happen thousands of times per second - you need to call the poll() system call again, and pass to the long array of fds from userspace to the kernel. The problem is that the time poll() takes is proportional to the number of fds to poll, rather than to the number of fds in which something actually happened. To solve this problem, the /dev/epoll interface was added to Linux in 2001, and later, apparently because Linus Torvalds doesn't like /dev tricks, new system calls were added instead (see epoll(4))). It was amazing to see a system on which Apache stuggled to keep 200 concurrent open connections, suddenly keep thousands of concurrent open connections, using only one thread (or N threads in a machine with N cpus). Together with sendfile(), this allows you to create killer Web servers :-) when you use poll, you can use the trick of having 2 theads - one polls on idle sockets (i.e. sockets that did not have I/O in the last X seconds), and one listens on 'active' sockets (i.e. sockets that had I/O in the last X seconds). this avoids the major problem with both select and poll - that after an event on a single socket, the info for all the sockets has to be copied to user space (when select/poll returns), and then to kernel space again (when invoking poll/select again). i think that people added epoll support in order to avoid waking the poll function altogether - by receiving a signal form the kernel with the exact info, instead of having to return from poll. Indeed (see my above explanation). epoll() *does* return from the poll every time, but it immediately lets you know what changed (no need to check a long array), and more importantly - when you want to call epoll() again there is no need to pass the long list of fds to the kernel again. -- Nadav Har'El| Tuesday, May 15 2007, 27 Iyyar 5767 [EMAIL PROTECTED] |- Phone +972-523-790466, ICQ 13349191 |Ms Piggy's last words: I'm pink, http://nadav.harel.org.il |therefore I'm ham. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
RE: need some help with tcp/ip programming
Hi Guy, well, to continue your terminology I feel not only foolish but a total moron, since I was not aware at all of this book. I'll be glad to receive a reference to this book in order to become cleverer. After all, this is a desire of any pprogrammer, if he does not already feel so beforehand. But seriously now, I based my learning excet on man pages, also on something I found on the internet called Beej S guide for network programming using internet sockets. Not a deeply detailed one, but still not bad with lots of valuable example. Just what I needed as a programmer working under managers that want their project to finish yesterday whereas the programmer himself needs to learn the subject just today. Again, I'll be glad to learn more on this subject regardless of the current project. However, as a blind erson, reading printed books requires many effort and time for me, and I sure preffer electronic format books (pdf etc). So if you can point me to an electronic format of Stevens' book, I'll much appreciate it. 2 more coments to what you said below: 1. As Amos already mentioned this in a separate message, I also got the impression from the documents that using select makes irrelevant to set the sockets as nonblocking, so did not bother with this at all. But if you say that there still are cases that read after select does block, you probably base on your experience and no doubt you are much more experienced than me in this area, so I'll take this under consideration. 2. I indeed was one of the people that was not aware of epoll until you mentioned it. I began to read the man pages related to epoll, but for the time being terminology like edge triggered or level triggered need deeper understanding from my side. I'll search the net for better documentation, but if anybody can give me brief explanation of this and which of them is relevant for my case, I'll be more than glad to listen and learn. Thanks, Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of guy keren Sent: Tuesday, May 15, 2007 3:17 AM To: Amos Shapira Cc: Linux-IL Subject: Re: need some help with tcp/ip programming Amos Shapira wrote: On 15/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours. As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket. if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block. if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN. in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. ofcourse, whenever i did select-based socket programming, i always set the sockets to non-blocking mode. this requires some careful programming
Re: need some help with tcp/ip programming
Rafi Cohen wrote: 2. I indeed was one of the people that was not aware of epoll until you mentioned it. I began to read the man pages related to epoll, but for the time being terminology like edge triggered or level triggered need deeper understanding from my side. Edge triggered - will only report the file descriptor (trigger the epoll) when it changes state (edge). Level triggered - will report the descriptor whenever it has anything interesting on it (the level is high). It's an EE term. Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
RE: need some help with tcp/ip programming
Hi Shachar, can you please give more detailed explanation why a thread per socket is not a wise idea? Not that I'm in a hurry to impplement this way, but I'll give you an example where I thought this could be a solution for me. One of the requirements of my project asks that my application, when it is a client, will actually make endless attempts to connect some of the servers, until any of the servers comes alive and the connection is established. (an interesting subject of it's own, which to one problematic ussie of this, I'll relate further below.) You probably agree that this caanot be done form the main thread. I ended by implementing a thread just for this. Attempting to connect endlessly until the connection is established and exits while raising a flag which the main thread checks just before each select to add the appropriate file descriptor to the select/poll array of descriptors. However, I do ask myself what's wrong in creating a thread per socket? In this thread I could try to connect, which in case it does not connect whould not disturb any other thread and when connected, would continue to oerate concurrently with the other threads. So, yyes, please explain what could be unwise in such an implementation. Concerning the endlss loo of connect(), a person on the remote computer told me that when a server is not alive there, those endless connects stuck the computer and his only way to free it is by unplugging the ethernet wire. Upon replugging it, everything works again. So, I'm ask to sleep for some seconds between each attempt to connect. Can you the experienced people coment on this? Is this event a known one and would indeed the delay of some seconds solve this? Or the reason to this event is something else? Thanks, Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shachar Shemesh Sent: Tuesday, May 15, 2007 8:14 AM To: Amos Shapira Cc: Linux-IL Subject: Re: need some help with tcp/ip programming Amos Shapira wrote: in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. Is there an on-line reference (or a manual page) to support this? man 2 read From what I remember about select, the definition of it returning a ready to read bit set is the next read won't block, which will be true for non-blocking sockets any time and therefore they weren't encouraged together with select. I believe you two are talking about two different things here. There is a world of difference between UDP and TCP in that regard. UDP is connectionless. This means that read's error codes relate only to the latest packet received. UDP also doesn't have a 100% clear concept of what CRC/checksum actually means. I still think it's a bug for select to report activity on a socket that merely received a packet with bad checksum (there is no CRC in TCP/IP), as how do you even know it was intended for this socket? In TCP, on the other hand, a read is related to the connection. Packets in TCP are coincidental. Under TCP, read returning 0 mean just one thing - the connection is close. if you have so many connections that multiple threads will become a problem then a single thread having to cycle through all these connections one by one will also slow things down. No, my experience begs to differ here. When I tested netchat (http://sourceforge.net/projects/nch), I found out that a single thread had no problem saturating the machine's capacity for network in/out communication. As long that your per-socket handling does not require too much processing to slow you down, merely cycling through the sockets will not be the problem if you are careful enough. With netchat, I used libevent for that (see further on for details), so I was using epoll. Your mileage may vary with the other technologies. Not to mention the signal problem and just generally the fact that one connection taking too much time to handle will slow the handling of other connections. Yes, it is probably better to use a single thread that does the event waiting, and a thread pool for the actual processing. Having one thread pet socket, however, is not a wise idea IMHO. A possible go-between might be to select/poll on multiple FD's then handing the work to threads from a thread pool, but such a job would be justifiable only for a large number of connections, IMHO. It's not that difficult to pull off, and I believe your analysis failed to account for the overhead of creating new threads for each new connection, as well as destroying the threads for no longer needed connections. If you insist on using a single thread then select seems to be the underdog today - poll is just as portable (AFAIKT), and Boost ASIO (and I'd expect ACE) allows making portable code which uses the superior API's such as epoll/kqueue/dev/poll
RE: need some help with tcp/ip programming
Hi Nadav, well, in my case I do not have thousands of concurrent connections, about 20-30 only. However, in some cases the input from those sockets is actually queries to a database and it may also end in operations on this database. Not a heavy database, but still insert/delete/update is done occassionally. Now, if I understand you correctly, you say that using a single thread with epoll, even with many concurrent connections will not decrease performance. Does this remain correct for my case, as explained above? For the time being, I do use a single thread with select mechanism which I'll modify to epoll once I understand it's usage. However, I do feel that multithread after epolll returns with an event may significantly increase concurrency and performance. I'll be glad to have your opinion about this. Thanks, Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nadav Har'El Sent: Tuesday, May 15, 2007 3:23 PM To: guy keren Cc: [EMAIL PROTECTED]; Amos Shapira; Linux-IL Subject: Re: need some help with tcp/ip programming On Mon, May 14, 2007, guy keren wrote about Re: need some help with tcp/ip programming: this is interesting. can anyone provide more info on this? the problem with select, is that it is unable to optimize handling of 'holes' in the file descriptor set. suppose that you need to select on file descriptors 2 and 4000. you need to pass info about all file descriptors up to 4000 (i.e. many '0' bits, and only two '1' bits, in the different select sets). with poll, you pass an array of the descriptors you care about. so the size of the array is proportional to the amount of descriptors you are interested in, while with select it is proportional to the numeric value of the largest descriptor you are interested in. This is indeed an accurate accessment of the difference between pol and select. Another point worth noticing is that in select(), the same array is used both as input and output. This means that after every event, you need to refill this array, which can be quite slow if you have many thousands of events per second. In some cases, you reach a point where you are listening to thousands of file descriptors, and getting thousands of events per second. For example, one can write a single-threaded HTTP server which handles thousands of concurrent connections with amazing performance (I wrote such a server once, and it was a really interesting experience). When you reach such high demands, even poll() is not good enough - every time one fd is ready to act on, something which can happen thousands of times per second - you need to call the poll() system call again, and pass to the long array of fds from userspace to the kernel. The problem is that the time poll() takes is proportional to the number of fds to poll, rather than to the number of fds in which something actually happened. To solve this problem, the /dev/epoll interface was added to Linux in 2001, and later, apparently because Linus Torvalds doesn't like /dev tricks, new system calls were added instead (see epoll(4))). It was amazing to see a system on which Apache stuggled to keep 200 concurrent open connections, suddenly keep thousands of concurrent open connections, using only one thread (or N threads in a machine with N cpus). Together with sendfile(), this allows you to create killer Web servers :-) when you use poll, you can use the trick of having 2 theads - one polls on idle sockets (i.e. sockets that did not have I/O in the last X seconds), and one listens on 'active' sockets (i.e. sockets that had I/O in the last X seconds). this avoids the major problem with both select and poll - that after an event on a single socket, the info for all the sockets has to be copied to user space (when select/poll returns), and then to kernel space again (when invoking poll/select again). i think that people added epoll support in order to avoid waking the poll function altogether - by receiving a signal form the kernel with the exact info, instead of having to return from poll. Indeed (see my above explanation). epoll() *does* return from the poll every time, but it immediately lets you know what changed (no need to check a long array), and more importantly - when you want to call epoll() again there is no need to pass the long list of fds to the kernel again. -- Nadav Har'El| Tuesday, May 15 2007, 27 Iyyar 5767 [EMAIL PROTECTED] |- Phone +972-523-790466, ICQ 13349191 |Ms Piggy's last words: I'm pink, http://nadav.harel.org.il |therefore I'm ham. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.467 / Virus
need some help with tcp/ip programming
Hi, as a pproject for a company I'm writing a tcp/ip application on linux using C language. My application has 2 connections as client to remote servers and is by itself a server accepting remote client connections. I'm using select() mechanism to manage all those connections. Everyting works nicely until any of the remote sideds disconnects. In some of such cases, I'm not succeeding to reconnect to this remote as client or re-accept connection from tjhe remote client. Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? 2. If I need to detect those disconnections to react propperly in my application and I can not rely on select, what would you suggest me to do? should I use a kind of ping mechanism to check every some seconds if the connection is still alive? or may be use multithread instead of select, where each thread is responsible for each connection source and instead of select I loop on read from this source and so, can detect when I read 0 bytes, which is the disconnect indication and react accordingly? Does this make sense or you see issues in such implementation also? Thanks, Rafi.
Re: need some help with tcp/ip programming
Rafi Cohen wrote: Hi, as a pproject for a company I'm writing a tcp/ip application on linux using C language. ah welcome, welcome to the pleasure dome... My application has 2 connections as client to remote servers and is by itself a server accepting remote client connections. I'm using select() mechanism to manage all those connections. Everyting works nicely until any of the remote sideds disconnects. In some of such cases, I'm not succeeding to reconnect to this remote as client or re-accept connection from tjhe remote client. you're not describing things properly here, since later you say you are not managing to disconnect (the problem is not with re-connecting - which, if happened, would imply a different problem altogether) Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. 2. If I need to detect those disconnections to react propperly in my application and I can not rely on select, what would you suggest me to do? should I use a kind of ping mechanism to check every some seconds if the connection is still alive? or may be use multithread instead of select, where each thread is responsible for each connection source and instead of select I loop on read from this source and so, can detect when I read 0 bytes, which is the disconnect indication and react accordingly? read would fail just like select does, and because of the same reason. you could implement the keepalive in your application, in case the keepalive parameters tweaking of the TCP stack does not work, for some reason. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
RE: need some help with tcp/ip programming
Hi Guy Rafi Cohen wrote: Hi, as a pproject for a company I'm writing a tcp/ip application on linux using C language. ah welcome, welcome to the pleasure dome... Hmm, thanks for your warm greetings. My application has 2 connections as client to remote servers and is by itself a server accepting remote client connections. I'm using select() mechanism to manage all those connections. Everyting works nicely until any of the remote sideds disconnects. In some of such cases, I'm not succeeding to reconnect to this remote as client or re-accept connection from tjhe remote client. you're not describing things properly here, since later you say you are not managing to disconnect (the problem is not with re-connecting - which, if happened, would imply a different problem altogether) You are correct, what I indeed meant to say is in order to re-connect, first I need to disconnect propperly and here lies my problem. Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. Yes, this does make sense and I need to check with the software developer to which mine is connecting remotely, if he indeed closes the socket when disconnecting. You gave me a clue, thanks. So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. I know of this option and will look into that deeper. May be I miss here something, but this option may be relevant for the case my application is a server. I wonder how it would affect, if at all, when my application is a client. 2. If I need to detect those disconnections to react propperly in my application and I can not rely on select, what would you suggest me to do? should I use a kind of ping mechanism to check every some seconds if the connection is still alive? or may be use multithread instead of select, where each thread is responsible for each connection source and instead of select I loop on read from this source and so, can detect when I read 0 bytes, which is the disconnect indication and react accordingly? read would fail just like select does, and because of the same reason. you could implement the keepalive in your application, in case the keepalive parameters tweaking of the TCP stack does not work, for some reason. --guy Thanks, Rafi. -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.467 / Virus Database: 269.7.0/803 - Release Date: 5/13/2007 12:17 PM = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
On 14/05/07, guy keren [EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218). I don't remember what was the work-around to that. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed (e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). --Amos
RE: need some help with tcp/ip programming
Amos, thanks for the ideas. I thought about poll and will look into this. I'm cecking read also for errors (valies 0) but in this case there ven can not be errors. Since the socket is disconnected, select does not detect any event on this socket and so does not give me any opportunity to read from it and even get an error. But thanks anyway. Rafi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Amos Shapira Sent: Monday, May 14, 2007 1:16 PM To: Linux-IL Subject: Re: need some help with tcp/ip programming On 14/05/07, guy keren [EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 messageID=4386218). I don't remember what was the work-around to that. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed ( e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). --Amos
Re: need some help with tcp/ip programming
this is a great thread - i'm learning a lot by reading it, even though i've been programming sockets for years. thanks for the question, and thanks for all the great answers. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed (e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. there is a distinction between different types of read() errors. at the very least, if you set the non-blocking option, you will get a response indicating that there is no data. if you select a non-zero timeout, you get a different return value indicating the timeout was hit w/ no data. if the other side disconnects, you get a different failure, and finally, there is a catch-all error for other reasons. IIRC, this was also somewhat implementation specific. it's been awhile, and it may have been Linux/OS X differences, but i would recommend that you are looking at the appropriate man page for your stack. if you do a random google, you may get the netbsd/freebsd stack, which may be different. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). this is interesting. can anyone provide more info on this? thanks, michael = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
(rafi - your quoting mixes your text with mine - you might want to fix this - it was very hard to read your letter). see my comments below: Rafi Cohen wrote: Hi Guy Rafi Cohen wrote: So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. and rafi responds: I know of this option and will look into that deeper. May be I miss here something, but this option may be relevant for the case my application is a server. I wonder how it would affect, if at all, when my application is a client. it does not matter if you have a client or a server - you want to know about network problems either way - unless you assume that the client is a GUI and the user will simply hit the 'cancel' button. since in your case it does not appear to be a GUI - rather a longer-living server, then you might want to handle the disconnection issues both on your side, and on the side of the server. note that for some applications, it is enough to have a 'close the socket if it was idle for X time' - i.e. if you didn't get any data from a socket during X minutes - you close it. Thanks, Rafi. hope this makes it clearer. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
[EMAIL PROTECTED] wrote: amos shapira wrote: Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). this is interesting. can anyone provide more info on this? the problem with select, is that it is unable to optimize handling of 'holes' in the file descriptor set. suppose that you need to select on file descriptors 2 and 4000. you need to pass info about all file descriptors up to 4000 (i.e. many '0' bits, and only two '1' bits, in the different select sets). with poll, you pass an array of the descriptors you care about. so the size of the array is proportional to the amount of descriptors you are interested in, while with select it is proportional to the numeric value of the largest descriptor you are interested in. note that this is relevant only for applications that have many open sockets. when you use poll, you can use the trick of having 2 theads - one polls on idle sockets (i.e. sockets that did not have I/O in the last X seconds), and one listens on 'active' sockets (i.e. sockets that had I/O in the last X seconds). this avoids the major problem with both select and poll - that after an event on a single socket, the info for all the sockets has to be copied to user space (when select/poll returns), and then to kernel space again (when invoking poll/select again). i think that people added epoll support in order to avoid waking the poll function altogether - by receiving a signal form the kernel with the exact info, instead of having to return from poll. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Amos Shapira wrote: On 14/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. if this is what he expects than, indeed, this is what happens. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218). I don't remember what was the work-around to that. you're describing an issue with JVM - not with linux. i never encountered such a problem when doing socket programming in C or C++. if you can find something clearer about this, that will be very interesting. Another point to check - does the read(2) after select(2) return an error? See select_tut(2) for more details on how to program with select - you should check for errors as well instead of just assuming that read(2) must succeed ( e.g. interrupt). Also while you are at it - check whether pselect(2) can help you improve your program's robustness. Maybe using poll(2) will help you around that (I also heard that poll is generally more efficient because it helps the kernel avoid having to re-interpret the syscall parameters on every call). it helps avoiding copying too much data to/from kernel space on a sparse sockets list, and it helps avoiding having to scan large sets in the kernel, to initialize its onw internal data structures. --guy = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
RE: need some help with tcp/ip programming
Thank you very much Guy and sorry for not writing the text in an approriate way. Usually, I reply above the original message, but this time tried to mix my comments close to your text so that they make sense and you don't loose the context. Next time I'll try to do better. Thanks for the most valuable information, Rafi. -Original Message- From: guy keren [mailto:[EMAIL PROTECTED] Sent: Monday, May 14, 2007 11:18 PM To: Rafi Cohen Cc: '[EMAIL PROTECTED] Org. Il' Subject: Re: need some help with tcp/ip programming (rafi - your quoting mixes your text with mine - you might want to fix this - it was very hard to read your letter). see my comments below: Rafi Cohen wrote: Hi Guy Rafi Cohen wrote: So, I have a couple of questions and I'll most apreciate any assistance. 1. Would you confirm that select, indeed, does not detect each and every remote disconnection and do you know if there is a rule behind those cases? TCP/IP stacks on linux, by default, will only notice a network disconnection (i.e. the network went down) reliably after 2 hours. that's how TCP/IP's internal keep alive mechanism is set. 2 hours is a completely impractical value for any sane system you might develop. you can tweak this parameter of the TCP/IP stack for specific sockets, on current linux kernels, using a socket option. (man 7 tcp - and look for 'keepalive' - there are 3 parameters for this). i never used this mechanism, since it was only possible to make this change globally when i needed that. and rafi responds: I know of this option and will look into that deeper. May be I miss here something, but this option may be relevant for the case my application is a server. I wonder how it would affect, if at all, when my application is a client. it does not matter if you have a client or a server - you want to know about network problems either way - unless you assume that the client is a GUI and the user will simply hit the 'cancel' button. since in your case it does not appear to be a GUI - rather a longer-living server, then you might want to handle the disconnection issues both on your side, and on the side of the server. note that for some applications, it is enough to have a 'close the socket if it was idle for X time' - i.e. if you didn't get any data from a socket during X minutes - you close it. Thanks, Rafi. hope this makes it clearer. --guy -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.467 / Virus Database: 269.7.0/803 - Release Date: 5/13/2007 12:17 PM = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
On 15/05/07, guy keren [EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 ). I don't remember what was the work-around to that. you're describing an issue with JVM - not with linux. i never encountered such a problem when doing socket programming in C or C++. if you can find something clearer about this, that will be very interesting. Yes, it was a JVM bug but it mentioned differences on Linux vs. other POSIX systems so I though it might be related. it helps avoiding copying too much data to/from kernel space on a sparse sockets list, and it helps avoiding having to scan large sets in the kernel, to initialize its onw internal data structures. Actually, epoll looks really cool, and Boost's ASIO seems to provide a portable C++ interface around it: http://asio.sourceforge.net/ On the other hand - if you are listening on many FD's which turn out to be ready then epoll apparently looses because it requires syscall (or kernel intervention) on every single FD, making select(2) (/poll(2)?) more attractive. Cheers, --Amos
Re: need some help with tcp/ip programming
Amos Shapira wrote: On 15/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours. As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket. if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block. if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN. in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. ofcourse, whenever i did select-based socket programming, i always set the sockets to non-blocking mode. this requires some careful programming, to avoid busy-waits, but it's the only way to gurantee fully non-blocking behaviour. and people should also note that the socket should be set to non-blocking mode before calling connect, and be ready to handle the peculear way that the connect call works for non-blocking sockets. doing socket programming without referencing stevens' latest TCP/IP book is foolish. (sorry, can't find a reference with a quick google, closest I got to might be: http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218 http://forum.java.sun.com/thread.jspa?threadID=767657messageID=4386218). I don't remember what was the work-around to that. you're describing an issue with JVM - not with linux. i never encountered such a problem when doing socket programming in C or C++. if you can find something clearer about this, that will be very interesting. Yes, it was a JVM bug but it mentioned differences on Linux vs. other POSIX systems so I though it might be related. probably not in this case. because the problem you originally described most likely does not exist. the other way around does exist, if one uses blocking sockets. but then again, no one uses blocking sockets in server software, unless they have a pair of reader+writer threads per socket - and even that may cause problems when shutting down the application. it helps avoiding copying too much data to/from kernel space on a sparse sockets list, and it helps avoiding having to scan large sets in the kernel, to initialize its onw internal data structures. Actually, epoll looks really cool, and Boost's ASIO seems to provide a portable C++ interface around it: http://asio.sourceforge.net/ On the other hand - if you are listening on many FD's which turn out to be ready then epoll apparently looses because it requires syscall (or kernel intervention) on every single FD, making select(2) (/poll(2)?) more attractive. besides epoll being non-portable, and thus it doesn't get used too
Re: need some help with tcp/ip programming
On 15/05/07, guy keren [EMAIL PROTECTED] wrote: Amos Shapira wrote: On 15/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. I wrote this to you, Guy. Rafi maybe used disconnect when he basically ment that the TCP connection went down from the other side while you seemed to hang on disconnect being defined as cable eaten by an aligator :). lets leave this subject. i brought it up, because many programmers new to socket programming are surprised by the fact that a network disconnection does not cause the socket to close, and that the connection may stay there for hours. As long as Rafi feels happy about the replies that's not relevant any more, IMHO. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Here is something about what I read before, it's the other way around, and possibly only relevant to UDP but I'm not sure - if a packet arrives with bad CRC, it's possible that the FD will be marked as ready to read by select but then the packet will be discarded (because of the CRC error) and when the process reads the socket it won't get anything. That would make the process get a 0 read right after select which does NOT indicate a close from the other side. http://www.uwsg.indiana.edu/hypermail/linux/kernel/0410.2/0001.html I don't know what would be a select(2)-based work-around, if required at all. first, it does not return a '0 read'. this situation could have two different effects, depending on the blocking-mode of the socket. if the socket is in blocking mode (the default mode) - select() might state there's data to be read, but recvmsg (or read) will block. if the socket is in non-blocking mode - select() might state there's data to be read, but recvmsg (of read) will return with -1, and errno set to EAGAIN. in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. Is there an on-line reference (or a manual page) to support this? From what I remember about select, the definition of it returning a ready to read bit set is the next read won't block, which will be true for non-blocking sockets any time and therefore they weren't encouraged together with select. ofcourse, whenever i did select-based socket programming, i always set the sockets to non-blocking mode. this requires some careful programming, to avoid busy-waits, but it's the only way to gurantee fully non-blocking behaviour. and people should also note that the socket should be set to non-blocking mode before calling connect, and be ready to handle the peculear way that the connect call works for non-blocking sockets. Also there is the issue of signals. If you want robust programs then you'll have to use pselect. doing socket programming without referencing stevens' latest TCP/IP book is foolish. Sorry for being foolish, I learned TCP/IP from RFC's and socket programming from BSD4.2 sources in `86, Steven's book wasn't available then. :^) I since then read the early editions of his books (circa early 90's, I remember reading a volume while the later ones where still in the making), but it's been a while since I had to write a complete C socket program with select in earnest, and I accept that some interfaces may have changed over the years. These days, with pthreads being a mainstream, I'd consider using multiple threads. select() is nice when you absolutely *must* use a single thread (which was the case back when pthreads wasn't invented yet, or later when the various UNIX versions had their own idea on thread API's) but if you have so many connections that multiple threads will become a problem then a single thread having to cycle through all these connections one by one will also slow things down. Not to mention the signal problem and just generally the fact that one connection taking too much time to handle will slow the handling of other connections. A possible go-between might be to select/poll on multiple FD's then handing the work to threads from a thread pool, but such a job would be justifiable only for a large number of connections, IMHO. If you insist on using a single thread then select seems to be the underdog today - poll is just
Re: need some help with tcp/ip programming
guy keren wrote: Amos Shapira wrote: On 14/05/07, *guy keren* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Rafi Cohen wrote: Reading some documentation on tcp/ip programming, I had the impression that the select mechanism should detect such remote disconnect event, thus enabling me to make a further read from this socket which should end in reading 0 bytes. Reading 0 bytes should indicate disconnection and let me disconnect propperly from my side and try to reconnect. However, it seems that select does not detect all those disconnect events and even worse, I can not see any rule behind when it does detect this and when it does not. select does not notice disconnections. it only notices if the socket was closed by the remote side. that's a completely different issue, and that's also the only time when you get a 0 return value from the read() system call. I think you are tinkering with semantics and so miss the real issue (do you work as a consultant? :). did you write that to rafi or to me? i'm not dealing with semantics - i am dealing with a real problem, that stable applications have to deal with - when the network breaks, and you never get the close from the other side. Basically - Rafi expects (as he should) that a read(fd,...)==0 after a select(2) call that indicated activity on fd means that the other side has closed the connection. if this is what he expects than, indeed, this is what happens. Alas - I think that I've just read not long ago that there is a bug in Linux' select in implementing just that and it might miss the close from the other side sometimes what you are describing here sounds astonishing - that such a basic feature of the sockets implementation is broken? i find this hard to believe, without clear evidence. Jumping in in the middle here, I don't have any clear evidence (or any evidence at all) for what Amos was talking about, but I did run across this worrying change in Wine: http://www.winehq.org/pipermail/wine-cvs/2006-November/027552.html Now, it is totally unclear to me whether the fd leak in question is a result of a Wine bug around select, or of select itself. This may, after all, prove to be nothing important. Then again, being as it is that all Wine used to do was translate the Windows version of select to the almost identical Linux version, I find this worrying. Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: need some help with tcp/ip programming
Amos Shapira wrote: in neither case will read return 0. the only time that read is allowed to return 0, is when it encounters an EOF. for a socket, this happens ONLY if the other side closed the sending-side of the connection. Is there an on-line reference (or a manual page) to support this? man 2 read From what I remember about select, the definition of it returning a ready to read bit set is the next read won't block, which will be true for non-blocking sockets any time and therefore they weren't encouraged together with select. I believe you two are talking about two different things here. There is a world of difference between UDP and TCP in that regard. UDP is connectionless. This means that read's error codes relate only to the latest packet received. UDP also doesn't have a 100% clear concept of what CRC/checksum actually means. I still think it's a bug for select to report activity on a socket that merely received a packet with bad checksum (there is no CRC in TCP/IP), as how do you even know it was intended for this socket? In TCP, on the other hand, a read is related to the connection. Packets in TCP are coincidental. Under TCP, read returning 0 mean just one thing - the connection is close. if you have so many connections that multiple threads will become a problem then a single thread having to cycle through all these connections one by one will also slow things down. No, my experience begs to differ here. When I tested netchat (http://sourceforge.net/projects/nch), I found out that a single thread had no problem saturating the machine's capacity for network in/out communication. As long that your per-socket handling does not require too much processing to slow you down, merely cycling through the sockets will not be the problem if you are careful enough. With netchat, I used libevent for that (see further on for details), so I was using epoll. Your mileage may vary with the other technologies. Not to mention the signal problem and just generally the fact that one connection taking too much time to handle will slow the handling of other connections. Yes, it is probably better to use a single thread that does the event waiting, and a thread pool for the actual processing. Having one thread pet socket, however, is not a wise idea IMHO. A possible go-between might be to select/poll on multiple FD's then handing the work to threads from a thread pool, but such a job would be justifiable only for a large number of connections, IMHO. It's not that difficult to pull off, and I believe your analysis failed to account for the overhead of creating new threads for each new connection, as well as destroying the threads for no longer needed connections. If you insist on using a single thread then select seems to be the underdog today - poll is just as portable (AFAIKT), and Boost ASIO (and I'd expect ACE) allows making portable code which uses the superior API's such as epoll/kqueue/dev/poll. Personally, I use libevent (http://www.monkey.org/~provos/libevent/), which has the advantage of being a C linkage program (ASIO is C++, as is ACE). It also has a standard license (three clause BSD). I skimmed over the boost license, and it doesn't seem problematic, but I don't like people creating new licenses with no clear justification. Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]