On Wed, May 16, 2007, Rafi Cohen wrote about "RE: need some help with tcp/ip programming": > Hi Nadav, well, in my case I do not have thousands of concurrent > connections, about 20-30 only.
In this case, the performance difference between an implementation using select, poll, epoll, or even one thread per connection, should be very small. The last one (one thread from a pre-existing pool, per connection) is easiest to implement, and might be very well enough for your needs. Let me give you an example of when using a single thread for all connections (and epoll or one of its friends) is important, rather than having a thread (or process) per connection. Imagine a Web server serving static files. Now imagine that (as is the case in a realistic situation) a large chunk of your clients have slow connections, and fetch from you around 10 KB a second. Your machine's CPU, disk, kernel, and network card can easily handle 1,000 such connections concurrently (the total of all these connections move just 10 MB per second). But you could not conceivably have a thread (or worse, a process) for each of these connections, because threads have significant overheads. If, for example, each thread takes up "just" 2 MB of memory (for its stack, kernel structures, and perhaps other things), these 1,000 connections take up 2 GB of memory (and now, try to imagine moving to a 1 Gbit network card, and hoping that you could server 5 times more concurrent connections...). The point is that all these threads don't do anything most of the time, and just wait for their chance to send their 10 K every second. It is much more efficient - in speed and certainly in memory - to have a single thread, which epoll's (or whatever) to find the next connection that is ready (for writing, reading, open or close) and process it, all in a single thread. Or, of course, if your machine is an SMP with N CPUs, then you should have N threads. > However, in some cases the input from those sockets is actually queries > to a database and it may also end in operations on this database. Not a > heavy database, but still insert/delete/update is done occassionally. > Now, if I understand you correctly, you say that using a single thread > with epoll, even with many concurrent connections will not decrease > performance. Writing a pure single-threaded server is *very hard*. You must take extreme care not to wait, ever. If your server needs to wait to get the content it needs to serve, e.g., to read the content from disk or to get it from a database, then other connections are not being served at the same time, and your CPU is being wasted! This is why such servers usually have complex state machines - e.g., when you get a database request from the client, you send it to the database, and do not wait for the result - rather - you remember that this connection is in a "sending to database" state, remember the command you need to send to the DB and add the database connection to the poll list; When this connection is ready to write, you write the command to it (you may not be able to do it in a single go, if it's a long command) and then you start waiting for reading the result from the DB, and at the same time you send it to the client (remembering to do flow control - if the client is not ready to be sent to, don't read from the database response). Doing all this is extremely complex, and you wouldn't want to do it except in extreme cases, when performance is of utmost importance and you're expecting thousands of concurrent connections. A simpler approach in your case is to have a hybrid server: a single thread using a poll (or whatever) loop waits for these "commands", and when it gets a command, it sends it to a small thread pool that acts on these commands. In some situations this can be more efficient than the straightforward one-thread-per-connection server - imagine for example that you have 30 concurrent connections, but each sends just one command a second which takes 0.1 seconds to process - in this case a thread pool with just 3 threads might be enough. Of course, like I said, if you're only expecting 30 concurrent connections, I would suggest that you just use the simplest approach: have on thread per connection. You will never notice the difference (I believe). -- Nadav Har'El | Wednesday, May 16 2007, 28 Iyyar 5767 [EMAIL PROTECTED] |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |A Life? Cool! Where can I download one of http://nadav.harel.org.il |those from? ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]