Re: What's the cost of using hundreds of threads?
I'm a bit confused by your math. Fifty connections should be 102 threads, which is quite reasonable. My formula applies to one forwarded ('loadbalanced') connection. Every such connection creates further n connections (pipes) which share the load. Every pipe requires two threads to be spawned. Every 'main connection' spawns two other threads - so my formula: 2*pipes+2 gives the number of threads spawned per 'main connection'. Now if connections_count connections are established the thread count equals: conn_count * threads_per_main_connection = conn_count * (2*pipes+2) For 50 connections and about 10 pipes it will give 1100 threads. My experience with lots of threads dates back to Python 1.5.2, but I rarely saw much improvement with more than a hundred threads, even for heavily I/O-bound applications on a multi-CPU system. However, if your focus is algorithmic complexity, you should be able to handle a couple of thousand threads easily enough. I don't spawn them because of computional reasons, but due to the fact that it makes my code much more simpler. I use built-in tcp features to achieve loadbalancing - every flow (directed through pipe) has it's own dedicated threads - separate for down- and upload. For every 'main connection' these threads share send and receive buffer. If any of pipes is congested the corresponding threads block on their send / recv functions - without affecting independence of data flows. Using threads gives me VERY simple code. To achieve this with poll / select would be much more difficult. And to guarantee concurrency and maximal throughput for all of pipes I would probably have to mirror code from linux TCP stack (I mean window shifting, data acknowlegement, retransmission queues). Or perhaps I exaggerate. -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
Thanks for your comments on winXP threads implementation. You confirmed me in conviction that I shouldn't use windows. Personally I use linux with 2.6.10 kernel, so hopefully I don't have to share your grief. ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
Steve Holden wrote: Apache, for example, can easily spawn more threads under Windows, and I've written code that uses 200 threads with excellent performance. Things seem to slow down around the 2,000 mark for some reason I'm not familiar with. As far as I know, the default Windows thread stack size is 2 MB. Do the math :) On NT4, beyond a couple of hundred threads a *heck* of a lot of time ends up being spent in the kernel doing context switches (and you can kiss even vaguely deterministic response times good-bye). Using a more recent version of Windows improves matters significantly. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.skystorm.net -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
In article [EMAIL PROTECTED], Przemys³aw Ró¿ycki [EMAIL PROTECTED] wrote: Thanks for your comments on winXP threads implementation. You confirmed me in conviction that I shouldn't use windows. Personally I use linux with 2.6.10 kernel, so hopefully I don't have to share your grief. ;) ? !? I'm confused, and apparently I'm confusing others. The one message I posted in this thread--largely reinforced by others--emphasizes only that WinXP is far *better* than earlier Win* flavors in its thread management. While I not only agree that Windows has disadvantages, but have stopped buying it for our company, my reasons have absolutely nothing to do with the details of implementation of WinXP. -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
In article [EMAIL PROTECTED], Przemysaw Rycki [EMAIL PROTECTED] wrote: Thanks for your comments on winXP threads implementation. You confirmed me in conviction that I shouldn't use windows. Personally I use linux with 2.6.10 kernel, so hopefully I don't have to share your grief. ;) ? !? I'm confused, and apparently I'm confusing others. The one message I posted in this thread--largely reinforced by others--emphasizes only that WinXP is far *better* than earlier Win* flavors in its thread management. While I not only agree that Windows has disadvantages, but have stopped buying it for our company, my reasons have absolutely nothing to do with the details of implementation of WinXP. :) . Ok, perhaps my answer wasn't that precise. I wrote my post only to say that your discussion on windows' threading performace doesn't concern me - because my program is written for linux environment. And yes, I agree that my comment could sound a bit enigmatic. -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
In article [EMAIL PROTECTED], =?ISO-8859-2?Q?Przemys=B3aw_R=F3=BFycki?= [EMAIL PROTECTED] wrote: I don't spawn them because of computional reasons, but due to the fact that it makes my code much more simpler. I use built-in tcp features to achieve loadbalancing - every flow (directed through pipe) has it's own dedicated threads - separate for down- and upload. For every 'main connection' these threads share send and receive buffer. If any of pipes is congested the corresponding threads block on their send / recv functions - without affecting independence of data flows. Using threads gives me VERY simple code. To achieve this with poll / select would be much more difficult. And to guarantee concurrency and maximal throughput for all of pipes I would probably have to mirror code from linux TCP stack (I mean window shifting, data acknowlegement, retransmission queues). Or perhaps I exaggerate. Maybe it would help if you explained what these pipes do. Based on what you've said so far, what I'd do in your situation is create one thread per pipe and one thread per connection, then use Queue to move data between threads. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
What's the cost of using hundreds of threads?
Hello, I have written some code, which creates many threads for each connection ('main connection'). The purpose of this code is to balance the load between several connections ('pipes'). The number of spawned threads depends on how many pipes I create (= 2*n+2, where n is the number of pipes). For good results I'll presumably share main connection's load between 10 pipes - therefore 22 threads will be spawned. Now if about 50 connections are forwarded the number of threads rises to thousand of threads (or several thousands if even more connections are established). My questions are: - What is the cost (in memory / CPU usage) of creating such amounts of threads? - Is there any 'upper boundary' that limits the number of threads? (is it python / OS related) - Is that the sign of 'clumsy programming' - i.e. Is creating so many threads a bad habit? (I must say that it simplified the solution of my problem very much). Limiting the number of threads is possible, but would affect the independence of data flows. (ok I admit - creating tricky algorithm could perhaps gurantee concurrency without spawning so many threads - but it's the simplest solution to this problem :) ). -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
Przemysaw Rycki wrote: Hello, I have written some code, which creates many threads for each connection ('main connection'). The purpose of this code is to balance the load between several connections ('pipes'). The number of spawned threads depends on how many pipes I create (= 2*n+2, where n is the number of pipes). For good results I'll presumably share main connection's load between 10 pipes - therefore 22 threads will be spawned. Now if about 50 connections are forwarded the number of threads rises to thousand of threads (or several thousands if even more connections are established). My questions are: - What is the cost (in memory / CPU usage) of creating such amounts of threads? - Is there any 'upper boundary' that limits the number of threads? (is it python / OS related) - Is that the sign of 'clumsy programming' - i.e. Is creating so many threads a bad habit? (I must say that it simplified the solution of my problem very much). Limiting the number of threads is possible, but would affect the independence of data flows. (ok I admit - creating tricky algorithm could perhaps gurantee concurrency without spawning so many threads - but it's the simplest solution to this problem :) ). PR, I notice there's a resource module with a getrusage(who) that looks like it would support a test to get what you need. wes -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
Przemysaw Rycki napisa(a): - Is there any 'upper boundary' that limits the number of threads? (is it python / OS related) - Is that the sign of 'clumsy programming' - i.e. Is creating so many threads a bad habit? (I must say that it simplified the solution of my problem very much). I've read somewhere (I cann't recall where, though, was it MSDN?) that Windows is not well suited to run more than 32 threads per process. Most of the code I saw doesn't spawn more threads than a half of this. -- Jarek Zgoda http://jpa.berlios.de/ | http://www.zgodowie.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
Jarek Zgoda wrote: Przemysaw Rycki napisa(a): - Is there any 'upper boundary' that limits the number of threads? (is it python / OS related) - Is that the sign of 'clumsy programming' - i.e. Is creating so many threads a bad habit? (I must say that it simplified the solution of my problem very much). I've read somewhere (I cann't recall where, though, was it MSDN?) that Windows is not well suited to run more than 32 threads per process. Most of the code I saw doesn't spawn more threads than a half of this. This is apocryphal. Do you have any hard evidence for this assertion? Apache, for example, can easily spawn more threads under Windows, and I've written code that uses 200 threads with excellent performance. Things seem to slow down around the 2,000 mark for some reason I'm not familiar with. regards Steve -- Meet the Python developers and your c.l.py favorites March 23-25 Come to PyCon DC 2005 http://www.pycon.org/ Steve Holden http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
In article [EMAIL PROTECTED], =?ISO-8859-2?Q?Przemys=B3aw_R=F3=BFycki?= [EMAIL PROTECTED] wrote: I have written some code, which creates many threads for each connection ('main connection'). The purpose of this code is to balance the load between several connections ('pipes'). The number of spawned threads depends on how many pipes I create (= 2*n+2, where n is the number of pipes). For good results I'll presumably share main connection's load between 10 pipes - therefore 22 threads will be spawned. Now if about 50 connections are forwarded the number of threads rises to thousand of threads (or several thousands if even more connections are established). I'm a bit confused by your math. Fifty connections should be 102 threads, which is quite reasonable. My questions are: - What is the cost (in memory / CPU usage) of creating such amounts of threads? - Is there any 'upper boundary' that limits the number of threads? (is it python / OS related) - Is that the sign of 'clumsy programming' - i.e. Is creating so many threads a bad habit? (I must say that it simplified the solution of my problem very much). Limiting the number of threads is possible, but would affect the independence of data flows. (ok I admit - creating tricky algorithm could perhaps gurantee concurrency without spawning so many threads - but it's the simplest solution to this problem :) ). My experience with lots of threads dates back to Python 1.5.2, but I rarely saw much improvement with more than a hundred threads, even for heavily I/O-bound applications on a multi-CPU system. However, if your focus is algorithmic complexity, you should be able to handle a couple of thousand threads easily enough. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: What's the cost of using hundreds of threads?
In article [EMAIL PROTECTED], Steve Holden [EMAIL PROTECTED] wrote: . . . I've read somewhere (I cann't recall where, though, was it MSDN?) that Windows is not well suited to run more than 32 threads per process. Most of the code I saw doesn't spawn more threads than a half of this. This is apocryphal. Do you have any hard evidence for this assertion? Apache, for example, can easily spawn more threads under Windows, and I've written code that uses 200 threads with excellent performance. Things seem to slow down around the 2,000 mark for some reason I'm not familiar with. . . . I'll support Mr. Zgoda's apocrypha. The thing is, as so often obtains, you're both right--early Windows flavors could dismember themselves entertainingly when a process launched a few dozen threads, but WinXP vastly improves that condition. I assert that I could substantiate my claims with appropriate references. I choose not to do so today. -- http://mail.python.org/mailman/listinfo/python-list