Re: What's the cost of using hundreds of threads?

2005-03-02 Thread Przemysaw Rycki
I'm a bit confused by your math.  Fifty connections should be 102
threads, which is quite reasonable.
My formula applies to one forwarded ('loadbalanced') connection. Every 
such connection creates further n connections (pipes) which share the 
load. Every pipe requires two threads to be spawned. Every 'main 
connection' spawns two other threads - so my formula: 2*pipes+2 gives 
the number of threads spawned per 'main connection'.

Now if connections_count connections are established the thread count 
equals:
conn_count * threads_per_main_connection = conn_count * (2*pipes+2)

For 50 connections and about 10 pipes it will give 1100 threads.
My experience with lots of threads dates back to Python 1.5.2, but I
rarely saw much improvement with more than a hundred threads, even for
heavily I/O-bound applications on a multi-CPU system.  However, if your
focus is algorithmic complexity, you should be able to handle a couple of
thousand threads easily enough.
I don't spawn them because of computional reasons, but due to the fact 
that it makes my code much more simpler. I use built-in tcp features to 
achieve loadbalancing - every flow (directed through pipe) has it's own 
dedicated threads - separate for down- and upload. For every 'main 
connection' these threads share send and receive buffer. If any of pipes 
is congested the corresponding threads block on their send / recv 
functions - without affecting independence of data flows.

Using threads gives me VERY simple code. To achieve this with poll / 
select would be much more difficult. And to guarantee concurrency and 
maximal throughput for all of pipes I would probably have to mirror code 
 from linux TCP stack (I mean window shifting, data acknowlegement, 
retransmission queues). Or perhaps I exaggerate.
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-02 Thread Przemysaw Rycki
Thanks for your comments on winXP threads implementation. You confirmed 
me in conviction that I shouldn't use windows.
Personally I use linux with 2.6.10 kernel, so hopefully I don't have to 
share your grief. ;)
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-02 Thread Nick Coghlan
Steve Holden wrote:
Apache, for example, can easily spawn more threads under Windows, and 
I've written code that uses 200 threads with excellent performance. 
Things seem to slow down around the 2,000 mark for some reason I'm not 
familiar with.
As far as I know, the default Windows thread stack size is 2 MB. Do the 
math :)
On NT4, beyond a couple of hundred threads a *heck* of a lot of time ends up 
being spent in the kernel doing context switches (and you can kiss even vaguely 
deterministic response times good-bye).

Using a more recent version of Windows improves matters significantly.
Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-02 Thread Cameron Laird
In article [EMAIL PROTECTED],
Przemys³aw Ró¿ycki  [EMAIL PROTECTED] wrote:
Thanks for your comments on winXP threads implementation. You confirmed 
me in conviction that I shouldn't use windows.
Personally I use linux with 2.6.10 kernel, so hopefully I don't have to 
share your grief. ;)

?  !?  I'm confused, and apparently I'm confusing others.
The one message I posted in this thread--largely reinforced
by others--emphasizes only that WinXP is far *better* than
earlier Win* flavors in its thread management.  While I not
only agree that Windows has disadvantages, but have stopped
buying it for our company, my reasons have absolutely nothing
to do with the details of implementation of WinXP.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-02 Thread Przemysaw Rycki
 In article [EMAIL PROTECTED],
 Przemysaw Rycki  [EMAIL PROTECTED] wrote:

 Thanks for your comments on winXP threads implementation. You 
confirmed me in conviction that I shouldn't use windows.
 Personally I use linux with 2.6.10 kernel, so hopefully I don't have 
to share your grief. ;)



 ?  !?  I'm confused, and apparently I'm confusing others.
 The one message I posted in this thread--largely reinforced
 by others--emphasizes only that WinXP is far *better* than
 earlier Win* flavors in its thread management.  While I not
 only agree that Windows has disadvantages, but have stopped
 buying it for our company, my reasons have absolutely nothing
 to do with the details of implementation of WinXP.

:) . Ok, perhaps my answer wasn't that precise. I wrote my post only to 
say that your discussion on windows' threading performace doesn't 
concern me - because my program is written for linux environment. And 
yes, I agree that my comment could sound a bit enigmatic.
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-02 Thread Aahz
In article [EMAIL PROTECTED],
=?ISO-8859-2?Q?Przemys=B3aw_R=F3=BFycki?=
  [EMAIL PROTECTED] wrote:

I don't spawn them because of computional reasons, but due to the fact
that it makes my code much more simpler. I use built-in tcp features
to achieve loadbalancing - every flow (directed through pipe) has it's
own dedicated threads - separate for down- and upload. For every 'main
connection' these threads share send and receive buffer. If any of
pipes is congested the corresponding threads block on their send / recv
functions - without affecting independence of data flows.

Using threads gives me VERY simple code. To achieve this with poll /
select would be much more difficult. And to guarantee concurrency and
maximal throughput for all of pipes I would probably have to mirror
code from linux TCP stack (I mean window shifting, data acknowlegement,
retransmission queues). Or perhaps I exaggerate.

Maybe it would help if you explained what these pipes do.  Based on
what you've said so far, what I'd do in your situation is create one
thread per pipe and one thread per connection, then use Queue to move
data between threads.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death.  --GvR
-- 
http://mail.python.org/mailman/listinfo/python-list


What's the cost of using hundreds of threads?

2005-03-01 Thread Przemysaw Rycki
Hello,
I have written some code, which creates many threads for each connection 
('main connection'). The purpose of this code is to balance the load 
between several connections ('pipes'). The number of spawned threads 
depends on how many pipes I create (= 2*n+2, where n is the number of 
pipes).

For good results I'll presumably share main connection's load between 10 
pipes - therefore 22 threads will be spawned. Now if about 50 
connections are forwarded the number of threads rises to thousand of 
threads (or several thousands if even more connections are established).

My questions are:
- What is the cost (in memory / CPU usage) of creating such amounts of 
threads?
- Is there any 'upper boundary' that limits the number of threads? (is 
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many 
threads a bad habit? (I must say that it simplified the solution of my 
problem very much).

Limiting the number of threads is possible, but would affect the 
independence of data flows. (ok I admit - creating tricky algorithm 
could perhaps gurantee concurrency without spawning so many threads - 
but it's the simplest solution to this problem :) ).
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-01 Thread wes weston
Przemysaw Rycki wrote:
Hello,
I have written some code, which creates many threads for each connection 
('main connection'). The purpose of this code is to balance the load 
between several connections ('pipes'). The number of spawned threads 
depends on how many pipes I create (= 2*n+2, where n is the number of 
pipes).

For good results I'll presumably share main connection's load between 10 
pipes - therefore 22 threads will be spawned. Now if about 50 
connections are forwarded the number of threads rises to thousand of 
threads (or several thousands if even more connections are established).

My questions are:
- What is the cost (in memory / CPU usage) of creating such amounts of 
threads?
- Is there any 'upper boundary' that limits the number of threads? (is 
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many 
threads a bad habit? (I must say that it simplified the solution of my 
problem very much).

Limiting the number of threads is possible, but would affect the 
independence of data flows. (ok I admit - creating tricky algorithm 
could perhaps gurantee concurrency without spawning so many threads - 
but it's the simplest solution to this problem :) ).
PR,
   I notice there's a resource module with a
getrusage(who) that looks like it would support
a test to get what you need.
wes
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-01 Thread Jarek Zgoda
Przemysaw Rycki napisa(a):
- Is there any 'upper boundary' that limits the number of threads? (is 
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many 
threads a bad habit? (I must say that it simplified the solution of my 
problem very much).
I've read somewhere (I cann't recall where, though, was it MSDN?) that 
Windows is not well suited to run more than 32 threads per process. Most 
of the code I saw doesn't spawn more threads than a half of this.

--
Jarek Zgoda
http://jpa.berlios.de/ | http://www.zgodowie.org/
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-01 Thread Steve Holden
Jarek Zgoda wrote:
Przemysaw Rycki napisa(a):
- Is there any 'upper boundary' that limits the number of threads? (is 
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many 
threads a bad habit? (I must say that it simplified the solution of my 
problem very much).

I've read somewhere (I cann't recall where, though, was it MSDN?) that 
Windows is not well suited to run more than 32 threads per process. Most 
of the code I saw doesn't spawn more threads than a half of this.

This is apocryphal. Do you have any hard evidence for this assertion?
Apache, for example, can easily spawn more threads under Windows, and 
I've written code that uses 200 threads with excellent performance. 
Things seem to slow down around the 2,000 mark for some reason I'm not 
familiar with.

regards
 Steve
--
Meet the Python developers and your c.l.py favorites March 23-25
Come to PyCon DC 2005  http://www.pycon.org/
Steve Holden   http://www.holdenweb.com/
--
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-01 Thread Aahz
In article [EMAIL PROTECTED],
=?ISO-8859-2?Q?Przemys=B3aw_R=F3=BFycki?=
  [EMAIL PROTECTED] wrote:

I have written some code, which creates many threads for each connection 
('main connection'). The purpose of this code is to balance the load 
between several connections ('pipes'). The number of spawned threads 
depends on how many pipes I create (= 2*n+2, where n is the number of 
pipes).

For good results I'll presumably share main connection's load between 10 
pipes - therefore 22 threads will be spawned. Now if about 50 
connections are forwarded the number of threads rises to thousand of 
threads (or several thousands if even more connections are established).

I'm a bit confused by your math.  Fifty connections should be 102
threads, which is quite reasonable.

My questions are:
- What is the cost (in memory / CPU usage) of creating such amounts of 
threads?
- Is there any 'upper boundary' that limits the number of threads? (is 
it python / OS related)
- Is that the sign of 'clumsy programming' - i.e. Is creating so many 
threads a bad habit? (I must say that it simplified the solution of my 
problem very much).

Limiting the number of threads is possible, but would affect the 
independence of data flows. (ok I admit - creating tricky algorithm 
could perhaps gurantee concurrency without spawning so many threads - 
but it's the simplest solution to this problem :) ).

My experience with lots of threads dates back to Python 1.5.2, but I
rarely saw much improvement with more than a hundred threads, even for
heavily I/O-bound applications on a multi-CPU system.  However, if your
focus is algorithmic complexity, you should be able to handle a couple of
thousand threads easily enough.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death.  --GvR
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What's the cost of using hundreds of threads?

2005-03-01 Thread Cameron Laird
In article [EMAIL PROTECTED],
Steve Holden  [EMAIL PROTECTED] wrote:
.
.
.
 I've read somewhere (I cann't recall where, though, was it MSDN?) that 
 Windows is not well suited to run more than 32 threads per process. Most 
 of the code I saw doesn't spawn more threads than a half of this.
 
This is apocryphal. Do you have any hard evidence for this assertion?

Apache, for example, can easily spawn more threads under Windows, and 
I've written code that uses 200 threads with excellent performance. 
Things seem to slow down around the 2,000 mark for some reason I'm not 
familiar with.
.
.
.
I'll support Mr. Zgoda's apocrypha.  The thing is, as so often
obtains, you're both right--early Windows flavors could dismember
themselves entertainingly when a process launched a few dozen
threads, but WinXP vastly improves that condition.

I assert that I could substantiate my claims with appropriate
references.  I choose not to do so today.
-- 
http://mail.python.org/mailman/listinfo/python-list