Re: [Tutor] Python performance resources & resouce usage hints

2006-04-07 Thread Liam Clarke
Well, thanks very much Kent, Hugo and Danny.

I went with the "never-ending blocking queues" and sentinel data approach.
When running tests with a continual stream of packets being received
3ms apart, CPU usage peaked at 15%, was usually around 7-9%, and when
deployed the packets will separated by seconds  rather than
milliseconds.

Thanks for the assistance, I've now overcome my fear of blocking I/O :).

Regards,

Lia, Clarke

On 4/8/06, Liam Clarke <[EMAIL PROTECTED]> wrote:
> Thanks very much all. :) I'll have a crack this afternoon and let you know.
>
> Kent - the increase in the queue size for the socket server is to
> allow for any delay in processing packets; it has a default queue size
> of 5 and then it starts rejecting packets; more of a safety policy
> when reducing CPU usage than a direct attempt to reduce CPU usage.
>
> Once again, thanks for the input!
>
> On 4/8/06, Danny Yoo <[EMAIL PROTECTED]> wrote:
> >
> >
> > On Fri, 7 Apr 2006, Kent Johnson wrote:
> >
> > > Hugo Gonz�lez Monteverde wrote:
> > > > You are not using the optional timeout and blocking which 'get' 
> > > > provides (!)
> > > >
> > > > Try setting it and see your CPU usage go down. This will implement
> > > > blocking, and the queue will be used as soon as data is there.  Set 
> > > > block
> > > > to True and a timeout if you need to use it (looks like you only need
> > > > blocking)
> > > >
> > > >  while True:
> > > >   try:
> > > >  data = self.queue.get(True)
> > > >  self.DAO.send_data(data)
> > >
> > > I think he will need the timeout too, otherwise the shutdown flag will
> > > only be checked when there is work which is probably not what he wants.
> >
> >
> > This could be fixed: when setting the 'shutdown' flag, also push a
> > sentinel piece of data into the queue.  That'll wake the thread back up,
> > and that'll give it the opportunity to process a shutdown.
> >
> > The thread here:
> >
> > http://mail.python.org/pipermail/tutor/2006-January/044557.html
> >
> > has some more examples of this.
> >
> >
> > Hope this helps!
> >
> >
> >
> > ___
> > Tutor maillist  -  Tutor@python.org
> > http://mail.python.org/mailman/listinfo/tutor
> >
> >
> >
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python performance resources & resouce usage hints

2006-04-07 Thread Liam Clarke
Thanks very much all. :) I'll have a crack this afternoon and let you know.

Kent - the increase in the queue size for the socket server is to
allow for any delay in processing packets; it has a default queue size
of 5 and then it starts rejecting packets; more of a safety policy
when reducing CPU usage than a direct attempt to reduce CPU usage.

Once again, thanks for the input!

On 4/8/06, Danny Yoo <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, 7 Apr 2006, Kent Johnson wrote:
>
> > Hugo Gonz�lez Monteverde wrote:
> > > You are not using the optional timeout and blocking which 'get' provides 
> > > (!)
> > >
> > > Try setting it and see your CPU usage go down. This will implement
> > > blocking, and the queue will be used as soon as data is there.  Set 
> > > block
> > > to True and a timeout if you need to use it (looks like you only need
> > > blocking)
> > >
> > >  while True:
> > >   try:
> > >  data = self.queue.get(True)
> > >  self.DAO.send_data(data)
> >
> > I think he will need the timeout too, otherwise the shutdown flag will
> > only be checked when there is work which is probably not what he wants.
>
>
> This could be fixed: when setting the 'shutdown' flag, also push a
> sentinel piece of data into the queue.  That'll wake the thread back up,
> and that'll give it the opportunity to process a shutdown.
>
> The thread here:
>
> http://mail.python.org/pipermail/tutor/2006-January/044557.html
>
> has some more examples of this.
>
>
> Hope this helps!
>
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python performance resources & resouce usage hints

2006-04-07 Thread Danny Yoo


On Fri, 7 Apr 2006, Kent Johnson wrote:

> Hugo Gonz�lez Monteverde wrote:
> > You are not using the optional timeout and blocking which 'get' provides (!)
> >
> > Try setting it and see your CPU usage go down. This will implement
> > blocking, and the queue will be used as soon as data is there.  Set 
> > block
> > to True and a timeout if you need to use it (looks like you only need
> > blocking)
> >
> >  while True:
> >   try:
> >  data = self.queue.get(True)
> >  self.DAO.send_data(data)
>
> I think he will need the timeout too, otherwise the shutdown flag will
> only be checked when there is work which is probably not what he wants.


This could be fixed: when setting the 'shutdown' flag, also push a
sentinel piece of data into the queue.  That'll wake the thread back up,
and that'll give it the opportunity to process a shutdown.

The thread here:

http://mail.python.org/pipermail/tutor/2006-January/044557.html

has some more examples of this.


Hope this helps!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python performance resources & resouce usage hints

2006-04-07 Thread Kent Johnson
Hugo González Monteverde wrote:
> You are not using the optional timeout and blocking which 'get' provides (!)
> 
> Try setting it and see your CPU usage go down. This will implement 
> blocking, and the queue will be used as soon as data is there.Set 
> block 
> to True and a timeout if you need to use it (looks like you only need 
> blocking)
> 
>  while True:
>   try:
>  data = self.queue.get(True)
>  self.DAO.send_data(data)

I think he will need the timeout too, otherwise the shutdown flag will 
only be checked when there is work which is probably not what he wants.

I agree, this is a good solution.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python performance resources & resouce usage hints

2006-04-07 Thread Hugo González Monteverde
Liam Clarke wrote:

> Each thread's run() method basically looks like this -
> 
> while True:
> try:
> data = self.queue.get(False)
> self.DAO.send_data(data)
> except Empty:
> if self.shutdown:
> print "\DAO closing"
> return
> continue
> 
> 

Hi Liam,

I checked the response by Kent. I completely agree with him in that the 
problem is the endless polling you have.

However, you do not have to implement a sleep in your loop; the main 
issue may be that you are not using the facilities of the 'get' method 
in the queue. From the queue docs:

get([block[, timeout]])
 Remove and return an item from the queue. If optional args block is 
true and timeout is None (the default), block if necessary until an item 
is available. If timeout is a positive number, it blocks at most timeout 
seconds and raises the Empty exception if no item was available within 
that time. Otherwise (block is false), return an item if one is 
immediately available, else raise the Empty exception (timeout is 
ignored in that case).


You are not using the optional timeout and blocking which 'get' provides (!)

Try setting it and see your CPU usage go down. This will implement 
blocking, and the queue will be used as soon as data is there.  Set block 
to True and a timeout if you need to use it (looks like you only need 
blocking)

 while True:
  try:
 data = self.queue.get(True)
 self.DAO.send_data(data)

Hope that helps,

Hugo

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python performance resources & resouce usage hints

2006-04-07 Thread Kent Johnson
Liam Clarke wrote:
> Hi,
> 
> I've developed what would be my largest Python app to date. And, going
> from the crude Windows Task Manager, it's trying to use as much CPU
> time as it can get when it's idle.
> 
> This, no doub,t is due to my design.
> 
> I have an UDP socket server, a packet cruncher, and a DAO.
> Each resides in it's own thread, and each thread is connected to the
> next by a Queue.
> 
> So it's server ---> cruncher --> DAO.
> 
> Each thread's run() method basically looks like this -
> 
> while True:
> try:
> data = self.queue.get(False)
> self.DAO.send_data(data)
> except Empty:
> if self.shutdown:
> print "\DAO closing"
> return
> continue
> 
> 
> i.e. each thread is looping endlessly until data arrives via the
> queue. I can't believe it chews the CPU time the way it does, but I
> suppose it's easy to do so.

Yes, it's easy. You have basically told the thread to check the queue as 
often as possible. So it is running as fast as it can, checking the 
queue and handling whatever comes its way.
> 
> My query to the list is twofold -
> 
> First, if anyone knows of any websites with articles on Python
> threading optimisation/pitfalls websites, I'd be greatly appreciative.

There's not much. This might help:
http://linuxgazette.net/107/pai.html

This book is excellent for teaching some of the tools that are used for 
communication and synchronization between threads:
http://greenteapress.com/semaphores/

There are many threading related recipes in the Python Cookbook, both 
online and printed.
http://aspn.activestate.com/ASPN/Cookbook/Python

> Okay, the 2nd piece of advice I'm seeking, is what would be the most
> efficient path here?
> My initial thoughts are along three lines:
> 
> Either:
> 
> A) Increase the queue size of the socket servers

I don't see how that would help.

> B) Use timer threads to 'pulse' my threads.

That's a lot of additional complexity.
> A) Increase the queue size of the socket servers
> B) Use blocking queues
> 
> Or:
> 
> A) Use blocking queues with a timeout
> B) Use the socket servers to "wake" processing threads

These are all better.
> 
> As Python doesn't have sleeping threads etc, the third option is my
> current measure of last resort, as there'll be some substantial
> rejigging, and I'm not overly experienced with threads.

Use time.sleep() to sleep a thread.

The simplest fix is to add a time.sleep() into your loops. Then the 
thread will check the queue, process any work, and sleep for a little 
while. (This is called a busy loop.)

The disadvantage of this solution is that there is a trade off between 
CPU usage and responsiveness. If the sleep is very short, the thread 
will be very responsive but it will still run a lot and use CPU when 
idling. If you make the timeout long, the idle CPU will be very low but 
responsiveness will be poor.

If you can live with a response delay of 0.05 or 0.1 second, try that, 
it should cut the CPU usage dramatically. Even a 0.01 delay might make a 
big difference.


A better fix is to use blocking queues or other blocking events. In this 
approach, a thread will block until there is something to do, then wake 
up, do its work and go back to sleep.

The hitch here is you need to find another way to signal the thread to 
exit. One possibility is just to mark them as daemon threads, then they 
will exit when your app exits. This is a very simple solution if you 
don't have any cleanup actions you want to do in the threads.

Another possibility might be to send a "quit" message in the queue. When 
the thread sees the special quit message, it forwards it to the next 
thread and exits.

If neither of these work then you could use a queue.get() with a timeout 
so you check the done flag periodically.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor