subject:"Threading problem"

[issue14301] xmlrpc client transport and threading problem

2019-03-15 Thread Mark Lawrence



Change by Mark Lawrence :


--
nosy:  -BreamoreBoy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14301] xmlrpc client transport and threading problem

2015-02-12 Thread Demian Brecht


Changes by Demian Brecht demianbre...@gmail.com:


--
nosy:  -demian.brecht

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14301
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14301] xmlrpc client transport and threading problem

2014-07-21 Thread Demian Brecht


Changes by Demian Brecht demianbre...@gmail.com:


--
nosy: +demian.brecht

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14301
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14301] xmlrpc client transport and threading problem

2014-07-03 Thread Mark Lawrence


Mark Lawrence added the comment:

@Kees sorry for the delay in getting back to you.

@Martin can you comment on this please.

--
components: +Library (Lib) -None
nosy: +BreamoreBoy, loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14301
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14301] xmlrpc client transport and threading problem

2012-03-14 Thread Kees Bos


New submission from Kees Bos k@zx.nl:

The transport (second parameter to ServerProxy) must be unique for every 
thread. This was not the case in pre-python2.7. It is caused by the reuse of 
the connection (stored in _connection). This could be handled by saving the 
thread id too.

I don't know whether this is a coding error or a documentation omission.

--
components: None
messages: 155751
nosy: kees
priority: normal
severity: normal
status: open
title: xmlrpc client transport and threading problem
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14301
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Threading problem

2010-04-25 Thread sdistefano

I have the following issue:

My program runs a thread called the MainThread, that loops trough a
number of URLs and decides when it's the right time for one to be
fetched.  Once a URL has to be fetched, it's added to a Queue object,
where the FetchingThread picks up and does the actual work. Often,
URLs have to be fetched with frequencies of 100ms, so the objects will
get added to the queue repeatedly. Even though it takes more than
100ms to get the URL and process it, ideally what would happen is:
ms0: Request 1 is sent
ms100: request 2 is sent
ms150: request 1 is processed
ms200: request 3 is sent
ms250: request 2 is processed

and so on... The problem is that for some reason python runs the main
thread considerably more than the checking thread. If I ask them both
to 'print' when run, this becomes obvious ; even if I create more
check threads than main threads (I even tried 50 check threads and 1
main thread). Despite my terminology, both the mainthread and
fetchthread are created in exactly the same way.

what part of threading in python am I not properly understanding?

thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading problem

2010-04-25 Thread Patrick Maupin

On Apr 25, 2:55 pm, sdistefano sdistef...@gmail.com wrote:
 I have the following issue:

 My program runs a thread called the MainThread, that loops trough a
 number of URLs and decides when it's the right time for one to be
 fetched.  Once a URL has to be fetched, it's added to a Queue object,
 where the FetchingThread picks up and does the actual work. Often,
 URLs have to be fetched with frequencies of 100ms, so the objects will
 get added to the queue repeatedly. Even though it takes more than
 100ms to get the URL and process it, ideally what would happen is:
 ms0: Request 1 is sent
 ms100: request 2 is sent
 ms150: request 1 is processed
 ms200: request 3 is sent
 ms250: request 2 is processed

 and so on... The problem is that for some reason python runs the main
 thread considerably more than the checking thread. If I ask them both
 to 'print' when run, this becomes obvious ; even if I create more
 check threads than main threads (I even tried 50 check threads and 1
 main thread). Despite my terminology, both the mainthread and
 fetchthread are created in exactly the same way.

 what part of threading in python am I not properly understanding?

Unless I'm missing something, your description doesn't make this sound
like either a python-specific problem, or a threading problem. Threads
run when it's their turn and they aren't blocked, and you haven't
described any code that would ever block your main thread, but your
subsidiary threads will often be blocked at a socket waiting for their
HTTP requests to complete.
-- 
http://mail.python.org/mailman/listinfo/python-list

Threading problem / Paramiko problem ?

2009-12-28 Thread mk


Hello everyone,

I wrote concurrent ssh client using Paramiko, available here: 
http://python.domeny.com/cssh.py


This program has a function for concurrent remote file/dir copying 
(class SSHThread, method 'sendfile'). One thread per host specified is 
started for copying (with a working queue of maximum length, of course).


It does have a problem with threading or Paramiko, though:

- If I specify, say, 3 hosts, the 3 threads started start copying onto 
remote hosts fast (on virtual machine, 10-15MB/s), using somewhat below 
100% of CPU all the time (I wish it were less CPU-consuming but I'm 
doing sending file portion by portion and it's coded in Python, plus 
there are other calculations, well..)


- If I specify say 10 hosts, copying is fast and CPU is under load until 
there are 2-3 threads left; then, CPU load goes down to some 15% and 
copying gets slow (at some 1MB/s).


It looks as if the CPU time gets divided in more or less even portions 
for each thread running at the moment when the maximum number of threads 
is active (10 in this example) *and it stays this way even if some 
threads are finished and join()ed *.


I do join() the finished threads (take a look at code, someone). Yet the 
CPU consumption and copying speed go down.


Now, it's either that, or Paramiko maxes out sending bandwidth per 
thread to the total divided by number of senders. I have no idea which 
and what's worse, no idea how to test this. I've done profiling which 
indicated nothing, basically all function calls except time.sleep take 
negligible time.


Regards,
mk

--
http://mail.python.org/mailman/listinfo/python-list

Re: Threading problem / Paramiko problem ?

2009-12-28 Thread MRAB


mk wrote:

Hello everyone,

I wrote concurrent ssh client using Paramiko, available here: 
http://python.domeny.com/cssh.py


This program has a function for concurrent remote file/dir copying 
(class SSHThread, method 'sendfile'). One thread per host specified is 
started for copying (with a working queue of maximum length, of course).


It does have a problem with threading or Paramiko, though:

- If I specify, say, 3 hosts, the 3 threads started start copying onto 
remote hosts fast (on virtual machine, 10-15MB/s), using somewhat below 
100% of CPU all the time (I wish it were less CPU-consuming but I'm 
doing sending file portion by portion and it's coded in Python, plus 
there are other calculations, well..)


- If I specify say 10 hosts, copying is fast and CPU is under load until 
there are 2-3 threads left; then, CPU load goes down to some 15% and 
copying gets slow (at some 1MB/s).


It looks as if the CPU time gets divided in more or less even portions 
for each thread running at the moment when the maximum number of threads 
is active (10 in this example) *and it stays this way even if some 
threads are finished and join()ed *.


I do join() the finished threads (take a look at code, someone). Yet the 
CPU consumption and copying speed go down.


Now, it's either that, or Paramiko maxes out sending bandwidth per 
thread to the total divided by number of senders. I have no idea which 
and what's worse, no idea how to test this. I've done profiling which 
indicated nothing, basically all function calls except time.sleep take 
negligible time.



From what I can see, your script basically does a busy wait in
mainprog(), repeatedly checking whether any threads have finished.

It might use less CPU time if you used the Queue module and the threads
informed the main loop of their progress and when they are about to
finish by putting messages in the queue. The main loop would get the
messages from the queue, updating the progress display or starting a new
thread as appropriate. It wouldn't be constantly polling the threads.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-12 Thread George Sakkis

On Jun 11, 3:07 pm, Carl Banks [EMAIL PROTECTED] wrote:

 On Jun 10, 11:33 pm, George Sakkis [EMAIL PROTECTED] wrote:

  I pasted my current solution athttp://codepad.org/FXF2SWmg. Any
  feedback, especially if it has to do with proving or disproving its
  correctness, will be appreciated.

 It seems like you're reinventing the wheel.  The Queue class does all
 this, and it's been thorougly battle-tested.

Synchronized queues are an extremely useful data structure in many
situations. The producer/consumer paradigm however is a more general
model and doesn't depend on any specific data structure.

 So first of all, can you tell us why the following wouldn't work?  It
 might help us understand the issue you're facing (never mind the
 produce and consume arguments for now--I'll cover that below).

 def iter_consumed(items):
 q = Queue.Queue()
 sentinel = object()
 def produce_all()
 for item in items:
 q.put()
 q.put(sentinel)
 producer = threading.Thread(target=produce_all)
 producer.start()
 try:
 while True:
 item = q.get()
 if item is sentinel:
 return
 yield item
 finally:
 # for robustness, notify producer to wrap things up
 # left as exercise
 producer.join()

As it is, all this does is yield the original items, but slower, which
is pretty much useless. The whole idea is to transform some inputs to
some outputs. How exactly each input is mapped to an output is
irrelevant at this point; this is the power of the producer/consumer
model. Produce might mean send an email to address X and consume
might mean wait for an automated email response, parse it and return
a value Y. No queue has to be involved; it *may* be involved of
course, but that's an implementation detail, it shouldn't make a
difference to iter_consumed().

If you replace q.push/q.pop with produce/consume respectively
and make the last two parameters, you're much closer to my idea.
What's left is getting rid of the sentinel, since the producer and the
consumer may have been written independently, not aware of
iter_consumed. E.g. for the email example, all producers and consumers
(there may be more than one) must agree in advance on a sentinel
email. For a given situation that might not be a big issue, but then
again, if iter_consumed() is written once without assuming a sentinel,
it makes life easier for all future producers and consumers.

 If you want to customize the effect of getting and putting, you can
 subclass Queue and override the _get and _put methods (however, last
 time I checked, the Queue class expects _put to always add an item to
 the internal sequence representing the queue--not necessarily to the
 top--and _get to always remove an item--not necessarily from the
 bottom).

Assuming that you're talking about the stdlib Queue.Queue class and
not some abstract Queue interface, extending it may help for limited
customization but can only get you so far. For instance the producer
and the consumer may not live in the same address space, or even in
the same machine.

 One issue from your function.  This line:

 done_remaining[1] += 1

 is not atomic, but your code depends on it being so.

No, it doesn't; it is protected by the condition object.

George
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Rhamphoryncus

Why not use a normal Queue, put a dummy value (such as None) in when
you're producer has finished, and have the main thread use the normal
Thread.join() method on all your child threads?
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Carl Banks

On Jun 10, 11:33 pm, George Sakkis [EMAIL PROTECTED] wrote:
 I'd like some feedback on a solution to a variant of the producer-
 consumer problem. My first few attempts turned out to deadlock
 occasionally; this one seems to be deadlock-free so far but I can't
 tell if it's provably correct, and if so, whether it can be
 simplified.

 The generic producer-consumer situation with unlimited buffer capacity
 is illustrated athttp://docs.python.org/lib/condition-objects.html.
 That approach assumes that the producer will keep producing items
 indefinitely, otherwise the consumer ends up waiting forever. The
 extension to the problem I am considering requires the consumer to be
 notified not only when there is a new produced item, but also when
 there is not going to be a new item so that it stops waiting.


Sounds like a sentinel would work for this.  The producer puts a
specific object (say, None) in the queue and the consumer checks for
this object and stops consuming when it sees it.  But that seems so
obvious I suspect there's something else up.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread George Sakkis

On Jun 11, 1:59 am, Rhamphoryncus [EMAIL PROTECTED] wrote:

 Why not use a normal Queue, put a dummy value (such as None) in when
 you're producer has finished, and have the main thread use the normal
 Thread.join() method on all your child threads?

I just gave two reasons:
- Concurrency / interactivity. The main thread shouldn't wait for all
one million items to be produced to get to see even one of them.
- Limiting resources. Just like iterating over the lines of a file is
more memory efficient than reading the whole file in memory, getting
each consumed item as it becomes available is more memory efficient
than waiting for all of them to finish.

George
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Larry Bates


George Sakkis wrote:

On Jun 10, 11:47 pm, Larry Bates [EMAIL PROTECTED] wrote:

I had a little trouble understanding what exact problem it is that you are
trying to solve but I'm pretty sure that you can do it with one of two methods:


Ok, let me try again with a different example: I want to do what can
be easily done with 2.5 Queues using Queue.task_done()/Queue.join()
(see example at http://docs.python.org/lib/QueueObjects.html), but
instead of  having to first put all items and then wait until all are
done, get each item as soon as it is done.


1) Write the producer as a generator using yield method that yields a result
every time it is called (something like os.walk does).  I guess you could yield
None if there wasn't anything to consume to prevent blocking.


Actually the way items are generated is not part of the problem; it
can be abstracted away as an arbitrary iterable input. As with all
iterables, there are no more items is communicated simply by a
StopIteration.


2) Usw somethink like Twisted insted that uses callbacks instead to handle
multiple asynchronous calls to produce.  You could have callbacks that don't do
anything if there is nothing to consume (sort of null objects I guess).


Twisted is interesting and very powerful but requires a different way
of thinking about the problem and designing a solution. More to the
point, callbacks often provide a less flexible and simple API than an
iterator that yields results (consumed items). For example, say that
you want to store the results to a dictionary. Using callbacks, you
would have to explicitly synchronize each access to the dictionary
since they may fire independently. OTOH an iterator by definition
yields items sequentially, so the client doesn't have to bother with
synchronization. Note that with client I mean the user of an API/
framework/library; the implementation of the library itself may of
course use callbacks under the hood (e.g. to put incoming results to a
Queue) but expose the API as a simple iterator.

George


If you use a queue and the producer/collector are running in different threads 
you don't have to wait for to first put all items and then wait until all are
done.  Producer can push items on the queue and while the collector 
asynchronously pops them off.


I'm virtually certain that I read on this forum that dictionary access is atomic 
if that helps.


-Larry
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread giltay

 Sounds like a sentinel would work for this.  The producer puts a
 specific object (say, None) in the queue and the consumer checks for
 this object and stops consuming when it sees it.  But that seems so
 obvious I suspect there's something else up.

 There's a decent implementation of this in the Python Cookbook,
Second Edition (9.4: Working with a Thread Pool), available from
Safari as a preview:
http://my.safaribooksonline.com/0596007973/pythoncook2-CHP-9-SECT-4

 Basically, there's a request_work function that adds (command,
data) pairs to the input Queue.  The command 'stop' is used to
terminate each worker thread (there's the sentinel).
stop_and_free_thread_pool() just puts N ('stop', None) pairs and
join()s each thread.

 The threadpool put()s the consumed items in an output Queue; they
can be retrieved concurrently using get().  You don't call join()
until you want to stop producing; you can get() at any time.

Geoff Gilmour-Taylor

(I ended up using this recipe in my own code, but with a completely
different stopping mechanism---I'm using the worker threads to control
subprocesses; I want to terminate the subprocesses but keep the worker
threads running---and a callback rather than an output queue.)
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Jean-Paul Calderone


On Tue, 10 Jun 2008 22:46:37 -0700 (PDT), George Sakkis [EMAIL PROTECTED] 
wrote:

On Jun 10, 11:47 pm, Larry Bates [EMAIL PROTECTED] wrote:


I had a little trouble understanding what exact problem it is that you are
trying to solve but I'm pretty sure that you can do it with one of two methods:


Ok, let me try again with a different example: I want to do what can
be easily done with 2.5 Queues using Queue.task_done()/Queue.join()
(see example at http://docs.python.org/lib/QueueObjects.html), but
instead of  having to first put all items and then wait until all are
done, get each item as soon as it is done.


1) Write the producer as a generator using yield method that yields a result
every time it is called (something like os.walk does).  I guess you could yield
None if there wasn't anything to consume to prevent blocking.


Actually the way items are generated is not part of the problem; it
can be abstracted away as an arbitrary iterable input. As with all
iterables, there are no more items is communicated simply by a
StopIteration.


2) Usw somethink like Twisted insted that uses callbacks instead to handle
multiple asynchronous calls to produce.  You could have callbacks that don't do
anything if there is nothing to consume (sort of null objects I guess).


Twisted is interesting and very powerful but requires a different way
of thinking about the problem and designing a solution. More to the
point, callbacks often provide a less flexible and simple API than an
iterator that yields results (consumed items). For example, say that
you want to store the results to a dictionary. Using callbacks, you
would have to explicitly synchronize each access to the dictionary
since they may fire independently.


This isn't true.  Access is synchronized automatically by virtue of the
fact that there is no pre-emptive multithreading in operation.  This is
one of the many advantages of avoiding threads. :)  There may be valid
arguments against callbacks, but this isn't one of them.

Jean-Paul
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Rhamphoryncus

On Jun 11, 6:00 am, George Sakkis [EMAIL PROTECTED] wrote:
 On Jun 11, 1:59 am, Rhamphoryncus [EMAIL PROTECTED] wrote:

  Why not use a normal Queue, put a dummy value (such as None) in when
  you're producer has finished, and have the main thread use the normal
  Thread.join() method on all your child threads?

 I just gave two reasons:
 - Concurrency / interactivity. The main thread shouldn't wait for all
 one million items to be produced to get to see even one of them.

Then don't wait.  The main thread can easily do other work while the
producer and consumer threads go about their business.


 - Limiting resources. Just like iterating over the lines of a file is
 more memory efficient than reading the whole file in memory, getting
 each consumed item as it becomes available is more memory efficient
 than waiting for all of them to finish.

That's why you give Queue a maxsize.  Put it at maybe 5 or 10.  Enough
that the producer can operate in a burst (usually more efficient that
switching threads after each item), then the consumer can grab them
all in a burst.

Then again, you may find it easier to use an event-driven architecture
(like Twisted, as others have suggested.)
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Carl Banks

On Jun 10, 11:33 pm, George Sakkis [EMAIL PROTECTED] wrote:
 I pasted my current solution athttp://codepad.org/FXF2SWmg. Any
 feedback, especially if it has to do with proving or disproving its
 correctness, will be appreciated.


It seems like you're reinventing the wheel.  The Queue class does all
this, and it's been thorougly battle-tested.

So first of all, can you tell us why the following wouldn't work?  It
might help us understand the issue you're facing (never mind the
produce and consume arguments for now--I'll cover that below).


def iter_consumed(items):
q = Queue.Queue()
sentinel = object()
def produce_all()
for item in items:
q.put()
q.put(sentinel)
producer = threading.Thread(target=produce_all)
producer.start()
try:
while True:
item = q.get()
if item is sentinel:
return
yield item
finally:
# for robustness, notify producer to wrap things up
# left as exercise
producer.join()


If you want to customize the effect of getting and putting, you can
subclass Queue and override the _get and _put methods (however, last
time I checked, the Queue class expects _put to always add an item to
the internal sequence representing the queue--not necessarily to the
top--and _get to always remove an item--not necessarily from the
bottom).

However, even that's only necessary if you want to get items in a
different order than you put them.  If you just want to filter items
as they're produced or consumed, you should simply define
produce_filter and consume_filter, that are called before q.put and
after q.get, respectively.


One issue from your function.  This line:

done_remaining[1] += 1

is not atomic, but your code depends on it being so.  It can get out
of sync if there is a intervening thread switch between the read and
set.  This was discussed on the list a while back.  I posted an atomic
counter object in that thread (it was written in C--no other way) for
which the += is atomic.  Otherwise you have to use a lock.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread Aahz

In article [EMAIL PROTECTED],
George Sakkis  [EMAIL PROTECTED] wrote:

I'd like some feedback on a solution to a variant of the producer-
consumer problem. My first few attempts turned out to deadlock
occasionally; this one seems to be deadlock-free so far but I can't
tell if it's provably correct, and if so, whether it can be
simplified.

Take a look at the threading tutorial on my web page, specifically the
threadpool spider.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

as long as we like the same operating system, things are cool. --piranha
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-11 Thread George Sakkis

On Jun 11, 10:13 am, Jean-Paul Calderone [EMAIL PROTECTED] wrote:
 On Tue, 10 Jun 2008 22:46:37 -0700 (PDT), George Sakkis [EMAIL PROTECTED] 
 wrote:
 On Jun 10, 11:47 pm, Larry Bates [EMAIL PROTECTED] wrote:

  I had a little trouble understanding what exact problem it is that you are
  trying to solve but I'm pretty sure that you can do it with one of two 
  methods:

 Ok, let me try again with a different example: I want to do what can
 be easily done with 2.5 Queues using Queue.task_done()/Queue.join()
 (see example athttp://docs.python.org/lib/QueueObjects.html), but
 instead of  having to first put all items and then wait until all are
 done, get each item as soon as it is done.

  1) Write the producer as a generator using yield method that yields a 
  result
  every time it is called (something like os.walk does).  I guess you could 
  yield
  None if there wasn't anything to consume to prevent blocking.

 Actually the way items are generated is not part of the problem; it
 can be abstracted away as an arbitrary iterable input. As with all
 iterables, there are no more items is communicated simply by a
 StopIteration.

  2) Usw somethink like Twisted insted that uses callbacks instead to handle
  multiple asynchronous calls to produce.  You could have callbacks that 
  don't do
  anything if there is nothing to consume (sort of null objects I guess).

 Twisted is interesting and very powerful but requires a different way
 of thinking about the problem and designing a solution. More to the
 point, callbacks often provide a less flexible and simple API than an
 iterator that yields results (consumed items). For example, say that
 you want to store the results to a dictionary. Using callbacks, you
 would have to explicitly synchronize each access to the dictionary
 since they may fire independently.

 This isn't true.  Access is synchronized automatically by virtue of the
 fact that there is no pre-emptive multithreading in operation.  This is
 one of the many advantages of avoiding threads. :)  There may be valid
 arguments against callbacks, but this isn't one of them.

 Jean-Paul

Thanks for the correction; that's an important advantage for callbacks
in this case.

George
--
http://mail.python.org/mailman/listinfo/python-list

Producer-consumer threading problem

2008-06-10 Thread George Sakkis

I'd like some feedback on a solution to a variant of the producer-
consumer problem. My first few attempts turned out to deadlock
occasionally; this one seems to be deadlock-free so far but I can't
tell if it's provably correct, and if so, whether it can be
simplified.

The generic producer-consumer situation with unlimited buffer capacity
is illustrated at http://docs.python.org/lib/condition-objects.html.
That approach assumes that the producer will keep producing items
indefinitely, otherwise the consumer ends up waiting forever. The
extension to the problem I am considering requires the consumer to be
notified not only when there is a new produced item, but also when
there is not going to be a new item so that it stops waiting. More
specifically, I want a generator (or iterator class) with the
following generic signature:

def iter_consumed(items, produce, consume):
'''Return an iterator over the consumed items.

:param items: An iterable of objects to be `produce`()d and
`consume`()d.

:param produce: A callable `f(item)` that produces a single item;
the return
value is ignored. What produce exactly means is application-
specific.

:param consume: A callable `f()` that consumes a previously
produced item
and returns the consumed item. What consume exactly means is
application-specific. The only assumption is that if `produce`
is called
`N` times, then the next `N` calls to `consume` will
(eventually, though
not necessarily immediatelly) return, i.e they will not block
indefinitely.
'''

One straightforward approach would be to serialize the problem: first
produce all `N` items and then call consume() exactly N times.
Although this makes the solution trivial, there are at least two
shortcomings. First, the client may have to wait too long for the
first item to arrive. Second, each call to produce() typically
requires resources for the produced task, so the maximum resource
requirement can be arbitrarily high as `N` increases. Therefore
produce() and consume() should run concurrently, with the invariant
that the calls to consume are no more than the calls to produce. Also,
after `N` calls to produce and consume, neither should be left
waiting.

I pasted my current solution at http://codepad.org/FXF2SWmg. Any
feedback, especially if it has to do with proving or disproving its
correctness, will be appreciated.

George
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-10 Thread Larry Bates


George Sakkis wrote:

I'd like some feedback on a solution to a variant of the producer-
consumer problem. My first few attempts turned out to deadlock
occasionally; this one seems to be deadlock-free so far but I can't
tell if it's provably correct, and if so, whether it can be
simplified.

The generic producer-consumer situation with unlimited buffer capacity
is illustrated at http://docs.python.org/lib/condition-objects.html.
That approach assumes that the producer will keep producing items
indefinitely, otherwise the consumer ends up waiting forever. The
extension to the problem I am considering requires the consumer to be
notified not only when there is a new produced item, but also when
there is not going to be a new item so that it stops waiting. More
specifically, I want a generator (or iterator class) with the
following generic signature:

def iter_consumed(items, produce, consume):
'''Return an iterator over the consumed items.

:param items: An iterable of objects to be `produce`()d and
`consume`()d.

:param produce: A callable `f(item)` that produces a single item;
the return
value is ignored. What produce exactly means is application-
specific.

:param consume: A callable `f()` that consumes a previously
produced item
and returns the consumed item. What consume exactly means is
application-specific. The only assumption is that if `produce`
is called
`N` times, then the next `N` calls to `consume` will
(eventually, though
not necessarily immediatelly) return, i.e they will not block
indefinitely.
'''

One straightforward approach would be to serialize the problem: first
produce all `N` items and then call consume() exactly N times.
Although this makes the solution trivial, there are at least two
shortcomings. First, the client may have to wait too long for the
first item to arrive. Second, each call to produce() typically
requires resources for the produced task, so the maximum resource
requirement can be arbitrarily high as `N` increases. Therefore
produce() and consume() should run concurrently, with the invariant
that the calls to consume are no more than the calls to produce. Also,
after `N` calls to produce and consume, neither should be left
waiting.

I pasted my current solution at http://codepad.org/FXF2SWmg. Any
feedback, especially if it has to do with proving or disproving its
correctness, will be appreciated.

George


I had a little trouble understanding what exact problem it is that you are 
trying to solve but I'm pretty sure that you can do it with one of two methods:


1) Write the producer as a generator using yield method that yields a result 
every time it is called (something like os.walk does).  I guess you could yield 
None if there wasn't anything to consume to prevent blocking.


2) Usw somethink like Twisted insted that uses callbacks instead to handle 
multiple asynchronous calls to produce.  You could have callbacks that don't do 
anything if there is nothing to consume (sort of null objects I guess).


I don't know if either of these help or not.

-Larry
--
http://mail.python.org/mailman/listinfo/python-list

Re: Producer-consumer threading problem

2008-06-10 Thread George Sakkis

On Jun 10, 11:47 pm, Larry Bates [EMAIL PROTECTED] wrote:

 I had a little trouble understanding what exact problem it is that you are
 trying to solve but I'm pretty sure that you can do it with one of two 
 methods:

Ok, let me try again with a different example: I want to do what can
be easily done with 2.5 Queues using Queue.task_done()/Queue.join()
(see example at http://docs.python.org/lib/QueueObjects.html), but
instead of  having to first put all items and then wait until all are
done, get each item as soon as it is done.

 1) Write the producer as a generator using yield method that yields a result
 every time it is called (something like os.walk does).  I guess you could 
 yield
 None if there wasn't anything to consume to prevent blocking.

Actually the way items are generated is not part of the problem; it
can be abstracted away as an arbitrary iterable input. As with all
iterables, there are no more items is communicated simply by a
StopIteration.

 2) Usw somethink like Twisted insted that uses callbacks instead to handle
 multiple asynchronous calls to produce.  You could have callbacks that don't 
 do
 anything if there is nothing to consume (sort of null objects I guess).

Twisted is interesting and very powerful but requires a different way
of thinking about the problem and designing a solution. More to the
point, callbacks often provide a less flexible and simple API than an
iterator that yields results (consumed items). For example, say that
you want to store the results to a dictionary. Using callbacks, you
would have to explicitly synchronize each access to the dictionary
since they may fire independently. OTOH an iterator by definition
yields items sequentially, so the client doesn't have to bother with
synchronization. Note that with client I mean the user of an API/
framework/library; the implementation of the library itself may of
course use callbacks under the hood (e.g. to put incoming results to a
Queue) but expose the API as a simple iterator.

George
--
http://mail.python.org/mailman/listinfo/python-list

Multi Threading Problem with Python + Django + PostgreSQL.

2008-03-31 Thread Pradip

Hello every body. I am new to this forum and also in Python.
Read many things about multi threading in python. But still having
problem.

I am using Django Framework with Python having PostgreSQL as backend
database with Linux OS. My applications are long running. I am using
threading.
The problem I am facing is that the connections that are being created
for database(postgres) update are not getting closed even though my
threads had returned and updated database successfully. It is not like
that the connections are not being reused. They r being reused but
after sometime new one is created. Like this it creates too many
connections and hence exceeding MAX_CONNECTION limit of postgres conf.

** I am using psycopg2 as adaptor for python to postgres connection.
which itself handles the connections(open/close)

Now the problem is with Django / Python / psycopg2 or any thing else??

HELP ME OUT!
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi Threading Problem with Python + Django + PostgreSQL.

2008-03-31 Thread Nikita the Spider

In article 
[EMAIL PROTECTED],
 Pradip [EMAIL PROTECTED] wrote:

 Hello every body. I am new to this forum and also in Python.
 Read many things about multi threading in python. But still having
 problem.
 
 I am using Django Framework with Python having PostgreSQL as backend
 database with Linux OS. My applications are long running. I am using
 threading.
 The problem I am facing is that the connections that are being created
 for database(postgres) update are not getting closed even though my
 threads had returned and updated database successfully. It is not like
 that the connections are not being reused. They r being reused but
 after sometime new one is created. Like this it creates too many
 connections and hence exceeding MAX_CONNECTION limit of postgres conf.
 
 ** I am using psycopg2 as adaptor for python to postgres connection.
 which itself handles the connections(open/close)

Hi Pradip,
A common problem that new users of Python encounter is that they expect 
database statements to COMMIT automatically. Psycopg2 follows the Python 
DB-API specification and does not autocommit transactions unless you ask 
it to do so. Perhaps your connections are not closing because they have 
open transactions? 

To enable autocommit, call this on your connection object:
connection.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCO
MMIT)

 Now the problem is with Django / Python / psycopg2 or any thing else??

Are you asking if there are bugs in this code that are responsible for 
your persistent connections? If so, then I'd say the answer is almost 
certainly no. Of course it's possible, but Django/Psycopg/Postgres is a 
pretty popular stack. The odds that there's a major bug in this popular 
code examined  by many eyes versus a bug in your code are pretty low, I 
think. Don't take it personally, the same applies to my me and my code. 
=)

Happy debugging

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
-- 
http://mail.python.org/mailman/listinfo/python-list

threading problem..

2007-10-25 Thread Abandoned

Hi..
I want to threading but i have a interesting error..
==
class SelectAll(threading.Thread):
   def __init__(self, name):
  threading.Thread.__init__(self)
  self.name = name #kelime

   def run(self):


self.result=...
nglist=[]
current = SelectAll(name)
nglist.append(current)
current.start()
 for aaa in nglist:
  aaa.join()

=

and it gives me this error:
aaa.join()

AttributeError: 'dict' object has no attribute 'join'


What is the problem i can't understand this error :(

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: threading problem..

2007-10-25 Thread Diez B. Roggisch

Abandoned schrieb:
 Hi..
 I want to threading but i have a interesting error..
 ==
 class SelectAll(threading.Thread):
def __init__(self, name):
   threading.Thread.__init__(self)
   self.name = name #kelime
 
def run(self):
 
 
 self.result=...
 nglist=[]
 current = SelectAll(name)
 nglist.append(current)
 current.start()
  for aaa in nglist:
   aaa.join()
 
 =
 
 and it gives me this error:
 aaa.join()
 
 AttributeError: 'dict' object has no attribute 'join'
 
 
 What is the problem i can't understand this error :(

that you have an attribute join on you SelectAll-objects that shadows 
the Thread.join

Diez

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: threading problem..

2007-10-25 Thread John Nagle

Tried that on Python 2.5 on Windows and it worked.

John Nagle

Abandoned wrote:
 Hi..
 I want to threading but i have a interesting error..
 ==
 class SelectAll(threading.Thread):
def __init__(self, name):
   threading.Thread.__init__(self)
   self.name = name #kelime
 
def run(self):
 
 
 self.result=...
 nglist=[]
 current = SelectAll(name)
 nglist.append(current)
 current.start()
  for aaa in nglist:
   aaa.join()
 
 =
 
 and it gives me this error:
 aaa.join()
 
 AttributeError: 'dict' object has no attribute 'join'
 
 
 What is the problem i can't understand this error :(
 
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading problem when many sockets open

2007-08-25 Thread Lawrence D'Oliveiro

In message [EMAIL PROTECTED], Philip
Zigoris wrote:

 ... and the server
 would go into a state where the master thread repeatedly polled the
 socket and printed an error.

Is that because it cannot create a new socket to accept the connection on?
-- 
http://mail.python.org/mailman/listinfo/python-list

Threading problem when many sockets open

2007-08-11 Thread Philip Zigoris

Hi all,

I have written a socket based service in python and under fairly heavy
traffic it performs really well.  But i have encountered the following
problem: when the system runs out of file descriptors, it seems to
stop switching control between threads.

Here is some more detail:

The system has n+2 threads, where n is usually around 10.  This was
implemented using the 'threading' and 'socket' modules in the python
2.5 standard library.

-- The master thread accepts new socket connections and then
enqueues the connection on a Queue (from the standard library).

-- There are n handler threads that pop a connection off of the
queue, read some number of bytes (~10), do some processing and then
send ~100 bytes back over the connection and close it.

-- The last thread is just a logging thread that has a queue of
messages that it writes to either stdout or a file.

Under pretty substantial load, the processing is quick enough that the
connections do not pile up very quickly.  But, in some cases they do.
And what I found was that as soon as the size of the Queue of
connections reached a high enough number (and it was always the same),
all of the processing seemed to stay in the master thread.  I
created a siege client that opened up 1000+ connections and the server
would go into a state where the master thread repeatedly polled the
socket and printed an error.  The Queue size stayed fixed at 997 even
if i shut down all of the connectionso n the client side.  Under
normal operating conditions, those connections would be detected as
broken in the handler thread and a message would be logged.  So it
seems that the handler threads aren't doing anything. (And they are
still alive, i check that in the master thread).

OK.  I hope all of this is clear.  Currently, I've solved the problem
by putting a cap on the queue size and i haven't seen the problem
reoccur.  But it would be nice to understand exactly what went wrong.

Thanks in advance.

-- 
--
Philip Zigoris I SPOCK I 650.366.1165
Spock is Hiring!
www.spock.com/jobs
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading problem

2006-04-18 Thread Diez B. Roggisch

 Here the error message:
 Exception in thread Thread-1:
 Traceback (most recent call last):
   File C:\Program Files\Python\lib\threading.py, line 442, in
   __bootstrap
 self.run()
   File G:\Robot teleskop\VRT\test\test2.py, line 25, in run
 Document.OpenFile('F:/Images/VRT/'+name)
   File C:\Program
 Files\Python\Lib\site-packages\win32com\client\dynamic.py, line 496, in
 __getattr__
 raise AttributeError, %s.%s % (self._username_, attr)
 AttributeError: MaxIm.Document.OpenFile

Seems that you think it should doesn't impress the COM-object of type
MaxIm.Document very much. Play around with that w/o threads if it works.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list

Threading problem

2006-04-17 Thread Aleksandar Cikota







Hi all,

I have a problem with threading. The following part should be running in a mainprogramm all the time, but so that the main programm also works (like 2 seperate programms, butin one)
How to integrate the Code-part in the main programm, so that the mainprogramm works?

Code:

import win32com.clientimport timeimport osimport threading

Document = win32com.client.Dispatch('MaxIm.Document')Application = win32com.client.Dispatch('MaxIm.Application')p = win32com.client.dynamic.Dispatch('PinPoint.Plate')


class TestThread ( threading.Thread ): path_to_watch = "F:/Images/VRT/" before = dict ([(f, None) for f in os.listdir (path_to_watch)]) while 1: time.sleep(2) after2 = dict ([(f, None) for f in os.listdir (path_to_watch)]) added = [f for f in after2 if not f in before] if added: name= ' ,'.join (added) if str(name[-3:])=='fit': Document.OpenFile('F:/Images/VRT/'+name) Document.SaveFile('F:/Images/VRT/'+ str(name[0:-4])+'.jpg', 6, 1024,2) Application.CloseAll()

 try:  p.AttachFITS('F:/Images/VRT/'+name) p.ArcsecPerPixelHoriz = -1.7 p.ArcsecPerPixelVert = -1.7 p.MaxMatchResidual = 1.0 p.FitOrder = 3 p.CentroidAlgorithm = 0 p.RightAscension = p.TargetRightAscension p.Declination = p.TargetDeclination p.Catalog = 0 # GSC p.CatalogPath = 'F:/' p.ProjectionType = 1 # p.Solve() p.DetachFITS()

 pRA = p.RightAscension pDec = p.Declination print pRA print pDec

 except: p.DetachFITS() print 'Error'  before = after2

TestThread().start()




For your prompt reply, I say thank you in advance.Best regards,Aleksandar











-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading problem

2006-04-17 Thread Faber

Aleksandar Cikota wrote:

 How to integrate the Code-part in the main programm, so that the
 mainprogramm works?
 
 Code:
 
 import win32com.client
 import time
 import os
 import threading
 
 Document = win32com.client.Dispatch('MaxIm.Document')
 Application = win32com.client.Dispatch('MaxIm.Application')
 p = win32com.client.dynamic.Dispatch('PinPoint.Plate')
 
 class TestThread ( threading.Thread ):
 path_to_watch = F:/Images/VRT/

def run(self):
# Put the following code in the run method

 before = dict ([(f, None) for f in os.listdir (path_to_watch)])
 while 1:

[cut]

 TestThread().start()

This should work

-- 
Faber
http://faberbox.com/
http://smarking.com/

The man who trades freedom for security does not deserve nor will he ever
receive either. -- Benjamin Franklin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading problem

2006-04-17 Thread Aleksandar Cikota

Thank You, but now it cannot open a file, but it should work...

Here the error message:
 Exception in thread Thread-1:
Traceback (most recent call last):
  File C:\Program Files\Python\lib\threading.py, line 442, in __bootstrap
self.run()
  File G:\Robot teleskop\VRT\test\test2.py, line 25, in run
Document.OpenFile('F:/Images/VRT/'+name)
  File C:\Program 
Files\Python\Lib\site-packages\win32com\client\dynamic.py, line 496, in 
__getattr__
raise AttributeError, %s.%s % (self._username_, attr)
AttributeError: MaxIm.Document.OpenFile



And here the Code:

import win32com.client
import time
import os
import threading

Document = win32com.client.Dispatch('MaxIm.Document')
Application = win32com.client.Dispatch('MaxIm.Application')
p = win32com.client.dynamic.Dispatch('PinPoint.Plate')

class TestThread (threading.Thread):
def run (self):
  path_to_watch = F:/Images/VRT/
  before = dict ([(f, None) for f in os.listdir (path_to_watch)])
  while 1:
time.sleep(2)
after2 = dict ([(f, None) for f in os.listdir (path_to_watch)])
added = [f for f in after2 if not f in before]
if added:
name= ' ,'.join (added)
if str(name[-3:])=='fit':
Document.OpenFile('F:/Images/VRT/'+name)
Document.SaveFile('F:/Images/VRT/'+ 
str(name[0:-4])+'.jpg', 6, 1024,2)
Application.CloseAll()
try:
p.AttachFITS('F:/Images/VRT/'+name)
p.ArcsecPerPixelHoriz = -1.7
p.ArcsecPerPixelVert = -1.7
p.MaxMatchResidual = 1.0
p.FitOrder = 3
p.CentroidAlgorithm = 0
p.RightAscension = p.TargetRightAscension
p.Declination = p.TargetDeclination
p.Catalog = 0  # GSC
p.CatalogPath = 'F:/'
p.ProjectionType = 1 #
p.Solve()
p.DetachFITS()
pRA = p.RightAscension
pDec = p.Declination
print pRA
print pDec
except:
p.DetachFITS()
print 'Error'
before = after2
TestThread().start()




raise AttributeError, %s.%s % (self._username_, attr), what does it 
mean?


For your prompt reply, I say thank you in advance.

Best regards,
Aleksandar




Faber [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Aleksandar Cikota wrote:

 How to integrate the Code-part in the main programm, so that the
 mainprogramm works?

 Code:

 import win32com.client
 import time
 import os
 import threading

 Document = win32com.client.Dispatch('MaxIm.Document')
 Application = win32com.client.Dispatch('MaxIm.Application')
 p = win32com.client.dynamic.Dispatch('PinPoint.Plate')

 class TestThread ( threading.Thread ):
 path_to_watch = F:/Images/VRT/

def run(self):
# Put the following code in the run method

 before = dict ([(f, None) for f in os.listdir (path_to_watch)])
 while 1:

 [cut]

 TestThread().start()

 This should work

 -- 
 Faber
 http://faberbox.com/
 http://smarking.com/

 The man who trades freedom for security does not deserve nor will he ever
 receive either. -- Benjamin Franklin 


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading Problem

2004-12-22 Thread Steve Holden

Norbert wrote:
Hello *,
i am experimenting with threads and get puzzling results.
Consider the following example:
#
import threading, time
def threadfunction():
.print threadfunction: entered
.x = 10
.while x  40:
.time.sleep(1) # time unit is seconds
.print threadfunction x=%d % x
.x += 10

print start
th = threading.Thread(target = threadfunction())
th.start()
print start completed
#
(the dots are inserted becaus Google mangles the lines otherwise)
This program gives the following result :

start
threadfunction: entered
threadfunction x=10
threadfunction x=20
threadfunction x=30
start completed

My aim was that the main program should continue to run while
threadfunction runs in parallel. That's the point of threads after all,
isn't it ?
I awaited something like the following :
start
threadfunction: entered
start completed---
threadfunction x=10
threadfunction x=20
threadfunction x=30
Does anyone know what's going on here ?
Well, I don't believe there's any guarantee that a thread will get run 
preference over its starter - they're both threads, after all. Try 
putting a sleep after th.start() and before the print statement and you 
should see that the worker thread runs while the main thread sleeps.

The same would be true if each were making OS calls and so on. When 
everything is more or les continuous computation, and so short it can be 
over in microseconds, there's no motivation for the scheduler to stop 
one thread and start another.

regards
 Steve
--
Steve Holden   http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
Holden Web LLC  +1 703 861 4237  +1 800 494 3119
--
http://mail.python.org/mailman/listinfo/python-list

Re: Threading Problem

2004-12-22 Thread Alan Kennedy

[Norbert]
 i am experimenting with threads and get puzzling results.
 Consider the following example:
 #
 import threading, time

 def threadfunction():
 print threadfunction: entered
 x = 10
 while x  40:
 time.sleep(1) # time unit is seconds
 print threadfunction x=%d % x
 x += 10



 print start
 th = threading.Thread(target = threadfunction())
The problem is here^^
You are *invoking* threadfunction, and passing its return value as the 
target, rather than passing the function itself as the target. That's 
why threadfunction's output appears in the output stream before the 
thread has even started.

Try this instead
#---
import threading, time
def threadfunction():
  print threadfunction: entered
  x = 10
  while x  40:
time.sleep(1) # time unit is seconds
print threadfunction x=%d % x
x += 10
print start
th = threading.Thread(target = threadfunction)
th.start()
print start completed
#
Which should output the expected

start
threadfunction: entered
start completed
threadfunction x=10
threadfunction x=20
threadfunction x=30
regards,
--
alan kennedy
--
email alan:  http://xhaus.com/contact/alan
--
http://mail.python.org/mailman/listinfo/python-list

Re: Threading Problem

2004-12-22 Thread Norbert

Thanks a lot, Steve, for your fast reply.
But the behaviour is the same if 'threadfunction' sleeps longer than
just 1 second. 'threadfunction' is of course a dummy to show the
problem, imagine a longrunning background-task.

If you are right, the question remains 'How can I assure that the
starting function finishes, while the other thread still runs ?' .  As
I said, this is the purpose of threading.

Thanks again
Norbert

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading Problem

2004-12-22 Thread Norbert

Thanks Alan,
i hoped it would be something trivial :)

Norbert

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Threading Problem

2004-12-22 Thread Fredrik Lundh

Steve Holden wrote:

 Well, I don't believe there's any guarantee that a thread will get run 
 preference over its 
 starter - they're both threads, after all. Try putting a sleep after 
 th.start() and before the 
 print statement and you should see that the worker thread runs while the 
 main thread sleeps.

that's correct, but the threading module does a 0.01-second sleep
to get around this, no matter what thread scheduler you're using.

if you're building threads on top of the lower-level thread api, you have
to do that yourself.

/F 



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: threading problem

2004-12-10 Thread Egor Bolonev

On Fri, 10 Dec 2004 00:12:16 GMT, Steven Bethard  
[EMAIL PROTECTED] wrote:

I think if you change the call to look like:
threading.Thread(target=run, args=(os.path.join('c:\\', path),)).start()
oh i see now. thanks
s='sdgdfgdfg'
s == (s)
True
s == (s,)
False

--
http://mail.python.org/mailman/listinfo/python-list

Re: A little threading problem

2004-12-02 Thread Alban Hertroys

Jeremy Jones wrote:
(not waiting, because it already did happen).  What is it exactly that 
you are trying to accomplish?  I'm sure there is a better approach.
I think I saw at least a bit of the light, reading up on readers and 
writers (A colleague showed up with a book called Operating system 
concepts that has a chapter on process synchronization).
It looks like I should be writing and reading 3 Queues instead of trying 
to halt and pause the threads explicitly. That looks a lot easier...

Thanks for pointing out the problem area.
--
http://mail.python.org/mailman/listinfo/python-list

Re: A little threading problem

2004-12-02 Thread Jeremy Jones

Alban Hertroys wrote:
Jeremy Jones wrote:
(not waiting, because it already did happen).  What is it exactly 
that you are trying to accomplish?  I'm sure there is a better approach.

I think I saw at least a bit of the light, reading up on readers and 
writers (A colleague showed up with a book called Operating system 
concepts that has a chapter on process synchronization).
It looks like I should be writing and reading 3 Queues instead of 
trying to halt and pause the threads explicitly. That looks a lot 
easier...

Thanks for pointing out the problem area.
That's actually along the lines of what I was going to recommend after 
getting more detail on what you are doing.  A couple of things that may 
(or may not) help you are:

* the Queue class in the Python standard library has a maxsize 
parameter.  When you create a queue, you can specify how large you want 
it to grow.  You can have your three threads busily parsing XML and 
extracting data from it and putting it into a queue and when there are a 
total of maxsize items in the queue, the next put() call (to put data 
into the queue) will block until the consumer thread has reduced the 
number of items in the queue.  I've never used 
xml.parsers.xmlproc.xmlproc.Application, but looking at the data, it 
seems to resemble a SAX parser, so you should have no problem putting 
(potentially blocking) calls to the queue into your handler.  The only 
thing this really buys you won't have read the whole XML file into memory.
* the get method on a queue object has a block flag.  You can 
effectively poll your queues something like this:

#untested code
#a_done, b_done and c_done are just checks to see if that particular 
document is done
while not (a_done and b_done and c_done):
   got_a, got_b, got_c = False, False, False
   item_a, item_b, item_c = None, None, None
   while (not a_done) and (not got_a):
  try:
 item_a = queue_a.get(0) #the 0 says don't block and raise an 
Empty exception if there's nothing there
 got_a = True
  except Queue.Empty:
 time.sleep(.3)
   while (not b_done) and (not got_b):
  try:
 item_b = queue_b.get(0)
 got_a = True
  except Queue.Empty:
 time.sleep(.3)
   while (not c_done) and (not got_c):
  try:
 item_c = queue_c.get(0)
 got_c = True
  except Queue.Empty:
 time.sleep(.3)
   put_into_database_or_whatever(item_a, item_b, item_c)

This will allow you to deal with one item at a time and if the xml files 
are different sizes, it should still work - you'll just pass None to 
put_into_database_or_whaver for that particular file.

HTH.
Jeremy Jones
--
http://mail.python.org/mailman/listinfo/python-list

A little threading problem

2004-12-01 Thread Alban Hertroys

Hello all,
I need your wisdom again. I'm working on a multi-threaded application 
that handles multiple data sources in small batches each time. The idea 
is that there are 3 threads that run simultaneously, each read a fixed 
number of records, and then they wait for eachother. After that the main 
thread does some processing, and the threads are allowed to continue 
reading data.

I summarized this part of the application in the attached python script, 
which locks up rather early, for reasons that I don't understand (I 
don't have a computer science education), and I'm pretty sure the 
problem is related to what I'm trying to fix in my application. Can 
anybody explain what's happening (Or maybe even show me a better way of 
doing this)?

Regards,
Alban Hertroys,
MAG Productions.
import sys
import threading

class AThread(threading.Thread):
	def __init__(self, name, mainCond, allowedCond):
		self.counter	= 0
		self.name		= name
		self.mainCond	= mainCond
		self.condAllowed = allowedCond
		self.waitUntilRunning = threading.Condition()

		threading.Thread.__init__(self, None, None, name, [])

	def start(self):
		threading.Thread.start(self)

		# Let the main thread wait until this thread is ready to accept Notify
		# events.
		self.waitUntilRunning.acquire()
		self.waitUntilRunning.wait()
		self.waitUntilRunning.release()

	def run(self):
		threading.Thread.run(self)

		# Print numbers 1 - 25
		while self.counter  25:
			self.condAllowed.acquire()

			# Tell the main thread that we're ready to receive Notifies
			self.waitUntilRunning.acquire()
			self.waitUntilRunning.notify()
			print Running
			self.waitUntilRunning.release()

			# Wait for a Notify from the main thread
			print Wait
			self.condAllowed.wait()
			self.condAllowed.release()

			self.counter += 1

			print Thread %s: counter = %d % (self.name, self.counter)


			# Tell the main thread that a thread has reached the end of the loop
			self.mainCond.acquire()
			self.mainCond.notify()
			self.mainCond.release()

class Main(object):
	def __init__(self):
		self.condWait = threading.Condition()
		self.condAllowed = threading.Condition()

		self.threads = [
			AThread('A', self.condWait, self.condAllowed),
			AThread('B', self.condWait, self.condAllowed),
			AThread('C', self.condWait, self.condAllowed),
		]

		# Start the threads
		for thread in self.threads:
			thread.start()

		while True:
			# Allow the threads to run another iteration
			self.condAllowed.acquire()
			print Notify
			self.condAllowed.notifyAll()
			self.condAllowed.release()

			# Wait until all threads reached the end of their loop
			for thread in self.threads:
self.condWait.acquire()
self.condWait.wait()
self.condWait.release()


main = Main()

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: A little threading problem

2004-12-01 Thread Jeremy Jones





Alban Hertroys wrote:
Hello
all,
  
  
I need your wisdom again. I'm working on a multi-threaded application
that handles multiple data sources in small batches each time. The idea
is that there are 3 threads that run simultaneously, each read a fixed
number of records, and then they wait for eachother. After that the
main thread does some processing, and the threads are allowed to
continue reading data.
  
  
I summarized this part of the application in the attached python
script, which locks up rather early, for reasons that I don't
understand (I don't have a computer science education), and I'm pretty
sure the problem is related to what I'm trying to fix in my
application. Can anybody explain what's happening (Or maybe even show
me a better way of doing this)?
  
  
Regards,
  
  
Alban Hertroys,
  
MAG Productions.
  
  

import sys
import threading

class AThread(threading.Thread):
	def __init__(self, name, mainCond, allowedCond):
		self.counter	= 0
		self.name		= name
		self.mainCond	= mainCond
		self.condAllowed = allowedCond
		self.waitUntilRunning = threading.Condition()

		threading.Thread.__init__(self, None, None, name, [])

	def start(self):
		threading.Thread.start(self)

		# Let the main thread wait until this thread is ready to accept Notify
		# events.
		self.waitUntilRunning.acquire()
		self.waitUntilRunning.wait()
		self.waitUntilRunning.release()

	def run(self):
		threading.Thread.run(self)

		# Print numbers 1 - 25
		while self.counter  25:
			self.condAllowed.acquire()

			# Tell the main thread that we're ready to receive Notifies
			self.waitUntilRunning.acquire()
			self.waitUntilRunning.notify()
			print "Running"
			self.waitUntilRunning.release()

			# Wait for a Notify from the main thread
			print "Wait"
			self.condAllowed.wait()
			self.condAllowed.release()

			self.counter += 1

			print "Thread %s: counter = %d" % (self.name, self.counter)


			# Tell the main thread that a thread has reached the end of the loop
			self.mainCond.acquire()
			self.mainCond.notify()
			self.mainCond.release()

class Main(object):
	def __init__(self):
		self.condWait = threading.Condition()
		self.condAllowed = threading.Condition()

		self.threads = [
			AThread('A', self.condWait, self.condAllowed),
			AThread('B', self.condWait, self.condAllowed),
			AThread('C', self.condWait, self.condAllowed),
		]

		# Start the threads
		for thread in self.threads:
			thread.start()

		while True:
			# Allow the threads to run another iteration
			self.condAllowed.acquire()
			print "Notify"
			self.condAllowed.notifyAll()
			self.condAllowed.release()

			# Wait until all threads reached the end of their loop
			for thread in self.threads:
self.condWait.acquire()
self.condWait.wait()
self.condWait.release()


main = Main()

  

You've got a deadlock. I modified your script to add a print "T-%s" %
self.name before an acquire and after a release in the threads you spun
off (not in the main thread). Here is the output:

[EMAIL PROTECTED] threading]$ python tt.py
T-A: acquiring condAllowed
T-A: acquiring waitUntilRunning
T-A: Running
T-A: released waitUntilRunning
T-A: Wait
T-B: acquiring condAllowed
T-B: acquiring waitUntilRunning
T-B: Running
T-B: released waitUntilRunning
T-B: Wait
T-C: acquiring condAllowed
T-C: acquiring waitUntilRunning
T-C: Running
T-C: released waitUntilRunning
T-C: Wait
Notify
T-A: released condAllowed
T-A: counter = 1
T-A: acquiring mainCond
T-A: released mainCond
T-A: acquiring condAllowed
T-A: acquiring waitUntilRunning
T-A: Running
T-A: released waitUntilRunning
T-A: Wait
T-C: released condAllowed
T-C: counter = 1
T-C: acquiring mainCond
T-C: released mainCond
T-C: acquiring condAllowed
T-C: acquiring waitUntilRunning
T-C: Running
T-C: released waitUntilRunning
T-C: Wait
T-B: released condAllowed
T-B: counter = 1
T-B: acquiring mainCond
T-B: released mainCond
T-B: acquiring condAllowed
Notify -Here is
your problem
T-A: released condAllowed
T-A: counter = 2
T-A: acquiring mainCond
T-A: released mainCond
T-A: acquiring condAllowed
T-A: acquiring waitUntilRunning
T-A: Running
T-A: released waitUntilRunning
T-A: Wait
T-B: acquiring waitUntilRunning
T-B: Running
T-B: released waitUntilRunning
T-B: Wait
T-C: released condAllowed
T-C: counter = 2
T-C: acquiring mainCond
T-C: released mainCond
T-C: acquiring condAllowed
T-C: acquiring waitUntilRunning
T-C: Running
T-C: released waitUntilRunning
T-C: Wait

Notify is called before thread B (in this case) hits the
condAllowed.wait() piece of code. So, it sits at that wait() for
forever (because it doesn't get notified, because the notification
already happened), waiting to be notified from the main thread, and the
main thread is waiting on thread B (again, in this case) to call
mainCond.notify(). This approach is a deadlock just wanting to happen
(not waiting, because it already did happen). What is it exactly that
you are trying to accomplish? I'm sure there is a better approach.

Jeremy Jones



--

44 matches

Mail list logo