Re: [Tutor] Changing instance attributes in different threads

2006-02-09 Thread Michael Lange
On Wed, 08 Feb 2006 18:14:14 -0500
Kent Johnson <[EMAIL PROTECTED]> wrote:

> > Sorry, I missed to insert the time.sleep(0.1) I used in my original while 
> > loop into the example above.
> > The reason for using time.sleep() is that I need to avoid lots of loops 
> > over an empty buffer.
> > The amount of time until the producer thread reads a new data fragment into 
> > the buffer may
> > be significant, depending on the fragment size requested by the driver (e.g 
> > my fm801 card
> > wants fragments of 16384 bytes which is about 0.09 audio seconds). On the 
> > other hand the
> > buffer may contain hundreds of kB of data if other processes cause a lot of 
> > disk I/O.
> 
> Using Queue.get() will do this for you automatically. If there is no 
> data it will block until something is added to the queue and you avoid a 
>   polling loop. If there is data it will return it quickly.
> > 

> > Not if I call it with block=0; if I understand the docs correctly the queue 
> > will raise a Queue.Empty exception
> > if the queue is currently locked by another thread. 
> 
> No, the block flag controls whether the call will wait until something 
> is in the queue or return immediately. The call to Queue.get() will 
> always block waiting for the lock that controls access to the Queue; it 
> can't even tell if the Queue is empty until it gets this lock.
> 

Ah , then I misunderstood what the docs meant with "return an item if one is 
immediately available, else raise the Empty exception".
I thought "immediately available" means the Queue is currently not locked by 
another thread.

Thanks again

Michael

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-08 Thread Kent Johnson
Michael Lange wrote:
> Sorry, I missed to insert the time.sleep(0.1) I used in my original while 
> loop into the example above.
> The reason for using time.sleep() is that I need to avoid lots of loops over 
> an empty buffer.
> The amount of time until the producer thread reads a new data fragment into 
> the buffer may
> be significant, depending on the fragment size requested by the driver (e.g 
> my fm801 card
> wants fragments of 16384 bytes which is about 0.09 audio seconds). On the 
> other hand the
> buffer may contain hundreds of kB of data if other processes cause a lot of 
> disk I/O.

Using Queue.get() will do this for you automatically. If there is no 
data it will block until something is added to the queue and you avoid a 
  polling loop. If there is data it will return it quickly.
> 
>>You should be able to take the try/except out of your version, since 
>>this code is the only consumer from the Queue, if queue.empty() is 
>>false, queue.get() should succeed.
>>
> Not if I call it with block=0; if I understand the docs correctly the queue 
> will raise a Queue.Empty exception
> if the queue is currently locked by another thread. 

No, the block flag controls whether the call will wait until something 
is in the queue or return immediately. The call to Queue.get() will 
always block waiting for the lock that controls access to the Queue; it 
can't even tell if the Queue is empty until it gets this lock.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-08 Thread Michael Lange
On Wed, 08 Feb 2006 13:47:39 -0500
Kent Johnson <[EMAIL PROTECTED]> wrote:

> > while self.recording:
> > data = []
> > while not self.rec_queue.empty():
> > try:
> >data.append(self.rec_queue.get(block=0))
> > except Queue.Empty:
> >break
> > for d in data:
> > self._waveobj.writeframesraw(d)

> I don't understand why this is better than my code. It's a little 
> different - you get all the data, then write all the data; I get a 
> little, write a little - but since you are writing one datum at a time 
> in both cases, I don't know why it would make much difference.
> 

Sorry, I missed to insert the time.sleep(0.1) I used in my original while loop 
into the example above.
The reason for using time.sleep() is that I need to avoid lots of loops over an 
empty buffer.
The amount of time until the producer thread reads a new data fragment into the 
buffer may
be significant, depending on the fragment size requested by the driver (e.g my 
fm801 card
wants fragments of 16384 bytes which is about 0.09 audio seconds). On the other 
hand the
buffer may contain hundreds of kB of data if other processes cause a lot of 
disk I/O.

> You should be able to take the try/except out of your version, since 
> this code is the only consumer from the Queue, if queue.empty() is 
> false, queue.get() should succeed.
> 

Not if I call it with block=0; if I understand the docs correctly the queue 
will raise a Queue.Empty exception
if the queue is currently locked by another thread. 

> In the vu meter thread I can see you might want to consume a minimum 
> length of data but again that doesn't seem so hard.
> 
> OTOH your latest code looks OK too, this was just a suggestion.
> 

It was a good suggestion, I never looked at the Queue class before, and it's 
definitely good to know about it.
In my special case here, your other suggestion of using Condition() seems to 
allow me easier handling of
my buffers, though. Another part of my code I did not post here that is 
executed when the recording
is stopped (i.e. self.recording is set to 0) also seems to be easier to 
implement with the Condition technique.

> > All Tkinter access must be from the main thread (or, more precisely,
> the thread that called mainloop).
> 
> OK, yes. What this means is that any code that changes the state of the 
> GUI should be called from the main thread. In your case, that means that 
> the thread that updates the vu meter must be the main thread. If you are 
> calling get_peaks() from a scheduled Tkinter task (scheduled with 
> after() or after_idle()) you will be fine.
> 
Yes, that's what I am doing. I think I just was confused because I did not 
understand what the Condition class does.
Now I think I see clearer, thanks for all your help.

Michael



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-08 Thread Kent Johnson
Michael Lange wrote:
> On Wed, 08 Feb 2006 08:37:18 -0500
> Kent Johnson <[EMAIL PROTECTED]> wrote:
>>child thread 2:
>>
>> while self.recording:
>> data = self.rec_queue.get()
>> for d in data:
>> self._waveobj.writeframesraw(d)# write data to file
>>
> 
> Thanks Kent,
> 
> the problem with Queues is that Queue.get() returns only one item at a time, 
> but I found that depending
> on cpu load and disk usage hundreds of data fragments may accumulate into the 
> recording buffer, so in the "writer"
> thread I would have to use something like (and similar in the get_peaks() 
> method):
> 
> while self.recording:
> data = []
> while not self.rec_queue.empty():
> try:
>data.append(self.rec_queue.get(block=0))
> except Queue.Empty:
>break
> for d in data:
> self._waveobj.writeframesraw(d)
> 
> I am not sure if this approach is more robust than the one that uses 
> Condition() objects,
> however I don't think the code looks cleaner.

I don't understand why this is better than my code. It's a little 
different - you get all the data, then write all the data; I get a 
little, write a little - but since you are writing one datum at a time 
in both cases, I don't know why it would make much difference.

You should be able to take the try/except out of your version, since 
this code is the only consumer from the Queue, if queue.empty() is 
false, queue.get() should succeed.

In the vu meter thread I can see you might want to consume a minimum 
length of data but again that doesn't seem so hard.

OTOH your latest code looks OK too, this was just a suggestion.

>>>This *seems* to work, however it looks like this code does not separate 
>>>properly the gui from the child
>>>threads which everyone says should be avoided in any case.
>>
>>I don't understand your concern here.
>>
> Maybe it is just because I have not fully understood how the Condition 
> objects work; what I had in mind are
> warnings like this one (from 
> http://www.astro.washington.edu/owen/TkinterSummary.html):
> 
> All Tkinter access must be from the main thread (or, more precisely,
the thread that called mainloop).

OK, yes. What this means is that any code that changes the state of the 
GUI should be called from the main thread. In your case, that means that 
the thread that updates the vu meter must be the main thread. If you are 
calling get_peaks() from a scheduled Tkinter task (scheduled with 
after() or after_idle()) you will be fine.

One way this problem comes up is if you have started a thread to run a 
long-running task and you want to update a status bar in the GUI. You 
can't update the GUI directly from the worker thread, you have to pass 
the status info back into the main thread somehow.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-08 Thread Michael Lange
On Wed, 08 Feb 2006 08:37:18 -0500
Kent Johnson <[EMAIL PROTECTED]> wrote:

> Another architecture you might consider is to have thread 1 put the 
> actual acquired buffers into two Queues that are read by the two 
> consumer threads. This would save you a lot of copying and give you a 
> cleaner implementation. It may block on the producer thread but the 
> Queue is locked only while something is actually being put in or taken 
> out so the blocks should be short.
> 
> For example (not tested!):
>  def get_peaks(self):
>  try:
>  data = self.vu_queue.get_nowait()
>  except Queue.Empty:
>  return None
>  left, right = 0, 0
>  for d in data:
>  left = max(audioop.max(audioop.tomono(d, 2, 1, 0), 2), left)
>  right = max(audioop.max(audioop.tomono(d, 2, 0, 1), 2), right)
>  return left, right
> 
> child thread 1:
> 
>  while self.running:
>  data = self._audioobj.read(self._fragsize)# read data from 
> soundcard
>  self.vu_queue.put(data)
>  self.rec_queue.put(data)
> 
> child thread 2:
> 
>  while self.recording:
>  data = self.rec_queue.get()
>  for d in data:
>  self._waveobj.writeframesraw(d)# write data to file
> 

Thanks Kent,

the problem with Queues is that Queue.get() returns only one item at a time, 
but I found that depending
on cpu load and disk usage hundreds of data fragments may accumulate into the 
recording buffer, so in the "writer"
thread I would have to use something like (and similar in the get_peaks() 
method):

while self.recording:
data = []
while not self.rec_queue.empty():
try:
   data.append(self.rec_queue.get(block=0))
except Queue.Empty:
   break
for d in data:
self._waveobj.writeframesraw(d)

I am not sure if this approach is more robust than the one that uses 
Condition() objects,
however I don't think the code looks cleaner.

> 
> > This *seems* to work, however it looks like this code does not separate 
> > properly the gui from the child
> > threads which everyone says should be avoided in any case.
> 
> I don't understand your concern here.
> 

Maybe it is just because I have not fully understood how the Condition objects 
work; what I had in mind are
warnings like this one (from 
http://www.astro.washington.edu/owen/TkinterSummary.html):

All Tkinter access must be from the main thread (or, more precisely, the 
thread that called mainloop).
Violating this is likely to cause nasty and mysterious symptoms such as 
freezes or core dumps. Yes this
makes combining multi-threading and Tkinter very difficult. The only fully 
safe technique I have found is polling
(e.g. use after from the main loop to poll a threading Queue that your 
thread writes). I have seen it suggested
that a thread can safely use event_create to communicate with the main 
thread, but have found this is not safe.

I guess I have to spend a second thought at this.

Thanks again

Michael


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-08 Thread Kent Johnson
Michael Lange wrote:
> On Tue, 7 Feb 2006 23:31:06 +0100
> Michael Lange <[EMAIL PROTECTED]> wrote:
> 
> 
>>So I think I need two Condition objects here; it is most important here that 
>>thread 1 does not
>>block to minimize the risk of recording buffer overruns, but from reading the 
>>docs I am
>>not sure about the correct procedure. So if I change self.rec_locked and 
>>self.vu_locked from the
>>code above to be Condition objects is it correct to do:
>>
> 
> 
> 
> Ok, some testing gave me the answer, with the code I posted I got an 
> AssertionError, so obviously
> the release() call has to be inside the "if 
> self.rec_lock.acquire(blocking=0):" block.
> So now my functions look like:
> 
> gui thread (periodically called by Tkinter):
> 
> def get_peaks(self):
> if not self.vu_lock.acquire(blocking=0):
> return None
> data = [x for x in self.vubuffer]
> self.vubuffer = []
> self.vu_lock.release()
> if not data:
> return None
> left, right = 0, 0
> for d in data:
> left = max(audioop.max(audioop.tomono(d, 2, 1, 0), 2), left)
> right = max(audioop.max(audioop.tomono(d, 2, 0, 1), 2), right)
> return left, right
>
> child thread 1:
> 
> vubuffer = []
> recbuffer = []
> while self.running:
> data = self._audioobj.read(self._fragsize)# read data from 
> soundcard
> vubuffer.append(data)
> if self.vu_lock.acquire(blocking=0):
> self.vubuffer += vubuffer
> vubuffer = []
> self.vu_lock.release()
> if self.recording:
> recbuffer.append(data)
> if self.rec_lock.acquire(blocking=0):
> self.recbuffer += recbuffer
> recbuffer = []
> self.rec_lock.release()
> 
> child thread 2:
> 
> while self.recording:
> # wait a moment until there is something in the buffer to be 
> written
> data = []
> time.sleep(0.1)
> if self.rec_lock.acquire(blocking=0):
> data = [x for x in self.recbuffer]
> self.recbuffer = []
> self.rec_lock.release()
> for d in data:
> self._waveobj.writeframesraw(d)# write data to file
> 

This is much better than your original.

Another architecture you might consider is to have thread 1 put the 
actual acquired buffers into two Queues that are read by the two 
consumer threads. This would save you a lot of copying and give you a 
cleaner implementation. It may block on the producer thread but the 
Queue is locked only while something is actually being put in or taken 
out so the blocks should be short.

For example (not tested!):
 def get_peaks(self):
 try:
 data = self.vu_queue.get_nowait()
 except Queue.Empty:
 return None
 left, right = 0, 0
 for d in data:
 left = max(audioop.max(audioop.tomono(d, 2, 1, 0), 2), left)
 right = max(audioop.max(audioop.tomono(d, 2, 0, 1), 2), right)
 return left, right

child thread 1:

 while self.running:
 data = self._audioobj.read(self._fragsize)# read data from 
soundcard
 self.vu_queue.put(data)
 self.rec_queue.put(data)

child thread 2:

 while self.recording:
 data = self.rec_queue.get()
 for d in data:
 self._waveobj.writeframesraw(d)# write data to file


> This *seems* to work, however it looks like this code does not separate 
> properly the gui from the child
> threads which everyone says should be avoided in any case.

I don't understand your concern here.

> On the other hand, with the technique I used before, with a boolean as 
> "lock", like:
> 
>  if not self.vu_locked:
> self.vu_locked = 1
> self.vubuffer += vubuffer
> vubuffer = []
> self.vu_locked = 0
> 
> it seems like the worst case is that both the gui and the child thread pass 
> the test "if not self.vu_locked" at
> the same time which might cause some data to be lost from the vubuffer list; 
> probably that is something
> I could live with.

You will be reading and writing the same list from two different 
threads, which seems like a bad idea.
> So now my question:
> Does anyone know how a threading.Condition() object is handled internally, so 
> maybe its methods actually
> can be called safely from the gui thread?

Take a look at the source, threading is in Python.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-08 Thread Michael Lange
On Tue, 7 Feb 2006 23:31:06 +0100
Michael Lange <[EMAIL PROTECTED]> wrote:

> 
> So I think I need two Condition objects here; it is most important here that 
> thread 1 does not
> block to minimize the risk of recording buffer overruns, but from reading the 
> docs I am
> not sure about the correct procedure. So if I change self.rec_locked and 
> self.vu_locked from the
> code above to be Condition objects is it correct to do:
> 


Ok, some testing gave me the answer, with the code I posted I got an 
AssertionError, so obviously
the release() call has to be inside the "if self.rec_lock.acquire(blocking=0):" 
block.
So now my functions look like:

gui thread (periodically called by Tkinter):

def get_peaks(self):
if not self.vu_lock.acquire(blocking=0):
return None
data = [x for x in self.vubuffer]
self.vubuffer = []
self.vu_lock.release()
if not data:
return None
left, right = 0, 0
for d in data:
left = max(audioop.max(audioop.tomono(d, 2, 1, 0), 2), left)
right = max(audioop.max(audioop.tomono(d, 2, 0, 1), 2), right)
return left, right
   
child thread 1:

vubuffer = []
recbuffer = []
while self.running:
data = self._audioobj.read(self._fragsize)# read data from soundcard
vubuffer.append(data)
if self.vu_lock.acquire(blocking=0):
self.vubuffer += vubuffer
vubuffer = []
self.vu_lock.release()
if self.recording:
recbuffer.append(data)
if self.rec_lock.acquire(blocking=0):
self.recbuffer += recbuffer
recbuffer = []
self.rec_lock.release()

child thread 2:

while self.recording:
# wait a moment until there is something in the buffer to be written
data = []
time.sleep(0.1)
if self.rec_lock.acquire(blocking=0):
data = [x for x in self.recbuffer]
self.recbuffer = []
self.rec_lock.release()
for d in data:
self._waveobj.writeframesraw(d)# write data to file

This *seems* to work, however it looks like this code does not separate 
properly the gui from the child
threads which everyone says should be avoided in any case.
On the other hand, with the technique I used before, with a boolean as "lock", 
like:

 if not self.vu_locked:
self.vu_locked = 1
self.vubuffer += vubuffer
vubuffer = []
self.vu_locked = 0

it seems like the worst case is that both the gui and the child thread pass the 
test "if not self.vu_locked" at
the same time which might cause some data to be lost from the vubuffer list; 
probably that is something
I could live with.
So now my question:
Does anyone know how a threading.Condition() object is handled internally, so 
maybe its methods actually
can be called safely from the gui thread?

Thanks

Michael
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-07 Thread Michael Lange
On Tue, 07 Feb 2006 06:02:45 -0500
Kent Johnson <[EMAIL PROTECTED]> wrote:

> 
> One way to make this code thread-safe is to use a threading.Condition() 
> instead of a boolean variable:
> 
> thread 1 does:
> 
>  self.lock.acquire()
>  if condition:
>  self.name = 'bob'
>  else:
>  self.name = 'mike'
>  self.lock.release()
> 
> thread 2 does:
> 
>  self.lock.acquire()
>  n = self.name
>  self.lock.release()
>  if n == 'bob':
>  
>  else:
>  
> 
> If this is the only communication or synchronization between the two 
> threads I don't think the lock is needed at all - thread 2 is presumably 
> in a loop and thread 1 is controlling the behaviour of the loop 
> asynchronously. If there is some other kind of synchronization between 
> the loops, and thread 2 is only supposed to run once for each setting of 
> self.name in thread 1, you could use Condition.wait() and 
> Condition.notify() to do the synchronization.
> 

Thanks Kent,

In fact I have three threads, a main gui thread and two child threads. Child 
thread 1 reads data
from the soundcard and appends these data to two lists which I use as recording 
buffer. The data from
list 1 are used by the gui thread to draw a vumeter, the data from list 2 are 
written to the
target file. So the communication between the threads occurs when the gui 
thread resp. child thread 2
read and empty the buffer lists to process the data.
So my functions currently (with boolean "locks") look like:

gui thread (this function is called periodically by Tkinter):

def get_peaks(self):
if self.vu_locked:
return None
self.vu_locked = 1
data = [x for x in self.vubuffer]
self.vubuffer = []
self.vu_locked = 0
if not data:
return None
left, right = 0, 0
for d in data:
left = max(audioop.max(audioop.tomono(d, 2, 1, 0), 2), left)
right = max(audioop.max(audioop.tomono(d, 2, 0, 1), 2), right)
return left, right

thread 1:

vubuffer = []
recbuffer = []
while self.running:
data = self._audioobj.read(self._fragsize)# read data from soundcard
vubuffer.append(data)
if not self.vu_locked:
self.vu_locked = 1
self.vubuffer += vubuffer
vubuffer = []
self.vu_locked = 0
if self.recording:
recbuffer.append(data)
if not self.rec_locked:
self.rec_locked = 1
self.recbuffer += recbuffer
recbuffer = []
self.rec_locked = 0

thread 2:

while self.recording:
# wait a moment until there is something in the buffer to be written
time.sleep(0.1)
if not self.rec_locked:
self.rec_locked = 1
data = [x for x in self.recbuffer]
self.recbuffer = []
self.rec_locked = 0
for d in data:
self._waveobj.writeframesraw(d)# write the data to a file

So I think I need two Condition objects here; it is most important here that 
thread 1 does not
block to minimize the risk of recording buffer overruns, but from reading the 
docs I am
not sure about the correct procedure. So if I change self.rec_locked and 
self.vu_locked from the
code above to be Condition objects is it correct to do:

thread 1:

vubuffer = []
recbuffer = []
while self.running:
data = self._audioobj.read(self._fragsize)# read data from soundcard
vubuffer.append(data)
if self.vu_locked.acquire(blocking=0):
self.vubuffer += vubuffer
vubuffer = []
self.vu_locked.release()
if self.recording:
recbuffer.append(data)
if self.rec_locked.acquire(blocking=0):
self.recbuffer += recbuffer
recbuffer = []
self.rec_locked.release()

thread 2:

while self.recording:
# wait a moment until there is something in the buffer to be written
time.sleep(0.1)
data = []
if self.rec_locked.acquire(blocking=0):
data = [x for x in self.recbuffer]
self.recbuffer = []
self.rec_locked.release()
for d in data:
self._waveobj.writeframesraw(d)# write the data to a file

Thanks

Michael
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-07 Thread Bernard Lebel
Hi Kent,

To answer your first concern, not all changes need to be intercepted
by one the child thread. I have not given out all the details about
the program, but if the parent thread gets certain values from the
database, it will take actions that may affect the child thread other
than just by setting the localjobstatus attribute.

As for your last comments, well, the conclusion I draw is that in fact
I should not take any chance with that, and implement some thread
safety using Queue or thread conditions. I'll check that out.



Thanks for all the help!
Bernard


On 2/7/06, Kent Johnson <[EMAIL PROTECTED]> wrote:
> Bernard Lebel wrote:
> > Hi Kent,
> >
> > I have put together a little script to give a rough idea about what
> > the program does.
> >
> > http://www.bernardlebel.com/scripts/nonxsi/help/bl_threadtest.py
>
> In this code, there is no guarantee that callMeWhenAttributeChanged()
> will see every change to oBernard.firstname. It could change more than
> once while checkFirstName() is sleeping and looping. The change to
> oBernard.firstname in main could stomp on a change from
> changeAttribute() before it is seen in checkFirstName().
>
> I don't know what the consequences of missing a change are. If the names
> are some kind of command token I would use a Queue - in this example you
> have two producers and one consumer, a Queue will ensure that every
> command is seen. If there is no consequence of missing a change you
> could leave it as-is or use a threading.Condition to replace the polling
> loop with wait() / notify().
> >
> >
> > The true program does this:
> >
> > - the top program file imports a module called fcJob
> >
> > - the top program instantiate the only class in the fcJob module. The
> > class is named fcJob as well, the instance is named simply "job". This
> > instance has few attribute, the one that I'm interested in now is
> > "localjobstatus".
> >
> > - the top program file enters a while loop where it checks a variety
> > of things, and if certain conditions are met, will start a big
> > function in a separate thread.
> >
> > - the function that runs in the separate thread reads and writes the
> > localjobstatus attributes.
> >
> > - while the child thread is running, the main thread checks a database
> > every 5 seconds to test the value of certain fields.
> >
> > - would the value of specific field changed to certain values, the top
> > thread will set the localjobstatus value to something like "killed".
> >
> > - the child thread, also running a while loop, tests the local job
> > attribute at 3 times during a single iteration. If it gets a "Killed"
> > value, it will call a function that basically terminates this child
> > thread in a clean way. Ultimately, it will set the localjobstatus to
> > "Pending".
>
> This sounds unsafe. What happens if the "Pending" value is overwritten
> by another "Killed" value? It might be fine if you use two different
> status variables.
>
> When working with threads you have to imagine what would happen if one
> thread stopped indefinitely at any point, and another thread ran long
> enough to do something unexpected. Or two threads interleave statements
> in the worst possible way. Your code seems to have several opportunities
> for bad things to happen. I can't tell what the consequences of them
> are, which will determine how hard you should work to prevent them.
>
> HTH
> Kent
>
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-07 Thread Kent Johnson
Bernard Lebel wrote:
> Hi Kent,
> 
> I have put together a little script to give a rough idea about what
> the program does.
> 
> http://www.bernardlebel.com/scripts/nonxsi/help/bl_threadtest.py

In this code, there is no guarantee that callMeWhenAttributeChanged() 
will see every change to oBernard.firstname. It could change more than 
once while checkFirstName() is sleeping and looping. The change to 
oBernard.firstname in main could stomp on a change from 
changeAttribute() before it is seen in checkFirstName().

I don't know what the consequences of missing a change are. If the names 
are some kind of command token I would use a Queue - in this example you 
have two producers and one consumer, a Queue will ensure that every 
command is seen. If there is no consequence of missing a change you 
could leave it as-is or use a threading.Condition to replace the polling 
loop with wait() / notify().
> 
> 
> The true program does this:
> 
> - the top program file imports a module called fcJob
> 
> - the top program instantiate the only class in the fcJob module. The
> class is named fcJob as well, the instance is named simply "job". This
> instance has few attribute, the one that I'm interested in now is
> "localjobstatus".
> 
> - the top program file enters a while loop where it checks a variety
> of things, and if certain conditions are met, will start a big
> function in a separate thread.
> 
> - the function that runs in the separate thread reads and writes the
> localjobstatus attributes.
> 
> - while the child thread is running, the main thread checks a database
> every 5 seconds to test the value of certain fields.
> 
> - would the value of specific field changed to certain values, the top
> thread will set the localjobstatus value to something like "killed".
> 
> - the child thread, also running a while loop, tests the local job
> attribute at 3 times during a single iteration. If it gets a "Killed"
> value, it will call a function that basically terminates this child
> thread in a clean way. Ultimately, it will set the localjobstatus to
> "Pending".

This sounds unsafe. What happens if the "Pending" value is overwritten 
by another "Killed" value? It might be fine if you use two different 
status variables.

When working with threads you have to imagine what would happen if one 
thread stopped indefinitely at any point, and another thread ran long 
enough to do something unexpected. Or two threads interleave statements 
in the worst possible way. Your code seems to have several opportunities 
for bad things to happen. I can't tell what the consequences of them 
are, which will determine how hard you should work to prevent them.

HTH
Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-07 Thread Kent Johnson
Michael Lange wrote:
> I have used a boolean to control access to a variable that is used by two 
> threads,
> as in this example:
> 
> thread 1 does:
> 
> while self.locked:
> pass
> self.locked = 1
> if condition:
> self.name = 'bob'
> else:
> self.name = 'mike'
> self.locked = 0
> 
> thread 2 does:
> 
> while self.locked:
> pass
> self.locked = 1
> n = self.name
> self.locked = 0
> if n == 'bob':
> 
> else:
> 
> 
> I *thought* this would be safe, but now reading this thread I start to doubt.
> Are there any pitfalls I overlooked in this technique?

If the intent is to ensure that only one thread at a time will be in the 
code where self.locked == 1, this code will not ensure that. 
Test-and-set code needs a lock to be thread safe.

Imagine both threads reach 'while self.locked' when self.locked == 0. 
Both threads might finish the while loop before either one sets 
self.locked. Then both threads might continue through to 'self.locked = 
0' together.

In this case I'm not sure what would happen if the above scenario took 
place; it might be harmless. But your lock is definitely not doing what 
you think it is.

One way to make this code thread-safe is to use a threading.Condition() 
instead of a boolean variable:

thread 1 does:

 self.lock.acquire()
 if condition:
 self.name = 'bob'
 else:
 self.name = 'mike'
 self.lock.release()

thread 2 does:

 self.lock.acquire()
 n = self.name
 self.lock.release()
 if n == 'bob':
 
 else:
 

If this is the only communication or synchronization between the two 
threads I don't think the lock is needed at all - thread 2 is presumably 
in a loop and thread 1 is controlling the behaviour of the loop 
asynchronously. If there is some other kind of synchronization between 
the loops, and thread 2 is only supposed to run once for each setting of 
self.name in thread 1, you could use Condition.wait() and 
Condition.notify() to do the synchronization.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-07 Thread Michael Lange
On Mon, 06 Feb 2006 18:34:18 -0500
Kent Johnson <[EMAIL PROTECTED]> wrote:


> 
> It sounds like you have some attributes that you are using as flags to 
> allow one thread to control another. There are definitely some pitfalls 
> here. You probably want to use threading.Condition or Queue.Queue to 
> communicate between the threads. Can you give more details of what you 
> are trying to do?
> 

I have used a boolean to control access to a variable that is used by two 
threads,
as in this example:

thread 1 does:

while self.locked:
pass
self.locked = 1
if condition:
self.name = 'bob'
else:
self.name = 'mike'
self.locked = 0

thread 2 does:

while self.locked:
pass
self.locked = 1
n = self.name
self.locked = 0
if n == 'bob':

else:


I *thought* this would be safe, but now reading this thread I start to doubt.
Are there any pitfalls I overlooked in this technique?

Thanks

Michael


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-06 Thread Bernard Lebel
Hi Kent,

I have put together a little script to give a rough idea about what
the program does.

http://www.bernardlebel.com/scripts/nonxsi/help/bl_threadtest.py


The true program does this:

- the top program file imports a module called fcJob

- the top program instantiate the only class in the fcJob module. The
class is named fcJob as well, the instance is named simply "job". This
instance has few attribute, the one that I'm interested in now is
"localjobstatus".

- the top program file enters a while loop where it checks a variety
of things, and if certain conditions are met, will start a big
function in a separate thread.

- the function that runs in the separate thread reads and writes the
localjobstatus attributes.

- while the child thread is running, the main thread checks a database
every 5 seconds to test the value of certain fields.

- would the value of specific field changed to certain values, the top
thread will set the localjobstatus value to something like "killed".

- the child thread, also running a while loop, tests the local job
attribute at 3 times during a single iteration. If it gets a "Killed"
value, it will call a function that basically terminates this child
thread in a clean way. Ultimately, it will set the localjobstatus to
"Pending".

So in essence, there are two thread reading and writing the the
localjobstatus attribute, the main thread and a child thread. The
child thread is reading the value in order to control its flow.

Hope I'm making sense.


Thanks
Bernard



On 2/6/06, Kent Johnson <[EMAIL PROTECTED]> wrote:
> Bernard Lebel wrote:
> > Example:
> >
> > - Class instance Bernard has attribute "name", whose value is "bernard".
> > - A function in a thread tests the value of "name". If "name" ==
> > "bernard", do nothing.
> > - A function in another thread, for some reason, changes "name" to "bob".
> > - The first function, few moments later, tests again the value of
> > "name". It see that the value has changed from "bernard" to "bob", and
> > this change causes the function to take an action.
> >
> > Is that what you mean by "invoking" code?
>
> No, it's not what I meant, I was talking about constructs that can cause
> an assignment like self.x = 1 to directly call code that you wrote. It
> sounds like you are not doing that.
>
> It sounds like you have some attributes that you are using as flags to
> allow one thread to control another. There are definitely some pitfalls
> here. You probably want to use threading.Condition or Queue.Queue to
> communicate between the threads. Can you give more details of what you
> are trying to do?
>
> Kent
>
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-06 Thread Kent Johnson
Bernard Lebel wrote:
> Example:
> 
> - Class instance Bernard has attribute "name", whose value is "bernard".
> - A function in a thread tests the value of "name". If "name" ==
> "bernard", do nothing.
> - A function in another thread, for some reason, changes "name" to "bob".
> - The first function, few moments later, tests again the value of
> "name". It see that the value has changed from "bernard" to "bob", and
> this change causes the function to take an action.
> 
> Is that what you mean by "invoking" code?

No, it's not what I meant, I was talking about constructs that can cause 
an assignment like self.x = 1 to directly call code that you wrote. It 
sounds like you are not doing that.

It sounds like you have some attributes that you are using as flags to 
allow one thread to control another. There are definitely some pitfalls 
here. You probably want to use threading.Condition or Queue.Queue to 
communicate between the threads. Can you give more details of what you 
are trying to do?

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-06 Thread Bernard Lebel
hi Kent,

See [Bernard] below.


On 2/6/06, Kent Johnson <[EMAIL PROTECTED]> wrote:
> Bernard Lebel wrote:
> > Hello,
> >
> > I have an instance attribute (a few characters string) that two
> > separate threads may change, potentially both at the same time.
> > My program doesn't implement thread safety for this particular task.
> >
> > So far I have never run into issues with this, but I have been reading
> > about data corruption when two threads try to write to the same data
> > at once. I'm just about to deploy my program over 70 computers, so now
> > I'm having some seconds thoughts of dread.
> >
> > Should changing instance attributes be done with maximum thread
> > safety? That is, to use some sort of queue (like the Queue class) so
> > that all changes to the attribute goes through this queue and can
> > never happen concurrently?
>
> Are you talking about a simple assignment like self.x = 'abc'?

[Bernard] Yes.

> If there is no Python code invoked by setting the attribute, I think the
> attribute will always be correctly set to the value from one of the
> threads. In this case the setattr will happen in a single Python byte
> code and it will be atomic. In other words, both of the sets will
> succeed and which ever one happens last will persist.
>
> If you *care* which of the setattrs succeeds, then you have a race
> condition that you need to address somehow.
>
> If Python code is invoked by setting the attribute then you could have a
> problem.

[Bernard] This is where I'm getting a bit confused. changing the
attribute doesn't automatically invoke code. But at various points in
the program, the value of this attribute is tested, and if matches
certain strings, a function is called.

Example:

- Class instance Bernard has attribute "name", whose value is "bernard".
- A function in a thread tests the value of "name". If "name" ==
"bernard", do nothing.
- A function in another thread, for some reason, changes "name" to "bob".
- The first function, few moments later, tests again the value of
"name". It see that the value has changed from "bernard" to "bob", and
this change causes the function to take an action.

Is that what you mean by "invoking" code?


This could happen, for example, if the attribute is a property,
> if the class (or one of its base classes) overrides __setattr__(), and
> probably in several other ways I haven't thought of. In this case you
> need to look at whether the Python code is thread-safe.

[Bernard] There is nothing of that sort in my program. Classes are all
"independent", that is, are not derived from another class. A handful
of their attributes like the one I described are changed at various
moments, only to be tested by other functions, in order to control the
flow of execution.


Bernard
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Changing instance attributes in different threads

2006-02-06 Thread Kent Johnson
Bernard Lebel wrote:
> Hello,
> 
> I have an instance attribute (a few characters string) that two
> separate threads may change, potentially both at the same time.
> My program doesn't implement thread safety for this particular task.
> 
> So far I have never run into issues with this, but I have been reading
> about data corruption when two threads try to write to the same data
> at once. I'm just about to deploy my program over 70 computers, so now
> I'm having some seconds thoughts of dread.
> 
> Should changing instance attributes be done with maximum thread
> safety? That is, to use some sort of queue (like the Queue class) so
> that all changes to the attribute goes through this queue and can
> never happen concurrently?

Are you talking about a simple assignment like self.x = 'abc'? If there 
is no Python code invoked by setting the attribute, I think the 
attribute will always be correctly set to the value from one of the 
threads. In this case the setattr will happen in a single Python byte 
code and it will be atomic. In other words, both of the sets will 
succeed and which ever one happens last will persist.

If you *care* which of the setattrs succeeds, then you have a race 
condition that you need to address somehow.

If Python code is invoked by setting the attribute then you could have a 
problem. This could happen, for example, if the attribute is a property, 
if the class (or one of its base classes) overrides __setattr__(), and 
probably in several other ways I haven't thought of. In this case you 
need to look at whether the Python code is thread-safe.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor