Re: Finding the source of an exception in a python multiprocessing program

2013-04-25 Thread Neil Cerutti
On 2013-04-24, William Ray Wing  wrote:
> On Apr 24, 2013, at 4:31 PM, Neil Cerutti  wrote:
>
>> On 2013-04-24, William Ray Wing  wrote:
>>> When I look at the pool module, the error is occurring in
>>> get(self, timeout=None) on the line after the final else:
>>> 
>>>def get(self, timeout=None):
>>>self.wait(timeout)
>>>if not self._ready:
>>>raise TimeoutError
>>>if self._success:
>>>return self._value
>>>else:
>>>raise self._value
>> 
>> The code that's failing is in self.wait. Somewhere in there you
>> must be masking an exception and storing it in self._value
>> instead of letting it propogate and crash your program. This is
>> hiding the actual context.
>
> I'm sorry, I'm not following you.  The "get" routine (and thus
> self.wait) is part of the "pool" module in the Python
> multiprocessing library. None of my code has a class or
> function named "get".

Oops! I failed to notice it was part of the pool module and not
your own code. 

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread Dave Angel

On 04/24/2013 08:00 PM, Oscar Benjamin wrote:

On 25 April 2013 00:26, Dave Angel  wrote:

On 04/24/2013 05:09 PM, William Ray Wing wrote:



   



My question is why bother with multithreading?  Why not just do these as
separate processes?  You said "they in no way interact with each other" and
that's a clear clue that separate processes would be cleaner.


It's using multiprocessing rather than threads: they are separate processes.



You're right;  I was completely off base.  brain-freeze.






It's state that is passed to it by the subprocess and should only be
accessed by the top-level process after the subprocess completes (I
think!).



Separate processes will find it much more difficult to interact, which is a
good thing most of the time.  Further, they seem to be scheduled more
efficiently because of the GIL, though that may not make that much
difference when you're time-limited by network data.


They are separate processes and do not share the GIL (unless I'm very
much mistaken).


No, you're not mistaken.  Somehow I interpreted the original as saying 
multi-thread, and everything else was wrong as a result.  Now it sounds 
like a bug in, or misuse of, the Pool class.




--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread Oscar Benjamin
On 25 April 2013 00:26, Dave Angel  wrote:
> On 04/24/2013 05:09 PM, William Ray Wing wrote:
>>
>> On Apr 24, 2013, at 4:31 PM, Neil Cerutti  wrote:
>>
>>> On 2013-04-24, William Ray Wing  wrote:

 When I look at the pool module, the error is occurring in
 get(self, timeout=None) on the line after the final else:

 def get(self, timeout=None):
 self.wait(timeout)
 if not self._ready:
 raise TimeoutError
 if self._success:
 return self._value
 else:
 raise self._value
>>>
>>>
>>> The code that's failing is in self.wait. Somewhere in there you
>>> must be masking an exception and storing it in self._value
>>> instead of letting it propogate and crash your program. This is
>>> hiding the actual context.
>>>
>>> --
>>> Neil Cerutti
>>> --
>>> http://mail.python.org/mailman/listinfo/python-list
>>
>>
>> I'm sorry, I'm not following you.  The "get" routine (and thus self.wait)
>> is part of the "pool" module in the Python multiprocessing library.
>> None of my code has a class or function named "get".
>>
>> -Bill
>>
>
> My question is why bother with multithreading?  Why not just do these as
> separate processes?  You said "they in no way interact with each other" and
> that's a clear clue that separate processes would be cleaner.

It's using multiprocessing rather than threads: they are separate processes.

>
> Without knowing anything about those libraries, I'd guess that somewhere
> they do store state in a global attribute or equivalent, and when that is
> accessed by both threads, it can crash.

It's state that is passed to it by the subprocess and should only be
accessed by the top-level process after the subprocess completes (I
think!).

>
> Separate processes will find it much more difficult to interact, which is a
> good thing most of the time.  Further, they seem to be scheduled more
> efficiently because of the GIL, though that may not make that much
> difference when you're time-limited by network data.

They are separate processes and do not share the GIL (unless I'm very
much mistaken). Also I think the underlying program is limited by the
call to sleep for 15 seconds.


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread Dave Angel

On 04/24/2013 05:09 PM, William Ray Wing wrote:

On Apr 24, 2013, at 4:31 PM, Neil Cerutti  wrote:


On 2013-04-24, William Ray Wing  wrote:

When I look at the pool module, the error is occurring in
get(self, timeout=None) on the line after the final else:

def get(self, timeout=None):
self.wait(timeout)
if not self._ready:
raise TimeoutError
if self._success:
return self._value
else:
raise self._value


The code that's failing is in self.wait. Somewhere in there you
must be masking an exception and storing it in self._value
instead of letting it propogate and crash your program. This is
hiding the actual context.

--
Neil Cerutti
--
http://mail.python.org/mailman/listinfo/python-list


I'm sorry, I'm not following you.  The "get" routine (and thus self.wait) is part of the 
"pool" module in the Python multiprocessing library.
None of my code has a class or function named "get".

-Bill



My question is why bother with multithreading?  Why not just do these as 
separate processes?  You said "they in no way interact with each other" 
and that's a clear clue that separate processes would be cleaner.


Without knowing anything about those libraries, I'd guess that somewhere 
they do store state in a global attribute or equivalent, and when that 
is accessed by both threads, it can crash.


Separate processes will find it much more difficult to interact, which 
is a good thing most of the time.  Further, they seem to be scheduled 
more efficiently because of the GIL, though that may not make that much 
difference when you're time-limited by network data.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread MRAB

On 24/04/2013 20:25, William Ray Wing wrote:

I run a bit of python code that monitors my connection to the greater Internet. 
 It checks connectivity to the requested target IP addresses, logging both 
successes and failures, once every 15 seconds.  I see failures quite regularly, 
predictably on Sunday nights after midnight when various networks are 
undergoing maintenance.  I'm trying to use python's multiprocessing library to 
run multiple copies in parallel to check connectivity to different parts of the 
country (they in no way interact with each other).

On rare occasions (maybe once every couple of months) I get the following 
exception and traceback:

Traceback (most recent call last):
   File "./CM_Harness.py", line 12, in 
 Foo = pool.map(monitor, targets)# and hands off two targets
   File 
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
 line 227, in map
 return self.map_async(func, iterable, chunksize).get()
   File 
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
 line 528, in get
 raise self._value
IndexError: list index out of range

The code where the traceback occurs is:

#!/usr/bin/env python

""" Harness to call multiple parallel copies
 of the basic monitor program
"""

from multiprocessing import Pool
from Connection_Monitor import monitor

targets = ["8.8.8.8", "www.ncsa.edu"]
pool = Pool(processes=2)# start 2 worker processes
Foo = pool.map(monitor, targets)# and hands off two targets


Line 12, in my code is simply the line that launches the underlying monitor 
code.  I'm assuming that the real error is occurring in the monitor program 
that is being launched, but I'm at a loss as to what to do to get a better 
handle on what's going wrong. Since, as I said, I see failures quite regularly, 
typically on Sunday nights after midnight when various networks are undergoing 
maintenance, I don't _think_ the exception is being triggered by that sort of 
failure.


[snip]
If the exception is being raised by 'monitor', you could try catching
the exception within that (or write a simple wrapper function which
calls it), write the traceback to a logfile, and then re-raise.

--
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread Oscar Benjamin
On 24 April 2013 20:25, William Ray Wing  wrote:
> I run a bit of python code that monitors my connection to the greater 
> Internet.  It checks connectivity to the requested target IP addresses, 
> logging both successes and failures, once every 15 seconds.  I see failures 
> quite regularly, predictably on Sunday nights after midnight when various 
> networks are undergoing maintenance.  I'm trying to use python's 
> multiprocessing library to run multiple copies in parallel to check 
> connectivity to different parts of the country (they in no way interact with 
> each other).
>
> On rare occasions (maybe once every couple of months) I get the following 
> exception and traceback:
>
> Traceback (most recent call last):
>   File "./CM_Harness.py", line 12, in 
> Foo = pool.map(monitor, targets)# and hands off two targets
>   File 
> "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
>  line 227, in map
> return self.map_async(func, iterable, chunksize).get()
>   File 
> "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
>  line 528, in get
> raise self._value
> IndexError: list index out of range
>
> The code where the traceback occurs is:
>
> #!/usr/bin/env python
>
> """ Harness to call multiple parallel copies
> of the basic monitor program
> """
>
> from multiprocessing import Pool
> from Connection_Monitor import monitor
>
> targets = ["8.8.8.8", "www.ncsa.edu"]
> pool = Pool(processes=2)# start 2 worker processes
> Foo = pool.map(monitor, targets)# and hands off two targets
>
>
> Line 12, in my code is simply the line that launches the underlying monitor 
> code.  I'm assuming that the real error is occurring in the monitor program 
> that is being launched, but I'm at a loss as to what to do to get a better 
> handle on what's going wrong. Since, as I said, I see failures quite 
> regularly, typically on Sunday nights after midnight when various networks 
> are undergoing maintenance, I don't _think_ the exception is being triggered 
> by that sort of failure.
>
> When I look at the pool module, the error is occurring in get(self, 
> timeout=None) on the line after the final else:
>
> def get(self, timeout=None):
> self.wait(timeout)
> if not self._ready:
> raise TimeoutError
> if self._success:
> return self._value
> else:
> raise self._value
>
>
> Python v 2.7.3, from Python.org, running on Mac OS-X 10.8.3

This looks to me like a bug in multiprocessing but I'm not very
experienced with it. Perhaps it would be good to open an issue on the
tracker. It might not be solvable without an easier way of reproducing
it though.


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread William Ray Wing
On Apr 24, 2013, at 4:31 PM, Neil Cerutti  wrote:

> On 2013-04-24, William Ray Wing  wrote:
>> When I look at the pool module, the error is occurring in
>> get(self, timeout=None) on the line after the final else:
>> 
>>def get(self, timeout=None):
>>self.wait(timeout)
>>if not self._ready:
>>raise TimeoutError
>>if self._success:
>>return self._value
>>else:
>>raise self._value
> 
> The code that's failing is in self.wait. Somewhere in there you
> must be masking an exception and storing it in self._value
> instead of letting it propogate and crash your program. This is
> hiding the actual context.
> 
> -- 
> Neil Cerutti
> -- 
> http://mail.python.org/mailman/listinfo/python-list

I'm sorry, I'm not following you.  The "get" routine (and thus self.wait) is 
part of the "pool" module in the Python multiprocessing library.
None of my code has a class or function named "get".

-Bill

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread Neil Cerutti
On 2013-04-24, William Ray Wing  wrote:
> When I look at the pool module, the error is occurring in
> get(self, timeout=None) on the line after the final else:
>
> def get(self, timeout=None):
> self.wait(timeout)
> if not self._ready:
> raise TimeoutError
> if self._success:
> return self._value
> else:
> raise self._value

The code that's failing is in self.wait. Somewhere in there you
must be masking an exception and storing it in self._value
instead of letting it propogate and crash your program. This is
hiding the actual context.

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Finding the source of an exception in a python multiprocessing program

2013-04-24 Thread William Ray Wing
I run a bit of python code that monitors my connection to the greater Internet. 
 It checks connectivity to the requested target IP addresses, logging both 
successes and failures, once every 15 seconds.  I see failures quite regularly, 
predictably on Sunday nights after midnight when various networks are 
undergoing maintenance.  I'm trying to use python's multiprocessing library to 
run multiple copies in parallel to check connectivity to different parts of the 
country (they in no way interact with each other).

On rare occasions (maybe once every couple of months) I get the following 
exception and traceback:

Traceback (most recent call last):
  File "./CM_Harness.py", line 12, in 
Foo = pool.map(monitor, targets)# and hands off two targets
  File 
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
 line 227, in map
return self.map_async(func, iterable, chunksize).get()
  File 
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
 line 528, in get
raise self._value
IndexError: list index out of range

The code where the traceback occurs is:

#!/usr/bin/env python

""" Harness to call multiple parallel copies
of the basic monitor program
"""

from multiprocessing import Pool
from Connection_Monitor import monitor

targets = ["8.8.8.8", "www.ncsa.edu"]
pool = Pool(processes=2)# start 2 worker processes
Foo = pool.map(monitor, targets)# and hands off two targets


Line 12, in my code is simply the line that launches the underlying monitor 
code.  I'm assuming that the real error is occurring in the monitor program 
that is being launched, but I'm at a loss as to what to do to get a better 
handle on what's going wrong. Since, as I said, I see failures quite regularly, 
typically on Sunday nights after midnight when various networks are undergoing 
maintenance, I don't _think_ the exception is being triggered by that sort of 
failure.

When I look at the pool module, the error is occurring in get(self, 
timeout=None) on the line after the final else:

def get(self, timeout=None):
self.wait(timeout)
if not self._ready:
raise TimeoutError
if self._success:
return self._value
else:
raise self._value


Python v 2.7.3, from Python.org, running on Mac OS-X 10.8.3

Thanks for any suggestions,
Bill
-- 
http://mail.python.org/mailman/listinfo/python-list