Re: avoiding deadlocks on client handle close w/ python/c api

2010-05-04 Thread Kapil Thangavelu
I've constructed  a simple example just using the zkpython library with
condition variables, that will deadlock. I've filed a new ticket for it,

https://issues.apache.org/jira/browse/ZOOKEEPER-763

the gdb stack traces look suspiciously like the ones in 591, but sans the
watchers.
https://issues.apache.org/jira/browse/ZOOKEEPER-591

the attached example on the ticket will deadlock in zk 3.3.0 (which has the
fix for 591) and trunk.

-kapil

On Mon, May 3, 2010 at 9:48 PM, Kapil Thangavelu kapil.f...@gmail.comwrote:

 Hi Folks,

 I'm constructing an async api on top of the zookeeper python bindings for
 twisted. The intent was to make a thin wrapper that would wrap the existing
 async api with one that allows for integration with the twisted python event
 loop (http://www.twistedmatrix.com) primarily using the async apis.

 One issue i'm running into while developing a unit tests, deadlocks occur
 if we attempt to close a handle while there are any outstanding async
 requests (aget, acreate, etc). Normally on close both the io thread
 terminates and the completion thread are terminated and joined, however
 w\ith outstanding async requests, the completion thread won't be in a
 joinable state, and we effectively hang when the main thread does the join.

 I'm curious if this would be considered bug, afaics ideal behavior would be
 on close of a handle, to effectively clear out any remaining callbacks and
 let the completion thread terminate.

 i've tried adding some bookkeeping to the api to guard against closing
 while there is an outstanding completion request, but its an imperfect
 solution do to the nature of the event loop integration. The problem is that
 the python callback invoked by the completion thread in turn schedules a
 function for the main thread. In twisted the api for this is implemented by
 appending the function to a list attribute on the reactor and then writing a
 byte to a pipe to wakeup the main thread. If a thread switch to the main
 thread occurs before the completion thread callback returns, the scheduled
 function runs and the rest of the application keeps processing, of which the
 last step for the unit tests is to close the connection, which results in a
 deadlock.

 i've included some of the client log and gdb stack traces from a deadlock'd
 client process.

 thanks,

 Kapil






Re: avoiding deadlocks on client handle close w/ python/c api

2010-05-04 Thread Patrick Hunt

Thanks Kapil, Mahadev perhaps you could take a look at this as well?

Patrick

On 05/04/2010 06:36 AM, Kapil Thangavelu wrote:

I've constructed  a simple example just using the zkpython library with
condition variables, that will deadlock. I've filed a new ticket for it,

https://issues.apache.org/jira/browse/ZOOKEEPER-763

the gdb stack traces look suspiciously like the ones in 591, but sans the
watchers.
https://issues.apache.org/jira/browse/ZOOKEEPER-591

the attached example on the ticket will deadlock in zk 3.3.0 (which has the
fix for 591) and trunk.

-kapil

On Mon, May 3, 2010 at 9:48 PM, Kapil Thangavelukapil.f...@gmail.comwrote:


Hi Folks,

I'm constructing an async api on top of the zookeeper python bindings for
twisted. The intent was to make a thin wrapper that would wrap the existing
async api with one that allows for integration with the twisted python event
loop (http://www.twistedmatrix.com) primarily using the async apis.

One issue i'm running into while developing a unit tests, deadlocks occur
if we attempt to close a handle while there are any outstanding async
requests (aget, acreate, etc). Normally on close both the io thread
terminates and the completion thread are terminated and joined, however
w\ith outstanding async requests, the completion thread won't be in a
joinable state, and we effectively hang when the main thread does the join.

I'm curious if this would be considered bug, afaics ideal behavior would be
on close of a handle, to effectively clear out any remaining callbacks and
let the completion thread terminate.

i've tried adding some bookkeeping to the api to guard against closing
while there is an outstanding completion request, but its an imperfect
solution do to the nature of the event loop integration. The problem is that
the python callback invoked by the completion thread in turn schedules a
function for the main thread. In twisted the api for this is implemented by
appending the function to a list attribute on the reactor and then writing a
byte to a pipe to wakeup the main thread. If a thread switch to the main
thread occurs before the completion thread callback returns, the scheduled
function runs and the rest of the application keeps processing, of which the
last step for the unit tests is to close the connection, which results in a
deadlock.

i've included some of the client log and gdb stack traces from a deadlock'd
client process.

thanks,

Kapil








Re: avoiding deadlocks on client handle close w/ python/c api

2010-05-04 Thread Mahadev Konar
Sure, Ill take a look at it.

Thanks
mahadev


On 5/4/10 2:32 PM, Patrick Hunt ph...@apache.org wrote:

 Thanks Kapil, Mahadev perhaps you could take a look at this as well?
 
 Patrick
 
 On 05/04/2010 06:36 AM, Kapil Thangavelu wrote:
 I've constructed  a simple example just using the zkpython library with
 condition variables, that will deadlock. I've filed a new ticket for it,
 
 https://issues.apache.org/jira/browse/ZOOKEEPER-763
 
 the gdb stack traces look suspiciously like the ones in 591, but sans the
 watchers.
 https://issues.apache.org/jira/browse/ZOOKEEPER-591
 
 the attached example on the ticket will deadlock in zk 3.3.0 (which has the
 fix for 591) and trunk.
 
 -kapil
 
 On Mon, May 3, 2010 at 9:48 PM, Kapil Thangavelukapil.f...@gmail.comwrote:
 
 Hi Folks,
 
 I'm constructing an async api on top of the zookeeper python bindings for
 twisted. The intent was to make a thin wrapper that would wrap the existing
 async api with one that allows for integration with the twisted python event
 loop (http://www.twistedmatrix.com) primarily using the async apis.
 
 One issue i'm running into while developing a unit tests, deadlocks occur
 if we attempt to close a handle while there are any outstanding async
 requests (aget, acreate, etc). Normally on close both the io thread
 terminates and the completion thread are terminated and joined, however
 w\ith outstanding async requests, the completion thread won't be in a
 joinable state, and we effectively hang when the main thread does the join.
 
 I'm curious if this would be considered bug, afaics ideal behavior would be
 on close of a handle, to effectively clear out any remaining callbacks and
 let the completion thread terminate.
 
 i've tried adding some bookkeeping to the api to guard against closing
 while there is an outstanding completion request, but its an imperfect
 solution do to the nature of the event loop integration. The problem is that
 the python callback invoked by the completion thread in turn schedules a
 function for the main thread. In twisted the api for this is implemented by
 appending the function to a list attribute on the reactor and then writing a
 byte to a pipe to wakeup the main thread. If a thread switch to the main
 thread occurs before the completion thread callback returns, the scheduled
 function runs and the rest of the application keeps processing, of which the
 last step for the unit tests is to close the connection, which results in a
 deadlock.
 
 i've included some of the client log and gdb stack traces from a deadlock'd
 client process.
 
 thanks,
 
 Kapil