[jira] Commented: (ZOOKEEPER-897) C Client seg faults during close

2010-10-28 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925846#action_12925846
 ] 

Patrick Hunt commented on ZOOKEEPER-897:


perhaps we should rely on existing testing for this one, but enter a new jira 
to refactor the client, specifically to allow testing? (ie a way to inject the 
helper code w/o needing to edit zookeeper.c directly)

 C Client seg faults during close
 

 Key: ZOOKEEPER-897
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Reporter: Jared Cantwell
Assignee: Jared Cantwell
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEEPER-897.diff, ZOOKEEPER-897.patch


 We observed a crash while closing our c client.  It was in the do_io() thread 
 that was processing as during the close() call.
 #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
 #1  0x0046234e in check_events (zh=0x6bd480, events=value optimized 
 out) at src/zookeeper.c:1687
 #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
 src/zookeeper.c:1971
 #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
 #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
 #5  0x76f706fd in clone () from /lib/libc.so.6
 #6  0x in ?? ()
 We tracked down the sequence of events, and the cause is that input_buffer is 
 being freed from a thread other than the do_io thread that relies on it:
 1. do_io() call check_events()
 2. if(eventsZOOKEEPER_READ) branch executes
 3. if (rc  0) branch executes
 4. if (zh-input_buffer != zh-primer_buffer) branch executes
 .in the meantime..
  5. zookeeper_close() called
  6. if (inc_ref_counter(zh,0)!=0) branch executes
  7. cleanup_bufs() is called
  8. input_buffer is freed at the end
 . back to check_events().
 9. queue_events() is called on a NULL buffer.
 I believe the patch is to only call free_completions() in zookeeper_close() 
 and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
 call any outstanding synhcronous completions, so only free_completions (which 
 is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-897) C Client seg faults during close

2010-10-28 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925858#action_12925858
 ] 

Mahadev konar commented on ZOOKEEPER-897:
-

jared, pat,
 I am ok without a test case for this one, because its a quite hard to create 
one. I just wanted someone else to run the tests on there machines just to 
verify (since I rarely see any problems in c tests on my machine). I will go 
ahead and commit this patch for now.


 C Client seg faults during close
 

 Key: ZOOKEEPER-897
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Reporter: Jared Cantwell
Assignee: Jared Cantwell
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEEPER-897.diff, ZOOKEEPER-897.patch


 We observed a crash while closing our c client.  It was in the do_io() thread 
 that was processing as during the close() call.
 #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
 #1  0x0046234e in check_events (zh=0x6bd480, events=value optimized 
 out) at src/zookeeper.c:1687
 #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
 src/zookeeper.c:1971
 #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
 #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
 #5  0x76f706fd in clone () from /lib/libc.so.6
 #6  0x in ?? ()
 We tracked down the sequence of events, and the cause is that input_buffer is 
 being freed from a thread other than the do_io thread that relies on it:
 1. do_io() call check_events()
 2. if(eventsZOOKEEPER_READ) branch executes
 3. if (rc  0) branch executes
 4. if (zh-input_buffer != zh-primer_buffer) branch executes
 .in the meantime..
  5. zookeeper_close() called
  6. if (inc_ref_counter(zh,0)!=0) branch executes
  7. cleanup_bufs() is called
  8. input_buffer is freed at the end
 . back to check_events().
 9. queue_events() is called on a NULL buffer.
 I believe the patch is to only call free_completions() in zookeeper_close() 
 and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
 call any outstanding synhcronous completions, so only free_completions (which 
 is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-897) C Client seg faults during close

2010-10-25 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924748#action_12924748
 ] 

Jared Cantwell commented on ZOOKEEPER-897:
--

we are using the 3.3.2 release.  i don't think the patch leaks memory because 
destroy() will eventually get called (by the reentrant call to 
zookeeper_close()), which calls cleanup_bufs() and frees those buffers, right?  
Also, i had a test that reproduced this error, but it was easiest to reproduce 
if i injected artificial sleeps into the zookeeper.c file.  If that's ok, then 
I can submit that.  Otherwise, i'll see if i can devise a test that can 
reproduce it otherwise.

 C Client seg faults during close
 

 Key: ZOOKEEPER-897
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-897
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Reporter: Jared Cantwell
Assignee: Jared Cantwell
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEEPER-897.diff, ZOOKEEPER-897.patch


 We observed a crash while closing our c client.  It was in the do_io() thread 
 that was processing as during the close() call.
 #0  queue_buffer (list=0x6bd4f8, b=0x0, add_to_front=0) at src/zookeeper.c:969
 #1  0x0046234e in check_events (zh=0x6bd480, events=value optimized 
 out) at src/zookeeper.c:1687
 #2  0x00462d74 in zookeeper_process (zh=0x6bd480, events=2) at 
 src/zookeeper.c:1971
 #3  0x00469c34 in do_io (v=0x6bd480) at src/mt_adaptor.c:311
 #4  0x77bc59ca in start_thread () from /lib/libpthread.so.0
 #5  0x76f706fd in clone () from /lib/libc.so.6
 #6  0x in ?? ()
 We tracked down the sequence of events, and the cause is that input_buffer is 
 being freed from a thread other than the do_io thread that relies on it:
 1. do_io() call check_events()
 2. if(eventsZOOKEEPER_READ) branch executes
 3. if (rc  0) branch executes
 4. if (zh-input_buffer != zh-primer_buffer) branch executes
 .in the meantime..
  5. zookeeper_close() called
  6. if (inc_ref_counter(zh,0)!=0) branch executes
  7. cleanup_bufs() is called
  8. input_buffer is freed at the end
 . back to check_events().
 9. queue_events() is called on a NULL buffer.
 I believe the patch is to only call free_completions() in zookeeper_close() 
 and not cleanup_bufs().  The original reason cleanup_bufs() was added was to 
 call any outstanding synhcronous completions, so only free_completions (which 
 is guarded) is needed.  I will submit a patch for review with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.