[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003080#comment-15003080
 ] 

Hadriel Kaplan commented on ZOOKEEPER-1556:
-------------------------------------------

We've found the same thing at my company. In our case, the leak was caused 
because the completion thread can end before the IO thread, just before the IO 
thread created a completion entry for the completion thread to process. We 
added a mutex guard to prevent the completion thread (do_completion()) from 
exiting its while-loop until the IO thread (do_io()) had exited its while loop; 
and also added an extra call to process_completions(zh) just after the 
while-loop in case it had one more to do.

We cleared the new mutex guard inside of adaptor_finish(), after the IO thread 
is ended. The reason is that the IO thread (and the completion thread) can 
finish their while-loops as soon as zh->close_requested is set to 1 in 
zookeeper_close(), including before free_completions() does its stuff... and 
since free_completions() can create one or more fake responses for the 
completion thread to process (for unanswered requests), it's possible for 
free_completions() to also add things to the completion queue and cause the 
memory leak. So keeping the completion thread inside its while-loop until 
adaptor_finish() is called seems to make sense.

> Memory leak reported by valgrind mt version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1556
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1556
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.4.4
>            Reporter: André Martin
>            Priority: Minor
>
> Valgrind reports the following memory leak when using the c-client (mt):
> ==11674== 18 bytes in 9 blocks are indirectly lost in loss record 14 of 173
> ==11674==    at 0x4C2B6CD: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==11674==    by 0xC8064A: ia_deserialize_string (recordio.c:271)
> ==11674==    by 0xC81F2E: deserialize_String_vector (zookeeper.jute.c:247)
> ==11674==    by 0xC842F9: deserialize_GetChildrenResponse 
> (zookeeper.jute.c:874)
> ==11674==    by 0xC7E9F0: zookeeper_process (zookeeper.c:1904)
> ==11674==    by 0xC7FE5B: do_io (mt_adaptor.c:439)
> ==11674==    by 0x4E39E99: start_thread (pthread_create.c:308)
> ==11674==    by 0x5FA6DBC: clone (clone.S:112)
> ==11674== 
> ==11674== 90 (72 direct, 18 indirect) bytes in 49 blocks are definitely lost 
> in loss record 139 of 173
> ==11674==    at 0x4C29DB4: calloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==11674==    by 0xC81EEE: deserialize_String_vector (zookeeper.jute.c:245)
> ==11674==    by 0xC842F9: deserialize_GetChildrenResponse 
> (zookeeper.jute.c:874)
> ==11674==    by 0xC7E9F0: zookeeper_process (zookeeper.c:1904)
> ==11674==    by 0xC7FE5B: do_io (mt_adaptor.c:439)
> ==11674==    by 0x4E39E99: start_thread (pthread_create.c:308)
> ==11674==    by 0x5FA6DBC: clone (clone.S:112)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to