[ https://issues.apache.org/jira/browse/ZOOKEEPER-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003080#comment-15003080 ]
Hadriel Kaplan commented on ZOOKEEPER-1556: ------------------------------------------- We've found the same thing at my company. In our case, the leak was caused because the completion thread can end before the IO thread, just before the IO thread created a completion entry for the completion thread to process. We added a mutex guard to prevent the completion thread (do_completion()) from exiting its while-loop until the IO thread (do_io()) had exited its while loop; and also added an extra call to process_completions(zh) just after the while-loop in case it had one more to do. We cleared the new mutex guard inside of adaptor_finish(), after the IO thread is ended. The reason is that the IO thread (and the completion thread) can finish their while-loops as soon as zh->close_requested is set to 1 in zookeeper_close(), including before free_completions() does its stuff... and since free_completions() can create one or more fake responses for the completion thread to process (for unanswered requests), it's possible for free_completions() to also add things to the completion queue and cause the memory leak. So keeping the completion thread inside its while-loop until adaptor_finish() is called seems to make sense. > Memory leak reported by valgrind mt version > ------------------------------------------- > > Key: ZOOKEEPER-1556 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1556 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.4.4 > Reporter: André Martin > Priority: Minor > > Valgrind reports the following memory leak when using the c-client (mt): > ==11674== 18 bytes in 9 blocks are indirectly lost in loss record 14 of 173 > ==11674== at 0x4C2B6CD: malloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==11674== by 0xC8064A: ia_deserialize_string (recordio.c:271) > ==11674== by 0xC81F2E: deserialize_String_vector (zookeeper.jute.c:247) > ==11674== by 0xC842F9: deserialize_GetChildrenResponse > (zookeeper.jute.c:874) > ==11674== by 0xC7E9F0: zookeeper_process (zookeeper.c:1904) > ==11674== by 0xC7FE5B: do_io (mt_adaptor.c:439) > ==11674== by 0x4E39E99: start_thread (pthread_create.c:308) > ==11674== by 0x5FA6DBC: clone (clone.S:112) > ==11674== > ==11674== 90 (72 direct, 18 indirect) bytes in 49 blocks are definitely lost > in loss record 139 of 173 > ==11674== at 0x4C29DB4: calloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==11674== by 0xC81EEE: deserialize_String_vector (zookeeper.jute.c:245) > ==11674== by 0xC842F9: deserialize_GetChildrenResponse > (zookeeper.jute.c:874) > ==11674== by 0xC7E9F0: zookeeper_process (zookeeper.c:1904) > ==11674== by 0xC7FE5B: do_io (mt_adaptor.c:439) > ==11674== by 0x4E39E99: start_thread (pthread_create.c:308) > ==11674== by 0x5FA6DBC: clone (clone.S:112) -- This message was sent by Atlassian JIRA (v6.3.4#6332)