[ https://issues.apache.org/jira/browse/ZOOKEEPER-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander A. Strelets updated ZOOKEEPER-2894: --------------------------------------------- Description: ZooKeeper C Client *+single thread+* build First of all, ZooKeeper C Client design allows calling _zookeeper_close()_ in two ways: a) from a ZooKeeper callback handler (completion or watcher) which in turn is called through _zookeeper_process()_ b) and from other places -- i.e., when the call-stack does not pass through any of zookeeper mechanics prior to enter into mentioned _zookeeper_close()_ The issue described here below is +specific only to the case (b)+. So, it's Ok with the case (a). When _zookeeper_close()_ is called in the (b) way, the following happens: 1. If there are requests waiting for responses in _zh.sent_requests_ queue, they all are removed from this queue and each of them is "completed" with personal fake response having status ZCLOSING. Such fake responses are put into _zh.completions_to_process_ queue. It's Ok 2. But then, _zh.completions_to_process_ queue is left unhandled. +Neither completion callbacks are called, nor dynamic memory allocated for fake responses is freed.+ 3. Different structures within _zh_ are dismissed and finally _zh_ is freed This is illustrated on the screenshot attached to this ticket: you may see that the next instruction to execute will be _free(zh)_ while _zh.completions_to_process_ queue is not empty (see the "Variables" tab to the right). *Consequently:* 1. At least there is definitely the +memory leak+ in the case (b) 2. And it looks like a great misbehavior not to call completions on sent requests in the case (b) as soon as they are called with ZCLOSING in the case (a) -- so, the +completions leak+ To reproduce the case (b) do the following: - open ZooKeeper session, connect to a server, receive and process connected-watch, etc. - then call for example _zoo_acreate()_ with valid arguments -- it shall return ZOK - then, immediately after it returned, call _zookeeper_close()_ - note that completion callback handler for _zoo_acreate()_ will not be called To reproduce the case (a) do about the same, but call _zookeeper_close()_ from a watcher or another completion callback handler. To fix this I propose calling to _process_completions()_ from _destroy(zhandle_t *zh)_ as it is done in _handle_error(zhandle_t *zh,int rc)_. was: ZooKeeper C Client *+single thread+* build First of all, ZooKeeper C Client design allows calling _zookeeper_close()_ in two ways: a) from a ZooKeeper callback handler (completion or watcher) which in turn is called through _zookeeper_process()_ b) and from other places -- i.e., when the call-stack does not pass through any of zookeeper mechanics prior to enter into mentioned _zookeeper_close()_ The issue described here below is +specific only to the case (b)+. So, it's Ok with the case (a). When _zookeeper_close()_ is called in the (b) way, the following happens: 1. If there are requests waiting for responses in _zh.sent_requests_ queue, they all are removed from this queue and each of them is "completed" with personal fake response having status ZCLOSING. Such fake responses are put into the _zh.completions_to_process_ queue. It's Ok 2. But then, _zh.completions_to_process_ queue is left unhandled. +Neither completion callbacks are called, nor dynamic memory allocated for fake responses is freed.+ 3. Different structures within _zh_ are dismissed and finally _zh_ is freed This is illustrated on the screenshot attached to this ticket: you may see that the next instruction to execute will be _free(zh)_ while _zh.completions_to_process_ queue is not empty (see the "Variables" tab to the right). *Consequently:* 1. At least there is definitely the +memory leak+ in the case (b) 2. And it looks like a great misbehavior not to call completions on sent requests in the case (b) as soon as they are called with ZCLOSING in the case (a) -- so, the +completions leak+ To reproduce the case (b) do the following: - open ZooKeeper session, connect to a server, receive and process connected-watch, etc. - then call for example _zoo_acreate()_ with valid arguments -- it shall return ZOK - then, immediately after it returned, call _zookeeper_close()_ - note that completion callback handler for _zoo_acreate()_ will not be called To reproduce the case (a) do about the same, but call _zookeeper_close()_ from a watcher or another completion callback handler. To fix this I propose calling to _process_completions()_ from _destroy(zhandle_t *zh)_ as it is done in _handle_error(zhandle_t *zh,int rc)_. > Memory and completions leak on zookeeper_close. > ----------------------------------------------- > > Key: ZOOKEEPER-2894 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2894 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.4.10 > Environment: Linux ubuntu 4.4.0-87-generic > gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 > https://github.com/apache/zookeeper.git > branch-3.4 > Reporter: Alexander A. Strelets > Priority: Critical > Fix For: 3.4.10 > > > ZooKeeper C Client *+single thread+* build > First of all, ZooKeeper C Client design allows calling _zookeeper_close()_ in > two ways: > a) from a ZooKeeper callback handler (completion or watcher) which in turn is > called through _zookeeper_process()_ > b) and from other places -- i.e., when the call-stack does not pass through > any of zookeeper mechanics prior to enter into mentioned _zookeeper_close()_ > The issue described here below is +specific only to the case (b)+. So, it's > Ok with the case (a). > When _zookeeper_close()_ is called in the (b) way, the following happens: > 1. If there are requests waiting for responses in _zh.sent_requests_ queue, > they all are removed from this queue and each of them is "completed" with > personal fake response having status ZCLOSING. Such fake responses are put > into _zh.completions_to_process_ queue. It's Ok > 2. But then, _zh.completions_to_process_ queue is left unhandled. +Neither > completion callbacks are called, nor dynamic memory allocated for fake > responses is freed.+ > 3. Different structures within _zh_ are dismissed and finally _zh_ is freed > This is illustrated on the screenshot attached to this ticket: you may see > that the next instruction to execute will be _free(zh)_ while > _zh.completions_to_process_ queue is not empty (see the "Variables" tab to > the right). > *Consequently:* > 1. At least there is definitely the +memory leak+ in the case (b) > 2. And it looks like a great misbehavior not to call completions on sent > requests in the case (b) as soon as they are called with ZCLOSING in the case > (a) -- so, the +completions leak+ > To reproduce the case (b) do the following: > - open ZooKeeper session, connect to a server, receive and process > connected-watch, etc. > - then call for example _zoo_acreate()_ with valid arguments -- it shall > return ZOK > - then, immediately after it returned, call _zookeeper_close()_ > - note that completion callback handler for _zoo_acreate()_ will not be called > To reproduce the case (a) do about the same, but call _zookeeper_close()_ > from a watcher or another completion callback handler. > To fix this I propose calling to _process_completions()_ from > _destroy(zhandle_t *zh)_ as it is done in _handle_error(zhandle_t *zh,int > rc)_. -- This message was sent by Atlassian JIRA (v6.4.14#64029)