[ https://issues.apache.org/jira/browse/ZOOKEEPER-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander A. Strelets updated ZOOKEEPER-2891: --------------------------------------------- Description: Function *_deserialize_multi()_* hits *_assert(entry)_* when called for the so called "Fake response" which is fabricated by the function _free_completions()_ for example when _zookeeper_close()_ is called while there is a pending _multi_ request. Such fake response includes only the header but zero bytes for the body. Due to this {{deserialize_MultiHeader(ia, "multiheader", &mhdr)}}, which is called repeatedly for each {{completion_list_t *entry = dequeue_completion(clist)}}, does not assign the _mhdr_ and keeps _mhdr.done == 0_ as it was originally initialized. Consequently the _while (!mhdr.done)_ does not ever end, and finally falls into the _assert(entry)_ with _entry == NULL_ when all sub-requests are "completed". ~// Normally on my platform assert raises SIGABRT.~ I propose to instruct the _deserialize_multi()_ function to break the loop on _entry == NULL_ if it was called for an unsuccessfull overal status of the multi response, and in particular for the fake response having _ZCLOSING_ (-116) status. I have introduced the _rc0_ parameter for this. *Another issue* with this function is that even if the while-loop exited properly, this function returns _rc == 0_, and this return code +overrides+ the true status value with {{rc = deserialize_multi(xid, cptr, ia, rc)}} in the _deserialize_response()_ function. So, the _multi_ response callback +handler would be called with _rc == ZOK_ instead of _rc == ZCLOSING_+ which is strictly wrong. To fix this I propose initializing _rc_ with the introduced _rc0_ instead of zero (which is _ZOK_ indeed). This is a proposed fix: https://github.com/apache/zookeeper/pull/360 was: Function *_deserialize_multi()_* hits *_assert(entry)_* when called for the so called "Fake response" which is fabricated by the function _free_completions()_ for example when _zookeeper_close()_ is called while there is a pending _multi_ request. Such fake response includes only the header but zero bytes for the body. Due to this {{deserialize_MultiHeader(ia, "multiheader", &mhdr)}}, which is called repeatedly for each {{completion_list_t *entry = dequeue_completion(clist)}}, does not assign the _mhdr_ and keeps _mhdr.done == 0_ as it was originally initialized. Consequently the _while (!mhdr.done)_ does not ever end, and finally falls into the _assert(entry)_ with _entry == NULL_ when all sub-requests are "completed". ~// Normally on my platform assert raises SIGABRT.~ I propose to instruct the _deserialize_multi()_ function to break the loop on _entry == NULL_ if it was called for an unsuccessfull overal status of the multi response, and in particular for the fake response having _ZCLOSING_ (-116) status. I have introduced the _rc0_ parameter for this. *Another issue* with this function is that even if the while-loop exited properly, this function returns _rc == 0_, and this return code +overrides+ the true status value with {{rc = deserialize_multi(xid, cptr, ia, rc)}} in the _deserialize_response()_ function. So, the _multi_ response callback +handler would be called with _rc == ZOK_ instead of _rc == ZCLOSING_+ which is strictly wrong. To fix this I propose initializing _rc_ with the introduced _rc0_ instead of zero (which is _ZOK_ indeed). > SIGABRT from assert during fake completion on zookeeper_close. > -------------------------------------------------------------- > > Key: ZOOKEEPER-2891 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2891 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.4.10 > Environment: Linux ubuntu 4.4.0-87-generic > gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 > https://github.com/apache/zookeeper.git > branch-3.4 > Reporter: Alexander A. Strelets > Priority: Critical > Labels: easyfix > Fix For: 3.4.10 > > > Function *_deserialize_multi()_* hits *_assert(entry)_* when called for the > so called "Fake response" which is fabricated by the function > _free_completions()_ for example when _zookeeper_close()_ is called while > there is a pending _multi_ request. > Such fake response includes only the header but zero bytes for the body. Due > to this {{deserialize_MultiHeader(ia, "multiheader", &mhdr)}}, which is > called repeatedly for each {{completion_list_t *entry = > dequeue_completion(clist)}}, does not assign the _mhdr_ and keeps _mhdr.done > == 0_ as it was originally initialized. Consequently the _while (!mhdr.done)_ > does not ever end, and finally falls into the _assert(entry)_ with _entry == > NULL_ when all sub-requests are "completed". ~// Normally on my platform > assert raises SIGABRT.~ > I propose to instruct the _deserialize_multi()_ function to break the loop on > _entry == NULL_ if it was called for an unsuccessfull overal status of the > multi response, and in particular for the fake response having _ZCLOSING_ > (-116) status. I have introduced the _rc0_ parameter for this. > *Another issue* with this function is that even if the while-loop exited > properly, this function returns _rc == 0_, and this return code +overrides+ > the true status value with {{rc = deserialize_multi(xid, cptr, ia, rc)}} in > the _deserialize_response()_ function. So, the _multi_ response callback > +handler would be called with _rc == ZOK_ instead of _rc == ZCLOSING_+ which > is strictly wrong. > To fix this I propose initializing _rc_ with the introduced _rc0_ instead of > zero (which is _ZOK_ indeed). > This is a proposed fix: https://github.com/apache/zookeeper/pull/360 -- This message was sent by Atlassian JIRA (v6.4.14#64029)