[ https://issues.apache.org/jira/browse/ZOOKEEPER-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050289#comment-13050289 ]
Craig Calef commented on ZOOKEEPER-770: --------------------------------------- This bug was also causing me some problems. After digging into the C code it occured to me that the problem was down through add_auth() to send_info_packet(), which was putting the auth packet on the to_send queue but not waking up the IO thread via adaptor_send_queue() like send_ping() does. The result was the auth packet didn't go out until the zoo_interest() timeout occured (which internally looks to be like 1/3rd the timeout specified in the zookeeper_init, but I don't 100% grok what zoo_interest is doing) Attached is a very simple patch which remedies this problem, and does not have any other appreciable impact. For me it cut the time it took for the add_auth watch to fire to practically instantaneous. > Slow add_auth calls with multi-threaded client > ---------------------------------------------- > > Key: ZOOKEEPER-770 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-770 > Project: ZooKeeper > Issue Type: Bug > Components: c client, contrib-bindings > Affects Versions: 3.3.0, 3.4.0 > Environment: ubuntu lucid (10.04), zk trunk (3.4) > Reporter: Kapil Thangavelu > Priority: Minor > Attachments: ZOOKEEPER-770-FIX.patch, authtest.py > > > Calls to add_auth are a bit slow from the c client library. The auth callback > typically takes multiple seconds to fire. I instrumented the java, c binding, > and python binding with a few log statements to find out where the slowness > was occuring ( > http://bazaar.launchpad.net/~hazmat/zookeeper/fast-auth-instrumented/revision/647). > It looks like when the io thread polls, it doesn't register interest in the > incoming packet, so the auth success message from the server and the auth > callback are only processed when the poll timeouts. I tried modifying > mt_adapter.c so the poll registers interest in both events, this causes a > considerably more wakeups but it does address the issue of making add_auth > fast. I think the ideal solution would be some sort of additional auth > handshake state on the handle, that zookeeper_interest could utilize to > suggest both POLLIN|POLLOUT are wanted for subsequent calls to poll during > the auth handshake handle state. > i'm attaching a script that takes 13s or 1.6s for the auth callback depending > on the session time out value (which in turn figures into the calculation of > the poll timeout). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira