[ https://issues.apache.org/jira/browse/ZOOKEEPER-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642943#comment-16642943 ]
Jorge Machado commented on ZOOKEEPER-1029: ------------------------------------------ Hi [~cnauroth] I think there is a bug here. On Mesos we have a problem that if one host is not DNS resolvable it does not try the other ones. I checked the zookeeper.c and in the function resolve_hosts line 813 we jump to next and from there we return the ZOK. I think there is some kind of looping missing. I'm no C expert sorry... On the Java api I see that we use InetSocketAddress.createUnresolved so that it will not be tried to resolve the hosts, can it be that we have a issue there ? > C client bug in zookeeper_init (if bad hostname is given) > --------------------------------------------------------- > > Key: ZOOKEEPER-1029 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1029 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.3.2, 3.4.6, 3.5.0 > Reporter: Dheeraj Agrawal > Assignee: Flavio Junqueira > Priority: Blocker > Fix For: 3.4.7, 3.5.2, 3.6.0 > > Attachments: ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.4.patch, > ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.4.patch, > ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.5.patch, ZOOKEEPER-1029-3.5.patch > > > If you give invalid hostname to zookeeper_init method, it's not able to > resolve it, and it tries to do the cleanup (free buffer/completion lists/etc) > . The adaptor_init() is not called for this code path, so the lock,cond > variables (for adaptor, completion lists) are not initialized. > As part of the cleanup it's trying to clean up some buffers and acquires > locks and unlocks (where the locks have not yet been initialized, so > unlocking fails) > lock_completion_list(&zh->sent_requests); - pthread_mutex/cond not > initialized > tmp_list = zh->sent_requests; > zh->sent_requests.head = 0; > zh->sent_requests.last = 0; > unlock_completion_list(&zh->sent_requests); trying to broadcast here > on uninitialized cond > It should do error checking to see if locking succeeds before unlocking it. > If Locking fails, then appropriate error handling has to be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)