[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642943#comment-16642943
 ] 

Jorge Machado commented on ZOOKEEPER-1029:
------------------------------------------

Hi [~cnauroth] I think there is a bug here. On Mesos we have a problem that if 
one host is not DNS resolvable  it does not try the other ones. 

I checked the zookeeper.c and in the function resolve_hosts line 813 we jump to 
next and from there we return the ZOK. I think there is some kind of looping 
missing. I'm no C expert sorry... On the Java api I see that we use 
InetSocketAddress.createUnresolved so that it will not be tried to resolve the 
hosts, can it be that we have a issue there ?

> C client bug in zookeeper_init (if bad hostname is given)
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1029
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1029
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.3.2, 3.4.6, 3.5.0
>            Reporter: Dheeraj Agrawal
>            Assignee: Flavio Junqueira
>            Priority: Blocker
>             Fix For: 3.4.7, 3.5.2, 3.6.0
>
>         Attachments: ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.4.patch, 
> ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.4.patch, 
> ZOOKEEPER-1029-3.4.patch, ZOOKEEPER-1029-3.5.patch, ZOOKEEPER-1029-3.5.patch
>
>
> If you give invalid hostname to zookeeper_init method, it's not able to 
> resolve it, and it tries to do the cleanup (free buffer/completion lists/etc) 
> . The adaptor_init() is not called for this code path, so the lock,cond 
> variables (for adaptor, completion lists) are not initialized.
> As part of the cleanup it's trying to clean up some buffers and acquires 
> locks and unlocks (where the locks have not yet been initialized, so 
> unlocking fails) 
>     lock_completion_list(&zh->sent_requests); - pthread_mutex/cond not 
> initialized
>     tmp_list = zh->sent_requests;
>     zh->sent_requests.head = 0;
>     zh->sent_requests.last = 0;
>     unlock_completion_list(&zh->sent_requests);   trying to broadcast here 
> on uninitialized cond
> It should do error checking to see if locking succeeds before unlocking it. 
> If Locking fails, then appropriate error handling has to be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to