[jira] [Issue Comment Deleted] (MESOS-2186) Mesos crashes if any configured zookeeper does not resolve.

2018-10-08 Thread Jorge Machado (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jorge Machado updated MESOS-2186:
-
Comment: was deleted

(was: Hi [~neilc], 

I think this needs to be re-opened. I have this situation on a Mesos 1.3.2 
cluster. 

 

Running on machine: mesosAgentNode
 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid [file:line|file:///line]] 
msg
 F1008 07:34:48.771600 12897 zookeeper.cpp:132] Failed to create ZooKeeper, 
zookeeper_init: No such file or directory [2]

we have 5 zookeepers configured and the last of them was removed from our 
network. The cluster is totally broken now. 

This should not happen.)

> Mesos crashes if any configured zookeeper does not resolve.
> ---
>
> Key: MESOS-2186
> URL: https://issues.apache.org/jira/browse/MESOS-2186
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.21.0, 0.26.0
> Environment: Zookeeper:  3.4.5+28-1.cdh4.7.1.p0.13.el6
> Mesos: 0.21.0-1.0.centos65
> CentOS: CentOS release 6.6 (Final)
>Reporter: Daniel Hall
>Priority: Critical
>  Labels: mesosphere
>
> When starting Mesos, if one of the configured zookeeper servers does not 
> resolve in DNS Mesos will crash and refuse to start. We noticed this issue 
> while we were rebuilding one of our zookeeper hosts in Google compute (which 
> bases the DNS on the machines running).
> Here is a log from a failed startup (hostnames and ip addresses have been 
> sanitised).
> {noformat}
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: I1209 22:54:54.088835 
> 28627 main.cpp:292] Starting Mesos master
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,095:28627(0x7fa9f042f700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.095239 
> 28642 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,097:28627(0x7fa9ed22a700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.097718 
> 28647 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a00b9  
> google::LogMessage::SendToLog()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,108:28627(0x7fa9ef02d700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.097718 
> 28647 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]F1209 22:54:54.108422 28644 zookeeper.cpp:113] Failed to 
> create ZooKeeper, zookeeper_init: No such file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,109:28627(0x7fa9f0e30700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.097718 
> 28647 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]F1209 22:54:54.108422 28644 zookeeper.cpp:113] Failed to 
> create ZooKeeper, zookeeper_init: No such file or directory [2]F1209 
> 22:54:54.109864 28641 zookeeper.cpp:113] Failed to create ZooKeeper, 
> zookeeper_init: No such file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a00b9  
> google::LogMessage::SendToLog()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a00b9  
> google::LogMessage::SendToLog()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: I1209 22:54:54.123208 
> 28640 master.c

[jira] [Issue Comment Deleted] (MESOS-2186) Mesos crashes if any configured zookeeper does not resolve.

2015-10-21 Thread Steven Schlansker (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Schlansker updated MESOS-2186:
-
Comment: was deleted

(was: If zookeeper_init() returns NULL, that in fact means that ZOOKEEPER-1029 
is unrelated, yeah?)

> Mesos crashes if any configured zookeeper does not resolve.
> ---
>
> Key: MESOS-2186
> URL: https://issues.apache.org/jira/browse/MESOS-2186
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.21.0, 0.26.0
> Environment: Zookeeper:  3.4.5+28-1.cdh4.7.1.p0.13.el6
> Mesos: 0.21.0-1.0.centos65
> CentOS: CentOS release 6.6 (Final)
>Reporter: Daniel Hall
>Priority: Critical
>  Labels: mesosphere
>
> When starting Mesos, if one of the configured zookeeper servers does not 
> resolve in DNS Mesos will crash and refuse to start. We noticed this issue 
> while we were rebuilding one of our zookeeper hosts in Google compute (which 
> bases the DNS on the machines running).
> Here is a log from a failed startup (hostnames and ip addresses have been 
> sanitised).
> {noformat}
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: I1209 22:54:54.088835 
> 28627 main.cpp:292] Starting Mesos master
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,095:28627(0x7fa9f042f700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.095239 
> 28642 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,097:28627(0x7fa9ed22a700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.097718 
> 28647 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a00b9  
> google::LogMessage::SendToLog()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,108:28627(0x7fa9ef02d700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.097718 
> 28647 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]F1209 22:54:54.108422 28644 zookeeper.cpp:113] Failed to 
> create ZooKeeper, zookeeper_init: No such file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 2014-12-09 
> 22:54:54,109:28627(0x7fa9f0e30700):ZOO_ERROR@getaddrs@599: getaddrinfo: No 
> such file or directory
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: 
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: F1209 22:54:54.097718 
> 28647 zookeeper.cpp:113] Failed to create ZooKeeper, zookeeper_init: No such 
> file or directory [2]F1209 22:54:54.108422 28644 zookeeper.cpp:113] Failed to 
> create ZooKeeper, zookeeper_init: No such file or directory [2]F1209 
> 22:54:54.109864 28641 zookeeper.cpp:113] Failed to create ZooKeeper, 
> zookeeper_init: No such file or directory [2]
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: *** Check failure stack 
> trace: ***
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a0160  
> google::LogMessage::Fail()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a00b9  
> google::LogMessage::SendToLog()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: @ 0x7fa9f56a00b9  
> google::LogMessage::SendToLog()
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: I1209 22:54:54.123208 
> 28640 master.cpp:318] Master 20141209-225454-4155764746-5050-28627 
> (mesosmaster-2.internal) started on 10.x.x.x:5050
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: I1209 22:54:54.123306 
> 28640 master.cpp:366] Master allowing unauthenticated frameworks to register
> Dec  9 22:54:54 mesosmaster-2 mesos-master[28627]: I1209 22:54:54.123327 
> 28640 master.cpp:371] Master allowing unauthenticated