Hi All,

I am running Centos 7.1, zookeeper version 3.4.7 and Mesos version 0.26.0. 
After starting the zookeeper, when I tried to start to start the meson-server 
with quorum 0 (everything being run on the same machine, not as local but 
distributed set up), the server crashed.
This happened immediately after the fresh installs.
When I changed quorum=1, the mesos master ran fine and the slave could get 
connected.
Then on restarting the mesos master, there was no issue. The issue was seen the 
very first time only.
The error stack is incomprehensible.

Anyone seen this issue previously?
The error log was —

[root@abc123 build]# ./bin/mesos-master.sh --ip=10.10.10.118 
--work_dir=/var/lib/mesos --zk=zk://10.10.10.118:2181/mesos --quorum=0
I1229 13:41:24.925851  3345 main.cpp:232] Build: 2015-12-29 12:29:36 by root
I1229 13:41:24.925983  3345 main.cpp:234] Version: 0.26.0
I1229 13:41:24.929131  3345 main.cpp:255] Using 'HierarchicalDRF' allocator
I1229 13:41:24.953929  3345 leveldb.cpp:176] Opened db in 24.529078ms
I1229 13:41:24.955523  3345 leveldb.cpp:183] Compacted db in 1.525191ms
I1229 13:41:24.955688  3345 leveldb.cpp:198] Created db iterator in 107413ns
I1229 13:41:24.955724  3345 leveldb.cpp:204] Seeked to beginning of db in 4553ns
I1229 13:41:24.955737  3345 leveldb.cpp:273] Iterated through 0 keys in the db 
in 224ns
I1229 13:41:24.956120  3345 replica.cpp:780] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I1229 13:41:24.961802  3345 main.cpp:464] Starting Mesos master
I1229 13:41:24.965438  3345 master.cpp:367] Master 
a38658f7-89c1-4b1f-84f9-5796234b2104 (localhost) started on 10.10.10.118:5050
I1229 13:41:24.965459  3345 master.cpp:369] Flags at startup: 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" 
--authorizers="local" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --initialize_driver_logging="true" --ip="10.10.10.118" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="0" 
--recovery_slave_removal_limit="100%" --registry="replicated_log" 
--registry_fetch_timeout="1mins" --registry_store_timeout="5secs" 
--registry_strict="false" --root_submissions="true" 
--slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
--user_sorter="drf" --version="false" 
--webui_dir="/home/admin/mesos/build/../src/webui" --work_dir="/var/lib/mesos" 
--zk="zk://10.10.10.118:2181/mesos" --zk_session_timeout="10secs"
I1229 13:41:24.965761  3345 master.cpp:416] Master allowing unauthenticated 
frameworks to register
I1229 13:41:24.965772  3345 master.cpp:421] Master allowing unauthenticated 
slaves to register
I1229 13:41:24.965837  3345 master.cpp:458] Using default 'crammd5' 
authenticator
W1229 13:41:24.965867  3345 authenticator.cpp:513] No credentials provided, 
authentication requests will be refused
I1229 13:41:24.965881  3345 authenticator.cpp:520] Initializing server SASL
I1229 13:41:24.966788  3364 log.cpp:238] Attempting to join replica to 
ZooKeeper group
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@712: Client 
environment:zookeeper.version=zookeeper C client 3.4.5
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@716: Client 
environment:host.name=abc.def.com<http://abc.def.com>
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@723: Client 
environment:os.name=Linux
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@724: Client 
environment:os.arch=3.10.0-229.el7.x86_64
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@725: Client 
environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@733: Client 
environment:user.name=root
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@741: Client 
environment:user.home=/root
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@753: Client 
environment:user.dir=/home/admin/mesos/build
2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@zookeeper_init@786: 
Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 
watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc0038c0 
flags=0
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@712: Client 
environment:zookeeper.version=zookeeper C client 3.4.5
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@716: Client 
environment:host.name=abc.def.com<http://abc.def.com>
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@723: Client 
environment:os.name=Linux
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@724: Client 
environment:os.arch=3.10.0-229.el7.x86_64
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@725: Client 
environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@733: Client 
environment:user.name=root
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@741: Client 
environment:user.home=/root
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@753: Client 
environment:user.dir=/home/admin/mesos/build
2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@zookeeper_init@786: 
Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 
watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc003a70 
flags=0
I1229 13:41:24.971629  3360 recover.cpp:449] Starting replica recovery
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@712: Client 
environment:zookeeper.version=zookeeper C client 3.4.5
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@716: Client 
environment:host.name=abc.def.com<http://abc.def.com>
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@723: Client 
environment:os.name=Linux
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@724: Client 
environment:os.arch=3.10.0-229.el7.x86_64
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@725: Client 
environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@733: Client 
environment:user.name=root
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@741: Client 
environment:user.home=/root
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@753: Client 
environment:user.dir=/home/admin/mesos/build
2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@zookeeper_init@786: 
Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 
watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc0078b0 
flags=0
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@712: Client 
environment:zookeeper.version=zookeeper C client 3.4.5
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@716: Client 
environment:host.name=abc.def.com<http://abc.def.com>
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@723: Client 
environment:os.name=Linux
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@724: Client 
environment:os.arch=3.10.0-229.el7.x86_64
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@725: Client 
environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@733: Client 
environment:user.name=root
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@741: Client 
environment:user.home=/root
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@753: Client 
environment:user.dir=/home/admin/mesos/build
2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@zookeeper_init@786: 
Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 
watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc007f00 
flags=0
I1229 13:41:24.973780  3362 recover.cpp:475] Replica is in EMPTY status
I1229 13:41:24.979076  3362 replica.cpp:676] Replica in EMPTY status received a 
broadcasted recover request from (4)@10.10.10.118:5050
I1229 13:41:24.979863  3362 recover.cpp:195] Received a recover response from a 
replica in EMPTY status
F1229 13:41:24.980000  3362 recover.cpp:219] CHECK_SOME(lowestBeginPosition): 
is NONE
*** Check failure stack trace: ***
    @     0x7f211143a6a2  google::LogMessage::Fail()
    @     0x7f211143a601  google::LogMessage::SendToLog()
2015-12-29 13:41:24,995:3345(0x7f20f37fe700):ZOO_INFO@check_events@1703: 
initiated connection to server [10.10.10.118:2181]
2015-12-29 13:41:25,004:3345(0x7f21008d9700):ZOO_INFO@check_events@1703: 
initiated connection to server [10.10.10.118:2181]
2015-12-29 13:41:25,004:3345(0x7f21018db700):ZOO_INFO@check_events@1703: 
initiated connection to server [10.10.10.118:2181]
2015-12-29 13:41:25,004:3345(0x7f20f27fc700):ZOO_INFO@check_events@1703: 
initiated connection to server [10.10.10.118:2181]
    @     0x7f211143a012  google::LogMessage::Flush()
    @     0x7f211143cd46  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f211056d44c  _CheckFatal::~_CheckFatal()
    @     0x7f211125a243  
mesos::internal::log::RecoverProtocolProcess::received()
    @     0x7f2111265ae6  
_ZZN7process8dispatchI6OptionIN5mesos8internal3log15RecoverResponseEENS4_22RecoverProtocolProcessERKNS_6FutureIS5_EES9_EENS8_IT_EERKNS_3PIDIT0_EEMSF_FSD_T1_ET2_ENKUlPNS_11ProcessBaseEE_clESO_
    @     0x7f211127c5d5  
_ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI6OptionIN5mesos8internal3log15RecoverResponseEENS8_22RecoverProtocolProcessERKNS0_6FutureIS9_EESD_EENSC_IT_EERKNS0_3PIDIT0_EEMSJ_FSH_T1_ET2_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
    @     0x7f21113c0d7d  std::function<>::operator()()
    @     0x7f21113a8b95  process::ProcessBase::visit()
    @     0x7f21113ac960  process::DispatchEvent::visit()
    @           0x471dd8  process::ProcessBase::serve()
I1229 13:41:25.136451  3366 contender.cpp:149] Joining the ZK group
    @     0x7f21113a4f81  process::ProcessManager::resume()
    @     0x7f21113a21b2  
_ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_
    @     0x7f21113ac18c  
_ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0EEEET_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE
    @     0x7f21113ac13c  
_ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_
    @     0x7f21113ac0ce  
_ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE
    @     0x7f21113ac025  
_ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv
    @     0x7f21113abfbe  
_ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv
    @     0x7f210ce3f220  (unknown)
    @     0x7f210d099dc5  start_thread
    @     0x7f210c5a721d  __clone
Aborted (core dumped)

Reply via email to