Hi All, I am running Centos 7.1, zookeeper version 3.4.7 and Mesos version 0.26.0. After starting the zookeeper, when I tried to start to start the meson-server with quorum 0 (everything being run on the same machine, not as local but distributed set up), the server crashed. This happened immediately after the fresh installs. When I changed quorum=1, the mesos master ran fine and the slave could get connected. Then on restarting the mesos master, there was no issue. The issue was seen the very first time only. The error stack is incomprehensible.
Anyone seen this issue previously? The error log was — [root@abc123 build]# ./bin/mesos-master.sh --ip=10.10.10.118 --work_dir=/var/lib/mesos --zk=zk://10.10.10.118:2181/mesos --quorum=0 I1229 13:41:24.925851 3345 main.cpp:232] Build: 2015-12-29 12:29:36 by root I1229 13:41:24.925983 3345 main.cpp:234] Version: 0.26.0 I1229 13:41:24.929131 3345 main.cpp:255] Using 'HierarchicalDRF' allocator I1229 13:41:24.953929 3345 leveldb.cpp:176] Opened db in 24.529078ms I1229 13:41:24.955523 3345 leveldb.cpp:183] Compacted db in 1.525191ms I1229 13:41:24.955688 3345 leveldb.cpp:198] Created db iterator in 107413ns I1229 13:41:24.955724 3345 leveldb.cpp:204] Seeked to beginning of db in 4553ns I1229 13:41:24.955737 3345 leveldb.cpp:273] Iterated through 0 keys in the db in 224ns I1229 13:41:24.956120 3345 replica.cpp:780] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I1229 13:41:24.961802 3345 main.cpp:464] Starting Mesos master I1229 13:41:24.965438 3345 master.cpp:367] Master a38658f7-89c1-4b1f-84f9-5796234b2104 (localhost) started on 10.10.10.118:5050 I1229 13:41:24.965459 3345 master.cpp:369] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --ip="10.10.10.118" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="0" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/home/admin/mesos/build/../src/webui" --work_dir="/var/lib/mesos" --zk="zk://10.10.10.118:2181/mesos" --zk_session_timeout="10secs" I1229 13:41:24.965761 3345 master.cpp:416] Master allowing unauthenticated frameworks to register I1229 13:41:24.965772 3345 master.cpp:421] Master allowing unauthenticated slaves to register I1229 13:41:24.965837 3345 master.cpp:458] Using default 'crammd5' authenticator W1229 13:41:24.965867 3345 authenticator.cpp:513] No credentials provided, authentication requests will be refused I1229 13:41:24.965881 3345 authenticator.cpp:520] Initializing server SASL I1229 13:41:24.966788 3364 log.cpp:238] Attempting to join replica to ZooKeeper group 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@716: Client environment:host.name=abc.def.com<http://abc.def.com> 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-229.el7.x86_64 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@733: Client environment:user.name=root 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@741: Client environment:user.home=/root 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/admin/mesos/build 2015-12-29 13:41:24,970:3345(0x7f21042f3700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc0038c0 flags=0 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@716: Client environment:host.name=abc.def.com<http://abc.def.com> 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-229.el7.x86_64 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@733: Client environment:user.name=root 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@741: Client environment:user.home=/root 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/admin/mesos/build 2015-12-29 13:41:24,970:3345(0x7f2103af2700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc003a70 flags=0 I1229 13:41:24.971629 3360 recover.cpp:449] Starting replica recovery 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@716: Client environment:host.name=abc.def.com<http://abc.def.com> 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-229.el7.x86_64 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@733: Client environment:user.name=root 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@741: Client environment:user.home=/root 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/admin/mesos/build 2015-12-29 13:41:24,972:3345(0x7f21062f7700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc0078b0 flags=0 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@716: Client environment:host.name=abc.def.com<http://abc.def.com> 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-229.el7.x86_64 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Fri Mar 6 11:36:42 UTC 2015 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@733: Client environment:user.name=root 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@741: Client environment:user.home=/root 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/admin/mesos/build 2015-12-29 13:41:24,972:3345(0x7f2105af6700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.10.10.118:2181 sessionTimeout=10000 watcher=0x7f2110e3f16c sessionId=0 sessionPasswd=<null> context=0x7f20fc007f00 flags=0 I1229 13:41:24.973780 3362 recover.cpp:475] Replica is in EMPTY status I1229 13:41:24.979076 3362 replica.cpp:676] Replica in EMPTY status received a broadcasted recover request from (4)@10.10.10.118:5050 I1229 13:41:24.979863 3362 recover.cpp:195] Received a recover response from a replica in EMPTY status F1229 13:41:24.980000 3362 recover.cpp:219] CHECK_SOME(lowestBeginPosition): is NONE *** Check failure stack trace: *** @ 0x7f211143a6a2 google::LogMessage::Fail() @ 0x7f211143a601 google::LogMessage::SendToLog() 2015-12-29 13:41:24,995:3345(0x7f20f37fe700):ZOO_INFO@check_events@1703: initiated connection to server [10.10.10.118:2181] 2015-12-29 13:41:25,004:3345(0x7f21008d9700):ZOO_INFO@check_events@1703: initiated connection to server [10.10.10.118:2181] 2015-12-29 13:41:25,004:3345(0x7f21018db700):ZOO_INFO@check_events@1703: initiated connection to server [10.10.10.118:2181] 2015-12-29 13:41:25,004:3345(0x7f20f27fc700):ZOO_INFO@check_events@1703: initiated connection to server [10.10.10.118:2181] @ 0x7f211143a012 google::LogMessage::Flush() @ 0x7f211143cd46 google::LogMessageFatal::~LogMessageFatal() @ 0x7f211056d44c _CheckFatal::~_CheckFatal() @ 0x7f211125a243 mesos::internal::log::RecoverProtocolProcess::received() @ 0x7f2111265ae6 _ZZN7process8dispatchI6OptionIN5mesos8internal3log15RecoverResponseEENS4_22RecoverProtocolProcessERKNS_6FutureIS5_EES9_EENS8_IT_EERKNS_3PIDIT0_EEMSF_FSD_T1_ET2_ENKUlPNS_11ProcessBaseEE_clESO_ @ 0x7f211127c5d5 _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI6OptionIN5mesos8internal3log15RecoverResponseEENS8_22RecoverProtocolProcessERKNS0_6FutureIS9_EESD_EENSC_IT_EERKNS0_3PIDIT0_EEMSJ_FSH_T1_ET2_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ @ 0x7f21113c0d7d std::function<>::operator()() @ 0x7f21113a8b95 process::ProcessBase::visit() @ 0x7f21113ac960 process::DispatchEvent::visit() @ 0x471dd8 process::ProcessBase::serve() I1229 13:41:25.136451 3366 contender.cpp:149] Joining the ZK group @ 0x7f21113a4f81 process::ProcessManager::resume() @ 0x7f21113a21b2 _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ @ 0x7f21113ac18c _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0EEEET_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE @ 0x7f21113ac13c _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ @ 0x7f21113ac0ce _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE @ 0x7f21113ac025 _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv @ 0x7f21113abfbe _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv @ 0x7f210ce3f220 (unknown) @ 0x7f210d099dc5 start_thread @ 0x7f210c5a721d __clone Aborted (core dumped)