Hi, It seems that the C++ scheduler driver doesn't detect loss of the connection to the master when not using zookeeper.
A simple way to reproduce this is to start a server passing it e.g. "--ip=127.0.0.1", start the scheduler driver passing it "127.0.0.1:5050", and then send a SIGKILL to the master. The scheduler logs the following: I1220 10:56:11.679347 10635 process.cpp:2928] Resuming __reaper__(1)@192.168.65.76:34345 at 2019-12-20 10:56:11.679366144+00:00 I1220 10:56:11.679392 10635 clock.cpp:279] Created a timer for __reaper__(1)@192.168.65.76:34345 in 100ms in the future (2019-12-20 10:56:11.779389952+00:00) I1220 10:56:11.690646 10631 process.cpp:2928] Resuming scheduler-6a93a8e3-5a8f-4195-bde2-718b5832d317@192.168.65.76:34345 at 2019-12-20 10:56:11.690665984+00:00 I1220 10:56:11.690775 10632 process.cpp:2928] Resuming __http__(1)@192.168.65.76:34345 at 2019-12-20 10:56:11.690784000+00:00 I1220 10:56:11.690806 10632 process.cpp:3088] Cleaning up __http__(1)@192.168.65.76:34345 I1220 10:56:11.690914 10632 process.cpp:2928] Resuming help@192.168.65.76:34345 at 2019-12-20 10:56:11.690921984+00:00 An strace confirms that the process receives EOF when reading from the socket, but Scheduler::disconnected isn't called. It's that expected? Or is it assumed that the scheduler relies on zookeeper for detection? Cheers, Charles