hi Yu, please enable debug mode to see more details logs with GLOG_v=3
2016-08-12 14:27 GMT+08:00 志昌 余 <yuzhichang_...@hotmail.com>: > Hi Anindya, > > The problem occurred again. The following is the log of the scheduler > driver log at Chronos side: > > > I0812 08:15:43.902712 96 sched.cpp:1937] Asked to abort the driver > I0812 08:15:43.902763 96 sched.cpp:981] Scheduler::statusUpdate took > 1.436378441secs > I0812 08:15:43.902788 96 sched.cpp:988] Not sending status update > acknowledgment message b\ > ecause the driver is not running! > I0812 08:15:43.902866 96 sched.cpp:919] Ignoring task status update > message because the dr\ > iver is not running! > > However from the earlier log I don't see the clue of why scheduler > driver be aborted. > > > > Thankds, > > Zhichang Yu > > > > ------------------------------ > *发件人:* 志昌 余 <yuzhichang_...@hotmail.com> > *发送时间:* 2016年8月9日 18:03:31 > *收件人:* user@mesos.apache.org > *主题:* 答复: Deactivationg framework unexpectly > > > Hi Anindys, > > Thanks for the info. I'll enable scheduler driver log to see > what happen. > > Regards, > > Zhichang Yu > ------------------------------ > *发件人:* anindya_si...@apple.com <anindya_si...@apple.com> 代表 Anindya Sinha > <anindya_si...@apple.com> > *发送时间:* 2016年8月8日 23:50:10 > *收件人:* user@mesos.apache.org > *主题:* Re: Deactivationg framework unexpectly > > Looks like your framework (chronos) is sending a > DeactivateFrameworkMessage message to the master. The scheduler driver > would also send a DeativateFramework message if it is aborted ( > https://github.com/apache/mesos/blob/master/src/sched/sched.cpp#L1224). > > Also, master can deactivate your framework if your framework disconnects > or fails over. Please check logs in master or see if your framework > received a FrameworkErrorMessage. > > Thanks > Anindya > > On Aug 8, 2016, at 3:35 AM, 志昌 余 <yuzhichang_...@hotmail.com> wrote: > > Hi, > I recently faced a wired problem. I'm running mesos + chronos. > Chronos often (once every several days) stops scheduling tasks due to > mesos deactived the framework. > As following is the log of mesos master leader: > > # grep -iP "activat|disconnected" /var/log/mesos/mesos-master.INFO > I0806 13:40:33.143658 30 master.cpp:2551] Deactivating framework > 90a6a7dc-7256-4e55-bd7e-573233c5df74-0000 (chronos-2.5.0-SNAPSHOT) at > scheduler-86a64d22-5201-4bb0-8a2c-70d3e97afae6@10.8.139.246:34544 > I0806 13:40:33.143908 23 hierarchical.cpp:375] Deactivated framework > 90a6a7dc-7256-4e55-bd7e-573233c5df74-0000 > > The fix is to manually reboot the chronos leader. > > My env: > There are 3 physical machines, on each are running containerized mesos > master and chronos. When the issue occurred, the mesos leader and chronos > leader were both running on the same machine. > > Software Version: > mesos-master:0.28.0-2.0.16.ubuntu1404 > > chronos:2.5.0-ce4469d.ubuntu1404-mesos-0.28.0-2.0.16.ubuntu1404 > > Can anyone give insight for this problem? > Thanks, > Zhichang Yu > > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com