Frédéric Comte created MESOS-9936:
-------------------------------------
Summary: Slave recovery is very slow with high local volume
persistant ( marathon app )
Key: MESOS-9936
URL: https://issues.apache.org/jira/browse/MESOS-9936
Project: Mesos
Issue Type: Bug
Components: agent
Reporter: Frédéric Comte
I run some local persistant applications..
After an unplannified shutdown of nodes running this kind of applications, I
see that the recovery process of mesos is taking a lot of time (more than 8
hours)...
This time depends of the amount of data in those volumes.
What does Mesos do in this process ?
{code:java}
Jul 08 07:40:44 boss1 mesos-agent[13345]: I0708 07:40:44.771447 13370
docker.cpp:890] Recovering Docker containers Jul 08 07:40:44 boss1
mesos-agent[13345]: I0708 07:40:44.783957 13375 containerizer.cpp:801]
Recovering Mesos containers
Jul 08 07:40:44 boss1 mesos-agent[13345]: I0708 07:40:44.799252 13373
linux_launcher.cpp:286] Recovering Linux launcher
Jul 08 07:40:44 boss1 mesos-agent[13345]: I0708 07:40:44.810429 13375
containerizer.cpp:1127] Recovering isolators
Jul 08 07:40:44 boss1 mesos-agent[13345]: I0708 07:40:44.817328 13389
containerizer.cpp:1166] Recovering provisioner
Jul 08 14:42:10 boss1 mesos-agent[13345]: I0708 14:42:10.928683 13373
composing.cpp:339] Finished recovering all containerizers
Jul 08 14:42:10 boss1 mesos-agent[13345]: I0708 14:42:10.950503 13354
status_update_manager_process.hpp:314] Recovering operation status update
manager
Jul 08 14:42:10 boss1 mesos-agent[13345]: I0708 14:42:10.957418 13399
slave.cpp:7729] Recovering executors
{code}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)