[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557021#comment-14557021 ]
Joris Van Remoortere commented on MESOS-2706: --------------------------------------------- I made a comment regarding this in MESOS-2254. For the docker case, after talking with [~timchen], we may be able to avoid this problem by collecting the usage statistics out of cgroups, rather than /proc, since we run docker using cgroups. That would avoid the issues presented in MESOS-2254. I believe the cgroups stats are much more efficient, though we should verify. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > -------------------------------------------------------------------------------------------- > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker > Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . > Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153-0000 > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-0000' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153-0000 > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-0000' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)