[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220889#comment-15220889 ] Timothy Chen commented on MESOS-2706: - Btw we already changed stats reading to read from cgroups instead of perf/stats, so this should ideally been fixed. Can [~kairu1987] you help verify? > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695609#comment-14695609 ] Alexander Rukletsov commented on MESOS-2706: This can be a docker related issue: docker daemon process requests slower in presence of numerous docker containers. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681976#comment-14681976 ] Alexander Rukletsov commented on MESOS-2706: I was able to reproduce the issue as well. Here is an excerpt of slave events for one of the first task is the sequence: {code} I0806 08:24:32.835737 1245 slave.cpp:1144] Got assigned task I0806 08:24:32.936342 1245 slave.cpp:4208] Launching executor I0806 08:24:32.938051 1245 slave.cpp:1401] Queuing task I0806 08:24:32.952314 1242 docker.cpp:626] Starting container I0806 08:24:33.637434 1243 docker.cpp:277] Checkpointing pid 21930 I0806 08:24:33.716608 1240 slave.cpp:3165] Monitoring executor I0806 08:24:33.830728 1241 slave.cpp:1555] Sending queued task ... to executor I0806 08:24:33.858212 1245 slave.cpp:2776] Forwarding the update TASK_RUNNING {code} and one of the last: {code} I0806 08:31:27.482077 1245 slave.cpp:1144] Got assigned task I0806 08:31:27.502507 1245 slave.cpp:4208] Launching executor I0806 08:31:27.503300 1245 slave.cpp:1401] Queuing task I0806 08:31:39.053246 1246 docker.cpp:626] Starting container I0806 08:32:47.695961 1246 docker.cpp:277] Checkpointing pid 19414 I0806 08:33:11.880014 1241 slave.cpp:3165] Monitoring executor I0806 08:33:12.060046 1241 slave.cpp:1555] Sending queued task ... to executor I0806 08:33:12.076020 1240 slave.cpp:2776] Forwarding the update TASK_RUNNING {code} > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557021#comment-14557021 ] Joris Van Remoortere commented on MESOS-2706: - I made a comment regarding this in MESOS-2254. For the docker case, after talking with [~timchen], we may be able to avoid this problem by collecting the usage statistics out of cgroups, rather than /proc, since we run docker using cgroups. That would avoid the issues presented in MESOS-2254. I believe the cgroups stats are much more efficient, though we should verify. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545901#comment-14545901 ] Benjamin Mahler commented on MESOS-2706: Linking in another ticket related to high cpu usage caused by the 1 second usage polling in the slave. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545342#comment-14545342 ] chenqiuhao commented on MESOS-2706: --- I used strace command and found the slave process will read all /proc/$/stats and /proc/$/cmdline to statics the usage of per docker-task round by round. For example, I launch a docker-task in an OS which have launched other 500 processes(count by ps -ef|grep wc ),the mesos slave process will keep on reading 500+500 times /proc/$/stats&cmdline round by round .And when the number of docker-tasks reached 50,the massive times of reading /proc/$/stats&cmdline exhaust whole 1 CPU time. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545124#comment-14545124 ] chenqiuhao commented on MESOS-2706: --- PS: 1.In my environment,when the tasks number reached 22,The CPU usages reached 100%. 2.From the top -Hp command ,we know that in lt-mesos-slave process,there is only one thread running,the rest are sleeping .So the process can only cost 100% CPU mostly . == Threads: 12 total, 1 running, 11 sleeping, 0 stopped, 0 zombie == > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545012#comment-14545012 ] chenqiuhao commented on MESOS-2706: --- Hi Tim,I update Mesos to 0.22.1 today. I reproduced as well . I noticed that when my task grew,the cpu usages of Process 'lt-mesos-slave' incresed and awlays was 100%(1CPU cost),like followed. Top reuslt top - 14:12:17 up 48 days, 16:41, 8 users, load average: 1.13, 1.36, 2.00 Tasks: 1717 total, 1 running, 1716 sleeping, 0 stopped, 0 zombie %Cpu(s): 8.2 us, 8.2 sy, 0.0 ni, 83.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 32782296 total, 17602948 used, 15179348 free, 193112 buffers KiB Swap: 20479996 total,0 used, 20479996 free. 10514872 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 271790 root 20 0 952028 54868 14192 S 100.6 0.2 163:26.57 lt-mesos-slave 180453 root 20 0 12.791g 955736 26216 S 4.2 2.9 11:12.81 java …… And when I used the command top -Hp 271790 to find what thread of lt-mesos-slave cost most,I found it was thread 271808 : top -Hp 271790 result top - 14:16:20 up 48 days, 16:45, 8 users, load average: 1.36, 1.35, 1.84 Threads: 12 total, 1 running, 11 sleeping, 0 stopped, 0 zombie %Cpu(s): 8.2 us, 8.0 sy, 0.0 ni, 83.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 32782296 total, 17599804 used, 15182492 free, 193112 buffers KiB Swap: 20479996 total,0 used, 20479996 free. 10515348 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 271808 root 20 0 952028 54868 14192 R 99.9 0.2 165:04.18 lt-mesos-slave 271807 root 20 0 952028 54868 14192 S 0.7 0.2 0:21.79 lt-mesos-slave 271810 root 20 0 952028 54868 14192 S 0.3 0.2 0:16.86 lt-mesos-slave 271813 root 20 0 952028 54868 14192 S 0.3 0.2 0:19.85 lt-mesos-slave 271814 root 20 0 952028 54868 14192 S 0.3 0.2 0:23.27 lt-mesos-slave 271815 root 20 0 952028 54868 14192 S 0.3 0.2 0:06.96 lt-mesos-slave 271790 root 20 0 952028 54868 14192 S 0.0 0.2 0:00.05 lt-mesos-slave 271809 root 20 0 952028 54868 14192 S 0.0 0.2 0:17.30 lt-mesos-slave 271811 root 20 0 952028 54868 14192 S 0.0 0.2 0:22.63 lt-mesos-slave 271812 root 20 0 952028 54868 14192 S 0.0 0.2 0:16.02 lt-mesos-slave 271820 root 20 0 952028 54868 14192 S 0.0 0.2 0:00.53 lt-mesos-slave 271821 root 20 0 952028 54868 14192 S 0.0 0.2 0:00.00 lt-mesos-slave === And then I used the pstack command for serveral times to check out what thread 271808 did, I found : =1st time pstack 271808 === Thread 1 (process 271808): #0 0x7fcb05f5725d in read () from /lib64/libpthread.so.0 #1 0x7fcb05cbe567 in std::__basic_file::xsgetn(char*, long) () from /lib64/libstdc++.so.6 #2 0x7fcb05cf7c50 in std::basic_filebuf >::underflow() () from /lib64/libstdc++.so.6 #3 0x7fcb05cc16e9 in std::istream::get(std::basic_streambuf >&, char) () from /lib64/libstdc++.so.6 #4 0x7fcb06837205 in proc::cmdline(Option const&) () from /wls/mesos-0.22.1/build/src/.libs/libmesos-0.22.1.so #5 0x7fcb06837ec8 in os::process(int) () from /wls/mesos-0.22.1/build/src/.libs/libmesos-0.22.1.so #6 0x7fcb06b5fd1c in mesos::internal::usage(int, bool, bool) () from /wls/mesos-0.22.1/build/src/.libs/libmesos-0.22.1.so #7 0x7fcb06a97eb0 in mesos::internal::slave::DockerContainerizer
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545013#comment-14545013 ] chenqiuhao commented on MESOS-2706: --- Hi Tim,I update Mesos to 0.22.1 today. I reproduced as well . I noticed that when my task grew,the cpu usages of Process 'lt-mesos-slave' incresed and awlays was 100%(1CPU cost),like followed. Top reuslt top - 14:12:17 up 48 days, 16:41, 8 users, load average: 1.13, 1.36, 2.00 Tasks: 1717 total, 1 running, 1716 sleeping, 0 stopped, 0 zombie %Cpu(s): 8.2 us, 8.2 sy, 0.0 ni, 83.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 32782296 total, 17602948 used, 15179348 free, 193112 buffers KiB Swap: 20479996 total,0 used, 20479996 free. 10514872 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 271790 root 20 0 952028 54868 14192 S 100.6 0.2 163:26.57 lt-mesos-slave 180453 root 20 0 12.791g 955736 26216 S 4.2 2.9 11:12.81 java …… And when I used the command top -Hp 271790 to find what thread of lt-mesos-slave cost most,I found it was thread 271808 : top -Hp 271790 result top - 14:16:20 up 48 days, 16:45, 8 users, load average: 1.36, 1.35, 1.84 Threads: 12 total, 1 running, 11 sleeping, 0 stopped, 0 zombie %Cpu(s): 8.2 us, 8.0 sy, 0.0 ni, 83.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 32782296 total, 17599804 used, 15182492 free, 193112 buffers KiB Swap: 20479996 total,0 used, 20479996 free. 10515348 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 271808 root 20 0 952028 54868 14192 R 99.9 0.2 165:04.18 lt-mesos-slave 271807 root 20 0 952028 54868 14192 S 0.7 0.2 0:21.79 lt-mesos-slave 271810 root 20 0 952028 54868 14192 S 0.3 0.2 0:16.86 lt-mesos-slave 271813 root 20 0 952028 54868 14192 S 0.3 0.2 0:19.85 lt-mesos-slave 271814 root 20 0 952028 54868 14192 S 0.3 0.2 0:23.27 lt-mesos-slave 271815 root 20 0 952028 54868 14192 S 0.3 0.2 0:06.96 lt-mesos-slave 271790 root 20 0 952028 54868 14192 S 0.0 0.2 0:00.05 lt-mesos-slave 271809 root 20 0 952028 54868 14192 S 0.0 0.2 0:17.30 lt-mesos-slave 271811 root 20 0 952028 54868 14192 S 0.0 0.2 0:22.63 lt-mesos-slave 271812 root 20 0 952028 54868 14192 S 0.0 0.2 0:16.02 lt-mesos-slave 271820 root 20 0 952028 54868 14192 S 0.0 0.2 0:00.53 lt-mesos-slave 271821 root 20 0 952028 54868 14192 S 0.0 0.2 0:00.00 lt-mesos-slave === And then I used the pstack command for serveral times to check out what thread 271808 did, I found : =1st time pstack 271808 === Thread 1 (process 271808): #0 0x7fcb05f5725d in read () from /lib64/libpthread.so.0 #1 0x7fcb05cbe567 in std::__basic_file::xsgetn(char*, long) () from /lib64/libstdc++.so.6 #2 0x7fcb05cf7c50 in std::basic_filebuf >::underflow() () from /lib64/libstdc++.so.6 #3 0x7fcb05cc16e9 in std::istream::get(std::basic_streambuf >&, char) () from /lib64/libstdc++.so.6 #4 0x7fcb06837205 in proc::cmdline(Option const&) () from /wls/mesos-0.22.1/build/src/.libs/libmesos-0.22.1.so #5 0x7fcb06837ec8 in os::process(int) () from /wls/mesos-0.22.1/build/src/.libs/libmesos-0.22.1.so #6 0x7fcb06b5fd1c in mesos::internal::usage(int, bool, bool) () from /wls/mesos-0.22.1/build/src/.libs/libmesos-0.22.1.so #7 0x7fcb06a97eb0 in mesos::internal::slave::DockerContainerizer
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542405#comment-14542405 ] Timothy Chen commented on MESOS-2706: - Hi sorry didn't have time to reproduce yet, was hoping today I will. If you can provide more detailed information that will be helpful, such as that biggest time factor that you found etc. Tim > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541617#comment-14541617 ] chenqiuhao commented on MESOS-2706: --- Hi Timothy,could you repro ? > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541616#comment-14541616 ] chenqiuhao commented on MESOS-2706: --- It seemed nobody replied... > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539190#comment-14539190 ] yuhe commented on MESOS-2706: - waiting for u ... > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539188#comment-14539188 ] yuhe commented on MESOS-2706: - waiting for u ... > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539189#comment-14539189 ] yuhe commented on MESOS-2706: - waiting for u ... > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537637#comment-14537637 ] yuhe commented on MESOS-2706: - Executor registered to the slave have delay the start time. Who can help? thx > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537634#comment-14537634 ] yuhe commented on MESOS-2706: - mesos/src/slave/slave.cpp if (!executor->isCommandExecutor()) { // If the executor is _not_ a command executor, this means that // the task will include the executor to run. The actual task to // run will be enqueued and subsequently handled by the executor // when it has registered to the slave. launch = slave->containerizer->launch( containerId, executorInfo_, // modified to include the task's resources. executor->directory, slave->flags.switch_user ? Option(user) : None(), slave->info.id(), slave->self(), info.checkpoint()); } else { // An executor has _not_ been provided by the task and will // instead define a command and/or container to run. Right now, // these tasks will require an executor anyway and the slave // creates a command executor. However, it is up to the // containerizer how to execute those tasks and the generated // executor info works as a placeholder. // TODO(nnielsen): Obsolete the requirement for executors to run // one-off tasks. launch = slave->containerizer->launch( containerId, taskInfo, executorInfo_, executor->directory, slave->flags.switch_user ? Option(user) : None(), slave->info.id(), slave->self(), info.checkpoint()); } > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537474#comment-14537474 ] chenqiuhao commented on MESOS-2706: --- thx,any help will be appreciate. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534960#comment-14534960 ] Timothy Chen commented on MESOS-2706: - This seems like it could be a Mesos problem, I can try to repro next week. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534755#comment-14534755 ] Marco Massenzio commented on MESOS-2706: [~drexin] can you please look into this? We think it was fixed for the //Build demo, but this may be something different. > When the docker-tasks grow, the time spare between Queuing task and Starting > container grows > > > Key: MESOS-2706 > URL: https://issues.apache.org/jira/browse/MESOS-2706 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.22.0 > Environment: My Environment info: > Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server. > Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and > 24G mems. > So Mesos can launch thousands of task in theory. > And the docker-task is very light-weight to launch a sshd service . >Reporter: chenqiuhao > > At the beginning, Marathon can launch docker-task very fast,but when the > number of tasks in the only-one mesos-slave host reached 50,It seemed > Marathon lauch docker-task slow. > So I check the mesos-slave log,and I found that the time spare between > Queuing task and Starting container grew . > For example, > launch the 1st docker task, it takes about 0.008s > [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing > task|Starting container' > I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor > dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 15:54:00.196832 225781 docker.cpp:581] Starting container > 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor > 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > launch the 50th docker task, it takes about 4.9s > I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor > dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework > '20150202-112355-2684495626-5050-26153- > I0508 16:12:15.801503 225778 docker.cpp:581] Starting container > '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor > 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework > '20150202-112355-2684495626-5050-26153-' > And when i launch the 100th docker task,it takes about 13s! > And I did the same test in one 24 Cpus and 256G mems server-host, it got the > same result. > Did somebody have the same experience , or Can help to do the same pressure > test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)