[jira] [Commented] (MESOS-4492) Add metrics for {RESERVE, UNRESERVE} and {CREATE, DESTROY} offer operation
[ https://issues.apache.org/jira/browse/MESOS-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182695#comment-15182695 ] Fan Du commented on MESOS-4492: --- [~greggomann] I saw this ticket is not accepted by committer so far, could you pls help to do that, and then I can update the JIRA workflow. And one more question, what do I need to do before [~jieyu] merge the patch since you have "ship it"? Thanks a lot for your reviewing :) > Add metrics for {RESERVE, UNRESERVE} and {CREATE, DESTROY} offer operation > -- > > Key: MESOS-4492 > URL: https://issues.apache.org/jira/browse/MESOS-4492 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Fan Du >Assignee: Fan Du >Priority: Minor > > This ticket aims to enable user or operator to inspect operation statistics > such as RESERVE, UNRESERVE, CREATE and DESTROY, current implementation only > supports LAUNCH. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4881) Rescind all outstanding offers after changing some weights.
Yongqiao Wang created MESOS-4881: Summary: Rescind all outstanding offers after changing some weights. Key: MESOS-4881 URL: https://issues.apache.org/jira/browse/MESOS-4881 Project: Mesos Issue Type: Task Reporter: Yongqiao Wang Assignee: Yongqiao Wang -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2043) framework auth fail with timeout error and never get authenticated
[ https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182680#comment-15182680 ] Adam B commented on MESOS-2043: --- I suspect that we need to actually implement the backoff/limit on authentication retries that is referenced in TODOs in the agent and scheduler driver code: https://github.com/apache/mesos/blob/0.28.0-rc1/src/slave/slave.cpp#L811 https://github.com/apache/mesos/blob/0.28.0-rc1/src/slave/slave.cpp#L916 https://github.com/apache/mesos/blob/0.28.0-rc1/src/sched/sched.cpp#L331 https://github.com/apache/mesos/blob/0.28.0-rc1/src/sched/sched.cpp#L459 cc [~vinodkone] > framework auth fail with timeout error and never get authenticated > -- > > Key: MESOS-2043 > URL: https://issues.apache.org/jira/browse/MESOS-2043 > Project: Mesos > Issue Type: Bug > Components: master, scheduler driver, security, slave >Affects Versions: 0.21.0 >Reporter: Bhuvan Arumugam >Priority: Critical > Labels: mesosphere, security > Attachments: aurora-scheduler.20141104-1606-1706.log, > mesos-master.20141104-1606-1706.log > > > I'm facing this issue in master as of > https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4 > As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm > running 1 master and 1 scheduler (aurora). The framework authentication fail > due to time out: > error on mesos master: > {code} > I1104 19:37:17.741449 8329 master.cpp:3874] Authenticating > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > I1104 19:37:17.741585 8329 master.cpp:3885] Using default CRAM-MD5 > authenticator > I1104 19:37:17.742106 8336 authenticator.hpp:169] Creating new server SASL > connection > W1104 19:37:22.742959 8329 master.cpp:3953] Authentication timed out > W1104 19:37:22.743548 8329 master.cpp:3930] Failed to authenticate > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: > Authentication discarded > {code} > scheduler error: > {code} > I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master > master@MASTER_IP:PORT > I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL > connection > I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL > authentication mechanisms: CRAM-MD5 > I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate > with mechanism 'CRAM-MD5' > W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out > I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master > master@MASTER_IP:PORT: Authentication discarded > {code} > Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & > {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is > trying to authenticate and fail. > {code} > W1104 19:36:30.769420 8319 master.cpp:3930] Failed to authenticate > scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to > communicate with authenticatee > I1104 19:36:42.701441 8328 master.cpp:3860] Queuing up authentication > request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > because authentication is still in progress > {code} > Restarting master and scheduler didn't fix it. > This particular issue happen with 1 master and 1 scheduler after MESOS-1866 > is fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4316) Support get non-default weights by /weights
[ https://issues.apache.org/jira/browse/MESOS-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongqiao Wang updated MESOS-4316: - Shepherd: Adam B > Support get non-default weights by /weights > --- > > Key: MESOS-4316 > URL: https://issues.apache.org/jira/browse/MESOS-4316 > Project: Mesos > Issue Type: Task >Reporter: Yongqiao Wang >Assignee: Yongqiao Wang >Priority: Minor > > Like /quota, we should also add query logic for /weights to keep consistent. > Then /roles no longer needs to show weight information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2043) framework auth fail with timeout error and never get authenticated
[ https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-2043: -- Component/s: slave security scheduler driver > framework auth fail with timeout error and never get authenticated > -- > > Key: MESOS-2043 > URL: https://issues.apache.org/jira/browse/MESOS-2043 > Project: Mesos > Issue Type: Bug > Components: master, scheduler driver, security, slave >Affects Versions: 0.21.0 >Reporter: Bhuvan Arumugam >Priority: Critical > Labels: mesosphere, security > Attachments: aurora-scheduler.20141104-1606-1706.log, > mesos-master.20141104-1606-1706.log > > > I'm facing this issue in master as of > https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4 > As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm > running 1 master and 1 scheduler (aurora). The framework authentication fail > due to time out: > error on mesos master: > {code} > I1104 19:37:17.741449 8329 master.cpp:3874] Authenticating > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > I1104 19:37:17.741585 8329 master.cpp:3885] Using default CRAM-MD5 > authenticator > I1104 19:37:17.742106 8336 authenticator.hpp:169] Creating new server SASL > connection > W1104 19:37:22.742959 8329 master.cpp:3953] Authentication timed out > W1104 19:37:22.743548 8329 master.cpp:3930] Failed to authenticate > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: > Authentication discarded > {code} > scheduler error: > {code} > I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master > master@MASTER_IP:PORT > I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL > connection > I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL > authentication mechanisms: CRAM-MD5 > I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate > with mechanism 'CRAM-MD5' > W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out > I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master > master@MASTER_IP:PORT: Authentication discarded > {code} > Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & > {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is > trying to authenticate and fail. > {code} > W1104 19:36:30.769420 8319 master.cpp:3930] Failed to authenticate > scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to > communicate with authenticatee > I1104 19:36:42.701441 8328 master.cpp:3860] Queuing up authentication > request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > because authentication is still in progress > {code} > Restarting master and scheduler didn't fix it. > This particular issue happen with 1 master and 1 scheduler after MESOS-1866 > is fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2043) framework auth fail with timeout error and never get authenticated
[ https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-2043: -- Priority: Critical (was: Major) > framework auth fail with timeout error and never get authenticated > -- > > Key: MESOS-2043 > URL: https://issues.apache.org/jira/browse/MESOS-2043 > Project: Mesos > Issue Type: Bug > Components: master, scheduler driver, security, slave >Affects Versions: 0.21.0 >Reporter: Bhuvan Arumugam >Priority: Critical > Labels: mesosphere, security > Attachments: aurora-scheduler.20141104-1606-1706.log, > mesos-master.20141104-1606-1706.log > > > I'm facing this issue in master as of > https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4 > As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm > running 1 master and 1 scheduler (aurora). The framework authentication fail > due to time out: > error on mesos master: > {code} > I1104 19:37:17.741449 8329 master.cpp:3874] Authenticating > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > I1104 19:37:17.741585 8329 master.cpp:3885] Using default CRAM-MD5 > authenticator > I1104 19:37:17.742106 8336 authenticator.hpp:169] Creating new server SASL > connection > W1104 19:37:22.742959 8329 master.cpp:3953] Authentication timed out > W1104 19:37:22.743548 8329 master.cpp:3930] Failed to authenticate > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: > Authentication discarded > {code} > scheduler error: > {code} > I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master > master@MASTER_IP:PORT > I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL > connection > I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL > authentication mechanisms: CRAM-MD5 > I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate > with mechanism 'CRAM-MD5' > W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out > I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master > master@MASTER_IP:PORT: Authentication discarded > {code} > Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & > {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is > trying to authenticate and fail. > {code} > W1104 19:36:30.769420 8319 master.cpp:3930] Failed to authenticate > scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to > communicate with authenticatee > I1104 19:36:42.701441 8328 master.cpp:3860] Queuing up authentication > request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > because authentication is still in progress > {code} > Restarting master and scheduler didn't fix it. > This particular issue happen with 1 master and 1 scheduler after MESOS-1866 > is fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4316) Support get non-default weights by /weights
[ https://issues.apache.org/jira/browse/MESOS-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182668#comment-15182668 ] Yongqiao Wang commented on MESOS-4316: -- Only add the GET for /weights in the same release as the PUT/POST. We can't change what's displayed in /roles without going through a deprecation cycle, but we can show the weights in both places for now. > Support get non-default weights by /weights > --- > > Key: MESOS-4316 > URL: https://issues.apache.org/jira/browse/MESOS-4316 > Project: Mesos > Issue Type: Task >Reporter: Yongqiao Wang >Assignee: Yongqiao Wang >Priority: Minor > > Like /quota, we should also add query logic for /weights to keep consistent. > Then /roles no longer needs to show weight information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2043) framework auth fail with timeout error and never get authenticated
[ https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-2043: -- Labels: mesosphere security (was: security) > framework auth fail with timeout error and never get authenticated > -- > > Key: MESOS-2043 > URL: https://issues.apache.org/jira/browse/MESOS-2043 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.21.0 >Reporter: Bhuvan Arumugam > Labels: mesosphere, security > Attachments: aurora-scheduler.20141104-1606-1706.log, > mesos-master.20141104-1606-1706.log > > > I'm facing this issue in master as of > https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4 > As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm > running 1 master and 1 scheduler (aurora). The framework authentication fail > due to time out: > error on mesos master: > {code} > I1104 19:37:17.741449 8329 master.cpp:3874] Authenticating > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > I1104 19:37:17.741585 8329 master.cpp:3885] Using default CRAM-MD5 > authenticator > I1104 19:37:17.742106 8336 authenticator.hpp:169] Creating new server SASL > connection > W1104 19:37:22.742959 8329 master.cpp:3953] Authentication timed out > W1104 19:37:22.743548 8329 master.cpp:3930] Failed to authenticate > scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: > Authentication discarded > {code} > scheduler error: > {code} > I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master > master@MASTER_IP:PORT > I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL > connection > I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL > authentication mechanisms: CRAM-MD5 > I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate > with mechanism 'CRAM-MD5' > W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out > I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master > master@MASTER_IP:PORT: Authentication discarded > {code} > Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & > {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is > trying to authenticate and fail. > {code} > W1104 19:36:30.769420 8319 master.cpp:3930] Failed to authenticate > scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to > communicate with authenticatee > I1104 19:36:42.701441 8328 master.cpp:3860] Queuing up authentication > request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 > because authentication is still in progress > {code} > Restarting master and scheduler didn't fix it. > This particular issue happen with 1 master and 1 scheduler after MESOS-1866 > is fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3368) Add device support in cgroups abstraction
[ https://issues.apache.org/jira/browse/MESOS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182646#comment-15182646 ] Abhishek Dasgupta commented on MESOS-3368: -- Please review: https://reviews.apache.org/r/44439/ > Add device support in cgroups abstraction > - > > Key: MESOS-3368 > URL: https://issues.apache.org/jira/browse/MESOS-3368 > Project: Mesos > Issue Type: Task >Reporter: Niklas Quarfot Nielsen >Assignee: Abhishek Dasgupta > > Add support for [device > cgroups|https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt] to > aid isolators controlling access to devices. > In the future, we could think about how to numerate and control access to > devices as resource or task/container policy -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182596#comment-15182596 ] Guangya Liu commented on MESOS-4877: Yes, that's make sense. Do you know how docker is handling such case with a local registry? does it still add a prefix or just fail for such case. > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2684) mesos-slave should not abort when a single task has e.g. a 'mkdir' failure
[ https://issues.apache.org/jira/browse/MESOS-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182557#comment-15182557 ] wangxiaosen commented on MESOS-2684: hi , look at /etc/cron.daily/tmpwatch , over 10 days has been deleted, change it to 200d or add -x /tmp/mesos , /usr/sbin/tmpwatch "$flags" -x /tmp/.X11-unix -x /tmp/.XIM-unix \ -x /tmp/.font-unix -x /tmp/.ICE-unix -x /tmp/.Test-unix \ -X '/tmp/hsperfdata_*' 10d /tmp > mesos-slave should not abort when a single task has e.g. a 'mkdir' failure > -- > > Key: MESOS-2684 > URL: https://issues.apache.org/jira/browse/MESOS-2684 > Project: Mesos > Issue Type: Bug > Components: docker, slave >Affects Versions: 0.21.1 >Reporter: Steven Schlansker > Attachments: mesos-slave-restart.txt > > > mesos-slave can encounter a variety of problems while attempting to launch a > task. If the task fails, that is unfortunate, but not the end of the world. > Other tasks should not be affected. > However, if the task failure happens to trigger an assertion, the entire > slave comes crashing down: > F0501 19:10:46.095464 1705 paths.hpp:342] CHECK_SOME(mkdir): No space left > on device Failed to create executor directory > '/mnt/mesos/slaves/20150327-194449-419644938-5050-1649-S71/frameworks/Singularity/executors/pp-gc-eventlog-teamcity.2015.03.31T23.55.14-1430507446029-2-10.70.8.160-us_west_2b/runs/95a54aeb-322c-48e9-9f6f-5b359bccbc01' > Immediately afterwards, all tasks on this slave were declared TASK_KILLED > when mesos-slave restarted. > Something as simple as a 'mkdir' failing is not worthy of an assertion > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182551#comment-15182551 ] Shuai Lin commented on MESOS-4877: -- Yeah, we need to think about what to do if the user is using a local registry. But I still think this need to fixed on mesos side because this could cause confusion for the users. He may ask: I can "ubuntu" image when using docker containerizer (or using docker command line), why do I have to change the name to "library/ubuntu" when using the mesos containerizer? > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
[ https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182545#comment-15182545 ] Shuai Lin commented on MESOS-4878: -- Thanks for the reply! But it problem is not "slow to pull docker images", but "failed to pull the image" (due to [MESOS-4877]). See my RR and the new added test for details. > Task stuck in TASK_STAGING when docker fetcher failed to fetch the image > > > Key: MESOS-4878 > URL: https://issues.apache.org/jira/browse/MESOS-4878 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > When a task is launched with the mesos containerizer and a docker image, if > the docker fetcher failed to pull the image, no more task updates are sent to > the scheduler. > {code} > I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test/store/docker/staging/V2dqJv' > E0306 17:29:00.749889 17651 slave.cpp:3773] Container > '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of > framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect > failed: Unexpected HTTP response '401 Unauthorized' when trying to get the > manifest > I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container > '6b98026b-a58d-434c-9432-b517012edc35' > I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators > to complete preparing before destroying the container > I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor > ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73 > {code} > Scheduler logs: > {code} > sudo ./build/src/mesos-execute --docker_image=alpine:latest > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=33.33.33.33:5050 > WARNING: Logging before InitGoogleLogging() is written to STDERR > W0306 17:28:57.491081 17740 sched.cpp:1642] > ** > Scheduler driver bound to loopback interface! Cannot communicate with remote > master(s). You might want to set 'LIBPROCESS_IP' environment variable to use > a routable IP address. > ** > I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0 > I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at > master@33.33.33.33:5050 > I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. > Attempting to register without authentication > I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with > a4ff93ba-2141-48e2-92a9-7354e4028282- > Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- > task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
[ https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated MESOS-4878: - Component/s: docker containerization > Task stuck in TASK_STAGING when docker fetcher failed to fetch the image > > > Key: MESOS-4878 > URL: https://issues.apache.org/jira/browse/MESOS-4878 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > When a task is launched with the mesos containerizer and a docker image, if > the docker fetcher failed to pull the image, no more task updates are sent to > the scheduler. > {code} > I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test/store/docker/staging/V2dqJv' > E0306 17:29:00.749889 17651 slave.cpp:3773] Container > '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of > framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect > failed: Unexpected HTTP response '401 Unauthorized' when trying to get the > manifest > I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container > '6b98026b-a58d-434c-9432-b517012edc35' > I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators > to complete preparing before destroying the container > I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor > ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73 > {code} > Scheduler logs: > {code} > sudo ./build/src/mesos-execute --docker_image=alpine:latest > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=33.33.33.33:5050 > WARNING: Logging before InitGoogleLogging() is written to STDERR > W0306 17:28:57.491081 17740 sched.cpp:1642] > ** > Scheduler driver bound to loopback interface! Cannot communicate with remote > master(s). You might want to set 'LIBPROCESS_IP' environment variable to use > a routable IP address. > ** > I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0 > I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at > master@33.33.33.33:5050 > I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. > Attempting to register without authentication > I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with > a4ff93ba-2141-48e2-92a9-7354e4028282- > Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- > task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated MESOS-4877: - Component/s: docker containerization > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4354) Implement isolator for Docker network
[ https://issues.apache.org/jira/browse/MESOS-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangya Liu updated MESOS-4354: --- Assignee: Srinivas (was: Qian Zhang) > Implement isolator for Docker network > - > > Key: MESOS-4354 > URL: https://issues.apache.org/jira/browse/MESOS-4354 > Project: Mesos > Issue Type: Improvement > Components: docker, isolation >Reporter: Qian Zhang >Assignee: Srinivas > > In Docker, user can create a network with Docker CLI, e.g., {{docker network > create my-network}}, we need to implement an isolator to make the container > launched by MesosContainerizer can use such network. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4355) Implement isolator for Docker volume
[ https://issues.apache.org/jira/browse/MESOS-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangya Liu reassigned MESOS-4355: -- Assignee: Guangya Liu (was: Qian Zhang) > Implement isolator for Docker volume > > > Key: MESOS-4355 > URL: https://issues.apache.org/jira/browse/MESOS-4355 > Project: Mesos > Issue Type: Improvement > Components: docker, isolation >Reporter: Qian Zhang >Assignee: Guangya Liu > > In Docker, user can create a volume with Docker CLI, e.g., {{docker volume > create --name my-volume}}, we need to implement an isolator to make the > container launched by MesosContainerizer can use such volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4880) Create test filter for swap
Zhitao Li created MESOS-4880: Summary: Create test filter for swap Key: MESOS-4880 URL: https://issues.apache.org/jira/browse/MESOS-4880 Project: Mesos Issue Type: Bug Components: testing, tests Reporter: Zhitao Li Priority: Minor The following tests: CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS reports fail when swap is enabled. They should be automatically skipped with a test filter like ROOT/CGROUPS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4879) Update glog patch to suport PowerPC LE
[ https://issues.apache.org/jira/browse/MESOS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhiwei updated MESOS-4879: --- Description: This is a part of PowerPC LE porting > Update glog patch to suport PowerPC LE > -- > > Key: MESOS-4879 > URL: https://issues.apache.org/jira/browse/MESOS-4879 > Project: Mesos > Issue Type: Improvement >Reporter: Chen Zhiwei >Assignee: Chen Zhiwei > > This is a part of PowerPC LE porting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4879) Update glog patch to suport PowerPC LE
[ https://issues.apache.org/jira/browse/MESOS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182464#comment-15182464 ] Chen Zhiwei commented on MESOS-4879: https://reviews.apache.org/r/44252/ > Update glog patch to suport PowerPC LE > -- > > Key: MESOS-4879 > URL: https://issues.apache.org/jira/browse/MESOS-4879 > Project: Mesos > Issue Type: Improvement >Reporter: Chen Zhiwei >Assignee: Chen Zhiwei > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4879) Update glog patch to suport PowerPC LE
Chen Zhiwei created MESOS-4879: -- Summary: Update glog patch to suport PowerPC LE Key: MESOS-4879 URL: https://issues.apache.org/jira/browse/MESOS-4879 Project: Mesos Issue Type: Improvement Reporter: Chen Zhiwei Assignee: Chen Zhiwei -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182394#comment-15182394 ] Guangya Liu commented on MESOS-4877: {quota} I think we can just add a library/ prefix to the image name if both of the following conditions are true: the image name doesn't have a slash in the ImageReference.repository part, and the image host is empty or the same as the one specified in --docker_registry {quota} This solution only fit into the case when end user using docker remote registry but cannot handle the case of local registry as end user may not always have {{library/}} as prefix for a image, what about clarify this in the document? > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
[ https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182391#comment-15182391 ] Guangya Liu commented on MESOS-4878: Seems this is caused by that the executor failed to register back due to isolator is not ready, can you have a try again with adding parameter "--executor_registration_timeout=5mins" when start up agent? It is a bit slow to pull docker images from China. > Task stuck in TASK_STAGING when docker fetcher failed to fetch the image > > > Key: MESOS-4878 > URL: https://issues.apache.org/jira/browse/MESOS-4878 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > When a task is launched with the mesos containerizer and a docker image, if > the docker fetcher failed to pull the image, no more task updates are sent to > the scheduler. > {code} > I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test/store/docker/staging/V2dqJv' > E0306 17:29:00.749889 17651 slave.cpp:3773] Container > '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of > framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect > failed: Unexpected HTTP response '401 Unauthorized' when trying to get the > manifest > I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container > '6b98026b-a58d-434c-9432-b517012edc35' > I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators > to complete preparing before destroying the container > I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor > ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73 > {code} > Scheduler logs: > {code} > sudo ./build/src/mesos-execute --docker_image=alpine:latest > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=33.33.33.33:5050 > WARNING: Logging before InitGoogleLogging() is written to STDERR > W0306 17:28:57.491081 17740 sched.cpp:1642] > ** > Scheduler driver bound to loopback interface! Cannot communicate with remote > master(s). You might want to set 'LIBPROCESS_IP' environment variable to use > a routable IP address. > ** > I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0 > I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at > master@33.33.33.33:5050 > I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. > Attempting to register without authentication > I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with > a4ff93ba-2141-48e2-92a9-7354e4028282- > Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- > task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
[ https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182289#comment-15182289 ] Shuai Lin commented on MESOS-4878: -- Seems it's because the containerizer is waiting for the preparation of the isolators, but the prepare is not started at all because the provision failed. > Task stuck in TASK_STAGING when docker fetcher failed to fetch the image > > > Key: MESOS-4878 > URL: https://issues.apache.org/jira/browse/MESOS-4878 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > When a task is launched with the mesos containerizer and a docker image, if > the docker fetcher failed to pull the image, no more task updates are sent to > the scheduler. > {code} > I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test/store/docker/staging/V2dqJv' > E0306 17:29:00.749889 17651 slave.cpp:3773] Container > '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of > framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect > failed: Unexpected HTTP response '401 Unauthorized' when trying to get the > manifest > I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container > '6b98026b-a58d-434c-9432-b517012edc35' > I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators > to complete preparing before destroying the container > I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor > ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73 > {code} > Scheduler logs: > {code} > sudo ./build/src/mesos-execute --docker_image=alpine:latest > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=33.33.33.33:5050 > WARNING: Logging before InitGoogleLogging() is written to STDERR > W0306 17:28:57.491081 17740 sched.cpp:1642] > ** > Scheduler driver bound to loopback interface! Cannot communicate with remote > master(s). You might want to set 'LIBPROCESS_IP' environment variable to use > a routable IP address. > ** > I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0 > I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at > master@33.33.33.33:5050 > I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. > Attempting to register without authentication > I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with > a4ff93ba-2141-48e2-92a9-7354e4028282- > Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- > task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
[ https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated MESOS-4878: - Affects Version/s: 0.27.0 0.27.1 > Task stuck in TASK_STAGING when docker fetcher failed to fetch the image > > > Key: MESOS-4878 > URL: https://issues.apache.org/jira/browse/MESOS-4878 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > When a task is launched with the mesos containerizer and a docker image, if > the docker fetcher failed to pull the image, no more task updates are sent to > the scheduler. > {code} > I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test/store/docker/staging/V2dqJv' > E0306 17:29:00.749889 17651 slave.cpp:3773] Container > '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of > framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect > failed: Unexpected HTTP response '401 Unauthorized' when trying to get the > manifest > I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container > '6b98026b-a58d-434c-9432-b517012edc35' > I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators > to complete preparing before destroying the container > I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor > ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73 > {code} > Scheduler logs: > {code} > sudo ./build/src/mesos-execute --docker_image=alpine:latest > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=33.33.33.33:5050 > WARNING: Logging before InitGoogleLogging() is written to STDERR > W0306 17:28:57.491081 17740 sched.cpp:1642] > ** > Scheduler driver bound to loopback interface! Cannot communicate with remote > master(s). You might want to set 'LIBPROCESS_IP' environment variable to use > a routable IP address. > ** > I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0 > I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at > master@33.33.33.33:5050 > I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. > Attempting to register without authentication > I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with > a4ff93ba-2141-48e2-92a9-7354e4028282- > Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- > task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
Shuai Lin created MESOS-4878: Summary: Task stuck in TASK_STAGING when docker fetcher failed to fetch the image Key: MESOS-4878 URL: https://issues.apache.org/jira/browse/MESOS-4878 Project: Mesos Issue Type: Bug Reporter: Shuai Lin Assignee: Shuai Lin When a task is launched with the mesos containerizer and a docker image, if the docker fetcher failed to pull the image, no more task updates are sent to the scheduler. {code} I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image 'alpine:latest' from 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to '/tmp/mesos-test/store/docker/staging/V2dqJv' E0306 17:29:00.749889 17651 slave.cpp:3773] Container '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect failed: Unexpected HTTP response '401 Unauthorized' when trying to get the manifest I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container '6b98026b-a58d-434c-9432-b517012edc35' I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators to complete preparing before destroying the container I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73 {code} Scheduler logs: {code} sudo ./build/src/mesos-execute --docker_image=alpine:latest --containerizer=mesos --name=just-a-test --command="sleep 1000" --master=33.33.33.33:5050 WARNING: Logging before InitGoogleLogging() is written to STDERR W0306 17:28:57.491081 17740 sched.cpp:1642] ** Scheduler driver bound to loopback interface! Cannot communicate with remote master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a routable IP address. ** I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0 I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at master@33.33.33.33:5050 I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. Attempting to register without authentication I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282- task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated MESOS-4877: - Affects Version/s: 0.27.0 > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0, 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated MESOS-4877: - Affects Version/s: 0.27.1 > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.1 >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182241#comment-15182241 ] Shuai Lin commented on MESOS-4877: -- I think we can just add a {{library/}} prefix to the image name if both of the following conditions are true: * the image name doesn't have a slash in the {{ImageReference.repository}} part, and * the image host is empty or the same as the one specified in {{--docker_registry}} This is how docker engine handles the image name: https://github.com/docker/docker/blob/v1.10.2/reference/reference.go#L171-L173 > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182236#comment-15182236 ] Shuai Lin commented on MESOS-4877: -- I've tried to send the requests to docker registry by hand: # First Get the token: {code} curl 'https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull' {code} # Then request the manifest from the registry: {code} curl -H 'Authorization: Bearer ' 'https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull' {code} It would fail with status code 401. In the response header, there is a message {{error="insufficient_scope"}}: {code} Www-Authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:alpine:pull",error="insufficient_scope; {code} > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
[ https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin reassigned MESOS-4877: Assignee: Shuai Lin > Mesos containerizer can't handle top level docker image like "alpine" (must > use "library/alpine") > - > > Key: MESOS-4877 > URL: https://issues.apache.org/jira/browse/MESOS-4877 > Project: Mesos > Issue Type: Bug >Reporter: Shuai Lin >Assignee: Shuai Lin > > This can be demonstrated with the {{mesos-execute}} command: > # Docker containerizer with image {{alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{alpine}}: failure > {code} > sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos > --name=just-a-test --command="sleep 1000" --master=localhost:5050 > {code} > # Mesos containerizer with image {{library/alpine}}: success > {code} > sudo ./build/src/mesos-execute --docker_image=library/alpine > --containerizer=mesos --name=just-a-test --command="sleep 1000" > --master=localhost:5050 > {code} > In the slave logs: > {code} > ea-4460-83 > 9c-838da86af34c-0007' > I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image > 'alpine:latest' > I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image > 'alpine:latest' from > 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to > '/tmp/mesos-test > /store/docker/staging/ka7MlQ' > E0306 16:32:43.098131 3400 slave.cpp:3773] Container > '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of > framework 4f055c6f-1bea-4460-839c-838da86af34c-0 > 007 failed to start: Collect failed: Unexpected HTTP response '401 > Unauthorized > {code} > curl command executed: > {code} > $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and > proc.name=curl >16:42:53.198998042 curl -s -S -L -D - > https://registry-1.docker.io:443/v2/alpine/manifests/latest > 16:42:53.784958541 curl -s -S -L -D - > https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull > 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer > eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw > https://registry-1.docker.io:443/v2/alpine/manifests/latest > {code} > Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
Shuai Lin created MESOS-4877: Summary: Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine") Key: MESOS-4877 URL: https://issues.apache.org/jira/browse/MESOS-4877 Project: Mesos Issue Type: Bug Reporter: Shuai Lin This can be demonstrated with the {{mesos-execute}} command: # Docker containerizer with image {{alpine}}: success {code} sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker --name=just-a-test --command="sleep 1000" --master=localhost:5050 {code} # Mesos containerizer with image {{alpine}}: failure {code} sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos --name=just-a-test --command="sleep 1000" --master=localhost:5050 {code} # Mesos containerizer with image {{library/alpine}}: success {code} sudo ./build/src/mesos-execute --docker_image=library/alpine --containerizer=mesos --name=just-a-test --command="sleep 1000" --master=localhost:5050 {code} In the slave logs: {code} ea-4460-83 9c-838da86af34c-0007' I0306 16:32:41.418269 3403 metadata_manager.cpp:159] Looking for image 'alpine:latest' I0306 16:32:41.418699 3403 registry_puller.cpp:194] Pulling image 'alpine:latest' from 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to '/tmp/mesos-test /store/docker/staging/ka7MlQ' E0306 16:32:43.098131 3400 slave.cpp:3773] Container '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of framework 4f055c6f-1bea-4460-839c-838da86af34c-0 007 failed to start: Collect failed: Unexpected HTTP response '401 Unauthorized {code} curl command executed: {code} $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and proc.name=curl 16:42:53.198998042 curl -s -S -L -D - https://registry-1.docker.io:443/v2/alpine/manifests/latest 16:42:53.784958541 curl -s -S -L -D - https://auth.docker.io/token?service=registry.docker.io=repository:alpine:pull 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw https://registry-1.docker.io:443/v2/alpine/manifests/latest {code} Also got the same result with {{ubuntu}} docker image. -- This message was sent by Atlassian JIRA (v6.3.4#6332)