[ https://issues.apache.org/jira/browse/MESOS-8158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234831#comment-16234831 ]
Charles Allen commented on MESOS-8158: -------------------------------------- [~jieyu] the socket is exposed. > Mesos Agent in docker neglects to retry discovering Task docker containers > -------------------------------------------------------------------------- > > Key: MESOS-8158 > URL: https://issues.apache.org/jira/browse/MESOS-8158 > Project: Mesos > Issue Type: Bug > Components: agent, containerization, docker, executor > Affects Versions: 1.4.0 > Environment: Windows 10 with Docker version 17.09.0-ce, build afdb6d4 > Reporter: Charles Allen > Priority: Normal > > I have attempted to launch Mesos agents inside of a docker container in such > a way where the agent docker can be replaced and recovered. Unfortunately I > hit a major snag in the way the mesos docker launching works. > To test simple functionality a marathon app is setup that simply has the > following command: {{date && python -m SimpleHTTPServer $PORT0}} > That way the HTTP port can be accessed to assure things are being assigned > correctly, and the date is printed out in the log. > When I attempt to start this marathon app, the mesos agent (inside a docker > container) properly launches an executor which properly creates a second task > that launches the python code. Here's the output from the executor logs (this > looks correct): > {code} > I1101 20:34:03.420210 68270 exec.cpp:162] Version: 1.4.0 > I1101 20:34:03.427455 68281 exec.cpp:237] Executor registered on agent > d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0 > I1101 20:34:03.428414 68283 executor.cpp:120] Registered docker executor on > 10.0.75.2 > I1101 20:34:03.428680 68281 executor.cpp:160] Starting task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 > I1101 20:34:03.428941 68281 docker.cpp:1080] Running docker -H > unix:///var/run/docker.sock run --cpu-shares 1024 --memory 134217728 -e > HOST=10.0.75.2 -e MARATHON_APP_DOCKER_IMAGE=python:2 -e > MARATHON_APP_ID=/testapp -e MARATHON_APP_LABELS= -e MARATHON_APP_RESOURCE_CPUS > =1.0 -e MARATHON_APP_RESOURCE_DISK=0.0 -e MARATHON_APP_RESOURCE_GPUS=0 -e > MARATHON_APP_RESOURCE_MEM=128.0 -e > MARATHON_APP_VERSION=2017-11-01T20:33:44.869Z -e > MESOS_CONTAINER_NAME=mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 -e > MESOS_SANDBOX=/mnt/mesos/sandbox -e MESOS_TA > SK_ID=testapp.fe35282f-bf43-11e7-a24b-0242ac110002 -e PORT=31464 -e > PORT0=31464 -e PORTS=31464 -e PORT_10000=31464 -e PORT_HTTP=31464 -v > /var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp > .fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75:/mnt/mesos/sandbox > --net host --entrypoint /bin/sh --name > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > --label=MESOS_TASK_ID=testapp.fe35282f-bf43-11e7-a24b-0242ac110002 python:2 > -c date && p > ython -m SimpleHTTPServer $PORT0 > I1101 20:34:03.430402 68281 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > I1101 20:34:03.520303 68286 docker.cpp:1290] Retrying inspect with non-zero > status code. cmd: 'docker -H unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75', interval: 500ms > I1101 20:34:04.021216 68288 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > I1101 20:34:04.124490 68281 docker.cpp:1290] Retrying inspect with non-zero > status code. cmd: 'docker -H unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75', interval: 500ms > I1101 20:34:04.624964 68288 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > I1101 20:34:04.934087 68286 docker.cpp:1345] Retrying inspect since container > not yet started. cmd: 'docker -H unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75', interval: 500ms > I1101 20:34:05.435145 68288 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > Wed Nov 1 20:34:06 UTC 2017 > {code} > But, somehow there is a TASK_FAILED message sent to marathon. > Upon further investigation, the following snippet can be found in the agent > logs (running in a docker container) > {code} > I1101 20:34:00.949129 9 slave.cpp:1736] Got assigned task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' for framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:00.950150 9 gc.cpp:93] Unscheduling > '/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001' > from gc > I1101 20:34:00.950225 9 gc.cpp:93] Unscheduling > '/var/run/mesos/meta/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001' > from gc > I1101 20:34:00.950472 12 slave.cpp:2003] Authorizing task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' for framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:00.951210 12 slave.cpp:2171] Launching task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' for framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:00.952265 12 paths.cpp:578] Trying to chown > '/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec7 > 5' to user 'root' > I1101 20:34:00.952733 12 slave.cpp:7256] Launching executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 with resources > [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"a > llocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] > in work directory > '/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac1100 > 02/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75' > I1101 20:34:00.953045 12 slave.cpp:2858] Launching container > 84f9ae30-9d4c-484a-860c-ca7845b7ec75 for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:00.955057 6 docker.cpp:1136] Starting container > '84f9ae30-9d4c-484a-860c-ca7845b7ec75' for task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' (and executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002') of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45 > -0001 > I1101 20:34:00.955263 12 slave.cpp:2400] Queued task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:00.955965 6 docker.cpp:1531] Running docker -H > unix:///var/run/docker.sock inspect python:2 > I1101 20:34:01.037293 8 docker.cpp:454] Docker pull python:2 completed > I1101 20:34:01.038180 8 docker.cpp:1080] Running docker -H > unix:///var/run/docker.sock run --cpu-shares 1126 --memory 167772160 -e > LIBPROCESS_IP=10.0.75.2 -e LIBPROCESS_PORT=0 -e > MESOS_AGENT_ENDPOINT=10.0.75.2:5051 -e MESOS_CHECKPOINT=1 -e > MESOS_CONTAINER_NAME=mesos > -84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor -e > MESOS_DIRECTORY=/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca784 > 5b7ec75 -e MESOS_EXECUTOR_ID=testapp.fe35282f-bf43-11e7-a24b-0242ac110002 -e > MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e > MESOS_FRAMEWORK_ID=a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 -e > MESOS_HTTP_COMMAND_EXECUTOR=0 -e > MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos-1.4.0. > so -e MESOS_NATIVE_LIBRARY=/usr/lib/libmesos-1.4.0.so -e > MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e > MESOS_SLAVE_ID=d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0 -e > MESOS_SLAVE_PID=slave(1)@10.0.75.2:5051 -e > MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v /va > r/run/docker.sock:/var/run/docker.sock:ro -v > /var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75:/var/run/meso > s/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75:rw > -v /var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/fra > meworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75:/mnt/mesos/sandbox > --net host --name mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor > --pid=host --cap-add=SYS_ADMIN --c > ap-add=SYS_PTRACE mesos-docker /usr/libexec/mesos/mesos-docker-executor > --cgroups_enable_cfs=false > --container=mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 --docker=docker > --docker_socket=/var/run/docker.sock --help=false > --initialize_driver_logging=true --launcher_dir=/u > sr/libexec/mesos --logbufsecs=0 --logging_level=INFO > --mapped_directory=/mnt/mesos/sandbox --quiet=false > --sandbox_directory=/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf4 > 3-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75 > --stop_timeout=0ns > I1101 20:34:01.040096 8 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor > I1101 20:34:01.138551 5 docker.cpp:1290] Retrying inspect with non-zero > status code. cmd: 'docker -H unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor', interval: 1secs > I1101 20:34:02.138964 13 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor > I1101 20:34:03.423805 5 slave.cpp:3935] Got registration for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 from executor(1)@10.0.75.2:35675 > I1101 20:34:03.424316 5 docker.cpp:1616] Ignoring updating container > 84f9ae30-9d4c-484a-860c-ca7845b7ec75 because resources passed to update are > identical to existing resources > I1101 20:34:03.424396 5 slave.cpp:2605] Sending queued task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' to executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 at executor(1)@10.0.75.2:35675 > I1101 20:34:04.052783 11 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > E1101 20:34:04.156435 12 slave.cpp:5292] Container > '84f9ae30-9d4c-484a-860c-ca7845b7ec75' for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 failed to start: Failed to run > 'docker -H unix:///var/run/dock > er.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75': exited with > status 1; stderr='Error: No such object: > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > ' > I1101 20:34:04.156497 12 docker.cpp:2078] Container > 84f9ae30-9d4c-484a-860c-ca7845b7ec75 launch failed > I1101 20:34:04.156622 7 slave.cpp:5405] Executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 has terminated with unknown status > I1101 20:34:04.156958 7 slave.cpp:4399] Handling status update > TASK_FAILED (UUID: 32c43a03-cede-49f8-9676-fd9411382c58) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 from @0.0.0.0:0 > E1101 20:34:04.157133 7 slave.cpp:4721] Failed to update resources for > container 84f9ae30-9d4c-484a-860c-ca7845b7ec75 of executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' running task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 on status update for terminal t > ask, destroying container: Container not found > W1101 20:34:04.157173 11 composing.cpp:582] Attempted to destroy unknown > container 84f9ae30-9d4c-484a-860c-ca7845b7ec75 > I1101 20:34:04.159068 12 status_update_manager.cpp:323] Received status > update TASK_FAILED (UUID: 32c43a03-cede-49f8-9676-fd9411382c58) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:04.159276 12 status_update_manager.cpp:834] Checkpointing > UPDATE for status update TASK_FAILED (UUID: > 32c43a03-cede-49f8-9676-fd9411382c58) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:04.159399 12 slave.cpp:4880] Forwarding the update TASK_FAILED > (UUID: 32c43a03-cede-49f8-9676-fd9411382c58) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 to master@10.0.75.2:5050 > I1101 20:34:04.294680 8 status_update_manager.cpp:395] Received status > update acknowledgement (UUID: 32c43a03-cede-49f8-9676-fd9411382c58) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:04.294747 8 status_update_manager.cpp:834] Checkpointing ACK > for status update TASK_FAILED (UUID: 32c43a03-cede-49f8-9676-fd9411382c58) > for task testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:04.294945 8 slave.cpp:5516] Cleaning up executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 at executor(1)@10.0.75.2:35675 > I1101 20:34:04.295308 8 slave.cpp:5612] Cleaning up framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:04.295418 8 gc.cpp:59] Scheduling > '/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75' > for gc > 6.99999658438519days in the future > I1101 20:34:04.295459 8 gc.cpp:59] Scheduling > '/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002' > for gc 6.99999658329778days in the future > I1101 20:34:04.295481 8 gc.cpp:59] Scheduling > '/var/run/mesos/meta/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002/runs/84f9ae30-9d4c-484a-860c-ca7845b7ec75' > f > or gc 6.99999658267259days in the future > I1101 20:34:04.295516 8 gc.cpp:59] Scheduling > '/var/run/mesos/meta/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001/executors/testapp.fe35282f-bf43-11e7-a24b-0242ac110002' > for gc 6.99999658223407days in the future > I1101 20:34:04.295537 8 gc.cpp:59] Scheduling > '/var/run/mesos/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001' > for gc 6.99999658143407days in the future > I1101 20:34:04.295558 8 gc.cpp:59] Scheduling > '/var/run/mesos/meta/slaves/d9bb6e96-ee26-43c2-977e-0c404fdd4e81-S0/frameworks/a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001' > for gc 6.9999965810637days in the future > I1101 20:34:04.295581 8 status_update_manager.cpp:285] Closing status > update streams for framework a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:06.748510 5 slave.cpp:4399] Handling status update > TASK_RUNNING (UUID: 001f20d1-fe18-4cb0-9b39-cb9cd3cb9741) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 from executor(1)@10.0.75.2:35675 > W1101 20:34:06.748546 5 slave.cpp:4455] Ignoring status update > TASK_RUNNING (UUID: 001f20d1-fe18-4cb0-9b39-cb9cd3cb9741) for task > testapp.fe35282f-bf43-11e7-a24b-0242ac110002 of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 for unknown framework a5eb6da1-f8ac- > 4642-8d66-cdd2e5b14d45-0001 > I1101 20:34:45.960857 9 slave.cpp:5828] Framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 seems to have exited. Ignoring > registration timeout for executor > 'testapp.f53fc2ce-bf43-11e7-a24b-0242ac110002' > I1101 20:34:52.039710 12 slave.cpp:5920] Current disk usage 0.50%. Max > allowed age: 6.265284309533982days > I1101 20:35:00.955883 12 slave.cpp:5828] Framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 seems to have exited. Ignoring > registration timeout for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' > I1101 20:35:52.040870 11 slave.cpp:5920] Current disk usage 0.50%. Max > allowed age: 6.265284309533982days > I1101 20:36:52.041280 5 slave.cpp:5920] Current disk usage 0.50%. Max > allowed age: 6.265284309533982days > I1101 20:37:52.042034 11 slave.cpp:5920] Current disk usage 0.50%. Max > allowed age: 6.265284309533982days > {code} > Of particular note is the following line: > {code} > I1101 20:34:01.138551 5 docker.cpp:1290] Retrying inspect with non-zero > status code. cmd: 'docker -H unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor', interval: 1secs > I1101 20:34:02.138964 13 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75.executor > I1101 20:34:03.423805 5 slave.cpp:3935] Got registration for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 from executor(1)@10.0.75.2:35675 > I1101 20:34:03.424316 5 docker.cpp:1616] Ignoring updating container > 84f9ae30-9d4c-484a-860c-ca7845b7ec75 because resources passed to update are > identical to existing resources > I1101 20:34:03.424396 5 slave.cpp:2605] Sending queued task > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' to executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 at executor(1)@10.0.75.2:35675 > I1101 20:34:04.052783 11 docker.cpp:1243] Running docker -H > unix:///var/run/docker.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > E1101 20:34:04.156435 12 slave.cpp:5292] Container > '84f9ae30-9d4c-484a-860c-ca7845b7ec75' for executor > 'testapp.fe35282f-bf43-11e7-a24b-0242ac110002' of framework > a5eb6da1-f8ac-4642-8d66-cdd2e5b14d45-0001 failed to start: Failed to run > 'docker -H unix:///var/run/dock > er.sock inspect mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75': exited with > status 1; stderr='Error: No such object: > mesos-84f9ae30-9d4c-484a-860c-ca7845b7ec75 > ' > {code} > For some reason it seems like the AGENT is trying to connect to the TASK > docker container in addition to the executor container, and is failing to > find it and instantly dying instead of obeying a retry mechanism. I do not > have an environment setup to trace the code path of this error state, but the > expected behavior is either that the Agent leaves it to the executor to track > the task, or that the agent respects retries. Neither of which seem to be > happening here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)