Chen Zhiwei created MESOS-5839:
----------------------------------

             Summary: Mesos docker image can't be started by docker-py
                 Key: MESOS-5839
                 URL: https://issues.apache.org/jira/browse/MESOS-5839
             Project: Mesos
          Issue Type: Bug
            Reporter: Chen Zhiwei


I can use `docker run` command to start a Mesos Agent container, but can't use 
docker-py to start.

When I use docker-py to start the Mesos Agent container, the mesos agent error 
message:
{quote}
I0713 02:07:55.161175 24808 logging.cpp:193] INFO level logging started!
I0713 02:07:55.162096 24808 main.cpp:264] Build: 2016-06-22 05:16:01 by root
I0713 02:07:55.162303 24808 main.cpp:266] Version: 1.0.0
I0713 02:07:55.162593 24808 main.cpp:269] Git tag: 1.0.0-rc1
I0713 02:07:55.162782 24808 main.cpp:273] Git SHA: 
dfe62665df67162e4c1064f524d6c0180100a9d2
I0713 02:07:55.169150 24808 systemd.cpp:237] systemd version `229` detected
I0713 02:07:55.277576 24808 containerizer.cpp:198] Using isolation: 
posix/cpu,posix/mem,filesystem/posix,network/cni
I0713 02:07:55.298914 24808 linux_launcher.cpp:101] Using 
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I0713 02:07:55.302297 24808 main.cpp:406] Starting Mesos agent
I0713 02:07:55.302889 24820 slave.cpp:203] Agent started on 1)@9.21.60.205:5051
I0713 02:07:55.303081 24820 slave.cpp:204] Flags at startup: 
--appc_simple_discovery_uri_prefix="http://"; 
--appc_store_dir="/tmp/mesos/store/appc" --authenticate_http="false" 
--authenticatee="crammd5" --authorizer="local" 
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
--cgroups_root="mesos" --container_disk_watch_interval="15secs" 
--containerizers="docker,mesos" --default_role="*" 
--disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" 
--docker_registry="https://registry-1.docker.io"; --docker_remove_delay="6hrs" 
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" 
--docker_store_dir="/tmp/mesos/store/docker" 
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" 
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins" 
--executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" 
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" 
--gc_disk_headroom="0.1" --hadoop_home="" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--http_command_executor="false" --image_provisioner_backend="copy" 
--initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" 
--launcher_dir="/usr/libexec/mesos" --log_dir="/var/log/mesos" --logbufsecs="0" 
--logging_level="INFO" --master="9.21.60.90:5050" 
--oversubscribed_resources_interval="15secs" --perf_duration="10secs" 
--perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" 
--quiet="false" --recover="reconnect" --recovery_timeout="15mins" 
--registration_backoff_factor="1secs" --revocable_cpu_low_priority="true" 
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="false" 
--systemd_enable_support="false" 
--systemd_runtime_directory="/run/systemd/system" --version="false" 
--work_dir="/var/lib/mesos"
..........
W0713 02:07:55.481570 24808 logging.cpp:91] {color:blue}*RAW: Received signal 
SIGTERM from process 19221 of user 0; exiting*{color}
{quote}

The process 19221 is the docker-containerd daemon. I am not sure if this issue 
if related to this fix: https://issues.apache.org/jira/browse/MESOS-4279 , but 
mesosphere/mesos:0.28.2 has no this issue.

Following is the Docker daemon log:

{quote}{code}
Jul 12 22:07:54 ubuntu5 docker[19214]: 
time="2016-07-12T22:07:54.954452857-04:00" level=warning msg="Security options 
with `:` as a separator are deprecated and will be completely unsupported in 
1.13, use `=` instead."
Jul 12 22:07:54 ubuntu5 kernel: [141068.247724] aufs 
au_opts_verify:1597:docker[26535]: dirperm1 breaks the protection by the 
permission bits on the lower branch
Jul 12 22:07:54 ubuntu5 kernel: [141068.269016] aufs 
au_opts_verify:1597:docker[26535]: dirperm1 breaks the protection by the 
permission bits on the lower branch
Jul 12 22:07:54 ubuntu5 kernel: [141068.287598] aufs 
au_opts_verify:1597:docker[19714]: dirperm1 breaks the protection by the 
permission bits on the lower branch
Jul 12 22:07:55 ubuntu5 docker[19214]: 
time="2016-07-12T22:07:55.509756825-04:00" level=info msg="Container 
ea0f9128d489f56f3b9f64a24926a53e499b4b6e5243301a01c69e171d4f054b failed to exit 
within 0 seconds of signal 15 - using the force"
Jul 12 22:07:55 ubuntu5 docker[19214]: 
time="2016-07-12T22:07:55.568161690-04:00" level=warning msg="container 
ea0f9128d489f56f3b9f64a24926a53e499b4b6e5243301a01c69e171d4f054b restart 
canceled"
{code}{quote}

h2. Docker run commands:
{code}
docker run -d --net=host --pid=host --privileged -e 
MESOS_MASTER=9.21.60.192:5050 -e MESOS_SWITCH_USER=0 -e 
MESOS_CONTAINERIZERS=docker,mesos -e MESOS_LOG_DIR=/var/log/mesos -e 
MESOS_WORK_DIR=/var/lib/mesos -v /var/log/mesos:/var/log/mesos -v 
/var/lib/mesos:/var/lib/mesos -v /var/run/docker.sock:/var/run/docker.sock -v 
/sys:/sys -v /var/lib/docker:/var/lib/docker -v /cgroup:/cgroup -v /dev:/dev 
chenzhiwei/mesos:1.0.0-rc1 mesos-slave --no-systemd_enable_support
{code}

h2. docker-py code:
{code}
from docker import Client

cli = Client(base_url='unix://var/run/docker.sock')

host_config=cli.create_host_config(privileged=True, network_mode="host", 
pid_mode="host", binds=["/dev:/dev", "/sys:/sys", "/cgroup:/cgroup", 
"/var/lib/mesos:/var/lib/mesos", "/var/log/mesos:/var/log/mesos", 
"/var/lib/docker:/var/lib/docker", "/var/run/docker.sock:/var/run/docker.sock"])

cli.create_container(image='chenzhiwei/mesos:1.0.0-rc1', command="mesos-slave 
--no-systemd_enable_support", environment={"MESOS_LOG_DIR": "/var/log/mesos", 
"MESOS_WORK_DIR": "/var/lib/mesos", "MESOS_MASTER": "9.21.60.192:5050", 
"MESOS_SWITCH_USER": 0, "MESOS_CONTAINERIZERS": "docker,mesos"}, 
volumes=["/dev", "/sys", "/cgroup", "/var/lib/mesos", "/var/log/mesos", 
"/var/lib/docker", "/var/run/docker.sock"], host_config=host_config, 
name="mesos-agent")

cli.start(container="xxxxx")
{code}

I tested with docker create and docker start, still success. I am not sure if 
this issue belongs to mesos or docker-py.

I also tried to use mesosphere official docker 
image(mesosphere/mesos:1.0.0-rc1) but still without luck. The docker run 
command works all right, but docker-py failed with same error message.

*NOTE*: My Docker host is Ubuntu 16.04 and Docker image is 
chenzhiwei/mesos:1.0.0-rc1, but you can also use the mesosphere/mesos-1.0.0-rc1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to