Yu Yang created MESOS-6810:
------------------------------

             Summary: Tasks getting stuck in STAGING state when using unified 
containerizer
                 Key: MESOS-6810
                 URL: https://issues.apache.org/jira/browse/MESOS-6810
             Project: Mesos
          Issue Type: Bug
          Components: containerization, docker
    Affects Versions: 1.1.0, 1.0.1, 1.0.0
         Environment: *OS*: ubuntu16.04 64bit
*mesos*: 1.1.0, one master and one agent on same machine
*Agent flag*: {{sudo ./bin/mesos-agent.sh --master=192.168.1.192:5050 
--work_dir=/tmp/mesos_slave --image_providers=docker 
--isolation=docker/runtime,filesystem/linux,cgroups/devices,gpu/nvidia 
--containerizers=mesos,docker --executor_environment_variables="{}"}}
            Reporter: Yu Yang


when submit tasks using container settings like:
{
        "container": {
                "mesos": {
                        "image": {
                                "docker": {
                                        "name": "nvidia/cuda"
                                },
                                "type": "DOCKER"
                        }
                },
                "type": "MESOS"
        },
}

then task will get stuck in STAGING state, and finally it will fail with 
message {{Failed to launch container: Collect failed: Failed to perform 'curl': 
curl: (56) GnuTLS recv error (-54): Error in pull function}}                    
                                        this is the related log on agent

{quote}
I1217 13:05:35.406365 20780 slave.cpp:1539] Got assigned task 
'mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' for framework 
02083c57-b2d9-4054-babe-90e962816813-0001
I1217 13:05:35.406749 20780 slave.cpp:1701] Launching task 
'mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' for framework 
02083c57-b2d9-4054-babe-90e962816813-0001
I1217 13:05:35.406970 20780 paths.cpp:536] Trying to chown 
'/tmp/mesos_slave/slaves/02083c57-b2d9-4054-babe-90e962816813-S0/frameworks/02083c57-b2d9-4054-babe-90e962816813-0001/executors/mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591/runs/8be3b5cd-afa3-4189-aa2a-f09d73529f8c'
 to user 'root'
I1217 13:05:35.409272 20780 slave.cpp:6179] Launching executor 
'mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' of framework 
02083c57-b2d9-4054-babe-90e962816813-0001 with resources cpus(*):0.1; mem(*):32 
in work directory 
'/tmp/mesos_slave/slaves/02083c57-b2d9-4054-babe-90e962816813-S0/frameworks/02083c57-b2d9-4054-babe-90e962816813-0001/executors/mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591/runs/8be3b5cd-afa3-4189-aa2a-f09d73529f8c'
I1217 13:05:35.409958 20780 slave.cpp:1987] Queued task 
'mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' for executor 
'mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' of framework 
02083c57-b2d9-4054-babe-90e962816813-0001
I1217 13:05:35.410163 20779 docker.cpp:1000] Skipping non-docker container
I1217 13:05:35.410636 20776 containerizer.cpp:938] Starting container 
8be3b5cd-afa3-4189-aa2a-f09d73529f8c for executor 
'mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' of framework 
02083c57-b2d9-4054-babe-90e962816813-0001
I1217 13:05:44.459362 20778 slave.cpp:4992] Terminating executor 
''cuda_mesos_nvidia_tf.72e9b9cf-8220-49bd-86fe-1667ee5e7a02' of framework 
02083c57-b2d9-4054-babe-90e962816813-0001' because it did not register within 
1mins
I1217 13:05:53.586819 20780 slave.cpp:5044] Current disk usage 63.59%. Max 
allowed age: 1.848503351525151days
I1217 13:06:35.410905 20777 slave.cpp:4992] Terminating executor 
''mesos_containerizer_test.2a845a72-7b54-4a95-b6fa-6aeda8c6b591' of framework 
02083c57-b2d9-4054-babe-90e962816813-0001' because it did not register within 
1mins
I1217 13:06:35.411175 20780 containerizer.cpp:1950] Destroying container 
8be3b5cd-afa3-4189-aa2a-f09d73529f8c in PROVISIONING state
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to