[ https://issues.apache.org/jira/browse/MESOS-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258999#comment-16258999 ]
Wang Qiang commented on MESOS-5028: ----------------------------------- I am not sure, I should put it here or not. Seems like I still have similar issue with mesos 1.2.1 I am trying to follow the mesos gpu tutorial (using UCR) but failed with ``` message: 'Failed to launch container: Collect failed: Failed to remove the entries under the directory labeled as opaque whiteout '/data/mesos/slave/provisioner/containers/d7651be4-5eb9-4973-86e5-e018e207a327/backends/copy/rootfses/d9515065-5806-4a19-82f5-eb80ddb040bf/usr/local/cuda-9.0': No such file or directory' ``` The image I am using is nvida:cuda The cuda image docker file has the ``` RUN apt-get update && apt-get install -y --no-install-recommends \ cuda-cudart-$CUDA_PKG_VERSION && \ ln -s cuda-9.0 /usr/local/cuda && \ rm -rf /var/lib/apt/lists/* ``` > Copy provisioner cannot replace directory with symlink > ------------------------------------------------------ > > Key: MESOS-5028 > URL: https://issues.apache.org/jira/browse/MESOS-5028 > Project: Mesos > Issue Type: Bug > Components: containerization > Reporter: Zhitao Li > Assignee: Chun-Hung Hsiao > Fix For: 1.1.2, 1.2.1, 1.3.0 > > > I'm trying to play with the new image provisioner on our custom docker > images, but one of layer failed to get copied, possibly due to a dangling > symlink. > Error log with Glog_v=1: > {quote} > I0324 05:42:48.926678 15067 copy.cpp:127] Copying layer path > '/tmp/mesos/store/docker/layers/5df0888641196b88dcc1b97d04c74839f02a73b8a194a79e134426d6a8fcb0f1/rootfs' > to rootfs > '/var/lib/mesos/provisioner/containers/5f05be6c-c970-4539-aa64-fd0eef2ec7ae/backends/copy/rootfses/507173f3-e316-48a3-a96e-5fdea9ffe9f6' > E0324 05:42:49.028506 15062 slave.cpp:3773] Container > '5f05be6c-c970-4539-aa64-fd0eef2ec7ae' for executor 'test' of framework > 75932a89-1514-4011-bafe-beb6a208bb2d-0004 failed to start: Collect failed: > Collect failed: Failed to copy layer: cp: cannot overwrite directory > ‘/var/lib/mesos/provisioner/containers/5f05be6c-c970-4539-aa64-fd0eef2ec7ae/backends/copy/rootfses/507173f3-e316-48a3-a96e-5fdea9ffe9f6/etc/apt’ > with non-directory > {quote} > Content of > _/tmp/mesos/store/docker/layers/5df0888641196b88dcc1b97d04c74839f02a73b8a194a79e134426d6a8fcb0f1/rootfs/etc/apt_ > points to a non-existing absolute path (cannot provide exact path but it's a > result of us trying to mount apt keys into docker container at build time). > I believe what happened is that we executed a script at build time, which > contains equivalent of: > {quote} > rm -rf /etc/apt/* && ln -sf /build-mount-point/ /etc/apt > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)