Graig, Thanks for reporting! I believe this might be related to MESOS-7208 <https://issues.apache.org/jira/browse/MESOS-7208>, which is fixed in the 1.2.x branch. So 1.2.1 should have this issue resolved. Is there a way to test 1.2.x branch see if this problem still exists?
- Jie On Wed, Apr 26, 2017 at 12:54 PM, De Groot (CTR), Craig < craig.degroot....@usgs.gov> wrote: > Joseph, > > Below is the error log from the agent. The user has permission to read > the file (docker.tar.gz). It just can't create the file because the copy > runs as the specified user but the sandbox directory is owned by root. > Curiously, both the stderr and stdout files are owned by the specified > user. I see similar errors (cp: cannot create regular file) in the stderr > file in the sandbox. > > --------------------------- > > W0413 11:59:29.657481 43771 fetcher.cpp:896] Begin fetcher log (stderr in > sandbox) for container 51a13bcd-9598-423d-b437-54324960f5f7 from running > command: /usr/libexec/mesos/mesos-fetcher > I0413 11:59:29.617892 43795 fetcher.cpp:531] Fetcher Info: > {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/14820533-768a-4c32-8a99- > 80cbca9958af-S4\/testuser","items":[{"action":"BYPASS_ > CACHE","uri":{"cache":false,"executable":false,"extract": > true,"value":"\/usr\/local\/usgs\/bridge\/docker.tar.gz"}} > ],"sandbox_directory":"\/usr\/local\/usgs\/mesos\/working\/ > slaves\/14820533-768a-4c32-8a99-80cbca9958af-S4\/ > frameworks\/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-0000\/ > executors\/marathon-test.8eed2d73-206a-11e7-b25d- > 02421f7b2f1a\/runs\/51a13bcd-9598-423d-b437-54324960f5f7"," > user":"testuser"} > I0413 11:59:29.634415 43795 fetcher.cpp:442] Fetching URI > '/usr/local/usgs/bridge/docker.tar.gz' > I0413 11:59:29.634462 43795 fetcher.cpp:283] Fetching directly into the > sandbox directory > I0413 11:59:29.634488 43795 fetcher.cpp:220] Fetching URI > '/usr/local/usgs/bridge/docker.tar.gz' > cp: cannot create regular file ‘/usr/local/usgs/mesos/ > working/slaves/14820533-768a-4c32-8a99-80cbca9958af-S4/ > frameworks/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-0000/ > executors/marathon-test.8eed2d73-206a-11e7-b25d- > 02421f7b2f1a/runs/51a13bcd-9598-423d-b437-54324960f5f7/docker.tar.gz’: > Permission denied > Failed to fetch '/usr/local/usgs/bridge/docker.tar.gz': Failed to copy > '/usr/local/usgs/bridge/docker.tar.gz': exited with status 1 > > End fetcher log for container 51a13bcd-9598-423d-b437-54324960f5f7 > E0413 11:59:29.657582 43771 fetcher.cpp:558] Failed to run mesos-fetcher: > Failed to fetch all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7' > with exit status: 256 > E0413 11:59:29.670099 43770 slave.cpp:4650] Container > '51a13bcd-9598-423d-b437-54324960f5f7' for executor > 'marathon-test.8eed2d73-206a-11e7-b25d-02421f7b2f1a' of framework > f61717d6-3ee2-40a7-bbd0-dfdd36a11932-0000 failed to start: Failed to > fetch all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7' with > exit status: 256 > > > > __________________________________________________ > Craig De Groot > Systems Engineer > Stinger Ghaffarian Technologies (SGT) > Technical Support Services Contractor to the > U.S. Geological Survey (USGS) > Earth Resources Observation and Science (EROS) Center > 47914 252nd Street > Sioux Falls, SD 57198-0001 > Ph: 605-594-2507 <(605)%20594-2507> > > > On Wed, Apr 26, 2017 at 1:58 PM, Joseph Wu <jos...@mesosphere.io> wrote: > >> There was a change in 1.2.0 which changed how the fetcher would chown the >> sandbox: >> https://issues.apache.org/jira/browse/MESOS-5218 >> >> Prior to 1.2, when the fetcher ran, it would recursively chown the entire >> sandbox to the given user. This was incorrect behavior, since the Mesos >> agent will create the sandbox under the same user (but might put some root >> files in the non-root sandbox). >> >> Can you check your agent logs and paste the fetcher's error here? >> >> On Wed, Apr 26, 2017 at 9:06 AM, De Groot (CTR), Craig < >> craig.degroot....@usgs.gov> wrote: >> >>> We recently upgraded from Mesos 1.1.0 to 1.2.0 and are encountering >>> errors with code that previously worked in 1.1.0. I believe that this is a >>> bug in the new version. If not, I would like to know the correct procedure >>> for using the sandbox as a user other than root. >>> >>> Here is the scenario: >>> 1) Setup a job in Marathon which specifies a URI to our private >>> docker.tar.gz >>> - See: this for an example ... https://mesosphere.github. >>> io/marathon/docs/native-docker-private-registry.html >>> - This is a local file on each node >>> >>> 2) Specify a User (other than root) in the Marathon UI >>> >>> 3) Mesos will try to fetch the file and fails during the copy because >>> the ownership of the sandbox directory are not changed to the specified >>> user. >>> - Note that 1.1.0 correctly set the sandbox directory to the specified >>> user >>> - This behavior is documented in the Mesos Docs here (see "specifying >>> a user name"): http://mesos.apache.org/documentation/latest/fetcher/ >>> >>> Thanks in advance for the help! >>> >>> __________________________________________________ >>> Craig De Groot >>> >>> >>> >> >