Graig,

Thanks for reporting! I believe this might be related to MESOS-7208
<https://issues.apache.org/jira/browse/MESOS-7208>, which is fixed in the
1.2.x branch. So 1.2.1 should have this issue resolved. Is there a way to
test 1.2.x branch see if this problem still exists?

- Jie

On Wed, Apr 26, 2017 at 12:54 PM, De Groot (CTR), Craig <
craig.degroot....@usgs.gov> wrote:

> Joseph,
>
> Below is the error log from the agent.  The user has permission to read
> the file (docker.tar.gz).  It just can't create the file because the copy
> runs as the specified user but the sandbox directory is owned by root.
> Curiously, both the stderr and stdout files are owned by the specified
> user.  I see similar errors (cp: cannot create regular file) in the stderr
> file in the sandbox.
>
> ---------------------------
>
> W0413 11:59:29.657481 43771 fetcher.cpp:896] Begin fetcher log (stderr in
> sandbox) for container 51a13bcd-9598-423d-b437-54324960f5f7 from running
> command: /usr/libexec/mesos/mesos-fetcher
> I0413 11:59:29.617892 43795 fetcher.cpp:531] Fetcher Info:
> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/14820533-768a-4c32-8a99-
> 80cbca9958af-S4\/testuser","items":[{"action":"BYPASS_
> CACHE","uri":{"cache":false,"executable":false,"extract":
> true,"value":"\/usr\/local\/usgs\/bridge\/docker.tar.gz"}}
> ],"sandbox_directory":"\/usr\/local\/usgs\/mesos\/working\/
> slaves\/14820533-768a-4c32-8a99-80cbca9958af-S4\/
> frameworks\/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-0000\/
> executors\/marathon-test.8eed2d73-206a-11e7-b25d-
> 02421f7b2f1a\/runs\/51a13bcd-9598-423d-b437-54324960f5f7","
> user":"testuser"}
> I0413 11:59:29.634415 43795 fetcher.cpp:442] Fetching URI
> '/usr/local/usgs/bridge/docker.tar.gz'
> I0413 11:59:29.634462 43795 fetcher.cpp:283] Fetching directly into the
> sandbox directory
> I0413 11:59:29.634488 43795 fetcher.cpp:220] Fetching URI
> '/usr/local/usgs/bridge/docker.tar.gz'
> cp: cannot create regular file ‘/usr/local/usgs/mesos/
> working/slaves/14820533-768a-4c32-8a99-80cbca9958af-S4/
> frameworks/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-0000/
> executors/marathon-test.8eed2d73-206a-11e7-b25d-
> 02421f7b2f1a/runs/51a13bcd-9598-423d-b437-54324960f5f7/docker.tar.gz’:
> Permission denied
> Failed to fetch '/usr/local/usgs/bridge/docker.tar.gz': Failed to copy
> '/usr/local/usgs/bridge/docker.tar.gz': exited with status 1
>
> End fetcher log for container 51a13bcd-9598-423d-b437-54324960f5f7
> E0413 11:59:29.657582 43771 fetcher.cpp:558] Failed to run mesos-fetcher:
> Failed to fetch all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7'
> with exit status: 256
> E0413 11:59:29.670099 43770 slave.cpp:4650] Container
> '51a13bcd-9598-423d-b437-54324960f5f7' for executor
> 'marathon-test.8eed2d73-206a-11e7-b25d-02421f7b2f1a' of framework
> f61717d6-3ee2-40a7-bbd0-dfdd36a11932-0000 failed to start: Failed to
> fetch all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7' with
> exit status: 256
>
>
>
> __________________________________________________
> Craig De Groot
> Systems Engineer
> Stinger Ghaffarian Technologies (SGT)
> Technical Support Services Contractor to the
> U.S. Geological Survey (USGS)
> Earth Resources Observation and Science (EROS) Center
> 47914 252nd Street
> Sioux Falls, SD 57198-0001
> Ph: 605-594-2507 <(605)%20594-2507>
>
>
> On Wed, Apr 26, 2017 at 1:58 PM, Joseph Wu <jos...@mesosphere.io> wrote:
>
>> There was a change in 1.2.0 which changed how the fetcher would chown the
>> sandbox:
>> https://issues.apache.org/jira/browse/MESOS-5218
>>
>> Prior to 1.2, when the fetcher ran, it would recursively chown the entire
>> sandbox to the given user.  This was incorrect behavior, since the Mesos
>> agent will create the sandbox under the same user (but might put some root
>> files in the non-root sandbox).
>>
>> Can you check your agent logs and paste the fetcher's error here?
>>
>> On Wed, Apr 26, 2017 at 9:06 AM, De Groot (CTR), Craig <
>> craig.degroot....@usgs.gov> wrote:
>>
>>> We recently upgraded from Mesos 1.1.0 to 1.2.0 and are encountering
>>> errors with code that previously worked in 1.1.0.  I believe that this is a
>>> bug in the new version.  If not, I would like to know the correct procedure
>>> for using the sandbox as a user other than root.
>>>
>>> Here is the scenario:
>>> 1) Setup a job in Marathon which specifies a URI to our private
>>> docker.tar.gz
>>>   - See: this for an example ... https://mesosphere.github.
>>> io/marathon/docs/native-docker-private-registry.html
>>>   - This is a local file on each node
>>>
>>> 2) Specify a User (other than root) in the Marathon UI
>>>
>>> 3) Mesos will try to fetch the file and fails during the copy because
>>> the ownership of the sandbox directory are not changed to the specified
>>> user.
>>>   - Note that 1.1.0 correctly set the sandbox directory to the specified
>>> user
>>>   - This behavior is documented in the Mesos Docs here (see "specifying
>>> a user name"):  http://mesos.apache.org/documentation/latest/fetcher/
>>>
>>> Thanks in advance for the help!
>>>
>>> __________________________________________________
>>> Craig De Groot
>>>
>>>
>>>
>>
>

Reply via email to