Re: Mesos fetcher error when running as non-root user

2017-04-26 Thread Jie Yu
Graig,

Thanks for reporting! I believe this might be related to MESOS-7208
, which is fixed in the
1.2.x branch. So 1.2.1 should have this issue resolved. Is there a way to
test 1.2.x branch see if this problem still exists?

- Jie

On Wed, Apr 26, 2017 at 12:54 PM, De Groot (CTR), Craig <
craig.degroot@usgs.gov> wrote:

> Joseph,
>
> Below is the error log from the agent.  The user has permission to read
> the file (docker.tar.gz).  It just can't create the file because the copy
> runs as the specified user but the sandbox directory is owned by root.
> Curiously, both the stderr and stdout files are owned by the specified
> user.  I see similar errors (cp: cannot create regular file) in the stderr
> file in the sandbox.
>
> ---
>
> W0413 11:59:29.657481 43771 fetcher.cpp:896] Begin fetcher log (stderr in
> sandbox) for container 51a13bcd-9598-423d-b437-54324960f5f7 from running
> command: /usr/libexec/mesos/mesos-fetcher
> I0413 11:59:29.617892 43795 fetcher.cpp:531] Fetcher Info:
> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/14820533-768a-4c32-8a99-
> 80cbca9958af-S4\/testuser","items":[{"action":"BYPASS_
> CACHE","uri":{"cache":false,"executable":false,"extract":
> true,"value":"\/usr\/local\/usgs\/bridge\/docker.tar.gz"}}
> ],"sandbox_directory":"\/usr\/local\/usgs\/mesos\/working\/
> slaves\/14820533-768a-4c32-8a99-80cbca9958af-S4\/
> frameworks\/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-\/
> executors\/marathon-test.8eed2d73-206a-11e7-b25d-
> 02421f7b2f1a\/runs\/51a13bcd-9598-423d-b437-54324960f5f7","
> user":"testuser"}
> I0413 11:59:29.634415 43795 fetcher.cpp:442] Fetching URI
> '/usr/local/usgs/bridge/docker.tar.gz'
> I0413 11:59:29.634462 43795 fetcher.cpp:283] Fetching directly into the
> sandbox directory
> I0413 11:59:29.634488 43795 fetcher.cpp:220] Fetching URI
> '/usr/local/usgs/bridge/docker.tar.gz'
> cp: cannot create regular file ‘/usr/local/usgs/mesos/
> working/slaves/14820533-768a-4c32-8a99-80cbca9958af-S4/
> frameworks/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-/
> executors/marathon-test.8eed2d73-206a-11e7-b25d-
> 02421f7b2f1a/runs/51a13bcd-9598-423d-b437-54324960f5f7/docker.tar.gz’:
> Permission denied
> Failed to fetch '/usr/local/usgs/bridge/docker.tar.gz': Failed to copy
> '/usr/local/usgs/bridge/docker.tar.gz': exited with status 1
>
> End fetcher log for container 51a13bcd-9598-423d-b437-54324960f5f7
> E0413 11:59:29.657582 43771 fetcher.cpp:558] Failed to run mesos-fetcher:
> Failed to fetch all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7'
> with exit status: 256
> E0413 11:59:29.670099 43770 slave.cpp:4650] Container
> '51a13bcd-9598-423d-b437-54324960f5f7' for executor
> 'marathon-test.8eed2d73-206a-11e7-b25d-02421f7b2f1a' of framework
> f61717d6-3ee2-40a7-bbd0-dfdd36a11932- failed to start: Failed to
> fetch all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7' with
> exit status: 256
>
>
>
> __
> Craig De Groot
> Systems Engineer
> Stinger Ghaffarian Technologies (SGT)
> Technical Support Services Contractor to the
> U.S. Geological Survey (USGS)
> Earth Resources Observation and Science (EROS) Center
> 47914 252nd Street
> Sioux Falls, SD 57198-0001
> Ph: 605-594-2507 <(605)%20594-2507>
>
>
> On Wed, Apr 26, 2017 at 1:58 PM, Joseph Wu  wrote:
>
>> There was a change in 1.2.0 which changed how the fetcher would chown the
>> sandbox:
>> https://issues.apache.org/jira/browse/MESOS-5218
>>
>> Prior to 1.2, when the fetcher ran, it would recursively chown the entire
>> sandbox to the given user.  This was incorrect behavior, since the Mesos
>> agent will create the sandbox under the same user (but might put some root
>> files in the non-root sandbox).
>>
>> Can you check your agent logs and paste the fetcher's error here?
>>
>> On Wed, Apr 26, 2017 at 9:06 AM, De Groot (CTR), Craig <
>> craig.degroot@usgs.gov> wrote:
>>
>>> We recently upgraded from Mesos 1.1.0 to 1.2.0 and are encountering
>>> errors with code that previously worked in 1.1.0.  I believe that this is a
>>> bug in the new version.  If not, I would like to know the correct procedure
>>> for using the sandbox as a user other than root.
>>>
>>> Here is the scenario:
>>> 1) Setup a job in Marathon which specifies a URI to our private
>>> docker.tar.gz
>>>   - See: this for an example ... https://mesosphere.github.
>>> io/marathon/docs/native-docker-private-registry.html
>>>   - This is a local file on each node
>>>
>>> 2) Specify a User (other than root) in the Marathon UI
>>>
>>> 3) Mesos will try to fetch the file and fails during the copy because
>>> the ownership of the sandbox directory are not changed to the specified
>>> user.
>>>   - Note that 1.1.0 correctly set the sandbox directory to the specified
>>> user
>>>   - This behavior is documented in the Mesos Docs here (see "specifying
>>> a user name"):  

Re: Mesos fetcher error when running as non-root user

2017-04-26 Thread De Groot (CTR), Craig
Joseph,

Below is the error log from the agent.  The user has permission to read the
file (docker.tar.gz).  It just can't create the file because the copy runs
as the specified user but the sandbox directory is owned by root.
Curiously, both the stderr and stdout files are owned by the specified
user.  I see similar errors (cp: cannot create regular file) in the stderr
file in the sandbox.

---

W0413 11:59:29.657481 43771 fetcher.cpp:896] Begin fetcher log (stderr in
sandbox) for container 51a13bcd-9598-423d-b437-54324960f5f7 from running
command: /usr/libexec/mesos/mesos-fetcher
I0413 11:59:29.617892 43795 fetcher.cpp:531] Fetcher Info:
{"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/14820533-768a-4c32-8a99-80cbca9958af-S4\/testuser","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"\/usr\/local\/usgs\/bridge\/docker.tar.gz"}}],"sandbox_directory":"\/usr\/local\/usgs\/mesos\/working\/slaves\/14820533-768a-4c32-8a99-80cbca9958af-S4\/frameworks\/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-\/executors\/marathon-test.8eed2d73-206a-11e7-b25d-02421f7b2f1a\/runs\/51a13bcd-9598-423d-b437-54324960f5f7","user":"testuser"}
I0413 11:59:29.634415 43795 fetcher.cpp:442] Fetching URI
'/usr/local/usgs/bridge/docker.tar.gz'
I0413 11:59:29.634462 43795 fetcher.cpp:283] Fetching directly into the
sandbox directory
I0413 11:59:29.634488 43795 fetcher.cpp:220] Fetching URI
'/usr/local/usgs/bridge/docker.tar.gz'
cp: cannot create regular file
‘/usr/local/usgs/mesos/working/slaves/14820533-768a-4c32-8a99-80cbca9958af-S4/frameworks/f61717d6-3ee2-40a7-bbd0-dfdd36a11932-/executors/marathon-test.8eed2d73-206a-11e7-b25d-02421f7b2f1a/runs/51a13bcd-9598-423d-b437-54324960f5f7/docker.tar.gz’:
Permission denied
Failed to fetch '/usr/local/usgs/bridge/docker.tar.gz': Failed to copy
'/usr/local/usgs/bridge/docker.tar.gz': exited with status 1

End fetcher log for container 51a13bcd-9598-423d-b437-54324960f5f7
E0413 11:59:29.657582 43771 fetcher.cpp:558] Failed to run mesos-fetcher:
Failed to fetch all URIs for container
'51a13bcd-9598-423d-b437-54324960f5f7' with exit status: 256
E0413 11:59:29.670099 43770 slave.cpp:4650] Container
'51a13bcd-9598-423d-b437-54324960f5f7' for executor
'marathon-test.8eed2d73-206a-11e7-b25d-02421f7b2f1a' of framework
f61717d6-3ee2-40a7-bbd0-dfdd36a11932- failed to start: Failed to fetch
all URIs for container '51a13bcd-9598-423d-b437-54324960f5f7' with exit
status: 256



__
Craig De Groot
Systems Engineer
Stinger Ghaffarian Technologies (SGT)
Technical Support Services Contractor to the
U.S. Geological Survey (USGS)
Earth Resources Observation and Science (EROS) Center
47914 252nd Street
Sioux Falls, SD 57198-0001
Ph: 605-594-2507


On Wed, Apr 26, 2017 at 1:58 PM, Joseph Wu  wrote:

> There was a change in 1.2.0 which changed how the fetcher would chown the
> sandbox:
> https://issues.apache.org/jira/browse/MESOS-5218
>
> Prior to 1.2, when the fetcher ran, it would recursively chown the entire
> sandbox to the given user.  This was incorrect behavior, since the Mesos
> agent will create the sandbox under the same user (but might put some root
> files in the non-root sandbox).
>
> Can you check your agent logs and paste the fetcher's error here?
>
> On Wed, Apr 26, 2017 at 9:06 AM, De Groot (CTR), Craig <
> craig.degroot@usgs.gov> wrote:
>
>> We recently upgraded from Mesos 1.1.0 to 1.2.0 and are encountering
>> errors with code that previously worked in 1.1.0.  I believe that this is a
>> bug in the new version.  If not, I would like to know the correct procedure
>> for using the sandbox as a user other than root.
>>
>> Here is the scenario:
>> 1) Setup a job in Marathon which specifies a URI to our private
>> docker.tar.gz
>>   - See: this for an example ... https://mesosphere.github.
>> io/marathon/docs/native-docker-private-registry.html
>>   - This is a local file on each node
>>
>> 2) Specify a User (other than root) in the Marathon UI
>>
>> 3) Mesos will try to fetch the file and fails during the copy because the
>> ownership of the sandbox directory are not changed to the specified user.
>>   - Note that 1.1.0 correctly set the sandbox directory to the specified
>> user
>>   - This behavior is documented in the Mesos Docs here (see "specifying a
>> user name"):  http://mesos.apache.org/documentation/latest/fetcher/
>>
>> Thanks in advance for the help!
>>
>> __
>> Craig De Groot
>>
>>
>>
>


Re: Mesos fetcher error when running as non-root user

2017-04-26 Thread Joseph Wu
There was a change in 1.2.0 which changed how the fetcher would chown the
sandbox:
https://issues.apache.org/jira/browse/MESOS-5218

Prior to 1.2, when the fetcher ran, it would recursively chown the entire
sandbox to the given user.  This was incorrect behavior, since the Mesos
agent will create the sandbox under the same user (but might put some root
files in the non-root sandbox).

Can you check your agent logs and paste the fetcher's error here?

On Wed, Apr 26, 2017 at 9:06 AM, De Groot (CTR), Craig <
craig.degroot@usgs.gov> wrote:

> We recently upgraded from Mesos 1.1.0 to 1.2.0 and are encountering errors
> with code that previously worked in 1.1.0.  I believe that this is a bug in
> the new version.  If not, I would like to know the correct procedure for
> using the sandbox as a user other than root.
>
> Here is the scenario:
> 1) Setup a job in Marathon which specifies a URI to our private
> docker.tar.gz
>   - See: this for an example ... https://mesosphere.github.
> io/marathon/docs/native-docker-private-registry.html
>   - This is a local file on each node
>
> 2) Specify a User (other than root) in the Marathon UI
>
> 3) Mesos will try to fetch the file and fails during the copy because the
> ownership of the sandbox directory are not changed to the specified user.
>   - Note that 1.1.0 correctly set the sandbox directory to the specified
> user
>   - This behavior is documented in the Mesos Docs here (see "specifying a
> user name"):  http://mesos.apache.org/documentation/latest/fetcher/
>
> Thanks in advance for the help!
>
> __
> Craig De Groot
>
>
>


Mesos fetcher error when running as non-root user

2017-04-26 Thread De Groot (CTR), Craig
We recently upgraded from Mesos 1.1.0 to 1.2.0 and are encountering errors
with code that previously worked in 1.1.0.  I believe that this is a bug in
the new version.  If not, I would like to know the correct procedure for
using the sandbox as a user other than root.

Here is the scenario:
1) Setup a job in Marathon which specifies a URI to our private
docker.tar.gz
  - See: this for an example ... https://mesosphere.github.io/
marathon/docs/native-docker-private-registry.html
  - This is a local file on each node

2) Specify a User (other than root) in the Marathon UI

3) Mesos will try to fetch the file and fails during the copy because the
ownership of the sandbox directory are not changed to the specified user.
  - Note that 1.1.0 correctly set the sandbox directory to the specified
user
  - This behavior is documented in the Mesos Docs here (see "specifying a
user name"):  http://mesos.apache.org/documentation/latest/fetcher/

Thanks in advance for the help!

__
Craig De Groot


Last chance: ApacheCon is just three weeks away

2017-04-26 Thread Rich Bowen
ApacheCon is just three weeks away, in Miami, Florida, May 15th - 18th.
http://apachecon.com/

There's still time to register and attend. ApacheCon is the best place
to find out about tomorrow's software, today.

ApacheCon is the official convention of The Apache Software Foundation,
and includes the co-located events:
  * Apache: Big Data
  * Apache: IoT
  * TomcatCon
  * FlexJS Summit
  * Cloudstack Collaboration Conference
  * BarCampApache
  * ApacheCon Lightning Talks

And there's dozens of opportunities to meet your fellow Apache
enthusiasts, both from your project, and from the other 200+ projects at
the Apache Software Foundation.

Register here:
http://events.linuxfoundation.org/events/apachecon-north-america/attend/register-

More information here: http://apachecon.com/

Follow us and learn more about ApacheCon:
  * Twitter: @ApacheCon
  * Discussion mailing list:
https://lists.apache.org/list.html?apachecon-disc...@apache.org
  * Podcasts and speaker interviews: http://feathercast.apache.org/
  * IRC: #apachecon on the https://freenode.net/

We look forward to seeing you in Miami!

-- 
Rich Bowen - VP Conferences, The Apache Software Foundation
http://apachecon.com/
@apachecon



signature.asc
Description: OpenPGP digital signature