Is there anything in the nfs server log files? Maybe it squashes root by default and the root group membership of darkness falls into that?
Regards, Bjoern > On May 12, 2015, at 5:53 AM, John Omernik <j...@omernik.com> wrote: > > So I tried su darkness and su - darkness and both allowed a file write with > no issues. On the group thing, while it is "weird" would that actually hurt > ti to contain that group? Even if I set the directory to 777 I still get a > failure. on a create within it. I am guessing this is something more to do > with MapRs NFS than Mesos at this point, but if anyone would have any other > tips on troubleshooting to confirm that, I'd appreciate it. > > John > >> On Mon, May 11, 2015 at 5:18 PM, Marco Massenzio <ma...@mesosphere.io> wrote: >> Looks to me that while 'uid' is 1000 >> uid=1000(darkness) gid=1000(darkness) groups=1000(darkness),0(root) >> >> this is still root's env when run from Mesos (also, weird that groups >> contains 0(root)): >> USER=root >> >> again - not sure how we su to a different user, but this usually happens if >> one does `su darkness` (instead of `su - darkness`) from the shell, at any >> rate. >> >> Marco Massenzio >> Distributed Systems Engineer >> >>> On Mon, May 11, 2015 at 6:54 AM, John Omernik <j...@omernik.com> wrote: >>> Paul: I checked in multiple places and I don't see rootsquash being used. I >>> am using the MapR NFS server, and I do not believe that is a common option >>> in the default setup ( I will follow up closer on that). >>> >>> Adam and Maxime: So I included the output of both id (instead of whoami) >>> and env (as seen below) and I believe that your ideas may be getting >>> somewhere. There are a number of things that strike me as odd in the >>> outputs, and I'd like your thoughts on them. First of all, remember that >>> the permissions on the folders are 775 right now, so with the primary group >>> set (which it appears to be based on id) and the user set, it still should >>> have write access. That said, the SUed process doesn't have any of the >>> other groups (which I want to test if any of those controls access, >>> especially with MapR). At risk of exposing to much information about my >>> test network in a public forum, I left all the details in the ENV to see if >>> there are things other may see that could be causing me issues. >>> >>> Thanks for the replies so far! >>> >>> >>> >>> >>> >>> New Script: >>> >>> #!/bin/bash >>> >>> echo "Writing id information to stderr for one stop logging" 1>&2 >>> >>> id 1>&2 >>> >>> >>> >>> echo "" 1>&2 >>> >>> >>> >>> echo "Printing out the env command to std err for one stop loggins" 1>&2 >>> >>> env 1>&2 >>> >>> >>> >>> mkdir /mapr/brewpot/mesos/storm/test/test1 >>> >>> >>> touch /mapr/brewpot/mesos/storm/test/test1/testing.go >>> >>> >>> >>> >>> >>> >>> Run within Mesos: >>> >>> I0511 08:41:02.804448 8048 exec.cpp:132] Version: 0.21.0 >>> I0511 08:41:02.814324 8059 exec.cpp:206] Executor registered on slave >>> 20150505-145508-1644210368-5050-8608-S2 >>> Writing id information to stderr for one stop logging >>> uid=1000(darkness) gid=1000(darkness) groups=1000(darkness),0(root) >>> >>> Printing out the env command to std err for one stop loggins >>> LIBPROCESS_IP=192.168.0.98 >>> HOST=hadoopmapr3.brewingintel.com >>> SHELL=/bin/bash >>> TERM=unknown >>> PORT_10005=31783 >>> MESOS_DIRECTORY=/tmp/mesos/slaves/20150505-145508-1644210368-5050-8608-S2/frameworks/20150302-094409-1644210368-5050-2134-0003/executors/permtest.5f822976-f7e3-11e4-a22d-56847afe9799/runs/e53dc010-dd3c-4993-8f39-f8b532e5cf8b >>> PORT0=31783 >>> MESOS_TASK_ID=permtest.5f822976-f7e3-11e4-a22d-56847afe9799 >>> USER=root >>> LD_LIBRARY_PATH=:/usr/local/lib >>> SUDO_USER=darkness >>> MESOS_EXECUTOR_ID=permtest.5f822976-f7e3-11e4-a22d-56847afe9799 >>> SUDO_UID=1000 >>> USERNAME=root >>> PATH=/home/darkness:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin >>> MAIL=/var/mail/root >>> PWD=/opt/mapr/mesos/tmp/slave/slaves/20150505-145508-1644210368-5050-8608-S2/frameworks/20150302-094409-1644210368-5050-2134-0003/executors/permtest.5f822976-f7e3-11e4-a22d-56847afe9799/runs/e53dc010-dd3c-4993-8f39-f8b532e5cf8b >>> MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-0.21.0.so >>> MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-0.21.0.so >>> LANG=en_US.UTF-8 >>> PORTS=31783 >>> MESOS_SLAVE_PID=slave(1)@192.168.0.98:5051 >>> MESOS_FRAMEWORK_ID=20150302-094409-1644210368-5050-2134-0003 >>> MESOS_CHECKPOINT=1 >>> SUDO_COMMAND=/usr/local/bin/mesos daemon.sh mesos-slave >>> --master=192.168.0.98:5050 --ip=192.168.0.98 >>> --log_dir=/opt/mapr/mesos/tmp/slave_log/ --containerizers=docker,mesos >>> --gc_delay=600mins --disk_watch_interval=60secs >>> HOME=/home/darkness >>> SHLVL=2 >>> LIBPROCESS_PORT=0 >>> MARATHON_APP_ID=/permtest >>> PYTHONPATH=:/usr/local/libexec/mesos/python >>> MARATHON_APP_VERSION=2015-05-11T13:41:04.218Z >>> LOGNAME=root >>> MESOS_SLAVE_ID=20150505-145508-1644210368-5050-8608-S2 >>> PORT=31783 >>> SUDO_GID=1000 >>> MESOS_RECOVERY_TIMEOUT=15mins >>> _=/usr/bin/env >>> mkdir: cannot create directory `/mapr/brewpot/mesos/storm/test/test1': >>> Permission denied >>> touch: cannot touch `/mapr/brewpot/mesos/storm/test/test1/testing.go': No >>> such file or directory >>> >>> >>> Run from command line: >>> >>> Writing id information to stderr for one stop logging >>> uid=1000(darkness) gid=1000(darkness) >>> groups=1000(darkness),4(adm),24(cdrom),27(sudo),30(dip),42(shadow),46(plugdev),111(lpadmin),112(sambashare),700(mapr),2000(brewclub),2001(lcusers) >>> >>> Printing out the env command to std err for one stop loggins >>> SHELL=/bin/bash >>> TERM=xterm-256color >>> XDG_SESSION_COOKIE=fd12ce903630f14654f11d12000006ce-1431349941.139006-807917506 >>> SSH_CLIENT=192.168.0.186 57204 22 >>> SSH_TTY=/dev/pts/0 >>> USER=darkness >>> LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36: >>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/lib/scala/bin >>> MAIL=/var/mail/darkness >>> PWD=/mnt >>> LANG=en_US.UTF-8 >>> NODE_PATH=/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript >>> HOME=/home/darkness >>> SHLVL=2 >>> LOGNAME=darkness >>> SSH_CONNECTION=192.168.0.186 57204 192.168.0.100 22 >>> LESSOPEN=| /usr/bin/lesspipe %s >>> LESSCLOSE=/usr/bin/lesspipe %s %s >>> _=/usr/bin/env >>> >>> >>>> On Mon, May 11, 2015 at 1:05 AM, Maxime Brugidou >>>> <maxime.brugi...@gmail.com> wrote: >>>> Mesos does not set the groups of the process correctly. There is a JIRA >>>> ticket for that. It only set the gid. I believe that this could explain >>>> the issue if your user is in a specific NFS group to be able go write. >>>> >>>> See >>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/MESOS-719 >>>>> On May 11, 2015 3:51 AM, "Paul Brett" <pbr...@twitter.com> wrote: >>>>> Can you check on the NFS server to see if the filesystem has been >>>>> exported with the rootsquash option? That's a commonly used option which >>>>> converts root uid on NFS clients to nobody on the server. >>>>> >>>>> -- Paul Brett >>>>> >>>>>> On May 10, 2015 5:15 PM, "Adam Bordelon" <a...@mesosphere.io> wrote: >>>>>> Go ahead and run `env` in your script too, and see if there are any >>>>>> interesting differences when run via Marathon vs. directly. >>>>>> Maybe you're running in a different shell? >>>>>> >>>>>>> On Sun, May 10, 2015 at 2:21 PM, John Omernik <j...@omernik.com> wrote: >>>>>>> I believe the slave IS running as root. FWIW when I ran the script from >>>>>>> above as root, it did work as intended (created the files on the NFS >>>>>>> share). >>>>>>> >>>>>>>> On Sun, May 10, 2015 at 9:08 AM, Dick Davies <d...@hellooperator.net> >>>>>>>> wrote: >>>>>>>> Any idea what user mesos is running as? This could just be a >>>>>>>> filesystem permission >>>>>>>> thing (ISTR last time I used NFS mounts, they had a 'root squash' >>>>>>>> option that prevented >>>>>>>> local root from writing to the NFS mount). >>>>>>>> >>>>>>>> On 9 May 2015 at 22:13, John Omernik <j...@omernik.com> wrote: >>>>>>>> > I am not specifying isolators. The Default? :) Is that a per slave >>>>>>>> > setting? >>>>>>>> > >>>>>>>> > On Sat, May 9, 2015 at 3:33 PM, James DeFelice >>>>>>>> > <james.defel...@gmail.com> >>>>>>>> > wrote: >>>>>>>> >> >>>>>>>> >> What isolators are you using? >>>>>>>> >> >>>>>>>> >> On Sat, May 9, 2015 at 3:48 PM, John Omernik <j...@omernik.com> >>>>>>>> >> wrote: >>>>>>>> >>> >>>>>>>> >>> Marco... great idea... thank you. I just tried it and it worked >>>>>>>> >>> when I >>>>>>>> >>> had a /mnt/permtesting with the same permissions. So it appears >>>>>>>> >>> something >>>>>>>> >>> to do with NFS and Mesos (Remember I tested just NFS that worked >>>>>>>> >>> fine, it's >>>>>>>> >>> the combination that is causing this). >>>>>>>> >>> >>>>>>>> >>> On Sat, May 9, 2015 at 1:09 PM, Marco Massenzio >>>>>>>> >>> <ma...@mesosphere.io> >>>>>>>> >>> wrote: >>>>>>>> >>>> >>>>>>>> >>>> Out of my own curiousity (sorry, I have no fresh insights into >>>>>>>> >>>> the issue >>>>>>>> >>>> here) did you try to run the script and write to a non-NFS mounted >>>>>>>> >>>> directory? (same ownership/permissions) >>>>>>>> >>>> >>>>>>>> >>>> This way we could at least find out whether it's something >>>>>>>> >>>> related to >>>>>>>> >>>> NFS, or a more general permission-related issue. >>>>>>>> >>>> >>>>>>>> >>>> Marco Massenzio >>>>>>>> >>>> Distributed Systems Engineer >>>>>>>> >>>> >>>>>>>> >>>> On Sat, May 9, 2015 at 5:10 AM, John Omernik <j...@omernik.com> >>>>>>>> >>>> wrote: >>>>>>>> >>>>> >>>>>>>> >>>>> Here is the testing I am doing. I used a simple script (run.sh) >>>>>>>> >>>>> It >>>>>>>> >>>>> writes the user it is running as to stderr (so it's the same log >>>>>>>> >>>>> as the >>>>>>>> >>>>> errors from file writing) and then tries to make a directory in >>>>>>>> >>>>> nfs, and >>>>>>>> >>>>> then touch a file in nfs. Note: This script directly run works >>>>>>>> >>>>> on every >>>>>>>> >>>>> node. You can see the JSON I used in marathon, and in the >>>>>>>> >>>>> sandbox results, >>>>>>>> >>>>> you can see the user is indeed darkness and the directory cannot >>>>>>>> >>>>> be created. >>>>>>>> >>>>> However when directly run, it the script, with the same user, >>>>>>>> >>>>> creates the >>>>>>>> >>>>> directory with no issue. Now, I realize this COULD still be a >>>>>>>> >>>>> NFS quirk >>>>>>>> >>>>> here, however, this testing points at some restriction in how >>>>>>>> >>>>> marathon kicks >>>>>>>> >>>>> off the cmd. Any thoughts on where to look would be very >>>>>>>> >>>>> helpful! >>>>>>>> >>>>> >>>>>>>> >>>>> John >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> Script: >>>>>>>> >>>>> >>>>>>>> >>>>> #!/bin/bash >>>>>>>> >>>>> echo "Writing whoami to stderr for one stop logging" 1>&2 >>>>>>>> >>>>> whoami 1>&2 >>>>>>>> >>>>> mkdir /mapr/brewpot/mesos/storm/test/test1 >>>>>>>> >>>>> touch /mapr/brewpot/mesos/storm/test/test1/testing.go >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> Run Via Marathon >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> { >>>>>>>> >>>>> "cmd": "/mapr/brewpot/mesos/storm/run.sh", >>>>>>>> >>>>> "cpus": 1.0, >>>>>>>> >>>>> "mem": 1024, >>>>>>>> >>>>> "id": "permtest", >>>>>>>> >>>>> "user": "darkness", >>>>>>>> >>>>> "instances": 1 >>>>>>>> >>>>> } >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> I0509 07:02:52.457242 9562 exec.cpp:132] Version: 0.21.0 >>>>>>>> >>>>> I0509 07:02:52.462700 9570 exec.cpp:206] Executor registered on >>>>>>>> >>>>> slave >>>>>>>> >>>>> 20150505-145508-1644210368-5050-8608-S0 >>>>>>>> >>>>> Writing whoami to stderr for one stop logging >>>>>>>> >>>>> darkness >>>>>>>> >>>>> mkdir: cannot create directory >>>>>>>> >>>>> `/mapr/brewpot/mesos/storm/test/test1': >>>>>>>> >>>>> Permission denied >>>>>>>> >>>>> touch: cannot touch >>>>>>>> >>>>> `/mapr/brewpot/mesos/storm/test/test1/testing.go': >>>>>>>> >>>>> No such file or directory >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> Run Via Shell: >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> $ /mapr/brewpot/mesos/storm/run.sh >>>>>>>> >>>>> Writing whoami to stderr for one stop logging >>>>>>>> >>>>> darkness >>>>>>>> >>>>> darkness@hadoopmapr1:/mapr/brewpot/mesos/storm$ ls ./test/ >>>>>>>> >>>>> test1 >>>>>>>> >>>>> darkness@hadoopmapr1:/mapr/brewpot/mesos/storm$ ls ./test/test1/ >>>>>>>> >>>>> testing.go >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> On Sat, May 9, 2015 at 3:14 AM, Adam Bordelon >>>>>>>> >>>>> <a...@mesosphere.io> >>>>>>>> >>>>> wrote: >>>>>>>> >>>>>> >>>>>>>> >>>>>> I don't know of anything inside of Mesos that would prevent you >>>>>>>> >>>>>> from >>>>>>>> >>>>>> writing to NFS. Maybe examine the environment variables set >>>>>>>> >>>>>> when running as >>>>>>>> >>>>>> that user. Or are you running in a Docker container? Those can >>>>>>>> >>>>>> have >>>>>>>> >>>>>> additional restrictions. >>>>>>>> >>>>>> >>>>>>>> >>>>>> On Fri, May 8, 2015 at 4:44 PM, John Omernik <j...@omernik.com> >>>>>>>> >>>>>> wrote: >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> I am doing something where people may recommend against my >>>>>>>> >>>>>>> course of >>>>>>>> >>>>>>> action. However, I am curious if there is "a way" basically I >>>>>>>> >>>>>>> have a process >>>>>>>> >>>>>>> being kicked off in marathon that is trying to write to a nfs >>>>>>>> >>>>>>> location. The >>>>>>>> >>>>>>> permissions of the user running the task and the nfs location >>>>>>>> >>>>>>> are good. So >>>>>>>> >>>>>>> what component of mesos or marathon is keeping me from writing >>>>>>>> >>>>>>> here ? ( I >>>>>>>> >>>>>>> am getting permission denied). Is this one of those things >>>>>>>> >>>>>>> that is just not >>>>>>>> >>>>>>> allowed, or is there an option to pass to marathon to allow >>>>>>>> >>>>>>> this? Thanks ! >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> -- >>>>>>>> >>>>>>> Sent from my iThing >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>> >>>>>>>> >>>> >>>>>>>> >>> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> -- >>>>>>>> >> James DeFelice >>>>>>>> >> 585.241.9488 (voice) >>>>>>>> >> 650.649.6071 (fax) >>>>>>>> > >>>>>>>> > >