The issue was indeed permission problems.
I’m using Fedora 22 and mesos-0.25.0-0.2.70.centos701406.x86_64
The following file is set with odd perms
-rwxr-x---. 1 root root 5202 Oct 12 21:08 /bin/mesos-init-wrapper
I was able to use ansible to fix the perms across my cluster by changing
perms/mode on the following
- name: Setting ownership for mesos on /bin/mesos-init-wrapper
file: path=/bin/mesos-init-wrapper owner=root group=mesos mode=755
notify: restart mesos service
tags: set_perms
- name: Setting permissions for mesos group on root owned mesos files
file: path={{ item }} owner=root group=mesos mode=0764
with_items:
- /etc/default/mesos
- /etc/default/mesos-master
- /etc/default/mesos-slave
notify: restart mesos service
tags: set_perms
- name: Setting permissions for mesos user/group on dirs
file: path={{ item }} owner=mesos group=mesos mode=0765 recurse=yes
with_items:
- /var/lib/mesos
- /var/log/mesos
- /etc/mesos
- /tmp/mesos
notify: restart mesos service
tags: set_perms
--
<http://www.orchardplatform.com/>
Rodrick Brown / DevOPs Engineer
+1 917 445 6839 / [email protected]
<mailto:[email protected]>
Orchard Platform
101 5th Avenue, 4th Floor, New York, NY 10003
http://www.orchardplatform.com <http://www.orchardplatform.com/>
Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending
Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
> On Oct 28, 2015, at 5:29 PM, Joris Van Remoortere <[email protected]> wrote:
>
> This may be related to the systemd support we added in 0.25.
> If the agent detects it is running on systemd it will try to launch a systemd
> slice under which to run the executors. If your non-root user does not have
> sufficient permissions to perform these operations that will be a problem.
> Can you share the agent logs to verify this? You should be able to access
> them using journalctl.
>
> Joris
>
> —
> Joris Van Remoortere
> Mesosphere
>
> On Wed, Oct 28, 2015 at 12:33 PM, haosdent <[email protected]
> <mailto:[email protected]>> wrote:
> does mesos slave have any log?
>
> On Wed, Oct 28, 2015 at 11:42 PM, Rodrick Brown <[email protected]
> <mailto:[email protected]>> wrote:
> After I upgraded the first thing I notice is that permissions on wrapper
> script
>
> # ls -al /usr/bin/mesos-init-wrapper
> -rwxr-x---. 1 root root 5202 Oct 12 21:08 /usr/bin/mesos-init-wrapper
>
> So systemd was unable to EXEC this script
>
> So I changed the perms on this wrapper
> # chmod a+x /usr/bin/mesos-init-wrapper
>
>
> However I’m still unable to bring up the process via systemd
>
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Started Mesos
> Slave.
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Starting Mesos
> Slave...
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]:
> mesos-slave.service: main process exited, code=exited, status=126/n/a
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Unit
> mesos-slave.service entered failed state.
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]:
> mesos-slave.service failed.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]:
> mesos-slave.service holdoff time over, scheduling restart.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Started Mesos
> Slave.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Starting Mesos
> Slave...
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]:
> mesos-slave.service: main process exited, code=exited, status=126/n/a
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Unit
> mesos-slave.service entered failed state.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com
> <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]:
> mesos-slave.service failed.
>
> # cat /usr/lib/systemd/system/mesos-slave.service
> [Unit]
> Description=Mesos Slave
> After=network.target
> Wants=network.target
>
> [Service]
> User=mesos
> ExecStart=/usr/bin/mesos-init-wrapper slave
> KillMode=process
> Restart=always
> RestartSec=20
> LimitNOFILE=16384
> CPUAccounting=true
> MemoryAccounting=true
>
> [Install]
> WantedBy=multi-user.target
>
> The only change I made to the unit file was add User=mesos this worked in
> previous versions of mesos.
>
> If remove User=mesos and have systemd bring the process up as root the slave
> joins the cluster and everything works as designed.
> Was something changed in 0.24.1 and 0.25 ?
>
> Thanks.
>
>
> --
> <http://www.orchardplatform.com/>
> Rodrick Brown / DevOPs Engineer
> +1 917 445 6839 <tel:%2B1%20917%20445%206839> / [email protected]
> <mailto:[email protected]>
> Orchard Platform
> 101 5th Avenue, 4th Floor, New York, NY 10003
> http://www.orchardplatform.com <http://www.orchardplatform.com/>
> Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending
> Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
>
> NOTICE TO RECIPIENTS: This communication is confidential and intended for the
> use of the addressee only. If you are not an intended recipient of this
> communication, please delete it immediately and notify the sender by return
> email. Unauthorized reading, dissemination, distribution or copying of this
> communication is prohibited. This communication does not constitute an offer
> to sell or a solicitation of an indication of interest to purchase any loan,
> security or any other financial product or instrument, nor is it an offer to
> sell or a solicitation of an indication of interest to purchase any products
> or services to any persons who are prohibited from receiving such information
> under applicable law. The contents of this communication may not be accurate
> or complete and are subject to change without notice. As such, Orchard App,
> Inc. (including its subsidiaries and affiliates, "Orchard") makes no
> representation regarding the accuracy or completeness of the information
> contained herein. The intended recipient is advised to consult its own
> professional advisors, including those specializing in legal, tax and
> accounting matters. Orchard does not provide legal, tax or accounting advice.
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
--
*NOTICE TO RECIPIENTS*: This communication is confidential and intended for
the use of the addressee only. If you are not an intended recipient of this
communication, please delete it immediately and notify the sender by return
email. Unauthorized reading, dissemination, distribution or copying of this
communication is prohibited. This communication does not constitute an
offer to sell or a solicitation of an indication of interest to purchase
any loan, security or any other financial product or instrument, nor is it
an offer to sell or a solicitation of an indication of interest to purchase
any products or services to any persons who are prohibited from receiving
such information under applicable law. The contents of this communication
may not be accurate or complete and are subject to change without notice.
As such, Orchard App, Inc. (including its subsidiaries and affiliates,
"Orchard") makes no representation regarding the accuracy or completeness
of the information contained herein. The intended recipient is advised to
consult its own professional advisors, including those specializing in
legal, tax and accounting matters. Orchard does not provide legal, tax or
accounting advice.