[ovirt-users] Re: rebooting an ovirt cluster

Strahil Sun, 03 Feb 2019 17:18:10 -0800

2. Reboot all nodes. I was testing for power outage response. All nodes come up, but glusterd is not running (seems to have failed for some reason). I can manually restart glusterd on all nodes and it comes up and starts communicating normally. However, the engine does not come online. So I figure out where it last lived, and try to start it manually through the web interface. This fails because vdsm-ovirtmgmt is not up. I figured out the correct way to start up the engine would be through the cli via hosted-engine --vm-start. This does work, but it takes a very long time, and it usually starts up on any node other than the one I told it to start on.

If you use fstab - prepare for pain... Systemd mounts are more effective.

Here is a sample:

[root@ovirt1 ~]# systemctl cat gluster_bricks-engine.mount

# /etc/systemd/system/gluster_bricks-engine.mount

[Unit]

Description=Mount glusterfs brick - ENGINE

Requires = vdo.service

After = vdo.service

Before = glusterd.service

Conflicts = umount.target

[Mount]

What=/dev/mapper/gluster_vg_md0-gluster_lv_engine

Where=/gluster_bricks/engine

Type=xfs

Options=inode64,noatime,nodiratime

[Install]

WantedBy=glusterd.service

[root@ovirt1 ~]# systemctl cat glusterd.service

# /etc/systemd/system/glusterd.service

[Unit]

Description=GlusterFS, a clustered file-system server

Requires=rpcbind.service gluster_bricks-engine.mount gluster_bricks-data.mount

After=network.target rpcbind.service gluster_bricks-engine.mount Before=network-online.target

[Service]

Type=forking

PIDFile=/var/run/glusterd.pid

LimitNOFILE=65536

Environment="LOG_LEVEL=INFO"

EnvironmentFile=-/etc/sysconfig/glusterd

ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVELKillMode=process

SuccessExitStatus=15

[Install]

WantedBy=multi-user.target

# /etc/systemd/system/glusterd.service.d/99-cpu.conf

[Service]

CPUAccounting=yes

Slice=glusterfs.slice

Note : Some of the 'After=' and 'Requires=' entries were removed during copy-pasting.

So I guess two (or three) questions. What is the expected operation after a full cluster reboot (ie: in the event of a power failure)? Why doesn't the engine start automatically, and what might be causing glusterd to fail, when it can be restarted manually and works fine?

Expected -everything to be up and running.

Root cause , the system's fstab generator starts after cluster tries to start the bricks - and of course fails.

Then everything on the chain fails.

Just use systemd's mount entries ( I have added automount also) and you won't have such issues.

Best Regards,

Strahil Nikolov

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7YCIYIZVQGXJPGMVMYCTRJUVT7YZOSE/

[ovirt-users] Re: rebooting an ovirt cluster

Reply via email to