Hi, On Fri, Jan 15, 2016 at 04:54:37PM +0900, yuta takeshita wrote: > Hi, > > Tanks for responding and making a patch. > > 2016-01-14 19:16 GMT+09:00 Dejan Muhamedagic <deja...@fastmail.fm>: > > > On Thu, Jan 14, 2016 at 11:04:09AM +0100, Dejan Muhamedagic wrote: > > > Hi, > > > > > > On Thu, Jan 14, 2016 at 04:20:19PM +0900, yuta takeshita wrote: > > > > Hello. > > > > > > > > I have been a problem with nfsserver RA on RHEL 7.1 and systemd. > > > > When the nfsd process is lost with unexpectly failure, > > nfsserver_monitor() > > > > doesn't detect it and doesn't execute failover. > > > > > > > > I use the below RA.(but this problem may be caused with latest > > nfsserver RA > > > > as well) > > > > > > https://github.com/ClusterLabs/resource-agents/blob/v3.9.6/heartbeat/nfsserver > > > > > > > > The cause is following. > > > > > > > > 1. After execute "pkill -9 nfsd", "systemctl status nfs-server.service" > > > > returns 0. > > > > > > I think that it should be systemctl is-active. Already had a > > > problem with systemctl status, well, not being what one would > > > assume status would be. Can you please test that and then open > > > either a pull request or issue at > > > https://github.com/ClusterLabs/resource-agents > > > > I already made a pull request: > > > > https://github.com/ClusterLabs/resource-agents/pull/741 > > > > Please test if you find time. > > > I tested the code, but still problems remain. > systemctl is-active retrun active and the return code is 0 as well as > systemctl status. > Perhaps it is inappropriate to use systemctl for monitoring the kernel > process.
OK. My patch was too naive and didn't take into account the systemd/kernel intricacies. > Mr Kay Sievers who is a developer of systemd said that systemd doesn't > monitor kernel process in the following. > http://comments.gmane.org/gmane.comp.sysutils.systemd.devel/34367 Thanks for the reference. One interesting thing could also be reading /proc/fs/nfsd/threads instead of checking the process existence. Furthermore, we could do some RPC based monitor, but that would be, I guess, better suited for another monitor depth. Cheers, Dejan > I reply to your pull request. > > Regards, > Yuta Takeshita > > > > > Thanks for reporting! > > > > Dejan > > > > > Thanks, > > > > > > Dejan > > > > > > > 2. nfsserver_monitor() judge with the return value of "systemctl status > > > > nfs-server.service". > > > > > > > > ---------------------------------------------------------------------- > > > > # ps ax | grep nfsd > > > > 25193 ? S< 0:00 [nfsd4] > > > > 25194 ? S< 0:00 [nfsd4_callbacks] > > > > 25197 ? S 0:00 [nfsd] > > > > 25198 ? S 0:00 [nfsd] > > > > 25199 ? S 0:00 [nfsd] > > > > 25200 ? S 0:00 [nfsd] > > > > 25201 ? S 0:00 [nfsd] > > > > 25202 ? S 0:00 [nfsd] > > > > 25203 ? S 0:00 [nfsd] > > > > 25204 ? S 0:00 [nfsd] > > > > 25238 pts/0 S+ 0:00 grep --color=auto nfsd > > > > # > > > > # pkill -9 nfsd > > > > # > > > > # systemctl status nfs-server.service > > > > ● nfs-server.service - NFS server and services > > > > Loaded: loaded (/etc/systemd/system/nfs-server.service; disabled; > > vendor > > > > preset: disabled) > > > > Active: active (exited) since 木 2016-01-14 11:35:39 JST; 1min 3s ago > > > > Process: 25184 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS > > (code=exited, > > > > status=0/SUCCESS) > > > > Process: 25182 ExecStartPre=/usr/sbin/exportfs -r (code=exited, > > > > status=0/SUCCESS) > > > > Main PID: 25184 (code=exited, status=0/SUCCESS) > > > > CGroup: /system.slice/nfs-server.service > > > > (snip) > > > > # > > > > # echo $? > > > > 0 > > > > # > > > > # ps ax | grep nfsd > > > > 25256 pts/0 S+ 0:00 grep --color=auto nfsd > > > > ---------------------------------------------------------------------- > > > > > > > > It is because the nfsd process is kernel process, and systemd does not > > > > monitor the state of the kernel process of running. > > > > > > > > Is there something good way? > > > > (When I use "pidof" instead of "systemctl status", the faileover is > > > > successful.) > > > > > > > > Regards, > > > > Yuta Takeshita > > > > > > > _______________________________________________ > > > > Users mailing list: Users@clusterlabs.org > > > > http://clusterlabs.org/mailman/listinfo/users > > > > > > > > Project Home: http://www.clusterlabs.org > > > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > Bugs: http://bugs.clusterlabs.org > > > > > > > > > _______________________________________________ > > > Users mailing list: Users@clusterlabs.org > > > http://clusterlabs.org/mailman/listinfo/users > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org