The /proc messages are not related to the stop ... the problem most probably 
is, that monit isn't able to check the mount table for CEPH filesystem as it 
doesn't support that mount string style. Monit supports CIFS and NFS which are 
similar (but simpler) - a source code modification would be needed to support 
CEPH.

Could it be possible to get access to some test system with CEPH filesystem? We 
can implement CEPH support, but we have no experiences with CEPH so far, so 
creating a CEPH cluster would take some time.

Best regards,
Martin


> On 17 Mar 2019, at 11:28, Oscar Segarra <oscar.sega...@gmail.com> wrote:
> 
> Hi, 
> 
> Any clue about how to fix this issue?
> 
> Thanks a lot
> 
> El vie., 8 mar. 2019 a las 22:08, Oscar Segarra (<oscar.sega...@gmail.com 
> <mailto:oscar.sega...@gmail.com>>) escribió:
> Hi, 
> 
> I have tried executing the monit process as 
> 
> monit -vvI 
> 
> And I get the following messages:
> 
> Cannot read proc file '/proc/33989/attr/current' -- Invalid argument
> Cannot read proc file '/proc/41/attr/current' -- Invalid argument
> Cannot read proc file '/proc/42/attr/current' -- Invalid argument
> Cannot read proc file '/proc/43/attr/current' -- Invalid argument
> Cannot read proc file '/proc/44/attr/current' -- Invalid argument
> Cannot read proc file '/proc/45/attr/current' -- Invalid argument
> Cannot read proc file '/proc/47/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5029/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5032/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5034/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5036/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5037/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5038/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5039/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5041/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5094/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5104/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5105/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5180/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5181/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5185/attr/current' -- Invalid argument
> Cannot read proc file '/proc/5192/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6/attr/current' -- Invalid argument
> Cannot read proc file '/proc/60/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6032/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6035/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6043/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6046/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6057/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6059/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6069/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6082/attr/current' -- Invalid argument
> Cannot read proc file '/proc/6931/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7113/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7114/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7116/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7118/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7127/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7196/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7441/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7450/attr/current' -- Invalid argument
> Cannot read proc file '/proc/7451/attr/current' -- Invalid argument
> Cannot read proc file '/proc/8/attr/current' -- Invalid argument
> Cannot read proc file '/proc/8450/attr/current' -- Invalid argument
> Cannot read proc file '/proc/8776/attr/current' -- Invalid argument
> Cannot read proc file '/proc/9/attr/current' -- Invalid argument
> Cannot read proc file '/proc/91/attr/current' -- Invalid argument
> 'check_cephfs' stop on user request
> Monit daemon with PID 33978 awakened
> 
> And of course, monit gets stuck in stop pending:
> 
> Filesystem 'check_cephfs'
>   status                       OK - stop pending
>   monitoring status            Monitored
>   monitoring mode              active
>   on reboot                    start
>   filesystem type              ceph
>   filesystem flags             
> rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216
>   permission                   755
>   uid                          27
>   gid                          27
>   block size                   4 MB
>   space total                  224 MB (of which 0.0% is reserved for root 
> user)
>   space free for non superuser 60 MB [26.8%]
>   space free total             60 MB [26.8%]
>   inodes total                 165
>   inodes free                  -1 [-0.6%]
>   data collected               Fri, 08 Mar 2019 21:59:26
> 
> System 'vdicnode04'
>   status                       OK
>   monitoring status            Monitored
>   monitoring mode              active
>   on reboot                    start
>   load average                 [0.21] [0.25] [0.58]
>   cpu                          0.5%us 0.7%sy 0.0%wa
>   memory usage                 706.1 MB [38.8%]
>   swap usage                   264 kB [0.0%]
>   uptime                       27m
>   boot time                    Fri, 08 Mar 2019 21:39:18
>   data collected               Fri, 08 Mar 2019 21:59:26
> 
> I don't know if those "Cannot read proc file" can be the problem of the 
> eternal "stop pending"
> 
> Thanks a lot in advance to everybody,
> Óscar 
> 
> El vie., 8 mar. 2019 a las 12:43, Oscar Segarra (<oscar.sega...@gmail.com 
> <mailto:oscar.sega...@gmail.com>>) escribió:
> Hi Paul,
> 
> The problem is not starting or stopping ceph server modules. I have the 
> problem in the client side where I want to be able to poweroff my client 
> machine even when the cephfs servers is not available. 
> 
> Thanks a lot 
> 
> El vie., 8 mar. 2019 1:49, Paul Theodoropoulos <p...@anastrophe.com 
> <mailto:p...@anastrophe.com>> escribió:
> I've zero experience with ceph, however - 
> 
> What about just incorporating ceph's status-checking facilities as the 
> trigger, instead of watching the mount? for example 
> 
> monit monitor:
> 
> check program ceph-status with path /usr/local/bin/ceph-status.sh
> start program = "/bin/systemctl start ceph.target"
> stop  program = "/bin/systemctl stop ceph\*.service ceph\*.target"
> if status != 0 then restart
> 
> ceph-status.sh:
> 
> #!/bin/bash
> ceph status >/dev/null 2>&1
> 
> As I said, no experience with ceph, just had a quick look at some of the 
> documentation - I could be completely wrong about the feasability of this...
> 
> 
> On 3/7/19 15:46, Oscar Segarra wrote:
>> Hi Martin, 
>> 
>> Thanks a lot for your quick response.
>> 
>> I have been making some tests but it looks your approach does not work at 
>> all:
>> 
>> This is my simple configuration:
>> 
>> cat << EOT > /etc/monit.d/monit_vdicube
>> check filesystem check_cephfs with path /mnt/vdicube_ceph_fs
>>     start program  = "/bin/mount -t ceph -o name=admin -o 
>> secret=AQDenzBcEyQ8BBAABQjoGn3DTnKN2v5hZm7gMw== 192.168.100.104:6789 
>> <http://192.168.100.104:6789/>,192.168.100.105:6789 
>> <http://192.168.100.105:6789/>,192.168.100.106:6789:/ /mnt/vdicube_ceph_fs"
>>     stop program   = "/bin/umount -f -l /mnt/vdicube_ceph_fs"
>>     IF CHANGED FSFLAGS THEN start
>> EOT
>> 
>> In this case when ceph monitors servers (192.168.100.104:6789 
>> <http://192.168.100.104:6789/>,192.168.100.105:6789 
>> <http://192.168.100.105:6789/>,192.168.100.106:6789 
>> <http://192.168.100.106:6789/>) everything works fine. Start, stop, restart 
>> works great.
>> 
>> Nevertheless, If I loose connectivity with ceph servers (I manually stop 
>> them) the monit service doesn't find out and continues showing "Ok" when, of 
>> course, none of the internal data can be acceeded. This can be normal 
>> because the mount instruction is still there:
>> 
>> [root@vdicnode04 mnt]# mount | grep ceph
>> 192.168.100.104:6789 <http://192.168.100.104:6789/>,192.168.100.105:6789 
>> <http://192.168.100.105:6789/>,192.168.100.106:6789:/ on 
>> /mnt/vdicube_ceph_fs type ceph 
>> (rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216)
>> 
>> In this scenario, If I execute the stop command as root manually from 
>> command line:
>> 
>> /bin/umount -f -l /mnt/vdicube_ceph_fs 
>> 
>> It umounts de FS immediately, however, If I stop it using the monit CLI: 
>> 
>> [root@vdicnode04 /]# monit stop check_cephfs
>> [root@vdicnode04 /]# monit status
>> Monit 5.25.1 uptime: 4m
>> 
>> Filesystem 'check_cephfs'
>>   status                       OK - stop pending
>>   monitoring status            Monitored
>>   monitoring mode              active
>>   on reboot                    start
>>   filesystem type              ceph
>>   filesystem flags             
>> rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216
>>   permission                   755
>>   uid                          27
>>   gid                          27
>>   block size                   4 MB
>>   space total                  228 MB (of which 0.0% is reserved for root 
>> user)
>>   space free for non superuser 64 MB [28.1%]
>>   space free total             64 MB [28.1%]
>>   inodes total                 165
>>   inodes free                  -1 [-0.6%]
>>   data collected               Fri, 08 Mar 2019 00:35:28
>> 
>> System 'vdicnode04'
>>   status                       OK
>>   monitoring status            Monitored
>>   monitoring mode              active
>>   on reboot                    start
>>   load average                 [0.02] [0.20] [0.21]
>>   cpu                          1.2%us 1.0%sy 0.0%wa
>>   memory usage                 514.8 MB [28.3%]
>>   swap usage                   0 B [0.0%]
>>   uptime                       59m
>>   boot time                    Thu, 07 Mar 2019 23:40:21
>>   data collected               Fri, 08 Mar 2019 00:35:28
>> 
>> [root@vdicnode04 /]#
>> 
>> It gets stuck  in the "stop pending" status.
>> 
>> In logs I can see the following:
>> 
>> [CET Mar  8 00:39:55] info     : 'check_cephfs' stop on user request
>> [CET Mar  8 00:39:55] info     : Monit daemon with PID 121791 awakened
>> 
>> Of course, mount is still there until I execute manually the umount command:
>> 
>> [root@vdicnode04 /]# mount | grep ceph
>> 192.168.100.104:6789 <http://192.168.100.104:6789/>,192.168.100.105:6789 
>> <http://192.168.100.105:6789/>,192.168.100.106:6789:/ on 
>> /mnt/vdicube_ceph_fs type ceph 
>> (rw,relatime,name=admin,secret=<hidden>,acl,wsize=16777216)
>> [root@vdicnode04 /]# umount -f -l /mnt/vdicube_ceph_fs
>> [root@vdicnode04 /]# mount | grep ceph
>> [root@vdicnode04 /]#
>> 
>> Even in this situation, monit status is still "stop pending"
>> 
>> [root@vdicnode04 /]# monit status
>> Monit 5.25.1 uptime: 4m
>> 
>> Filesystem 'check_cephfs'
>>   status                       OK - stop pending
>>   monitoring status            Monitored
>> 
>> Any help will be welcome!
>> 
>> Óscar.
>> 
>> 
>> El jue., 7 mar. 2019 a las 22:06, mart...@tildeslash.com 
>> <mailto:mart...@tildeslash.com> (<mart...@tildeslash.com 
>> <mailto:mart...@tildeslash.com>>) escribió:
>> Hi,
>> 
>> we didn't test with ceph, you can try generic configuration, for example:
>> 
>>         check filesystem myfs with path /mydata
>>                 start program = ...    #note: set the start command (mount)
>>                 stop program = ...      #note: set the stop command (umount)
>> 
>> It is possible that monit won't be able to collect I/O statistics ... in 
>> that case we can implement support for ceph.
>> 
>> Best regards,
>> Martin
>> 
>> 
>> > On 7 Mar 2019, at 15:55, Oscar Segarra <oscar.sega...@gmail.com 
>> > <mailto:oscar.sega...@gmail.com>> wrote:
>> > 
>> > Hi,
>> > 
>> > I'd like to mount a cephfs filesystem when it is available (just checking 
>> > ceph metadata server tcp port).
>> > 
>> > And, on poweroff the server , i'd like to force umount the previous cephfs 
>> > volume if it is already mounted. This is because if ceph metadata server 
>> > is not available, the server loops infinitely trying to umount the cephfs 
>> > mount point.
>> > 
>> > Can theese two use cases be implemented with monit? 
>> > 
>> > Thanks a lot in advance 
>> > Óscar 
>> > -- 
>> > To unsubscribe:
>> > https://lists.nongnu.org/mailman/listinfo/monit-general 
>> > <https://lists.nongnu.org/mailman/listinfo/monit-general>
>> 
>> 
>> -- 
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general 
>> <https://lists.nongnu.org/mailman/listinfo/monit-general>
> 
> -- 
> Paul Theodoropoulos
> www.anastrophe.com <http://www.anastrophe.com/>-- 
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general 
> <https://lists.nongnu.org/mailman/listinfo/monit-general>-- 
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general

-- 
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to