the kernel version is 3.10.0-1160.42.2.el7.x86_64 and the script to stop the problematic service takes care to delete the pidfile normally.. I couldn't check what happens in that occasion anyway
On Mon, 22 Nov 2021 at 13:39, Luca Cazzaniga <fusilla...@gmail.com> wrote: > > sorry i miss to say I also double check the load of the server on > grafana, and it was not on heavy load for what I can see.. > We will evaluate to upgrade monit release if it would be helpful to > solve the issue > > On Mon, 22 Nov 2021 at 13:38, Luca Cazzaniga <fusilla...@gmail.com> wrote: > > > > Hi Lutz, the pid was saved in a bin directory on a xfs fs, the pidfile > > was removed by the script of the application server, or at least I > > suppose so, in that occasion there's been a service stop via monit and > > the daemon log reported the service was successfully terminated. > > After that there's been an upgrade of the os, a restart of the server > > and at the boot monit should have started the daemon again but it > > looped with the error "sistd failed to get process data" for one > > application server. Anyway it did its work for other services > > configured with pidfile without problem. > > Really strange.. > > Not sure how procfs is updated/flushed, it's an interface to the > > kernel structures for process statistics written on a cache.. i wonder > > if it could be deleted or unavailable for some reason.. > > > > On Mon, 22 Nov 2021 at 08:21, Lutz Mader <lutz.ma...@freenet.de> wrote: > > > > > > Hello Luca Cazzaniga, > > > thanks for your answer, response. I can not reproduce my problems too, > > > but the problem will came back from time to time, on some of my systems. > > > > > > > In our case the server was just restarted and the problem had persisted > > > > all > > > > over the week end. The annoying part is that no alert was triggered > > > > also if > > > > the process wasn't actually running... I don't think the server was > > > > overloaded upon restart anyway but I can't reproduce the problem > > > > neither... > > > > Maybe an event should be launched in that kinds of situation. Just to > > > > let > > > > the sysop know > > > > > > Where are the pid files stored, is a tmpfs used? > > > Do you remove old pid files? > > > Your Linux version is? > > > > > > This is an old version of Monit, any plan to use a new one? > > > > > > >> The monit release on the server is 5.25.3 > > > > > > With regards, > > > Lutz > > >