I second the monit suggestion. I have used it for exactly this purpose (watching/restarting slave threads) in the past.
regards, Darren On 23 April 2018 at 05:47, Bill Houle <[email protected]> wrote: > As someone who recently had to implement a monitor of not-smokeping > processes, might I suggest “monit”? It is a fairly mainstream package that > is readily available in yum and apt-get repos. > > Monit is a locally-installed (ie per slave) daemon process that can > monitor files (by timestamp or checksum), processes (by PID), programs (by > exit code), and system (by resource consumption). It has a flexible config > language that can alert/start/stop/exec based on those monitor conditions. > > I could see monit being used to watch each slave and alert and/or > auto-restart the data collection. > > —bill > > > > On Apr 22, 2018, at 11:29 AM, Gregory Sloop <[email protected]> wrote: > > This is an awesome idea - and one I've wished for in the past - but never > got around to working on. > Checking the slave data files modification times seems plausible as a way > to check updates - but you'd have to test to be sure. [IIRC that will work > though.] > > Personally, I'd probably try to write it in bash - or something completely > external to smokeping. [Bash because of few dependicies - though you'll > probably want/need something like sendemail for email notifications... > > If slaves are behind NAT or something similar, you'll have to have a way > to get to the slave for handling a restart, but that's really outside the > scope of what you're doing. > > Honestly, simply getting notification that a slave is not pushing updates > would be more than enough - even without the restart. > > Sounds fab to me. And I can't think of a better way, off hand. > > -Greg > > > Hello, > > I have a Debian Jessie box with Smokeping 2.6 installed on it. > > It receives data from Slaves over the Internet (10 slaves or so). > Each Slave roughly monitors xDSL or fiber links. > > Every monday, I can see that data from one or two slaves is missing. > Then I remotely restart smokeping service on slave where data is missing. > > I would like to implement something like: > > - if no data at all from Slave for a given period of time, then restart > Slave's smokeping service and send a Notice email > > - if no data at all from Slave for a longer period of time and Slave's > restart already attempted, then send a Warning email > > As Slaves data is stored on a known directory ins Master's filesystem, I > think I can detect when data from a slave has not been lately modified, > reading directories of files modification times. > > Is there a better way to do so ? Alert's settings seem more appropriate > when WAN links in my case, are slower. > > Best regards > > > > > *-- Gregory Sloop, Principal: Sloop Network & Computer Consulting Voice: > 503.251.0452 x82 EMail: *[email protected] > http://www.sloop.net > *---* > > _______________________________________________ > smokeping-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users > > > _______________________________________________ > smokeping-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users > >
_______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
