As someone who recently had to implement a monitor of not-smokeping processes, 
might I suggest “monit”? It is a fairly mainstream package that is readily 
available in yum and apt-get repos. 

Monit is a locally-installed (ie per slave) daemon process that can monitor 
files (by timestamp or checksum), processes (by PID), programs (by exit code), 
and system (by resource consumption). It has a flexible config language that 
can alert/start/stop/exec based on those monitor conditions. 

I could see monit being used to watch each slave and alert and/or auto-restart 
the data collection. 

—bill



> On Apr 22, 2018, at 11:29 AM, Gregory Sloop <[email protected]> wrote:
> 
> This is an awesome idea - and one I've wished for in the past - but never got 
> around to working on.
> Checking the slave data files modification times seems plausible as a way to 
> check updates - but you'd have to test to be sure. [IIRC that will work 
> though.]
> 
> Personally, I'd probably try to write it in bash - or something completely 
> external to smokeping. [Bash because of few dependicies - though you'll 
> probably want/need something like sendemail for email notifications...
> 
> If slaves are behind NAT or something similar, you'll have to have a way to 
> get to the slave for handling a restart, but that's really outside the scope 
> of what you're doing. 
> 
> Honestly, simply getting notification that a slave is not pushing updates 
> would be more than enough - even without the restart.
> 
> Sounds fab to me. And I can't think of a better way, off hand.
> 
> -Greg
> 
> 
> Hello,
> 
> I have a Debian Jessie box with Smokeping 2.6 installed on it.
> 
> It receives data from Slaves over the Internet (10 slaves or so).
> Each Slave roughly monitors xDSL or fiber links.
> 
> Every monday, I can see that data from one or two slaves is missing.
> Then I remotely restart smokeping service on slave where data is missing.
> 
> I would like to implement something like:
> 
> - if no data at all from Slave for a given period of time, then restart 
> Slave's smokeping service and send a Notice email
> 
> - if no data at all from Slave for a longer period of time and Slave's 
> restart already attempted, then send a Warning email
> 
> As Slaves data is stored on a known directory ins Master's filesystem, I 
> think I can detect when data from a slave has not been lately  modified, 
> reading directories of files modification times.
> 
> Is there a better way to do so ? Alert's settings seem more appropriate when 
> WAN links in my case, are slower.
> 
> Best regards
> 
> 
> -- 
> Gregory Sloop, Principal: Sloop Network & Computer Consulting
> Voice: 503.251.0452 x82
> EMail: [email protected]
> http://www.sloop.net
> ---
> _______________________________________________
> smokeping-users mailing list
> [email protected]
> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users

Reply via email to