Monit won't help if the slave went down because someone unplugged it, or some
other disaster befell it. It also won't help if the process is still running,
but not actually pushing data to the master.
However, it does have the benefit of being easy to install and configure, with
no development/debug time required.
For the use cases I've got, I think something running on the master would be
more likely to be helpful more of the time. [I can't recall a single case where
the slave was still up and functional, where Monit would do anything, yet the
smokeping slave process was borked. But that may just be me.]
-Greg
I second the monit suggestion. I have used it for exactly this purpose
(watching/restarting slave threads) in the past.
regards,
Darren
On 23 April 2018 at 05:47, Bill Houle <[email protected]> wrote:
As someone who recently had to implement a monitor of not-smokeping processes,
might I suggest “monit”? It is a fairly mainstream package that is readily
available in yum and apt-get repos.
Monit is a locally-installed (ie per slave) daemon process that can monitor
files (by timestamp or checksum), processes (by PID), programs (by exit code),
and system (by resource consumption). It has a flexible config language that
can alert/start/stop/exec based on those monitor conditions.
I could see monit being used to watch each slave and alert and/or auto-restart
the data collection.
—bill
On Apr 22, 2018, at 11:29 AM, Gregory Sloop <[email protected]> wrote:
This is an awesome idea - and one I've wished for in the past - but never got
around to working on.
Checking the slave data files modification times seems plausible as a way to
check updates - but you'd have to test to be sure. [IIRC that will work though.]
Personally, I'd probably try to write it in bash - or something completely
external to smokeping. [Bash because of few dependicies - though you'll
probably want/need something like sendemail for email notifications...
If slaves are behind NAT or something similar, you'll have to have a way to get
to the slave for handling a restart, but that's really outside the scope of
what you're doing.
Honestly, simply getting notification that a slave is not pushing updates would
be more than enough - even without the restart.
Sounds fab to me. And I can't think of a better way, off hand.
-Greg
Hello,
I have a Debian Jessie box with Smokeping 2.6 installed on it.
It receives data from Slaves over the Internet (10 slaves or so).
Each Slave roughly monitors xDSL or fiber links.
Every monday, I can see that data from one or two slaves is missing.
Then I remotely restart smokeping service on slave where data is missing.
I would like to implement something like:
- if no data at all from Slave for a given period of time, then restart Slave's
smokeping service and send a Notice email
- if no data at all from Slave for a longer period of time and Slave's restart
already attempted, then send a Warning email
As Slaves data is stored on a known directory ins Master's filesystem, I think
I can detect when data from a slave has not been lately modified, reading
directories of files modification times.
Is there a better way to do so ? Alert's settings seem more appropriate when
WAN links in my case, are slower.
Best regards
--
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: [email protected]
http://www.sloop.net
---
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
--
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: [email protected]
http://www.sloop.net
---
_______________________________________________
smokeping-users mailing list
[email protected]
https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users