Hi,

I haven't use nProbe, but you might also consider tackling this from the
other end.  On some systems, we've had services that for some reason just
stop.  Sometimes it is a temporary problem, with some patch or another
messing things up.  Others last longer...

I've written a few 'nanny' scripts scheduled either in cron or the Windows
Scheduler that check to see if a specific process is running, and try to
restart it if it isn't.  I also have these e-mail me when they've had to
restart a process or when they've failed to restart a process.  Here's one
we had for another service that I modified for nprobe.  You'll need to
change the nprobe arguments for your system and maybe the path too.  Set the
e-mail recipients and smtp server for your system also.  The perl
'Proc::ProcessTable' module is not installed by default.  On OSX, just do
"sudo perl -MCPAN -e 'install Proc::ProcessTable'" to install it before
trying the script.

Hope it helps!
Ted

#!/usr/bin/perl
use strict;
use warnings;
use Proc::ProcessTable;
use Net::SMTP;

my $svc='nprobe';
my $dir='/usr/local/bin';
my $pid_dir='/var/run';
my $pid_file="$pid_dir/$svc.pid";
my $args="--daemon-mode --verbose 1 --pid-file $pid_file";
my $start_cmd="$dir/$svc $args";
## Mail notification config
my @recipients=qw( [email protected] );
my $smtp_server='smtp.your-domain.com';

# Number of seconds to wait before checking if
# 'kill' command, if executed, worked.
my $wait_time=3;
# Number of times to try to kill a process before
# giving up.
my $num_kill_tries=5;

########################
## MAIN
########################

my ($res,$whole_msg,$msg,$expected_pid,$actual_pid);

($expected_pid,$msg)=&check_for_pid_file($pid_file);
$whole_msg .= $msg;
($actual_pid,$msg)=&check_for_process();
$whole_msg .= $msg;
## Just quit now if the process is running correctly.
if ($expected_pid && $actual_pid) {
  if ($expected_pid == $actual_pid) {
    $msg = "Expected and actual PIDs match.  Nothing to do.\n";
    $whole_msg .= $msg;
    print $whole_msg;
    exit;
  } else {
    $msg = "Problem: Actual PID doesn't match that in $pid_file.\n";
    $whole_msg .= $msg;
  }
}
## Try to restart the service and send a message.
($res,$msg)=&clean_and_stop_svc($actual_pid,$pid_file);
$whole_msg .= $msg;
($res,$msg)=&start_svc($start_cmd);
$whole_msg .= $msg;
&notify($whole_msg);

########################
## FUNCTIONS
########################
# Checks for $pid_file.  If found, returns PID it contains
# if not, returns undef. Also returns info text as 2nd value.
sub check_for_pid_file {
  my ($f)=...@_;
  my ($fh,$msg);
  my $pid=undef;
  if (-f $f) {
    if (open($fh,$f)) {
      while (<$fh>) {
        chomp;
        $pid=$_;
        $msg="PID file \"$f\" exists, read PID $pid.\n";
        last;
      }
      close $fh;
    } else {
      $msg="PID file \"$f\" exists but could not be read.\n";
    }
  } else {
    $msg="PID file \"$f\" does not exist.\n";
  }
  return ($pid,$msg);
}

# Checks for running process. If found, returns process ID.
# If not, returns undef. Also returns info text as 2nd value.
sub check_for_process {
  my ($found_pid,$msg);
  my $pt = new Proc::ProcessTable;
  my (@fields) = $pt->fields;

  # Find pid of desired service
  foreach my $proc ( @{$pt->table} ) {
    for my $field (@fields) {
      if ($field eq 'fname' && $proc->$field() eq $svc) {
        $found_pid=$proc->pid;
      }
    }
  }
  if (! $found_pid) {
    $msg="Process $svc is not running.\n";
  } else {
    $msg="Process $svc is running with PID $found_pid.\n";
  }
  return ($found_pid,$msg);
}

# Stops running service. Deletes any old .pid files.
# Returns 1 or undef, plus a text message.  1 means success, undef failure.
sub clean_and_stop_svc {
  my ($pid,$pid_file)=...@_;
  my $res;
  my $tries=0;
  while (defined($pid)) {
    $res=kill 9, $pid;
    sleep $wait_time;
    $tries += 1;
    ($pid,$msg)=&check_for_process();
    if ($tries >= $num_kill_tries) {
      last;
    }
  }
  if (-f $pid_file) {
    $res=unlink $pid_file;
    if (! $res) {
      $msg="Failed to delete old .pid file \"$pid_file\".\n";
    } else {
      $msg="Deleted old .pid file \"$pid_file\".\n";
    }
  }
  if (! $pid) {
    if ($tries > 0) {
      $msg .= "Successfully stopped service $svc.\n";
      $pid=1;
    }
  } else {
    $msg .= "Failed to stop service $svc after $num_kill_tries attempts.\n";
    $pid=undef;
  }
  return ($pid,$msg);
}

# Starts service.  Returns 1 or undef, plus a text message.
# 1 means success, undef means failure.
sub start_svc {
  my ($cmd)=...@_;
  my $res=system($cmd);
  my $msg;
  if ($res) {
    $msg="ERROR trying to start $svc with command \"$start_cmd\".\n";
    $msg .= "Return code: $res, error text: $...@.\n";
    $res=undef;
  } else {
    $msg="Successfully started $svc.\n";
    $res=1;
  }
  return ($res,$msg);
}

# Sends notification of what this nanny did and when.
sub notify {
  my ($msg)=...@_;
  my $now=localtime(time);
  my $me=`basename $0`;
  my $hostname=`hostname`;
  chomp $me;
  chomp $hostname;
  my $smtp;
  my $subject="Notice: $svc on $hostname has been restarted.";
  if ($msg =~ /error/i || $msg =~ /fail/i) {
    $subject="Warning: $svc on $hostname is not running.";
  }
  $msg .= "\n---------------------------------------------------\n";
  $msg .= "This message send from $hostname by $me script on $now.\n";

  $smtp = Net::SMTP->new($smtp_server);
  $smtp->mail($ENV{USER});
  $smtp->recipient(@recipients);
  $smtp->data();
  $smtp->datasend("From: $...@$hostname\n");
  $smtp->datasend("To: $svc.service.managers.\n");
  $smtp->datasend("Subject: $subject\n");
  $smtp->datasend("\n");
  $smtp->datasend($msg);
  $smtp->dataend();
  $smtp->quit;
}



On Mon, Dec 21, 2009 at 11:54 PM, Damian Halloran
<[email protected]>wrote:

> Hello all,
>
> I need assistance with monitoring that I am receiving probe data from a
> remote probe.
>
> The set up I have is an IM server receiving probes from a remote site. The
> probe source is at a client's site with a Mac Mini running OS X 10.6 and
> nProbe.
>
> All is working really well and it is fantastic.
>
> There is the occasional problem with nProbe where it will stop running and
> IM will receive no data from the mini. Currently the only way I see that
> this has happened is if I open the Flows windows and see that there is no
> data.
>
> Is there a way to set up a notifier that will let me know when the flows
> data has stopped being received by the IM server?
>
> Thanks
>
> Damian.
> --
> Damian Halloran B Comp CCNA
> Capital IT Pty Ltd
> http://www.capitalit.net.au
> 03 9024 0631
> 0419 308 036
> [email protected]
>
>
>
> ____________________________________________________________________
> List archives:
> http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
> To unsubscribe: send email to: [email protected]
>
>
____________________________________________________________________
List archives: 
http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]

Reply via email to