first see my attached examples of a per-minute cronjob with its config-file
this works perfectly since years, with F15 this works NOT the same way because
a forced restart per shell would not be recognized, a reboot would not
be recognized and since "systemctl" gives NO FEEDBACK the so generated
mails are useless

it does not matter if YOU like this implementation as long it
works for years perfectly and i am responsible for it

Am 01.09.2011 13:53, schrieb "Jóhann B. Guðmundsson":
> On 09/01/2011 11:07 AM, Reindl Harald wrote:
>>
>> Am 01.09.2011 12:48, schrieb "Jóhann B. Guðmundsson":
>>> On 09/01/2011 09:42 AM, Reindl Harald wrote:
>>>> yes and that is why "systemd" should generate a notify-mail to root as
>>>> self-written scripts are doing since years so they could be really
>>>> replaced with systemd - the silent restart is a unfinished thought
>>> Again with my admin hat it should not it.
>> it should be able because you are not the admin of my servers :-)
>
>
> Thankfully I'm not.
>
> I manage up to 100 server instances which kinda is enough and fills my quota .
>

well mine are 30 (mail, web, dns, epp-interfaces, netatalk, samba, 
spamfirewalls)
and since i am developer at the same time you can believe my quota is more
than full for dealing with unbaked changes

>>
>>> It should provide the admin with the means to take actions if an failure 
>>> occurs since in large deployments you
>>> might want send different email depending on which service is failing like 
>>> sending hostmas...@example.com mail if
>>> bind goes down, webmas...@example.com if apache goes down etc.
>> in your environemnt maybe, but it should generally have SIMPLE options
>> for setups where hostmaster, webmaster, postmaster is the same person
>
> Then create a unit that only sends failure notifications to root hey you can 
> even call it
> RestartNotifyMail.service....
>

without any documentation?
where is the documentation hwo to do this?

>>
>>> and at the same time take other action like potentially trigger abrt to 
>>> send bug report or
>>> revert changes and restart the daemon etc. as opposed to pipe a simple mail 
>>> notification
>>> to root and systemd provides exactly that to admins via OnFailure= see man 
>>> systemd.unit
>>> for details...
>> http://0pointer.de/public/systemd-man/systemd.unit.html does not contain the 
>> word "mail"
>> "RestartNotifyMail=root" would be what is needed in SOHO environments!
>>
>> well but a) there are missing options and b) OnFailure is simply a joke
>> if i say "Restart=always" then OnFailure is NOT triggered
>
> Why should it be honestly I would like to hear those arguments which should 
> be rather interesting...

if a server is primary a WEBSERVER httpd has to be restarted if it does not run
if somebody does a "killall httpd" it SHOULD start it again
if i want httpd down i stop crond for whatever reason - easy to manage
if it is restarted i want a notify mail

this time a cronjob every minute this does and it recognizes if reboot/shutdown
or /sbin/service is active to not force the restart

this does no longer work with Fedora 15
so i need a replacement
but systemd is only a blind batcher

>> if i say "Restart=always" i want a mail if this happens
>> not a own unit-file
>
> This is just laughable no admin would do this so go ahead and shoot yourself 
explain me not my job - i know exactly what i do the only thing i do not know
is how to deal with systemd that all the perfect working this are working 
exactly
as they do since years
> in the foot but dont be complaining on this list of the lack of options ( or 
> some option not working ) and what
> not in the process so anyone who might be passing by on the worldwideweb 
> ignore this and set a more sane Restart
> option for your environment/deployment as in Restart=on-failure and create a 
> set of good units to use with that
> in OnFailure=.
IT IS missing

|on-failure| AND |on-abort| is valid and better than always
but the lacking of configure both forces me to use "always"
what exactly do you not understand?

> When service(s) fail they fail for a reason and admin should inspect the 
> cause of that...
> ( thou some apparently would like to be spammed to death when that service is 
> stuck in a restart loop )...

you tell me nothing new and that is why i want a simple mail if it happens
and that is why my cronjob is running once per minute and a maximum of
60 mails per hour is NOT spamming to death

but this does not change the fact that SOMETIMES services are hanging like 
dbmail-lmtpd
did sometimes, as i woke up i noticed this started the service and made a 
"postqueue -f"

the service did not fail for days and if i would had my cron-job write after 
this
NO SINGLE PROBLEM for customers would have exist

so do not explain me after ten years waht i need for our services
i have all i need for them on F14 and nwo my problem is how to deal with
systemd since it brings
'httpd', 'process'=>'httpd', 'listen'=>'0.0.0.0:80'); $services[] = array('name'=>'httpd-worker', 'process'=>'httpd-worker', 'listen'=>'127.0.0.1:8888'); $services[] = array('name'=>'smb', 'process'=>'smbd', 'listen'=>'0.0.0.0:445'); $services[] = array('name'=>'pure-ftpd', 'process'=>'pure-ftpd', 'listen'=>'0.0.0.0:21'); $services[] = array('name'=>'postfix', 'process'=>'master', 'listen'=>'10.0.0.6:25'); $services[] = array('name'=>'named', 'process'=>'named', 'listen'=>'10.0.0.6:53'); $services[] = array('name'=>'php_service', 'process'=>'php /path/script.php'); ?>
#!/usr/bin/php
<?php
 /** Zu ueberwachende Dienste einbinden */
 $services = array();
 require(dirname(__FILE__) . '/' . 'rh_watchdog.conf.php');

 /** Pfade zu Shell-Kommandos */
 $check_cmd   = '/bin/netstat -ltnT';
 $restart_cmd = '/sbin/service';
 $killall_cmd = '/usr/bin/killall';

 /** Liste der offenen TCP-Adressen/Ports abrufen und zwischenspeichern */
 $running = rh_list_networkservices($check_cmd);

 /** Liste der zu ueberwachenden Dienste durchlaufen und bei Bedarf neu starten 
*/
 foreach($services as $service)
 {
  /** Echte Netzwerk-Dienste */
  if(isset($service['listen']) && !empty($service['listen']))
  {
   /** Wenn Netzwerkport nicht offen ist Dienst neu starten */
   if(!in_array($service['listen'], $running))
   {
    /** Beim ersten Mal 5 Sekunden warten und nochmals pruefen */
    echo '"' . $service['name'] . '" not running, waiting 5 seconds and restart 
if it not comes back' ."\n";
    sleep(5);
    $running = rh_list_networkservices($check_cmd);
    /** Dienst tatsaechlich neu starten */
    if(!in_array($service['listen'], $running))
    {
     /** Sicherstellen dass gerade kein Admin-Zugriff auf den Dienst 
stattfindet */
     if(!watchdog_count_proc('/sbin/service ' . $service['name']))
     {
      /** Nicht wenn Shutdown/Reboot aktiv ist */
      if(!watchdog_count_proc('halt') && !watchdog_count_proc('reboot') && 
!watchdog_count_proc('shutdown') && !watchdog_count_proc('consolehelper'))
      {
       sleep(1);
       echo rh_exec_shellcmd($restart_cmd . ' ' . $service['name'] . ' stop');
       sleep(1);
       rh_exec_shellcmd($killall_cmd . ' ' . $service['process']);
       echo rh_exec_shellcmd($restart_cmd . ' ' . $service['name'] . ' start');
      }
      /** Cron-Message dass Watchdog mit Reboot/Shutdown kollidiert ist */
      else
      {
       echo 'shutdown/reboot in progress: ' . $service['name'] . ' (watchdog 
ignored)' . "\n";
      }
     }
     /** Cron-Message dass Watchdog mit Admin-Kommando kollidiert ist */
     else
     {
      echo '"/sbin/service ' . $service['name'] . '" running (watchdog 
ignored)' . "\n";
     }
    }
   }
  }
  /** Interne, langlebige Prozesse ohne Netzwerk-Ports */
  else
  {
   /** Check ob Prozess laeuft */
   if(watchdog_count_proc($service['process'], $service['name']) < 1)
   {
    /** Beim ersten Mal 5 Sekunden warten und nochmals pruefen */
    echo '"' . $service['name'] . '" not running, waiting 5 seconds and restart 
if it not comes back' ."\n";
    sleep(5);
    /** Dienst tatsaechlich neu starten */
    if(watchdog_count_proc($service['process'], $service['name']) < 1)
    {
     /** Sicherstellen dass kein administrativer Dienst-Zugriff stattfindet und 
der Rechner nicht gerade herunterfaehrt */
     if(!watchdog_count_proc('halt') && !watchdog_count_proc('reboot') && 
!watchdog_count_proc('shutdown') && !watchdog_count_proc('consolehelper'))
     {
      sleep(1);
      echo rh_exec_shellcmd($restart_cmd . ' ' . $service['name'] . ' restart', 
/**$background*/0);
     }
     /** Cron-Message dass Watchdog mit Admin bzw. Reboot/Shutdown kollidiert 
ist */
     else
     {
      echo 'admin-access or shutdown/reboot in progress: ' . $service['name'] . 
' (watchdog ignored)' . "\n";
     }
    }
   }
  }
 }

 /** Array aller laufenden Netzwerkdienste zurueckgeben */
 function rh_list_networkservices($check_cmd)
 {
  $listening   = rh_exec_shellcmd($check_cmd);
  $temp        = explode("\n", $listening);
  $running     = array();
  foreach($temp as $line)
  {
   if(stristr($line, 'LISTEN'))
   {
    $line      = preg_replace('/  +|' . "\t|  +" . '/', ' ', $line);
    $zsp       = explode(" ", $line);
    $running[] = $zsp[3];
   }
  }
  return $running;
 }

 /** Shell-Kommando mittels popen ausfuehren und Ausgabe zurueckgeben */
 function rh_exec_shellcmd($cmd, $background=0)
 {
  if($background)
  {
   $cmd = $cmd . ' &';
  }
  ob_start();
  passthru($cmd);
  return ob_get_clean();
 }

 /**
  * Anzahl laufender Instanzen eines System-Befehls zaehlen
  *
  * @param  string  $proc_name     Prozess-String nach dem gesucht wird
  * @param  string  $service_name  Dienst-Name um Einschlag des Watchdog bei 
manuellem Neustart abzufangen
  * @return integer
 */
 function watchdog_count_proc($proc_name, $service_name='')
 {
  /** Nur als CLI */
  if(php_sapi_name() != 'cli')
  {
   return 0;
  }
  /** Wenn Shell nicht '/bin/sh' oder '/bin/bash' ist abbrechen da unbekanntes 
Verhalten */
  settype($_ENV['SHELL'], 'string');
  $shell = strtolower($_ENV['SHELL']);
  if($shell != '/bin/bash' && $shell != '/bin/sh')
  {
   return 0;
  }
  /** ps-Kommando generieren, ausfuehren und Ausgabe zwischenspeichern */
  $out = rh_exec_shellcmd('/bin/ps ax');
  $out = trim(utf8_decode($out));
  /** Ausgabe in Array auftrennen und jede Zeile durchlaufen */
  $arr = explode("\n", $out);
  $xcount = 0;
  foreach($arr as $line)
  {
   /** Wenn "/sbin/service $service_name" aktiv ist Admin-Aktion erkennen und 
nicht eingreifen */
   if(!empty($service_name))
   {
    if(stripos($line, '/sbin/service ' . $service_name) !== false)
    {
     $xcount++;
    }
   }
   /** Check ob gesuchtes Kommando in Zeile vorkommt, 'ps' und 'grep' 
ignorieren */
   if(stripos($line, $proc_name) !== false && stripos($line, 'ps ax') == false 
&& stripos($line, 'ps aux') == false && stripos($line, 'grep') == false)
   {
    $xcount++;
   }
  }
  /** Anzahl zurueckgeben */
  return $xcount;
 }
?>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to