Re: [Simple-evcorr-users] shellcmd with timeout
On 05/23/2011 04:51 PM, Matthieu Pérotin wrote: Hi Risto, the solutions you list are fine with us. The only objections I may have are: - the shell only solution induce an additional fork, which is not necessary with a patch. It may be problematic in heavy loaded systems; - we are loosing the return value of the 'true' command: the one that will get caught by waitpid will be the one of kill. The perl solution did not have this issue, but it made it even more difficult to send anything other than a SIGALARM; - it renders the rules more complicate and less readable, as it introduces some advance job control features inside the execution instruction. ...since you mentioned that you would like to have the opportunity to use not only ALRM or TERM, perhaps a flexible solution is to set up the following subroutine at SEC startup: type=single ptype=regexp pattern=(SEC_STARTUP|SEC_RESTART) context=SEC_INTERNAL_EVENT desc=compile child fork action=eval %child_with_timeout ( sub { if (scalar(@_) 3) { return -1; } \ my($int) = shift @_; my($sig) = shift @_; my($pid) = fork(); \ if ($pid == -1) { return -1; } elsif ($pid 0) { return 0; } \ $pid = fork(); if ($pid == -1) { exit(1); } \ if ($pid == 0) { exec(@_); } else { \ $SIG{ALRM} = sub { kill $sig, $pid; exit(0); }; \ alarm($int); while (wait() != -1) {}; exit(0); } } ) This function runs the custom program through double fork. The intermediate process is necessary for controlling your program and terminating it. The function takes three parameters -- the timeout value ($int), the signal number ($sig), and the command line. Once you have compiled this compact function, you can use it all over your ruleset in this simple way: type=single ptype=substr pattern=test3 desc=if script has run for 10 seconds, simulate Floating Point exception action=call %o %child_with_timeout 10 8 /home/risto/SEC-misc/test.sh type=single ptype=substr pattern=test4 desc=if script has run for 10 seconds, end it with SIGKILL action=call %o %child_with_timeout 10 9 /home/risto/SEC-misc/test.sh Just another way of tackling the problem (it would be quite tricky to augment 'spawn', 'shellcmd', 'pipe' and SingleWithScript with all this functionality). kind regards, risto -- vRanger cuts backup time in half-while increasing security. With the market-leading solution for virtual backup and recovery, you get blazing-fast, flexible, and affordable data protection. Download your free trial now. http://p.sf.net/sfu/quest-d2dcopy1 ___ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users
[Simple-evcorr-users] shellcmd with timeout
Hi, we recently experienced an annoying problem with processes that, in some circonstances, would get stuck and never return. The fault here is clearly on the processes side, but one can never be sure that a process will return nicely... The consequence on SEC's side is that child processes remain attached to the SEC process, cluttering its %children hash table and adding to the complexity of the check_children sub. A solution to the problem would be to have the possibility to give a timeout option to the shellcmd action: on expiration a sigterm (or sigkill, I'm still not sure) would be issued to the process that was launched. I could not find in the mailing list archives any message about a similar issue, and as we really needed this feature I implemented it as a new action (to retain backward compatibility, it could not bear the same name), which takes two parameters: a timeout in seconds and the command to launch. I'm not quite sure this mailing list is the right place for proposing patchs. If not, could someone give me the right place for that ? Regards, Matthieu. -- Open Software RD / Cluster Management BARD / Bruyères-Le-Châtel Bull, Architect of an Open World TM (www.bull.com) phone +33 (0)1 69 26 62 51 matthieu.pero...@bull.net -- What Every C/C++ and Fortran developer Should Know! Read this article and learn how Intel has extended the reach of its next-generation tools to help Windows* and Linux* C/C++ and Fortran developers boost performance applications - including clusters. http://p.sf.net/sfu/intel-dev2devmay ___ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users
Re: [Simple-evcorr-users] shellcmd with timeout
On 05/20/2011 02:56 PM, Matthieu Pérotin wrote: Hi, we recently experienced an annoying problem with processes that, in some circonstances, would get stuck and never return. The fault here is clearly on the processes side, but one can never be sure that a process will return nicely... The consequence on SEC's side is that child processes remain attached to the SEC process, cluttering its %children hash table and adding to the complexity of the check_children sub. A solution to the problem would be to have the possibility to give a timeout option to the shellcmd action: on expiration a sigterm (or sigkill, I'm still not sure) would be issued to the process that was launched. I could not find in the mailing list archives any message about a similar issue, and as we really needed this feature I implemented it as a new action (to retain backward compatibility, it could not bear the same name), which takes two parameters: a timeout in seconds and the command to launch. I'm not quite sure this mailing list is the right place for proposing patchs. If not, could someone give me the right place for that ? Regards, Matthieu. hi Matthieu, indeed, the mailing list is the proper way for proposing patches. However, in this case it looks to me the issue can quite easily tackled with the means provided by the standard UNIX shell, for example: action=shellcmd (/bin/yourprog PROID=$! ; sleep 10; kill -9 $PROID) Since the shellcmd action allows for shell intepretation of the commandline (provided that shell metacharacters are present), this action will run /bin/yourprog in background and assign its PID to a variable PROID. Then, the shell that started /bin/yourprog will sleep for 10 seconds, and then kill /bin/yourprog (provided the process is still running). There are a number of other ways for tackling the issue, like the employment of Perl: action=shellcmd ( perl -e 'alarm(10); exec(/bin/yourprog)' ) In this case, since the command line does not contain shell metacharacters, an interpreting shell is not started, but SEC rather runs perl directly. In the started new process, we invoke the alarm(2) system call for delivering the ALRM signal for the process itself after 10 seconds. Then we simply run /bin/yourprog within the current process, and since the alarm timer is inherited by /bin/yourprog, the process will get it after 10 seconds and terminate (provided /bin/yourprog does not set a handler to ALRM). In the past, the users have taken advantage of similar shell/Perl features for advanced job control, e.g., see http://simple-evcorr.sourceforge.net/FAQ.html#21. So instead of patching SEC, I'd take advantage of the features of shell or Perl, since they are simply so much more advanced. hope this helps, risto -- What Every C/C++ and Fortran developer Should Know! Read this article and learn how Intel has extended the reach of its next-generation tools to help Windows* and Linux* C/C++ and Fortran developers boost performance applications - including clusters. http://p.sf.net/sfu/intel-dev2devmay ___ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users