Hi,

we recently experienced an annoying problem with processes that, in some
circonstances, would get stuck and never return. The fault here is
clearly on the processes side, but one can never be sure that a process
will return nicely... The consequence on SEC's side is that child
processes remain attached to the SEC process, cluttering its %children
hash table and adding to the complexity of the check_children sub.

A solution to the problem would be to have the possibility to give a
timeout option to the shellcmd action: on expiration a sigterm (or
sigkill, I'm still not sure) would be issued to the process that was
launched.

I could not find in the mailing list archives any message about a
similar issue, and as we really needed this feature I implemented it as
a new action (to retain backward compatibility, it could not bear the
same name), which takes two parameters: a timeout in seconds and the
command to launch.

I'm not quite sure this mailing list is the right place for proposing
patchs. If not, could someone give me the right place for that ?

Regards,
Matthieu.
-- 
Open Software R&D / Cluster Management
BARD / Bruyères-Le-Châtel
Bull, Architect of an Open World TM (www.bull.com)
phone +33 (0)1 69 26 62 51
matthieu.pero...@bull.net


------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to