I appreciate this patch Eygene -thanks. Eygene Ryabinkin wrote: > Good day. > > I happened to drive into the following bug with CFEngine 2.x: when cfrun > is used to initiate cfagents on the servers and it is spawning many > children (maxchild is set), then the single cfrun instance that was > forked to hail the specific server and run into a timeout, will kill the > whole process group. So all cfrun horde, including the master process > will be killed. > > I am facing this issue when some of the nodes I'm maintaining with > CFEngine were failed, so the connection isn't going to be established > and cfrun will wait for it (virtually) indefinite. The problem is that > ALARM_PID, being the uninitialized static variable, will be set to 0. > So, when SIGALARM will be delivered to one of the children, it will > issue 'kill(0, SIGTERM)' and this means "kill entire process group". > > The attached patch heals the situation. It works both on Linux/FreeBSD > powered CFEngine servers running 2.2.9. The patch tries to initialize > ALARM_PID to -1 in all relevant places, not only in globals.c. > > And another small patch that fixes a typo is attached too. > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bug-cfengine mailing list > [email protected] > https://cfengine.org/mailman/listinfo/bug-cfengine
-- Mark Burgess ------------------------------------------------- Professor of Network and System Administration Oslo University College, Norway Personal Web: http://www.iu.hio.no/~mark Office Telf : +47 22453272 ------------------------------------------------- _______________________________________________ Bug-cfengine mailing list [email protected] https://cfengine.org/mailman/listinfo/bug-cfengine
