I found a file descriptor leak that would only occur on linux from the 
latest change that added cpu utilization monitoring. I am testing it 
now. Please check too and tell me if things get better.

SiliconSlick wrote:
> Howdy all,
> 
> I rebuilt an RPM using the latest SVN code
> late last week and installed it here.  I'm
> now getting a lot of cfenvd failures.  It
> appears to be leaking file descriptors.
> 
> A sample from the log gives the following:
> 
> Apr 19 22:34:52 gemenon cfenvd[22964]:  File descriptor 1021 of child
> higher than MAXFD, check for defunct children
> Apr 19 22:34:52 gemenon cfenvd[22964]:  File descriptor 1022 of child
> 13520 higher than MAXFD, check for defunct children
> Apr 19 22:34:52 gemenon cfenvd[22964]:  File descriptor 1022 of child
> higher than MAXFD, check for defunct children
> Apr 19 22:37:22 gemenon cfenvd[22964]:  File descriptor 1022 of child
> 13533 higher than MAXFD, check for defunct children
> Apr 19 22:37:22 gemenon cfenvd[22964]:  File descriptor 1022 of child
> higher than MAXFD, check for defunct children
> Apr 19 22:40:05 gemenon cfenvd[22964]:  Couldn't open average
> database /var/cfengine/state/cf_observations.db
> Apr 19 22:40:05 gemenon cfenvd[22964]:  db_open: Too many open files
> Apr 19 22:40:05 gemenon cfenvd[22964]:  Error reading average database
> Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Executing
> shell command: /etc/rc.d/init.d/cfenvd restart
> Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Restart:
> Stopping cfengine anomaly detection service (cfenvd): [FAILED]
> Apr 19 23:01:43 gemenon cfenvd[14285]: Lock
> lock.db.localhost.cfenvd.daemon_2743 expired (after 2575/1 minutes)
> Apr 19 23:01:43 gemenon cfenvd[14283]: cfenvd: starting
> Apr 19 23:01:43 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: Restart:
> Starting cfengine anomaly detection service (cfenvd): [  OK  ]
> Apr 19 23:01:44 gemenon xxx_cfengine_cfexecd:gemenon:[13852]: (Done
> with /etc/rc.d/init.d/cfenvd restart)
> Apr 19 23:24:21 gemenon cfenvd[14285]:  LDT Buffer full at 10
> Apr 19 23:39:22 gemenon cfenvd[14285]:  File descriptor 20 of child
> 14584 higher than MAXFD, check for defunct children
> Apr 19 23:39:22 gemenon cfenvd[14285]:  File descriptor 20 of child
> higher than MAXFD, check for defunct children
> Apr 19 23:41:52 gemenon cfenvd[14285]:  File descriptor 20 of child
> 14602 higher than MAXFD, check for defunct children
> Apr 19 23:41:52 gemenon cfenvd[14285]:  File descriptor 20 of child
> higher than MAXFD, check for defunct children
> 
> It appears after 40 minutes, it has reached MAXFD==20.  It goes
> along for a while and then eventually dies a day and a half later 
> (at ~30/hour and with 1024 fds avail, about 34 hours).
> 
> Given Mark's recent changes and request for help with cfenvd,
> I thought it might be related.  Looking at the diff between
> revision 550 and 553 of cfenvd.c[*], I'm thinking the culprit
> might be a return without "fclose(fp)" on line 1404[**].   I haven't
> tested a fix yet since I'm not sure what the fix is (close
> the file first?... don't return?).
> 
> Does this seem like it could be the cause of the problem
> I'm seeing above?  Anyone else having similar problems?
> 
> jack/SiliconSlick
> 
> [*]
> http://svn.iu.hio.no/viewvc/trunk/src/cfenvd.c?root=Cfengine-2&r1=550&r2=553
> 
> [**] this bit:
> 
>    else
>       {
>       Verbose("Found nothing (%s)\n",cpuname);
>       index = ob_spare;
>       return;
>       }
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Bug-cfengine mailing list
> [email protected]
> https://cfengine.org/mailman/listinfo/bug-cfengine

-- 


Mark Burgess

Web: http://www.iu.hio.no/~mark
Tlf: +47 22453272
_______________________________________________
Bug-cfengine mailing list
[email protected]
https://cfengine.org/mailman/listinfo/bug-cfengine

Reply via email to