Hi all,

> > this fits with my symptoms. For the record my set-up is 4.1.2 on a
166mmx
> > 64M & 4.3gig. Is the symptom occurring on higher spec machines?
>
> I don't know, I don't have one :-)

  We are testing on a Celeron 733 with 256 MB RAM, under very light load
(just for testing). I guess thats why we are not getting this kind of
errors.

> The problem is a number of synchronisation problems with the way that the
> data is collected. What you have is cron running a perl script, which runs
> lots of commands, then updates a database. And every minute it does the
> same thing. And it tries to do the same thing for three different monitors
> (plus one every fifteen minutes).

  Exactly :)

> This could make the machine a lot busier than it otherwise is, and the
> commands meminfo.pl etc might take more than a minute to run. If that
> happens, then two instances of meminfo.pl may finish in the same second,
> and try to update the database with new memory info timestamped at that
> same second. And this is what rrdupdate is complaining about.

  Even more now that it has a test -e command and if you have a loaded
/proc/dev/net you might need to update a lot of databases every minute.

> The solutions to this are twofold:
>
> - run a long lasting perl program which becomes active every minute, and
>   takes and logs a new sample. (This will save the perl
>   interpreter/compiler startup time, lots of times).

  I agree. This was pointed to us some time ago privatelly from another
colister. We agree thats the way to go. We will remake the collecting side
of e-smith monitor in this way (version 2.0)

> - be very lightweight in the way you take the samples. Most of the data
>   can be read straight out of the /proc filesystem, without running the
>   sar command, for instance.

  I have given some suggestions on this in a different branch in the list.
You could remove the test -e command as long as you install the package
carefully (with all the interfaces present in /proc/dev/net), and even
reduce the sar command from 3 iterations to just 1. BTW, we use the sar
command as we don't know of any value in the /proc structure that gives us
the same info (yep, you could use the same info as the uptime command but
thats not a % but a relative measure). We will love to use that value as
this will reduce package dependencies and be a lot faster but we don't know
wich one to use.

  In response to this same email, Richard has said:

>> Ahh, mutual exclusion - maybe some form of semaphore handling or
transaction
queuing and locking?

  Well, I believe this is completely out of our reach. As I said my perl
knowledge is very light and my friend's is not too much from mine. Using
this kind of techniques would be too much for us, at least now. We will try
other possibilities first.

  Regards.


--
Please report bugs to [EMAIL PROTECTED]
Please mail [EMAIL PROTECTED] (only) to discuss security issues
Support for registered customers and partners to [EMAIL PROTECTED]
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Archives by mail and http://www.mail-archive.com/devinfo%40lists.e-smith.org

Reply via email to