Hi,
I’m trying to get some dependable data on Nagios IO.
Nagios does a lot of disk IO, which is known, but there’s no hard numbers to it.
It gets especially for systems that _have_ best practices applied:
- rrdcached is running, volatile data is written to a RAM disk, etc.
My current approach is using systemtap and collecting only write accesses and
their latencies.
This I have, using the sys call to IO probe here:
https://sourceware.org/systemtap/examples/keyword-index.html#FILE
...and grep, since I don’t really understand all of it.
To turn it into something more worthwhile that can be used by more people and
show results easily,
I want to use the flame graph thing as described at
http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
The whole toolkit seems be able to work with system tap.
The problem:
I’m apparently just too stupid. I don’t know how to get started.
I do not remotely grasp how to take the flamegraph git repo and the script I
have and make them do “something”
(something being, a sort on IO time spend per path element of the files written
to)
Did any of you try something similar?
Did any of you work with flame graphs and can give some advice?
Florian
_______________________________________________
Discuss mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
http://lopsa.org/