I have looked for IO spikes, but not found anything to correlate.
The java garbage collection does seem a possible option.
I will let the list know how things go..
Chris
On 11/13/13 07:51, David Lang wrote:
The other thing I would look for is I/O spikes, try running iostat for
a bit and see if you have a disk getting hammered. If so you can tweak
kernel settings to not let as much data get cached before it starts
getting written out.
David Lang
On Tue, 12 Nov 2013, Leon Towns-von Stauber wrote:
Random-seeming resource usage spikes on a Java application server?
I'd suspect garbage collection.
- Leon
On Nov 12, 2013, at 7:57 PM, Chris Picton <[email protected]> wrote:
Hi all
I have a set of servers running asterisk and some java apps which
have (so far) unexplained spikes in load average.
A typical spike which occurs at "random" times would see the 1
minute load average load go from around 4 to upwards of 50, sometime
approaching 200, within one second.
From proc manpage, the 1 min load average is "number of jobs in the
run queue (state R) or waiting for disk I/O (state D) averaged over
1 minute"
I am collecting many different stats from proc every second, but
nothing I have found can correlate with the spike in load average.
The counts of process numbers from /proc/stat and /prov/loadavg do
not match up to the sudden spike. I have looked at memory paging,
irqs, number of threads, cpu states(intr/iowait/etc), network
traffic, disk io, etc but no metric I have yet found indicates it is
changing behaviour at the same time as the load average spikes
As I am writing this, I have realized that I am not actually
tracking the numbers which would be the direct cause of the load
average, which would be to loop through all processes, extract the
process state from /proc/<pid>/stat, and add up the various types.
This would provide (hopefully) a match so I could see that the load
average numbers are "correct", and may indicate a cause (many
processes waiting for IO, or lots of the same process (asterisk or
java) being scheduled to run at the same time)
While I do that, would anyone have some other idea of how to
troubleshoot the cause of very high load spikes?
Regards
Chris
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/