> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove > <ro...@rjcnet.co.uk> wrote: > > Sorry guys. Swap is not the issue. We've had this > confirmed by Oracle and I can clearly see there is > 96GB of swap awailable on the system and ~50GB of > main memory. > > By who at Oracle? Not everyone is equally qualified. > I would tend to > rust Jim Mauro (who co-wrote the books[1] on Solaris > internals, > performance, & dtrace) over most of the people you > will get to through > normal support channels.
Agreed. The normal support channel told us the GUDS script would be better to capture the root cause over producing a memory dump. > > 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/ > > How do you know that available swap doesn't > momentarily drop? Because I have been monitoring it during the issues with vmstat and I also understand the workload on the platform to know that nothing is starting with huge memory requirements suddenly. This is a VCS cluster with Oracle Database Resource Groups. DISM usage by the various Oracle DB's is not in use as we ran into that a bug with that some months ago. We've seen patched the system but we don't need the use of DISM on this dev/test Oracle VCS cluster. I've run into plenty of instances where a system has tens > of gigabytes of > free memory but is woefully short on reservable swap > (virtual memory, > as Jim approximates). Usually "vmstat 1" is helpful > in observing > spikes, but as I said before this could miss very > short spikes. If > you've already done this to see that swap is unlikely > to be an issue, > knowing that would be useful to know. If you are > measuring the amount > of reservable swap with "swap -l", you are doing it > wrong. Agreed. I don't use it and I don't trust the output from the top utility either :-) > > I do agree that there can be other shortfalls that > can cause this. > This may call for speculative tracing of stacks > across the fork entry > and return calls, displaying results only when the > fork fails with > EAGAIN. Jim's second script is similar to what I > suggest, except that > it doesn't show the code path taken between > syscall::forksys:entry and > syscall::forksys:return. > > Also, I would be a little careful running the second > script as is for > long periods of time if you have a lot of forksys > activity with unique > stacks. I think that as it is @ks may grow rather > large over time > because the successful forks are not cleared. > > -- > Mike Gerdts > http://mgerdts.blogspot.com/ > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss@opensolaris.org > -- This message posted from opensolaris.org _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org