Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

Robin Cotgrove Fri, 29 Oct 2010 14:24:10 -0700

> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove
> <ro...@rjcnet.co.uk> wrote:
> > Sorry guys. Swap is not the issue. We've had this
> confirmed by Oracle and I can clearly see there is
> 96GB of swap awailable on the system and ~50GB of
> main memory.
> 
> By who at Oracle?  Not everyone is equally qualified.
>  I would tend to
> rust Jim Mauro (who co-wrote the books[1] on Solaris
> internals,
> performance, & dtrace) over most of the people you
> will get to through
> normal support channels.


Agreed. The normal support channel told us the GUDS script would be better to 
capture the root cause over producing a memory dump. 

> 
> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> 
> How do you know that available swap doesn't
> momentarily drop?  

Because I have been monitoring it during the issues with vmstat and I also 
understand the workload on the platform to know that nothing is starting with 
huge memory requirements suddenly. This is a VCS cluster with Oracle Database 
Resource Groups. DISM usage by the various Oracle DB's is not in use as we ran 
into that a bug with that some months ago. We've seen patched the system but we 
don't need the use of DISM on this dev/test Oracle VCS cluster. 

I've run into plenty of instances where a system has tens
> of gigabytes of
> free memory but is woefully short on reservable swap
> (virtual memory,
> as Jim approximates).  Usually "vmstat 1" is helpful
> in observing
> spikes, but as I said before this could miss very
> short spikes.  If
> you've already done this to see that swap is unlikely
> to be an issue,
> knowing that would be useful to know.  If you are
> measuring the amount
> of reservable swap with "swap -l", you are doing it
> wrong.

Agreed. I don't use it and I don't trust the output from the top utility either 
:-) 

> 
> I do agree that there can be other shortfalls that
> can cause this.
> This may call for speculative tracing of stacks
> across the fork entry
> and return calls, displaying results only when the
> fork fails with
> EAGAIN.  Jim's second script is similar to what I
> suggest, except that
> it doesn't show the code path taken between
> syscall::forksys:entry and
> syscall::forksys:return.
> 
> Also, I would be a little careful running the second
> script as is for
> long periods of time if you have a lot of forksys
> activity with unique
> stacks.  I think that as it is @ks may grow rather
> large over time
> because the successful forks are not cleared.
> 
> -- 
> Mike Gerdts
> http://mgerdts.blogspot.com/
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>
-- 
This message posted from opensolaris.org
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

Reply via email to