Hi Gabriele, The "prstat -Z" numbers by themselves do not tell you if paging is going on. All you can really glean from it is that all values in the "MEMORY" percent column add up to less than 50% of the overall system memory.
Look at "vmstat 1" output, the "pi" and "po" columns in particular, while it prints out one line per second over time. You should be able to do this from within each zone as well as in the global zone. That will tell you if active paging/swapping is going on. You could also use "iostat -xn 1" and see if the drives used by your swap space are busy, to look for clues. Regards, Marion > Date: Wed, 25 Nov 2015 23:00:55 +0100 > From: Gabriele Bulfon <[email protected]> > Reply-To: <[email protected]> > To: Marion Hakanson <[email protected]>, <[email protected]> > Subject: Re: [discuss] deadlocks > > Thanks Marion! > this is what I was thinking, look at my global zone prstat -Z : > ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE > 7 855 8697M 6877M 28% 16:03:15 2.6% cloudserver > 8 292 2978M 1766M 7.2% 0:20:09 0.4% www.sonicle.com > 92 1129M 477M 1.9% 110:18:26 0.1% global > 1 23 41M 31M 0.1% 11:55:15 0.0% asterisk > 5 26 524M 460M 1.9% 11:23:06 0.0% pkgserver > 3 434 2212M 1231M 5.0% 39:04:23 0.0% encoserver > 2 138 1402M 774M 3.2% 4:04:16 0.0% demo.sonicle.com > > cloudserver and www.sonicle.com are the said zones. > I'm not sure what the SWAP column is saying, is cloudserver using all that > swap? > There's no clue about swaps inside the zones, swap -l says: > sonicle@cloudserver:~$ swap -l > swapfile dev swaplo blocks free > /dev/swap - 8 25149432 24050136 > but maybe it's not real... > ---------------------------------------------------------------------------------- > Da: Marion Hakanson > [email protected] > Data: 25 novembre 2015 20.06.54 CET > Oggetto: Re: [discuss] deadlocks > Hi Gabriele, > The behavior you describe could be caused by saturation of any of > the resources on the system, not only CPU/load. If all of RAM were > used up, for example, a new SSH session would pause until processes > were swapped/paged out enough to give room for a new SSH to run > in memory. Similar results could happen if network or disk resources > were saturated. > Have a look at the USE method: > http://www.brendangregg.com/usemethod.html > On illumos/Solaris-based systems, you can use these commands to start with: > prstat 1 (for CPU) > vmstat 1 (for memory) > iostat -xn 1 (for disk) > dladm show-link -s -i 1 > (for network) > Also check /var/adm/messages for errors being logged at or near the > time of the issue. > Regards, > Marion > Date: Wed, 25 Nov 2015 19:39:38 +0100 > From: Gabriele Bulfon > To: > Subject: [discuss] deadlocks > Hi, > I'm looking for help to find a solution to strange slow downs on a long > living XStream/illumos server. > This server runs 5-6 zones, on intel 8 cores, 24GB ram, separate boot on sata > mirror rpool, and data on sas raidz pool. > Two of these zones run essentially the same software: apache, tomcat, cyrus, > postfix, amavis, postgres > Apache front ends http to tomcat, running our collab webapps, working all the > day on postfix smtp, cyrus imap and postgres db. > 1st zone is our own dev machine, running 4-5 users actually on all the stack. > 2nd zone is our customers machine, running around 1000 users on all the > stack, separated into about 10 cyrus domains > and their separated 10 instances of both webapps and databases. > Recently, it happens from time to time (1-2 times a week) that everything > starts to slow down. > Stopping one or the other zone's tomcat/apache gets everything back: somtimes > it's ours, sometimes it's the cloud. > Ok, at first sight one would say: your web app has problems. > But....then why do I have hard times connecting via ssh to the zones during > this situations? Login takes minutes, > password to shell another lots of minutes, but prstat/top don't show any cpu > high usage on global zone, nor inside the zones. > Then I stop one tomcat (sometimes one, sometimes the other), and verything > gets free. > Imap processes during these times are around 1000 in one machine, around 100 > on the other. > Then they abruptly gets down, obvioiusly the web app closes connections. > So my question is....how can I dig this problem? > I would think that if the webapp is the problem, iniside java/tomcat, I > should not experience problem during ssh. > Any possible limits on socket? Any other idea? > Thanks.... > Gabriele > ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
