25 ноября 2015 г. 23:32:13 CET, Marion Hakanson <[email protected]> пишет: >Hi Gabriele, > >The "prstat -Z" numbers by themselves do not tell you if paging is >going on. >All you can really glean from it is that all values in the "MEMORY" >percent >column add up to less than 50% of the overall system memory. > >Look at "vmstat 1" output, the "pi" and "po" columns in particular, >while it prints out one line per second over time. You should be >able to do this from within each zone as well as in the global zone. >That will tell you if active paging/swapping is going on. > >You could also use "iostat -xn 1" and see if the drives used by your >swap space are busy, to look for clues. > >Regards, > >Marion > > >> Date: Wed, 25 Nov 2015 23:00:55 +0100 >> From: Gabriele Bulfon <[email protected]> >> Reply-To: <[email protected]> >> To: Marion Hakanson <[email protected]>, <[email protected]> >> Subject: Re: [discuss] deadlocks >> >> Thanks Marion! >> this is what I was thinking, look at my global zone prstat -Z : >> ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE >> 7 855 8697M 6877M 28% 16:03:15 2.6% cloudserver >> 8 292 2978M 1766M 7.2% 0:20:09 0.4% www.sonicle.com >> 92 1129M 477M 1.9% 110:18:26 0.1% global >> 1 23 41M 31M 0.1% 11:55:15 0.0% asterisk >> 5 26 524M 460M 1.9% 11:23:06 0.0% pkgserver >> 3 434 2212M 1231M 5.0% 39:04:23 0.0% encoserver >> 2 138 1402M 774M 3.2% 4:04:16 0.0% demo.sonicle.com >> >> cloudserver and www.sonicle.com are the said zones. >> I'm not sure what the SWAP column is saying, is cloudserver using all >that swap? >> There's no clue about swaps inside the zones, swap -l says: >> sonicle@cloudserver:~$ swap -l >> swapfile dev swaplo blocks free >> /dev/swap - 8 25149432 24050136 >> but maybe it's not real... >> >---------------------------------------------------------------------------------- >> Da: Marion Hakanson >> [email protected] >> Data: 25 novembre 2015 20.06.54 CET >> Oggetto: Re: [discuss] deadlocks >> Hi Gabriele, >> The behavior you describe could be caused by saturation of any of >> the resources on the system, not only CPU/load. If all of RAM were >> used up, for example, a new SSH session would pause until processes >> were swapped/paged out enough to give room for a new SSH to run >> in memory. Similar results could happen if network or disk resources >> were saturated. >> Have a look at the USE method: >> http://www.brendangregg.com/usemethod.html >> On illumos/Solaris-based systems, you can use these commands to start >with: >> prstat 1 (for CPU) >> vmstat 1 (for memory) >> iostat -xn 1 (for disk) >> dladm show-link -s -i 1 >> (for network) >> Also check /var/adm/messages for errors being logged at or near the >> time of the issue. >> Regards, >> Marion >> Date: Wed, 25 Nov 2015 19:39:38 +0100 >> From: Gabriele Bulfon >> To: >> Subject: [discuss] deadlocks >> Hi, >> I'm looking for help to find a solution to strange slow downs on a >long living XStream/illumos server. >> This server runs 5-6 zones, on intel 8 cores, 24GB ram, separate boot >on sata mirror rpool, and data on sas raidz pool. >> Two of these zones run essentially the same software: apache, tomcat, >cyrus, postfix, amavis, postgres >> Apache front ends http to tomcat, running our collab webapps, working >all the day on postfix smtp, cyrus imap and postgres db. >> 1st zone is our own dev machine, running 4-5 users actually on all >the stack. >> 2nd zone is our customers machine, running around 1000 users on all >the stack, separated into about 10 cyrus domains >> and their separated 10 instances of both webapps and databases. >> Recently, it happens from time to time (1-2 times a week) that >everything starts to slow down. >> Stopping one or the other zone's tomcat/apache gets everything back: >somtimes it's ours, sometimes it's the cloud. >> Ok, at first sight one would say: your web app has problems. >> But....then why do I have hard times connecting via ssh to the zones >during this situations? Login takes minutes, >> password to shell another lots of minutes, but prstat/top don't show >any cpu high usage on global zone, nor inside the zones. >> Then I stop one tomcat (sometimes one, sometimes the other), and >verything gets free. >> Imap processes during these times are around 1000 in one machine, >around 100 on the other. >> Then they abruptly gets down, obvioiusly the web app closes >connections. >> So my question is....how can I dig this problem? >> I would think that if the webapp is the problem, iniside java/tomcat, >I should not experience problem during ssh. >> Any possible limits on socket? Any other idea? >> Thanks.... >> Gabriele >> > >
Also do not forget that in Solaris, swap does not necessarily mean allocated memory - it can quite be just reserved. VMs tend to do that - requiring enough swap to be available if it needs to be used for a VM's RAM to get swapped out. Usually it is never really used. Try to leave a terminal (maybe in a vnc console) with 'vmstat 1' and 'iostat -Xnz 1' so you'd see if the system is swapping durkng the slowdown. Also maybe walking the fragmented memory is a time-consuming task. Revise your JVM settings (GC etc.) to rule that out - or maybe even solve the issue by cleaning up often enough that single ops are not fatally slow. HTH, Jim -- Typos courtesy of K-9 Mail on my Samsung Android ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
