Hi Gabriele, The behavior you describe could be caused by saturation of any of the resources on the system, not only CPU/load. If all of RAM were used up, for example, a new SSH session would pause until processes were swapped/paged out enough to give room for a new SSH to run in memory. Similar results could happen if network or disk resources were saturated.
Have a look at the USE method: http://www.brendangregg.com/usemethod.html On illumos/Solaris-based systems, you can use these commands to start with: prstat 1 (for CPU) vmstat 1 (for memory) iostat -xn 1 (for disk) dladm show-link -s -i 1 <interface> (for network) Also check /var/adm/messages for errors being logged at or near the time of the issue. Regards, Marion > Date: Wed, 25 Nov 2015 19:39:38 +0100 > From: Gabriele Bulfon <[email protected]> > To: <[email protected]> > Subject: [discuss] deadlocks > > Hi, > I'm looking for help to find a solution to strange slow downs on a long > living XStream/illumos server. > This server runs 5-6 zones, on intel 8 cores, 24GB ram, separate boot on sata > mirror rpool, and data on sas raidz pool. > Two of these zones run essentially the same software: apache, tomcat, cyrus, > postfix, amavis, postgres > Apache front ends http to tomcat, running our collab webapps, working all the > day on postfix smtp, cyrus imap and postgres db. > 1st zone is our own dev machine, running 4-5 users actually on all the stack. > 2nd zone is our customers machine, running around 1000 users on all the > stack, separated into about 10 cyrus domains > and their separated 10 instances of both webapps and databases. > Recently, it happens from time to time (1-2 times a week) that everything > starts to slow down. > Stopping one or the other zone's tomcat/apache gets everything back: somtimes > it's ours, sometimes it's the cloud. > Ok, at first sight one would say: your web app has problems. > But....then why do I have hard times connecting via ssh to the zones during > this situations? Login takes minutes, > password to shell another lots of minutes, but prstat/top don't show any cpu > high usage on global zone, nor inside the zones. > Then I stop one tomcat (sometimes one, sometimes the other), and verything > gets free. > Imap processes during these times are around 1000 in one machine, around 100 > on the other. > Then they abruptly gets down, obvioiusly the web app closes connections. > So my question is....how can I dig this problem? > I would think that if the webapp is the problem, iniside java/tomcat, I > should not experience problem during ssh. > Any possible limits on socket? Any other idea? > Thanks.... > Gabriele > ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
