Hi David,
I have a theory. But before I present it, I would like a bit more
information.
- Output of "/usr/sbin/prtdiag"
- If it is a Sun Fire, the output of "cfgadm -av"
Thanks,
Sherry
On Wed, Aug 31, 2005 at 02:13:47PM -0700, David McDaniel wrote:
> Our application consists of multiple cooperating multithreaded processes. The
> application is both latency and throughput sensitive. Since it originated
> long ago, several artifacts are less than optimal, but thats the way it is
> for awhile longer. Anyway, I digress. Most threads run in the TS class with
> boosted "nice" values so as to limit the possible interference from the the
> occasional background task; the exceptions are a few very lightweight,
> infrequent, but urgent ones that run in the RT class. Additionally, hires
> tick is set. As a result, the default rechoose_interval period is 3ms rather
> than the normal 30ms.
> The curious thing is that on a 12 cpu system I can observe that some cpus
> are much busier than others, and latency as observed via prstat -Lm is higher
> than expected on a lightly loaded system. I presume this is an artifact of
> threads queueing up for rechoose_interval on the last cpu they ran on instead
> of migrating. This seems to be born out by the fact that I can use psrset to
> create a set containing one of the otherwise idle cpus, bind a process to it,
> then delete the processor set and see that the previously bound process
> appears to stick on the previously idle cpu. OK, so far, but the other
> processes still seem to be contending for busy cpus, which is inoptimal for
> our application.
> Now comes the real puzzler, to me at least. I set rechoose_interval=0 in
> /etc/system, reboot, take it from the top. I though this would result in the
> load being spread out over time as threads migrated to and then stuck to
> uncontended cpus, but thats not what I see. Here is mpstat snapshot:
> CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
> 0 14 0 211 2976 1957 2245 84 375 156 0 20874 17 9 0
> 74
> 1 0 0 149 98 2 2612 89 583 73 0 19032 16 11 0
> 73
> 2 12 0 184 86 6 2523 76 589 76 0 17215 13 9 0
> 77
> 3 0 0 96 650 581 2387 64 530 85 0 13249 11 7 0
> 82
> 8 56 0 11 6 1 581 2 227 25 0 1401 2 2 0
> 97
> 9 0 0 6 4 1 550 0 111 8 0 398 1 1 0
> 98
> 10 5 0 8 28 25 546 0 44 14 0 165 0 1 0
> 99
> 11 0 0 16 390 388 219 0 23 18 0 75 0 1 0 99
> 16 52 0 13 10 7 223 1 22 5 0 212 0 1 0 99
> 17 0 0 5 4 1 322 0 34 5 0 525 0 1 0 99
> 18 0 0 15 5 1 319 1 86 11 0 1558 1 1 0 98
> 19 1 0 50 8 1 552 4 192 22 0 4406 4 2 0 94
>
> Any thoughts?
> This message posted from opensolaris.org
> _______________________________________________
> perf-discuss mailing list
> [email protected]
--
[EMAIL PROTECTED], Solaris Kernel Development, http://blogs.sun.com/sherrym
_______________________________________________
perf-discuss mailing list
[email protected]