I just realised the size of this box.

Doing the mdb ::memstat stuff might be ill advised...

Check it manually first to see what sort of impact it has on th ebox, as 
it could well tie up a whole core for up to a minute...

Nathan.

Nathan Kroenert wrote:
> Hey Peter,
> 
> Indeed, the vmstat output is very unhelpful in this context...
> 
> There are a bunch of things you can do that's really really simple though.
> 
> On the global zone, kick off a prstat -s size and a prstat -s rss
> 
> get them logging to files, and interrogate them when things go belly up.
> 
> Now: If the memory grab happens in a few seconds, that might be 
> problematic, as the default interval of prstat is 5 seconds. It might 
> happen and disappear before you even get to see it. If this is the case, 
> you could (at the expense of some CPU) have prstat's interval set to 1s.
> 
> But, I'd suspect this will be enough to help you see it.
> 
> Other than this, you have some other options using DTrace, but it's kind 
> of like swatting a fly with a bus... And it'll be very much non-trivial. 
> ;)  Think per brk() operation tracing, buffer sizes and all that sort of 
> stuff... Nasty.
> 
> Other things to consider doing (just in case the memory utilisation is 
> NOT in userspace) is to kick off something like:
> 
> while sleep 5
> do
>     echo ::memstat
> done | mdb -k
> 
> or something to that effect, so you can see if it is indeed kernel (and 
> if you have a late enough kernel patch, if it's ZFS's ARC.
> 
> Hope this helps!
> 
> Nathan.
> 
> Peter Keating wrote:
>> All,
>>    I am trying to isolate a performance issue and was wondering if 
>> anyone could help.  Below is some "vmstat -p"  output from a M4000 
>> (64GB, 4x Sparc64 VII, S10_u7 - 139555-08) running 5 local zones (each 
>> running Oracle, one with grid control and one with SAP).   It appears 
>> that at various times, something suddenly increases it's real memory 
>> requirements.  When this occurs the system is somewhat unresponsive, 
>> so isolating the process(es) is problematic, it's often over before we 
>> hear about it.
>>  
>> I was thinking that dtrace(or something else) might be able to 
>> identify process.
>>  
>> Any ideas?
>> Peter
>>  
>>  
>>  
>> #date   time interval  swap  free              re     mf  fr  de  sr  
>> epi  epo  epf  api  apo  apf  fpi  fpo  fpf
>> 2010/03/04 14:08 60 17794448 1577664 1071 4927 1 0 0 0 0 0 134 0 0 0 1 1
>> 2010/03/04 14:09 60 18106176 1909744 1988 11769 2 0 0 0 0 0 204 0 0 0 2 2
>> 2010/03/04 14:10 60 17775872 1462440 1686 9215 1 0 0 0 0 0 46 0 0 0 1 1
>> 2010/03/04 14:11 60 17764936 1439960 1312 6824 0 0 0 0 0 0 39 0 0 0 0 0
>> 2010/03/04 14:12 60 18185640 1816304 1396 8137 1 0 0 0 0 0 332 0 0 0 1 1
>> 2010/03/04 14:13 60 17528096 1197368 1336 7575 0 3728 0 0 0 0 106 0 0 
>> 1 0 0
>> 2010/03/04 14:14 60 17622328 1280640 1728 13445 4347 1026760 8174 0 0 
>> 1 167 3571 3840 0 5 506
>> 2010/03/04 14:15 60 16889408 1282600 1427 9631 3393 13032 4778 1 0 1 
>> 225 2948 3128 0 0 264
>> 2010/03/04 14:16 60 17258256 1445952 1038 5746 1 0 0 1 0 0 36 0 0 0 1 1
>> #date   time interval  swap  free              re     mf   fr        
>> de         sr    epi  epo  epf  api  apo   apf    fpi  fpo  fpf
>> 2010/03/04 14:17 60 17065032 1330008 673 4888 14682 1026760 63296 0 0 
>> 48      222 9858 11364 247 6 3270
>> 2010/03/04 14:18 60 15389112 1688640 324 1659 30164 2424        8792 2 
>> 0 12       20 29690 29415 1 2 737
>> 2010/03/04 14:19 60 14553768 1703528 275 1432 1065 0 0 8 0 0 104 1380 
>> 1065 2 0 0
>> 2010/03/04 14:20 60 16028776 3364696 2439 7346 11422 0 0 27 0 0 863 
>> 13294 11421 15 1 1
>> 2010/03/04 14:21 60 16672000 3968296 1944 6821 0 0 0 14 0 0 2081 0 0 3 
>> 0 0
>> 2010/03/04 14:22 60 16201328 3181632 1746 9441 1 0 0 1 0 0 1540 0 0 5 1 1
>> 2010/03/04 14:23 60 15626224 2233352 1332 6039 0 0 0 1 0 0 301 0 0 2 0 0
>> 2010/03/04 14:24 60 15253440 1808192 1398 5951 0 0 0 24 0 0 269 0 0 12 
>> 0 0
>> 2010/03/04 14:25 60 15241024 1777448 1472 5677 0 0 0 3 0 0 142 0 0 2 0 0
>> 2010/03/04 14:26 60 15280624 1785088 1102 5276 0 0 0 2 0 0 221 0 0 0 0 0
>> 2010/03/04 14:27 60 15208128 1683320 1483 6312 0 0 0 0 0 0 172 0 0 0 0 0
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> ug-msosug mailing list
>> ug-msosug at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/ug-msosug
> _______________________________________________
> ug-msosug mailing list
> ug-msosug at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ug-msosug

Reply via email to