Re: [dtrace-discuss] LatencyTop

2008-01-21 Thread Colin Burgess
I don't think anyone disagrees that measuring latency is a bad idea.  

Note Brendan's comments in http://osnews.com/permalink?296801
http://osnews.com/permalink?296801 

Encouraging customers to look at latencies for performance analysis is
really important.
...
If this tool does get customers to think more carefully about latency
metrics, then that will certainly be valuable. All roads lead to DTrace.

However forcing people to patch their kernel to do it is not going to
attract neophytes, no matter
how simply the user space tool is.

That's the beauty of dynamic tracing systems.  You can answer questions that
the original designers of the OS didn't anticipate (or had
time to think about) that you would ask.

Cheers,

Colin

Aubrey Li wrote: 

On Jan 19, 2008 10:30 AM, Roman Shaposhnik  mailto:[EMAIL PROTECTED]
[EMAIL PROTECTED] wrote: 
 On Fri, 2008-01-18 at 16:33 -0500, Colin Burgess wrote: 
  I see Intel has released a new tool.  Oh, it requires some patches to 
  the kernel to record 
  latency times.  Good thing people don't mind patching their kernels, eh?

  
  So who can write the equivalent latencytop.d the fastest? ;-) 
  
  http://www.latencytop.org/ http://www.latencytop.org/  
 
 What I find interesting about these projects that Intel 
 spawns for Linux (PowerTOP, LatencyTOP and couple of others) 
 is that regardless of internal implementation they are 
 very useful end user tools. Here at Sun we seem to be 
 missing interest in creating such things. Which is a bit of a 
 shame. They are ideal vehicles for disseminating DTrace 
 knowledge and exposing neophytes to the raw power of DTrace. 
 
 To be fair Greg's DTrace toolkit helps in that respect, but 
 still it sets the bar pretty high for anybody who would 
 like to use it. 
 
 It is easy to poke fun at LatencyTOP, but asking the right 
 question could sometimes be even more important than 
 being able to deliver the answer. 
 
 Just my 2c. 
 
 Thanks, 
 Roman. 
 
 P.S. I was able to extend the battery time of my Linux laptop 
 1.5x using PowerTOP. Can the same thing be done with DTrace? 
 Perhaps it can, but I don't think I can code it up. 

Solaris PowerTOP is almost done. 
http://www.opensolaris.org/os/project/tesla/Work/Powertop/
http://www.opensolaris.org/os/project/tesla/Work/Powertop/  

-Aubrey 
Intel OpenSolaris Team 


-- 

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] LatencyTop

2008-01-21 Thread Aubrey Li
On Jan 19, 2008 9:04 AM, Colin Burgess [EMAIL PROTECTED] wrote:
 Well I see that Brendan did reply to the OSNews link to this.  He basically
 shot them down at hardcoding the instrumentation - as he should have! :-)

 Shame on Intel - they should know better!

 Colin

I'm not a member of the linux LatencyTOP team.
I haven't gotten a chance to see how is this tool implemented.
But I totally disagree with you about it's a shame.
Regardless of internal implementation, It's interesting and be very useful.
At least, for end users, it helps to visualize system latencies.

-Aubrey
Intel OpenSolaris Team
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] DTrace, Solaris 9, JVM

2008-01-21 Thread Z W
Hi Gurus

Quick question.
I have JVM 1.5 running with Solaris 9.
Can DTrace be used with such a configuration or is DTrace only for Solaris
10 ?

Thanks
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] DTrace, Solaris 9, JVM

2008-01-21 Thread Rayson Ho
On Jan 21, 2008 2:58 PM, Z W [EMAIL PROTECTED] wrote:
 Can DTrace be used with such a configuration or is DTrace only for Solaris
 10 ?

DTrace is only for Solaris 10.

Time to upgrade ;-)

Rayson



 Thanks
 ___
 dtrace-discuss mailing list
 dtrace-discuss@opensolaris.org

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Fletcher Cocquyt
Hi, I am trying to debug the bottle neck(s) in a Solaris 10
Mailman/Spamassassin/Sendmail VMWare VM and get the following error from
tcptop:

[EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop
dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve
SS_TCP_FAST_ACCEPT: Unknown variable name

thanks for any insight,
Fletcher.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Colin Burgess
Sent: Friday, January 18, 2008 1:33 PM
To: dtrace-discuss@opensolaris.org
Subject: [dtrace-discuss] LatencyTop

I see Intel has released a new tool.  Oh, it requires some patches to 
the kernel to record
latency times.  Good thing people don't mind patching their kernels, eh?

So who can write the equivalent latencytop.d the fastest? ;-)

http://www.latencytop.org/

-- 
[EMAIL PROTECTED]

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Chris Yoder (kernel)
On Mon, Jan 21, 2008 at 01:39:03PM -0800, Fletcher Cocquyt wrote:
 Hi, I am trying to debug the bottle neck(s) in a Solaris 10
 Mailman/Spamassassin/Sendmail VMWare VM and get the following error from
 tcptop:
 
 [EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop
 dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve
 SS_TCP_FAST_ACCEPT: Unknown variable name

From
http://mail.opensolaris.org/pipermail/dtrace-discuss/2006-January/001021.html

 I think SS_TCP_FAST_ACCEPT has been changed to SS_DIRECT
 by a recent putback, although I was unable to isolate it. (no time!)
 Changing my local copy of tcpsnoop.d to use SS_DIRECT fixed
 the problem for me.
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Fletcher Cocquyt
Followup - this system has a lot of kernel activity and I/O - (top typically
shows CPU  50% kernel) - but the hotkernel blorked with this (eventhough
load avg was only ~2 and command line is responsive):

[EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel 
Sampling... Hit Ctrl-C to end.
dtrace: processing aborted: Abort due to systemic unresponsiveness

FUNCTIONCOUNT   PCNT

I'm working my way down the toolkit list - any help on pinpointing the
bottlenecks with the appropriate 1st pass tools appreciated.

Here is some iotop output - nothing surprising here - sendmail, spamd and
mailman (python) are generating I/O:

2008 Jan 21 13:49:54,  load: 1.35,  disk_r: 32 KB,  disk_w:   2424 KB

  UIDPID   PPID CMD  DEVICE  MAJ MIN DBYTES
0  13413  13412 sendmail sd0  61   0 W 2048
0  13411  13406 sendmail sd0  61   0 W 4096
0  13409  13370 sendmail sd0  61   0 W 5120
0  3  0 fsflush  sd0  61   0 W 8192
0  13420  1 sendmail sd0  61   0 W22528
  555   3809   3140 spamdsd0  61   0 R32768
0  13419496 sendmail sd0  61   0 W41984
0  13412496 sendmail sd0  61   0 W44032
0  13370496 sendmail sd0  61   0 W50688
0  13413  1 sendmail sd0  61   0 W51712
0  13406496 sendmail sd0  61   0 W71680
0  13414496 sendmail sd0  61   0 W96256
   35  24406  24400 python2.4sd0  61   0 W   172032
0  0  0 schedsd0  61   0 W   318464
  555   3809   3140 spamdsd0  61   0 W   405504
   35  24409  24400 python2.4sd0  61   0 W  1006592

Ideally I'd like to know what the fixable (tunable) bottlenecks are on a
system that otherwise has plenty of CPU and memory available

Thanks


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Fletcher
Cocquyt
Sent: Monday, January 21, 2008 1:39 PM
To: dtrace-discuss@opensolaris.org
Subject: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT:
Unknown variable name

Hi, I am trying to debug the bottle neck(s) in a Solaris 10
Mailman/Spamassassin/Sendmail VMWare VM and get the following error from
tcptop:

[EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop
dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve
SS_TCP_FAST_ACCEPT: Unknown variable name

thanks for any insight,
Fletcher.



___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Brendan Gregg - Sun Microsystems
G'Day,

On Mon, Jan 21, 2008 at 01:39:03PM -0800, Fletcher Cocquyt wrote:
 Hi, I am trying to debug the bottle neck(s) in a Solaris 10
 Mailman/Spamassassin/Sendmail VMWare VM and get the following error from
 tcptop:
 
 [EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop
 dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve
 SS_TCP_FAST_ACCEPT: Unknown variable name

Try a newer version of tcptop.  The latest has:

# grep -n SS_TCP_FAST_ACCEPT tcptop
66:# 20-Apr-2006 Fixed SS_TCP_FAST_ACCEPT bug in build 31+.
157: * 0x0020 has been hardcoded. It was SS_TCP_FAST_ACCEPT, but was

Latest version of the toolkit (0.99) here:
http://www.brendangregg.com/dtrace.html#DTraceToolkit

Brendan

-- 
Brendan
[CA, USA]
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Brendan Gregg - Sun Microsystems
On Mon, Jan 21, 2008 at 01:55:36PM -0800, Fletcher Cocquyt wrote:
 Followup - this system has a lot of kernel activity and I/O - (top typically
 shows CPU  50% kernel) - but the hotkernel blorked with this (eventhough
 load avg was only ~2 and command line is responsive):
 
 [EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel 
 Sampling... Hit Ctrl-C to end.
 dtrace: processing aborted: Abort due to systemic unresponsiveness

The system is so busy DTrace has decided to play it safe and abort...

Based on a few hunches, try these:

- interstat 1
look for a network driver burning CPU

- pidpersec.d from the DTraceToolkit
(or sar -c 1 100 if DTrace won't behave)
look for lots of short lived processes

- procsystime -coT from the DTraceToolkit
look for frequent syscalls burning CPU time

- dtrace -n 'profile-101 { @[stack(5)] = count(); }'
(this has a slower profile rate than hotuser)
look for hot kernel stacks

Brendan

-- 
Brendan
[CA, USA]
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed toresolve SS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Brendan Gregg - Sun Microsystems
On Mon, Jan 21, 2008 at 02:17:46PM -0800, Fletcher Cocquyt wrote:
 Replaced SS_TCP_FAST_ACCEPT with SS_DIRECT in tcptop per the thread you
 cited - now I get a new error:
 
 [EMAIL PROTECTED]:~ 2:14pm 133 # ./tcptop
 dtrace: failed to compile script /dev/fd/11: line 163: failed to resolve
 `tcp_g_q: Unknown symbol name
 
 I got it from here:
 http://www.brendangregg.com/DTrace/tcptop
 is that not up to date?

Sorry about that - I've kept the DTraceToolkit bundle up to date, but not
individual copies of those scripts in other locations.  I'll either update
that copy, or link it to the DTraceToolkit bundle when I get a chance.

Stefan Parvu has an up to date HTML browsable version of the toolkit here:

http://www.nbl.fi/~nbl97/solaris/dtrace/dtt_testing.html

Click on 0.99.

Brendan

-- 
Brendan
[CA, USA]
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Brendan Gregg - Sun Microsystems
On Mon, Jan 21, 2008 at 02:16:02PM -0800, Brendan Gregg - Sun Microsystems 
wrote:
 On Mon, Jan 21, 2008 at 01:55:36PM -0800, Fletcher Cocquyt wrote:
  Followup - this system has a lot of kernel activity and I/O - (top typically
  shows CPU  50% kernel) - but the hotkernel blorked with this (eventhough
  load avg was only ~2 and command line is responsive):
  
  [EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel 
  Sampling... Hit Ctrl-C to end.
  dtrace: processing aborted: Abort due to systemic unresponsiveness
 
 The system is so busy DTrace has decided to play it safe and abort...
 
 Based on a few hunches, try these:
 
 - interstat 1
   look for a network driver burning CPU
 
Sorry (typo) - intrstat

Brendan

-- 
Brendan
[CA, USA]
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Brendan Gregg - Sun Microsystems
On Mon, Jan 21, 2008 at 02:48:47PM -0800, Fletcher Cocquyt wrote:
 Forgive me, where do I find 'interstat' ?
 
 Also, where can I get Sun::Solaris::Kstat for prustat?

It's probably already under /usr/perl5/5.8.4/lib - it's a vendor (Sun)
supplied package.

prustat was written as a demo tool - it might be useful, but it will probably
fail due to a kernel change.  I wrote it when I was a customer to make a
point to Sun that this is the sort of tool that customers would like.
It turns out that supporting this tool for customers would require stable
network providers for DTrace, a project that is still in progress.

I never put prustat into the DTraceToolkit because it wasn't stable enough,
despite it providing key resource utilisations by process (which is wonderful,
and made possible by DTrace).  If anyone hasn't seen it, it looks like this:

# prustat -ct 20 5
   PID   %CPU   %Mem  %Disk   %Net  COMM
 22301  78.84   3.16   0.00   0.00  setiathome
 22635   4.09   0.20  69.11   0.00  tar   
   440   2.76  45.39   0.00   0.00  Xsun   
  2618   0.31  14.34   0.00   0.00  mozilla-bin 
 22640   3.87   1.49   0.12   0.00  dtrace   
   582   2.04   2.16   0.00   0.00  gnome-terminal
   576   0.02   2.80   0.00   0.00  nautilus  
  2299   0.33   1.99   0.00   0.00  acroread   
 22641   0.00   0.00   1.84   0.00  upsmonitor  
   578   0.37   1.46   0.00   0.00  gnome-panel  
   574   0.41   1.31   0.00   0.00  metacity  
  6504   0.00   1.23   0.00   0.00  nautilus-throbb
   593   0.04   1.05   0.00   0.00  mixer_applet2
   556   0.00   1.05   0.00   0.00  gconfd-2  
   549   0.00   0.94   0.00   0.00  gnome-session 
  6510   0.00   0.93   0.00   0.00  nautilus-text-v
   591   0.02   0.83   0.00   0.00  galf-server
 21551   0.00   0.56   0.00   0.00  dtterm
  4789   0.10   0.45   0.00   0.00  vncviewer
   553   0.00   0.43   0.00   0.00  gnome-volcheck

the screen updates like William LeFebvre's top.

Let me stress again - prustat was written to demonstrate an idea, but is
currently unstable as a tool.

Brendan

-- 
Brendan
[CA, USA]
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Fletcher Cocquyt
Thanks -  prustat works great once I point it at Sun's perl (I was using a
newer install)
I'm going to record some snapshots when the contention is happening...

What if I wanted to quantify the latency (wait times) due to DNS lookups (I
suspect I could benefit from a local caching install - but I want a
before) picture so I can show how much better it is using a local DNS
cache...

Thanks,
Fletcher

-Original Message-
From: Brendan Gregg - Sun Microsystems [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 21, 2008 3:38 PM
To: Fletcher Cocquyt
Cc: dtrace-discuss@opensolaris.org
Subject: Re: [dtrace-discuss] tcptop error: failed to
resolveSS_TCP_FAST_ACCEPT: Unknown variable name

On Mon, Jan 21, 2008 at 02:48:47PM -0800, Fletcher Cocquyt wrote:
 Forgive me, where do I find 'interstat' ?
 
 Also, where can I get Sun::Solaris::Kstat for prustat?

It's probably already under /usr/perl5/5.8.4/lib - it's a vendor (Sun)
supplied package.

prustat was written as a demo tool - it might be useful, but it will
probably
fail due to a kernel change.  I wrote it when I was a customer to make a
point to Sun that this is the sort of tool that customers would like.
It turns out that supporting this tool for customers would require stable
network providers for DTrace, a project that is still in progress.

I never put prustat into the DTraceToolkit because it wasn't stable enough,
despite it providing key resource utilisations by process (which is
wonderful,
and made possible by DTrace).  If anyone hasn't seen it, it looks like this:

# prustat -ct 20 5
   PID   %CPU   %Mem  %Disk   %Net  COMM
 22301  78.84   3.16   0.00   0.00  setiathome
 22635   4.09   0.20  69.11   0.00  tar   
   440   2.76  45.39   0.00   0.00  Xsun   
  2618   0.31  14.34   0.00   0.00  mozilla-bin 
 22640   3.87   1.49   0.12   0.00  dtrace   
   582   2.04   2.16   0.00   0.00  gnome-terminal
   576   0.02   2.80   0.00   0.00  nautilus  
  2299   0.33   1.99   0.00   0.00  acroread   
 22641   0.00   0.00   1.84   0.00  upsmonitor  
   578   0.37   1.46   0.00   0.00  gnome-panel  
   574   0.41   1.31   0.00   0.00  metacity  
  6504   0.00   1.23   0.00   0.00  nautilus-throbb
   593   0.04   1.05   0.00   0.00  mixer_applet2
   556   0.00   1.05   0.00   0.00  gconfd-2  
   549   0.00   0.94   0.00   0.00  gnome-session 
  6510   0.00   0.93   0.00   0.00  nautilus-text-v
   591   0.02   0.83   0.00   0.00  galf-server
 21551   0.00   0.56   0.00   0.00  dtterm
  4789   0.10   0.45   0.00   0.00  vncviewer
   553   0.00   0.43   0.00   0.00  gnome-volcheck

the screen updates like William LeFebvre's top.

Let me stress again - prustat was written to demonstrate an idea, but is
currently unstable as a tool.

Brendan

-- 
Brendan
[CA, USA]


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org