Re: [dtrace-discuss] LatencyTop
I don't think anyone disagrees that measuring latency is a bad idea. Note Brendan's comments in http://osnews.com/permalink?296801 http://osnews.com/permalink?296801 Encouraging customers to look at latencies for performance analysis is really important. ... If this tool does get customers to think more carefully about latency metrics, then that will certainly be valuable. All roads lead to DTrace. However forcing people to patch their kernel to do it is not going to attract neophytes, no matter how simply the user space tool is. That's the beauty of dynamic tracing systems. You can answer questions that the original designers of the OS didn't anticipate (or had time to think about) that you would ask. Cheers, Colin Aubrey Li wrote: On Jan 19, 2008 10:30 AM, Roman Shaposhnik mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Fri, 2008-01-18 at 16:33 -0500, Colin Burgess wrote: I see Intel has released a new tool. Oh, it requires some patches to the kernel to record latency times. Good thing people don't mind patching their kernels, eh? So who can write the equivalent latencytop.d the fastest? ;-) http://www.latencytop.org/ http://www.latencytop.org/ What I find interesting about these projects that Intel spawns for Linux (PowerTOP, LatencyTOP and couple of others) is that regardless of internal implementation they are very useful end user tools. Here at Sun we seem to be missing interest in creating such things. Which is a bit of a shame. They are ideal vehicles for disseminating DTrace knowledge and exposing neophytes to the raw power of DTrace. To be fair Greg's DTrace toolkit helps in that respect, but still it sets the bar pretty high for anybody who would like to use it. It is easy to poke fun at LatencyTOP, but asking the right question could sometimes be even more important than being able to deliver the answer. Just my 2c. Thanks, Roman. P.S. I was able to extend the battery time of my Linux laptop 1.5x using PowerTOP. Can the same thing be done with DTrace? Perhaps it can, but I don't think I can code it up. Solaris PowerTOP is almost done. http://www.opensolaris.org/os/project/tesla/Work/Powertop/ http://www.opensolaris.org/os/project/tesla/Work/Powertop/ -Aubrey Intel OpenSolaris Team -- [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] LatencyTop
On Jan 19, 2008 9:04 AM, Colin Burgess [EMAIL PROTECTED] wrote: Well I see that Brendan did reply to the OSNews link to this. He basically shot them down at hardcoding the instrumentation - as he should have! :-) Shame on Intel - they should know better! Colin I'm not a member of the linux LatencyTOP team. I haven't gotten a chance to see how is this tool implemented. But I totally disagree with you about it's a shame. Regardless of internal implementation, It's interesting and be very useful. At least, for end users, it helps to visualize system latencies. -Aubrey Intel OpenSolaris Team ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
[dtrace-discuss] DTrace, Solaris 9, JVM
Hi Gurus Quick question. I have JVM 1.5 running with Solaris 9. Can DTrace be used with such a configuration or is DTrace only for Solaris 10 ? Thanks ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] DTrace, Solaris 9, JVM
On Jan 21, 2008 2:58 PM, Z W [EMAIL PROTECTED] wrote: Can DTrace be used with such a configuration or is DTrace only for Solaris 10 ? DTrace is only for Solaris 10. Time to upgrade ;-) Rayson Thanks ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
[dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name
Hi, I am trying to debug the bottle neck(s) in a Solaris 10 Mailman/Spamassassin/Sendmail VMWare VM and get the following error from tcptop: [EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name thanks for any insight, Fletcher. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Colin Burgess Sent: Friday, January 18, 2008 1:33 PM To: dtrace-discuss@opensolaris.org Subject: [dtrace-discuss] LatencyTop I see Intel has released a new tool. Oh, it requires some patches to the kernel to record latency times. Good thing people don't mind patching their kernels, eh? So who can write the equivalent latencytop.d the fastest? ;-) http://www.latencytop.org/ -- [EMAIL PROTECTED] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name
On Mon, Jan 21, 2008 at 01:39:03PM -0800, Fletcher Cocquyt wrote: Hi, I am trying to debug the bottle neck(s) in a Solaris 10 Mailman/Spamassassin/Sendmail VMWare VM and get the following error from tcptop: [EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name From http://mail.opensolaris.org/pipermail/dtrace-discuss/2006-January/001021.html I think SS_TCP_FAST_ACCEPT has been changed to SS_DIRECT by a recent putback, although I was unable to isolate it. (no time!) Changing my local copy of tcpsnoop.d to use SS_DIRECT fixed the problem for me. ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name
Followup - this system has a lot of kernel activity and I/O - (top typically shows CPU 50% kernel) - but the hotkernel blorked with this (eventhough load avg was only ~2 and command line is responsive): [EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel Sampling... Hit Ctrl-C to end. dtrace: processing aborted: Abort due to systemic unresponsiveness FUNCTIONCOUNT PCNT I'm working my way down the toolkit list - any help on pinpointing the bottlenecks with the appropriate 1st pass tools appreciated. Here is some iotop output - nothing surprising here - sendmail, spamd and mailman (python) are generating I/O: 2008 Jan 21 13:49:54, load: 1.35, disk_r: 32 KB, disk_w: 2424 KB UIDPID PPID CMD DEVICE MAJ MIN DBYTES 0 13413 13412 sendmail sd0 61 0 W 2048 0 13411 13406 sendmail sd0 61 0 W 4096 0 13409 13370 sendmail sd0 61 0 W 5120 0 3 0 fsflush sd0 61 0 W 8192 0 13420 1 sendmail sd0 61 0 W22528 555 3809 3140 spamdsd0 61 0 R32768 0 13419496 sendmail sd0 61 0 W41984 0 13412496 sendmail sd0 61 0 W44032 0 13370496 sendmail sd0 61 0 W50688 0 13413 1 sendmail sd0 61 0 W51712 0 13406496 sendmail sd0 61 0 W71680 0 13414496 sendmail sd0 61 0 W96256 35 24406 24400 python2.4sd0 61 0 W 172032 0 0 0 schedsd0 61 0 W 318464 555 3809 3140 spamdsd0 61 0 W 405504 35 24409 24400 python2.4sd0 61 0 W 1006592 Ideally I'd like to know what the fixable (tunable) bottlenecks are on a system that otherwise has plenty of CPU and memory available Thanks -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fletcher Cocquyt Sent: Monday, January 21, 2008 1:39 PM To: dtrace-discuss@opensolaris.org Subject: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name Hi, I am trying to debug the bottle neck(s) in a Solaris 10 Mailman/Spamassassin/Sendmail VMWare VM and get the following error from tcptop: [EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name thanks for any insight, Fletcher. ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name
G'Day, On Mon, Jan 21, 2008 at 01:39:03PM -0800, Fletcher Cocquyt wrote: Hi, I am trying to debug the bottle neck(s) in a Solaris 10 Mailman/Spamassassin/Sendmail VMWare VM and get the following error from tcptop: [EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name Try a newer version of tcptop. The latest has: # grep -n SS_TCP_FAST_ACCEPT tcptop 66:# 20-Apr-2006 Fixed SS_TCP_FAST_ACCEPT bug in build 31+. 157: * 0x0020 has been hardcoded. It was SS_TCP_FAST_ACCEPT, but was Latest version of the toolkit (0.99) here: http://www.brendangregg.com/dtrace.html#DTraceToolkit Brendan -- Brendan [CA, USA] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name
On Mon, Jan 21, 2008 at 01:55:36PM -0800, Fletcher Cocquyt wrote: Followup - this system has a lot of kernel activity and I/O - (top typically shows CPU 50% kernel) - but the hotkernel blorked with this (eventhough load avg was only ~2 and command line is responsive): [EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel Sampling... Hit Ctrl-C to end. dtrace: processing aborted: Abort due to systemic unresponsiveness The system is so busy DTrace has decided to play it safe and abort... Based on a few hunches, try these: - interstat 1 look for a network driver burning CPU - pidpersec.d from the DTraceToolkit (or sar -c 1 100 if DTrace won't behave) look for lots of short lived processes - procsystime -coT from the DTraceToolkit look for frequent syscalls burning CPU time - dtrace -n 'profile-101 { @[stack(5)] = count(); }' (this has a slower profile rate than hotuser) look for hot kernel stacks Brendan -- Brendan [CA, USA] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed toresolve SS_TCP_FAST_ACCEPT: Unknown variable name
On Mon, Jan 21, 2008 at 02:17:46PM -0800, Fletcher Cocquyt wrote: Replaced SS_TCP_FAST_ACCEPT with SS_DIRECT in tcptop per the thread you cited - now I get a new error: [EMAIL PROTECTED]:~ 2:14pm 133 # ./tcptop dtrace: failed to compile script /dev/fd/11: line 163: failed to resolve `tcp_g_q: Unknown symbol name I got it from here: http://www.brendangregg.com/DTrace/tcptop is that not up to date? Sorry about that - I've kept the DTraceToolkit bundle up to date, but not individual copies of those scripts in other locations. I'll either update that copy, or link it to the DTraceToolkit bundle when I get a chance. Stefan Parvu has an up to date HTML browsable version of the toolkit here: http://www.nbl.fi/~nbl97/solaris/dtrace/dtt_testing.html Click on 0.99. Brendan -- Brendan [CA, USA] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name
On Mon, Jan 21, 2008 at 02:16:02PM -0800, Brendan Gregg - Sun Microsystems wrote: On Mon, Jan 21, 2008 at 01:55:36PM -0800, Fletcher Cocquyt wrote: Followup - this system has a lot of kernel activity and I/O - (top typically shows CPU 50% kernel) - but the hotkernel blorked with this (eventhough load avg was only ~2 and command line is responsive): [EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel Sampling... Hit Ctrl-C to end. dtrace: processing aborted: Abort due to systemic unresponsiveness The system is so busy DTrace has decided to play it safe and abort... Based on a few hunches, try these: - interstat 1 look for a network driver burning CPU Sorry (typo) - intrstat Brendan -- Brendan [CA, USA] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name
On Mon, Jan 21, 2008 at 02:48:47PM -0800, Fletcher Cocquyt wrote: Forgive me, where do I find 'interstat' ? Also, where can I get Sun::Solaris::Kstat for prustat? It's probably already under /usr/perl5/5.8.4/lib - it's a vendor (Sun) supplied package. prustat was written as a demo tool - it might be useful, but it will probably fail due to a kernel change. I wrote it when I was a customer to make a point to Sun that this is the sort of tool that customers would like. It turns out that supporting this tool for customers would require stable network providers for DTrace, a project that is still in progress. I never put prustat into the DTraceToolkit because it wasn't stable enough, despite it providing key resource utilisations by process (which is wonderful, and made possible by DTrace). If anyone hasn't seen it, it looks like this: # prustat -ct 20 5 PID %CPU %Mem %Disk %Net COMM 22301 78.84 3.16 0.00 0.00 setiathome 22635 4.09 0.20 69.11 0.00 tar 440 2.76 45.39 0.00 0.00 Xsun 2618 0.31 14.34 0.00 0.00 mozilla-bin 22640 3.87 1.49 0.12 0.00 dtrace 582 2.04 2.16 0.00 0.00 gnome-terminal 576 0.02 2.80 0.00 0.00 nautilus 2299 0.33 1.99 0.00 0.00 acroread 22641 0.00 0.00 1.84 0.00 upsmonitor 578 0.37 1.46 0.00 0.00 gnome-panel 574 0.41 1.31 0.00 0.00 metacity 6504 0.00 1.23 0.00 0.00 nautilus-throbb 593 0.04 1.05 0.00 0.00 mixer_applet2 556 0.00 1.05 0.00 0.00 gconfd-2 549 0.00 0.94 0.00 0.00 gnome-session 6510 0.00 0.93 0.00 0.00 nautilus-text-v 591 0.02 0.83 0.00 0.00 galf-server 21551 0.00 0.56 0.00 0.00 dtterm 4789 0.10 0.45 0.00 0.00 vncviewer 553 0.00 0.43 0.00 0.00 gnome-volcheck the screen updates like William LeFebvre's top. Let me stress again - prustat was written to demonstrate an idea, but is currently unstable as a tool. Brendan -- Brendan [CA, USA] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name
Thanks - prustat works great once I point it at Sun's perl (I was using a newer install) I'm going to record some snapshots when the contention is happening... What if I wanted to quantify the latency (wait times) due to DNS lookups (I suspect I could benefit from a local caching install - but I want a before) picture so I can show how much better it is using a local DNS cache... Thanks, Fletcher -Original Message- From: Brendan Gregg - Sun Microsystems [mailto:[EMAIL PROTECTED] Sent: Monday, January 21, 2008 3:38 PM To: Fletcher Cocquyt Cc: dtrace-discuss@opensolaris.org Subject: Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name On Mon, Jan 21, 2008 at 02:48:47PM -0800, Fletcher Cocquyt wrote: Forgive me, where do I find 'interstat' ? Also, where can I get Sun::Solaris::Kstat for prustat? It's probably already under /usr/perl5/5.8.4/lib - it's a vendor (Sun) supplied package. prustat was written as a demo tool - it might be useful, but it will probably fail due to a kernel change. I wrote it when I was a customer to make a point to Sun that this is the sort of tool that customers would like. It turns out that supporting this tool for customers would require stable network providers for DTrace, a project that is still in progress. I never put prustat into the DTraceToolkit because it wasn't stable enough, despite it providing key resource utilisations by process (which is wonderful, and made possible by DTrace). If anyone hasn't seen it, it looks like this: # prustat -ct 20 5 PID %CPU %Mem %Disk %Net COMM 22301 78.84 3.16 0.00 0.00 setiathome 22635 4.09 0.20 69.11 0.00 tar 440 2.76 45.39 0.00 0.00 Xsun 2618 0.31 14.34 0.00 0.00 mozilla-bin 22640 3.87 1.49 0.12 0.00 dtrace 582 2.04 2.16 0.00 0.00 gnome-terminal 576 0.02 2.80 0.00 0.00 nautilus 2299 0.33 1.99 0.00 0.00 acroread 22641 0.00 0.00 1.84 0.00 upsmonitor 578 0.37 1.46 0.00 0.00 gnome-panel 574 0.41 1.31 0.00 0.00 metacity 6504 0.00 1.23 0.00 0.00 nautilus-throbb 593 0.04 1.05 0.00 0.00 mixer_applet2 556 0.00 1.05 0.00 0.00 gconfd-2 549 0.00 0.94 0.00 0.00 gnome-session 6510 0.00 0.93 0.00 0.00 nautilus-text-v 591 0.02 0.83 0.00 0.00 galf-server 21551 0.00 0.56 0.00 0.00 dtterm 4789 0.10 0.45 0.00 0.00 vncviewer 553 0.00 0.43 0.00 0.00 gnome-volcheck the screen updates like William LeFebvre's top. Let me stress again - prustat was written to demonstrate an idea, but is currently unstable as a tool. Brendan -- Brendan [CA, USA] ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org