Re: [dtrace-discuss] Memory leak scripts

2008-07-02 Thread Fletcher Cocquyt
Can you please provide a reference for disassembling malloc on Solaris 10?
I am also pursuing the previous suggestion of a Python provider - this one
seems to be against Python 2.5:
http://blogs.sun.com/binujp/resource/pydtrace/diffs

Thanks,
Fletcher

On 7/1/08 9:48 PM, Sanjeev Bagewadi [EMAIL PROTECTED] wrote:

 Hello Fletcher,
 
  From the error looks like dtrace is not able recognize it as probe.
 DTrace needs a signature for the function to be detected as probe.
 Probably this is missing in case of malloc.
 
 Just to double check this you could disassemble malloc and check if we
 have a push' instruction at the beginning.
 
 Thanks and regards,
 Sanjeev.
 
 Fletcher Cocquyt wrote:
 Hola, I am trying to isolate the memory leak I suspect in a mailman
 installation ­ I found:
 http://blogs.sun.com/sanjeevb/date/200506
 
 It gives an error:
 
 [EMAIL PROTECTED]:~ 9:21am 65 # ./memleak.d 10312
 dtrace: failed to compile script ./memleak.d: line 3: probe
 description pid10312:libc.so.1:malloc:entry does not match any probes
 
 I am on SunOS 5.10 Generic_127112-07 i86pc i386 i86pc
 
 Are there some better scripts for isolating memory leaks?
 
 thanks
 Fletch.
 
 
 ___
 dtrace-discuss mailing list
 dtrace-discuss@opensolaris.org
   
 




___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Memory leak scripts

2008-07-02 Thread Fletcher Cocquyt
Looks OK:

[EMAIL PROTECTED]:~ 7:22am 60 # !nm
nm /bin/python | egrep malloc
[3597]  | 134599012|   0|FUNC |GLOB |0|UNDEF  |malloc
[690]   | 0|   0|FILE |LOCL |0|ABS|obmalloc.c



On 7/2/08 6:11 AM, rickey c weisner [EMAIL PROTECTED] wrote:

 Fletcher,
 First confirm that malloc is in your binary.
 
 arwen:nm a.out | grep malloc
 [70]| 134547228| 0|FUNC |GLOB |0|UNDEF  |malloc
 
 Then key on any malloc.
 Something like:
 pid$target::malloc:return,
 pid$target::memalign:return,
 pid$target::realloc:return,
 pid$target::valloc:return
 
 rick

-- 
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology (IRT)
Stanford University School of Medicine

Email: [EMAIL PROTECTED]
Phone: (650) 724-7485


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Memory leak scripts

2008-07-02 Thread Fletcher Cocquyt
Sanjeev, I get this with your new version:

[EMAIL PROTECTED]:~ 7:26am 64 # ./memleak2.d 7560
dtrace: failed to compile script ./memleak2.d: line 3: probe description
pid7560:libc.so.1:malloc:0 does not match any probes

[EMAIL PROTECTED]:~ 7:27am 65 # ps -ef | grep 7560
 mailman  7560   718   0 06:24:11 ?   0:05 /bin/python
/opt/mailman-2.1.9/bin/qrunner --runner=BounceRunner:0:1 -s

thanks

On 7/2/08 4:29 AM, Sanjeev Bagewadi [EMAIL PROTECTED] wrote:

 Fletcher,
 
 Mark Durney hit similar problem and while I was working with him and
 talking to
 my colleague he pointed out that we could use function:offset notation
 when we
 are using pid-provider.
 
 So, I have modified the script to enable the first instruction of malloc.
 
 Attached is the script. Please try it out and let me know if it works.
 If it does I shall update my blog to reflect it.
 
 NOTE : If there more functions which fail (for :entry) please replace
 entry with 0.
 
 Thanks and regards,
 Sanjeev.
 
 Sanjeev Bagewadi wrote:
 Fletcher,
 
 You could attach mdb to the running process and disassemble the routine
 in question :
 -- snip --
 # mdb -p pid
 malloc::dis
 libc.so.1`malloc:   pushl  %ebp
 libc.so.1`malloc+1: movl   %esp,%ebp
 libc.so.1`malloc+3: pushl  %ebx
 libc.so.1`malloc+4: pushl  %esi
 libc.so.1`malloc+5: pushl  %edi
 libc.so.1`malloc+6: call   +0x5 libc.so.1`malloc+0xb
 libc.so.1`malloc+0xb:   popl   %ebx
 libc.so.1`malloc+0xc:   addl   $0x88fe1,%ebx
 -- snip --
 
 So, in my case notice that the first instruction is pushl.
 
 Thanks and regards,
 Sanjeev.
 Fletcher Cocquyt wrote:
   
 Can you please provide a reference for disassembling malloc on Solaris 10?
 I am also pursuing the previous suggestion of a Python provider - this one
 seems to be against Python 2.5:
 http://blogs.sun.com/binujp/resource/pydtrace/diffs
 
 Thanks,
 Fletcher
 
 On 7/1/08 9:48 PM, Sanjeev Bagewadi [EMAIL PROTECTED] wrote:
 
   
 
 Hello Fletcher,
 
  From the error looks like dtrace is not able recognize it as probe.
 DTrace needs a signature for the function to be detected as probe.
 Probably this is missing in case of malloc.
 
 Just to double check this you could disassemble malloc and check if we
 have a push' instruction at the beginning.
 
 Thanks and regards,
 Sanjeev.
 
 Fletcher Cocquyt wrote:
 
   
 Hola, I am trying to isolate the memory leak I suspect in a mailman
 installation ­ I found:
 http://blogs.sun.com/sanjeevb/date/200506
 
 It gives an error:
 
 [EMAIL PROTECTED]:~ 9:21am 65 # ./memleak.d 10312
 dtrace: failed to compile script ./memleak.d: line 3: probe
 description pid10312:libc.so.1:malloc:entry does not match any probes
 
 I am on SunOS 5.10 Generic_127112-07 i86pc i386 i86pc
 
 Are there some better scripts for isolating memory leaks?
 
 thanks
 Fletch.
 
 
 ___
 dtrace-discuss mailing list
 dtrace-discuss@opensolaris.org
   
   
 
 
 
   
 
 
 
   
 

-- 
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology (IRT)
Stanford University School of Medicine

Email: [EMAIL PROTECTED]
Phone: (650) 724-7485


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Memory leak scripts

2008-07-02 Thread Fletcher Cocquyt
+0x156
  python`Py_Main+0xa6b
  python`main+0x17
  python`_start+0x80

  0  42249   free:entry Ptr=0x86d78f0




On 7/2/08 7:46 AM, rickey c weisner [EMAIL PROTECTED] wrote:

 Fletcher,
 This looks suspicious. Perhaps your malloc is not in libc ?
 [690]   | 0|   0|FILE |LOCL |0|ABS|obmalloc.c
 remove the libc.so.1 from your probe description.
 rick
 
 On Wed, Jul 02, 2008 at 07:23:49AM -0700, Fletcher Cocquyt wrote:
 Date: Wed, 02 Jul 2008 07:23:49 -0700
 From: Fletcher Cocquyt [EMAIL PROTECTED]
 Subject: Re: [dtrace-discuss] Memory leak scripts
 In-reply-to: [EMAIL PROTECTED]
 To: rickey c weisner [EMAIL PROTECTED]
 Cc: dtrace-discuss@opensolaris.org
 Thread-topic: [dtrace-discuss] Memory leak scripts
 Thread-index: AcjcTz6M0DPfSSwvAEuYeh4RTBFiig==
 X-PMX-Version: 5.4.1.325704
 X-Brightmail-Tracker: AA==
 X-Antispam: No, score=0.0/5.0, scanned in 0.102sec at (localhost [127.0.0.1])
 by smf-spamd v1.3.1 - http://smfs.sf.net/
 User-Agent: Microsoft-Entourage/12.11.0.080522
 Original-recipient: rfc822;[EMAIL PROTECTED]
 
 Looks OK:
 
 [EMAIL PROTECTED]:~ 7:22am 60 # !nm
 nm /bin/python | egrep malloc
 [3597]  | 134599012|   0|FUNC |GLOB |0|UNDEF  |malloc
 
 
 
 On 7/2/08 6:11 AM, rickey c weisner [EMAIL PROTECTED] wrote:
 
 Fletcher,
 First confirm that malloc is in your binary.
 
 arwen:nm a.out | grep malloc
 [70]| 134547228| 0|FUNC |GLOB |0|UNDEF  |malloc
 
 Then key on any malloc.
 Something like:
 pid$target::malloc:return,
 pid$target::memalign:return,
 pid$target::realloc:return,
 pid$target::valloc:return
 
 rick
 
 -- 
 Fletcher Cocquyt
 Senior Systems Administrator
 Information Resources and Technology (IRT)
 Stanford University School of Medicine
 
 Email: [EMAIL PROTECTED]
 Phone: (650) 724-7485
 
 

-- 
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology (IRT)
Stanford University School of Medicine

Email: [EMAIL PROTECTED]
Phone: (650) 724-7485


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Memory leak scripts - analysis

2008-07-02 Thread Fletcher Cocquyt
Ok, maybe this is significant in the context of explaining why my python
(mailman) processes seem to grow abnormally?
If the libc malloc is not being called, why and is that an important issue?


[EMAIL PROTECTED]:~ 2:17pm 54 # ldd /bin/python
libresolv.so.2 =/lib/libresolv.so.2
libsocket.so.1 =/lib/libsocket.so.1
libnsl.so.1 =   /lib/libnsl.so.1
librt.so.1 =/lib/librt.so.1
libdl.so.1 =/lib/libdl.so.1
libm.so.2 = /lib/libm.so.2
libc.so.1 = /lib/libc.so.1
libmp.so.2 =/lib/libmp.so.2
libmd.so.1 =/lib/libmd.so.1
libscf.so.1 =   /lib/libscf.so.1
libaio.so.1 =   /lib/libaio.so.1
libdoor.so.1 =  /lib/libdoor.so.1
libuutil.so.1 = /lib/libuutil.so.1
libgen.so.1 =   /lib/libgen.so.1

This is Python 2.5.2, built with no configure options besides --prefix on
Solaris 10.

Config.log excerpt:
configure:14433: checking for --with-pymalloc
configure:14453: result: yes

thanks

On 7/2/08 7:46 AM, rickey c weisner [EMAIL PROTECTED] wrote:

 Fletcher,
 This looks suspicious. Perhaps your malloc is not in libc ?
 [690]   | 0|   0|FILE |LOCL |0|ABS|obmalloc.c
 remove the libc.so.1 from your probe description.
 rick
 
 On Wed, Jul 02, 2008 at 07:23:49AM -0700, Fletcher Cocquyt wrote:
 Date: Wed, 02 Jul 2008 07:23:49 -0700
 From: Fletcher Cocquyt [EMAIL PROTECTED]
 Subject: Re: [dtrace-discuss] Memory leak scripts
 In-reply-to: [EMAIL PROTECTED]
 To: rickey c weisner [EMAIL PROTECTED]
 Cc: dtrace-discuss@opensolaris.org
 Thread-topic: [dtrace-discuss] Memory leak scripts
 Thread-index: AcjcTz6M0DPfSSwvAEuYeh4RTBFiig==
 X-PMX-Version: 5.4.1.325704
 X-Brightmail-Tracker: AA==
 X-Antispam: No, score=0.0/5.0, scanned in 0.102sec at (localhost [127.0.0.1])
 by smf-spamd v1.3.1 - http://smfs.sf.net/
 User-Agent: Microsoft-Entourage/12.11.0.080522
 Original-recipient: rfc822;[EMAIL PROTECTED]
 
 Looks OK:
 
 [EMAIL PROTECTED]:~ 7:22am 60 # !nm
 nm /bin/python | egrep malloc
 [3597]  | 134599012|   0|FUNC |GLOB |0|UNDEF  |malloc
 
 
 
 On 7/2/08 6:11 AM, rickey c weisner [EMAIL PROTECTED] wrote:
 
 Fletcher,
 First confirm that malloc is in your binary.
 
 arwen:nm a.out | grep malloc
 [70]| 134547228| 0|FUNC |GLOB |0|UNDEF  |malloc
 
 Then key on any malloc.
 Something like:
 pid$target::malloc:return,
 pid$target::memalign:return,
 pid$target::realloc:return,
 pid$target::valloc:return
 
 rick
 
 -- 
 Fletcher Cocquyt
 Senior Systems Administrator
 Information Resources and Technology (IRT)
 Stanford University School of Medicine
 
 Email: [EMAIL PROTECTED]
 Phone: (650) 724-7485
 
 

-- 
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology (IRT)
Stanford University School of Medicine

Email: [EMAIL PROTECTED]
Phone: (650) 724-7485


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Memory leak scripts - analysis

2008-07-02 Thread Fletcher Cocquyt
   -   4K rwx--[ anon ]
CFBB   4   4   4   -   4K rwx--[ anon ]
CFBC   4   4   -   -   4K r--s-  dev:61,0 ino:47763
CFBC4000 156 156   -   -   4K r-x--  ld.so.1
CFBFB000   4   4   4   -   4K rwx--  ld.so.1
CFBFC000   8   8   8   -   4K rwx--  ld.so.1
 --- --- --- ---
total Kb   28724   28044   23440   -


On 7/2/08 2:58 PM, rickey c weisner [EMAIL PROTECTED] wrote:

 Fletcher,
 libc malloc being called or not only had to do with the
 naming of your probe.
 Looking at your configure options :
 --with-pymalloc
 This imples to me that python has his own malloc.
 
 I do not recall, but why do you think you have a memory leak ?
 
 Just because the process grows over time and does not diminish
 in size does not necessarily mean a memory leak. How do you measure
 the size of your process and are you examining the virtual size
 or the RSS ? The virtual size only grows upward, but I would expect
 it to eventually stabilize. RSS will go up and
 down over time. I would be more concerned with RSS than virtual size
 except for the possibility of exceeding a 4 GB address space for 32 bit
 applications. pmap -xs would be interesting.
 
 rick
 
 On Wed, Jul 02, 2008 at 02:27:31PM -0700, Fletcher Cocquyt wrote:
 Date: Wed, 02 Jul 2008 14:27:31 -0700
 From: Fletcher Cocquyt [EMAIL PROTECTED]
 Subject: Re: [dtrace-discuss] Memory leak scripts - analysis
 In-reply-to: [EMAIL PROTECTED]
 To: rickey c weisner [EMAIL PROTECTED]
 Cc: dtrace-discuss@opensolaris.org
 Thread-topic: [dtrace-discuss] Memory leak scripts - analysis
 Thread-index: Acjcim8+knqA/bONQ0GtOzL8hEW0mg==
 X-PMX-Version: 5.4.1.325704
 X-Brightmail-Tracker: AA==
 X-Antispam: No, score=0.0/5.0, scanned in 0.085sec at (localhost [127.0.0.1])
 by smf-spamd v1.3.1 - http://smfs.sf.net/
 User-Agent: Microsoft-Entourage/12.11.0.080522
 Original-recipient: rfc822;[EMAIL PROTECTED]
 
 Ok, maybe this is significant in the context of explaining why my python
 (mailman) processes seem to grow abnormally?
 If the libc malloc is not being called, why and is that an important issue?
 
 
 [EMAIL PROTECTED]:~ 2:17pm 54 # ldd /bin/python
 libresolv.so.2 =/lib/libresolv.so.2
 libsocket.so.1 =/lib/libsocket.so.1
 libnsl.so.1 =   /lib/libnsl.so.1
 librt.so.1 =/lib/librt.so.1
 libdl.so.1 =/lib/libdl.so.1
 libm.so.2 = /lib/libm.so.2
 libc.so.1 = /lib/libc.so.1
 libmp.so.2 =/lib/libmp.so.2
 libmd.so.1 =/lib/libmd.so.1
 libscf.so.1 =   /lib/libscf.so.1
 libaio.so.1 =   /lib/libaio.so.1
 libdoor.so.1 =  /lib/libdoor.so.1
 libuutil.so.1 = /lib/libuutil.so.1
 libgen.so.1 =   /lib/libgen.so.1
 
 This is Python 2.5.2, built with no configure options besides --prefix on
 Solaris 10.
 
 Config.log excerpt:
 configure:14433: checking for --with-pymalloc
 configure:14453: result: yes
 
 thanks
 
 On 7/2/08 7:46 AM, rickey c weisner [EMAIL PROTECTED] wrote:
 
 Fletcher,
 This looks suspicious. Perhaps your malloc is not in libc ?
 [690]   | 0|   0|FILE |LOCL |0|ABS|obmalloc.c
 remove the libc.so.1 from your probe description.
 rick
 
 On Wed, Jul 02, 2008 at 07:23:49AM -0700, Fletcher Cocquyt wrote:
 Date: Wed, 02 Jul 2008 07:23:49 -0700
 From: Fletcher Cocquyt [EMAIL PROTECTED]
 Subject: Re: [dtrace-discuss] Memory leak scripts
 In-reply-to: [EMAIL PROTECTED]
 To: rickey c weisner [EMAIL PROTECTED]
 Cc: dtrace-discuss@opensolaris.org
 Thread-topic: [dtrace-discuss] Memory leak scripts
 Thread-index: AcjcTz6M0DPfSSwvAEuYeh4RTBFiig==
 X-PMX-Version: 5.4.1.325704
 X-Brightmail-Tracker: AA==
 X-Antispam: No, score=0.0/5.0, scanned in 0.102sec at (localhost
 [127.0.0.1])
 by smf-spamd v1.3.1 - http://smfs.sf.net/
 User-Agent: Microsoft-Entourage/12.11.0.080522
 Original-recipient: rfc822;[EMAIL PROTECTED]
 
 Looks OK:
 
 [EMAIL PROTECTED]:~ 7:22am 60 # !nm
 nm /bin/python | egrep malloc
 [3597]  | 134599012|   0|FUNC |GLOB |0|UNDEF  |malloc
 
 
 
 On 7/2/08 6:11 AM, rickey c weisner [EMAIL PROTECTED] wrote:
 
 Fletcher,
 First confirm that malloc is in your binary.
 
 arwen:nm a.out | grep malloc
 [70]| 134547228| 0|FUNC |GLOB |0|UNDEF  |malloc
 
 Then key on any malloc.
 Something like:
 pid$target::malloc:return,
 pid$target::memalign:return,
 pid$target::realloc:return,
 pid$target::valloc:return
 
 rick
 
 -- 
 Fletcher Cocquyt
 Senior Systems Administrator
 Information Resources and Technology (IRT)
 Stanford University School of Medicine
 
 Email: [EMAIL PROTECTED]
 Phone: (650) 724-7485
 
 
 
 -- 
 Fletcher Cocquyt
 Senior Systems Administrator
 Information Resources and Technology (IRT)
 Stanford University School of Medicine
 
 Email: [EMAIL PROTECTED]
 Phone: (650) 724-7485
 
 

-- 
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology

[dtrace-discuss] Memory leak scripts

2008-07-01 Thread Fletcher Cocquyt
Hola, I am trying to isolate the memory leak I suspect in a mailman
installation ­ I found:
http://blogs.sun.com/sanjeevb/date/200506

It gives an error:

[EMAIL PROTECTED]:~ 9:21am 65 # ./memleak.d 10312
dtrace: failed to compile script ./memleak.d: line 3: probe description
pid10312:libc.so.1:malloc:entry does not match any probes

I am on SunOS 5.10 Generic_127112-07 i86pc i386 i86pc

Are there some better scripts for isolating memory leaks?

thanks
Fletch.
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] Memory leak scripts

2008-07-01 Thread Fletcher Cocquyt
Yes:
[EMAIL PROTECTED]:~ 10:02am 52 # ps -ef | grep 10312
 mailman 10312 22726   0 09:13:19 ?   0:05 /bin/python
/opt/mailman-2.1.9/bin/qrunner --runner=VirginRunner:0:1 -s

This is the error for no such process:

[EMAIL PROTECTED]:~ 10:04am 53 # ./memleak.d 666
dtrace: failed to compile script ./memleak.d: line 3: failed to grab process
666
[EMAIL PROTECTED]:~ 10:04am 54 # ps -ef | grep 666
root 20386 19893   0 10:04:49 pts/1   0:00 grep 666
[EMAIL PROTECTED]:~ 10:04am 55 #

I'm hoping there is a fresher script than this 3yr old one I found via the
top google hit for:  dtrace script for memory leak

The 2nd and third hits are now this thread - gah!

I know memory leaks are a non-trivial problem - but the rate of this one is
so egregious as to require twice daily restarts of mailman - I like the
logic behind checking the alloc/free calls and matching them up...

Any tips appreciated -
Thanks,
Fletcher



On 7/1/08 9:41 AM, Michael Schuster [EMAIL PROTECTED] wrote:

 Fletcher Cocquyt wrote:
 Hola, I am trying to isolate the memory leak I suspect in a mailman
 installation ­ I found:
 http://blogs.sun.com/sanjeevb/date/200506
 
 It gives an error:
 
 [EMAIL PROTECTED]:~ 9:21am 65 # ./memleak.d 10312
 dtrace: failed to compile script ./memleak.d: line 3: probe description
 pid10312:libc.so.1:malloc:entry does not match any probes
 
 this begs the question:
 is there a process with pid 10312?
 
 Michael

-- 
Fletcher Cocquyt
Senior Systems Administrator
Information Resources and Technology (IRT)
Stanford University School of Medicine

Email: [EMAIL PROTECTED]
Phone: (650) 724-7485


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-02-26 Thread Fletcher Cocquyt
I wanted to post a closing message for this thread.
I resolved the system contention on this Solaris VM - although it was not by
way of Dtrace.
Turns out the VMWare settings in the vmx file for this Solaris VM were not
optimal:

 memsize = 2048 (old file)
 sched.mem.max = 256 (old file) - (If sched.mem.max is smaller than memsize,
 the balloon driver can start consuming memory (especially if the Guest
 Operating system application has peaky memory usage). However, this setting
 can cause the balloon driver to retain it's hold on memory continuously, even
 if the Guest Operating System requires it again. This causes Guest Operating
 System to start swapping and will slow down considerably.)

Now I recognize the vmware-memctld process consuming so much CPU was a red
flag for this.
Once the two settings were brought into line (by using VC and checking
Memory resources unlimited) the VM functioned 100x better (responsiveness,
workload throughput etc_
Thanks


On 1/21/08 2:16 PM, Brendan Gregg - Sun Microsystems [EMAIL PROTECTED]
wrote:

 On Mon, Jan 21, 2008 at 01:55:36PM -0800, Fletcher Cocquyt wrote:
 Followup - this system has a lot of kernel activity and I/O - (top typically
 shows CPU  50% kernel) - but the hotkernel blorked with this (eventhough
 load avg was only ~2 and command line is responsive):
 
 [EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel
 Sampling... Hit Ctrl-C to end.
 dtrace: processing aborted: Abort due to systemic unresponsiveness
 
 The system is so busy DTrace has decided to play it safe and abort...
 
 Based on a few hunches, try these:
 
 - interstat 1
 look for a network driver burning CPU
 
 - pidpersec.d from the DTraceToolkit
 (or sar -c 1 100 if DTrace won't behave)
 look for lots of short lived processes
 
 - procsystime -coT from the DTraceToolkit
 look for frequent syscalls burning CPU time
 
 - dtrace -n 'profile-101 { @[stack(5)] = count(); }'
 (this has a slower profile rate than hotuser)
 look for hot kernel stacks
 
 Brendan


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] where is a working tcptop for Solaris 10 8/07 s10x_u4wos_12b X86?

2008-01-25 Thread Fletcher Cocquyt
This is my version
10:46am 61  more /etc/release
Solaris 10 8/07 s10x_u4wos_12b X86
   Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
Assembled 16 August 2007



Did you say what version of Solaris are you on?

Thanks



___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] where is a working tcptop?

2008-01-24 Thread Fletcher Cocquyt
Hi, I tried both:
http://www.nbl.fi/~nbl97/solaris/dtrace/099html/Net/tcptop_snv.html and
http://www.nbl.fi/~nbl97/solaris/dtrace/099html/Net/tcptop.html

they both give this error:

./tcptop_nevada
./tcptop_nevada[80]: syntax error at line 86 : `' unmatched

Is there a central dtrace repository under SVN revision control?

Thanks

-Original Message-
From: Brendan Gregg - Sun Microsystems [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 21, 2008 2:25 PM
To: Fletcher Cocquyt
Cc: [EMAIL PROTECTED]; dtrace-discuss@opensolaris.org
Subject: Re: [dtrace-discuss] tcptop error: failed toresolve
SS_TCP_FAST_ACCEPT: Unknown variable name

On Mon, Jan 21, 2008 at 02:17:46PM -0800, Fletcher Cocquyt wrote:
 Replaced SS_TCP_FAST_ACCEPT with SS_DIRECT in tcptop per the thread you
 cited - now I get a new error:
 
 [EMAIL PROTECTED]:~ 2:14pm 133 # ./tcptop
 dtrace: failed to compile script /dev/fd/11: line 163: failed to resolve
 `tcp_g_q: Unknown symbol name
 
 I got it from here:
 http://www.brendangregg.com/DTrace/tcptop
 is that not up to date?

Sorry about that - I've kept the DTraceToolkit bundle up to date, but not
individual copies of those scripts in other locations.  I'll either update
that copy, or link it to the DTraceToolkit bundle when I get a chance.

Stefan Parvu has an up to date HTML browsable version of the toolkit here:

http://www.nbl.fi/~nbl97/solaris/dtrace/dtt_testing.html

Click on 0.99.

Brendan

-- 
Brendan
[CA, USA]


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] where is a working tcptop?

2008-01-24 Thread Fletcher Cocquyt
Same error with the '-latest':

[EMAIL PROTECTED]:~ 1:58pm 55 # DTraceToolkit-0.99/Net/tcptop
dtrace: failed to compile script /dev/fd/11: line 166: failed to resolve
`tcp_g_q: Unknown symbol name

I'd like to get this working as network captures are showing retransmits...
Thanks


-Original Message-
From: Brendan Gregg - Sun Microsystems [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 24, 2008 11:37 AM
To: Fletcher Cocquyt
Cc: [EMAIL PROTECTED]; dtrace-discuss@opensolaris.org
Subject: Re: where is a working tcptop?

G'Day Fletcher,

On Thu, Jan 24, 2008 at 10:40:43AM -0800, Fletcher Cocquyt wrote:
 Hi, I tried both:
 http://www.nbl.fi/~nbl97/solaris/dtrace/099html/Net/tcptop_snv.html and
 http://www.nbl.fi/~nbl97/solaris/dtrace/099html/Net/tcptop.html
 
 they both give this error:
 
 ./tcptop_nevada
 ./tcptop_nevada[80]: syntax error at line 86 : `' unmatched

Hmm, sounds like a HTML-izing bug.  The latest version should always be
here:

http://www.brendangregg.com/DTraceToolkit-latest.tar.gz

Brendan

-- 
Brendan
[CA, USA]


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] where is a working tcptop?

2008-01-24 Thread Fletcher Cocquyt
Different error with that one:

[EMAIL PROTECTED]:~ 2:49pm 53 # DTraceToolkit-0.99/Net/tcptop_snv 
dtrace: failed to compile script /dev/fd/11: line 198: probe description
fbt:ip:tcp_xchg:entry does not match any probes

Retransmit rate is low (9/3) - but the fact I'm seeing any warrants
further analysis

Thanks



-Original Message-
From: Brendan Gregg - Sun Microsystems [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 24, 2008 2:48 PM
To: Fletcher Cocquyt
Cc: [EMAIL PROTECTED]; dtrace-discuss@opensolaris.org
Subject: Re: where is a working tcptop?

G'Day Fletcher,

On Thu, Jan 24, 2008 at 02:39:19PM -0800, Fletcher Cocquyt wrote:
 Same error with the '-latest':
 
 [EMAIL PROTECTED]:~ 1:58pm 55 # DTraceToolkit-0.99/Net/tcptop
 dtrace: failed to compile script /dev/fd/11: line 166: failed to resolve
 `tcp_g_q: Unknown symbol name

That's not quite the latest:

   DTraceToolkit-0.99 MANPATH=Man man tcptop
   ...
   OS
Solaris 10 3/05

   DTraceToolkit-0.99 MANPATH=Man man tcptop_snv
   ...
   OS
Solaris Nevada / OpenSolaris, circa late 2007

Try tcptop_snv.  I put the OS field in the man pages for this rev, not
only to point out Solaris version support, but also for MacOS X and other
OSes with DTrace.


 I'd like to get this working as network captures are showing
retransmits...
 Thanks

This won't help directly with retransmits.  What is your retransmit ratio?

Brendan

-- 
Brendan
[CA, USA]


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] tcptop error: failed to resolve SS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Fletcher Cocquyt
Hi, I am trying to debug the bottle neck(s) in a Solaris 10
Mailman/Spamassassin/Sendmail VMWare VM and get the following error from
tcptop:

[EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop
dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve
SS_TCP_FAST_ACCEPT: Unknown variable name

thanks for any insight,
Fletcher.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Colin Burgess
Sent: Friday, January 18, 2008 1:33 PM
To: dtrace-discuss@opensolaris.org
Subject: [dtrace-discuss] LatencyTop

I see Intel has released a new tool.  Oh, it requires some patches to 
the kernel to record
latency times.  Good thing people don't mind patching their kernels, eh?

So who can write the equivalent latencytop.d the fastest? ;-)

http://www.latencytop.org/

-- 
[EMAIL PROTECTED]

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Fletcher Cocquyt
Followup - this system has a lot of kernel activity and I/O - (top typically
shows CPU  50% kernel) - but the hotkernel blorked with this (eventhough
load avg was only ~2 and command line is responsive):

[EMAIL PROTECTED]:~ 1:41pm 114 # ./hotkernel 
Sampling... Hit Ctrl-C to end.
dtrace: processing aborted: Abort due to systemic unresponsiveness

FUNCTIONCOUNT   PCNT

I'm working my way down the toolkit list - any help on pinpointing the
bottlenecks with the appropriate 1st pass tools appreciated.

Here is some iotop output - nothing surprising here - sendmail, spamd and
mailman (python) are generating I/O:

2008 Jan 21 13:49:54,  load: 1.35,  disk_r: 32 KB,  disk_w:   2424 KB

  UIDPID   PPID CMD  DEVICE  MAJ MIN DBYTES
0  13413  13412 sendmail sd0  61   0 W 2048
0  13411  13406 sendmail sd0  61   0 W 4096
0  13409  13370 sendmail sd0  61   0 W 5120
0  3  0 fsflush  sd0  61   0 W 8192
0  13420  1 sendmail sd0  61   0 W22528
  555   3809   3140 spamdsd0  61   0 R32768
0  13419496 sendmail sd0  61   0 W41984
0  13412496 sendmail sd0  61   0 W44032
0  13370496 sendmail sd0  61   0 W50688
0  13413  1 sendmail sd0  61   0 W51712
0  13406496 sendmail sd0  61   0 W71680
0  13414496 sendmail sd0  61   0 W96256
   35  24406  24400 python2.4sd0  61   0 W   172032
0  0  0 schedsd0  61   0 W   318464
  555   3809   3140 spamdsd0  61   0 W   405504
   35  24409  24400 python2.4sd0  61   0 W  1006592

Ideally I'd like to know what the fixable (tunable) bottlenecks are on a
system that otherwise has plenty of CPU and memory available

Thanks


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Fletcher
Cocquyt
Sent: Monday, January 21, 2008 1:39 PM
To: dtrace-discuss@opensolaris.org
Subject: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT:
Unknown variable name

Hi, I am trying to debug the bottle neck(s) in a Solaris 10
Mailman/Spamassassin/Sendmail VMWare VM and get the following error from
tcptop:

[EMAIL PROTECTED]:~ 1:35pm 103 # ./tcptop
dtrace: failed to compile script /dev/fd/11: line 40: failed to resolve
SS_TCP_FAST_ACCEPT: Unknown variable name

thanks for any insight,
Fletcher.



___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] tcptop error: failed to resolveSS_TCP_FAST_ACCEPT: Unknown variable name

2008-01-21 Thread Fletcher Cocquyt
Thanks -  prustat works great once I point it at Sun's perl (I was using a
newer install)
I'm going to record some snapshots when the contention is happening...

What if I wanted to quantify the latency (wait times) due to DNS lookups (I
suspect I could benefit from a local caching install - but I want a
before) picture so I can show how much better it is using a local DNS
cache...

Thanks,
Fletcher

-Original Message-
From: Brendan Gregg - Sun Microsystems [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 21, 2008 3:38 PM
To: Fletcher Cocquyt
Cc: dtrace-discuss@opensolaris.org
Subject: Re: [dtrace-discuss] tcptop error: failed to
resolveSS_TCP_FAST_ACCEPT: Unknown variable name

On Mon, Jan 21, 2008 at 02:48:47PM -0800, Fletcher Cocquyt wrote:
 Forgive me, where do I find 'interstat' ?
 
 Also, where can I get Sun::Solaris::Kstat for prustat?

It's probably already under /usr/perl5/5.8.4/lib - it's a vendor (Sun)
supplied package.

prustat was written as a demo tool - it might be useful, but it will
probably
fail due to a kernel change.  I wrote it when I was a customer to make a
point to Sun that this is the sort of tool that customers would like.
It turns out that supporting this tool for customers would require stable
network providers for DTrace, a project that is still in progress.

I never put prustat into the DTraceToolkit because it wasn't stable enough,
despite it providing key resource utilisations by process (which is
wonderful,
and made possible by DTrace).  If anyone hasn't seen it, it looks like this:

# prustat -ct 20 5
   PID   %CPU   %Mem  %Disk   %Net  COMM
 22301  78.84   3.16   0.00   0.00  setiathome
 22635   4.09   0.20  69.11   0.00  tar   
   440   2.76  45.39   0.00   0.00  Xsun   
  2618   0.31  14.34   0.00   0.00  mozilla-bin 
 22640   3.87   1.49   0.12   0.00  dtrace   
   582   2.04   2.16   0.00   0.00  gnome-terminal
   576   0.02   2.80   0.00   0.00  nautilus  
  2299   0.33   1.99   0.00   0.00  acroread   
 22641   0.00   0.00   1.84   0.00  upsmonitor  
   578   0.37   1.46   0.00   0.00  gnome-panel  
   574   0.41   1.31   0.00   0.00  metacity  
  6504   0.00   1.23   0.00   0.00  nautilus-throbb
   593   0.04   1.05   0.00   0.00  mixer_applet2
   556   0.00   1.05   0.00   0.00  gconfd-2  
   549   0.00   0.94   0.00   0.00  gnome-session 
  6510   0.00   0.93   0.00   0.00  nautilus-text-v
   591   0.02   0.83   0.00   0.00  galf-server
 21551   0.00   0.56   0.00   0.00  dtterm
  4789   0.10   0.45   0.00   0.00  vncviewer
   553   0.00   0.43   0.00   0.00  gnome-volcheck

the screen updates like William LeFebvre's top.

Let me stress again - prustat was written to demonstrate an idea, but is
currently unstable as a tool.

Brendan

-- 
Brendan
[CA, USA]


___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] DTraceTools Update

2007-09-05 Thread Fletcher Cocquyt
Re: not enough test servers
Can't Dtrace testing and development be done on virtual machines?
Doesn't Dtrace behave the same on a Solaris 10 virtual machine (eg VMWare's
free server?) - and yes as far as I know there is not currently a way to
create a Sparc VM, but x86 based OSes are well represented.
I'm keen to test out VMWare Lab Manager which purports to be the solution
for rapid deployment of whole sets of test system
 
Thanks for continuing the development DTrace

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Brendan Gregg -
Sun Microsystems
Sent: Wednesday, September 05, 2007 6:04 PM
To: Gary Gendel
Cc: dtrace-discuss@opensolaris.org
Subject: Re: [dtrace-discuss] DTraceTools Update

G'Day Folks,

Plans to update the DTraceTools (DTraceToolkit)? yes. Development has
been happening, but I haven't wanted to upload a new version without
addressing the tcp* scripts first. somehow. They have exposed an issue
with versioning of unstable scripts and supported OSes, which I'll use
this thread to discuss at some length for anyone interested.

...

Firstly, most of the DTraceToolkit *is* up to date with the latest snv
builds,
since most of the DTraceToolkit uses stable DTrace providers (as it should).

Some stable providers are not yet available, and until then we are in an
awkward place -- people on older (and newer) builds may find some of the 
fbt based script don't work.

I'm currently thinking that it would be practical to only support the
following,

Solaris 10 3/05
OpenSolaris latest build
MacOS X Leopard
[insert OSes here after DTrace is ported]

I've already made changes to the man pages to show which operating systems
each script will run on.

This means that the tcp* scripts need updating to support the latest
OpenSolaris builds (and updating, and updating, as things keep changing).
Of course, life will be somewhat easier when stable networking providers
exist, and the tcp* scripts can use their probes (although, I'm expecting
tcpsnoop and tcptop to need more than just the network providers to become
stable).

Several people have asked about the tcp* scripts on Solaris 10 6/06 and
other Solaris builds (builds inbetween 3/05 and the latest OpenSolaris).
I've wanted to have a go at fixing these scripts for these minor releases -
but since moving to the US I've found it harder to re-acquire a pile of test

servers to support them (SPARC and x86 servers for every Solaris 10
release == a lot of servers, space and electricity). The desire is there,
but the servers are not; not to mention that it will probably eat up a lot
of my spare time to port these.

Now, if I or someone else do eventually port the tcp* scripts, that then
presents a versioning issue in the DTraceToolkit, and I'd prefer not to
have fat ugly scripts in a THIS VERSION, THAT VERSION style as Nathan
has mentioned. I'm thiking the way ahead would be a Versions directory
with entire ported copies of the script. eg,

/opt/DTT# ls -1 Net/tcp* Net/Versions/tcp*
Net/tcpsnoop
Net/tcpsnoop.d
Net/tcpstat.d
Net/tcptop
Net/tcpwdist.d
Net/Versions/tcpsnoop.sol10u2.d
Net/Versions/tcpsnoop.sol10u3.d
Net/Versions/tcpsnoop.sol10u3
Net/Versions/tcptop.sol10u3

Remember, there won't be many scripts in these Versions directories, just
those *fbt* based scripts that have broken, so it won't be that common to
need to poke around there.

However, what happens if I have a *stable* provider based script, and want
to enhance it to use newer DTrace features (like multiple aggregations)?
I would end up with two or more versions, one for Solaris 10 3/05 (without
the enhancements), one for the latest OpenSolaris (with the enhancements),
and possibly another for MacOS X (with whatever they support so far), and
maybe another for Linux (when they port DTrace :-).

Would some of these get moved to the Versions directory (forcing people
to frequently look in there)? Do I write a wrapper for every script, isaexec
style? Do I deal with it in the installer script, symlinking the correct
version based on your OS? Do I have ugly ifdef THIS VERSION statements
throughout the scripts?... I don't know what to do yet, but it won't be
long before I'll need an answer (I do want to start using some of the new
DTrace features, as well as supporting those on Solaris 10 3/05). Ideas?

Until I know of a sensible way to do it, I could add scripts with an x
after their name - for extended (and rename them when we think of
something better). Eg,

hotuser # Solaris 10 3/05 (uses ustack() + perl)
hotuserx# Latest OpenSolaris (uses ufunc() and umod())

hotuser would be the most glaring example, since the code will become
trivial if I can use ufunc() and umod() instead. I don't know which version
MacOS X would run (need to check if it has ufunc() and umod())...

I should stress that this issue is only for