Re: [dtrace-discuss] Why does not always the ustack() output the function name?

2008-10-23 Thread Yasufumi CHIDA
Hello, Chad,

Thank you for getting interested in detail.
In my case, inconsistent results from different program was that.

However, dtrace ustack() showed results from some programs with the name of 
_start(),  main(), libc.so.1'_open() and so on, 
but from the other programs, without name of functions but hexa-decimal 
addresses.


e.g.

(with name)

bash setpgrp 13
libc.so.1`_syscall6+0x1c
35c6c
34fa8
bash`execute_command_internal+0x414
bash`execute_command+0x50
bash`reader_loop+0x220
bash`main+0x90c
bash`_start+0x108




(without name)  ... another Dscript
'/usr/openwin/lib/X11/XtErrorDB' failed
febbcf78
febb05a0
fec97b38
fec97a78
fedbbffc
fedbbeac
fedbbe40
fedc0220
fedc037c
fed8fb6c
fed8f2f8
fed8f290
cf3f8
3f648
d1c98
5c658




That 's inconsistency I said. I still have no idea...

Yasufumi
--
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Why does not always the ustack() output the function name?

2008-10-23 Thread Casper . Dik

Hello, Chad,

Thank you for getting interested in detail.
In my case, inconsistent results from different program was that.

However, dtrace ustack() showed results from some programs with the name of 
_start(),  main(),
 libc.so.1'_open() and so on, 
but from the other programs, without name of functions but hexa-decimal 
addresses.



dtrace needs to be able open the process; if the process has terminated,
all you get are hexidecimal addresses.

DTrace is async: first it generates the probes output in the probe
context (sync); then later it pretty prints the output, outside of the probe
context (after the fact). 

Casper

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Why does not always the ustack() output the function name?

2008-10-23 Thread Adam Leventhal
On Oct 23, 2008, at 2:20 AM, [EMAIL PROTECTED] wrote:
 Thank you for getting interested in detail.
 In my case, inconsistent results from different program was that.

 However, dtrace ustack() showed results from some programs with the  
 name of _start(),  main(),
 libc.so.1'_open() and so on,
 but from the other programs, without name of functions but hexa- 
 decimal addresses.


 dtrace needs to be able open the process; if the process has  
 terminated,
 all you get are hexidecimal addresses.

 DTrace is async: first it generates the probes output in the probe
 context (sync); then later it pretty prints the output, outside of  
 the probe
 context (after the fact).


To be a little clearer, the ustack() action records addresses when the  
probe
fires, but those addresses aren't converted to symbols until later in  
user-
land, and at that point the process from which the stack came may have  
gone
away. Here's how I solve this problem:

some probe where you take a ustack()
{
ustack();
procs[pid] = 1;
}

syscall::rexit:entry
/procs[pid]/
{
procs[pid] = 0;
stop();
system(prun %d, pid);
}

What this does is catches processes as they call the exit system call  
and
forces them to pause long enough for us to gather ustack() data. This  
isn't
fool proof since applications that exit abnormally will escape.

Adam

--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] fbt probes creation issue

2008-10-23 Thread Aubrey Li
Hi list,

From the fbt driver, it's easy to understand the fbt probes are
created after fbt module
is loaded, fbt_provide_module will be called when a module is loaded and create
the fbt probes for all the functions.

But how are the fbt probes created for the modules which are loaded  before fbt
module is loaded?

There must be some magic I missed.
Could you please shed some light on me?

Thanks,
-Aubrey
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] fbt probes creation issue

2008-10-23 Thread Adam Leventhal
On Oct 23, 2008, at 7:22 AM, Aubrey Li wrote:
 But how are the fbt probes created for the modules which are loaded   
 before fbt
 module is loaded?

It happens in the call to dtrace_open() which calls  
dtrace_probe_provide().
This iterates over all providers and calls its dtps_provide_module  
callback
for every module.

Adam

--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Difference between dtrace 1.1 and 1.3

2008-10-23 Thread Adam Leventhal
Hey Benoit,

Can you send me (in private mail) the output of your command adding -x  
debug? Also, have you tried fully specifying the library as libc.so.1?  
Thanks.

Adam

On Oct 23, 2008, at 8:15 AM, benoit wrote:

 Hi all;

 The following script works well with dtrace 1.1 but return the  
 following error with dtrace 1.3

 dtrace: failed to compile script log-memory.d: line 3: probe  
 description pid25659:libc:malloc:entry does not match any probes

 Thanks for any suggestion.
 -benoit


 #!/usr/sbin/dtrace -s

 pid$1:libc:malloc:
 entry{
 self-trace = 1;self-size = arg0;
 }

 pid$1:libc:malloc:return
 /self-trace == 1  self-size  $2/
 {
 printf(Ptr=0x%p Size=%d TS=%d AllocTime=%Y, arg1, self- 
 size, timestamp, walltimestamp);
 ustack();
 self-trace = 0;
 self-size = 0;
 allocated[arg1] = 1;
 }


 ___
 dtrace-discuss mailing list
 dtrace-discuss@opensolaris.org


--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] Network I/O Delays on T5220

2008-10-23 Thread Patrick Pinchera
I'm working with a colleague on a network latency problem he is having 
on a T5220 server.  Basically, his server reads some data off the 
network interface via sockets from a different hardware device, and then 
the T5220 server responds with some data written out to the same the 
network interface. 
There is a timing requirement, where the reads happen about every 80 
milliseconds, and the responding writes need to happen and be complete 
in about 240 milliseconds.  On other hardware (and even on Solaris 8), 
typical responses are in the 20 msec range, and had no latency issues at 
all.

What he is seeing is the write response happens OK for a while, then it 
slows down where the response (socket write from the T5220) takes longer 
than 240 msecs.

He used the socketsnoop.d script from the DTrace Toolkit, and had the 
following observations:

 The problem in a nutshell is that there is a message (controls for 
 hardware) sent about every 80 msec from the T5220/Solaris v10 to an 
 MVME6100/VxWorks v5.5.1 via TCP/IP over the e1000g1 interface.  The 
 message needs to get to the MVME6100 within 240 msec of being sent or 
 the signal processor will run out of commands.  The process sending the 
 message is running at a real time priority and only one other controller
 process has a higher real time priority.  From a dtrace that monitored 
 fbt:ip:tcp_output:entry, I see 5 socket writes each time my message is
 sent:

  

TIMEWALLTIMEUID   PID DIR  SIZE CMD (wrapped 
 on next line)

 499194.679681840 2008 Oct 20 14:23:35.61 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.680047812 2008 Oct 20 14:23:35.62 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.680408073 2008 Oct 20 14:23:35.62 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.680613575 2008 Oct 20 14:23:35.62 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.680732307 2008 Oct 20 14:23:35.62 0 21542   W  4160 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched.

  

 When the problem occurred I saw only 3 write followed by a 2 references to 
 sched:

  

 499194.761630757 2008 Oct 20 14:23:35.70 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.761992335 2008 Oct 20 14:23:35.70 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.762465836 2008 Oct 20 14:23:35.70 0 21542   W 14600 
 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched

 499194.763077839 2008 Oct 20 14:23:35.70 0 0   W 14600 sched 
 499194.763134514 2008 Oct 20 14:23:35.70 0 0   W  4160 sched

  
   
The 5 socket write()'s above, (four at 14600 bytes, and the fifth @ 4160 
bytes) from the rc_radar_sched process, are normal, and within the time 
window we need. But when it takes longer than 240 msecs, we only see 3 
socket write()'s at 14600 bytes from rc_radar_sched, followed by two 
socket write()'s from the sched process (the first is 14600 bytes, and 
the second 4160). This is when the delay happens for subsequent write()'s.

What could be going on here? Any more data we should dtrace?

Thanks,
Pat



___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] query re memory profiling

2008-10-23 Thread Ryan
 Hi
 I have a Dtrace script ( attached ) which examines
 memory usage, using
 fbt:genunix:anon_resvmem:entry  return

 Seems to work well, tested with a small app that
 malloced  65 K, 
 malloced 200M and mmapeed a 16M file and it appeared
 to get things correct.

But is it tested with calls to free()?

 Basically I am looking at an app that seems to wander
 off the 
 reservation in terms of memory usage, so wanted to
 rule out any bugs in the script.

AFAICT ::anon_unresvmem:entry/self-size  0/ would never fire because 
self-size is only non-zero during calls to anon_resvmem. That's probably why 
there seems to be a memory leak. 

 is there a way to get the stack of the calling app at the point in
 time of the anon_resvmem:entry  return.

ustack();

Other thoughts:

Most of those self- variables (uid, pid, etc) seem unnecessary (though 
harmless) because the values are just as easy to figure out from the :::return 
as from the :::entry

Out of curiosity, why the test for non-NULL pr_psargs? Should unresvmem test 
for it as well?

It seems like you'd want sth like this:

fbt:genunix:anon_resvmem:entry /arg1 != 0/ 
{ self-size = arg0; self-args = (char*) curpsinfo-pr_psargs; }

fbt:genunix:anon_resvmem:return /self-args != 0  arg1 != 0/
{ printf(...); total[pid] += self-size; self-args = 0; }

fbt:genunix:anon_unresvmem:entry
{ self-size = arg0; self-args = (char*) curpsinfo-pr_psargs; }

fbt:genunix:anon_unresvmem:return /self-args != 0/
{ printf(...); total[pid] -= self-size; self-args = 0; }

Regards,
Ryan
--
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org