Re: [dtrace-discuss] Why does not always the ustack() output the function name?
Hello, Chad, Thank you for getting interested in detail. In my case, inconsistent results from different program was that. However, dtrace ustack() showed results from some programs with the name of _start(), main(), libc.so.1'_open() and so on, but from the other programs, without name of functions but hexa-decimal addresses. e.g. (with name) bash setpgrp 13 libc.so.1`_syscall6+0x1c 35c6c 34fa8 bash`execute_command_internal+0x414 bash`execute_command+0x50 bash`reader_loop+0x220 bash`main+0x90c bash`_start+0x108 (without name) ... another Dscript '/usr/openwin/lib/X11/XtErrorDB' failed febbcf78 febb05a0 fec97b38 fec97a78 fedbbffc fedbbeac fedbbe40 fedc0220 fedc037c fed8fb6c fed8f2f8 fed8f290 cf3f8 3f648 d1c98 5c658 That 's inconsistency I said. I still have no idea... Yasufumi -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Why does not always the ustack() output the function name?
Hello, Chad, Thank you for getting interested in detail. In my case, inconsistent results from different program was that. However, dtrace ustack() showed results from some programs with the name of _start(), main(), libc.so.1'_open() and so on, but from the other programs, without name of functions but hexa-decimal addresses. dtrace needs to be able open the process; if the process has terminated, all you get are hexidecimal addresses. DTrace is async: first it generates the probes output in the probe context (sync); then later it pretty prints the output, outside of the probe context (after the fact). Casper ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Why does not always the ustack() output the function name?
On Oct 23, 2008, at 2:20 AM, [EMAIL PROTECTED] wrote: Thank you for getting interested in detail. In my case, inconsistent results from different program was that. However, dtrace ustack() showed results from some programs with the name of _start(), main(), libc.so.1'_open() and so on, but from the other programs, without name of functions but hexa- decimal addresses. dtrace needs to be able open the process; if the process has terminated, all you get are hexidecimal addresses. DTrace is async: first it generates the probes output in the probe context (sync); then later it pretty prints the output, outside of the probe context (after the fact). To be a little clearer, the ustack() action records addresses when the probe fires, but those addresses aren't converted to symbols until later in user- land, and at that point the process from which the stack came may have gone away. Here's how I solve this problem: some probe where you take a ustack() { ustack(); procs[pid] = 1; } syscall::rexit:entry /procs[pid]/ { procs[pid] = 0; stop(); system(prun %d, pid); } What this does is catches processes as they call the exit system call and forces them to pause long enough for us to gather ustack() data. This isn't fool proof since applications that exit abnormally will escape. Adam -- Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
[dtrace-discuss] fbt probes creation issue
Hi list, From the fbt driver, it's easy to understand the fbt probes are created after fbt module is loaded, fbt_provide_module will be called when a module is loaded and create the fbt probes for all the functions. But how are the fbt probes created for the modules which are loaded before fbt module is loaded? There must be some magic I missed. Could you please shed some light on me? Thanks, -Aubrey ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] fbt probes creation issue
On Oct 23, 2008, at 7:22 AM, Aubrey Li wrote: But how are the fbt probes created for the modules which are loaded before fbt module is loaded? It happens in the call to dtrace_open() which calls dtrace_probe_provide(). This iterates over all providers and calls its dtps_provide_module callback for every module. Adam -- Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Difference between dtrace 1.1 and 1.3
Hey Benoit, Can you send me (in private mail) the output of your command adding -x debug? Also, have you tried fully specifying the library as libc.so.1? Thanks. Adam On Oct 23, 2008, at 8:15 AM, benoit wrote: Hi all; The following script works well with dtrace 1.1 but return the following error with dtrace 1.3 dtrace: failed to compile script log-memory.d: line 3: probe description pid25659:libc:malloc:entry does not match any probes Thanks for any suggestion. -benoit #!/usr/sbin/dtrace -s pid$1:libc:malloc: entry{ self-trace = 1;self-size = arg0; } pid$1:libc:malloc:return /self-trace == 1 self-size $2/ { printf(Ptr=0x%p Size=%d TS=%d AllocTime=%Y, arg1, self- size, timestamp, walltimestamp); ustack(); self-trace = 0; self-size = 0; allocated[arg1] = 1; } ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org -- Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
[dtrace-discuss] Network I/O Delays on T5220
I'm working with a colleague on a network latency problem he is having on a T5220 server. Basically, his server reads some data off the network interface via sockets from a different hardware device, and then the T5220 server responds with some data written out to the same the network interface. There is a timing requirement, where the reads happen about every 80 milliseconds, and the responding writes need to happen and be complete in about 240 milliseconds. On other hardware (and even on Solaris 8), typical responses are in the 20 msec range, and had no latency issues at all. What he is seeing is the write response happens OK for a while, then it slows down where the response (socket write from the T5220) takes longer than 240 msecs. He used the socketsnoop.d script from the DTrace Toolkit, and had the following observations: The problem in a nutshell is that there is a message (controls for hardware) sent about every 80 msec from the T5220/Solaris v10 to an MVME6100/VxWorks v5.5.1 via TCP/IP over the e1000g1 interface. The message needs to get to the MVME6100 within 240 msec of being sent or the signal processor will run out of commands. The process sending the message is running at a real time priority and only one other controller process has a higher real time priority. From a dtrace that monitored fbt:ip:tcp_output:entry, I see 5 socket writes each time my message is sent: TIMEWALLTIMEUID PID DIR SIZE CMD (wrapped on next line) 499194.679681840 2008 Oct 20 14:23:35.61 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.680047812 2008 Oct 20 14:23:35.62 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.680408073 2008 Oct 20 14:23:35.62 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.680613575 2008 Oct 20 14:23:35.62 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.680732307 2008 Oct 20 14:23:35.62 0 21542 W 4160 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched. When the problem occurred I saw only 3 write followed by a 2 references to sched: 499194.761630757 2008 Oct 20 14:23:35.70 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.761992335 2008 Oct 20 14:23:35.70 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.762465836 2008 Oct 20 14:23:35.70 0 21542 W 14600 /home/cm/block3/v66_block3/20081015/rdp/bin/rc_radar_sched 499194.763077839 2008 Oct 20 14:23:35.70 0 0 W 14600 sched 499194.763134514 2008 Oct 20 14:23:35.70 0 0 W 4160 sched The 5 socket write()'s above, (four at 14600 bytes, and the fifth @ 4160 bytes) from the rc_radar_sched process, are normal, and within the time window we need. But when it takes longer than 240 msecs, we only see 3 socket write()'s at 14600 bytes from rc_radar_sched, followed by two socket write()'s from the sched process (the first is 14600 bytes, and the second 4160). This is when the delay happens for subsequent write()'s. What could be going on here? Any more data we should dtrace? Thanks, Pat ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] query re memory profiling
Hi I have a Dtrace script ( attached ) which examines memory usage, using fbt:genunix:anon_resvmem:entry return Seems to work well, tested with a small app that malloced 65 K, malloced 200M and mmapeed a 16M file and it appeared to get things correct. But is it tested with calls to free()? Basically I am looking at an app that seems to wander off the reservation in terms of memory usage, so wanted to rule out any bugs in the script. AFAICT ::anon_unresvmem:entry/self-size 0/ would never fire because self-size is only non-zero during calls to anon_resvmem. That's probably why there seems to be a memory leak. is there a way to get the stack of the calling app at the point in time of the anon_resvmem:entry return. ustack(); Other thoughts: Most of those self- variables (uid, pid, etc) seem unnecessary (though harmless) because the values are just as easy to figure out from the :::return as from the :::entry Out of curiosity, why the test for non-NULL pr_psargs? Should unresvmem test for it as well? It seems like you'd want sth like this: fbt:genunix:anon_resvmem:entry /arg1 != 0/ { self-size = arg0; self-args = (char*) curpsinfo-pr_psargs; } fbt:genunix:anon_resvmem:return /self-args != 0 arg1 != 0/ { printf(...); total[pid] += self-size; self-args = 0; } fbt:genunix:anon_unresvmem:entry { self-size = arg0; self-args = (char*) curpsinfo-pr_psargs; } fbt:genunix:anon_unresvmem:return /self-args != 0/ { printf(...); total[pid] -= self-size; self-args = 0; } Regards, Ryan -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org