Re: [dtrace-discuss] Is there a way to clear Dtrace arrays?

2008-12-10 Thread Adam Leventhal
> Ignore my request. Even if there was a way to clear an array, there  
> is no function that allows me to print out that array. Only an  
> aggregation does that.

True, and I'd argue that an aggregation is probably the right data  
structure
for what you're doing.

Adam

--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
Hello all...
 Thanks a lot for the answers! I think the problem is "almost" fixed. Every 
dtrace documentation says to use predicates to guarantee the relation between 
the start/done probes... Max was the only one paying attention reading the 
docs. ;-)
 But i'm still getting weird numbers:

Wed Dec 10 08:36:33 BRST 2008
Wed Dec 10 08:46:55 BRST 2008

 cut here -
Tracing... Hit Ctrl-C to end.
^C
NFSv3 read/write distributions (us):

  read
   value  - Distribution - count
   2 | 0
   4 | 631
   8 | 145603
  16 |@155926
  32 |@@   15970
  64 |@6111
 128 | 942
 256 | 372
 512 | 883
1024 | 1649
2048 | 1090
4096 |@8278
8192 |@@@  24605
   16384 |@8868
   32768 | 1694
   65536 | 304
  131072 | 63
  262144 | 27
  524288 | 31
 1048576 | 43
 2097152 | 3
 4194304 | 0

  write
   value  - Distribution - count
 128 | 0
 256 | 1083 
 512 | 32622
1024 |@70353
2048 |@@   70851
4096 |@@   47906
8192 |@@   44898
   16384 |@@@  20481
   32768 |@5633 
   65536 | 1605 
  131072 | 1339
  262144 | 957
  524288 | 380
 1048576 | 143
 2097152 | 21
 4194304 | 0

NFSv3 read/write by host (total us):

  x.16.0.x   3647019
  x.16.0.x   8734488
  x.16.0.x  50890034
  x.16.0.x  68852624
  x.16.0.x  70407241
  x.16.0.x  79028592
  x.16.0.x  83013688
  x.16.0.x 104045281
  x.16.0.x 105245138
  x.16.0.x 124286383
  x.16.0.x154526695
  x.16.0.x 194419023
  x.16.0.x 221794650
  x.16.0.x 259302970
  x.16.0.x 289694542
  x.16.0.x 290207418
  x.16.0.x 487983050
  x.16.0.x1267135728

NFSv3 read/write top 10 files (total us):

  /teste/file1   95870303
  /teste/file2  104212948
  /teste/file3  104311607
  /teste/file4  121076447
  /teste/file5  137687236
  /teste/file6  160895273
  /teste/file7  180765880
  /teste/file8  198827114
  /teste/file9  372380414
  /teste/file10 1126991407
-- cut here --
 Max, will be difficult disable processors on that machine (production). 
 Thanks again!

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted fro

Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread [EMAIL PROTECTED]
Marcelo Leal wrote:
>> Marcelo Leal wrote:
>> 
>>> Hello all...
>>>  Thanks a lot for the answers! I think the problem
>>>   
>> is "almost" fixed. Every dtrace documentation says to
>> use predicates to guarantee the relation between the
>> start/done probes... Max was the only one paying
>> attention reading the docs. ;-)
>> 
>>>  But i'm still getting weird numbers:
>>>   
>>>   
>> Which numbers don't look right?  3 of your reads took
>> between 2 and 4 
>> milliseconds, most were between
>> 8 and 16 nanoseconds.  21 writes took between 2 and 4
>> milliseconds, the 
>> most amount of time
>> spent doing read/write by host is about 1.2 seconds,
>> and teste/file10 
>> took about 1.1 seconds.
>> Looks pretty good to me.(?).  I'm curious about what
>> you were expecting 
>> to see.
>> 
>  
>  The problem is the total numbers...
>  1267135728 and 1126991407, for example. 
>  21 and 19 minutes in a ten minutes trace.
>  Or am i missing something?
>   
When I do the arithmetic, I get about 1.2 seconds for the first number,
and 1.1 seconds for the second number.  These numbers are in 
nanoseconds, no?
So, 1267135728/(1000*1000*1000) = 1.267... seconds.
max

>  Leal
> [http://www.eall.com.br/blog]
>
>   
>>> Wed Dec 10 08:36:33 BRST 2008
>>> Wed Dec 10 08:46:55 BRST 2008
>>>
>>>  cut here -
>>> Tracing... Hit Ctrl-C to end.
>>> ^C
>>> NFSv3 read/write distributions (us):
>>>
>>>   read
>>>value  - Distribution
>>>   
>> - count
>> 
>>>2 |
>>>   
>> 0
>>  631
>> @@@ 145603
>> 
>>>   16 |@
>>>   
>>155926
>>   15970
>>6111
>>   942
>>372
>>   883
>>1649
>>   1090
>>8278
>>   24605
>>8868
>>   1694
>>304
>>   63
>>27
>>   31
>>43
>>   3
>>0
>> value  - Distribution -
>>  count
>> 128 |
>> 0
>>  1083 
>> @@@ 32622
>> 
>>> 1024 |@
>>>   
>>70353
>>   70851
>>47906
>>   44898
>>20481
>>   5633 
>>1605 
>>   1339
>>957
>>   380
>>143
>>   21
>>0
>> otal us):
>> 
>>>   x.16.0.x
>>>   
>> 647019
>> 
>>>   x.16.0.x
>>>   
>> 734488
>> 
>>>   x.16.0.x
>>>   
>> 0890034
>> 
>>>   x.16.0.x
>>>   
>> 8852624
>> 
>>>   x.16.0.x
>>>   
>> 0407241
>> 
>>>   x.16.0.x
>>>   
>> 9028592
>> 
>>>   x.16.0.x
>>>   
>> 3013688
>> 
>>>   x.16.0.x
>>>   
>> 04045281
>> 
>>>   x.16.0.x
>>>   
>> 05245138
>> 
>>>   x.16.0.x
>>>   
>> 24286383
>> 
>>>   x.16.0.x
>>>   
>> 54526695
>> 
>>>   x.16.0.x
>>>   
>> 94419023
>> 
>>>   x.16.0.x
>>>   
>> 21794650
>> 
>>>   x.16.0.x
>>>   
>> 59302970
>> 
>>>   x.16.0.x
>>>   
>> 89694542
>> 
>>>   x.16.0.x
>>>   
>> 90207418
>> 
>>>   x.16.0.x
>>>   
>> 87983050
>> 
>>>   x.16.0.x
>>>   
>> 267135728
>> 
>>> NFSv3 read/write top 10 files (total us):
>>>
>>>   /teste/file1   95870303
>>>   /teste/file2  104212948
>>>   /teste/file3  104311607
>>>   /teste/file4  121076447
>>>   /teste/file5  137687236
>>>   /teste/file6  160895273
>>>   /teste/file7  180765880
>>>   /teste/file8  198827114
>>>   /teste/file9  372380414
>>>   /teste/file10 1126991407
>>> -- cut here --
>>>   
>>>   
>>>  Max, will be difficult disable processors on that
>>>   
>> machine (production). 
>> 
>>>   
>>>   
>> Yes.  I understand. 
>> Regards,
>> max
>>
>> 
>>>  Thanks again!
>>>
>>>  Leal
>>> [http://www.eall.com.br/blog]
>>>   
>>>   
>> ___
>> dtrace-discuss mailing l

Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
> Marcelo Leal wrote:
> > Hello all...
> >  Thanks a lot for the answers! I think the problem
> is "almost" fixed. Every dtrace documentation says to
> use predicates to guarantee the relation between the
> start/done probes... Max was the only one paying
> attention reading the docs. ;-)
> >   
> Actually, this is not from reading the docs.  It is
> from being burnt a 
> few times by
> getting "impossible" time values.  By the way, I am
> replying to you and 
> to the mailing
> list, but messages to you are getting bounced.

 Oh, seems like i need to update my profile. ;-)

> >  But i'm still getting weird numbers:
> >   
> Which numbers don't look right?  3 of your reads took
> between 2 and 4 
> milliseconds, most were between
> 8 and 16 nanoseconds.  21 writes took between 2 and 4
> milliseconds, the 
> most amount of time
> spent doing read/write by host is about 1.2 seconds,
> and teste/file10 
> took about 1.1 seconds.
> Looks pretty good to me.(?).  I'm curious about what
> you were expecting 
> to see.
 
 The problem is the total numbers...
 1267135728 and 1126991407, for example. 
 21 and 19 minutes in a ten minutes trace.
 Or am i missing something?

 Leal
[http://www.eall.com.br/blog]

> 
> > Wed Dec 10 08:36:33 BRST 2008
> > Wed Dec 10 08:46:55 BRST 2008
> >
> >  cut here -
> > Tracing... Hit Ctrl-C to end.
> > ^C
> > NFSv3 read/write distributions (us):
> >
> >   read
> >value  - Distribution
> - count
> >2 |
> 0
>  631
> @@@ 145603
> >   16 |@
>155926
>   15970
>6111
>   942
>372
>   883
>1649
>   1090
>8278
>   24605
>8868
>   1694
>304
>   63
>27
>   31
>43
>   3
>0
> value  - Distribution -
>  count
> 128 |
> 0
>  1083 
> @@@ 32622
> > 1024 |@
>70353
>   70851
>47906
>   44898
>20481
>   5633 
>1605 
>   1339
>957
>   380
>143
>   21
>0
> otal us):
> >
> >   x.16.0.x
> 
> 647019
> >   x.16.0.x
> 
> 734488
> >   x.16.0.x
> 
> 0890034
> >   x.16.0.x
> 
> 8852624
> >   x.16.0.x
> 
> 0407241
> >   x.16.0.x
> 
> 9028592
> >   x.16.0.x
> 
> 3013688
> >   x.16.0.x
> 
> 04045281
> >   x.16.0.x
> 
> 05245138
> >   x.16.0.x
> 
> 24286383
> >   x.16.0.x
> 
> 54526695
> >   x.16.0.x
> 
> 94419023
> >   x.16.0.x
> 
> 21794650
> >   x.16.0.x
> 
> 59302970
> >   x.16.0.x
> 
> 89694542
> >   x.16.0.x
> 
> 90207418
> >   x.16.0.x
> 
> 87983050
> >   x.16.0.x
> 
> 267135728
> >
> > NFSv3 read/write top 10 files (total us):
> >
> >   /teste/file1   95870303
> >   /teste/file2  104212948
> >   /teste/file3  104311607
> >   /teste/file4  121076447
> >   /teste/file5  137687236
> >   /teste/file6  160895273
> >   /teste/file7  180765880
> >   /teste/file8  198827114
> >   /teste/file9  372380414
> >   /teste/file10 1126991407
> > -- cut here --
> >   
> 
> >  Max, will be difficult disable processors on that
> machine (production). 
> >   
> Yes.  I understand. 
> Regards,
> max
> 
> >  Thanks again!
> >
> >  Leal
> > [http://www.eall.com.br/blog]
> >   
> 
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread [EMAIL PROTECTED]
Marcelo Leal wrote:
> Hello all...
>  Thanks a lot for the answers! I think the problem is "almost" fixed. Every 
> dtrace documentation says to use predicates to guarantee the relation between 
> the start/done probes... Max was the only one paying attention reading the 
> docs. ;-)
>   
Actually, this is not from reading the docs.  It is from being burnt a 
few times by
getting "impossible" time values.  By the way, I am replying to you and 
to the mailing
list, but messages to you are getting bounced.
>  But i'm still getting weird numbers:
>   
Which numbers don't look right?  3 of your reads took between 2 and 4 
milliseconds, most were between
8 and 16 nanoseconds.  21 writes took between 2 and 4 milliseconds, the 
most amount of time
spent doing read/write by host is about 1.2 seconds, and teste/file10 
took about 1.1 seconds.
Looks pretty good to me.(?).  I'm curious about what you were expecting 
to see.

> Wed Dec 10 08:36:33 BRST 2008
> Wed Dec 10 08:46:55 BRST 2008
>
>  cut here -
> Tracing... Hit Ctrl-C to end.
> ^C
> NFSv3 read/write distributions (us):
>
>   read
>value  - Distribution - count
>2 | 0
>4 | 631
>8 | 145603
>   16 |@155926
>   32 |@@   15970
>   64 |@6111
>  128 | 942
>  256 | 372
>  512 | 883
> 1024 | 1649
> 2048 | 1090
> 4096 |@8278
> 8192 |@@@  24605
>16384 |@8868
>32768 | 1694
>65536 | 304
>   131072 | 63
>   262144 | 27
>   524288 | 31
>  1048576 | 43
>  2097152 | 3
>  4194304 | 0
>
>   write
>value  - Distribution - count
>  128 | 0
>  256 | 1083 
>  512 | 32622
> 1024 |@70353
> 2048 |@@   70851
> 4096 |@@   47906
> 8192 |@@   44898
>16384 |@@@  20481
>32768 |@5633 
>65536 | 1605 
>   131072 | 1339
>   262144 | 957
>   524288 | 380
>  1048576 | 143
>  2097152 | 21
>  4194304 | 0
>
> NFSv3 read/write by host (total us):
>
>   x.16.0.x   3647019
>   x.16.0.x   8734488
>   x.16.0.x  50890034
>   x.16.0.x  68852624
>   x.16.0.x  70407241
>   x.16.0.x  79028592
>   x.16.0.x  83013688
>   x.16.0.x 104045281
>   x.16.0.x 105245138
>   x.16.0.x 124286383
>   x.16.0.x154526695
>   x.16.0.x 194419023
>   x.16.0.x 221794650
>   x.16.0.x 259302970
>   x.16.0.x 289694542
>   x.16.0.x 290207418
>   x.16.0.x  

Re: [dtrace-discuss] Round four: Re: code review req: 6750659 drti.ocrashes app due to corrupt environment

2008-12-10 Thread Roland Mainz
Mike Gerdts wrote:
> 
> I believe that I have incorporated all of the feedback given
> (thanks!).  Changes since the 2008-11-16 version include:
> 
> - ksh style & coding standards compliance in test script (Roland)
> - "dof_init_debug == B_FALSE" vs. "!dof_init_debug" (Adam)
> 
> The updated webrev is at:
> 
> http://cr.opensolaris.org/~mgerdts/6750659-2008-12-03/

Quick review (patch code is quoted with "> "):
-- snip --
[snip]
> +
> +PATH=/usr/bin:/usr/sbin:$PATH

Is "export" not needed in this case, e.g. shouldn't subprocesses inherit
PATH, too ?

> +if (( $# != 1 )); then
> +   print -u2 'expected one argument: '
> +   exit 2
> +fi
> +
> +#
> +# jdtrace does not implement the -h option that is required to generate
> +# C header files.
> +#
> +if [[ "$1" == */jdtrace ]]; then
> +   exit 0

Is the zero return code (= "suceess") Ok in this case ?

> +fi
> +
> +dtrace="$1"
> +startdir="$PWD"
> +dir=$(mktemp -td drtiXX)
> +if (( $? != 0 )); then
> +   print -u2 'Could not create safe temporary directory'
> +   exit 2
> +fi
> +
> +cd "$dir"
> +
> +cat > Makefile < +all: main
> +
> +main: main.o prov.o
> +   \$(CC) -o main main.o prov.o
> +
> +main.o: main.c prov.h
> +   \$(CC) -c main.c
> +
> +prov.h: prov.d
> +   $dtrace -h -s prov.d
> +
> +prov.o: prov.d main.o
> +   $dtrace -G -32 -s prov.d main.o
> +EOF
> +
> +cat > prov.d < +provider tester {
> +   probe entry();
> +};
> +EOF
> +
> +cat > main.c < +#include 
> +#include 
> +#include "prov.h"
> +
> +int
> +main(int argc, char **argv, char **envp)
> +{
> +   envp[0] = (char*)0xff;
> +   TESTER_ENTRY();
> +   return 0;

ISO C defines |EXIT_SUCCESS| (it's defined as |#define EXIT_SUCCESS
(0)|) ... I don't know whether it's better or worse in this case...

> +}
> +EOF
> +
> +make > /dev/null
> +status=$?
> +if (( $status != 0 )) ; then

Umpf... please remove the extra '$' - see
http://www.opensolaris.org/os/project/shell/shellstyle/#avoid_unneccesary_string_number_conversions
-- snip --

The remaining stuff looks good... :-)



Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [EMAIL PROTECTED]
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
I think (us) is microseconds. There is one division by "1000" on the source 
code...

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Round four: Re: code review req: 6750659 drti.ocrashes app due to corrupt environment

2008-12-10 Thread Adam Leventhal
I'd like to put an end to this discussion. Mike has done enough work,  
and these tests are more than up to the task. If there are further  
comments on the code, that's fine, but we can suffer through tests  
that are functionally correct, but stylistically less than perfect.

Adam

On Dec 10, 2008, at 6:13 AM, Roland Mainz wrote:

> Mike Gerdts wrote:
>>
>> I believe that I have incorporated all of the feedback given
>> (thanks!).  Changes since the 2008-11-16 version include:
>>
>> - ksh style & coding standards compliance in test script (Roland)
>> - "dof_init_debug == B_FALSE" vs. "!dof_init_debug" (Adam)
>>
>> The updated webrev is at:
>>
>> http://cr.opensolaris.org/~mgerdts/6750659-2008-12-03/
>
> Quick review (patch code is quoted with "> "):
> -- snip --
> [snip]
>> +
>> +PATH=/usr/bin:/usr/sbin:$PATH
>
> Is "export" not needed in this case, e.g. shouldn't subprocesses  
> inherit
> PATH, too ?
>
>> +if (( $# != 1 )); then
>> +   print -u2 'expected one argument: '
>> +   exit 2
>> +fi
>> +
>> +#
>> +# jdtrace does not implement the -h option that is required to  
>> generate
>> +# C header files.
>> +#
>> +if [[ "$1" == */jdtrace ]]; then
>> +   exit 0
>
> Is the zero return code (= "suceess") Ok in this case ?
>
>> +fi
>> +
>> +dtrace="$1"
>> +startdir="$PWD"
>> +dir=$(mktemp -td drtiXX)
>> +if (( $? != 0 )); then
>> +   print -u2 'Could not create safe temporary directory'
>> +   exit 2
>> +fi
>> +
>> +cd "$dir"
>> +
>> +cat > Makefile <> +all: main
>> +
>> +main: main.o prov.o
>> +   \$(CC) -o main main.o prov.o
>> +
>> +main.o: main.c prov.h
>> +   \$(CC) -c main.c
>> +
>> +prov.h: prov.d
>> +   $dtrace -h -s prov.d
>> +
>> +prov.o: prov.d main.o
>> +   $dtrace -G -32 -s prov.d main.o
>> +EOF
>> +
>> +cat > prov.d <> +provider tester {
>> +   probe entry();
>> +};
>> +EOF
>> +
>> +cat > main.c <> +#include 
>> +#include 
>> +#include "prov.h"
>> +
>> +int
>> +main(int argc, char **argv, char **envp)
>> +{
>> +   envp[0] = (char*)0xff;
>> +   TESTER_ENTRY();
>> +   return 0;
>
> ISO C defines |EXIT_SUCCESS| (it's defined as |#define EXIT_SUCCESS
> (0)|) ... I don't know whether it's better or worse in this case...
>
>> +}
>> +EOF
>> +
>> +make > /dev/null
>> +status=$?
>> +if (( $status != 0 )) ; then
>
> Umpf... please remove the extra '$' - see
> http://www.opensolaris.org/os/project/shell/shellstyle/#avoid_unneccesary_string_number_conversions
> -- snip --
>
> The remaining stuff looks good... :-)
>
> 
>
> Bye,
> Roland
>
> -- 
>  __ .  . __
> (o.\ \/ /.o) [EMAIL PROTECTED]
>  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
>  /O /==\ O\  TEL +49 641 3992797
> (;O/ \/ \O;)
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org


--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread [EMAIL PROTECTED]
Hi Marcelo,

Marcelo Leal wrote:
> I think (us) is microseconds. There is one division by "1000" on the source 
> code...
>
>   
Oops.  You're right.  I did not see that.  (That might explain
the 4-8 nanosecond I/Os, which I did think seemed pretty fast.
They are actually 4-8 microsecond).  So, you want to know how
you can spend 19 or 20 minutes in a 10 minute trace?
You have multiple cpu's, so each cpu can be working in parallel
on different I/O requests.  If you have 8 cpu's, the average time
spent by each cpu would be about 2.5 minutes.  This does sound a little
high to me, but not extreme.  If you got 80 minutes, I would be
concerned that all cpus are working on requests all the time.
It might be difficult to correlate per cpu, as I suspect the start and
done probe for a given I/O could fire on different cpus.

max
>  Leal
> [http://www.eall.com.br/blog]
>   

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
Ok, but that is a bug, or should work like that? 
 We can not use dtrace on multiple processors systems?
 Sorry, but i don't get it...
__
 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread [EMAIL PROTECTED]
Hi Marcelo,

Marcelo Leal wrote:
> Ok, but that is a bug, or should work like that? 
>  We can not use dtrace on multiple processors systems?
>  Sorry, but i don't get it...
>   
I don't consider this a bug.  I think it depends on what you are trying 
to measure.
The script you are using measures latency for read/write operations
across all cpus.  There is nothing wrong with the sum of the times
being longer than the total time you are tracing, if there are multiple 
cpus.
If you wanted to measure latency per cpu,
the script would need to be changed.

So, what are you trying to measure?
I think more interesting is to find out why some I/O operations
take longer than others.  For this, you can use speculative tracing.
Trace all function entries and returns, but only commit if the time
spent is longer than the longest time found thus far.

I'm sure others have ideas about this...

max

> __
>  Leal
> [http://www.eall.com.br/blog]
>   

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
Sorry, but i do not agree.
 We are talking about a NFSv3 provider, and not about how many cpu's there are 
on the system. I do not have the knowledge to discuss with you the aspects 
about the implementation, but as a user point of view, i think that numbers 
don't make sense. If the fact that the number of cpu's is important for the 
start/done for the NFS provider, i think it will for all other dtrace 
providers. 
 Thanks a lot for your answers Max!

__
Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Jim Mauro
No bug here - we can absolutely use DTrace on MP systems,
reliably and with confidence.

The script output shows some nasty outliers for a small percentage
of the reads and writes happening on the server. Time to take a closer
look at the IO subsystem. I'd start with "iostat -znx 1", and see what
the queues look like, IO rates, and service times.

I'd also have a look at the network. Download nicstat and run it
(go to blogs.sun.com and search for "nicstat" - it's easy to find).

What are you using for an on-disk file system? UFS or ZFS?

/jim



Marcelo Leal wrote:
> Ok, but that is a bug, or should work like that? 
>  We can not use dtrace on multiple processors systems?
>  Sorry, but i don't get it...
> __
>  Leal
> [http://www.eall.com.br/blog]
>   
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Printing the main script responsible for writing a file

2008-12-10 Thread Jonathan Adams
On Mon, Nov 24, 2008 at 06:03:44AM -0800, debabrata das wrote:
> Hi all,
> 
> I am new in dtrace. But I have urgent requirement. I would like to know a 
> name of program which is responsible for updating a file. I have written the 
> small following program :
> 
> #!/usr/sbin/dtrace -s
> 
> syscall::open*:entry
> /pid == $1/
> {
>printf("%s %s %d", execname, copyinstr(arg0), pid);
> }
> 
> Now if I ruun this then I get following output :
> 
> -bash-3.00$ ./text.d | grep deba
> dtrace: script './text.d' matched 2 probes
>   8402 open64:entry bash /tmp/deba.txt 18786
> 
> Now I want to know which script is updating this file. I would like to the 
> see the output of "ptree 18786". I am not sure how to this structure of ptree 
> or the name of the script.

you want curpsinfo->pr_psargs;  it's a string which contains the string that
"ptree" outputs, which is the "argv" array, separated by spaces, truncated
to 80 characters.

Cheers,
- jonathan

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] How to tell FPU activity on a CMT processor? Which tool, what to look at?

2008-12-10 Thread Paul Clayton
Okay, I admit to confusion and a certain amount of defeat.

I have numerous T2000  boxes and I want to see how much floating point activity 
is taking place on them in order to tell if I should move applications to a T2 
based box.

So I go looking around.

First up is the cooltst application from 
http://cooltools.sunsource.net/cooltst/ which I try to download. All I can get 
is a Zip file instead of the cooltst_v3.tar.bz2 file mentioned in the download 
section of the page. So the question here is how/where to get a version of 
cooltst that runs on a T2000?

Next up is other mechanisms to get FPU info.

So I go searching the web and find Darryl Gove's web page of 
http://blogs.sun.com/d/entry/using_dtrace_to_locate_floating that talks to 
using kstat and dtrace to look for 'fpu_unfinished_traps'. So I use both kstat 
and dtrace options that Darryl mentions and both options come up with zero (0) 
counts! 

So I start wondering about the count actually being zero and go looking some 
more.

Then I find, http://www.informit.com/articles/article.aspx?p=1161980&seqNum=3 
and read through section 4.3.8 where is mentions "One of the metrics that kstat 
reports is the number of emulated floating-point instructions. Not all 
floating-point operations are performed in hardware; some have been left to 
software.".

So now I am wondering if there is any pure hardware based FPU transactions 
taking place that would NOT show up in the 'fpu_unfinished_traps' counts on a 
CMT T1 chip based server? And if so, how do I find out about those operations?

So, the question is, lacking cooltst that I can get which is usable on a T2000, 
what is the full story on floating point math on a CMT and where/how can data 
about it's usage be gotten?

My thanks for any and all help in this!

pdc
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] dtrace (1M) manpage/ docs seem incorrect re 32/64 bit

2008-12-10 Thread David Holmes
The dtrace(1M) manpage seems to be incorrect, it states:

   -32 | -64

 The D compiler produces programs using the  native  data
 model  of  the  operating system kernel. You can use the
 isainfo -b command to determine  the  current  operating
 system  data  model.   If  the  -32 option is specified,
 dtrace forces the D compiler  to  compile  a  D  program
 using the 32-bit data model. If the -64 option is speci-
 fied, dtrace forces the D compiler to compile a  D  pro-
 gram using the 64-bit data model. These options are typ-
 ically not required as dtrace selects  the  native  data
 model  as  the default. The data model affects the sizes
 of integer types and other language properties.  D  pro-
 grams  compiled for either data model can be executed on
 both 32-bit and 64-bit kernels. The -32 and -64  options
 also determine the ELF file format (ELF32 or ELF64) pro-
 duced by the -G option.

If I read this correctly then trying doing "dtrace -G -o foo ..." will default 
to being 32-bit on a 32-bit system and 64-bit on a 64-bit system. That does not 
seem to be the case however - it always seems to do a 32-bit build if -64 is 
not specified.

Further, on systems where bug 6456626 has been fixed, -G will try to guess 
whether to do 32-bit or 64-bit compile and so the docs need to be updated to 
reflect this as well.

Cheers,
David Holmes
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] How to tell FPU activity on a CMT processor? Which tool, what to look at?

2008-12-10 Thread Rayson Ho
On 12/11/08, Paul Clayton <[EMAIL PROTECTED]> wrote:
> Next up is other mechanisms to get FPU info.
>
> So I go searching the web and find Darryl Gove's web page of 
> http://blogs.sun.com/d/entry/using_dtrace_to_locate_floating that talks to 
> using kstat and dtrace to look for 'fpu_unfinished_traps'. So I use both 
> kstat and dtrace options that Darryl mentions and both options come up with 
> zero (0) counts!


Take a look at this doc:

http://www.sun.com/blueprints/1205/819-5144.pdf


> So now I am wondering if there is any pure hardware based FPU transactions 
> taking place that would NOT show up in the 'fpu_unfinished_traps' counts on a 
> CMT T1 chip based server? And if so, how do I find out about those operations?

You can read the full micro architecture of the T1, specifically
"OpenSPARC T1 Micro Architecture Specification", chapter 7:
Floating-Point Unit.

http://www.opensparc.net/opensparc-t1/index.html

Rayson


>
> So, the question is, lacking cooltst that I can get which is usable on a 
> T2000, what is the full story on floating point math on a CMT and where/how 
> can data about it's usage be gotten?
>
> My thanks for any and all help in this!
>
> pdc
> --
> This message posted from opensolaris.org
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] How to tell FPU activity on a CMT processor? Which tool, what to look at?

2008-12-10 Thread Clayton, Paul D
Rayson..

My thanks for the pointers to the two documents. Very interesting
reading. 

For the emulated instructions, I now have my answers to tell what is
happening. In  the case currently under investigation the only emulation
is FSQRTD and that stands at 884K but is not currently growing so some
other app then the one in question use FSQRTD.

But in reading the T1 specification, there appears to be a lot of FPU
'stuff' that takes place within the single instance FPU with no
emulation. 

So part of my question remains. 

How can I find out about FPU usage for non-emulated instructions on a T1
chipset?

I have to wonder about the command "/usr/sbin/cpustat -c
pic0=FP_instr_cnt,pic1=Instr_cnt,sys 1 10" and if this would deal with
hardware based FPU usage or is another option for the emulated
functions? The document does not provide a clear hint on this subject.

Take care.

pdc

-Original Message-
From: Rayson Ho [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 11, 2008 1:12 AM
To: Clayton, Paul D
Cc: dtrace-discuss@opensolaris.org
Subject: Re: [dtrace-discuss] How to tell FPU activity on a CMT
processor? Which tool, what to look at?

On 12/11/08, Paul Clayton <[EMAIL PROTECTED]> wrote:
> Next up is other mechanisms to get FPU info.
>
> So I go searching the web and find Darryl Gove's web page of
http://blogs.sun.com/d/entry/using_dtrace_to_locate_floating that talks
to using kstat and dtrace to look for 'fpu_unfinished_traps'. So I use
both kstat and dtrace options that Darryl mentions and both options come
up with zero (0) counts!


Take a look at this doc:

http://www.sun.com/blueprints/1205/819-5144.pdf


> So now I am wondering if there is any pure hardware based FPU
transactions taking place that would NOT show up in the
'fpu_unfinished_traps' counts on a CMT T1 chip based server? And if so,
how do I find out about those operations?

You can read the full micro architecture of the T1, specifically
"OpenSPARC T1 Micro Architecture Specification", chapter 7:
Floating-Point Unit.

http://www.opensparc.net/opensparc-t1/index.html

Rayson


>
> So, the question is, lacking cooltst that I can get which is usable on
a T2000, what is the full story on floating point math on a CMT and
where/how can data about it's usage be gotten?
>
> My thanks for any and all help in this!
>
> pdc
> --
> This message posted from opensolaris.org
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread [EMAIL PROTECTED]
Hi Marcelo,

Marcelo Leal wrote:
> Sorry, but i do not agree.
>  We are talking about a NFSv3 provider, and not about how many cpu's there 
> are on the system. I do not have the knowledge to discuss with you the 
> aspects about the implementation, but as a user point of view, i think that 
> numbers don't make sense. If the fact that the number of cpu's is important 
> for the start/done for the NFS provider, i think it will for all other dtrace 
> providers. 
>   
I have found that almost all "bugs" with dtrace are either in the scripting
or in the interpretation of the output.  The mechanism used to implement
dtrace is quite simple, and for the script you are running, I don't believe
you are hitting any bug in the implementation of dtrace.  Also, using dtrace
to examine the system is like using a microscope.  You need to know the
big picture first, before you can determine what you need to trace.
Otherwise you can end up getting a lot of detail about something that
has nothing to do with the problem you are experiencing.  In this instance,
as Jim says, iostat will give you a better view of the big picture.

As for the 20 minutes total time spent for I/O in a script running 10 
minutes,
this could happen even on a single CPU.  For example, you start the script
and immediately 10 I/O requests are made.  Now, let's say all 10 requests
are queued (i.e., block, waiting for some resource).  If they all finish 
just before
the end of the 10 minute period, the total elapsed time would be about 
100 minutes.
So long as the total time divided by the number of operations does not 
exceed
10 minutes, the output is reasonable.

I suspect the outliers, (the I/Os that are taking 2-4 seconds) are due 
to queuing at
the disk driver, but they could be due to scheduling as well.  (I would 
have to look
at exactly when the start and done probes fire to determine this.  But 
fortunately,
source code is available to determine this.).  As I mentioned in an 
earlier post,
speculative tracing can be used to determine the code path taken for the 
longest
I/O.  I have written a script that might work, but have no way to test 
it at this time.
If you are interested, I'll post it.

Jim, as for taking this offline, that is ok for you, but my posts to 
Marcelo are bouncing...

max


>  Thanks a lot for your answers Max!
>
> __
> Leal
> [http://www.eall.com.br/blog]
>   

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org