Re: [dtrace-discuss] Guidelines for Long Running DTrace Scripts

2009-03-28 Thread Marcelo Leal
Hello there...
 I need to implement something similar, and before start, i did think to look 
here first. And that is the good in being part of such community. ;-)
 As there is only one comment for the Ben's question, and a good one, i think 
if we can work in a prototype together, and maybe create a general framework 
for that. And publish the result FMA/Dtrace scripts, and the processing 
scripts. I'm thinking in use orca to do the plotting, so, even the necessary 
steps for that.
 Ben did not talk about specifically what is the dtrace script he wants to 
implement in "daemon" mode. So, i will explain my case, and ask Ben if he can 
do the same, so we can implement together. 
 I want the following informations for each ZFS dataset (FS or VOL), regarding 
NFS operations, for all NFS servers:

 1 - Total requests (reads and writes);
 2 - Latency for each operation;
 3 - Total Sync operations (Zil);
 4 - And the spa_sync informations too.

 Would be nice to see if for "some reason" we have more requests that we can 
handle... but i don't think that is possible (in the end we would have all done 
with big latency times, i guess). Anyway...
 
 Obviously, we do not need to get for example *all* the NFS operations, but we 
do need something representative. Maybe aggregate in memory, and persist on 
disk from time to time. I don't know if there is some kind of "timer" in dtrace 
to activate and deactivate probes.

 ps.: But a real time monitor(like analytics) would be very nice! ;-)

 That's it, i wait your comments, and thanks a lot for your time!

 Leal
[ http://www.eall.com.br ]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] NFS Block Monitor

2009-01-24 Thread Marcelo Leal
FYI (Version 0.3):

http://www.eall.com.br/blog/?p=970

Leal
[ http://www.eall.com.br/blog ]

> Hello all..
> I did some tests to understand the behaviour of ZFS
> and slog (SSD), and for understand the workload i
> did implement a simple software to visualize the
>  data blocks (read/write).
> I'm posting here the link in the case somebody wants
> to try it.
>  http://www.eall.com.br/blog/?p=906
> 
>  Thanks a lot for your time.
> 
>  Leal
> http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] NFS Block Monitor

2009-01-12 Thread Marcelo Leal
Hello all..
 I did some tests to understand the behaviour of ZFS and slog (SSD), and for 
understand the workload i did implement a simple software to visualize the data 
blocks (read/write).
 I'm posting here the link in the case somebody wants to try it.
 http://www.eall.com.br/blog/?p=906

 Thanks a lot for your time.

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-11 Thread Marcelo Leal
No, i'm not thinking about the numbers like "good" or "bad" for now. Because of 
that little bug in the first script, i'm just trying to realize if the numbers 
are OK. ;-)
 Like Max said, all the IO's time can be greater than the tracing period. The 
only problem was the "two" days of the first script, but now 20 minutes can be 
"right". 
 Assuming that, i can continue investigating to now realize if that numbers are 
good or not and work to understand/fix them.
 Thanks a lot for your answers Jim, and Max.
 
 ps.: Max, i just update my profile, and i'm receiving your mails.

 __
Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
Sorry, but i do not agree.
 We are talking about a NFSv3 provider, and not about how many cpu's there are 
on the system. I do not have the knowledge to discuss with you the aspects 
about the implementation, but as a user point of view, i think that numbers 
don't make sense. If the fact that the number of cpu's is important for the 
start/done for the NFS provider, i think it will for all other dtrace 
providers. 
 Thanks a lot for your answers Max!

__
Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
Ok, but that is a bug, or should work like that? 
 We can not use dtrace on multiple processors systems?
 Sorry, but i don't get it...
__
 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
I think (us) is microseconds. There is one division by "1000" on the source 
code...

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
> Marcelo Leal wrote:
> > Hello all...
> >  Thanks a lot for the answers! I think the problem
> is "almost" fixed. Every dtrace documentation says to
> use predicates to guarantee the relation between the
> start/done probes... Max was the only one paying
> attention reading the docs. ;-)
> >   
> Actually, this is not from reading the docs.  It is
> from being burnt a 
> few times by
> getting "impossible" time values.  By the way, I am
> replying to you and 
> to the mailing
> list, but messages to you are getting bounced.

 Oh, seems like i need to update my profile. ;-)

> >  But i'm still getting weird numbers:
> >   
> Which numbers don't look right?  3 of your reads took
> between 2 and 4 
> milliseconds, most were between
> 8 and 16 nanoseconds.  21 writes took between 2 and 4
> milliseconds, the 
> most amount of time
> spent doing read/write by host is about 1.2 seconds,
> and teste/file10 
> took about 1.1 seconds.
> Looks pretty good to me.(?).  I'm curious about what
> you were expecting 
> to see.
 
 The problem is the total numbers...
 1267135728 and 1126991407, for example. 
 21 and 19 minutes in a ten minutes trace.
 Or am i missing something?

 Leal
[http://www.eall.com.br/blog]

> 
> > Wed Dec 10 08:36:33 BRST 2008
> > Wed Dec 10 08:46:55 BRST 2008
> >
> >  cut here -
> > Tracing... Hit Ctrl-C to end.
> > ^C
> > NFSv3 read/write distributions (us):
> >
> >   read
> >value  - Distribution
> - count
> >2 |
> 0
>  631
> @@@ 145603
> >   16 |@
>155926
>   15970
>6111
>   942
>372
>   883
>1649
>   1090
>8278
>   24605
>8868
>   1694
>304
>   63
>27
>   31
>43
>   3
>0
> value  - Distribution -
>  count
> 128 |
> 0
>  1083 
> @@@ 32622
> > 1024 |@
>70353
>   70851
>47906
>   44898
>20481
>   5633 
>1605 
>   1339
>957
>   380
>143
>   21
>0
> otal us):
> >
> >   x.16.0.x
> 
> 647019
> >   x.16.0.x
> 
> 734488
> >   x.16.0.x
> 
> 0890034
> >   x.16.0.x
> 
> 8852624
> >   x.16.0.x
> 
> 0407241
> >   x.16.0.x
> 
> 9028592
> >   x.16.0.x
> 
> 3013688
> >   x.16.0.x
> 
> 04045281
> >   x.16.0.x
> 
> 05245138
> >   x.16.0.x
> 
> 24286383
> >   x.16.0.x
> 
> 54526695
> >   x.16.0.x
> 
> 94419023
> >   x.16.0.x
> 
> 21794650
> >   x.16.0.x
> 
> 59302970
> >   x.16.0.x
> 
> 89694542
> >   x.16.0.x
> 
> 90207418
> >   x.16.0.x
> 
> 87983050
> >   x.16.0.x
> 
> 267135728
> >
> > NFSv3 read/write top 10 files (total us):
> >
> >   /teste/file1   95870303
> >   /teste/file2  104212948
> >   /teste/file3  104311607
> >   /teste/file4  121076447
> >   /teste/file5  137687236
> >   /teste/file6  160895273
> >   /teste/file7  180765880
> >   /teste/file8  198827114
> >   /teste/file9  372380414
> >   /teste/file10 1126991407
> > -- cut here --
> >   
> 
> >  Max, will be difficult disable processors on that
> machine (production). 
> >   
> Yes.  I understand. 
> Regards,
> max
> 
> >  Thanks again!
> >
> >  Leal
> > [http://www.eall.com.br/blog]
> >   
> 
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-10 Thread Marcelo Leal
Hello all...
 Thanks a lot for the answers! I think the problem is "almost" fixed. Every 
dtrace documentation says to use predicates to guarantee the relation between 
the start/done probes... Max was the only one paying attention reading the 
docs. ;-)
 But i'm still getting weird numbers:

Wed Dec 10 08:36:33 BRST 2008
Wed Dec 10 08:46:55 BRST 2008

 cut here -
Tracing... Hit Ctrl-C to end.
^C
NFSv3 read/write distributions (us):

  read
   value  - Distribution - count
   2 | 0
   4 | 631
   8 | 145603
  16 |@155926
  32 |@@   15970
  64 |@6111
 128 | 942
 256 | 372
 512 | 883
1024 | 1649
2048 | 1090
4096 |@8278
8192 |@@@  24605
   16384 |@8868
   32768 | 1694
   65536 | 304
  131072 | 63
  262144 | 27
  524288 | 31
 1048576 | 43
 2097152 | 3
 4194304 | 0

  write
   value  - Distribution - count
 128 | 0
 256 | 1083 
 512 | 32622
1024 |@70353
2048 |@@   70851
4096 |@@   47906
8192 |@@   44898
   16384 |@@@  20481
   32768 |@5633 
   65536 | 1605 
  131072 | 1339
  262144 | 957
  524288 | 380
 1048576 | 143
 2097152 | 21
 4194304 | 0

NFSv3 read/write by host (total us):

  x.16.0.x   3647019
  x.16.0.x   8734488
  x.16.0.x  50890034
  x.16.0.x  68852624
  x.16.0.x  70407241
  x.16.0.x  79028592
  x.16.0.x  83013688
  x.16.0.x 104045281
  x.16.0.x 105245138
  x.16.0.x 124286383
  x.16.0.x154526695
  x.16.0.x 194419023
  x.16.0.x 221794650
  x.16.0.x 259302970
  x.16.0.x 289694542
  x.16.0.x 290207418
  x.16.0.x 487983050
  x.16.0.x1267135728

NFSv3 read/write top 10 files (total us):

  /teste/file1   95870303
  /teste/file2  104212948
  /teste/file3  104311607
  /teste/file4  121076447
  /teste/file5  137687236
  /teste/file6  160895273
  /teste/file7  180765880
  /teste/file8  198827114
  /teste/file9  372380414
  /teste/file10 1126991407
-- cut here --
 Max, will be difficult disable processors on that machine (production). 
 Thanks again!

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted fro

Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-09 Thread Marcelo Leal
Some kind of both... ;-)
 I was investigating a "possible" performance problem, that i'm not sure if is 
the NFS server or not. 
 So, i was faced with that weird numbers. I think one thing is not related with 
the other, but we need to fix whatever is the problem with the script or the 
provider, to have confidence on the tool. Except that numbers, the other 
latency values seems to be fine, don't you agree?

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-09 Thread Marcelo Leal
Oops, that would be a nice test, but something i cannot do. ;-)

[http://www.eall.com.br/blog]

 Leal.
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-09 Thread Marcelo Leal
Hello Jim, this is not a benchmark. The filenames i did change for privacy...
 This is a NFS server, yes.

# uname -a
SunOS test 5.11 snv_89 i86pc i386 i86pc

# cat /etc/release
   Solaris Express Community Edition snv_89 X86
   Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
  Assembled 06 May 2008
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-09 Thread Marcelo Leal
Hello Jim!
 Actually i can repeat it... every time i did run some d script to collect some 
data i got some (how do you call it? nasty :) values. Look:

Fri Dec  5 10:19:32 BRST 2008
Fri Dec  5 10:29:34 BRST 2008


NFSv3 read/write distributions (us):

  read
   value  - Distribution - count
   2 | 0
   4 | 1092
   8 |@93773
  16 | 64481
  32 |@@   11713
  64 |@7590
 128 | 1156
 256 | 698
 512 | 1394
1024 | 1729
2048 | 805
4096 |@2732
8192 |@@@  14893
   16384 |@@   9351
   32768 |@2988
   65536 | 647
  131072 | 119
  262144 | 29
  524288 | 30
 1048576 | 28
 2097152 | 0

  write
   value  - Distribution - count
  64 | 0
 128 | 8
 256 | 2418
 512 |@@@  22679
1024 | 28442
2048 | 59887
4096 |@68852
8192 |@65152
   16384 | 32224
   32768 |@@   11554
   65536 | 3162
  131072 | 1100
  262144 | 446
  524288 | 105
 1048576 | 70
 2097152 | 11
 4194304 | 0
 8388608 | 0
16777216 | 0
33554432 | 0
67108864 | 0
   134217728 | 0
   268435456 | 0
   536870912 | 0
  1073741824 | 0
  2147483648 | 0
  4294967296 | 0
  8589934592 | 0
 17179869184 | 0
 34359738368 | 0
 68719476736 | 0
137438953472 | 11
274877906944 | 0

NFSv3 read/write by host (total us):

  x.16.0.x   4707246
  x.16.0.x  28397213
  x.16.0.x  40901275
  x.16.0.x  68333664
  x.16.0.x  89357734
  x.16.0.x 125890329
  x.16.0.x 127848295
  x.16.0.x 132248305
  x.16.0.x 135161278
  x.16.0.x 138579146
  x.16.0.x 146275507
  x.16.0.x 156761509
  x.16.0.x 566154684
  x.16.0.x  185948455950
  x.16.0.x  186184056503
  x.16.0.x  186341816343
  x.16.0.x1488962592532

Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-08 Thread Marcelo Leal
36 hours... ;-))

 Leal.
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-08 Thread Marcelo Leal
Hello Jim,
 - cut here ---
Qui Dez  4 19:08:39 BRST 2008
Qui Dez  4 19:18:02 BRST 2008

 - cut here ---
NFSv3 read/write distributions (us):

  read
   value  - Distribution - count
   2 | 0
   4 | 22108
   8 |@@@  80611
  16 | 66331
  32 |@@   11497
  64 |@4939
 128 | 979
 256 | 727
 512 | 788
1024 | 1663
2048 | 496
4096 |@3389
8192 |@@@  14518
   16384 |@4856
   32768 | 742
   65536 | 119
  131072 | 38
  262144 | 9
  524288 | 25
 1048576 | 7
 2097152 | 0

  write
   value  - Distribution - count
  64 | 0
 128 | 55
 256 |@@   8750
 512 |@@   52926
1024 |@34370
2048 |@@@  24610
4096 |@@@  12136
8192 |@@@  10819
   16384 |@4181
   32768 | 1198
   65536 | 811
  131072 | 793
  262144 | 278
  524288 | 26
 1048576 | 2
 2097152 | 0
 4194304 | 0
 8388608 | 0
16777216 | 0
33554432 | 0
67108864 | 0
   134217728 | 0
   268435456 | 0
   536870912 | 0
  1073741824 | 0
  2147483648 | 0
  4294967296 | 0
  8589934592 | 0
 17179869184 | 0
 34359738368 | 0
 68719476736 | 1
137438953472 | 0

NFSv3 read/write by host (total us):

  x.16.0.x   1987595
  x.16.0.x   2588201
  x.16.0.x  20370903
  x.16.0.x  21400116
  x.16.0.x  25208119
  x.16.0.x  28874221
  x.16.0.x  32523821
  x.16.0.x  41103342
  x.16.0.x  43934153
  x.16.0.x  51819379
  x.16.0.x 57477455
  x.16.0.x  57679165
  x.16.0.x  59575938
  x.16.0.x  95072158
  x.16.0.x 305615207
  x.16.0.x 349252742
  x.16.0.x  131175486635

NFSv3 read/write top 10 files (total us):

  /teste/file1   29942610
  /teste/file2   32180289
  /teste

Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-08 Thread Marcelo Leal
Hello,

> Are you referring to nfsv3rwsnoop.d?
> 
> The TIME(us) value from that script is not a latency
> measurement,
> it's just a time stamp.
> 
> If you're referring to a different script, let us
> know specifically
> which script.

 Sorry, when i did write "latency", i did assume that you will know that i was 
talking about the "nfsv3rwtime.d" script. Sorry...  i mean, that is the script 
in the wiki page to see the latencies. 
 The:
 "NFSv3 read/write by host (total us):" 
 and
"NFSv3 read/write top 10 files (total us):"

 are showing that numbers...

 Thanks a lot for your answer!

 Leal.
> 
> /jim
> 
> 
> Marcelo Leal wrote:
> > Hello there,
> >  Ten minutes of trace (latency), using the nfs
> dtrace script from nfsv3 provider wiki page, i got
> total numbers (us) like:
> >  131175486635
> >   ???
> >
> >  thanks!
> >   
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?

2008-12-04 Thread Marcelo Leal
Hello there,
 Ten minutes of trace (latency), using the nfs dtrace script from nfsv3 
provider wiki page, i got total numbers (us) like:
 131175486635
  ???

 thanks!
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org