Re: [dtrace-discuss] Guidelines for Long Running DTrace Scripts
Hello there... I need to implement something similar, and before start, i did think to look here first. And that is the good in being part of such community. ;-) As there is only one comment for the Ben's question, and a good one, i think if we can work in a prototype together, and maybe create a general framework for that. And publish the result FMA/Dtrace scripts, and the processing scripts. I'm thinking in use orca to do the plotting, so, even the necessary steps for that. Ben did not talk about specifically what is the dtrace script he wants to implement in "daemon" mode. So, i will explain my case, and ask Ben if he can do the same, so we can implement together. I want the following informations for each ZFS dataset (FS or VOL), regarding NFS operations, for all NFS servers: 1 - Total requests (reads and writes); 2 - Latency for each operation; 3 - Total Sync operations (Zil); 4 - And the spa_sync informations too. Would be nice to see if for "some reason" we have more requests that we can handle... but i don't think that is possible (in the end we would have all done with big latency times, i guess). Anyway... Obviously, we do not need to get for example *all* the NFS operations, but we do need something representative. Maybe aggregate in memory, and persist on disk from time to time. I don't know if there is some kind of "timer" in dtrace to activate and deactivate probes. ps.: But a real time monitor(like analytics) would be very nice! ;-) That's it, i wait your comments, and thanks a lot for your time! Leal [ http://www.eall.com.br ] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] NFS Block Monitor
FYI (Version 0.3): http://www.eall.com.br/blog/?p=970 Leal [ http://www.eall.com.br/blog ] > Hello all.. > I did some tests to understand the behaviour of ZFS > and slog (SSD), and for understand the workload i > did implement a simple software to visualize the > data blocks (read/write). > I'm posting here the link in the case somebody wants > to try it. > http://www.eall.com.br/blog/?p=906 > > Thanks a lot for your time. > > Leal > http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
[dtrace-discuss] NFS Block Monitor
Hello all.. I did some tests to understand the behaviour of ZFS and slog (SSD), and for understand the workload i did implement a simple software to visualize the data blocks (read/write). I'm posting here the link in the case somebody wants to try it. http://www.eall.com.br/blog/?p=906 Thanks a lot for your time. Leal [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
No, i'm not thinking about the numbers like "good" or "bad" for now. Because of that little bug in the first script, i'm just trying to realize if the numbers are OK. ;-) Like Max said, all the IO's time can be greater than the tracing period. The only problem was the "two" days of the first script, but now 20 minutes can be "right". Assuming that, i can continue investigating to now realize if that numbers are good or not and work to understand/fix them. Thanks a lot for your answers Jim, and Max. ps.: Max, i just update my profile, and i'm receiving your mails. __ Leal [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Sorry, but i do not agree. We are talking about a NFSv3 provider, and not about how many cpu's there are on the system. I do not have the knowledge to discuss with you the aspects about the implementation, but as a user point of view, i think that numbers don't make sense. If the fact that the number of cpu's is important for the start/done for the NFS provider, i think it will for all other dtrace providers. Thanks a lot for your answers Max! __ Leal [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Ok, but that is a bug, or should work like that? We can not use dtrace on multiple processors systems? Sorry, but i don't get it... __ Leal [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
I think (us) is microseconds. There is one division by "1000" on the source code... Leal [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
> Marcelo Leal wrote: > > Hello all... > > Thanks a lot for the answers! I think the problem > is "almost" fixed. Every dtrace documentation says to > use predicates to guarantee the relation between the > start/done probes... Max was the only one paying > attention reading the docs. ;-) > > > Actually, this is not from reading the docs. It is > from being burnt a > few times by > getting "impossible" time values. By the way, I am > replying to you and > to the mailing > list, but messages to you are getting bounced. Oh, seems like i need to update my profile. ;-) > > But i'm still getting weird numbers: > > > Which numbers don't look right? 3 of your reads took > between 2 and 4 > milliseconds, most were between > 8 and 16 nanoseconds. 21 writes took between 2 and 4 > milliseconds, the > most amount of time > spent doing read/write by host is about 1.2 seconds, > and teste/file10 > took about 1.1 seconds. > Looks pretty good to me.(?). I'm curious about what > you were expecting > to see. The problem is the total numbers... 1267135728 and 1126991407, for example. 21 and 19 minutes in a ten minutes trace. Or am i missing something? Leal [http://www.eall.com.br/blog] > > > Wed Dec 10 08:36:33 BRST 2008 > > Wed Dec 10 08:46:55 BRST 2008 > > > > cut here - > > Tracing... Hit Ctrl-C to end. > > ^C > > NFSv3 read/write distributions (us): > > > > read > >value - Distribution > - count > >2 | > 0 > 631 > @@@ 145603 > > 16 |@ >155926 > 15970 >6111 > 942 >372 > 883 >1649 > 1090 >8278 > 24605 >8868 > 1694 >304 > 63 >27 > 31 >43 > 3 >0 > value - Distribution - > count > 128 | > 0 > 1083 > @@@ 32622 > > 1024 |@ >70353 > 70851 >47906 > 44898 >20481 > 5633 >1605 > 1339 >957 > 380 >143 > 21 >0 > otal us): > > > > x.16.0.x > > 647019 > > x.16.0.x > > 734488 > > x.16.0.x > > 0890034 > > x.16.0.x > > 8852624 > > x.16.0.x > > 0407241 > > x.16.0.x > > 9028592 > > x.16.0.x > > 3013688 > > x.16.0.x > > 04045281 > > x.16.0.x > > 05245138 > > x.16.0.x > > 24286383 > > x.16.0.x > > 54526695 > > x.16.0.x > > 94419023 > > x.16.0.x > > 21794650 > > x.16.0.x > > 59302970 > > x.16.0.x > > 89694542 > > x.16.0.x > > 90207418 > > x.16.0.x > > 87983050 > > x.16.0.x > > 267135728 > > > > NFSv3 read/write top 10 files (total us): > > > > /teste/file1 95870303 > > /teste/file2 104212948 > > /teste/file3 104311607 > > /teste/file4 121076447 > > /teste/file5 137687236 > > /teste/file6 160895273 > > /teste/file7 180765880 > > /teste/file8 198827114 > > /teste/file9 372380414 > > /teste/file10 1126991407 > > -- cut here -- > > > > > Max, will be difficult disable processors on that > machine (production). > > > Yes. I understand. > Regards, > max > > > Thanks again! > > > > Leal > > [http://www.eall.com.br/blog] > > > > ___ > dtrace-discuss mailing list > dtrace-discuss@opensolaris.org -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Hello all... Thanks a lot for the answers! I think the problem is "almost" fixed. Every dtrace documentation says to use predicates to guarantee the relation between the start/done probes... Max was the only one paying attention reading the docs. ;-) But i'm still getting weird numbers: Wed Dec 10 08:36:33 BRST 2008 Wed Dec 10 08:46:55 BRST 2008 cut here - Tracing... Hit Ctrl-C to end. ^C NFSv3 read/write distributions (us): read value - Distribution - count 2 | 0 4 | 631 8 | 145603 16 |@155926 32 |@@ 15970 64 |@6111 128 | 942 256 | 372 512 | 883 1024 | 1649 2048 | 1090 4096 |@8278 8192 |@@@ 24605 16384 |@8868 32768 | 1694 65536 | 304 131072 | 63 262144 | 27 524288 | 31 1048576 | 43 2097152 | 3 4194304 | 0 write value - Distribution - count 128 | 0 256 | 1083 512 | 32622 1024 |@70353 2048 |@@ 70851 4096 |@@ 47906 8192 |@@ 44898 16384 |@@@ 20481 32768 |@5633 65536 | 1605 131072 | 1339 262144 | 957 524288 | 380 1048576 | 143 2097152 | 21 4194304 | 0 NFSv3 read/write by host (total us): x.16.0.x 3647019 x.16.0.x 8734488 x.16.0.x 50890034 x.16.0.x 68852624 x.16.0.x 70407241 x.16.0.x 79028592 x.16.0.x 83013688 x.16.0.x 104045281 x.16.0.x 105245138 x.16.0.x 124286383 x.16.0.x154526695 x.16.0.x 194419023 x.16.0.x 221794650 x.16.0.x 259302970 x.16.0.x 289694542 x.16.0.x 290207418 x.16.0.x 487983050 x.16.0.x1267135728 NFSv3 read/write top 10 files (total us): /teste/file1 95870303 /teste/file2 104212948 /teste/file3 104311607 /teste/file4 121076447 /teste/file5 137687236 /teste/file6 160895273 /teste/file7 180765880 /teste/file8 198827114 /teste/file9 372380414 /teste/file10 1126991407 -- cut here -- Max, will be difficult disable processors on that machine (production). Thanks again! Leal [http://www.eall.com.br/blog] -- This message posted fro
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Some kind of both... ;-) I was investigating a "possible" performance problem, that i'm not sure if is the NFS server or not. So, i was faced with that weird numbers. I think one thing is not related with the other, but we need to fix whatever is the problem with the script or the provider, to have confidence on the tool. Except that numbers, the other latency values seems to be fine, don't you agree? Leal [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Oops, that would be a nice test, but something i cannot do. ;-) [http://www.eall.com.br/blog] Leal. -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Hello Jim, this is not a benchmark. The filenames i did change for privacy... This is a NFS server, yes. # uname -a SunOS test 5.11 snv_89 i86pc i386 i86pc # cat /etc/release Solaris Express Community Edition snv_89 X86 Copyright 2008 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 06 May 2008 -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Hello Jim! Actually i can repeat it... every time i did run some d script to collect some data i got some (how do you call it? nasty :) values. Look: Fri Dec 5 10:19:32 BRST 2008 Fri Dec 5 10:29:34 BRST 2008 NFSv3 read/write distributions (us): read value - Distribution - count 2 | 0 4 | 1092 8 |@93773 16 | 64481 32 |@@ 11713 64 |@7590 128 | 1156 256 | 698 512 | 1394 1024 | 1729 2048 | 805 4096 |@2732 8192 |@@@ 14893 16384 |@@ 9351 32768 |@2988 65536 | 647 131072 | 119 262144 | 29 524288 | 30 1048576 | 28 2097152 | 0 write value - Distribution - count 64 | 0 128 | 8 256 | 2418 512 |@@@ 22679 1024 | 28442 2048 | 59887 4096 |@68852 8192 |@65152 16384 | 32224 32768 |@@ 11554 65536 | 3162 131072 | 1100 262144 | 446 524288 | 105 1048576 | 70 2097152 | 11 4194304 | 0 8388608 | 0 16777216 | 0 33554432 | 0 67108864 | 0 134217728 | 0 268435456 | 0 536870912 | 0 1073741824 | 0 2147483648 | 0 4294967296 | 0 8589934592 | 0 17179869184 | 0 34359738368 | 0 68719476736 | 0 137438953472 | 11 274877906944 | 0 NFSv3 read/write by host (total us): x.16.0.x 4707246 x.16.0.x 28397213 x.16.0.x 40901275 x.16.0.x 68333664 x.16.0.x 89357734 x.16.0.x 125890329 x.16.0.x 127848295 x.16.0.x 132248305 x.16.0.x 135161278 x.16.0.x 138579146 x.16.0.x 146275507 x.16.0.x 156761509 x.16.0.x 566154684 x.16.0.x 185948455950 x.16.0.x 186184056503 x.16.0.x 186341816343 x.16.0.x1488962592532
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
36 hours... ;-)) Leal. -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Hello Jim, - cut here --- Qui Dez 4 19:08:39 BRST 2008 Qui Dez 4 19:18:02 BRST 2008 - cut here --- NFSv3 read/write distributions (us): read value - Distribution - count 2 | 0 4 | 22108 8 |@@@ 80611 16 | 66331 32 |@@ 11497 64 |@4939 128 | 979 256 | 727 512 | 788 1024 | 1663 2048 | 496 4096 |@3389 8192 |@@@ 14518 16384 |@4856 32768 | 742 65536 | 119 131072 | 38 262144 | 9 524288 | 25 1048576 | 7 2097152 | 0 write value - Distribution - count 64 | 0 128 | 55 256 |@@ 8750 512 |@@ 52926 1024 |@34370 2048 |@@@ 24610 4096 |@@@ 12136 8192 |@@@ 10819 16384 |@4181 32768 | 1198 65536 | 811 131072 | 793 262144 | 278 524288 | 26 1048576 | 2 2097152 | 0 4194304 | 0 8388608 | 0 16777216 | 0 33554432 | 0 67108864 | 0 134217728 | 0 268435456 | 0 536870912 | 0 1073741824 | 0 2147483648 | 0 4294967296 | 0 8589934592 | 0 17179869184 | 0 34359738368 | 0 68719476736 | 1 137438953472 | 0 NFSv3 read/write by host (total us): x.16.0.x 1987595 x.16.0.x 2588201 x.16.0.x 20370903 x.16.0.x 21400116 x.16.0.x 25208119 x.16.0.x 28874221 x.16.0.x 32523821 x.16.0.x 41103342 x.16.0.x 43934153 x.16.0.x 51819379 x.16.0.x 57477455 x.16.0.x 57679165 x.16.0.x 59575938 x.16.0.x 95072158 x.16.0.x 305615207 x.16.0.x 349252742 x.16.0.x 131175486635 NFSv3 read/write top 10 files (total us): /teste/file1 29942610 /teste/file2 32180289 /teste
Re: [dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Hello, > Are you referring to nfsv3rwsnoop.d? > > The TIME(us) value from that script is not a latency > measurement, > it's just a time stamp. > > If you're referring to a different script, let us > know specifically > which script. Sorry, when i did write "latency", i did assume that you will know that i was talking about the "nfsv3rwtime.d" script. Sorry... i mean, that is the script in the wiki page to see the latencies. The: "NFSv3 read/write by host (total us):" and "NFSv3 read/write top 10 files (total us):" are showing that numbers... Thanks a lot for your answer! Leal. > > /jim > > > Marcelo Leal wrote: > > Hello there, > > Ten minutes of trace (latency), using the nfs > dtrace script from nfsv3 provider wiki page, i got > total numbers (us) like: > > 131175486635 > > ??? > > > > thanks! > > > ___ > dtrace-discuss mailing list > dtrace-discuss@opensolaris.org -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org
[dtrace-discuss] Is the nfs dtrace script right (from nfsv3 provider wiki)?
Hello there, Ten minutes of trace (latency), using the nfs dtrace script from nfsv3 provider wiki page, i got total numbers (us) like: 131175486635 ??? thanks! -- This message posted from opensolaris.org ___ dtrace-discuss mailing list dtrace-discuss@opensolaris.org