Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-09 Thread Bernd Finger
Marcelo,

I just finished writing up my test results. Hopefully it will answer 
most of your questions. You can find it in my blog, as permalink

http://blogs.sun.com/blogfinger/entry/zfs_and_the_uberblock_part

Regards,

Bernd

Marcelo Leal wrote:
>> Marcelo,
>  Hello there... 
>> I did some more tests.
> 
> You are getting very useful informations with your tests. Thanks a lot!!
> 
>> I found that not each uberblock_update() is also
>> followed by a write to 
>> the disk (although the txg is increased every 30
>> seconds for each of the 
>> three zpools of my 2008.11 system). In these cases,
>> ub_rootbp.blk_birth 
>> stays at the same value while txg is incremented by
>> 1.
>   Are you sure about that? I mean, what i could understand for the 
> ondiskformat, is that there is a correlation 1:1 between txg, creation time, 
> and ubberblock. Each time there is write to the pool, we have another "state" 
> of the filesystem. Actually, we just need another valid uberblock when we 
> change the filesystem state (write to it). 
>  
>> But each sync command on the OS level is followed by
>> a 
>> vdev_uberblock_sync() directly after the
>> uberblock_update() and then by 
>> four writes to the four uberblock copies (one per
>>  copy) on disk.
>  Hmm, maybe the uberblock_update is not really important in our discussion... 
> ;-)
>  
>> And a change to one or more files in any pool during
>> the 30 seconds 
>> interval is also followed by a vdev_uberblock_sync()
>> of that pool at the 
>> end of the interval.
> 
>  So, what is the uberblock_update? 
>> So on my system (a web server) during time when there
>> is enough activity 
>> that each uberblock_update() is followed by
>> vdev_uberblock_sync(),
>>
>> I get:
>>   2 writes per minute (*60)
> 
>  I'm totally lost... 2 writes per minute?
> 
>>  writes per hour (*24)
>>  2880 writes per day
>> ut only each 128th time to the same block ->
>> = 22.5 writes to the same block on the drive per day.
>>
>> If we take the lower number of max. writes in the
>> referenced paper which 
>> is 10.000, we get 10.000/22.5 = 444.4 days or one
>> year and 79 days.
>>
>> For 100.000, we get .4 days or more than 12
>> years.
> 
>  Ok, but i think the number is 10.000. 100.000 would be a static wear 
> leveling, and it is a non-trivial implementation for USB pen drives right?
>> During times without http access to my server, only
>> about each 5th to 
>> 10th uberblock_update() is followed by
>> vdev_uberblock_sync() for rpool, 
>> and much less for the two data pools, which means
>> that the corresponding 
>> uberblocks on disk will be skipped for writing (if I
>> did not overlook 
>> anything), and the device will likely be worn out
>> later.
>  I need to know what is the uberblock_update... it seems not related with 
> txg, sync of disks, labels, nothing... ;-) 
> 
>  Thanks a lot Bernd.
> 
>  Leal
> [http://www.eall.com.br/blog]
>> Regards,
>>
>> Bernd
>>
>> Marcelo Leal wrote:
>>> Hello Bernd,
>>>  Now i see your point... ;-)
>>>  Well, following a "very simple" math:
>>>
>>>  - One txg each 5 seconds = 17280/day;
>>>  - Each txg writing 1MB (L0-L3) = 17GB/day
>>>  
>>>  In the paper the math was 10 years = ( 2.7 * the
>> size of the USB drive) writes per day, right? 
>>>  So, in a 4GB drive, would be ~10GB/day. Then, just
>> the labels update would make our USB drive live for 5
>> years... and if each txg update 5MB of data, our
>> drive would live for just a year.
>>>  Help, i'm not good with numbers... ;-)
>>>
>>>  Leal
>>> [http://www.eall.com.br/blog]
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discu
>> ss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-07 Thread Marcelo Leal
> Marcelo,
 Hello there... 
> 
> I did some more tests.

You are getting very useful informations with your tests. Thanks a lot!!

> 
> I found that not each uberblock_update() is also
> followed by a write to 
> the disk (although the txg is increased every 30
> seconds for each of the 
> three zpools of my 2008.11 system). In these cases,
> ub_rootbp.blk_birth 
> stays at the same value while txg is incremented by
> 1.
  Are you sure about that? I mean, what i could understand for the 
ondiskformat, is that there is a correlation 1:1 between txg, creation time, 
and ubberblock. Each time there is write to the pool, we have another "state" 
of the filesystem. Actually, we just need another valid uberblock when we 
change the filesystem state (write to it). 
 
> 
> But each sync command on the OS level is followed by
> a 
> vdev_uberblock_sync() directly after the
> uberblock_update() and then by 
> four writes to the four uberblock copies (one per
>  copy) on disk.
 Hmm, maybe the uberblock_update is not really important in our discussion... 
;-)
 
> And a change to one or more files in any pool during
> the 30 seconds 
> interval is also followed by a vdev_uberblock_sync()
> of that pool at the 
> end of the interval.

 So, what is the uberblock_update? 
> 
> So on my system (a web server) during time when there
> is enough activity 
> that each uberblock_update() is followed by
> vdev_uberblock_sync(),
> 
> I get:
>   2 writes per minute (*60)

 I'm totally lost... 2 writes per minute?

>  writes per hour (*24)
>  2880 writes per day
> ut only each 128th time to the same block ->
> = 22.5 writes to the same block on the drive per day.
> 
> If we take the lower number of max. writes in the
> referenced paper which 
> is 10.000, we get 10.000/22.5 = 444.4 days or one
> year and 79 days.
> 
> For 100.000, we get .4 days or more than 12
> years.

 Ok, but i think the number is 10.000. 100.000 would be a static wear leveling, 
and it is a non-trivial implementation for USB pen drives right?
> 
> During times without http access to my server, only
> about each 5th to 
> 10th uberblock_update() is followed by
> vdev_uberblock_sync() for rpool, 
> and much less for the two data pools, which means
> that the corresponding 
> uberblocks on disk will be skipped for writing (if I
> did not overlook 
> anything), and the device will likely be worn out
> later.
 I need to know what is the uberblock_update... it seems not related with txg, 
sync of disks, labels, nothing... ;-) 

 Thanks a lot Bernd.

 Leal
[http://www.eall.com.br/blog]
> 
> Regards,
> 
> Bernd
> 
> Marcelo Leal wrote:
> > Hello Bernd,
> >  Now i see your point... ;-)
> >  Well, following a "very simple" math:
> > 
> >  - One txg each 5 seconds = 17280/day;
> >  - Each txg writing 1MB (L0-L3) = 17GB/day
> >  
> >  In the paper the math was 10 years = ( 2.7 * the
> size of the USB drive) writes per day, right? 
> >  So, in a 4GB drive, would be ~10GB/day. Then, just
> the labels update would make our USB drive live for 5
> years... and if each txg update 5MB of data, our
> drive would live for just a year.
> >  Help, i'm not good with numbers... ;-)
> > 
> >  Leal
> > [http://www.eall.com.br/blog]
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discu
> ss
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-07 Thread Bernd Finger
Marcelo,

I did some more tests.

I found that not each uberblock_update() is also followed by a write to 
the disk (although the txg is increased every 30 seconds for each of the 
three zpools of my 2008.11 system). In these cases, ub_rootbp.blk_birth 
stays at the same value while txg is incremented by 1.

But each sync command on the OS level is followed by a 
vdev_uberblock_sync() directly after the uberblock_update() and then by 
  four writes to the four uberblock copies (one per copy) on disk.

And a change to one or more files in any pool during the 30 seconds 
interval is also followed by a vdev_uberblock_sync() of that pool at the 
end of the interval.

So on my system (a web server) during time when there is enough activity 
that each uberblock_update() is followed by vdev_uberblock_sync(),

I get:
  2 writes per minute (*60)
=  120 writes per hour (*24)
= 2880 writes per day
but only each 128th time to the same block ->
= 22.5 writes to the same block on the drive per day.

If we take the lower number of max. writes in the referenced paper which 
is 10.000, we get 10.000/22.5 = 444.4 days or one year and 79 days.

For 100.000, we get .4 days or more than 12 years.

During times without http access to my server, only about each 5th to 
10th uberblock_update() is followed by vdev_uberblock_sync() for rpool, 
and much less for the two data pools, which means that the corresponding 
uberblocks on disk will be skipped for writing (if I did not overlook 
anything), and the device will likely be worn out later.

Regards,

Bernd

Marcelo Leal wrote:
> Hello Bernd,
>  Now i see your point... ;-)
>  Well, following a "very simple" math:
> 
>  - One txg each 5 seconds = 17280/day;
>  - Each txg writing 1MB (L0-L3) = 17GB/day
>  
>  In the paper the math was 10 years = ( 2.7 * the size of the USB drive) 
> writes per day, right? 
>  So, in a 4GB drive, would be ~10GB/day. Then, just the labels update would 
> make our USB drive live for 5 years... and if each txg update 5MB of data, 
> our drive would live for just a year.
>  Help, i'm not good with numbers... ;-)
> 
>  Leal
> [http://www.eall.com.br/blog]

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-07 Thread Marcelo Leal
Hello Bernd,
 Now i see your point... ;-)
 Well, following a "very simple" math:

 - One txg each 5 seconds = 17280/day;
 - Each txg writing 1MB (L0-L3) = 17GB/day
 
 In the paper the math was 10 years = ( 2.7 * the size of the USB drive) writes 
per day, right? 
 So, in a 4GB drive, would be ~10GB/day. Then, just the labels update would 
make our USB drive live for 5 years... and if each txg update 5MB of data, our 
drive would live for just a year.
 Help, i'm not good with numbers... ;-)

 Leal
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-07 Thread Bernd Finger
Marcelo,

the problem which I mentioned is with the limited number of write cycles 
in flash memory chips. The following document published in June 2007 by 
a USB flash drive vendor says the guaranteed number of write cycles for 
their USB flash drives is between 10.000 and 100.000:

http://www.corsairmemory.com/_faq/FAQ_flash_drive_wear_leveling.pdf

Some vendors (including the one mentioned above) state that their drives 
use wear leveling so that write activity is distributed over a larger 
area to avoid that always the same cell will be overwritten again and again.

Regards,

Bernd

Marcelo Leal wrote:
>> Hi,
> 
>  Hello Bernd,
> 
>> After I published a blog entry about installing
>> OpenSolaris 2008.11 on a 
>> USB stick, I read a comment about a possible issue
>> with wearing out 
>> blocks on the USB stick after some time because ZFS
>> overwrites its 
>> uberblocks in place.
>  I did not understand well what you are trying to say with "wearing out 
> blocks", but in fact the uberblocks are not overwriten in place. The pattern 
> you did notice with the dtrace script, is the update of the uberblock that is 
> maintained in an array of 128 elements (1K each, just one active at time). 
> Each physical vdev has four labes (256K structures) L0, L1, L2, and L3. Two 
> in the begining and two at the end.
>  Because the labels are in fixed location on disk, is the only update that 
> zfs does not uses cow, but a two staged update. IIRC, the update is L0 and 
> L2,and after that L1 and L3.
>  Take a look:
> 
>  
> http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/vdev_label.c
> 
>  So:
>  - The label is overwritten (in a two staged update);
>  - The uberblock is not overwritten, but do write to a new element on the 
> array. So, the transition from one uberblock(txg and timestamp) to another is 
> atomic.
> 
>  I'm deploying a USB solution too, so if you can clarify the problem, i would 
> appreciate it. 
> 
> ps.: I did look your blog, but did not see any comments around that, and the 
> comments section is closed. ;-)
> 
>  Leal
> [http://www.eall.com.br/blog]
> 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-06 Thread Marcelo Leal
> Hi,

 Hello Bernd,

> 
> After I published a blog entry about installing
> OpenSolaris 2008.11 on a 
> USB stick, I read a comment about a possible issue
> with wearing out 
> blocks on the USB stick after some time because ZFS
> overwrites its 
> uberblocks in place.
 I did not understand well what you are trying to say with "wearing out 
blocks", but in fact the uberblocks are not overwriten in place. The pattern 
you did notice with the dtrace script, is the update of the uberblock that is 
maintained in an array of 128 elements (1K each, just one active at time). Each 
physical vdev has four labes (256K structures) L0, L1, L2, and L3. Two in the 
begining and two at the end.
 Because the labels are in fixed location on disk, is the only update that zfs 
does not uses cow, but a two staged update. IIRC, the update is L0 and L2,and 
after that L1 and L3.
 Take a look:

 
http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/vdev_label.c

 So:
 - The label is overwritten (in a two staged update);
 - The uberblock is not overwritten, but do write to a new element on the 
array. So, the transition from one uberblock(txg and timestamp) to another is 
atomic.

 I'm deploying a USB solution too, so if you can clarify the problem, i would 
appreciate it. 

ps.: I did look your blog, but did not see any comments around that, and the 
comments section is closed. ;-)

 Leal
[http://www.eall.com.br/blog]

> 
> I tried to get more information about how updating
> uberblocks works with 
> the following dtrace script:
> 
> /* io:genunix::start */
> io:genunix:default_physio:start,
> io:genunix:bdev_strategy:start,
> io:genunix:biodone:done
> {
> printf ("%d %s %d %d", timestamp, execname,
>  args[0]->b_blkno, 
> rgs[0]->b_bcount);
> }
> 
> fbt:zfs:uberblock_update:entry
> {
> printf ("%d (%d) %d, %d, %d, %d, %d, %d, %d, %d",
>  timestamp,
>  args[0]->ub_timestamp,
>  args[0]->ub_rootbp.blk_prop, args[0]->ub_guid_sum,
> args[0]->ub_rootbp.blk_birth,
>  args[0]->ub_rootbp.blk_fill,
> args[1]->vdev_id, args[1]->vdev_asize,
>  args[1]->vdev_psize,
>  args[2]);
> e output shows the following pattern after most of
> the 
> uberblock_update events:
> 
> 0  34404 uberblock_update:entry 244484736418912
>  (1231084189) 
> 226475971064889345, 4541013553469450828, 26747, 159,
> 0, 0, 0, 26747
> 0   6668bdev_strategy:start 244485190035647
>  sched 502 1024
> 0   6668bdev_strategy:start 244485190094304
>  sched 1014 1024
> 0   6668bdev_strategy:start 244485190129133
>  sched 39005174 1024
> 0   6668bdev_strategy:start 244485190163273
>  sched 39005686 1024
> 0   6656  biodone:done 244485190745068
>  sched 502 1024
> 0   6656  biodone:done 244485191239190
>  sched 1014 1024
> 0   6656  biodone:done 244485191737766
>  sched 39005174 1024
> 0   6656  biodone:done 244485192236988
>  sched 39005686 1024
> ...
> 0  34404   uberblock_update:entry
>  244514710086249 
> 1231084219) 9226475971064889345, 4541013553469450828,
> 26747, 159, 0, 0, 
> 0, 26748
> 0  34404   uberblock_update:entry
>  244544710086804 
> 1231084249) 9226475971064889345, 4541013553469450828,
> 26747, 159, 0, 0, 
> 0, 26749
> ...
> 0  34404   uberblock_update:entry
>  244574740885524 
> 1231084279) 9226475971064889345, 4541013553469450828,
> 26750, 159, 0, 0, 
> 0, 26750
> 0   6668 bdev_strategy:start 244575189866189
>  sched 508 1024
> 0   6668 bdev_strategy:start 244575189926518
>  sched 1020 1024
> 0   6668 bdev_strategy:start 244575189961783
>  sched 39005180 1024
> 0   6668 bdev_strategy:start 244575189995547
>  sched 39005692 1024
> 0   6656   biodone:done 244575190584497
>  sched 508 1024
> 0   6656   biodone:done 244575191077651
>  sched 1020 1024
> 0   6656   biodone:done 244575191576723
>  sched 39005180 1024
> 0   6656   biodone:done 244575192077070
>  sched 39005692 1024
> I am not a dtrace or zfs expert, but to me it looks
> like in many cases, 
> an uberblock update is followed by a write of 1024
> bytes to four 
> different disk blocks. I also found that the four
> block numbers are 
> incremented with always even numbers (256, 258, 260,
> ,..) 127 times and 
> then the first block is written again. Which would
> mean that for a txg 
> of 5, the four uberblock copies have been written
> 5/127=393 
> times (Correct?).
> 
> What I would like to find out is how to access fields
> from arg1 (this is 
> the data of type vdev in:
> 
> int uberblock_update(uberblock_t *ub, vdev_t *rvd,
> uint64_t txg)
> 
> ). When using the fbt:zfs:uberblock_update:entry
> probe, its elements are 
> always 0, as you can see in the above output. When
> using the 
> fbt:zfs:uberblock_update:return probe, I am getting
> an error message 
> like the following:
> 
> dtrace: failed to compile script
> zfs-uberblock-report-04.d: line 14: 
> operator -> must be applied to a pointer
> 
> Any idea how to access the fields of v

Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-04 Thread Richard Elling
Bernd Finger wrote:
> Hi,
>
> After I published a blog entry about installing OpenSolaris 2008.11 on a 
> USB stick, I read a comment about a possible issue with wearing out 
> blocks on the USB stick after some time because ZFS overwrites its 
> uberblocks in place.
>
> I tried to get more information about how updating uberblocks works with 
> the following dtrace script:
>
> /* io:genunix::start */
> io:genunix:default_physio:start,
> io:genunix:bdev_strategy:start,
> io:genunix:biodone:done
> {
> printf ("%d %s %d %d", timestamp, execname, args[0]->b_blkno, 
> args[0]->b_bcount);
> }
>
> fbt:zfs:uberblock_update:entry
> {
> printf ("%d (%d) %d, %d, %d, %d, %d, %d, %d, %d", timestamp,
>   args[0]->ub_timestamp,
>   args[0]->ub_rootbp.blk_prop, args[0]->ub_guid_sum,
>   args[0]->ub_rootbp.blk_birth, args[0]->ub_rootbp.blk_fill,
>   args[1]->vdev_id, args[1]->vdev_asize, args[1]->vdev_psize,
>   args[2]);
> }
>
> The output shows the following pattern after most of the 
> uberblock_update events:
>
>0  34404 uberblock_update:entry 244484736418912 (1231084189) 
> 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 0, 26747
>0   6668bdev_strategy:start 244485190035647 sched 502 1024
>0   6668bdev_strategy:start 244485190094304 sched 1014 1024
>0   6668bdev_strategy:start 244485190129133 sched 39005174 1024
>0   6668bdev_strategy:start 244485190163273 sched 39005686 1024
>0   6656  biodone:done 244485190745068 sched 502 1024
>0   6656  biodone:done 244485191239190 sched 1014 1024
>0   6656  biodone:done 244485191737766 sched 39005174 1024
>0   6656  biodone:done 244485192236988 sched 39005686 1024
>
> ...
>0  34404   uberblock_update:entry 244514710086249 
> (1231084219) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
> 0, 26748
>0  34404   uberblock_update:entry 244544710086804 
> (1231084249) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
> 0, 26749
> ...
>0  34404   uberblock_update:entry 244574740885524 
> (1231084279) 9226475971064889345, 4541013553469450828, 26750, 159, 0, 0, 
> 0, 26750
>0   6668 bdev_strategy:start 244575189866189 sched 508 1024
>0   6668 bdev_strategy:start 244575189926518 sched 1020 1024
>0   6668 bdev_strategy:start 244575189961783 sched 39005180 1024
>0   6668 bdev_strategy:start 244575189995547 sched 39005692 1024
>0   6656   biodone:done 244575190584497 sched 508 1024
>0   6656   biodone:done 244575191077651 sched 1020 1024
>0   6656   biodone:done 244575191576723 sched 39005180 1024
>0   6656   biodone:done 244575192077070 sched 39005692 1024
>
> I am not a dtrace or zfs expert, but to me it looks like in many cases, 
> an uberblock update is followed by a write of 1024 bytes to four 
> different disk blocks. I also found that the four block numbers are 
> incremented with always even numbers (256, 258, 260, ,..) 127 times and 
> then the first block is written again. Which would mean that for a txg 
> of 5, the four uberblock copies have been written 5/127=393 
> times (Correct?).
>   

The uberblocks are stored in a circular queue: 128 entries @ 1k.  The method
is described in the on-disk specification document.  I applaud your 
effort to
reverse-engineer this :-)
http://www.opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf

I've done some research in this area by measuring the actual I/O to each
block on the disk.  This can be done with TNF or dtrace -- for any
workload.  I'd be interested in hearing about your findings, especially if
you record block update counts for real workloads.

Note: wear leveling algorithms for specific devices do not seem to be
publically available :-(  But the enterprise SSDs seem to be gravitating
towards using DRAM write caches anyway.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-04 Thread Andrew Gabriel
Bernd Finger wrote:
> Hi,
>
> After I published a blog entry about installing OpenSolaris 2008.11 on a 
> USB stick, I read a comment about a possible issue with wearing out 
> blocks on the USB stick after some time because ZFS overwrites its 
> uberblocks in place.

The flash controllers used on solid state disks implement wear leveling, 
to ensure hot blocks don't prematurely wear out on the flash device. 
Wear leveling will move hot logical blocks around in the physical flash 
memory, so one part doesn't wear out much faster than the rest of it.

I would presume flash on USB sticks does something similar, but I don't 
know that for sure.

-- 
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-04 Thread Bernd Finger
Hi,

After I published a blog entry about installing OpenSolaris 2008.11 on a 
USB stick, I read a comment about a possible issue with wearing out 
blocks on the USB stick after some time because ZFS overwrites its 
uberblocks in place.

I tried to get more information about how updating uberblocks works with 
the following dtrace script:

/* io:genunix::start */
io:genunix:default_physio:start,
io:genunix:bdev_strategy:start,
io:genunix:biodone:done
{
printf ("%d %s %d %d", timestamp, execname, args[0]->b_blkno, 
args[0]->b_bcount);
}

fbt:zfs:uberblock_update:entry
{
printf ("%d (%d) %d, %d, %d, %d, %d, %d, %d, %d", timestamp,
  args[0]->ub_timestamp,
  args[0]->ub_rootbp.blk_prop, args[0]->ub_guid_sum,
  args[0]->ub_rootbp.blk_birth, args[0]->ub_rootbp.blk_fill,
  args[1]->vdev_id, args[1]->vdev_asize, args[1]->vdev_psize,
  args[2]);
}

The output shows the following pattern after most of the 
uberblock_update events:

   0  34404 uberblock_update:entry 244484736418912 (1231084189) 
9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 0, 26747
   0   6668bdev_strategy:start 244485190035647 sched 502 1024
   0   6668bdev_strategy:start 244485190094304 sched 1014 1024
   0   6668bdev_strategy:start 244485190129133 sched 39005174 1024
   0   6668bdev_strategy:start 244485190163273 sched 39005686 1024
   0   6656  biodone:done 244485190745068 sched 502 1024
   0   6656  biodone:done 244485191239190 sched 1014 1024
   0   6656  biodone:done 244485191737766 sched 39005174 1024
   0   6656  biodone:done 244485192236988 sched 39005686 1024

...
   0  34404   uberblock_update:entry 244514710086249 
(1231084219) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
0, 26748
   0  34404   uberblock_update:entry 244544710086804 
(1231084249) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
0, 26749
...
   0  34404   uberblock_update:entry 244574740885524 
(1231084279) 9226475971064889345, 4541013553469450828, 26750, 159, 0, 0, 
0, 26750
   0   6668 bdev_strategy:start 244575189866189 sched 508 1024
   0   6668 bdev_strategy:start 244575189926518 sched 1020 1024
   0   6668 bdev_strategy:start 244575189961783 sched 39005180 1024
   0   6668 bdev_strategy:start 244575189995547 sched 39005692 1024
   0   6656   biodone:done 244575190584497 sched 508 1024
   0   6656   biodone:done 244575191077651 sched 1020 1024
   0   6656   biodone:done 244575191576723 sched 39005180 1024
   0   6656   biodone:done 244575192077070 sched 39005692 1024

I am not a dtrace or zfs expert, but to me it looks like in many cases, 
an uberblock update is followed by a write of 1024 bytes to four 
different disk blocks. I also found that the four block numbers are 
incremented with always even numbers (256, 258, 260, ,..) 127 times and 
then the first block is written again. Which would mean that for a txg 
of 5, the four uberblock copies have been written 5/127=393 
times (Correct?).

What I would like to find out is how to access fields from arg1 (this is 
the data of type vdev in:

int uberblock_update(uberblock_t *ub, vdev_t *rvd, uint64_t txg)

). When using the fbt:zfs:uberblock_update:entry probe, its elements are 
always 0, as you can see in the above output. When using the 
fbt:zfs:uberblock_update:return probe, I am getting an error message 
like the following:

dtrace: failed to compile script zfs-uberblock-report-04.d: line 14: 
operator -> must be applied to a pointer

Any idea how to access the fields of vdev, or how to print out the pool 
name associated to an uberblock_update event?

Regards,

Bernd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss