>-----Original Message----- >From: Brandon High [mailto:bh...@freaks.com] >Sent: Monday, May 10, 2010 3:12 PM > >On Mon, May 10, 2010 at 1:53 PM, Geoff Nordli <geo...@gnaa.net> wrote: >> You are right, I didn't look at that property, and instead I was >> focused on the record size property. > >zvols don't have a recordsize - That's a property of filesystem datasets, not >volumes.
Awesome, that makes things a lot clearer now :) > >> When I look at the stmfadm llift-lu -v .... it shows me the block size >> of "512". I am running NexentaCore 3.0 (b134+) . I wonder if the >> default size has changed with different versions. > >I see what you're referring to. The iscsi block size, which is what the LUN reports >to initiator as it's block size, vs. the block size written to disk. So in essence this is the disk "sector" size, again makes sense. Are people actually changing this value? > > >> As long as you are using a multiple of the file system block size, >> then alignment shouldn't be a problem with iscsi based zvols. When >> using a zvol comstar stores the metadata in a zvol object; instead of >> the first part of the volume. > >There can be an "off by one" error which will cause small writes to span blocks. If >the data is not block aligned, then a 4k write causes two read/modify/writes (on >zfs two blocks have to be read then written and block pointers updated) whereas >an aligned write will not require the existing data to be read. This is assuming that >the zvol block size = VM fs block size = 4k. In the case where the zvol block size is >a multiple of the VM fs block size (eg 4k VM fs, 8k zvol), then writing one fs block >will alway require a read for an aligned filesystem, but could require two for an >unaligned fs if the VM fs block spans two zvol blocks. > >There's been a lot of discussion about this lately with the introduction of WD's 4k >sector drives, since they have a 512b sector emulation mode. > Doesn't this alignment have more to do with aligning writes to the stripe/segment size of a traditional storage array? The articles I am reading suggests creating a small unused partition to take up the space up to 127bytes (assuming 128byte segment), then create the real partition from the 128th sector going forward. I am not sure how this would happen with zfs. Thanks for clearing up my misconceptions. Geoff _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss