Re: [discuss] [OpenIndiana-discuss] 4k sectors again

Richard Elling Wed, 15 Feb 2012 12:06:39 -0800

On Feb 15, 2012, at 11:26 AM, Garrett D'Amore wrote:
> On Feb 15, 2012, at 11:05 AM, Richard Lowe wrote:
> 
>> In that case, it obviously should be zpool create -o blocksize=...
>> I'm guessing it's trying to be per-vdev.

The -o and -O options set zpool or zfs parameters (zpool get ...) The ashift is 
derived
from the physical block size and is stored in the label, but most of the label 
contents
are neither visible as zpool parameters nor changeable. So it does not really 
fit as a 
-o or -O option.

There is another discontinuity wrt block sizes for volumes. The following are 
legal:
        zfs create -b 4k -V 1mb zwimming/zvol
        zfs create -o volblocksize=4k -V 1mb zwimming/zvol

Clearly, ashift cannot be changed once the top-level vdev is created. There are 
other
optional parameters that are only changeable at creation (eg utf8only) but 
these are
also available in the get output and are stored along with other parameters.

In theory, ashift can be different for each per-top-level vdev. Dunno what 
implications
that might have on real use cases. 

>> I'm completely unconvinced that integrating the current hack as a
>> stop-gap is even somewhat sensible.  When done properly, none of this
>> UI is necessary, or even a good idea.

Agree, in principle, which is why I haven't done it before, preferring to use 
one of
the many other workarounds that are relatively easy to implement. Unfortunately,
those are guru-class workarounds, not for mere mortals.  We need to offer some
sort of easy override for mortals, especially those unlucky enough to have 
purchased
lying devices.

> I disagree.  Having an override option is useful when the disk drive lies, 
> and we don't know about it yet.  We do need a grey list somewhere for these 
> outliers.  I can also see reasons to use an even bigger block size -- think 
> SSDs which have different cell properties. :-)

Pretty much everything is optimized for 4KB today, because that is where the 
industry leader,
NTFS, lives.

On Feb 15, 2012, at 11:45 AM, Bill Sommerfeld wrote:

> last fall I did a little experimentation with a pair of WD20EARS drives (which
> are 2TB "advanced format" drives with 4K physical and 512 byte logical 
> sectors).
> 
> In my admittedly not particularly demanding and not particularly scientific
> tests, I saw little difference in performance between ashift=9 (512) and
> ashift=12 (4K) but significantly increased disk space usage, so I left them as
> ashift=9.

This is not surprising, but for a whole lot of other reasons… HDDs are not fast 
:-)
But a lot depends on the RAID configuration. I have a theory that says 4KB 
sectors is
the end of raidz for general-purpose, low-disk-count systems. If the average 
file size is
less than N, then the actual allocation looks more like mirroring (RAID-1E, not 
RAID-1).
So it is not unexpected to see "more space used" for raidz on 4KB sector disks 
than
512 byte sector disks.

NB, ESG recently published a study that says 1 enterprise-class SSD has the 
performance 
of 25 FC disks when running Oracle DB. Game over for people trying to get 
performance
out of HDDs.

> Has anyone taken a more thorough look at this to better characterize the space
> vs. speed tradeoff?

The more interesting use case is SSDs, especially consumer-class SSDs that 
expect 4KB
aligned I/O and code such into their wear-leveling algorithms. I suspect that 
the default
512 byte (lying) sector size results in faster wear-out for low-end flash SSDs.
 -- richard

--
DTrace Conference, April 3, 2012, 
http://wiki.smartos.org/display/DOC/dtrace.conf
ZFS Performance and Training
[email protected]
+1-760-896-4422

-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Re: [discuss] [OpenIndiana-discuss] 4k sectors again

Reply via email to