On Feb 15, 2012, at 11:26 AM, Garrett D'Amore wrote:
> On Feb 15, 2012, at 11:05 AM, Richard Lowe wrote:
>
>> In that case, it obviously should be zpool create -o blocksize=...
>> I'm guessing it's trying to be per-vdev.
The -o and -O options set zpool or zfs parameters (zpool get ...) The ashift is
derived
from the physical block size and is stored in the label, but most of the label
contents
are neither visible as zpool parameters nor changeable. So it does not really
fit as a
-o or -O option.
There is another discontinuity wrt block sizes for volumes. The following are
legal:
zfs create -b 4k -V 1mb zwimming/zvol
zfs create -o volblocksize=4k -V 1mb zwimming/zvol
Clearly, ashift cannot be changed once the top-level vdev is created. There are
other
optional parameters that are only changeable at creation (eg utf8only) but
these are
also available in the get output and are stored along with other parameters.
In theory, ashift can be different for each per-top-level vdev. Dunno what
implications
that might have on real use cases.
>> I'm completely unconvinced that integrating the current hack as a
>> stop-gap is even somewhat sensible. When done properly, none of this
>> UI is necessary, or even a good idea.
Agree, in principle, which is why I haven't done it before, preferring to use
one of
the many other workarounds that are relatively easy to implement. Unfortunately,
those are guru-class workarounds, not for mere mortals. We need to offer some
sort of easy override for mortals, especially those unlucky enough to have
purchased
lying devices.
> I disagree. Having an override option is useful when the disk drive lies,
> and we don't know about it yet. We do need a grey list somewhere for these
> outliers. I can also see reasons to use an even bigger block size -- think
> SSDs which have different cell properties. :-)
Pretty much everything is optimized for 4KB today, because that is where the
industry leader,
NTFS, lives.
On Feb 15, 2012, at 11:45 AM, Bill Sommerfeld wrote:
> last fall I did a little experimentation with a pair of WD20EARS drives (which
> are 2TB "advanced format" drives with 4K physical and 512 byte logical
> sectors).
>
> In my admittedly not particularly demanding and not particularly scientific
> tests, I saw little difference in performance between ashift=9 (512) and
> ashift=12 (4K) but significantly increased disk space usage, so I left them as
> ashift=9.
This is not surprising, but for a whole lot of other reasons… HDDs are not fast
:-)
But a lot depends on the RAID configuration. I have a theory that says 4KB
sectors is
the end of raidz for general-purpose, low-disk-count systems. If the average
file size is
less than N, then the actual allocation looks more like mirroring (RAID-1E, not
RAID-1).
So it is not unexpected to see "more space used" for raidz on 4KB sector disks
than
512 byte sector disks.
NB, ESG recently published a study that says 1 enterprise-class SSD has the
performance
of 25 FC disks when running Oracle DB. Game over for people trying to get
performance
out of HDDs.
> Has anyone taken a more thorough look at this to better characterize the space
> vs. speed tradeoff?
The more interesting use case is SSDs, especially consumer-class SSDs that
expect 4KB
aligned I/O and code such into their wear-leveling algorithms. I suspect that
the default
512 byte (lying) sector size results in faster wear-out for low-end flash SSDs.
-- richard
--
DTrace Conference, April 3, 2012,
http://wiki.smartos.org/display/DOC/dtrace.conf
ZFS Performance and Training
[email protected]
+1-760-896-4422
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com