> On Mar 15, 2018, at 9:48 PM, Mike Gerdts <mike.ger...@joyent.com> wrote: > > On Thu, Mar 15, 2018 at 9:10 PM, Richard Elling > <richard.ell...@richardelling.com <mailto:richard.ell...@richardelling.com>> > wrote: > > >> On Mar 15, 2018, at 1:30 PM, Mike Gerdts <mike.ger...@joyent.com >> <mailto:mike.ger...@joyent.com>> wrote: >> >> On Thu, Mar 15, 2018 at 3:00 PM, Matthew Ahrens <mahr...@delphix.com >> <mailto:mahr...@delphix.com>> wrote: >> Yes, I agree and that all sounds great (including "zfs set >> refreservation=auto" to get back to the originally-computed refreservation). >> A shame that we didn't catch this when implementing "zfs clone" back in the >> day. >> >> I assume that refreservation will continue to be a non-inheritable property, >> and that "refreservation=auto" is just a shortcut for "refreservation=123GB" >> (or whatever the right number is). So if you set it to "auto", "zfs get" >> will show "123GB". And changing the volsize will do whatever it does today. >> >> Pretty much, but currently you can't set refreservation to a value greater >> than volsize. The largest explicit value that is allowed is still volsize. > > That is a simple bug to fix and I thought we already had a fix, but perhaps > only in ZoL? In any case, > the fix needs to be in openzfs. IMHO, we really don't need to check the > requested refreservation > against volsize at all. > > I had started down that route, then convinced myself that it may be there for > good reason. Can you explain how a refreservation greater than the size > calculated by zvol_volsize_to_reservation() is useful? I couldn't come up > with a way (aside from a potential bug in that copies is not always accounted > for).
Sure, raidz skip blocks are not accounted for. In part this is logically due to skip blocks being assigned at the SPA layer and reservations are at the DSL layer. The pathological example is raidz2 on 4kn disks with volblocksize=8k (default). The predicted reservation is 8k per block (logical) plus 8k parity = 16k, but the actual allocated space is 24k. The DSL "free" space assumes 16k so it overestimates the usable space. Thus you can run out of allocated space in the pool before hitting refreservation -- a bad thing. One way to innoculate is to increase refreservation to a value greater than volsize. Note: there are several other remedies to this pathological example, but they aren't pertinent to this discussion. -- richard > > Overwriting a zvol, taking a snapshot, then overwriting it again doesn't seem > to reduce usedbyrefreservation. > > # zfs create -V 100m zones/t/100m > # dd if=/dev/zero of=/dev/zvol/rdsk/zones/t/100m bs=1024k > write: I/O error > 101+0 records in > 101+0 records out > 104857600 bytes transferred in 0.280976 secs (373191031 bytes/sec) > > # zfs get -p space,refreservation,volsize zones/t/100m > NAME PROPERTY VALUE SOURCE > zones/t/100m name zones/t/100m - > zones/t/100m available 270941458432 - > zones/t/100m used 110362624 - > zones/t/100m usedbysnapshots 0 - > zones/t/100m usedbydataset 105127936 - > zones/t/100m usedbyrefreservation 5234688 - > zones/t/100m usedbychildren 0 - > zones/t/100m refreservation 110362624 local > zones/t/100m volsize 104857600 local > > # zfs snapshot zones/t/100m@1 > # dd if=/dev/zero of=/dev/zvol/rdsk/zones/t/100m bs=1024k > write: I/O error > 101+0 records in > 101+0 records out > 104857600 bytes transferred in 0.421021 secs (249055635 bytes/sec) > > # zfs get -p space,refreservation,volsize zones/t/100m > NAME PROPERTY VALUE SOURCE > zones/t/100m name zones/t/100m - > zones/t/100m available 270835998720 - > zones/t/100m used 215490560 - > zones/t/100m usedbysnapshots 105127936 - > zones/t/100m usedbydataset 105127936 - > zones/t/100m usedbyrefreservation 5234688 - > zones/t/100m usedbychildren 0 - > zones/t/100m refreservation 110362624 local > zones/t/100m volsize 104857600 local > > > Frankly, I have some concerns about these numbers from before the snapshot > (same output as first set above, just trimmed) > > # zfs get -p space,refreservation,volsize zones/t/100m > NAME PROPERTY VALUE SOURCE > zones/t/100m used 110362624 - > zones/t/100m usedbydataset 105127936 - > zones/t/100m usedbyrefreservation 5234688 - > zones/t/100m refreservation 110362624 local > zones/t/100m volsize 104857600 local > > In that output, used and refreservation are equal, but usebyreferservation is > 5112 KiB. I would have expected that all of the 5376 KiB of metadata (used - > volsize) would have been counted against the refreservation, bringing > usedbyrefreservation to 0. > > Mike > openzfs <https://openzfs.topicbox.com/latest> / openzfs-developer > <https://openzfs.topicbox.com/groups/developer/members> / Permalink > <https://openzfs.topicbox.com/groups/developer/discussions/Te3d593ba00521b6d-Mfc214744f84ec3b19fcbf6af>Delivery > options <https://openzfs.topicbox.com/groups> ------------------------------------------ openzfs: openzfs-developer Permalink: https://openzfs.topicbox.com/groups/developer/discussions/Te3d593ba00521b6d-M0dfed5d7451e4ea21218fff2 Delivery options: https://openzfs.topicbox.com/groups