Evan Layton wrote: > Keith Mitchell wrote: >> >> >> Evan Layton wrote: >>> Ethan Quach wrote: >>>> >>>> >>>> Evan Layton wrote: >>>>> Ethan Quach wrote: >>>>>> >>>>>> >>>>>> Evan Layton wrote: >>>>>>> jeanm wrote: >>>>>>>> Looks OK to me. Maybe some comments explaining the B_FALSE and >>>>>>>> what it does? >>>>>>>> I looked in the zfs code but that's kind of awkward. >>>>>>> >>>>>>> Probably a good idea to add that since it's a bit convoluted... >>>>>>> >>>>>>> I've added: >>>>>>> /* >>>>>>> * The boolean set to B_FALSE and passed to zfs_destroy_snaps() >>>>>>> tells >>>>>>> * zfs to process and destroy the snapshots now. Otherwise ZFS will >>>>>>> * wait to destroy these snapshots until a zfs release is called >>>>>>> and >>>>>>> * the references for all the snapshots have been released. >>>>>>> */ >>>>>> >>>>>> nit - for the second sentence, "Otherwise the call will potentially >>>>>> return where the snapshot isn't actually destroyed yet, and ZFS is >>>>>> waiting until all the references to the snapshot have been released >>>>>> before actually destroying the snapshot." >>>>>> >>>>> >>>>> Much better! I'll make that change. >>>> >>>> Hmm, I may have made an incorrect assumption on the behavior >>>> though. Will the call actually return and ZFS waits, or wait before >>>> returning? >>> >>> If I understand the comments and the code correctly, ZFS returns >>> without >>> actually destroying the snapshots. It waits to do that until all the >>> references are released and the zfs_release has been called. >> Devil's advocate: What's the problem with using B_TRUE if this is how >> it works? If someone has truly set a hold on a BE snapshot, it seems >> more appropriate to let ZFS clean it up later when whoever put the >> hold on it is done with it. As long as it gets cleaned up eventually >> & automatically... And if we don't want people putting holds on it, >> would it be more appropriate to set the ZFS property such that normal >> users can't put a hold on it? > > There are a couple of problems with this. > > This approach has us returning form the zfs_destroy_snap call without > knowing if or when the snapshot will be destroyed. We think we're done > but may not be... > > Also this approach adds the need to place a zfs release in the > appropriate spots so that the last reference is released. With the > changes done for 6803121 when a snapshot is taken a reference is > automatically added. This is used to keep the snapshot around until > the zfs release is called. If we don't call it or call it in the wrong > place we end up returning from the zfs_destroy_snap believing that the > snapshot will be destroyed but since we either didn't call release or > it wasn't in the right spot, we then have no way to know that the > destroy didn't happen. The use of B_FALSE here forces the destroy to > happen now and not at some later point when zfs release is called. > What if we have a bunch of snapshots we're destroying and we have all > of them stacked up waiting for the zfs release. Then when we call the > zfs release it starts destroying them but one of them fails for some > reason, I'm not sure how strait forward it would be to map that back > to which snapshot destroy call we need to reference to return the > error. It may not be as big a deal as I'm thinking right now but why > add another call (zfs release) to do the actual destroy later if we > really don't need to? I think I am either misunderstanding the ZFS changes, or misunderstanding this specific instance of using a BE snapshot. From what I understand:
zfs_create will create a snapshot, and set the user reference count to 1 zfs_release will decrement the user reference count, and destroy the snapshot if it brings the reference count to 0 zfs_hold will increment the reference count zfs_destroy will succeed if reference count is 1, and fail otherwise. If the snapshot being destroyed was created during this invocation of beadm, I see no problem with immediately destroying it. However, I think the appropriate call would be zfs_release, as this fits the new model more appropriately. Unless I misunderstand the new functionality, or the proposed fix in the CR is not what was implemented, zfs_release should delete the snapshot without a separate call to zfs_destroy. If the snapshot being destroyed was NOT created during this invocation of beadm, then it is not safe to assume that no user has placed a hold on it, and thus, immediate destruction is the incorrect choice here. Again, zfs_release should be used here. In either case, by calling zfs_release, we can then immediately check the user reference count or return value, and immediately know whether (a) the snapshot was destroyed or (b) the snapshot is still alive due to a hold. Related question: Are there any situations where failure to delete a snapshot would preclude us from finishing a call to beadm? > > -evan
