Hi Evan,

As discussed over the phone, I now understand the changes and agree with 
them. Thanks for being patient with me!

- Keith

Evan Layton wrote:
> Keith Mitchell wrote:
>>
>>
>> Evan Layton wrote:
>>> Keith Mitchell wrote:
>>>>
>>>>
>>>> Evan Layton wrote:
>>>>> Ethan Quach wrote:
>>>>>>
>>>>>>
>>>>>> Evan Layton wrote:
>>>>>>> Ethan Quach wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Evan Layton wrote:
>>>>>>>>> jeanm wrote:
>>>>>>>>>> Looks OK to me. Maybe some comments explaining the B_FALSE 
>>>>>>>>>> and what it does?
>>>>>>>>>> I looked in the zfs code but that's kind of awkward.
>>>>>>>>>
>>>>>>>>> Probably a good idea to add that since it's a bit convoluted...
>>>>>>>>>
>>>>>>>>> I've added:
>>>>>>>>> /*
>>>>>>>>>  * The boolean set to B_FALSE and passed to 
>>>>>>>>> zfs_destroy_snaps() tells
>>>>>>>>>  * zfs to process and destroy the snapshots now. Otherwise ZFS 
>>>>>>>>> will
>>>>>>>>>  * wait to destroy these snapshots until a zfs release is 
>>>>>>>>> called and
>>>>>>>>>  * the references for all the snapshots have been released.
>>>>>>>>>  */
>>>>>>>>
>>>>>>>> nit - for the second sentence, "Otherwise the call will 
>>>>>>>> potentially
>>>>>>>> return where the snapshot isn't actually destroyed yet, and ZFS is
>>>>>>>> waiting until all the references to the snapshot have been 
>>>>>>>> released
>>>>>>>> before actually destroying the snapshot."
>>>>>>>>
>>>>>>>
>>>>>>> Much better! I'll make that change.
>>>>>>
>>>>>> Hmm, I may have made an incorrect assumption on the behavior
>>>>>> though.  Will the call actually return and ZFS waits, or wait before
>>>>>> returning?
>>>>>
>>>>> If I understand the comments and the code correctly, ZFS returns 
>>>>> without
>>>>> actually destroying the snapshots. It waits to do that until all the
>>>>> references are released and the zfs_release has been called.
>>>> Devil's advocate: What's the problem with using B_TRUE if this is 
>>>> how it works? If someone has truly set a hold on a BE snapshot, it 
>>>> seems more appropriate to let ZFS clean it up later when whoever 
>>>> put the hold on it is done with it. As long as it gets cleaned up 
>>>> eventually & automatically... And if we don't want people putting 
>>>> holds on it, would it be more appropriate to set the ZFS property 
>>>> such that normal users can't put a hold on it?
>>>
>>> There are a couple of problems with this.
>>>
>>> This approach has us returning form the zfs_destroy_snap call 
>>> without knowing if or when the snapshot will be destroyed. We think 
>>> we're done but may not be...
>>>
>>> Also this approach adds the need to place a zfs release in the 
>>> appropriate spots so that the last reference is released. With the 
>>> changes done for 6803121 when a snapshot is taken a reference is 
>>> automatically added. This is used to keep the snapshot around until 
>>> the zfs release is called. If we don't call it or call it in the 
>>> wrong place we end up returning from the zfs_destroy_snap believing 
>>> that the snapshot will be destroyed but since we either didn't call 
>>> release or it wasn't in the right spot, we then have no way to know 
>>> that the destroy didn't happen. The use of B_FALSE here forces the 
>>> destroy to happen now and not at some later point when zfs release 
>>> is called. What if we have a bunch of snapshots we're destroying and 
>>> we have all of them stacked up waiting for the zfs release. Then 
>>> when we call the zfs release it starts destroying them but one of 
>>> them fails for some reason, I'm not sure how strait forward it would 
>>> be to map that back to which snapshot destroy call we need to 
>>> reference to return the error. It may not be as big a deal as I'm 
>>> thinking right now but why add another call (zfs release) to do the 
>>> actual destroy later if we really don't need to?
>> I think I am either misunderstanding the ZFS changes, or 
>> misunderstanding this specific instance of using a BE snapshot. From 
>> what I understand:
>>
>> zfs_create will create a snapshot, and set the user reference count to 1
>> zfs_release will decrement the user reference count, and destroy the 
>> snapshot if it brings the reference count to 0
>> zfs_hold will increment the reference count
>> zfs_destroy will succeed if reference count is 1, and fail otherwise.
>
> We're using zfs_destroy_snap here.
>
>>
>> If the snapshot being destroyed was created during this invocation of 
>> beadm, I see no problem with immediately destroying it. However, I 
>> think the appropriate call would be zfs_release, as this fits the new 
>> model more appropriately. Unless I misunderstand the new 
>> functionality, or the proposed fix in the CR is not what was 
>> implemented, zfs_release should delete the snapshot without a 
>> separate call to zfs_destroy.
>>
>
> as I understand things with the new changes both the 
> zfs_destroy_snap() call and the zfs_release call are needed.
>
>> If the snapshot being destroyed was NOT created during this 
>> invocation of beadm, then it is not safe to assume that no user has 
>> placed a hold on it, and thus, immediate destruction is the incorrect 
>> choice here. Again, zfs_release should be used here.
>
> We are always either dealing with snapshots created during this 
> invocation of beadm, snapshots related directly to the clone and 
> snapshots we are specifically destroying or a snapshot that was 
> specifically requested to be destroyed.
>
>>
>> In either case, by calling zfs_release, we can then immediately check 
>> the user reference count or return value, and immediately know 
>> whether (a) the snapshot was destroyed or (b) the snapshot is still 
>> alive due to a hold.
>
> We still need the zfs_destroy_snap call. I really don't want to have 
> to add an additional call to zfs_release which is what would be 
> required if we set the boolean to true. Plus it wouldn't buy us 
> anything to have zfs wait for the reference count to drop since we've 
> already gone through the process of making sure that this snapshot 
> doesn't have any dependencies before we get to this point. 
> Additionally if there are still references we don't want to destroy 
> this snapshot. If we use the B_TRUE and zfs_release route here it 
> wouldn't destroy the snapshot right then. If we then exit and another 
> call comes into the library that releases that reference the snapshot 
> would automatically be destroyed out from underneath us which may not 
> be exactly what we wanted.
>
>>
>> Related question: Are there any situations where failure to delete a 
>> snapshot would preclude us from finishing a call to beadm?
>
> yes. If it fails to destroy the snapshot a BE's datasets where cloned 
> from then we fail the destroy and we end up with a snapshot hanging 
> around that should have been removed.
>
>
> The short answer to all of this is that the use of B_FALSE here 
> defaults us to the original behavior which is what we want. If there's 
> still a reference to this snapshot then something in our code has 
> failed and we should catch this.
>
>>>
>>> -evan
>

Reply via email to