Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=10589



(In reply to comment #27)
> > int ldlm_resource_get_unused(ldlm_resource *res, list_head *head,
> > __u64 bits)
> 
> > This method gathers all the unused locks on the resource @res into
> > the @head. Only those inodebits locks are added to @head that match
> > by the policy @bits. The further attempts to find or cancel locks
> > inserted into the list @head, will not find nor be able to cancel them.
> 
> I think it should be applicable to any lock, not just inodebit locks.

yes, it is. but @bits are checked for inodebits locks only.

> You want this same feature for extent locks too, for example (on destroy).
> perhaps it should take policy here and gather all locks that intersect
> policy-wise.
> In fact I see you use this function from OSC too, and bits there just
> make zero sense.

right, no defect here.

> Elsewhere in mdc* osc_destroy:
> there is no point in searching locks when OBD_CONNECT_EARLY_CANCEL I think?
> Or do you think we can save on blocking ast?

blocking will be sent by server anyway, but we can save on the later
canceling rpc.

> Then for such a case cancelling
> ASTs should be sent to servers (and ldlm_resource_cancel_unused must not
> set the cancelling flags), failing to send these ASTs will cause server
> to send blocking ASTs for these locks still (that client does not have
> already), and then evicting client because it never replies with cancel.

it seems not correct, server will get blocking reply with EINVAL error
which is considered as a legal race currently in ldlm_handle_ast_error(),
after that server is not waiting for cancel, it just cancel its lock and
proceed without client eviction. meanwhile canceling rpc will not be sent
back to server, what saves 1 rpc.
 
> In lustre_swab_ldlm_request:
> Why no swabbing for lock handles themselves if there are more than 2?

lock handles are opaque and are not swabbed.

> Locking:
> > While we are sending a batched cancel request, a lock may be cancelled
> > on server in parallel. Then this lock will not be found by its handle
> > on the server when our batched cancel request will be handled. This
> > is also considered as a valid race.
> 
> This cannot happen. Server won't cancel lock without confirmation from
> the client, so either it degenerates into 1st case in locking section
> or the lock is already canceled client-side and then we won't find it
> on client and won't send the handle to server.

yes, this is the repeating of the 1st case of the locking section.
I will remove this one.

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to