Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10589
(In reply to comment #27) > > int ldlm_resource_get_unused(ldlm_resource *res, list_head *head, > > __u64 bits) > > > This method gathers all the unused locks on the resource @res into > > the @head. Only those inodebits locks are added to @head that match > > by the policy @bits. The further attempts to find or cancel locks > > inserted into the list @head, will not find nor be able to cancel them. > > I think it should be applicable to any lock, not just inodebit locks. yes, it is. but @bits are checked for inodebits locks only. > You want this same feature for extent locks too, for example (on destroy). > perhaps it should take policy here and gather all locks that intersect > policy-wise. > In fact I see you use this function from OSC too, and bits there just > make zero sense. right, no defect here. > Elsewhere in mdc* osc_destroy: > there is no point in searching locks when OBD_CONNECT_EARLY_CANCEL I think? > Or do you think we can save on blocking ast? blocking will be sent by server anyway, but we can save on the later canceling rpc. > Then for such a case cancelling > ASTs should be sent to servers (and ldlm_resource_cancel_unused must not > set the cancelling flags), failing to send these ASTs will cause server > to send blocking ASTs for these locks still (that client does not have > already), and then evicting client because it never replies with cancel. it seems not correct, server will get blocking reply with EINVAL error which is considered as a legal race currently in ldlm_handle_ast_error(), after that server is not waiting for cancel, it just cancel its lock and proceed without client eviction. meanwhile canceling rpc will not be sent back to server, what saves 1 rpc. > In lustre_swab_ldlm_request: > Why no swabbing for lock handles themselves if there are more than 2? lock handles are opaque and are not swabbed. > Locking: > > While we are sending a batched cancel request, a lock may be cancelled > > on server in parallel. Then this lock will not be found by its handle > > on the server when our batched cancel request will be handled. This > > is also considered as a valid race. > > This cannot happen. Server won't cancel lock without confirmation from > the client, so either it degenerates into 1st case in locking section > or the lock is already canceled client-side and then we won't find it > on client and won't send the handle to server. yes, this is the repeating of the 1st case of the locking section. I will remove this one. _______________________________________________ Lustre-devel mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-devel
