Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=10717
Was this race fixed in post 1.4.5.8 releases with the addition of the
change below. Or is the observed failure understood to have been
caused by a different event?
int ldlm_cancel_lru(struct ldlm_namespace *ns, ldlm_sync_t sync)
{
...
list_for_each_entry_safe(lock, next, &ns->ns_unused_list, l_lru) {
LASSERT(!lock->l_readers && !lock->l_writers);
/* If we have chosen to canecl this lock voluntarily, we better
send cancel notification to server, so that it frees
appropriate state. This might lead to a race where while
we are doing cancel here, server is also silently
cancelling this lock. */
>>> lock->l_flags &= ~LDLM_FL_CANCEL_ON_BLOCK;
...
}
...
}
_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel