Re: [ceph] what's going on with d_rehash() in splice_dentry()?
On Mon, Mar 07, 2016 at 06:25:13AM -0500, Sage Weil wrote: > That looks okay, but changing d_rehash to d_add still means you're doing > te d_instantiate(dn, NULL) in the d_unhashed case; is there a reason you > changed that line? Is the dentry_rcuwalk_invalidate in __d_instantiate is > important before rehashing? d_add() gets uninlined in my queue, along with some locking massage to avoid extra bouncing of ->d_lock we do for no good reason. So that call is also going away. > > As an aside, tracking back to the originating fs method is > > painful as hell ;-/ I _think_ that rehash can be hit during ->lookup() > > returning a negative, but I wouldn't bet a dime on it not happening from > > other methods... AFAICS, the change should be OK regardless of what > > it's been called from, but... _ouch_. Is is documented anywhere public? > > It is a pain to follow, yes. FWIW this whole block is predicated in > req->r_locked_dir being non-NULL (i.e., VFS holds dir->i_mutex), which is > only true for lookup, create operations (mkdir/mknod/symlink/etc.), > atomic_open, and the .get_name export op. There's not much documentation > beyond a description of the meaning of fields (e.g. r_locked_dir) in > fs/ceph/mds_client.h ... Yes. Now consider the plans to make ->i_mutex an rwsem and weaken it for ->lookup() (i.e. held only shared when it's held for lookup alone). IOW, the aforementioned queue... I would rather avoid d_rehash() in ->lookup() instances; the tricky part is that we want to avoid parallel lookups on the *same* name, so we need an "in parallel lookup" state for dentries and mechanism allowing to wait for it to be finished. d_add()/d_splice_alias() on such a dentry would take it out of that state and I would rather avoid dealing with it in d_rehash()... I was going to post the series last week, got sidetracked on various fun things I'd found (ecryptfs and lustre). Hopefully will get all of that sorted out and the whole series posted for review tomorrow or on Wednesday...
Re: [ceph] what's going on with d_rehash() in splice_dentry()?
On Mon, 7 Mar 2016, Al Viro wrote: > On Wed, Mar 02, 2016 at 11:00:01AM +0800, Yan, Zheng wrote: > > > > This code dates back to when Ceph was originally upstreamed, so the > > > history is murky, but I expect at that point I wanted to avoid hashing in > > > the no-lease case. But I don't think it matters. We should just remove > > > the prehash argument from splice_dentry entirely. > > > > > > Zheng, does that sound right? > > > > Yes. I think we can remove the d_rehash(dn) call and rehash parameter. > > Another question in the same general area: > /* null dentry? */ > if (!rinfo->head->is_target) { > dout("fill_trace null dentry\n"); > if (d_really_is_positive(dn)) { > ceph_dir_clear_ordered(dir); > dout("d_delete %p\n", dn); > d_delete(dn); > } else { > dout("d_instantiate %p NULL\n", dn); > d_instantiate(dn, NULL); > if (have_lease && d_unhashed(dn)) > d_rehash(dn); > update_dentry_lease(dn, rinfo->dlease, > session, > req->r_request_started); > } > goto done; > } > What's that d_instantiate() about? We have just checked that it's > negative; what's the point of setting ->d_inode to NULL again? Would it > be OK if we just do > } else { > if (have_lease && d_unhashed(dn)) > d_add(dn, NULL); > update_dentry_lease(dn, rinfo->dlease, > session, > req->r_request_started); > } > in there? That looks okay, but changing d_rehash to d_add still means you're doing te d_instantiate(dn, NULL) in the d_unhashed case; is there a reason you changed that line? Is the dentry_rcuwalk_invalidate in __d_instantiate is important before rehashing? > As an aside, tracking back to the originating fs method is > painful as hell ;-/ I _think_ that rehash can be hit during ->lookup() > returning a negative, but I wouldn't bet a dime on it not happening from > other methods... AFAICS, the change should be OK regardless of what > it's been called from, but... _ouch_. Is is documented anywhere public? It is a pain to follow, yes. FWIW this whole block is predicated in req->r_locked_dir being non-NULL (i.e., VFS holds dir->i_mutex), which is only true for lookup, create operations (mkdir/mknod/symlink/etc.), atomic_open, and the .get_name export op. There's not much documentation beyond a description of the meaning of fields (e.g. r_locked_dir) in fs/ceph/mds_client.h ... sage
Re: [ceph] what's going on with d_rehash() in splice_dentry()?
On Wed, Mar 02, 2016 at 11:00:01AM +0800, Yan, Zheng wrote: > > This code dates back to when Ceph was originally upstreamed, so the > > history is murky, but I expect at that point I wanted to avoid hashing in > > the no-lease case. But I don't think it matters. We should just remove > > the prehash argument from splice_dentry entirely. > > > > Zheng, does that sound right? > > Yes. I think we can remove the d_rehash(dn) call and rehash parameter. Another question in the same general area: /* null dentry? */ if (!rinfo->head->is_target) { dout("fill_trace null dentry\n"); if (d_really_is_positive(dn)) { ceph_dir_clear_ordered(dir); dout("d_delete %p\n", dn); d_delete(dn); } else { dout("d_instantiate %p NULL\n", dn); d_instantiate(dn, NULL); if (have_lease && d_unhashed(dn)) d_rehash(dn); update_dentry_lease(dn, rinfo->dlease, session, req->r_request_started); } goto done; } What's that d_instantiate() about? We have just checked that it's negative; what's the point of setting ->d_inode to NULL again? Would it be OK if we just do } else { if (have_lease && d_unhashed(dn)) d_add(dn, NULL); update_dentry_lease(dn, rinfo->dlease, session, req->r_request_started); } in there? As an aside, tracking back to the originating fs method is painful as hell ;-/ I _think_ that rehash can be hit during ->lookup() returning a negative, but I wouldn't bet a dime on it not happening from other methods... AFAICS, the change should be OK regardless of what it's been called from, but... _ouch_. Is is documented anywhere public?
Re: [ceph] what's going on with d_rehash() in splice_dentry()?
> On Mar 1, 2016, at 22:50, Sage Weil wrote: > > Hi Al, > > On Fri, 26 Feb 2016, Al Viro wrote: >> You have, modulo printks and BUG_ON(), >> { >>struct dentry *realdn; >>/* dn must be unhashed */ >>if (!d_unhashed(dn)) >>d_drop(dn); >>realdn = d_splice_alias(in, dn); >>if (IS_ERR(realdn)) { >>if (prehash) >>*prehash = false; /* don't rehash on error */ >>dn = realdn; /* note realdn contains the error */ >>goto out; >>} else if (realdn) { >>dput(dn); >>dn = realdn; >>} >>if ((!prehash || *prehash) && d_unhashed(dn)) >>d_rehash(dn); >> >> When d_splice_alias() returns NULL it has hashed the dentry you'd given it; >> when it returns a different dentry, that dentry is also returned hashed. >> IOW, d_rehash(dn) in there should never be called. >> >> If you have a case when it _is_ called, you've found a bug somewhere and >> I'd like to see details. AFAICS, the whole prehash thing appears to be >> pointless - even the place where we modify *prehash, since in that case >> we return ERR_PTR() and the only caller passing non-NULL prehash >> (&have_lease) >> buggers off on such return value past all code that would look at have_lease >> value. > > Right. > >> One possible reading is that you want to prevent hashing in !have_lease >> case of >>dn = splice_dentry(dn, in, &have_lease); >> If that's the case, you might have a problem, since it will be hashed no >> matter what... > > In this case it doesn't actually matter if it is hashed or not, since > we will look at the lease state on the dentry before trusting it... > > This code dates back to when Ceph was originally upstreamed, so the > history is murky, but I expect at that point I wanted to avoid hashing in > the no-lease case. But I don't think it matters. We should just remove > the prehash argument from splice_dentry entirely. > > Zheng, does that sound right? Yes. I think we can remove the d_rehash(dn) call and rehash parameter. Regards Yan, Zheng
Re: [ceph] what's going on with d_rehash() in splice_dentry()?
Hi Al, On Fri, 26 Feb 2016, Al Viro wrote: > You have, modulo printks and BUG_ON(), > { > struct dentry *realdn; > /* dn must be unhashed */ > if (!d_unhashed(dn)) > d_drop(dn); > realdn = d_splice_alias(in, dn); > if (IS_ERR(realdn)) { > if (prehash) > *prehash = false; /* don't rehash on error */ > dn = realdn; /* note realdn contains the error */ > goto out; > } else if (realdn) { > dput(dn); > dn = realdn; > } > if ((!prehash || *prehash) && d_unhashed(dn)) > d_rehash(dn); > > When d_splice_alias() returns NULL it has hashed the dentry you'd given it; > when it returns a different dentry, that dentry is also returned hashed. > IOW, d_rehash(dn) in there should never be called. > > If you have a case when it _is_ called, you've found a bug somewhere and > I'd like to see details. AFAICS, the whole prehash thing appears to be > pointless - even the place where we modify *prehash, since in that case > we return ERR_PTR() and the only caller passing non-NULL prehash (&have_lease) > buggers off on such return value past all code that would look at have_lease > value. Right. > One possible reading is that you want to prevent hashing in !have_lease > case of > dn = splice_dentry(dn, in, &have_lease); > If that's the case, you might have a problem, since it will be hashed no > matter what... In this case it doesn't actually matter if it is hashed or not, since we will look at the lease state on the dentry before trusting it... This code dates back to when Ceph was originally upstreamed, so the history is murky, but I expect at that point I wanted to avoid hashing in the no-lease case. But I don't think it matters. We should just remove the prehash argument from splice_dentry entirely. Zheng, does that sound right? Thanks! sage
[ceph] what's going on with d_rehash() in splice_dentry()?
You have, modulo printks and BUG_ON(), { struct dentry *realdn; /* dn must be unhashed */ if (!d_unhashed(dn)) d_drop(dn); realdn = d_splice_alias(in, dn); if (IS_ERR(realdn)) { if (prehash) *prehash = false; /* don't rehash on error */ dn = realdn; /* note realdn contains the error */ goto out; } else if (realdn) { dput(dn); dn = realdn; } if ((!prehash || *prehash) && d_unhashed(dn)) d_rehash(dn); When d_splice_alias() returns NULL it has hashed the dentry you'd given it; when it returns a different dentry, that dentry is also returned hashed. IOW, d_rehash(dn) in there should never be called. If you have a case when it _is_ called, you've found a bug somewhere and I'd like to see details. AFAICS, the whole prehash thing appears to be pointless - even the place where we modify *prehash, since in that case we return ERR_PTR() and the only caller passing non-NULL prehash (&have_lease) buggers off on such return value past all code that would look at have_lease value. One possible reading is that you want to prevent hashing in !have_lease case of dn = splice_dentry(dn, in, &have_lease); If that's the case, you might have a problem, since it will be hashed no matter what... PS: the proof that d_splice_alias() always hashes is simple - if you exclude the places where it returns ERR_PTR(), you are left with d_rehash(dentry); return NULL; in the very end, __d_move(new, dentry, false); ... return new; and int err = __d_unalias(inode, dentry, new); ... // err turned out to be zero return new; The first one is obvious - we return NULL after an explicit d_rehash() of the argument. __d_move() is guaranteed to return with its first argument hashed due to __d_drop(dentry); __d_rehash(dentry, d_hash(target->d_parent, target->d_name.hash)); (dentry here refers to the first argument of __d_move() - it's our 'new'). And zero-returning __d_unalias() ends up calling __d_move(), with the third argument of __d_unalias() ending up as the first one of __d_move(). So in both remaining cases we return a dentry that has just been hashed.