On Thu, 5 Mar 2015 06:05:20 +0000 Al Viro <v...@zeniv.linux.org.uk> wrote:

> On Thu, Mar 05, 2015 at 04:21:21PM +1100, NeilBrown wrote:
> > Hi Al (and others),
> > 
> >  I wonder if you could look over this patchset.
> >  It allows RCU-walk to follow symlinks in many common cases,
> >  thus removing a surprising performance hit caused by using symlinks.
> > 
> >  The last could of patches make changes to XFS and NFS to support
> >  this but I haven't forwarded to the relevant lists yet.
> >  If/when the early code meets with approval I'll do that.
> > 
> >  The first patch almost certainly needs to be changed.  I originally
> >  wrote this code when filesystems could see inside nameidata.
> >  It is now opaque so the simplest solution was to provide an
> >  accessor function.
> >  Maybe I should as a 'flags' arg to ->follow_link?? Or have
> >  ->follow_link and ->follow_link_rcu ??
> >  What do you suggest?
> 
> Umm...  Some observations:
>       * now ->follow_link() can be called in RCU mode, which means
> that it can race with fs shutdown; not a problem, except that now it
> joins ->lookup() et.al. in "if some data structure is needed in RCU
> case of that, make sure it's not destroyed without an RCU delay somewhere
> between the entry into ->kill_sb() and destruction.
>       * highmem pages in symlinks: that BS shouldn't be allowed at
> all.  Just make sure that at least for those filesystems symlink inodes
> get mapping_set_gfp_mask(&inode->i_data, GFP_KERNEL) and be done with that.
>       * are you sure that security_inode_follow_link() is OK to call in
> RCU mode?
>       * what warranties are you giving for the lifetime of strings
> passed to nd_set_link()?  Right now it's "should not be freed until the
> matching ->put_link()"; what happens for RCU mode?
>       * really nasty one: creat(2) on a dangling symlink.  What's to
> preserve the last component if you get into that symlink in RCU mode?
> 
> TBH, I'm less than fond of passing nameidata to ->follow_link() at all,
> flags or no flags.  We could kill current->link_count and
> current->total_link_count, replacing them with one void * current->nameidata
> and taking counters into struct nameidata itself.  Have places like e.g.
> kern_path_locked() do
>       struct nameidata nd, *saved = set_nameidata(&nd);
>       ...
>       set_nameidata(saved);
> with set_nameidata(p) doing this:
>       old = current->nameidata;
>       current->nameidata = p;
>       if (p) {
>               if (!old) {
>                       p->link_count = 0;
>                       p->total_link_count = 0;
>               } else {
>                       p->link_count = old->link_count;
>                       p->total_link_count = old->total_link_count;
>               }
>       }
>       return old;
> 
> Then nd_set_link() et.al. would use current->nameidata instead of an
> explicitly passed pointer and ->follow_link() instances wouldn't need
> that opaque pointer passed to them at all.

Wow, thanks for the quick response!!!

I'll look into all those issues and get back to you when I have something
coherent to say.

NeilBrown

Attachment: pgp_fcu9BHu14.pgp
Description: OpenPGP digital signature

Reply via email to