Re: [git pull] more vfs bits

2015-02-22 Thread Sedat Dilek
On Sun, Feb 22, 2015 at 10:37 AM, Al Viro  wrote:
> On Sun, Feb 22, 2015 at 10:32:02AM +0100, Sedat Dilek wrote:
>
>> How do you test?
>
> Mostly xfstests and LTP, plus assorted tests depending on the areas touched...
> Any extra testing is welcome - the more, the merrier...

My base is Linux-v3.19-9526-ga135c717d5cd...

...plus vfs.git#for-linus up to...

commit 377a2340606837e7a1245280b0738e194335e0a1
"autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation"

In the attached tarballs you will find everything you need (see post-scriptum).

- Sedat -

P.S.: Output of ls-lR.txt

$ cat ls-lR.txt
.:
total 20
drwxr-xr-x 2 wearefam wearefam 4096 Feb 22 12:37 configs
drwxr-xr-x 2 wearefam wearefam 4096 Feb 22 14:12 fio
-rw-r--r-- 1 wearefam wearefam0 Feb 22 14:15 ls-lR.txt
drwxr-xr-x 2 wearefam wearefam 4096 Feb 22 14:11 ltp
drwxr-xr-x 4 wearefam wearefam 4096 Feb 22 12:40 patches
drwxr-xr-x 2 wearefam wearefam 4096 Feb 22 14:13 scripts

./configs:
total 124
-rw-r--r-- 1 wearefam wearefam 125234 Feb 22 10:55
config-3.19.0-9526.2-iniza-small

./fio:
total 132
-rw-r--r-- 1 wearefam wearefam 61638 Feb 22 14:10
dmesg_3.19.0-9526.2-iniza-small_after-fio-2.2.5.txt
-rw-r--r-- 1 wearefam wearefam 60066 Feb 22 14:09
dmesg_3.19.0-9526.2-iniza-small_before-fio-2.2.5.txt
-rw-r--r-- 1 wearefam wearefam  1980 Feb 22 14:10
dmesg_3.19.0-9526.2-iniza-small_fio-2.2.5.diff
-rw-r--r-- 1 wearefam wearefam  1921 Feb 22 14:10
results_fio-2.2.5_3.19.0-9526.2-iniza-small.txt

./ltp:
total 660
-rw-r--r-- 1 wearefam wearefam  60066 Feb 22 14:05
dmesg_3.19.0-9526.2-iniza-small_after-ltp-20150119.txt
-rw-r--r-- 1 wearefam wearefam  56791 Feb 22 13:40
dmesg_3.19.0-9526.2-iniza-small_before-ltp-20150119.txt
-rw-r--r-- 1 wearefam wearefam   3699 Feb 22 14:05
dmesg_3.19.0-9526.2-iniza-small_ltp-20150119.diff
-rw-r--r-- 1 wearefam wearefam 550587 Feb 22 14:05
results_ltp-20150119_3.19.0-9526.2-iniza-small.txt

./patches:
total 8
drwxr-xr-x 2 wearefam wearefam 4096 Feb 22 01:46 rhashtable-fixes-linux-3.20-rc1
drwxr-xr-x 2 wearefam wearefam 4096 Feb 22 12:40
syscalls-umount01-fixes-ltp-full-20150119

./patches/rhashtable-fixes-linux-3.20-rc1:
total 4
-rw-r--r-- 1 wearefam wearefam 1273 Feb 22 01:44
rhashtable-initialize-all-rhashtable-walker-members.patch

./patches/syscalls-umount01-fixes-ltp-full-20150119:
total 4
-rw-r--r-- 1 wearefam wearefam 1342 Feb 22 11:48
0001-syscalls-umount01-Give-a-hint-on-failure-with-EBUSY.patch

./scripts:
total 8
-rwxr-xr-x 1 wearefam wearefam 1544 Feb 22 14:12 fio-testcase.sh
-rwxr-xr-x 1 wearefam wearefam  696 Feb 22 14:13 ltp-testcase.sh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread David Howells
You might want to remove dentry_inode_once().  I can still see it in your
vfs/for-linus branch.

It might also be worth renaming d_dentry() to d_backing_dentry() to match
d_backing_inode().

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread David Howells
Linus Torvalds  wrote:

> So the ACCESS_ONCE() thing is more special than just "done under RCU".
> It's more like "really special case done without any of the normal
> locking _or_ any of the normal RCU checks".
> 
> That said, the overhead of using ACCESS_ONCE() is basically nil, so
> it's not like we couldn't just start doing more of them, and make it
> be more of a "any time we're under RCU" kind of thing.

Some functions access ->d_inode more than once.  Wouldn't that potentially
increase the number of load instructions?  Admittedly, calls to
dentry->d_inode could be replaced with inode = dentry->d_inode, then use
inode.

> Yeah, I think "d_backing_store_inode()" would probably be more along
> the lines, but that's a mouthful. Maybe shortened to
> "d_backing_inode()"?

Sounds more reasonable than d_opened_inode().  d_actual_inode() might also
work.  d_lower_inode() might work too.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread David Howells
Linus Torvalds  wrote:

>  - dentry_inode*() is supposed to be "the inode that would be used if
> the dentry was opened"
>
>What part of "dentry_inode()" implies "if the dentry was opened" to
> you? Nothing. The name is fundamentally bad.

That because I wasn't thinking of it that way because it's used in a lot more
places than just opening code.  Audit, for example.

> And what *possible* situation could make that "_once()" version ever be
> valid? None. It's bogus. It's crap. It's insane. There is no way that it is
> *ever* a valid question to even ask. If the dentry is so unstable that you
> can't safely look at the inode, you had damn well better never ask "ok, what
> would the inode be if I opened this random pointer"?

There were originally some uses of dentry_inode_once(), but I think they
dropped out when I removed most of fs/*.c from consideration by the scripts.

>  - fs_inode*() is supposed to be "this is the inode that the native
> filesystem uses".

Yes.

> So of the four new helpers, I really don't see any of them as "good".
> I think "dentry_inode()" could remain, but even there I think the name
> should specify *what* it is ("d_opened_inode()"?  I don't like that name
> either,

That's also a poor choice.  The inode isn't even opened necessarily.  If it is
opened and you have the struct file *, you should almost certainly be using
file_inode().

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread David Howells
Al Viro  wrote:

> FWIW, I probably should've collapsed the per-fs patches together, or
> held them back until the next cycle; I understand why David did them
> that way (less painful rebasing that stuff)

Actually, that's not why I did it that way.  What I wanted was to get the
wrappers upstream in this window then I could push the individual per-fs
commits to the managers of those filesystems to go through their trees in the
next cycle.

I didn't rebase the scripted commits since I can just rerun the script.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread Sedat Dilek
On Sun, Feb 22, 2015 at 10:37 AM, Al Viro  wrote:
> On Sun, Feb 22, 2015 at 10:32:02AM +0100, Sedat Dilek wrote:
>
>> How do you test?
>
> Mostly xfstests and LTP, plus assorted tests depending on the areas touched...
> Any extra testing is welcome - the more, the merrier...

As said I wanted to test with unionmount-testsuite from Git HEAD master.

But...

The README is out-of-date and...

...my python3 throws me a...

$ LC_ALL=C ./run --ov --set-up 2>&1 | tee
../unionmount-testsuite-log_ov-set-up_$(uname -r).txt
Traceback (most recent call last):
  File "./run", line 107, in 
set_up(ctx)
  File 
"/home/wearefam/src/unionmount-testsuite/unionmount-testsuite-git/set_up.py",
line 26, in set_up
os.sync()
AttributeError: 'module' object has no attribute 'sync'

$ dpkg -l | grep python3
ii  python3
3.2.3-0ubuntu1.2interactive
high-level object-oriented language (default python3 version)
ii  python3-minimal
3.2.3-0ubuntu1.2minimal subset of
the Python language (default python3 version)
ii  python3.2
3.2.3-0ubuntu3.6Interactive
high-level object-oriented language (version 3.2)
ii  python3.2-minimal
3.2.3-0ubuntu3.6Minimal subset of
the Python language (version 3.2)

AFAICS I had reported already the README needs a refresh.

I will continue with LTP-lite testing and fio.

- Sedat -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread Al Viro
On Sun, Feb 22, 2015 at 10:32:02AM +0100, Sedat Dilek wrote:

> How do you test?

Mostly xfstests and LTP, plus assorted tests depending on the areas touched...
Any extra testing is welcome - the more, the merrier...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-22 Thread Sedat Dilek
On Sun, Feb 22, 2015 at 9:51 AM, Al Viro  wrote:
> On Sat, Feb 21, 2015 at 07:16:16PM -0800, Linus Torvalds wrote:
>> On Sat, Feb 21, 2015 at 6:51 PM, Al Viro  wrote:
>> >
>> > Umm...  Works for me.  Let's do it this way, then:
>> > * rename those guys through the whole series
>> > * leave the "annotate the filesystems" bits to sit in a vfs.git
>> > branch
>> > * slap trylock_super() + bugfixes I'd been doing today
>> > (procfs and debugfs symlink removals racing with follow_link, oopsable;
>> > double-copy in autofs dev_ioctl.c, with length not rechecked after
>> > copying, theoretically oopsable + reasonably likely data leak) on top of 
>> > queue
>> > * feed it through local tests and send an updated pull request 
>> > later
>> > tonight.
>>
>> Ok, I guess I can live with that.
>
> Looks like it survived all the local beating so far...
>

How do you test?

I would like to contribute by testing...
By running LTP-lite testsuite and fio v2.2.5.
Maybe, I can also test with unionmount-testsuite the overlayfs part.
Sounds that good to you?

- Sedat -

> Stuff in there: assorted fixes, multilayer overlayfs from Miklos, beginning
> of David's series (long-term goal being to have VFS understand stacking
> relationships, rather than kludging that up in a bunch of places; for now
> it's mostly infrastructure pieces along with some optimizations that could
> be standalone).  Pushed, please pull from the usual place -
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus
>
> Shortlog:
> Al Viro (10):
>   switch ll_lookup_finish_locks() and ll_revalidate_it_finish() to inode
>   configfs: configfs_create() init callback is never NULL and it never 
> fails
>   configfs: fold create_dir() into its only caller
>   configfs_add_file: fold into its sole caller
>   don't bother with most of the bad_file_ops methods
>   hypfs: switch to read_iter/write_iter
>   Documentation/filesystems/Locking: ->get_sb() is long gone
>   debugfs: leave freeing a symlink body until inode eviction
>   procfs: fix race between symlink removals and traversals
>   autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for 
> allocation
>
> Bastien Nocera (1):
>   coredump: Fix typo in comment
>
> David Howells (14):
>   configfs: Fix potential NULL d_inode dereference
>   Infiniband: Fix potential NULL d_inode dereference
>   VFS: Introduce inode-getting helpers for layered/unioned fs environments
>   VFS: Add a whiteout dentry type
>   VFS: Add a fallthrough flag for marking virtual dentries
>   VFS: Split DCACHE_FILE_TYPE into regular and special types
>   Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
>   Apparmor: Use d_is_positive/negative() rather than testing 
> dentry->d_inode
>   TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
>   Smack: Use d_is_positive() rather than testing dentry->d_inode
>   SELinux: Use d_is_positive() rather than testing dentry->d_inode
>   VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to 
> d_is_*(dentry)
>   Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
>   fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
>
> Kinglong Mee (1):
>   fs/aio.c: Remove duplicate function name in pr_debug messages
>
> Konstantin Khlebnikov (1):
>   trylock_super(): replacement for grab_super_passive()
>
> Miklos Szeredi (16):
>   ovl: check whiteout while reading directory
>   ovl: make path-type a bitmap
>   ovl: dont replace opaque dir
>   ovl: add mutli-layer infrastructure
>   ovl: helper to iterate layers
>   ovl: multi-layer readdir
>   ovl: multi-layer lookup
>   ovl: check whiteout on lowest layer as well
>   ovl: lookup ENAMETOOLONG on lower means ENOENT
>   ovl: allow statfs if no upper layer
>   ovl: mount: change order of initialization
>   ovl: improve mount helpers
>   ovl: make upperdir optional
>   ovl: support multiple lower layers
>   ovl: add testsuite to docs
>   ovl: document lower layer ordering
>
> Omar Sandoval (1):
>   posix_acl: fix reference leaks in posix_acl_create
>
> Rasmus Villemoes (1):
>   autofs4: Wrong format for printing dentry
>
> Seunghun Lee (1):
>   ovl: Prevent rw remount when it should be ro mount
>
> hujianyang (5):
>   ovl: Cleanup redundant blank lines
>   ovl: Use macros to present ovl_xattr
>   ovl: Fix kernel panic while mounting overlayfs
>   ovl: Fix opaque regression in ovl_lookup
>   ovl: discard independent cursor in readdir()
>
> Diffstat:
>  Documentation/filesystems/Locking  |   2 -
>  Documentation/filesystems/overlayfs.txt|  28 +
>  arch/s390/hypfs/inode.c|  53 +-
>  drivers/infiniband/hw/ipath/ipath_fs.c |   2 +-
>  drivers/infiniband/hw/qib/qib_fs.c |   2 +-
> 

Re: [git pull] more vfs bits

2015-02-22 Thread Al Viro
On Sat, Feb 21, 2015 at 07:16:16PM -0800, Linus Torvalds wrote:
> On Sat, Feb 21, 2015 at 6:51 PM, Al Viro  wrote:
> >
> > Umm...  Works for me.  Let's do it this way, then:
> > * rename those guys through the whole series
> > * leave the "annotate the filesystems" bits to sit in a vfs.git
> > branch
> > * slap trylock_super() + bugfixes I'd been doing today
> > (procfs and debugfs symlink removals racing with follow_link, oopsable;
> > double-copy in autofs dev_ioctl.c, with length not rechecked after
> > copying, theoretically oopsable + reasonably likely data leak) on top of 
> > queue
> > * feed it through local tests and send an updated pull request later
> > tonight.
> 
> Ok, I guess I can live with that.

Looks like it survived all the local beating so far...

Stuff in there: assorted fixes, multilayer overlayfs from Miklos, beginning
of David's series (long-term goal being to have VFS understand stacking
relationships, rather than kludging that up in a bunch of places; for now
it's mostly infrastructure pieces along with some optimizations that could
be standalone).  Pushed, please pull from the usual place -
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Shortlog:
Al Viro (10):
  switch ll_lookup_finish_locks() and ll_revalidate_it_finish() to inode
  configfs: configfs_create() init callback is never NULL and it never fails
  configfs: fold create_dir() into its only caller
  configfs_add_file: fold into its sole caller
  don't bother with most of the bad_file_ops methods
  hypfs: switch to read_iter/write_iter
  Documentation/filesystems/Locking: ->get_sb() is long gone
  debugfs: leave freeing a symlink body until inode eviction
  procfs: fix race between symlink removals and traversals
  autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for 
allocation

Bastien Nocera (1):
  coredump: Fix typo in comment

David Howells (14):
  configfs: Fix potential NULL d_inode dereference
  Infiniband: Fix potential NULL d_inode dereference
  VFS: Introduce inode-getting helpers for layered/unioned fs environments
  VFS: Add a whiteout dentry type
  VFS: Add a fallthrough flag for marking virtual dentries
  VFS: Split DCACHE_FILE_TYPE into regular and special types
  Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
  Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
  TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
  Smack: Use d_is_positive() rather than testing dentry->d_inode
  SELinux: Use d_is_positive() rather than testing dentry->d_inode
  VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
  Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions

Kinglong Mee (1):
  fs/aio.c: Remove duplicate function name in pr_debug messages

Konstantin Khlebnikov (1):
  trylock_super(): replacement for grab_super_passive()

Miklos Szeredi (16):
  ovl: check whiteout while reading directory
  ovl: make path-type a bitmap
  ovl: dont replace opaque dir
  ovl: add mutli-layer infrastructure
  ovl: helper to iterate layers
  ovl: multi-layer readdir
  ovl: multi-layer lookup
  ovl: check whiteout on lowest layer as well
  ovl: lookup ENAMETOOLONG on lower means ENOENT
  ovl: allow statfs if no upper layer
  ovl: mount: change order of initialization
  ovl: improve mount helpers
  ovl: make upperdir optional
  ovl: support multiple lower layers
  ovl: add testsuite to docs
  ovl: document lower layer ordering

Omar Sandoval (1):
  posix_acl: fix reference leaks in posix_acl_create

Rasmus Villemoes (1):
  autofs4: Wrong format for printing dentry

Seunghun Lee (1):
  ovl: Prevent rw remount when it should be ro mount

hujianyang (5):
  ovl: Cleanup redundant blank lines
  ovl: Use macros to present ovl_xattr
  ovl: Fix kernel panic while mounting overlayfs
  ovl: Fix opaque regression in ovl_lookup
  ovl: discard independent cursor in readdir()

Diffstat:
 Documentation/filesystems/Locking  |   2 -
 Documentation/filesystems/overlayfs.txt|  28 +
 arch/s390/hypfs/inode.c|  53 +-
 drivers/infiniband/hw/ipath/ipath_fs.c |   2 +-
 drivers/infiniband/hw/qib/qib_fs.c |   2 +-
 drivers/staging/lustre/lustre/llite/dcache.c   |  12 +-
 drivers/staging/lustre/lustre/llite/file.c |   8 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
 drivers/staging/lustre/lustre/llite/namei.c|  12 +-
 fs/9p/vfs_inode.c  |   2 +-
 fs/aio.c   |   6 +-
 fs/autofs4/dev-ioctl.c |   8 +-
 fs/autofs4/expire.c

Re: [git pull] more vfs bits

2015-02-21 Thread Linus Torvalds
On Sat, Feb 21, 2015 at 6:51 PM, Al Viro  wrote:
>
> Umm...  Works for me.  Let's do it this way, then:
> * rename those guys through the whole series
> * leave the "annotate the filesystems" bits to sit in a vfs.git
> branch
> * slap trylock_super() + bugfixes I'd been doing today
> (procfs and debugfs symlink removals racing with follow_link, oopsable;
> double-copy in autofs dev_ioctl.c, with length not rechecked after
> copying, theoretically oopsable + reasonably likely data leak) on top of queue
> * feed it through local tests and send an updated pull request later
> tonight.

Ok, I guess I can live with that.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Al Viro
On Sat, Feb 21, 2015 at 06:19:45PM -0800, Linus Torvalds wrote:

> > dentry_inode -> something. d_opened_inode() might do, but I'm not sure -
> > still sounds a bit wrong to me.  What it's about is "the actual fs object
> > behind this name, maybe from upper fs, maybe showing through from underlying
> > layer"
> 
> Yeah, I think "d_backing_store_inode()" would probably be more along
> the lines, but that's a mouthful. Maybe shortened to
> "d_backing_inode()"?

Umm...  Works for me.  Let's do it this way, then:
* rename those guys through the whole series
* leave the "annotate the filesystems" bits to sit in a vfs.git
branch
* slap trylock_super() + bugfixes I'd been doing today
(procfs and debugfs symlink removals racing with follow_link, oopsable;
double-copy in autofs dev_ioctl.c, with length not rechecked after
copying, theoretically oopsable + reasonably likely data leak) on top of queue
* feed it through local tests and send an updated pull request later
tonight.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Linus Torvalds
On Sat, Feb 21, 2015 at 6:02 PM, Al Viro  wrote:
>
> I'm somewhat tempted to do this:
> fs_inode -> d_inode
> fs_inode_once ->d_inode_rcu (it's not quite ->d_revalidate()-only, there's
> a bit in autofs ->d_manage() as well)

Ok, those at least match our existing naming logic (ie "d_inode()"
would match what we did to "d_count()").

I'm not sure about d_inode_rcu(), for the simple reason that even when
we're doing RCU walking, most of the time we have *not* used the
ACCESS_ONCE() model, we instead end up just using the regular d_inode
and then check the sequence count.

I think.

So the ACCESS_ONCE() thing is more special than just "done under RCU".
It's more like "really special case done without any of the normal
locking _or_ any of the normal RCU checks".

That said, the overhead of using ACCESS_ONCE() is basically nil, so
it's not like we couldn't just start doing more of them, and make it
be more of a "any time we're under RCU" kind of thing.

> dentry_inode -> something. d_opened_inode() might do, but I'm not sure -
> still sounds a bit wrong to me.  What it's about is "the actual fs object
> behind this name, maybe from upper fs, maybe showing through from underlying
> layer"

Yeah, I think "d_backing_store_inode()" would probably be more along
the lines, but that's a mouthful. Maybe shortened to
"d_backing_inode()"?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Al Viro
On Sun, Feb 22, 2015 at 02:02:07AM +, Al Viro wrote:
> Hmm...  ..._once() variants are trivially dropped, IMO.  dentry_inode_once()
> is so bloody special that it *SHOULD* stick out; we don't have any places
> like that, anyway.
> 
> I'm somewhat tempted to do this:
> fs_inode -> d_inode
> fs_inode_once ->d_inode_rcu (it's not quite ->d_revalidate()-only, there's
> a bit in autofs ->d_manage() as well)
> dentry_inode -> something. d_opened_inode() might do, but I'm not sure -
> still sounds a bit wrong to me.  What it's about is "the actual fs object
> behind this name, maybe from upper fs, maybe showing through from underlying
> layer".  It's not always opened; it's what we'd get if we opened it (and
> hadn't triggered any copyups, that is).  E.g. sys_getxattr() would want to
> use that, even if nobody has opened that sucker yet, etc.

*snort*

d_inode/d_inode_rcu/[d_]inode_here(), perhaps? ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Al Viro
On Sat, Feb 21, 2015 at 05:34:23PM -0800, Linus Torvalds wrote:
> On Sat, Feb 21, 2015 at 4:51 PM, Al Viro  wrote:
> >
> > Looking at that queue, it might make sense to hold back everything in that
> > series past "fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions"
> > for now
> 
> Hmm. Even I'd pull just that, quite frankly, I just think it's
> *confusing* to have those badly named "helpers", that were introduced
> earlier in that series.
> 
> These guys are currently all teh same thing, but even if they weren't,
> the naming is not helpful, and not sane:
>  - fs_inode
>  - fs_inode_once
>  - dentry_inode
>  - dentry_inode_once
> 
> Let's walk through them:
> 
>  - dentry_inode*() is supposed to be "the inode that would be used if
> the dentry was opened"
> 
>What part of "dentry_inode()" implies "if the dentry was opened" to
> you? Nothing. The name is fundamentally bad.
> 
>   And what *possible* situation could make that "_once()" version ever
> be valid? None. It's bogus. It's crap. It's insane. There is no way
> that it is *ever* a valid question to even ask. If the dentry is so
> unstable that you can't safely look at the inode, you had damn well
> better never ask "ok, what would the inode be if I opened this random
> pointer"?
> 
>So one of them is badly named, and the other one is fundamentally
> not a valid operation at all, as far as I can tell.
> 
>  - fs_inode*() is supposed to be "this is the inode that the native
> filesystem uses".
> 
>So again, I think the naming is horrible, since it doesn't really
> follow the normal dentry helper routine names. But I'm sure we have
> other cases where we screwed that up, so whatever..
> 
>The "_once()" naming is doubly bad, as explained elsewhere. What
> possible situation merits using that helper? If it's just
> revalidate(), then make it about that.
> 
>But more importantly, this is the one where I don't see how it
> could ever possibly be anything but "dentry->d_inode". I'd much rather
> just leave that.
> 
> So of the four new helpers, I really don't see any of them as "good".
> I think "dentry_inode()" could remain, but even there I think the name
> should specify *what* it is ("d_opened_inode()"? I don't like that
> name either, but at least it would try to explain what the point is,
> rather than having to look up a comment above the function definition
> to figure out what the point is)
> 
> The strongest argument I've seen for them existing at all was that
> "markers for what has been looked at". But that's something that
> belongs in a development tree, not as a series to confuse others with.

Hmm...  ..._once() variants are trivially dropped, IMO.  dentry_inode_once()
is so bloody special that it *SHOULD* stick out; we don't have any places
like that, anyway.

I'm somewhat tempted to do this:
fs_inode -> d_inode
fs_inode_once ->d_inode_rcu (it's not quite ->d_revalidate()-only, there's
a bit in autofs ->d_manage() as well)
dentry_inode -> something. d_opened_inode() might do, but I'm not sure -
still sounds a bit wrong to me.  What it's about is "the actual fs object
behind this name, maybe from upper fs, maybe showing through from underlying
layer".  It's not always opened; it's what we'd get if we opened it (and
hadn't triggered any copyups, that is).  E.g. sys_getxattr() would want to
use that, even if nobody has opened that sucker yet, etc.
dentry_inode_once -> RIP

It's still greppable ([-]>d_inode\> will do it) and IMO it's better than
fs_inode().  And yes, the churn issue remains, but IMO having a pair of
inlined helpers (d_inode(dentry) and d_inode_rcu(dentry)) in dcache.h is
not too horrible per se.

Comments?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Linus Torvalds
On Sat, Feb 21, 2015 at 4:51 PM, Al Viro  wrote:
>
> Looking at that queue, it might make sense to hold back everything in that
> series past "fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions"
> for now

Hmm. Even I'd pull just that, quite frankly, I just think it's
*confusing* to have those badly named "helpers", that were introduced
earlier in that series.

These guys are currently all teh same thing, but even if they weren't,
the naming is not helpful, and not sane:
 - fs_inode
 - fs_inode_once
 - dentry_inode
 - dentry_inode_once

Let's walk through them:

 - dentry_inode*() is supposed to be "the inode that would be used if
the dentry was opened"

   What part of "dentry_inode()" implies "if the dentry was opened" to
you? Nothing. The name is fundamentally bad.

  And what *possible* situation could make that "_once()" version ever
be valid? None. It's bogus. It's crap. It's insane. There is no way
that it is *ever* a valid question to even ask. If the dentry is so
unstable that you can't safely look at the inode, you had damn well
better never ask "ok, what would the inode be if I opened this random
pointer"?

   So one of them is badly named, and the other one is fundamentally
not a valid operation at all, as far as I can tell.

 - fs_inode*() is supposed to be "this is the inode that the native
filesystem uses".

   So again, I think the naming is horrible, since it doesn't really
follow the normal dentry helper routine names. But I'm sure we have
other cases where we screwed that up, so whatever..

   The "_once()" naming is doubly bad, as explained elsewhere. What
possible situation merits using that helper? If it's just
revalidate(), then make it about that.

   But more importantly, this is the one where I don't see how it
could ever possibly be anything but "dentry->d_inode". I'd much rather
just leave that.

So of the four new helpers, I really don't see any of them as "good".
I think "dentry_inode()" could remain, but even there I think the name
should specify *what* it is ("d_opened_inode()"? I don't like that
name either, but at least it would try to explain what the point is,
rather than having to look up a comment above the function definition
to figure out what the point is)

The strongest argument I've seen for them existing at all was that
"markers for what has been looked at". But that's something that
belongs in a development tree, not as a series to confuse others with.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Al Viro
On Sat, Feb 21, 2015 at 05:14:37PM -0800, Linus Torvalds wrote:

> .. and this is the one that makes no sense to me.
> 
> It's the common case, and I don't see how it *possibly* adds any
> value. The "I want the inode of this dentry" is traditionally done as
> "dentry->d_inode".
> 
> What is the *upside* of the wrapper?

AFAICS, having yet-to-be-annotated cases stick out...

BTW, the goal this series is aiming at probably ought to be spelled out
more clearly: there's a bunch of stacking-related stuff (overlayfs and
ecryptfs in the tree, at least unionmount and aufs outside) that could
benefit from having the notion "this dentry covers that stack of
dentries from underlying fs layers" supported sanely by VFS, rather
than having it open-coded in one way or another.  And every place like
that ends up in incestous relationship with VFS; it was annoying while
it had been just ecryptfs, but it's getting worse now.  Moreover, the
details of behaviour overlayfs ends up having to rely upon are both
potentially brittle *and* leaving quite a few things not working properly
(starting with /proc/*/fd/* readlink, etc.)  The goal behind
all that massage is to have that notion (stacking) understood by VFS.

And no, it's not related to the question of annotating ->d_inode accesses -
just something that wasn't quite obvious from David's description.
IMO it's worth spelling out somewhere in this thread...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Linus Torvalds
On Sat, Feb 21, 2015 at 4:18 PM, David Howells  wrote:
>
> So what I want to do is:
>
>  (1) Introduce wrappers around accesses to ->d_inode.  These can be used to
>  mark such accesses as being classified into two categories:
>
>  (A) Some accesses should only consider the dentry they're given.
>  Filesystems would fall into this category when dealing with their own
>  dentries and inodes.

.. and this is the one that makes no sense to me.

It's the common case, and I don't see how it *possibly* adds any
value. The "I want the inode of this dentry" is traditionally done as
"dentry->d_inode".

What is the *upside* of the wrapper?

>  (B) Other accesses should fall through from the dentry they're given to
>  an underlying layer.  A lot of non-filesystem accesses, including
>  those in the VFS side of pathwalk, fall into this category, as do
>  accesses to external objects made by filesystems such as overlayfs
>  and ecryptfs.

Again, if they actually want something *else* than the "native inode",
then at that point a wrapper makes sense.

If you actually want to document that "this use wants the underlying
inode", then go wild. But that is *different* from all the common uses
inside random filesystems that just want "what's the inode of this
dentry".

IOW, my argument is that I cannot see *any* possible reason to wrap
the normal "give me the inode associated with this dentry".

I *can* see a reason to wrap - and document - the cases that are *not*
that kind of simple thing. I *can* see the reason for saying "give me
the inode of the lower layer", or "give me the inode, or NULL if it's
a whiteout entry".

But it's just that empty "fs_inode()" wrapper itself that I just don't
see the point of.

Every single low-level filesystem that does all the actual core
lookup/mkdir/create/whatever operations care about the native inode.
You seem to even kind of acknowledge that in the naming: "fs_inode()".
And my argument is that there is never any possible reason why that
would ever be anything but "dentry->d_inode".

So the wrapper doesn't actually help anything, and it *does* obfuscate
things. It makes people go "whats' the difference between "fs_inode()"
and "fs_inode_once()". It quite possible makes people think it's an
expensive operation. Who knows. It seems pointless.

>  (6) Add a DCACHE_WHITEOUT_TYPE type variant for dentries that will in the
>  future be recognised by the VFS as a 'negative' type, but can be used by
>  the overlay or unionmount to note a difference to an ordinary negative
>  type dentry that can also be a fallthrough (at least in unionmount if
>  this ever makes it).


Right. And a helper/wrapper makes sense for those cases, and actually
clarifies that "this particular code knows and cares about whiteout
entries".

I'm not arguing against those kinds of wrappers that actually
_clarify_ things. I'm arguing against random wrappers that don't, and
that really fundamentally seem like they cannot possibly ever have any
other valid value.

A low-level filesystem may have a real inode associated with a
whiteout dentry, it could be a device inode with a zero major number
or some other special inode (or it might not be an inode at alln - it
could easily also be just a directory entry type). Such a filesystem
would clearly care about the inode. and would actually really care
about "dentry->d_inode".

And I actually think that having "dentry->d_inode" is *clearer* than
"fs_ionode(dentry)". It's clear that that is a naked native access.
It's not some hidden abstracted case. That's *the* inode associated
with the dentry.

>  (7) Where feasible, replace if-statements that check the positivity or
>  negativity of dentries by doing if (...->d_inode) with checks on the type
>  of the dentry.

.. and this is again the kind of wrapper I think is *good*. It's
abstracting some real issue.

I don't object at all to abstracting out "ok, a dentry is negative if
it has a NULL inode, or if it is marked as a whiteout entry". I think
that writing

if (d_negative(dentry)) ...

is more readable than

if (!dentry->d_inode ||  (dentry->d_flags & DCACHE_WHITEOUT)) ..

or variations of that (I guess it's "IS_WHITEOUT(dentry->d_inode)" right now).

So I'm not against helpers that do something meaningful.

And there are downsides to arbitrary random wrappers/helpers. Churn.
Harder to see what the code actually *does*. Bad naming (dentry
operations are generally called "d_xyz()" or "dentry_xyz()").  Does it
do soemthign else? Are there rules for calling this? All the mental
rules you have to have, and that *change* just because you change the
syntax.

>> Now, that was true in the "bad old days" when we just used
>> ACCESS_ONCE(dentry->d_inode) too, but at least in that old model we
>> don't have some idiotic wrapper around it.
>
> I can't make ACCESS_ONCE(fs_inode(dentry)) work if fs_inode() is an inline
> function.  I might be abl

Re: [git pull] more vfs bits

2015-02-21 Thread Al Viro
On Sun, Feb 22, 2015 at 12:23:06AM +, David Howells wrote:
> Linus Torvalds  wrote:
> 
> > Also explain why that crap was done one file at a time?
> 
> Because it wasn't.  Here's the script for your perusal.  Al cherry-picked the
> output, so you won't find everything the script produces in Al's pull request.
> 
> Breaking it down into one commit per fs makes it easier to review the
> individual chunks.
> 
> David

[snip]

BTW, the original queue was in
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs file-pin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Al Viro
On Sat, Feb 21, 2015 at 02:45:34PM -0800, Linus Torvalds wrote:
> So I've pulled this, but quite frankly, I think I will unpull it again
> unless you actually state *what* the advantages of this pointless
> series of endless patches are.

> And no, that "explanation" in commit b717805b3c8b is not an
> explanation. Why should filesystems have to know/care. Any use of
> "fs_inode()" clearly does *not* specify upper/lower, so what the f*ck
> is the advantage of that helper wrt just making the rule be that
> "dentry->d_inode" is that unspecified thing.
> 
> Explain it, or that crap gets undone.

Use of fs_inode does not specify upper/lower; it's choice of dentry_inode() vs.
fs_inode().  And d_inode() side of that pile is what I didn't cherry-pick
from David's tree - it was too massive already.

FWIW, I probably should've collapsed the per-fs patches together, or
held them back until the next cycle; I understand why David did them
that way (less painful rebasing that stuff), but once they switch from
"try to stay sane while developing" to "into the tree it goes" mode,
they ought to have been collapsed together.  And that's something I should
have done - my fault.

Filesystems (non-layered ones) should *not* care; this fs_inode() thing is
exactly what we do right now and, unlike dentry_inode(), it is intended to
stay that way.  And yes, we could've bloody well have left it as ->d_inode;
it's no more than an annotation.  What it gives is a way to keep track of
the accesses that remain to be annotated.

Looking at that queue, it might make sense to hold back everything in that
series past "fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions"
for now - keep in in vfs.git for the next cycle.  My reason for pushing those
now had been no better than wanting to be rid of the trivial bits to make
life a bit easier for the next couple of months, avoiding fun on on each
conflict with every actively changing filesystem.  With the bulk of such
places annotated at once, the work would be limited to VFS proper plus
a bit of whack-a-mole in the end of cycle, pretty much consisting of annotating
the new ->d_inode instances in filesystems (all to fs_inode(), of course).
But that should've been spelled out in pull request *and* done in one
commit instead of that long tail.  Again, my apologies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread David Howells
Linus Torvalds  wrote:

> Also explain why that crap was done one file at a time?

Because it wasn't.  Here's the script for your perusal.  Al cherry-picked the
output, so you won't find everything the script produces in Al's pull request.

Breaking it down into one commit per fs makes it easier to review the
individual chunks.

David
---
#!/usr/bin/perl -w
use strict;

open(my $fd, "<$0") || die $0;
my @script = <$fd>;
close($fd);

my @file_system_types;
open(my $g, 'git grep -l "struct file_system_type.*=" |') ||
die "Can't grep for filesystem type";
@file_system_types = <$g>;
close($g);

my @excludes = (
"fs/attr.c",
"fs/dcache.c",
"fs/exportfs/expfs.c",
"fs/file_table.c",
"fs/notify/",
"fs/locks.c",
"fs/namei.c",
"fs/namespace.c",
"fs/open.c",
"fs/utimes.c",
"fs/xattr.c",
"include/linux/",
"Documentation/filesystems/vfs.txt",
);

my @treat_as_fs = (
"drivers/staging/lustre",
"fs/kernfs",
"fs/libfs.c",
"fs/quota/dquot.c",
"ipc",
"kernel/relay.c",
"kernel/trace",
);

###
#
# Find the filesystems and classify them according to whether they occupy a
# directory or a file in the source.
#
###
my %fs_names = ();
my %fs_dirs = ();
my %fs_files = ();

# Miscellaneous convenience sets
my %fs_misc = (
#"arch" => [],
#"drivers" => [],
#"fs" => [],
#"security" => []
);

my @fs_single = ();

fs_file: foreach my $file (@file_system_types) {
chomp $file;
foreach my $ex (@excludes, @treat_as_fs) {
next fs_file if (substr($file, 0, length($ex)) eq $ex);
}

# Handle whole-directory filesystems
if ($file =~ m!^fs/([a-z0-9]+)/.*[.]c!) {
my $dir = substr($file, 0, rindex($file, "/"));
my $name = $1;
$fs_names{$name} = $dir;
$fs_dirs{$dir} = [];
next;
}

#next if ($file =~ m!^drivers/staging/lustre!);

# Handle single-file filesystems
$fs_files{$file} = [];
}

foreach my $path (@treat_as_fs) {
if ($path =~ /[.][ch]/) {
$fs_files{$path} = [];
} else {
my $name = substr($path, rindex($path, "/") + 1);
$fs_names{$name} = $path;
$fs_dirs{$path} = [];
}
}

my @to_fs_inode = sort(keys(%fs_dirs));

###
#
# Find all occurrences of files containing "->d_inode" and divide them amongst
# the various filesystems and non-filesystems.
#
###
my @occurrences;
open($g, 'git grep -l "[-]>d_inode" |') ||
die "Can't grep for ->d_inode";
@occurrences = <$g>;
close($g);

my %non_fs = ();

file: foreach my $file (@occurrences) {
chomp $file;
foreach my $ex (@excludes) {
next file if (substr($file, 0, length($ex)) eq $ex);
}

foreach my $path (@to_fs_inode) {
if (index($file, $path) == 0) {
#print $file, " found in ", $path, "\n";
push @{$fs_dirs{$path}}, $file;
next file;
}
}

if (exists($fs_files{$file})) {
foreach my $path (keys(%fs_misc)) {
if (index($file, $path) == 0) {
push @{$fs_misc{$path}}, $file;
delete $fs_files{$file};
next file;
}
}
}

if (exists($fs_files{$file})) {
push @{$fs_files{$file}}, $file;
next file;
}

if ($file =~ m!include/trace/events/([_a-zA-Z0-9]+)[.]h!) {
my $fs = $1;
if (exists($fs_names{$fs})) {
push @{$fs_dirs{$fs_names{$fs}}}, $file;
next;
}
}

#print $file, " not found\n";
$non_fs{$file} = [ $file ];
}

foreach my $path (sort(keys(%fs_files))) {
push @fs_single, @{$fs_files{$path}};
}

###
#
# Summarise how the filesystem file sets will be split up
#
###
my $summarise = 0;
if ($summarise) {
foreach my $path (sort(keys(%fs_dirs))) {
print $path, ":\n";
foreach my $file (@{$fs_dirs{$path}}) {
print "\t", $file, "\n";
}
}

foreach my $path (sort(keys(%fs_misc))) {
print $path, "-single-fs:\n";
foreach my $file (@{$fs_misc{$path}}) {
print "\t", $file, "\n";
}
}

print "single-fs:\n";
foreach my $path (sort(keys(%fs_files))) {
foreach my $file (@{$fs_files{$path}}) {
print "\t", $file, "\n";
}
}

print "non-filesystem:\n";
foreach my $path (sort(keys(%non_fs))) {
foreach my $file (@{$non_fs{$path}}) {
print "\t", $file, "\n";
}
}

print "\n";
}

#

Re: [git pull] more vfs bits

2015-02-21 Thread David Howells
Linus Torvalds  wrote:

> > Assorted stuff from this cycle.  The big ones here are multilayer
> > overlayfs from Miklos and beginning of sorting ->d_inode accesses out from
> > David.
> 
> So I've pulled this, but quite frankly, I think I will unpull it again
> unless you actually state *what* the advantages of this pointless
> series of endless patches are.

Let me describe the problem I'm trying to solve first.

Code that accessed ->d_inode may need to look at some other dentry/inode then
the dentry it is given when the filesystem it is operating on is a layer in
some sort of union, but it sees through the filter of the top union/overlay
layer.

Take overlayfs as an example.  Regular files get dummy dentries and inodes
created in the top layer, but overlayfs has to pull a trick to go around the
VFS and repoint open files at the lower/source layer or the upper/workspace
layer rather than the overlay layer.  Access attempts outside of open files
(eg. utime) get trapped by the dummy inode and dealt with there.

Now this brings me to a particular problem: file->f_path and file->f_inode
point to either the upper layer vfsmount, dentry and inode or the lower layer
vfsmount, dentry and inode.  This means that anything that needs those
parameters does not get to see the fact that an overlay was involved.  Such
things include security LSMs, /proc, notifications, locks and leases.

So what I want to do is:

 (1) Introduce wrappers around accesses to ->d_inode.  These can be used to
 mark such accesses as being classified into two categories:

 (A) Some accesses should only consider the dentry they're given.
 Filesystems would fall into this category when dealing with their own
 dentries and inodes.

 (B) Other accesses should fall through from the dentry they're given to
 an underlying layer.  A lot of non-filesystem accesses, including
 those in the VFS side of pathwalk, fall into this category, as do
 accesses to external objects made by filesystems such as overlayfs
 and ecryptfs.

 Because there are a lot of ->d_inode calls, my intention is to classify
 them by giving them an appropriate wrapper.  Type (A) get wrapped with
 fs_inode{,_once}() and type (B) get wrapped with dentry_inode{,_once}().

 Wrapping them in this way makes it easier to find the cases that are more
 problematic.  SELinux, for example, might need to look at *both* layers
 when assessing the security label to be levied upon an overlay object.

 (2) Introduce wrappers around accesses to a given dentry where that dentry
 might not be used directly but rather fall through (similar to (1B)
 above).

 (3) Start off with wrappers that simply pass dentry or dentry->d_inode
 straight through.

 (4) Add an extra pointer into struct dentry that can be pointed at a lower
 layer - and will critically *pin* that lower layer.

 This will avoid the need for file->f_path to directly pin the dentry to
 which file->f_inode refers.  This then allows file->f_path to point to
 the top layer whilst file->f_inode points to an underlay file.

 (6) Add a DCACHE_WHITEOUT_TYPE type variant for dentries that will in the
 future be recognised by the VFS as a 'negative' type, but can be used by
 the overlay or unionmount to note a difference to an ordinary negative
 type dentry that can also be a fallthrough (at least in unionmount if
 this ever makes it).

 (7) Where feasible, replace if-statements that check the positivity or
 negativity of dentries by doing if (...->d_inode) with checks on the type
 of the dentry.

 Also, where feasible, replace S_ISxxx checks on inode type with checks of
 the dentry type field.  This cuts down on the number of accesses to the
 inode we need to do - which is good if we have to check indirectly.

 To this end, I split DCACHE_FILE_TYPE to differentiate regular and
 special types.

 (5) Make the fallthrough logic work.  You set DCACHE_FALLTHRU on a dentry and
 this tells d_dentry() and dentry_inode() to bypass the given dentry and
 use dentry->layer.lower and dentry->layer.lower->d_inode instead - but
 does not affect the operation of fs_inode().

 Note that the DCACHE_FALLTHRU flag and dentry->layer do not belong to the
 filesystem that owns the dentry on which they are set, but rather belong
 to the facility (be it overlayfs or unionmount) that is managing the
 union.


So the patch series isn't complete as it stands.  I'm trying to do the
wrapping first to reduce the number of ->d_inode accesses that need special
consideration.

> Now, that was true in the "bad old days" when we just used
> ACCESS_ONCE(dentry->d_inode) too, but at least in that old model we
> don't have some idiotic wrapper around it.

I can't make ACCESS_ONCE(fs_inode(dentry)) work if fs_inode() is an inline
function.  I might be able to make it work if it's a macro.  I also do

Re: [git pull] more vfs bits

2015-02-21 Thread Linus Torvalds
On Sat, Feb 21, 2015 at 2:45 PM, Linus Torvalds
 wrote:
>
> Explain it, or that crap gets undone.

Also explain why that crap was done one file at a time?

I'm getting really tired of people trying to inflate their commit
counts with tricks like this. What was the advantage of doing the same
thing over-and-over one file at a time? It makes things more
manageable exactly *why*?

 Linus "grumpy as hell" Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] more vfs bits

2015-02-21 Thread Linus Torvalds
On Fri, Feb 20, 2015 at 7:34 PM, Al Viro  wrote:
> Assorted stuff from this cycle.  The big ones here are multilayer
> overlayfs from Miklos and beginning of sorting ->d_inode accesses out from
> David.

So I've pulled this, but quite frankly, I think I will unpull it again
unless you actually state *what* the advantages of this pointless
series of endless patches are.

There is absolutely no sane reason to use this crap, as far as I can
tell. The new "fs_inode_once()" thing is just stupid. It's named for
what it does, not *why*, and there is no hint to the filesystem as to
why it should use "fs_inode_once()" vs "fs_inode()".

Now, that was true in the "bad old days" when we just used
ACCESS_ONCE(dentry->d_inode) too, but at least in that old model we
don't have some idiotic wrapper around it.

Dammit, if we add wrapper and "helper" functions, they should *help*,
not confuse. This thing is just badly named, and there is no actual
real explanation for why it exists in the first place, nor for when to
use one or the other. There is just an endless series of patches with
pointless churn.

And no, that "explanation" in commit b717805b3c8b is not an
explanation. Why should filesystems have to know/care. Any use of
"fs_inode()" clearly does *not* specify upper/lower, so what the f*ck
is the advantage of that helper wrt just making the rule be that
"dentry->d_inode" is that unspecified thing.

Explain it, or that crap gets undone.

I'm annoyed, because shit like this that comes in at the end of the
merge window when everybody and their dog sends me random crap on the
Friday afternoon before the merge window closes is just annoying as
hell.

Yes, I work weekends too, but there is really *no* excuse for
last-minute pull requests that don't have good explanations for them.
Explanations for why they are last-minute, and explanations for why
they exist at all and why I should waste my Saturday on pulling them.

Today has been a huge waste of time for me, and reading through this
was just the last drop.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/