Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-10 Thread Cong Wang
On Sun, Jun 10, 2018 at 12:27 PM, David Miller  wrote:
> I'm applying this for now, it is at least a step towards fixing
> all of these issues.
>
> If it is really offensive, I can revert, just tell me.

Thanks, David!

Unless there is something fundamentally broken, there is no
reason to revert it. The only risk here is some possible deadlock,
but I am ready to fix any deadlock report caused by this
patch. :)


Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-10 Thread David Miller
From: Cong Wang 
Date: Thu,  7 Jun 2018 13:39:49 -0700

> fchownat() doesn't even hold refcnt of fd until it figures out
> fd is really needed (otherwise is ignored) and releases it after
> it resolves the path. This means sock_close() could race with
> sockfs_setattr(), which leads to a NULL pointer dereference
> since typically we set sock->sk to NULL in ->release().
> 
> As pointed out by Al, this is unique to sockfs. So we can fix this
> in socket layer by acquiring inode_lock in sock_close() and
> checking against NULL in sockfs_setattr().
> 
> sock_release() is called in many places, only the sock_close()
> path matters here. And fortunately, this should not affect normal
> sock_close() as it is only called when the last fd refcnt is gone.
> It only affects sock_close() with a parallel sockfs_setattr() in
> progress, which is not common.
> 
> Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
> Reported-by: shankarapailoor 
> Cc: Tetsuo Handa 
> Cc: Lorenzo Colitti 
> Cc: Al Viro 
> Signed-off-by: Cong Wang 

I'm applying this for now, it is at least a step towards fixing
all of these issues.

If it is really offensive, I can revert, just tell me.


Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-07 Thread Al Viro
On Thu, Jun 07, 2018 at 03:15:15PM -0700, Cong Wang wrote:

> > You do realize that the same ->setattr() can be called by way of
> > chown() on /proc/self/fd/, right?  What would you do there -
> > bump refcount on that struct file when traversing that symlink and
> > hold it past the end of pathname resolution, until... what exactly?
> 
> I was thinking about this:
> 
> error = user_path_at(dfd,); // hold dfd when needed
> 
> if (!error) {
> error = chown_common(&path, mode);
> path_put(&path);  // release dfd if held
> 
> With this, we can guarantee ->release() is only possibly called
> after chown_common() which is after ->setattr() too.

No, we can't.  You are assuming that there *is* dfd and that it points
to the opened socket we are going to operate upon.  That is not guaranteed.
At all.  If e.g. 42 is a file descriptor of an opened socket, plain chown(2)
on /proc/self/fd/42 will trigger that ->setattr().  No dfd in sight.
We do run across an opened file at some point, all right - when we traverse
the symlink in procfs.  You can't bump ->f_count there.  Even leaving aside
the "where would I stash the reference to that file?" and "how long would I
hold it?", you can't bump ->f_count on other process' files.  That would
bugger the expectations of close(2) callers.


Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-07 Thread Cong Wang
On Thu, Jun 7, 2018 at 3:04 PM, Al Viro  wrote:
> On Thu, Jun 07, 2018 at 02:45:58PM -0700, Cong Wang wrote:
>> On Thu, Jun 7, 2018 at 2:26 PM, Al Viro  wrote:
>> > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
>> >> fchownat() doesn't even hold refcnt of fd until it figures out
>> >> fd is really needed (otherwise is ignored) and releases it after
>> >> it resolves the path. This means sock_close() could race with
>> >> sockfs_setattr(), which leads to a NULL pointer dereference
>> >> since typically we set sock->sk to NULL in ->release().
>> >>
>> >> As pointed out by Al, this is unique to sockfs. So we can fix this
>> >> in socket layer by acquiring inode_lock in sock_close() and
>> >> checking against NULL in sockfs_setattr().
>> >
>> > That looks like a massive overkill - it's way heavier than it should be.
>>
>> I don't see any other quick way to fix this. My initial thought is
>> to keep that refcnt until path_put(), apparently you don't like it
>> either.
>
> You do realize that the same ->setattr() can be called by way of
> chown() on /proc/self/fd/, right?  What would you do there -
> bump refcount on that struct file when traversing that symlink and
> hold it past the end of pathname resolution, until... what exactly?

I was thinking about this:

error = user_path_at(dfd,); // hold dfd when needed

if (!error) {
error = chown_common(&path, mode);
path_put(&path);  // release dfd if held

With this, we can guarantee ->release() is only possibly called
after chown_common() which is after ->setattr() too.


Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-07 Thread Al Viro
On Thu, Jun 07, 2018 at 02:45:58PM -0700, Cong Wang wrote:
> On Thu, Jun 7, 2018 at 2:26 PM, Al Viro  wrote:
> > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
> >> fchownat() doesn't even hold refcnt of fd until it figures out
> >> fd is really needed (otherwise is ignored) and releases it after
> >> it resolves the path. This means sock_close() could race with
> >> sockfs_setattr(), which leads to a NULL pointer dereference
> >> since typically we set sock->sk to NULL in ->release().
> >>
> >> As pointed out by Al, this is unique to sockfs. So we can fix this
> >> in socket layer by acquiring inode_lock in sock_close() and
> >> checking against NULL in sockfs_setattr().
> >
> > That looks like a massive overkill - it's way heavier than it should be.
> 
> I don't see any other quick way to fix this. My initial thought is
> to keep that refcnt until path_put(), apparently you don't like it
> either.

You do realize that the same ->setattr() can be called by way of
chown() on /proc/self/fd/, right?  What would you do there -
bump refcount on that struct file when traversing that symlink and
hold it past the end of pathname resolution, until... what exactly?


Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-07 Thread Cong Wang
On Thu, Jun 7, 2018 at 2:26 PM, Al Viro  wrote:
> On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
>> fchownat() doesn't even hold refcnt of fd until it figures out
>> fd is really needed (otherwise is ignored) and releases it after
>> it resolves the path. This means sock_close() could race with
>> sockfs_setattr(), which leads to a NULL pointer dereference
>> since typically we set sock->sk to NULL in ->release().
>>
>> As pointed out by Al, this is unique to sockfs. So we can fix this
>> in socket layer by acquiring inode_lock in sock_close() and
>> checking against NULL in sockfs_setattr().
>
> That looks like a massive overkill - it's way heavier than it should be.

I don't see any other quick way to fix this. My initial thought is
to keep that refcnt until path_put(), apparently you don't like it
either.


> And it's very likely to trigger shitloads of deadlock warnings, some
> possibly even true.

I do audit the inode_lock usage in networking, I don't see any
deadlock, of course, there could be some non-networking code
uses socket API that I missed. But _generally_, socket doesn't
have a pointer to retrieve this inode.


Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()

2018-06-07 Thread Al Viro
On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote:
> fchownat() doesn't even hold refcnt of fd until it figures out
> fd is really needed (otherwise is ignored) and releases it after
> it resolves the path. This means sock_close() could race with
> sockfs_setattr(), which leads to a NULL pointer dereference
> since typically we set sock->sk to NULL in ->release().
> 
> As pointed out by Al, this is unique to sockfs. So we can fix this
> in socket layer by acquiring inode_lock in sock_close() and
> checking against NULL in sockfs_setattr().

That looks like a massive overkill - it's way heavier than it should be.
And it's very likely to trigger shitloads of deadlock warnings, some
possibly even true.