Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
On Sun, Jun 10, 2018 at 12:27 PM, David Miller wrote: > I'm applying this for now, it is at least a step towards fixing > all of these issues. > > If it is really offensive, I can revert, just tell me. Thanks, David! Unless there is something fundamentally broken, there is no reason to revert it. The only risk here is some possible deadlock, but I am ready to fix any deadlock report caused by this patch. :)
Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
From: Cong Wang Date: Thu, 7 Jun 2018 13:39:49 -0700 > fchownat() doesn't even hold refcnt of fd until it figures out > fd is really needed (otherwise is ignored) and releases it after > it resolves the path. This means sock_close() could race with > sockfs_setattr(), which leads to a NULL pointer dereference > since typically we set sock->sk to NULL in ->release(). > > As pointed out by Al, this is unique to sockfs. So we can fix this > in socket layer by acquiring inode_lock in sock_close() and > checking against NULL in sockfs_setattr(). > > sock_release() is called in many places, only the sock_close() > path matters here. And fortunately, this should not affect normal > sock_close() as it is only called when the last fd refcnt is gone. > It only affects sock_close() with a parallel sockfs_setattr() in > progress, which is not common. > > Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.") > Reported-by: shankarapailoor > Cc: Tetsuo Handa > Cc: Lorenzo Colitti > Cc: Al Viro > Signed-off-by: Cong Wang I'm applying this for now, it is at least a step towards fixing all of these issues. If it is really offensive, I can revert, just tell me.
Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
On Thu, Jun 07, 2018 at 03:15:15PM -0700, Cong Wang wrote: > > You do realize that the same ->setattr() can be called by way of > > chown() on /proc/self/fd/, right? What would you do there - > > bump refcount on that struct file when traversing that symlink and > > hold it past the end of pathname resolution, until... what exactly? > > I was thinking about this: > > error = user_path_at(dfd,); // hold dfd when needed > > if (!error) { > error = chown_common(&path, mode); > path_put(&path); // release dfd if held > > With this, we can guarantee ->release() is only possibly called > after chown_common() which is after ->setattr() too. No, we can't. You are assuming that there *is* dfd and that it points to the opened socket we are going to operate upon. That is not guaranteed. At all. If e.g. 42 is a file descriptor of an opened socket, plain chown(2) on /proc/self/fd/42 will trigger that ->setattr(). No dfd in sight. We do run across an opened file at some point, all right - when we traverse the symlink in procfs. You can't bump ->f_count there. Even leaving aside the "where would I stash the reference to that file?" and "how long would I hold it?", you can't bump ->f_count on other process' files. That would bugger the expectations of close(2) callers.
Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
On Thu, Jun 7, 2018 at 3:04 PM, Al Viro wrote: > On Thu, Jun 07, 2018 at 02:45:58PM -0700, Cong Wang wrote: >> On Thu, Jun 7, 2018 at 2:26 PM, Al Viro wrote: >> > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote: >> >> fchownat() doesn't even hold refcnt of fd until it figures out >> >> fd is really needed (otherwise is ignored) and releases it after >> >> it resolves the path. This means sock_close() could race with >> >> sockfs_setattr(), which leads to a NULL pointer dereference >> >> since typically we set sock->sk to NULL in ->release(). >> >> >> >> As pointed out by Al, this is unique to sockfs. So we can fix this >> >> in socket layer by acquiring inode_lock in sock_close() and >> >> checking against NULL in sockfs_setattr(). >> > >> > That looks like a massive overkill - it's way heavier than it should be. >> >> I don't see any other quick way to fix this. My initial thought is >> to keep that refcnt until path_put(), apparently you don't like it >> either. > > You do realize that the same ->setattr() can be called by way of > chown() on /proc/self/fd/, right? What would you do there - > bump refcount on that struct file when traversing that symlink and > hold it past the end of pathname resolution, until... what exactly? I was thinking about this: error = user_path_at(dfd,); // hold dfd when needed if (!error) { error = chown_common(&path, mode); path_put(&path); // release dfd if held With this, we can guarantee ->release() is only possibly called after chown_common() which is after ->setattr() too.
Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
On Thu, Jun 07, 2018 at 02:45:58PM -0700, Cong Wang wrote: > On Thu, Jun 7, 2018 at 2:26 PM, Al Viro wrote: > > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote: > >> fchownat() doesn't even hold refcnt of fd until it figures out > >> fd is really needed (otherwise is ignored) and releases it after > >> it resolves the path. This means sock_close() could race with > >> sockfs_setattr(), which leads to a NULL pointer dereference > >> since typically we set sock->sk to NULL in ->release(). > >> > >> As pointed out by Al, this is unique to sockfs. So we can fix this > >> in socket layer by acquiring inode_lock in sock_close() and > >> checking against NULL in sockfs_setattr(). > > > > That looks like a massive overkill - it's way heavier than it should be. > > I don't see any other quick way to fix this. My initial thought is > to keep that refcnt until path_put(), apparently you don't like it > either. You do realize that the same ->setattr() can be called by way of chown() on /proc/self/fd/, right? What would you do there - bump refcount on that struct file when traversing that symlink and hold it past the end of pathname resolution, until... what exactly?
Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
On Thu, Jun 7, 2018 at 2:26 PM, Al Viro wrote: > On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote: >> fchownat() doesn't even hold refcnt of fd until it figures out >> fd is really needed (otherwise is ignored) and releases it after >> it resolves the path. This means sock_close() could race with >> sockfs_setattr(), which leads to a NULL pointer dereference >> since typically we set sock->sk to NULL in ->release(). >> >> As pointed out by Al, this is unique to sockfs. So we can fix this >> in socket layer by acquiring inode_lock in sock_close() and >> checking against NULL in sockfs_setattr(). > > That looks like a massive overkill - it's way heavier than it should be. I don't see any other quick way to fix this. My initial thought is to keep that refcnt until path_put(), apparently you don't like it either. > And it's very likely to trigger shitloads of deadlock warnings, some > possibly even true. I do audit the inode_lock usage in networking, I don't see any deadlock, of course, there could be some non-networking code uses socket API that I missed. But _generally_, socket doesn't have a pointer to retrieve this inode.
Re: [Patch net] socket: close race condition between sock_close() and sockfs_setattr()
On Thu, Jun 07, 2018 at 01:39:49PM -0700, Cong Wang wrote: > fchownat() doesn't even hold refcnt of fd until it figures out > fd is really needed (otherwise is ignored) and releases it after > it resolves the path. This means sock_close() could race with > sockfs_setattr(), which leads to a NULL pointer dereference > since typically we set sock->sk to NULL in ->release(). > > As pointed out by Al, this is unique to sockfs. So we can fix this > in socket layer by acquiring inode_lock in sock_close() and > checking against NULL in sockfs_setattr(). That looks like a massive overkill - it's way heavier than it should be. And it's very likely to trigger shitloads of deadlock warnings, some possibly even true.