Hi Doug, > -----Original Message----- > From: Doug Ledford <dledf...@redhat.com> > Sent: Thursday, August 23, 2018 11:56 AM > To: Parav Pandit <pa...@mellanox.com>; Jason Gunthorpe <j...@ziepe.ca>; > Eric Biggers <ebigg...@kernel.org> > Cc: linux-r...@vger.kernel.org; dasaratharaman.chandramo...@intel.com; > Leon Romanovsky <leo...@mellanox.com>; linux-kernel@vger.kernel.org; > Mark Bloch <ma...@mellanox.com>; Moni Shoua <mo...@mellanox.com>; > syzkaller-b...@googlegroups.com; syzbot > <syzbot+29ee8f76017ce6cf0...@syzkaller.appspotmail.com> > Subject: Re: [RDMA bug] KASAN: use-after-free Read in __list_del_entry_valid > (4) > > On Thu, 2018-08-23 at 16:39 +0000, Parav Pandit wrote: > > > -----Original Message----- > > > From: Jason Gunthorpe <j...@ziepe.ca> > > > Sent: Thursday, August 23, 2018 9:55 AM > > > To: Eric Biggers <ebigg...@kernel.org> > > > Cc: Doug Ledford <dledf...@redhat.com>; linux-r...@vger.kernel.org; > > > dasaratharaman.chandramo...@intel.com; Leon Romanovsky > > > <leo...@mellanox.com>; linux-kernel@vger.kernel.org; Mark Bloch > > > <ma...@mellanox.com>; Moni Shoua <mo...@mellanox.com>; Parav > Pandit > > > <pa...@mellanox.com>; syzkaller-b...@googlegroups.com; syzbot > > > <syzbot+29ee8f76017ce6cf0...@syzkaller.appspotmail.com> > > > Subject: Re: [RDMA bug] KASAN: use-after-free Read in > > > __list_del_entry_valid > > > (4) > > > > > > On Wed, Aug 22, 2018 at 11:16:31PM -0700, Eric Biggers wrote: > > > > Hello RDMA / InfiniBand maintainers, > > > > > > > > This is an RDMA bug and it still occurs on Linus' tree as of today > > > > (commit 815f0ddb346c1960). > > > > > > > > I've also simplified the reproducer for it; see below after the original > report. > > > > Apparently it involves a race between RDMA_USER_CM_CMD_RESOLVE_IP > > > > > > and > > > > RDMA_USER_CM_CMD_LISTEN. > > > > > > That is an amazing reproducer! > > > > > > I have a feeling this is the same cause as all the other syzkaller bugs > > > in this > code: > > > lack of any sane locking at all :\ > > > > > > We've talked about chucking a big lock around this whole thing, but > > > nobody has done it yet.. It isn't so simple. > > > > > > > I had some code in which reduces three locks (handler_lock, qp_mutex, > id_lock) to single mutex to protect the cm_id and protects every exported > symbol of rdmacm which works on cm_id. > > But not ready enough to post it as patch yet. Lot of tests required before > > I get > there and some refactor too before that. > > Does it finally address the fact that the rdmacm code was written so that it > was > always synchronous but RoCE src gid (I think that's what it was, I'm typing > this > from long ago memory) lookup broke that assumption? > I am not sure. To me it is unlikely, because rdma_resolve_route() for InfiniBand is not synchronous either which needs to query the SA. But qp_mutex existed long before that which doesn't provide any performance improvements. ( by splitting as 3rd lock instead of id_lock and handler_lock) and so on.
> -- > Doug Ledford <dledf...@redhat.com> > GPG KeyID: B826A3330E572FDD > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD