On Wed, Dec 08, 2010 at 04:55:22PM +0200, Nir Muchtar wrote:
> On Tue, 2010-12-07 at 14:29 -0700, Jason Gunthorpe wrote:
> 
> > What you've done in your v2 patch won't work if the table you are
> > dumping is too large, once you pass sk_rmem_alloc for the netlink
> > socket it will deadlock. The purpose of dump_start is to avoid that
> > deadlock. (review my past messages on the subject)
> > 
> > Your v1 patch wouldn't deadlock, but it would fail to dump with
> > ENOMEM, and provides an avenue to build an unprivileged kernel OOM
> > DOS.
> > 
> > The places in the kernel that don't use dump_start have to stay under
> > sk_rmem_alloc.
> > 
> > Jason
> 
> Sorry, I still need some clarifications...
> When you say deadlocks, do you mean when calling malloc with a lock or
> when overflowing a socket receive buffer?
> For the second case, when we use netlink_unicast, the skbuff is sent and
> freed. It is transferred to the userspace's socket using netlink_sendskb
> and accumulated in its recv buff.
> 
> Are you referring to a deadlock there? I still fail to see the issue.
> Why would the kernel socket recv buff reach a limit? Could you please
> elaborate?

Netlink is all driven from user space syscalls.. so it looks like

sendmsg()
[..]
ibnl_rcv_msg
cma_get_stats
[..]
ibnl_unicast
[..]
netlink_attachskb
(now we block on the socket recv queue once it fills)

The deadlock is that userspace is sitting in sendmsg() while the
kernel is sleeping in netlink_attachskb waiting for the recvbuf to
empty.

User space cannot call recvmsg() while it is in blocked in sendmsg()
so it all goes boom.

Even if cma_get_stats was executed from a kernel thread and
ibnl_rcv_msg returned back to userspace you still hold the dev_list
mutex while calling ibnl_unicast, which can sleep waiting on
userspace, which creates an easy DOS against the RDMA CM (I can write
a program that causes the kernel the hold the mutx indefinitely).

You can't hold the mutex while sleeping for userspace, so you have to
unlock it. If you unlock it you have to fixup your position when you
re-lock it. If you can fixup your position then you can use
dump_start.

I don't see malloc being a concern anywhere in what you've done...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to