Re: [for-next 7/7] IB/mlx5: Implement fragmented completion queue (CQ)
> On Feb 23, 2018, at 9:13 PM, Saeed Mahameedwrote: > >> On Thu, 2018-02-22 at 16:04 -0800, Santosh Shilimkar wrote: >> Hi Saeed >> >>> On 2/21/2018 12:13 PM, Saeed Mahameed wrote: >>> From: Yonatan Cohen >>> >>> The current implementation of create CQ requires contiguous >>> memory, such requirement is problematic once the memory is >>> fragmented or the system is low in memory, it causes for >>> failures in dma_zalloc_coherent(). >>> >>> This patch implements new scheme of fragmented CQ to overcome >>> this issue by introducing new type: 'struct mlx5_frag_buf_ctrl' >>> to allocate fragmented buffers, rather than contiguous ones. >>> >>> Base the Completion Queues (CQs) on this new fragmented buffer. >>> >>> It fixes following crashes: >>> kworker/29:0: page allocation failure: order:6, mode:0x80d0 >>> CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0 >>> Workqueue: ib_cm cm_work_handler [ib_cm] >>> Call Trace: >>> [<>] dump_stack+0x19/0x1b >>> [<>] warn_alloc_failed+0x110/0x180 >>> [<>] __alloc_pages_slowpath+0x6b7/0x725 >>> [<>] __alloc_pages_nodemask+0x405/0x420 >>> [<>] dma_generic_alloc_coherent+0x8f/0x140 >>> [<>] x86_swiotlb_alloc_coherent+0x21/0x50 >>> [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core] >>> [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core] >>> [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core] >>> [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core] >>> [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib] >>> [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib] >>> >>> Signed-off-by: Yonatan Cohen >>> Reviewed-by: Tariq Toukan >>> Signed-off-by: Leon Romanovsky >>> Signed-off-by: Saeed Mahameed >>> --- >> >> Jason mentioned about this patch to me off-list. We were >> seeing similar issue with SRQs & QPs. So wondering whether >> you have any plans to do similar change for other resouces >> too so that they don't rely on higher order page allocation >> for icm tables. >> > > Hi Santosh, > > Adding Majd, > > Which ULP is in question ? how big are the QPs/SRQs you create that > lead to this problem ? > > For icm tables we already allocate only order 0 pages: > see alloc_system_page() in > drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c > > But for kernel RDMA SRQ and QP buffers there is a place for > improvement. > > Majd, do you know if we have any near future plans for this. It’s in our plans to move all the buffers to use 0-order pages. Santosh, Is this RDS? Do you have persistent failure with some configuration? Can you please share more information? Thanks > >> Regards, >> Santosh
Re: [for-next 4/6] net/mlx5: FPGA, Add basic support for Innova
> On Jun 10, 2017, at 1:24 AM, Doug Ledfordwrote: > >> On Wed, 2017-06-07 at 13:21 -0600, Jason Gunthorpe wrote: >>> On Wed, Jun 07, 2017 at 10:13:43PM +0300, Saeed Mahameed wrote: >>> >>> No !! >>> I am just showing you that the ib_core eventually will end up >>> calling >>> mlx5_core to create a QP. >>> so mlx5_core can create the QP it self since it is the one >>> eventually >>> creating QPs. >>> we just call mlx5_core_create_qp directly. >> >> Which is building a RDMA ULP inside a driver without using the core >> code :( > > Aren't the transmit/receive queues of the Ethernet netdevice on > mlx4/mlx5 hardware QPs too? Those bypass the RDMA subsystem entirely. > Just because something uses a QP on hardware that does *everything* > via QPs doesn't necessarily mean it must go through the RDMA subsystem. > > Now, the fact that the content of the packets is basically a RoCE > packet does make things a bit fuzzier, but if their packets are > specially crafted RoCE packets that aren't really intended to be fully > RoCE spec compliant (maybe they don't support all the options as normal > RoCE QPs), then I can see hiding them from the larger RoCE portion of > the RDMA stack. > >>> This keep getting more ugly :( What about security? What if user space sends some raw packets to the FPGA - can it reprogram the ISPEC settings or worse? >>> >>> No such thing. This QP is only for internal driver/HW >>> communications, >>> as it is faster from the existing command interface. >>> it is not meant to be exposed for any raw user space usages at all, >>> without proper standard API adapter of course. >> >> I'm not asking about the QP, I'm asking what happens after the NIC >> part. You use ROCE packets to control the FPGA. What prevents >> userspace from forcibly constructing roce packets and sending them to >> the FPGA. How does the FPGA know for certain the packet came from the >> kernel QP and not someplace else. > > This is a valid concern. > >> This is especially true for mlx nics as there are many raw packet >> bypass mechanisms available to userspace. > All of the Raw packet bypass mechanisms are restricted to CAP_NET_RAW, and thus malicious users can't simply open a RAW Packet QP and send it to the FPGA.. > Right. The question becomes: Does the firmware filter outgoing raw ETH > QPs such that a nefarious user could not send a crafted RoCE packet > that the bump on the wire would intercept and accept? > > -- > Doug Ledford > GPG KeyID: B826A3330E572FDD > > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html