Public bug reported: SRU Justification:
[Impact] Backport RDMA DMABUF functionality Nvidia is working on a high performance networking solution with real customers. That solution is being developed using the Ubuntu 22.04 LTS distro release and the distro kernel (lowlatency flavour). This “dma_buf” patchset consists of upstreamed patches that allow buffers to be shared between drivers thus enhancing performance while reducing copying of data. Our team is currently engaged in the development of a high-performance networking solution tailored to meet the demands of real-world customers. This cutting-edge solution is being crafted on the foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically the lowlatency flavor. At the heart of our innovation lies the transformative "dma_buf" patchset, comprising a series of patches that have been integrated into the upstream kernel in 5.16 and 5.17. These patches introduce a groundbreaking capability: enabling the seamless sharing of buffers among various drivers. This not only bolsters the solution's performance but also minimizes the need for data copying, effectively enhancing efficiency across the board. The new functionality is isolated such that existing user will not execute these new code paths. * First 3 patches adds a new api to the RDMA subsystem that allows drivers to get a pinned dmabuf memory region without requiring an implementation of the move_notify callback. https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/ * The remaining patches add support for DMABUF when creating a devx umem. devx umems are quite similar to MR's execpt they cannot be revoked, so this uses the dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot work with MR. https://lore.kernel.org/all/0-v1-bd147097458e+ede- umem_dmabuf_...@nvidia.com/ [Test Plan] SW Configuration: • Download CUDA 12.2 run file (https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local) • Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run -m=kernel-open • Clone perftest from https://github.com/linux-rdma/perftest. • cd perftest • export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH • export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH • run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; make # Start Server $ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf #Start Client $ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost [Where problems could occur?] ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2040526 Title: Backport DMABUF functionality Status in linux package in Ubuntu: Incomplete Bug description: SRU Justification: [Impact] Backport RDMA DMABUF functionality Nvidia is working on a high performance networking solution with real customers. That solution is being developed using the Ubuntu 22.04 LTS distro release and the distro kernel (lowlatency flavour). This “dma_buf” patchset consists of upstreamed patches that allow buffers to be shared between drivers thus enhancing performance while reducing copying of data. Our team is currently engaged in the development of a high-performance networking solution tailored to meet the demands of real-world customers. This cutting-edge solution is being crafted on the foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically the lowlatency flavor. At the heart of our innovation lies the transformative "dma_buf" patchset, comprising a series of patches that have been integrated into the upstream kernel in 5.16 and 5.17. These patches introduce a groundbreaking capability: enabling the seamless sharing of buffers among various drivers. This not only bolsters the solution's performance but also minimizes the need for data copying, effectively enhancing efficiency across the board. The new functionality is isolated such that existing user will not execute these new code paths. * First 3 patches adds a new api to the RDMA subsystem that allows drivers to get a pinned dmabuf memory region without requiring an implementation of the move_notify callback. https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/ * The remaining patches add support for DMABUF when creating a devx umem. devx umems are quite similar to MR's execpt they cannot be revoked, so this uses the dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot work with MR. https://lore.kernel.org/all/0-v1-bd147097458e+ede- umem_dmabuf_...@nvidia.com/ [Test Plan] SW Configuration: • Download CUDA 12.2 run file (https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local) • Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run -m=kernel-open • Clone perftest from https://github.com/linux-rdma/perftest. • cd perftest • export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH • export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH • run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; make # Start Server $ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf #Start Client $ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost [Where problems could occur?] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp