[Bug 2055082] Re: IB peer memory feature regressed in 6.5
This bug is awaiting verification that the linux-raspi/6.5.0-1014.17 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-mantic-linux-raspi' to 'verification-done- mantic-linux-raspi'. If the problem still exists, change the tag 'verification-needed-mantic-linux-raspi' to 'verification-failed-mantic- linux-raspi'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: kernel-spammed-mantic-linux-raspi-v2 verification-needed-mantic-linux-raspi -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2055082 Title: IB peer memory feature regressed in 6.5 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2055082/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2055082] Re: IB peer memory feature regressed in 6.5
This bug was fixed in the package linux - 6.5.0-27.28 --- linux (6.5.0-27.28) mantic; urgency=medium * mantic/linux: 6.5.0-27.28 -proposed tracker (LP: #2055584) * Packaging resync (LP: #1786013) - [Packaging] drop ABI data - [Packaging] update annotations scripts - debian.master/dkms-versions -- update from kernel-versions (main/2024.03.04) * CVE-2024-26597 - net: qualcomm: rmnet: fix global oob in rmnet_policy * CVE-2024-26599 - pwm: Fix out-of-bounds access in of_pwm_single_xlate() * Drop ABI checks from kernel build (LP: #2055686) - [Packaging] Remove in-tree abi checks * Cranky update-dkms-versions rollout (LP: #2055685) - [Packaging] remove update-dkms-versions - Move debian/dkms-versions to debian.master/dkms-versions - [Packaging] Replace debian/dkms-versions with $(DEBIAN)/dkms-versions * linux: please move erofs.ko (CONFIG_EROFS for EROFS support) from linux- modules-extra to linux-modules (LP: #2054809) - UBUNTU [Packaging]: Include erofs in linux-modules instead of linux-modules- extra * performance: Scheduler: ratelimit updating of load_avg (LP: #2053251) - sched/fair: Ratelimit update to tg->load_avg * IB peer memory feature regressed in 6.5 (LP: #2055082) - SAUCE: RDMA/core: Introduce peer memory interface * linux-tools-common: man page of usbip[d] is misplaced (LP: #2054094) - [Packaging] rules: Put usbip manpages in the correct directory * CVE-2024-23851 - dm: limit the number of targets and parameter size area * CVE-2024-23850 - btrfs: do not ASSERT() if the newly created subvolume already got read * x86: performance: tsc: Extend watchdog check exemption to 4-Sockets platform (LP: #2054699) - x86/tsc: Extend watchdog check exemption to 4-Sockets platform * linux: please move dmi-sysfs.ko (CONFIG_DMI_SYSFS for SMBIOS support) from linux-modules-extra to linux-modules (LP: #2045561) - [Packaging] Move dmi-sysfs.ko into linux-modules * Fix AMD brightness issue on AUO panel (LP: #2054773) - drm/amdgpu: make damage clips support configurable * Mantic update: upstream stable patchset 2024-02-28 (LP: #2055199) - f2fs: explicitly null-terminate the xattr list - pinctrl: lochnagar: Don't build on MIPS - ALSA: hda - Fix speaker and headset mic pin config for CHUWI CoreBook XPro - mptcp: fix uninit-value in mptcp_incoming_options - wifi: cfg80211: lock wiphy mutex for rfkill poll - wifi: avoid offset calculation on NULL pointer - wifi: mac80211: handle 320 MHz in ieee80211_ht_cap_ie_to_sta_ht_cap - debugfs: fix automount d_fsdata usage - nvme-core: fix a memory leak in nvme_ns_info_from_identify() - drm/amd/display: update dcn315 lpddr pstate latency - drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer - smb: client, common: fix fortify warnings - blk-mq: don't count completed flush data request as inflight in case of quiesce - nvme-core: check for too small lba shift - hwtracing: hisi_ptt: Handle the interrupt in hardirq context - hwtracing: hisi_ptt: Don't try to attach a task - ASoC: wm8974: Correct boost mixer inputs - arm64: dts: rockchip: fix rk356x pcie msg interrupt name - ASoC: Intel: Skylake: Fix mem leak in few functions - ASoC: nau8822: Fix incorrect type in assignment and cast to restricted __be16 - ASoC: Intel: Skylake: mem leak in skl register function - ASoC: cs43130: Fix the position of const qualifier - ASoC: cs43130: Fix incorrect frame delay configuration - ASoC: rt5650: add mutex to avoid the jack detection failure - ASoC: Intel: skl_hda_dsp_generic: Drop HDMI routes when HDMI is not available - nouveau/tu102: flush all pdbs on vmm flush - ASoC: amd: yc: Add DMI entry to support System76 Pangolin 13 - ASoC: hdac_hda: Conditionally register dais for HDMI and Analog - net/tg3: fix race condition in tg3_reset_task() - ASoC: da7219: Support low DC impedance headset - nvme: introduce helper function to get ctrl state - nvme: prevent potential spectre v1 gadget - arm64: dts: rockchip: Fix PCI node addresses on rk3399-gru - drm/amdgpu: Add NULL checks for function pointers - drm/exynos: fix a potential error pointer dereference - drm/exynos: fix a wrong error checking - hwmon: (corsair-psu) Fix probe when built-in - LoongArch: Preserve syscall nr across execve() - clk: rockchip: rk3568: Add PLL rate for 292.5MHz - clk: rockchip: rk3128: Fix HCLK_OTG gate register - jbd2: correct the printing of write_flags in jbd2_write_superblock() - jbd2: increase the journal IO's priority - drm/crtc: Fix uninit-value bug in drm_mode_setcrtc - neighbour: Don't let neigh_forced_gc() disable preemption for long - platform/x86: intel-vbtn: Fix missing tablet-mode-switch events - jbd2: fix soft lockup in journal_finish_inode_data_buffe
[Bug 2055082] Re: IB peer memory feature regressed in 6.5
= Verification = $ cat /proc/version Linux version 6.5.0-27-generic (buildd@lcy02-amd64-059) (x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0, GNU ld (GNU Binutils for Ubuntu) 2.41) #28-Ubuntu SMP PREEMPT_DYNAMIC Thu Mar 7 18:21:00 UTC 2024 ubuntu@ubuntu:~/autotest-client-tests/ubuntu_performance_gpudirect_rdma/nvidia-peermem-test$ ./nvidia-peermem-test.sh -m peermem Repository: 'Types: deb URIs: https://ppa.launchpadcontent.net/canonical-nvidia/perftest+cuda/ubuntu/ Suites: mantic Components: main ' Description: Used internal for kernel regression testing More info: https://launchpad.net/~canonical-nvidia/+archive/ubuntu/perftest+cuda Adding repository. Found existing deb entry in /etc/apt/sources.list.d/canonical-nvidia-ubuntu-perftest_cuda-mantic.sources Hit:1 http://archive.ubuntu.com/ubuntu mantic InRelease Hit:2 http://archive.ubuntu.com/ubuntu mantic-updates InRelease Hit:3 http://archive.ubuntu.com/ubuntu mantic-security InRelease Hit:4 http://archive.ubuntu.com/ubuntu mantic-backports InRelease Hit:5 http://archive.ubuntu.com/ubuntu mantic-proposed InRelease Hit:6 https://ppa.launchpadcontent.net/canonical-nvidia/perftest+cuda/ubuntu mantic InRelease Hit:7 https://ppa.launchpadcontent.net/dannf/dannf/ubuntu mantic InRelease Reading package lists... Done Reading package lists... Done Building dependency tree... Done Reading state information... Done perftest is already the newest version (24.01.0+0.38-1+perftest+cuda.1~ubuntu23.10.1). 0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded. Reading package lists... Done Building dependency tree... Done Reading state information... Done opensm is already the newest version (3.3.23-2). 0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded. --use_cuda= Use CUDA specific device for GPUDirect RDMA testing Perftest doesn't supports CUDA tests with inline messages: inline size set to 0 * Waiting for client to connect... * Perftest doesn't supports CUDA tests with inline messages: inline size set to 0 initializing CUDA initializing CUDA Listing all CUDA devices in system: CUDA device 0: PCIe address is 07:00 CUDA device 1: PCIe address is 0F:00 CUDA device 2: PCIe address is 47:00 CUDA device 3: PCIe address is 4E:00 CUDA device 4: PCIe address is 87:00 CUDA device 5: PCIe address is 90:00 CUDA device 6: PCIe address is B7:00 CUDA device 7: PCIe address is BD:00 Picking device No. 1 [pid = 15582, dev = 1] device name = [NVIDIA A100-SXM4-40GB] creating CUDA Ctx Listing all CUDA devices in system: CUDA device 0: PCIe address is 07:00 CUDA device 1: PCIe address is 0F:00 CUDA device 2: PCIe address is 47:00 CUDA device 3: PCIe address is 4E:00 CUDA device 4: PCIe address is 87:00 CUDA device 5: PCIe address is 90:00 CUDA device 6: PCIe address is B7:00 CUDA device 7: PCIe address is BD:00 Picking device No. 0 [pid = 15576, dev = 0] device name = [NVIDIA A100-SXM4-40GB] creating CUDA Ctx making it the current CUDA Ctx cuMemAlloc() of a 16777216 bytes GPU buffer allocated GPU buffer address at 7c014600 pointer=0x7c014600 --- RDMA_Write BW Test Dual-port : OFF Device : mlx5_6 Number of qps : 1Transport type : IB Connection type : RC Using SRQ : OFF PCIe relax order: ON making it the current CUDA Ctx cuMemAlloc() of a 16777216 bytes GPU buffer allocated GPU buffer address at 7a08b400 pointer=0x7a08b400 --- RDMA_Write BW Test Dual-port : OFF Device : mlx5_2 Number of qps : 1Transport type : IB Connection type : RC Using SRQ : OFF PCIe relax order: ON ibv_wr* API : ON TX depth: 128 CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --- ibv_wr* API : ON CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --- local address: LID 0x01 QPN 0x0029 PSN 0x6763b2 RKey 0x180eef VAddr 0x007a08b480 local address: LID 0x02 QPN 0x36ef PSN 0x149b7b RKey 0x180efd VAddr 0x007c014680 remote address: LID 0x02 QPN 0x36ef PSN 0x149b7b RKey 0x180efd VAddr 0x007c014680 --- #bytes #iterationsBW peak[MB/sec]BW average[MB/sec] MsgRate[Mpps] remote address: LID 0x01 QPN 0x0029 PSN 0x6763b2 RKey 0x180eef VAddr 0x007a08b480 -
[Bug 2055082] Re: IB peer memory feature regressed in 6.5
This bug is awaiting verification that the linux/6.5.0-27.28 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-mantic-linux' to 'verification-done-mantic-linux'. If the problem still exists, change the tag 'verification-needed-mantic- linux' to 'verification-failed-mantic-linux'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: kernel-spammed-mantic-linux-v2 verification-needed-mantic-linux -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2055082 Title: IB peer memory feature regressed in 6.5 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2055082/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2055082] Re: IB peer memory feature regressed in 6.5
** Changed in: linux (Ubuntu Mantic) Importance: Undecided => Medium ** Changed in: linux (Ubuntu Mantic) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2055082 Title: IB peer memory feature regressed in 6.5 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2055082/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs