Hi, We tried to repro issue with kernel 'linux-image- unsigned-5.15.0-59-generic_5.15.0-59.65_amd64.deb' from -proposed repository. The issue is not observed.
After link down/up sequence, nvme controllers 4, 68, 69 and 70 reconnect successfully. [ 793.550538] nvme nvme4: queue 0: timeout request 0x0 type 4 [ 793.550544] nvme nvme4: starting error recovery [ 793.552141] nvme nvme4: failed nvme_keep_alive_end_io error=10 [ 793.567947] nvme nvme4: Reconnecting in 10 seconds... [ 794.574539] nvme nvme70: queue 0: timeout request 0x0 type 4 [ 794.574543] nvme nvme70: starting error recovery [ 794.574544] nvme nvme68: queue 0: timeout request 0x0 type 4 [ 794.574548] nvme nvme69: queue 0: timeout request 0x0 type 4 [ 794.574549] nvme nvme68: starting error recovery [ 794.574550] nvme nvme69: starting error recovery [ 794.574768] nvme nvme70: failed nvme_keep_alive_end_io error=10 [ 794.574793] nvme nvme69: failed nvme_keep_alive_end_io error=10 [ 794.574877] nvme nvme68: failed nvme_keep_alive_end_io error=10 [ 794.591403] nvme nvme70: Reconnecting in 10 seconds... [ 794.591628] nvme nvme69: Reconnecting in 10 seconds... [ 794.594555] nvme nvme68: Reconnecting in 10 seconds... [ 796.631586] IPv6: ADDRCONF(NETDEV_CHANGE): eno33np0: link becomes ready [ 803.632108] nvme nvme4: creating 64 I/O queues. [ 803.668542] nvme nvme4: mapped 64/0/0 default/read/poll queues. [ 803.671517] nvme nvme4: Successfully reconnected (1 attempt) [ 804.655794] nvme nvme70: queue_size 128 > ctrl sqsize 64, clamping down [ 804.655886] nvme nvme70: creating 64 I/O queues. [ 804.655961] nvme nvme68: queue_size 128 > ctrl sqsize 64, clamping down [ 804.655994] nvme nvme69: queue_size 128 > ctrl sqsize 64, clamping down [ 804.656042] nvme nvme68: creating 64 I/O queues. [ 804.656043] nvme nvme69: creating 64 I/O queues. [ 804.669742] nvme nvme69: mapped 64/0/0 default/read/poll queues. [ 804.669761] nvme nvme70: mapped 64/0/0 default/read/poll queues. [ 804.669773] nvme nvme68: mapped 64/0/0 default/read/poll queues. [ 804.685893] nvme nvme70: Successfully reconnected (1 attempt) [ 804.702605] nvme nvme69: Successfully reconnected (1 attempt) [ 804.722602] nvme nvme68: Successfully reconnected (1 attempt) ** Tags removed: verification-needed-jammy ** Tags added: verification-done-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1989990 Title: [SRU] Ubuntu 22.04 - NVMe TCP - Host fails to reconnect to target after link down/link up sequence Status in linux package in Ubuntu: Invalid Status in linux source package in Jammy: In Progress Bug description: [Impact] Ubuntu 22.04 host fails to reconnect successfully to the NVMe TCP target after link down event if the number of queues have changed post link down. [Fix] Following upstream patch set helps address the issue. 1. nvmet: Expose max queues to configfs https://git.infradead.org/nvme.git/commit/2c4282742d049e2a5ab874e2b359a2421b9377c2 2. nvme-tcp: Handle number of queue changes https://git.infradead.org/nvme.git/commit/516204e486a19d03962c2757ef49782e6c1cacf4 3. nvme-rdma: Handle number of queue changes https://git.infradead.org/nvme.git/commit/e800278c1dc97518eab1970f8f58a5aad52b0f86 The patch in Point 2 above helps address the failure to reconnect in NVMe TCP scenario. Also, following patch addresses error code parsing issue in the reconnect sequence. nvme-fabrics: parse nvme connect Linux error codes https://git.infradead.org/nvme.git/commit/ec9e96b5230148294c7abcaf3a4c592d3720b62d [Test Plan] 1. Boot into Ubuntu 22.04 kernel without fix. 2. Establish connection to NVMe TCP target. 3. Toggle NIC link and bring link up after 10 seconds. When the NIC link is down, on the target increase the number of queues assigned to the controller. 4. Observe that connection to target is lost and after link comes up, controller from host tries to re-establish connection. 5. With patch, reconnection succeeds with higher number of queues [Where problems could occur] Regression risk is low to medium. [Other Info] Test Kernel Source https://code.launchpad.net/~mreed8855/ubuntu/+source/linux/+git/jammy/+ref/lp_1989990_nvme_tcp To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1989990/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp