** Description changed: [Impact] Ubuntu 22.04 host fails to reconnect successfully to the NVMe TCP target after link down event if the number of queues have changed post link down. [Fix] Following upstream patch set helps address the issue. 1. nvmet: Expose max queues to configfs https://git.infradead.org/nvme.git/commit/2c4282742d049e2a5ab874e2b359a2421b9377c2 2. nvme-tcp: Handle number of queue changes https://git.infradead.org/nvme.git/commit/516204e486a19d03962c2757ef49782e6c1cacf4 3. nvme-rdma: Handle number of queue changes https://git.infradead.org/nvme.git/commit/e800278c1dc97518eab1970f8f58a5aad52b0f86 The patch in Point 2 above helps address the failure to reconnect in NVMe TCP scenario. Also, following patch addresses error code parsing issue in the reconnect sequence. nvme-fabrics: parse nvme connect Linux error codes https://git.infradead.org/nvme.git/commit/ec9e96b5230148294c7abcaf3a4c592d3720b62d [Test Plan] + 1. Boot into Ubuntu 22.04 kernel without fix. - 1. Boot into Ubuntu 22.04 kernel without fix. + 2. Establish connection to NVMe TCP target. - 2. Establish connection to powerstore and create more than 70 NVMe - controllers ( >64 controllers) + 3. Toggle NIC link and bring link up after 10 seconds. When the NIC + link is down, on the target increase the number of queues assigned to + the controller. - nvme connect -t tcp -a <target address> -n <target nqn> -D - - Observe that nvme controllers > 64 get assigned 8 queues. - - 3. Delete few controllers so that total number of controllers becomes < - 64. This results in higher number of queues becoming available to - remaining NVMe controllers. - - nvme disconnect -d <nvme controller> - - 4. Toggle NIC link and bring link up after 10 seconds. - - 5. Observe that connection to target is lost and after link comes up, + 4. Observe that connection to target is lost and after link comes up, controller from host tries to re-establish connection. - 6. With patch, reconnection succeeds with higher number of queues. + 5. With patch, reconnection succeeds with higher number of queues [Where problems could occur] Regression risk is low to medium. [Other Info] Test Kernel Source https://code.launchpad.net/~mreed8855/ubuntu/+source/linux/+git/jammy/+ref/lp_1989990_nvme_tcp
** Information type changed from Private to Public -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1989990 Title: [SRU] Ubuntu 22.04 - NVMe TCP - Host fails to reconnect to target after link down/link up sequence Status in linux package in Ubuntu: In Progress Status in linux source package in Jammy: In Progress Bug description: [Impact] Ubuntu 22.04 host fails to reconnect successfully to the NVMe TCP target after link down event if the number of queues have changed post link down. [Fix] Following upstream patch set helps address the issue. 1. nvmet: Expose max queues to configfs https://git.infradead.org/nvme.git/commit/2c4282742d049e2a5ab874e2b359a2421b9377c2 2. nvme-tcp: Handle number of queue changes https://git.infradead.org/nvme.git/commit/516204e486a19d03962c2757ef49782e6c1cacf4 3. nvme-rdma: Handle number of queue changes https://git.infradead.org/nvme.git/commit/e800278c1dc97518eab1970f8f58a5aad52b0f86 The patch in Point 2 above helps address the failure to reconnect in NVMe TCP scenario. Also, following patch addresses error code parsing issue in the reconnect sequence. nvme-fabrics: parse nvme connect Linux error codes https://git.infradead.org/nvme.git/commit/ec9e96b5230148294c7abcaf3a4c592d3720b62d [Test Plan] 1. Boot into Ubuntu 22.04 kernel without fix. 2. Establish connection to NVMe TCP target. 3. Toggle NIC link and bring link up after 10 seconds. When the NIC link is down, on the target increase the number of queues assigned to the controller. 4. Observe that connection to target is lost and after link comes up, controller from host tries to re-establish connection. 5. With patch, reconnection succeeds with higher number of queues [Where problems could occur] Regression risk is low to medium. [Other Info] Test Kernel Source https://code.launchpad.net/~mreed8855/ubuntu/+source/linux/+git/jammy/+ref/lp_1989990_nvme_tcp To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1989990/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp