Some updates here: the patch was released in the -proposed pocket, and is available in the kernel 4.4.0-1075-aws - to enable the proposed repository please see this https://wiki.ubuntu.com/Testing/EnableProposed. The plan is to have this kernel released in the first week of February, after all tests/validations finish in the proposed package.
With this kernel, if the timeouts occur the driver will poll the completion queue to be sure the io "timeouting" isn't completed, and our tests showed that for this bug, the io is there, which seems to indicate a missed interrupt. So, a kernel with the patch will mitigate the effects of the timeouts, not leading to the aborts anymore. The following message will be observed in dmesg: [39630.417191] nvme 0000:00:04.0: I/O 0 QID 2 timeout, completion polled Thanks, Guilherme -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1788035 Title: nvme: avoid cqe corruption To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788035/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs