It looks like the service is failing because your controller is in the
process of resetting, which appears to take several minutes. I'm not
sure what the design is for nvme-cli tools handling such a long reset
time, but my first guess would be to increase the kernel rport timeout,
which appears to be around 30 seconds, from the log output. In your
hardware's case, it seems like that timeout should be more than 180
seconds.

Apr 07 11:45:10 ICTM1608S01H1 root[2894793]: JD: Resetting controller A
Apr 07 11:45:28 ICTM1608S01H1 kernel: lpfc 0000:af:00.1: 5:(0):6172 NVME 
rescanned DID x3d0a00 port_state x2
Apr 07 11:45:28 ICTM1608S01H1 kernel: lpfc 0000:18:00.1: 1:(0):6172 NVME 
rescanned DID x3d0a00 port_state x2
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: controller 
connectivity lost. Awaiting Reconnect
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: controller 
connectivity lost. Awaiting Reconnect
Apr 07 11:45:28 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:28 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: io failed due to 
lldd error 6
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: io failed due to 
lldd error 6
Apr 07 11:45:29 ICTM1608S01H1 kernel: lpfc 0000:af:00.0: 4:(0):6172 NVME 
rescanned DID x011400 port_state x2
Apr 07 11:45:29 ICTM1608S01H1 kernel: lpfc 0000:18:00.0: 0:(0):6172 NVME 
rescanned DID x011400 port_state x2
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: controller 
connectivity lost. Awaiting Reconnect
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: controller 
connectivity lost. Awaiting Reconnect
Apr 07 11:45:29 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:29 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: io failed due to 
lldd error 6
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: io failed due to 
lldd error 6
Apr 07 11:45:59 ICTM1608S01H1 kernel:  rport-10:0-9: blocked FC remote port 
time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel:  rport-16:0-9: blocked FC remote port 
time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel:  rport-15:0-9: blocked FC remote port 
time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel:  rport-12:0-9: blocked FC remote port 
time out: removing rport
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: dev_loss_tmo (60) 
expired while waiting for remoteport connectivity.
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme5: Removing ctrl: NQN 
"nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: dev_loss_tmo (60) 
expired while waiting for remoteport connectivity.
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme1: Removing ctrl: NQN 
"nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: dev_loss_tmo (60) 
expired while waiting for remoteport connectivity.
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme4: Removing ctrl: NQN 
"nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: dev_loss_tmo (60) 
expired while waiting for remoteport connectivity.
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme8: Removing ctrl: NQN 
"nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:47:07 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:47:07 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:47:08 ICTM1608S01H1 systemd-udevd[2896872]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:47:08 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 
'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:49:56 ICTM1608S01H1 root[2899783]: JD: Controller A online
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: nvme-subsys0 - 
NQN=nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: \
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]:  +- nvme2 fc 
traddr=nn-0x200200a098d8580e:pn-0x202300a098d8580e 
host_traddr=nn-0x20000090fadcc5ce>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]:  +- nvme3 fc 
traddr=nn-0x200200a098d8580e:pn-0x201300a098d8580e 
host_traddr=nn-0x200000109b8f2b8d>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]:  +- nvme6 fc 
traddr=nn-0x200200a098d8580e:pn-0x202300a098d8580e 
host_traddr=nn-0x200000109b8f2b8e>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]:  +- nvme7 fc 
traddr=nn-0x200200a098d8580e:pn-0x201300a098d8580e 
host_traddr=nn-0x20000090fadcc5cd>

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1874270

Title:
  NVMe/FC connections fail to reestablish after controller is reset

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvme-cli/+bug/1874270/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to