> What is the bad CSTS bit? CSTS.RDY?
The reset will be triggered by the result of nvme_should_reset():
1196 static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
1197 {
1198
1199 ⇥ /* If true, indicates loss of adapter communication, possibly by a
1200 ⇥ * NVMe Subsystem reset.
1201 ⇥ */
1202 ⇥ bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);
This csts value is set in nvme_timeout:
1240 static enum blk_eh_timer_return nvme_timeout(struct request *req,
bool reserved)
1241 {
...
1247 ⇥ u32 csts = readl(dev->bar + NVME_REG_CSTS);
...
1256 ⇥ /*
1257 ⇥ * Reset immediately if the controller is failed
1258 ⇥ */
1259 ⇥ if (nvme_should_reset(dev, csts)) {
1260 ⇥ ⇥ nvme_warn_reset(dev, csts);
1261 ⇥ ⇥ nvme_dev_disable(dev, false);
1262 ⇥ ⇥ nvme_reset_ctrl(&dev->ctrl);
Again, here's the message printed by nvme_warn_reset:
Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will
reset: CSTS=0x3, PCI_STATUS=0x10
>From include/linux/nvme.h:
105 ⇥ NVME_REG_CSTS⇥ = 0x001c,⇥ /* Controller Status */
- Tyler