On Sun, Jan 04, 2026 at 10:34:55AM -0700, Theo de Raadt wrote:
> Mark Kettenis <[email protected]> wrote:
>
> > I'm 100% sure that I am booting the correct kernel. The checksum
> > calculated by that code above is the same. But for some reason the
> > checksum that we read back from the hibernation info on disk is
> > all-zeroes. So something is going wrong. Will dig deeper when I have
> > time.
>
> Is it just the checksum field --- or has the signature sector not
> actually made it onto disk?
>
> There is this messy thing in subr_hibernate.c around 1954
>
> /* Allow the disk to settle */
> delay(500000);
>
> Few days ago I asked Mike about this again. Apparently this was a workaround
> for an old system, and we should not do it anymore. That was probably ahci.
> But why did we need it back then?
This was added well before ahci hibernate was working at all, so
it must have been for wdc.
>
> These new systems are nvme. Do we have a situation where the last hibernate
> write operation gets skipped in subr_hibernate.c, or do we have low-level
> side-effect-free io functions which don't do their job. Is
> nvme_hibernate_io()
> failing the last write to disk?
Looking at the nvme shutdown code again, I realise we're not deleting
the hibernate i/o queue, which we're supposed to do as part of the
normal shutdown procedure. Perhaps without that the controller isn't
flushing all the data out to non-volatile storage. We don't issue a
flush command after the last hibernate write, but we shouldn't have
to.
Maybe this will help? (only compile tested)
Index: nvme.c
===================================================================
RCS file: /cvs/src/sys/dev/ic/nvme.c,v
diff -u -p -r1.125 nvme.c
--- nvme.c 16 Dec 2025 00:24:55 -0000 1.125
+++ nvme.c 5 Jan 2026 04:17:41 -0000
@@ -557,6 +574,12 @@ nvme_shutdown(struct nvme_softc *sc)
printf("%s: unable to delete q, disabling\n", DEVNAME(sc));
goto disable;
}
+#ifdef HIBERNATE
+ if (nvme_q_delete(sc, sc->sc_hib_q) != 0) {
+ printf("%s: unable to delete hib q, disabling\n", DEVNAME(sc));
+ goto disable;
+ }
+#endif
cc = nvme_read4(sc, NVME_CC);
CLR(cc, NVME_CC_SHN_MASK);