On Sat, Mar 20, 2010 at 05:03:16PM -0400, Steven Bellovin wrote: > > Let me see if I can find my first note on the subject -- it might > > give a clue about the date of any changes. > > Turns out that I sendpr-ed it in September: kern/42104.
I even responded to the PR, not that I had any useful ideas at the time. That sounds like maybe the problem is not on the suspend side but on the resume side, that is, that stuff is being written out before (some layer of) the disk subsystem is ready to go again. With vanilla FFS such writes should be synchronous so it should be (relatively) easy to figure out what's going on. Do you feel like trying out dtrace? :-) On the other hand, if fsck thinks the inode for a named pipe is unallocated (or particularly, has duplicate blocks, since pipes shouldn't have blocks at all)... that means that whatever went wrong went wrong when the pipe was created, not when something exited and removed it. And with vanilla ffs, those are synchronous writes and they should happen in quick succession; if the inode didn't get written but the directory did, something's more badly wrong than just the disk not being ready yet. And I strongly suspect that the pipe creation isn't tied to suspending, that is, the pipe should have been created long before you suspended and should not in general be removed and recreated by suspending. And that means either something is severely wrong in general and you're only seeing it after crashing due to suspend (which is possible, but seems not too likely) or the suspend cycle is actively writing garbage and corrupting the fs. Meanwhile, getting traps while dumping is Very Strange (TM). Do we have any kind of debug code that can checksum memory before and after the suspend? I wonder if something ACPI-related is garbaging memory. -- David A. Holland dholl...@netbsd.org