On 25/11/2017 12:54, Scott Long wrote: > Why is overloading EIO so bad? brelse() will call bdirty() when a BIO_WRITE > command has failed with EIO. Calling bdirty() has the effect of retrying the > I/O. > This disregards the fact that disk drivers only return EIO when they’ve > decided > that the I/O cannot be retried. It has no termination condition for the > retries, and > will endlessly retry I/O in vain; I’ve seen this quite frequently. It also > disregards > the fact that I/O marked as B_PAGING can’t be retried in this fashion, and > will > trigger a panic. Because we pretend that EIO can be retried, we are left with > a system that is very fragile when I/O actually does fail. Instead of adding > more special cases and blurred lines, I want to go back to enforcing strict > contracts between the layers and force the core parts of the system to respect > those contracts and handle errors properly, instead of just retrying and > hoping for the best.
I agree with your intention. But let's not project what I consider to be a bug in the buffer cache code on all consumers of bio / geom interface. In fact, I am much surprised that there is any code that treats EIO as retriable. I don't know of any other such code except for specialized disk recovery tools. -- Andriy Gapon _______________________________________________ [email protected] mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-geom To unsubscribe, send any mail to "[email protected]"
