On Wed, Feb 10, 2021, 9:26 PM Keith Busch <kbu...@kernel.org> wrote: > On Thu, Feb 11, 2021 at 12:38:48PM +0900, Minwoo Im wrote: > > On 21-02-11 12:00:11, Keith Busch wrote: > > > But I would prefer to see advanced retry tied to real errors that can > be > > > retried, like if we got an EBUSY or EAGAIN errno or something like > that. > > > > I have seen a thread [1] about ACRE. Forgive me If I misunderstood this > > thread or missed something after this thread. It looks like CRD field in > > the CQE can be set for any NVMe error state which means it *may* depend > on > > the device status. > > Right! Setting CRD values is at the controller's discretion for any > error status as long as the host enables ACRE. > > > And this patch just introduced a internal temporarily error state of > > the controller by returning Command Intrrupted status. > > It's just purely synthetic, though. I was hoping something more natural > could trigger the status. That might not provide the deterministic > scenario you're looking for, though. > > I'm not completely against using QEMU as a development/test vehicle for > corner cases like this, but we are introducing a whole lot of knobs > recently, and you practically need to be a QEMU developer to even find > them. We probably should step up the documentation in the wiki along > with these types of features. >
I'd love that too... I need to test FreeBSD's nvme driver for different error conditions. I know qemu can help, but it's a bit obscure. Warner > I think, in this stage, we can go with some errors in the middle of the > > AIO (nvme_aio_err()) for advanced retry. Shouldn't AIO errors are > > retry-able and supposed to be retried ? > > Sure, we can assume that receiving an error in the AIO callback means > the lower layers exhausted available recovery mechanisms. > >