RE: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

2020-09-29 Thread Matias Bjorling


> -Original Message-
> From: Klaus Jensen 
> Sent: Tuesday, 29 September 2020 20.00
> To: Keith Busch 
> Cc: Damien Le Moal ; Fam Zheng
> ; Kevin Wolf ; qemu-
> bl...@nongnu.org; Niklas Cassel ; Klaus Jensen
> ; qemu-de...@nongnu.org; Alistair Francis
> ; Philippe Mathieu-Daudé ;
> Matias Bjorling 
> Subject: Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and
> Zoned Namespace Command Set
> 
> On Sep 29 10:29, Keith Busch wrote:
> > On Tue, Sep 29, 2020 at 12:46:33PM +0200, Klaus Jensen wrote:
> > > It is unmistakably clear that you are invalidating my arguments
> > > about portability and endianness issues by suggesting that we just
> > > remove persistent state and deal with it later, but persistence is
> > > the killer feature that sets the QEMU emulated device apart from
> > > other emulation options. It is not about using emulation in
> > > production (because yeah, why would you?), but persistence is what
> > > makes it possible to develop and test "zoned FTLs" or something that
> requires recovery at power up.
> > > This is what allows testing of how your host software deals with
> > > opened zones being transitioned to FULL on power up and the
> > > persistent tracking of LBA allocation (in my series) can be used to
> > > properly test error recovery if you lost state in the app.
> >
> > Hold up -- why does an OPEN zone transition to FULL on power up? The
> > spec suggests it should be CLOSED. The spec does appear to support
> > going to FULL on a NVM Subsystem Reset, though. Actually, now that I'm
> > looking at this part of the spec, these implicit transitions seem a
> > bit less clear than I expected. I'm not sure it's clear enough to
> > evaluate qemu's compliance right now.
> >
> > But I don't see what testing these transitions has to do with having a
> > persistent state. You can reboot your VM without tearing down the
> > running QEMU instance. You can also unbind the driver or shutdown the
> > controller within the running operating system. That should make those
> > implicit state transitions reachable in order to exercise your FTL's
> > recovery.
> >
> 
> Oh dear - don't "spec" with me ;)
> 
> NVMe v1.4 Section 7.3.1:
> 
> An NVM Subsystem Reset is initiated when:
>   * Main power is applied to the NVM subsystem;
>   * A value of 4E564D64h ("NVMe") is written to the NSSR.NSSRC
> field;
>   * Requested using a method defined in the NVMe Management
> Interface specification; or
>   * A vendor specific event occurs.
> 
> In the context of QEMU, "Main power" is tearing down QEMU and starting it
> from scratch. Just like on a "real" host, unbinding the driver, rebooting or
> shutting down the controller does not cause a subsystem reset (and does not
> cause the zones to change state). And since the device does not indicate
> support for the optional NSSR.NSSRC register, that way to initiate a subsystem
> cannot be used.
> 
> The reason for moving to FULL is that write pointer updates are not persisted
> on each advancement, only when the zone state changes. So zones that were
> opened might have valid data, but invalid write pointer.
> So the device transitions them to FULL as it is allowed to.
> 

How about when one must also recover from intermediate states (i.e., 
open/closed upon power loss). For example, I don't hope a real SSD 
implementation transition zones to full when it has thousands of open 
simultaneously. That could be a disaster for the PE cycles, and a lot of media 
going to waste. One would want applications to support that kind of failure 
mode as well. 


RE: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

2020-09-29 Thread Matias Bjorling


> -Original Message-
> From: Klaus Jensen 
> Sent: Tuesday, 29 September 2020 20.36
> To: Matias Bjorling 
> Cc: Keith Busch ; Damien Le Moal
> ; Fam Zheng ; Kevin Wolf
> ; qemu-block@nongnu.org; Niklas Cassel
> ; Klaus Jensen ; qemu-
> de...@nongnu.org; Alistair Francis ; Philippe
> Mathieu-Daudé 
> Subject: Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and
> Zoned Namespace Command Set
> 
> On Sep 29 18:17, Matias Bjorling wrote:
> >
> >
> > > -Original Message-
> > > From: Klaus Jensen 
> > > Sent: Tuesday, 29 September 2020 20.00
> > > To: Keith Busch 
> > > Cc: Damien Le Moal ; Fam Zheng
> > > ; Kevin Wolf ; qemu-
> > > bl...@nongnu.org; Niklas Cassel ; Klaus
> > > Jensen ; qemu-de...@nongnu.org; Alistair
> > > Francis ; Philippe Mathieu-Daudé
> > > ; Matias Bjorling 
> > > Subject: Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types
> > > and Zoned Namespace Command Set
> > >
> > > On Sep 29 10:29, Keith Busch wrote:
> > > > On Tue, Sep 29, 2020 at 12:46:33PM +0200, Klaus Jensen wrote:
> > > > > It is unmistakably clear that you are invalidating my arguments
> > > > > about portability and endianness issues by suggesting that we
> > > > > just remove persistent state and deal with it later, but
> > > > > persistence is the killer feature that sets the QEMU emulated
> > > > > device apart from other emulation options. It is not about using
> > > > > emulation in production (because yeah, why would you?), but
> > > > > persistence is what makes it possible to develop and test "zoned
> > > > > FTLs" or something that
> > > requires recovery at power up.
> > > > > This is what allows testing of how your host software deals with
> > > > > opened zones being transitioned to FULL on power up and the
> > > > > persistent tracking of LBA allocation (in my series) can be used
> > > > > to properly test error recovery if you lost state in the app.
> > > >
> > > > Hold up -- why does an OPEN zone transition to FULL on power up?
> > > > The spec suggests it should be CLOSED. The spec does appear to
> > > > support going to FULL on a NVM Subsystem Reset, though. Actually,
> > > > now that I'm looking at this part of the spec, these implicit
> > > > transitions seem a bit less clear than I expected. I'm not sure
> > > > it's clear enough to evaluate qemu's compliance right now.
> > > >
> > > > But I don't see what testing these transitions has to do with
> > > > having a persistent state. You can reboot your VM without tearing
> > > > down the running QEMU instance. You can also unbind the driver or
> > > > shutdown the controller within the running operating system. That
> > > > should make those implicit state transitions reachable in order to
> > > > exercise your FTL's recovery.
> > > >
> > >
> > > Oh dear - don't "spec" with me ;)
> > >
> > > NVMe v1.4 Section 7.3.1:
> > >
> > > An NVM Subsystem Reset is initiated when:
> > >   * Main power is applied to the NVM subsystem;
> > >   * A value of 4E564D64h ("NVMe") is written to the NSSR.NSSRC
> > > field;
> > >   * Requested using a method defined in the NVMe Management
> > > Interface specification; or
> > >   * A vendor specific event occurs.
> > >
> > > In the context of QEMU, "Main power" is tearing down QEMU and
> > > starting it from scratch. Just like on a "real" host, unbinding the
> > > driver, rebooting or shutting down the controller does not cause a
> > > subsystem reset (and does not cause the zones to change state). And
> > > since the device does not indicate support for the optional
> > > NSSR.NSSRC register, that way to initiate a subsystem cannot be used.
> > >
> > > The reason for moving to FULL is that write pointer updates are not
> > > persisted on each advancement, only when the zone state changes. So
> > > zones that were opened might have valid data, but invalid write pointer.
> > > So the device transitions them to FULL as it is allowed to.
> > >
> >
> > How about when one must also recover from intermediate states (i.e.,
> > open/closed upon power loss). For example, I don't hope a real SSD
> > implementation