RE: [PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Dmitry Fomichev


> -Original Message-
> From: Klaus Jensen 
> Sent: Tuesday, June 30, 2020 4:30 PM
> To: Niklas Cassel 
> Cc: qemu-block@nongnu.org; Klaus Jensen ;
> qemu-de...@nongnu.org; Keith Busch ; Max Reitz
> ; Kevin Wolf ; Javier Gonzalez
> ; Maxim Levitsky ;
> Philippe Mathieu-Daudé ; Dmitry Fomichev
> ; Damien Le Moal
> ; Matias Bjorling 
> Subject: Re: [PATCH 00/10] hw/block/nvme: namespace types and zoned
> namespaces
> 
> On Jun 30 12:59, Niklas Cassel wrote:
> > On Tue, Jun 30, 2020 at 12:01:29PM +0200, Klaus Jensen wrote:
> > > From: Klaus Jensen 
> > >
> > > Hi all,
> >
> > Hello Klaus,
> >
> 
> Hi Niklas,
> 
> > >
> > >   * the controller uses timers to autonomously finish zones (wrt. FRL)
> >
> > AFAICT, Dmitry's patches does this as well.
> >
> 
> Hmm, yeah. Something is going on at least. It's not really clear to me
> why it works or what is happening with that admin completion queue
> timer, but I'll dig through it.
> 
> > >
> > > I've been on paternity leave for a month, so I havn't been around to
> > > review Dmitry's patches, but I have started that process now. I would
> > > also be happy to work with Dmitry & Friends on merging our versions to
> > > get the best of both worlds if it makes sense.
> > >
> > > This series and all preparatory patch sets (the ones I've been posting
> > > yesterday and today) are available on my GitHub[2]. Unfortunately
> > > Patchew got screwed up in the middle of me sending patches and it
> never
> > > picked up v2 of "hw/block/nvme: support multiple namespaces" because
> it
> > > was getting late and I made a mistake with the CC's. So my posted series
> > > don't apply according to Patchew, but they actually do if you follow the
> > > Based-on's (... or just grab [2]).
> > >
> > >
> > >   [1]: Message-Id: <20200617213415.22417-1-dmitry.fomic...@wdc.com>
> > >   [2]: https://github.com/birkelund/qemu/tree/for-master/nvme
> > >
> > >
> > > Based-on: <20200630043122.1307043-1-...@irrelevant.dk>
> > > ("[PATCH 0/3] hw/block/nvme: bump to v1.4")
> >
> > Is this the only patch series that this series depends on?
> >
> > In the beginning of the cover letter, you mentioned
> > "NVMe v1.4 mandatory support", "SGLs", "multiple namespaces",
> > and "and mostly just overall clean up".
> >
> 
> No, its a string of series that all has a Based-on tag (that is, "[PATCH
> 0/3] hw/block/nvme: bump to v1.4" has another Based-on tag that points
> to the dependency of that). The point was to have patchew nicely apply
> everything, but it broke midway...
> 
> As Philippe pointed out, all of the patch sets are integrated in the
> GitHub tree, applied to QEMU master.
> 
> >
> > I think that you have done a great job getting the NVMe
> > driver out of a frankenstate, and made it compliant with
> > a proper spec (NVMe 1.4).
> >
> > I'm also a big fan of the refactoring so that the driver
> > handles more than one namespace, and the new bus model.
> >
> 
> Well, thanks! :)
> 
> > I know that you first sent your
> > "nvme: support NVMe v1.3d, SGLs and multiple namespaces"
> > patch series July, last year.
> >
> > Looking at your outstanding patch series on patchwork:
> > https://patchwork.kernel.org/project/qemu-devel/list/?submitter=188679
> >
> > (Feel free to correct me if I have misunderstood anything.)
> >
> > I see that these are related to your patch series from July last year:
> > hw/block/nvme: bump to v1.3
> > hw/block/nvme: support scatter gather lists
> > hw/block/nvme: support multiple namespaces
> > hw/block/nvme: bump to v1.4
> >
> 
> Yeah this stuff has been around for a while so the history on patchwork
> is a mess.
> 
> >
> > This patch series seems minor and could probably be merged immediately:
> > hw/block/nvme: handle transient dma errors
> >
> 
> Sure, but it's nicer in combination with the previous series
> ("hw/block/nvme: AIO and address mapping refactoring"). What I /can/ do
> is rip out "hw/block/nvme: allow multiple aios per command" as that
> patch might require more time for reviews. The rest of that series are
> clean ups and a couple of bug fixes.
> 
> >
> > This patch series looks a bit weird:
> > hw/block/nvme: AIO and address mapping refactoring
> >
> > Since it looks like a V1 post, and was first posted yesterday.
> > However, 2 out of the 17 patches in are Acked-by: Keith.
> > (Perhaps some of your previously posted patches was put inside
> > this new patch series?)
> >
> 
> Yes that and reviewers requested a lot of separation, so basically the
> patch set ballooned.
> 
> >
> > This patch series:
> > hw/block/nvme: namespace types and zoned namespaces
> >
> > Which was first posted today. Up until earlier today, we haven't seen
> > any patches from you related to ZNS (only overall NVMe cleanups).
> > Dmitry's ZNS patches have been on the list since 2020-06-16.
> >
> 
> Yeah, as I mentioned in my cover letter, I was on leave, so I wasn't
> around for the big ZNS release day either. But, honestly, I think this
> is irrelevant - code should be merged 

Re: [PATCH v2 12/18] hw/block/nvme: Simulate Zone Active excursions

2020-06-30 Thread Alistair Francis
On Wed, Jun 17, 2020 at 2:52 PM Dmitry Fomichev  wrote:
>
> Added a Boolean flag to turn on simulation of Zone Active Excursions.
> If the flag, "active_excursions", is set to true, the driver will try
> to finish one of the currently open zone if max active zones limit is
> going to get exceeded.
>
> Signed-off-by: Dmitry Fomichev 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/block/nvme.c | 24 +++-
>  hw/block/nvme.h |  1 +
>  2 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 05a7cbcfcc..a29cbfcc96 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -540,6 +540,26 @@ static void nvme_auto_transition_zone(NvmeCtrl *n, 
> NvmeNamespace *ns,
>  {
>  NvmeZone *zone;
>
> +if (n->params.active_excursions && adding_active &&
> +n->params.max_active_zones &&
> +ns->nr_active_zones == n->params.max_active_zones) {
> +zone = nvme_peek_zone_head(ns, ns->closed_zones);
> +if (zone) {
> +/*
> + * The namespace is at the limit of active zones.
> + * Try to finish one of the currently active zones
> + * to make the needed active zone resource available.
> + */
> +nvme_aor_dec_active(n, ns);
> +nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_FULL);
> +zone->d.za &= ~(NVME_ZA_FINISH_RECOMMENDED |
> +NVME_ZA_RESET_RECOMMENDED);
> +zone->d.za |= NVME_ZA_FINISHED_BY_CTLR;
> +zone->tstamp = 0;
> +trace_pci_nvme_zone_finished_by_controller(zone->d.zslba);
> +}
> +}
> +
>  if (implicit && n->params.max_open_zones &&
>  ns->nr_open_zones == n->params.max_open_zones) {
>  zone = nvme_remove_zone_head(n, ns, ns->imp_open_zones);
> @@ -2631,7 +2651,7 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, 
> NvmeNamespace *ns, int lba_index,
>  /* MAR/MOR are zeroes-based, 0x means no limit */
>  ns->id_ns_zoned->mar = cpu_to_le32(n->params.max_active_zones - 1);
>  ns->id_ns_zoned->mor = cpu_to_le32(n->params.max_open_zones - 1);
> -ns->id_ns_zoned->zoc = 0;
> +ns->id_ns_zoned->zoc = cpu_to_le16(n->params.active_excursions ? 0x2 : 
> 0);
>  ns->id_ns_zoned->ozcs = n->params.cross_zone_read ? 0x01 : 0x00;
>
>  ns->id_ns_zoned->lbafe[lba_index].zsze = 
> cpu_to_le64(n->params.zone_size);
> @@ -2993,6 +3013,8 @@ static Property nvme_props[] = {
>  DEFINE_PROP_INT32("max_active", NvmeCtrl, params.max_active_zones, 0),
>  DEFINE_PROP_INT32("max_open", NvmeCtrl, params.max_open_zones, 0),
>  DEFINE_PROP_BOOL("cross_zone_read", NvmeCtrl, params.cross_zone_read, 
> true),
> +DEFINE_PROP_BOOL("active_excursions", NvmeCtrl, params.active_excursions,
> + false),
>  DEFINE_PROP_UINT8("fill_pattern", NvmeCtrl, params.fill_pattern, 0),
>  DEFINE_PROP_END_OF_LIST(),
>  };
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index f5a4679702..8a0aaeb09a 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -15,6 +15,7 @@ typedef struct NvmeParams {
>
>  boolzoned;
>  boolcross_zone_read;
> +boolactive_excursions;
>  uint8_t fill_pattern;
>  uint32_tzamds_bs;
>  uint64_tzone_size;
> --
> 2.21.0
>
>



Re: [PATCH v2 11/18] hw/block/nvme: Introduce max active and open zone limits

2020-06-30 Thread Alistair Francis
On Wed, Jun 17, 2020 at 3:07 PM Dmitry Fomichev  wrote:
>
> Added two module properties, "max_active" and "max_open" to control
> the maximum number of zones that can be active or open. Once these
> variables are set to non-default values, the driver checks these
> limits during I/O and returns Too Many Active or Too Many Open
> command status if they are exceeded.
>
> Signed-off-by: Hans Holmberg 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c | 183 +++-
>  hw/block/nvme.h |   4 ++
>  2 files changed, 185 insertions(+), 2 deletions(-)
>
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 2e03b0b6ed..05a7cbcfcc 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -120,6 +120,87 @@ static void nvme_remove_zone(NvmeCtrl *n, NvmeNamespace 
> *ns, NvmeZoneList *zl,
>  zone->prev = zone->next = 0;
>  }
>
> +/*
> + * Take the first zone out from a list, return NULL if the list is empty.
> + */
> +static NvmeZone *nvme_remove_zone_head(NvmeCtrl *n, NvmeNamespace *ns,
> +NvmeZoneList *zl)
> +{
> +NvmeZone *zone = nvme_peek_zone_head(ns, zl);
> +
> +if (zone) {
> +--zl->size;
> +if (zl->size == 0) {
> +zl->head = NVME_ZONE_LIST_NIL;
> +zl->tail = NVME_ZONE_LIST_NIL;
> +} else {
> +zl->head = zone->next;
> +ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL;
> +}
> +zone->prev = zone->next = 0;
> +}
> +
> +return zone;
> +}
> +
> +/*
> + * Check if we can open a zone without exceeding open/active limits.
> + * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5).
> + */
> +static int nvme_aor_check(NvmeCtrl *n, NvmeNamespace *ns,
> + uint32_t act, uint32_t opn)
> +{
> +if (n->params.max_active_zones != 0 &&
> +ns->nr_active_zones + act > n->params.max_active_zones) {
> +trace_pci_nvme_err_insuff_active_res(n->params.max_active_zones);
> +return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR;
> +}
> +if (n->params.max_open_zones != 0 &&
> +ns->nr_open_zones + opn > n->params.max_open_zones) {
> +trace_pci_nvme_err_insuff_open_res(n->params.max_open_zones);
> +return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR;
> +}
> +
> +return NVME_SUCCESS;
> +}
> +
> +static inline void nvme_aor_inc_open(NvmeCtrl *n, NvmeNamespace *ns)
> +{
> +assert(ns->nr_open_zones >= 0);
> +if (n->params.max_open_zones) {
> +ns->nr_open_zones++;
> +assert(ns->nr_open_zones <= n->params.max_open_zones);
> +}
> +}
> +
> +static inline void nvme_aor_dec_open(NvmeCtrl *n, NvmeNamespace *ns)
> +{
> +if (n->params.max_open_zones) {
> +assert(ns->nr_open_zones > 0);
> +ns->nr_open_zones--;
> +}
> +assert(ns->nr_open_zones >= 0);
> +}
> +
> +static inline void nvme_aor_inc_active(NvmeCtrl *n, NvmeNamespace *ns)
> +{
> +assert(ns->nr_active_zones >= 0);
> +if (n->params.max_active_zones) {
> +ns->nr_active_zones++;
> +assert(ns->nr_active_zones <= n->params.max_active_zones);
> +}
> +}
> +
> +static inline void nvme_aor_dec_active(NvmeCtrl *n, NvmeNamespace *ns)
> +{
> +if (n->params.max_active_zones) {
> +assert(ns->nr_active_zones > 0);
> +ns->nr_active_zones--;
> +assert(ns->nr_active_zones >= ns->nr_open_zones);
> +}
> +assert(ns->nr_active_zones >= 0);
> +}
> +
>  static void nvme_assign_zone_state(NvmeCtrl *n, NvmeNamespace *ns,
>  NvmeZone *zone, uint8_t state)
>  {
> @@ -454,6 +535,24 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, 
> NvmeRequest *req)
>  timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
>  }
>
> +static void nvme_auto_transition_zone(NvmeCtrl *n, NvmeNamespace *ns,
> +bool implicit, bool adding_active)
> +{
> +NvmeZone *zone;
> +
> +if (implicit && n->params.max_open_zones &&
> +ns->nr_open_zones == n->params.max_open_zones) {
> +zone = nvme_remove_zone_head(n, ns, ns->imp_open_zones);
> +if (zone) {
> +/*
> + * Automatically close this implicitly open zone.
> + */
> +nvme_aor_dec_open(n, ns);
> +nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_CLOSED);
> +}
> +}
> +}
> +
>  static uint16_t nvme_check_zone_write(NvmeZone *zone, uint64_t slba,
>  uint32_t nlb)
>  {
> @@ -531,6 +630,23 @@ static uint16_t nvme_check_zone_read(NvmeCtrl *n, 
> NvmeZone *zone, uint64_t slba,
>  return status;
>  }
>
> +static uint16_t nvme_auto_open_zone(NvmeCtrl *n, NvmeNamespace *ns,
> +NvmeZone *zone)
> +{
> +uint16_t status = NVME_SUCCESS;
> +uint8_t zs = nvme_get_zone_state(zone);
> +
> +if (zs == NVME_ZONE_STATE_EMPTY) {
> +nvme_auto_transition_zone(n, ns, true, true);
> +status = nvme_aor_check(n, ns, 1, 1);
> +} else if (zs == NVME_ZONE_STATE_CLOSED) {
> +nvme_auto_transition_zone(n, 

Re: [PATCH v2 08/18] hw/block/nvme: Make Zoned NS Command Set definitions

2020-06-30 Thread Alistair Francis
On Wed, Jun 17, 2020 at 2:51 PM Dmitry Fomichev  wrote:
>
> Define values and structures that are needed to support Zoned
> Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator.
>
> All new protocol definitions are located in include/block/nvme.h
> and everything added that is specific to this implementation is kept
> in hw/block/nvme.h.
>
> In order to improve scalability, all open, closed and full zones
> are organized in separate linked lists. Consequently, almost all
> zone operations don't require scanning of the entire zone array
> (which potentially can be quite large) - it is only necessary to
> enumerate one or more zone lists. Zone lists are designed to be
> position-independent as they can be persisted to the backing file
> as a part of zone metadata. NvmeZoneList struct defined in this patch
> serves as a head of every zone list.
>
> NvmeZone structure encapsulates NvmeZoneDescriptor defined in Zoned
> Command Set specification and adds a few more fields that are
> internal to this implementation.
>
> Signed-off-by: Niklas Cassel 
> Signed-off-by: Hans Holmberg 
> Signed-off-by: Ajay Joshi 
> Signed-off-by: Matias Bjorling 
> Signed-off-by: Shin'ichiro Kawasaki 
> Signed-off-by: Alexey Bogoslavsky 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.h  | 130 +++
>  include/block/nvme.h | 119 ++-
>  2 files changed, 248 insertions(+), 1 deletion(-)
>
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 0d29f75475..2c932b5e29 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -3,12 +3,22 @@
>
>  #include "block/nvme.h"
>
> +#define NVME_DEFAULT_ZONE_SIZE   128 /* MiB */
> +#define NVME_DEFAULT_MAX_ZA_SIZE 128 /* KiB */
> +
>  typedef struct NvmeParams {
>  char *serial;
>  uint32_t num_queues; /* deprecated since 5.1 */
>  uint32_t max_ioqpairs;
>  uint16_t msix_qsize;
>  uint32_t cmb_size_mb;
> +
> +boolzoned;
> +boolcross_zone_read;
> +uint8_t fill_pattern;
> +uint32_tzamds_bs;
> +uint64_tzone_size;
> +uint64_tzone_capacity;
>  } NvmeParams;
>
>  typedef struct NvmeAsyncEvent {
> @@ -17,6 +27,8 @@ typedef struct NvmeAsyncEvent {
>
>  enum NvmeRequestFlags {
>  NVME_REQ_FLG_HAS_SG   = 1 << 0,
> +NVME_REQ_FLG_FILL = 1 << 1,
> +NVME_REQ_FLG_APPEND   = 1 << 2,
>  };
>
>  typedef struct NvmeRequest {
> @@ -24,6 +36,7 @@ typedef struct NvmeRequest {
>  BlockAIOCB  *aiocb;
>  uint16_tstatus;
>  uint16_tflags;
> +uint64_tfill_ofs;
>  NvmeCqe cqe;
>  BlockAcctCookie acct;
>  QEMUSGList  qsg;
> @@ -61,11 +74,35 @@ typedef struct NvmeCQueue {
>  QTAILQ_HEAD(, NvmeRequest) req_list;
>  } NvmeCQueue;
>
> +typedef struct NvmeZone {
> +NvmeZoneDescr   d;
> +uint64_ttstamp;
> +uint32_tnext;
> +uint32_tprev;
> +uint8_t rsvd80[8];
> +} NvmeZone;
> +
> +#define NVME_ZONE_LIST_NILUINT_MAX
> +
> +typedef struct NvmeZoneList {
> +uint32_thead;
> +uint32_ttail;
> +uint32_tsize;
> +uint8_t rsvd12[4];
> +} NvmeZoneList;
> +
>  typedef struct NvmeNamespace {
>  NvmeIdNsid_ns;
>  uint32_tnsid;
>  uint8_t csi;
>  QemuUUIDuuid;
> +
> +NvmeIdNsZoned   *id_ns_zoned;
> +NvmeZone*zone_array;
> +NvmeZoneList*exp_open_zones;
> +NvmeZoneList*imp_open_zones;
> +NvmeZoneList*closed_zones;
> +NvmeZoneList*full_zones;
>  } NvmeNamespace;
>
>  static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> @@ -100,6 +137,7 @@ typedef struct NvmeCtrl {
>  uint32_tnum_namespaces;
>  uint32_tmax_q_ents;
>  uint64_tns_size;
> +
>  uint8_t *cmbuf;
>  uint32_tirq_status;
>  uint64_thost_timestamp; /* Timestamp sent by the 
> host */
> @@ -107,6 +145,12 @@ typedef struct NvmeCtrl {
>
>  HostMemoryBackend *pmrdev;
>
> +int zone_file_fd;
> +uint32_tnum_zones;
> +uint64_tzone_size_bs;
> +uint64_tzone_array_size;
> +uint8_t zamds;
> +
>  NvmeNamespace   *namespaces;
>  NvmeSQueue  **sq;
>  NvmeCQueue  **cq;
> @@ -121,6 +165,86 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, 
> NvmeNamespace *ns)
>  return n->ns_size >> nvme_ns_lbads(ns);
>  }
>
> +static inline uint8_t nvme_get_zone_state(NvmeZone *zone)
> +{
> +return zone->d.zs >> 4;
> +}
> +
> +static inline void nvme_set_zone_state(NvmeZone *zone, enum NvmeZoneState 
> state)
> +{
> +zone->d.zs = state << 4;
> +}
> +
> +static inline uint64_t nvme_zone_rd_boundary(NvmeCtrl *n, NvmeZone *zone)
> +{
> +return zone->d.zslba + n->params.zone_size;
> +}
> +
> +static inline uint64_t 

Re: [PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Klaus Jensen
On Jun 30 08:42, Keith Busch wrote:
> On Tue, Jun 30, 2020 at 04:09:46PM +0200, Philippe Mathieu-Daudé wrote:
> > What I see doable for the following days is:
> > - hw/block/nvme: Fix I/O BAR structure [3]
> > - hw/block/nvme: handle transient dma errors
> > - hw/block/nvme: bump to v1.3
> 
> 
> These look like sensible patches to rebase future work on, IMO. The 1.3
> updates had been prepared a while ago, at least.

I think Philippe's "hw/block/nvme: Fix I/O BAR structure" series is a
no-brainer. It just needs to get in asap.

The "hw/block/nvme: handle transient dma errors" series would really
benefit from most of the patches in my "hw/block/nvme: AIO and address
mapping refactoring" series. The elephant in the room is the AIO part
(the "hw/block/nvme: allow multiple aios per command" patch), so I will
get rid of it and leave the cleanup patches there and post it as a new
series together with the "handle transient dma errors" fixes. This would
make it a series of around ~17-18 patches, but I think they are all
quite reviewable.

The bump to v1.3 should also pretty much be ready for merging.



Re: [PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Klaus Jensen
On Jun 30 12:59, Niklas Cassel wrote:
> On Tue, Jun 30, 2020 at 12:01:29PM +0200, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > Hi all,
> 
> Hello Klaus,
> 

Hi Niklas,

> > 
> >   * the controller uses timers to autonomously finish zones (wrt. FRL)
> 
> AFAICT, Dmitry's patches does this as well.
> 

Hmm, yeah. Something is going on at least. It's not really clear to me
why it works or what is happening with that admin completion queue
timer, but I'll dig through it.

> > 
> > I've been on paternity leave for a month, so I havn't been around to
> > review Dmitry's patches, but I have started that process now. I would
> > also be happy to work with Dmitry & Friends on merging our versions to
> > get the best of both worlds if it makes sense.
> > 
> > This series and all preparatory patch sets (the ones I've been posting
> > yesterday and today) are available on my GitHub[2]. Unfortunately
> > Patchew got screwed up in the middle of me sending patches and it never
> > picked up v2 of "hw/block/nvme: support multiple namespaces" because it
> > was getting late and I made a mistake with the CC's. So my posted series
> > don't apply according to Patchew, but they actually do if you follow the
> > Based-on's (... or just grab [2]).
> > 
> > 
> >   [1]: Message-Id: <20200617213415.22417-1-dmitry.fomic...@wdc.com>
> >   [2]: https://github.com/birkelund/qemu/tree/for-master/nvme
> > 
> > 
> > Based-on: <20200630043122.1307043-1-...@irrelevant.dk>
> > ("[PATCH 0/3] hw/block/nvme: bump to v1.4")
> 
> Is this the only patch series that this series depends on?
> 
> In the beginning of the cover letter, you mentioned
> "NVMe v1.4 mandatory support", "SGLs", "multiple namespaces",
> and "and mostly just overall clean up".
> 

No, its a string of series that all has a Based-on tag (that is, "[PATCH
0/3] hw/block/nvme: bump to v1.4" has another Based-on tag that points
to the dependency of that). The point was to have patchew nicely apply
everything, but it broke midway...

As Philippe pointed out, all of the patch sets are integrated in the
GitHub tree, applied to QEMU master.

> 
> I think that you have done a great job getting the NVMe
> driver out of a frankenstate, and made it compliant with
> a proper spec (NVMe 1.4).
> 
> I'm also a big fan of the refactoring so that the driver
> handles more than one namespace, and the new bus model.
> 

Well, thanks! :)

> I know that you first sent your
> "nvme: support NVMe v1.3d, SGLs and multiple namespaces"
> patch series July, last year.
> 
> Looking at your outstanding patch series on patchwork:
> https://patchwork.kernel.org/project/qemu-devel/list/?submitter=188679
> 
> (Feel free to correct me if I have misunderstood anything.)
> 
> I see that these are related to your patch series from July last year:
> hw/block/nvme: bump to v1.3
> hw/block/nvme: support scatter gather lists
> hw/block/nvme: support multiple namespaces
> hw/block/nvme: bump to v1.4
> 

Yeah this stuff has been around for a while so the history on patchwork
is a mess.

> 
> This patch series seems minor and could probably be merged immediately:
> hw/block/nvme: handle transient dma errors
> 

Sure, but it's nicer in combination with the previous series
("hw/block/nvme: AIO and address mapping refactoring"). What I /can/ do
is rip out "hw/block/nvme: allow multiple aios per command" as that
patch might require more time for reviews. The rest of that series are
clean ups and a couple of bug fixes.

> 
> This patch series looks a bit weird:
> hw/block/nvme: AIO and address mapping refactoring
> 
> Since it looks like a V1 post, and was first posted yesterday.
> However, 2 out of the 17 patches in are Acked-by: Keith.
> (Perhaps some of your previously posted patches was put inside
> this new patch series?)
> 

Yes that and reviewers requested a lot of separation, so basically the
patch set ballooned.

> 
> This patch series:
> hw/block/nvme: namespace types and zoned namespaces
> 
> Which was first posted today. Up until earlier today, we haven't seen
> any patches from you related to ZNS (only overall NVMe cleanups).
> Dmitry's ZNS patches have been on the list since 2020-06-16.
> 

Yeah, as I mentioned in my cover letter, I was on leave, so I wasn't
around for the big ZNS release day either. But, honestly, I think this
is irrelevant - code should be merged based on technical reasons (not
technicalities).

> 
> Just a friendly suggestion, how about:
> 
> 1) We get your
> 
> hw/block/nvme: bump to v1.3
> hw/block/nvme: support scatter gather lists
> hw/block/nvme: support multiple namespaces
> hw/block/nvme: bump to v1.4
> 
> patch series merged.
> 

Blowing my own horn here, but yeah, it seems like everyone would like to
see this merged.

> 2) We get Dmitry's patch series merged.
> 
> Shared 4:th) If there is any feature that you miss in Dmitry's patch series,
> perhaps you could send patches to add what you are missing.
>

Looks like the two version are pretty much on par in terms of features.


Re: [PATCH v2 06/18] hw/block/nvme: Define trace events related to NS Types

2020-06-30 Thread Alistair Francis
On Wed, Jun 17, 2020 at 2:46 PM Dmitry Fomichev  wrote:
>
> A few trace events are defined that are relevant to implementing
> Namespace Types (NVMe TP 4056).
>
> Signed-off-by: Dmitry Fomichev 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/block/trace-events | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 423d491e27..3f3323fe38 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -39,8 +39,13 @@ pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t 
> vector, uint16_t size,
>  pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
>  pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
> +pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8""
>  pci_nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
> +pci_nvme_identify_ns_csi(uint16_t ns, uint8_t csi) "identify namespace, 
> nsid=%"PRIu16", csi=0x%"PRIx8""
>  pci_nvme_identify_nslist(uint16_t ns) "identify namespace list, 
> nsid=%"PRIu16""
> +pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "identify namespace 
> list, nsid=%"PRIu16", csi=0x%"PRIx8""
> +pci_nvme_list_ns_descriptors(void) "identify namespace descriptors"
> +pci_nvme_identify_cmd_set(void) "identify i/o command set"
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write 
> cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested 
> cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> @@ -59,6 +64,8 @@ pci_nvme_mmio_stopped(void) "cleared controller enable bit"
>  pci_nvme_mmio_shutdown_set(void) "shutdown bit set"
>  pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
>  pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects 
> log read"
> +pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set 
> selected by host, bar.cc=0x%"PRIx32""
> +pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets 
> selected by host, bar.cc=0x%"PRIx32""
>
>  # nvme traces for error conditions
>  pci_nvme_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
> @@ -72,6 +79,9 @@ pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin 
> opcode 0x%"PRIx8""
>  pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) 
> "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
>  pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported 
> and effects log offset must be 0, got %"PRIu64""
>  pci_nvme_err_invalid_effects_log_len(uint32_t len) "commands supported and 
> effects log size is 4096, got %"PRIu32""
> +pci_nvme_err_change_css_when_enabled(void) "changing CC.CSS while controller 
> is enabled"
> +pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM 
> command set is enabled"
> +pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set 
> combination index %"PRIu32""
>  pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue 
> deletion, sid=%"PRIu16""
>  pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating 
> submission queue, invalid cqid=%"PRIu16""
>  pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating 
> submission queue, invalid sqid=%"PRIu16""
> @@ -127,6 +137,7 @@ pci_nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t 
> new_head) "completion qu
>  pci_nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write 
> for nonexistent queue, sqid=%"PRIu32", ignoring"
>  pci_nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) 
> "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", 
> new_head=%"PRIu16", ignoring"
>  pci_nvme_unsupported_log_page(uint16_t lid) "unsupported log page 
> 0x%"PRIx16""
> +pci_nvme_ub_unknown_css_value(void) "unknown value in cc.css field"
>
>  # xen-block.c
>  xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s 
> d%up%u"
> --
> 2.21.0
>
>



[PATCH v2 11/12] block/nvme: Simplify nvme_create_queue_pair() arguments

2020-06-30 Thread Philippe Mathieu-Daudé
nvme_create_queue_pair() doesn't require BlockDriverState anymore.
Replace it by BDRVNVMeState and AioContext to simplify.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 010286e8ad..90b2e00e8d 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -208,12 +208,12 @@ static void nvme_free_req_queue_cb(void *opaque)
 qemu_mutex_unlock(>lock);
 }
 
-static NVMeQueuePair *nvme_create_queue_pair(BlockDriverState *bs,
+static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
+ AioContext *aio_context,
  int idx, int size,
  Error **errp)
 {
 int i, r;
-BDRVNVMeState *s = bs->opaque;
 Error *local_err = NULL;
 NVMeQueuePair *q;
 uint64_t prp_list_iova;
@@ -232,8 +232,7 @@ static NVMeQueuePair 
*nvme_create_queue_pair(BlockDriverState *bs,
 q->s = s;
 q->index = idx;
 qemu_co_queue_init(>free_req_queue);
-q->completion_bh = aio_bh_new(bdrv_get_aio_context(bs),
-  nvme_process_completion_bh, q);
+q->completion_bh = aio_bh_new(aio_context, nvme_process_completion_bh, q);
 r = qemu_vfio_dma_map(s->vfio, q->prp_list_pages,
   s->page_size * NVME_NUM_REQS,
   false, _list_iova);
@@ -637,7 +636,8 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error 
**errp)
 NvmeCmd cmd;
 int queue_size = NVME_QUEUE_SIZE;
 
-q = nvme_create_queue_pair(bs, n, queue_size, errp);
+q = nvme_create_queue_pair(s, bdrv_get_aio_context(bs),
+   n, queue_size, errp);
 if (!q) {
 return false;
 }
@@ -682,6 +682,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
  Error **errp)
 {
 BDRVNVMeState *s = bs->opaque;
+AioContext *aio_context = bdrv_get_aio_context(bs);
 int ret;
 uint64_t cap;
 uint64_t timeout_ms;
@@ -742,7 +743,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 
 /* Set up admin queue. */
 s->queues = g_new(NVMeQueuePair *, 1);
-s->queues[QUEUE_INDEX_ADMIN] = nvme_create_queue_pair(bs, 0,
+s->queues[QUEUE_INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context, 0,
   NVME_QUEUE_SIZE,
   errp);
 if (!s->queues[QUEUE_INDEX_ADMIN]) {
-- 
2.21.3




[PATCH v2 07/12] block/nvme: Replace qemu_try_blockalign0 by qemu_try_blockalign/memset

2020-06-30 Thread Philippe Mathieu-Daudé
In the next commit we'll get rid of qemu_try_blockalign().
To ease review, first replace qemu_try_blockalign0() by explicit
calls to qemu_try_blockalign() and memset().

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 7ebd5be1f3..5b0bb9a8d7 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -174,12 +174,12 @@ static void nvme_init_queue(BlockDriverState *bs, 
NVMeQueue *q,
 
 bytes = ROUND_UP(nentries * entry_bytes, s->page_size);
 q->head = q->tail = 0;
-q->queue = qemu_try_blockalign0(bs, bytes);
-
+q->queue = qemu_try_blockalign(bs, bytes);
 if (!q->queue) {
 error_setg(errp, "Cannot allocate queue");
 return;
 }
+memset(q->queue, 0, bytes);
 r = qemu_vfio_dma_map(s->vfio, q->queue, bytes, false, >iova);
 if (r) {
 error_setg(errp, "Cannot map queue");
@@ -223,11 +223,12 @@ static NVMeQueuePair 
*nvme_create_queue_pair(BlockDriverState *bs,
 if (!q) {
 return NULL;
 }
-q->prp_list_pages = qemu_try_blockalign0(bs,
+q->prp_list_pages = qemu_try_blockalign(bs,
   s->page_size * NVME_QUEUE_SIZE);
 if (!q->prp_list_pages) {
 goto fail;
 }
+memset(q->prp_list_pages, 0, s->page_size * NVME_QUEUE_SIZE);
 qemu_mutex_init(>lock);
 q->s = s;
 q->index = idx;
@@ -521,7 +522,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 .cdw10 = cpu_to_le32(0x1),
 };
 
-id = qemu_try_blockalign0(bs, sizeof(*id));
+id = qemu_try_blockalign(bs, sizeof(*id));
 if (!id) {
 error_setg(errp, "Cannot allocate buffer for identify response");
 goto out;
@@ -531,8 +532,9 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 error_setg(errp, "Cannot map buffer for DMA");
 goto out;
 }
-cmd.prp1 = cpu_to_le64(iova);
 
+memset(id, 0, sizeof(*id));
+cmd.prp1 = cpu_to_le64(iova);
 if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
 error_setg(errp, "Failed to identify controller");
 goto out;
@@ -1282,11 +1284,11 @@ static int coroutine_fn 
nvme_co_pdiscard(BlockDriverState *bs,
 
 assert(s->nr_queues > 1);
 
-buf = qemu_try_blockalign0(bs, s->page_size);
+buf = qemu_try_blockalign(bs, s->page_size);
 if (!buf) {
 return -ENOMEM;
 }
-
+memset(buf, 0, s->page_size);
 buf->nlb = cpu_to_le32(bytes >> s->blkshift);
 buf->slba = cpu_to_le64(offset >> s->blkshift);
 buf->cattr = 0;
-- 
2.21.3




[PATCH v2 09/12] block/nvme: Simplify nvme_init_queue() arguments

2020-06-30 Thread Philippe Mathieu-Daudé
nvme_init_queue() doesn't require BlockDriverState anymore.
Replace it by BDRVNVMeState to simplify.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 8b4d957a8e..c28c08b3e3 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -165,10 +165,9 @@ static QemuOptsList runtime_opts = {
 },
 };
 
-static void nvme_init_queue(BlockDriverState *bs, NVMeQueue *q,
+static void nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
 int nentries, int entry_bytes, Error **errp)
 {
-BDRVNVMeState *s = bs->opaque;
 size_t bytes;
 int r;
 
@@ -251,14 +250,14 @@ static NVMeQueuePair 
*nvme_create_queue_pair(BlockDriverState *bs,
 req->prp_list_iova = prp_list_iova + i * s->page_size;
 }
 
-nvme_init_queue(bs, >sq, size, NVME_SQ_ENTRY_BYTES, _err);
+nvme_init_queue(s, >sq, size, NVME_SQ_ENTRY_BYTES, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 goto fail;
 }
 q->sq.doorbell = >regs->doorbells[idx * 2 * s->doorbell_scale];
 
-nvme_init_queue(bs, >cq, size, NVME_CQ_ENTRY_BYTES, _err);
+nvme_init_queue(s, >cq, size, NVME_CQ_ENTRY_BYTES, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 goto fail;
-- 
2.21.3




[PATCH v2 10/12] block/nvme: Replace BDRV_POLL_WHILE by AIO_WAIT_WHILE

2020-06-30 Thread Philippe Mathieu-Daudé
BDRV_POLL_WHILE() is defined as:

  #define BDRV_POLL_WHILE(bs, cond) ({  \
  BlockDriverState *bs_ = (bs); \
  AIO_WAIT_WHILE(bdrv_get_aio_context(bs_), \
 cond); })

As we will remove the BlockDriverState use in the next commit,
start by using the exploded version of BDRV_POLL_WHILE().

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/nvme.c b/block/nvme.c
index c28c08b3e3..010286e8ad 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -493,6 +493,7 @@ static void nvme_cmd_sync_cb(void *opaque, int ret)
 static int nvme_cmd_sync(BlockDriverState *bs, NVMeQueuePair *q,
  NvmeCmd *cmd)
 {
+AioContext *aio_context = bdrv_get_aio_context(bs);
 NVMeRequest *req;
 int ret = -EINPROGRESS;
 req = nvme_get_free_req(q);
@@ -501,7 +502,7 @@ static int nvme_cmd_sync(BlockDriverState *bs, 
NVMeQueuePair *q,
 }
 nvme_submit_command(q, req, cmd, nvme_cmd_sync_cb, );
 
-BDRV_POLL_WHILE(bs, ret == -EINPROGRESS);
+AIO_WAIT_WHILE(aio_context, ret == -EINPROGRESS);
 return ret;
 }
 
-- 
2.21.3




[PATCH v2 12/12] block/nvme: Use per-queue AIO context

2020-06-30 Thread Philippe Mathieu-Daudé
To be able to use multiple queues on the same hardware,
we need to have each queue able to receive IRQ notifications
in the correct AIO context.
The AIO context and the notification handler have to be proper
to each queue, not to the block driver. Move aio_context and
irq_notifier from BDRVNVMeState to NVMeQueuePair.

Signed-off-by: Philippe Mathieu-Daudé 
---
Since v1: Moved irq_notifier to NVMeQueuePair
---
 block/nvme.c | 71 +++-
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 90b2e00e8d..e7b9ecec41 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -60,6 +60,8 @@ typedef struct {
 
 typedef struct {
 QemuMutex   lock;
+AioContext *aio_context;
+EventNotifier irq_notifier;
 
 /* Read from I/O code path, initialized under BQL */
 BDRVNVMeState   *s;
@@ -107,7 +109,6 @@ QEMU_BUILD_BUG_ON(offsetof(NVMeRegs, doorbells) != 0x1000);
 #define QUEUE_INDEX_IO(n)   (1 + n)
 
 struct BDRVNVMeState {
-AioContext *aio_context;
 QEMUVFIOState *vfio;
 NVMeRegs *regs;
 /* The submission/completion queue pairs.
@@ -120,7 +121,6 @@ struct BDRVNVMeState {
 /* How many uint32_t elements does each doorbell entry take. */
 size_t doorbell_scale;
 bool write_cache_supported;
-EventNotifier irq_notifier;
 
 uint64_t nsze; /* Namespace size reported by identify command */
 int nsid;  /* The namespace id to read/write data. */
@@ -227,11 +227,17 @@ static NVMeQueuePair 
*nvme_create_queue_pair(BDRVNVMeState *s,
 if (!q->prp_list_pages) {
 goto fail;
 }
+r = event_notifier_init(>irq_notifier, 0);
+if (r) {
+error_setg(errp, "Failed to init event notifier");
+goto fail;
+}
 memset(q->prp_list_pages, 0, s->page_size * NVME_QUEUE_SIZE);
 qemu_mutex_init(>lock);
 q->s = s;
 q->index = idx;
 qemu_co_queue_init(>free_req_queue);
+q->aio_context = aio_context;
 q->completion_bh = aio_bh_new(aio_context, nvme_process_completion_bh, q);
 r = qemu_vfio_dma_map(s->vfio, q->prp_list_pages,
   s->page_size * NVME_NUM_REQS,
@@ -325,7 +331,7 @@ static void nvme_put_free_req_locked(NVMeQueuePair *q, 
NVMeRequest *req)
 static void nvme_wake_free_req_locked(NVMeQueuePair *q)
 {
 if (!qemu_co_queue_empty(>free_req_queue)) {
-replay_bh_schedule_oneshot_event(q->s->aio_context,
+replay_bh_schedule_oneshot_event(q->aio_context,
 nvme_free_req_queue_cb, q);
 }
 }
@@ -492,7 +498,6 @@ static void nvme_cmd_sync_cb(void *opaque, int ret)
 static int nvme_cmd_sync(BlockDriverState *bs, NVMeQueuePair *q,
  NvmeCmd *cmd)
 {
-AioContext *aio_context = bdrv_get_aio_context(bs);
 NVMeRequest *req;
 int ret = -EINPROGRESS;
 req = nvme_get_free_req(q);
@@ -501,7 +506,7 @@ static int nvme_cmd_sync(BlockDriverState *bs, 
NVMeQueuePair *q,
 }
 nvme_submit_command(q, req, cmd, nvme_cmd_sync_cb, );
 
-AIO_WAIT_WHILE(aio_context, ret == -EINPROGRESS);
+AIO_WAIT_WHILE(q->aio_context, ret == -EINPROGRESS);
 return ret;
 }
 
@@ -621,14 +626,16 @@ static bool nvme_poll_queues(BDRVNVMeState *s)
 
 static void nvme_handle_event(EventNotifier *n)
 {
-BDRVNVMeState *s = container_of(n, BDRVNVMeState, irq_notifier);
+NVMeQueuePair *q = container_of(n, NVMeQueuePair, irq_notifier);
+BDRVNVMeState *s = q->s;
 
 trace_nvme_handle_event(s);
 event_notifier_test_and_clear(n);
 nvme_poll_queues(s);
 }
 
-static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
+static bool nvme_add_io_queue(BlockDriverState *bs,
+  AioContext *aio_context, Error **errp)
 {
 BDRVNVMeState *s = bs->opaque;
 int n = s->nr_queues;
@@ -636,8 +643,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error 
**errp)
 NvmeCmd cmd;
 int queue_size = NVME_QUEUE_SIZE;
 
-q = nvme_create_queue_pair(s, bdrv_get_aio_context(bs),
-   n, queue_size, errp);
+q = nvme_create_queue_pair(s, aio_context, n, queue_size, errp);
 if (!q) {
 return false;
 }
@@ -672,7 +678,8 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error 
**errp)
 static bool nvme_poll_cb(void *opaque)
 {
 EventNotifier *e = opaque;
-BDRVNVMeState *s = container_of(e, BDRVNVMeState, irq_notifier);
+NVMeQueuePair *q = container_of(e, NVMeQueuePair, irq_notifier);
+BDRVNVMeState *s = q->s;
 
 trace_nvme_poll_cb(s);
 return nvme_poll_queues(s);
@@ -693,12 +700,6 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 qemu_co_queue_init(>dma_flush_queue);
 s->device = g_strdup(device);
 s->nsid = namespace;
-s->aio_context = bdrv_get_aio_context(bs);
-ret = event_notifier_init(>irq_notifier, 0);
-if (ret) {
-error_setg(errp, "Failed to init event notifier");
-return ret;
-

[PATCH v2 05/12] block/nvme: Rename local variable

2020-06-30 Thread Philippe Mathieu-Daudé
We are going to modify the code in the next commit. Renaming
the 'resp' variable to 'id' first makes the next commit easier
to review. No logical changes.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 28762d7ee8..b9760ff04f 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -510,8 +510,8 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 BDRVNVMeState *s = bs->opaque;
 NvmeIdCtrl *idctrl;
 NvmeIdNs *idns;
+uint8_t *id;
 NvmeLBAF *lbaf;
-uint8_t *resp;
 uint16_t oncs;
 int r;
 uint64_t iova;
@@ -520,14 +520,14 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 .cdw10 = cpu_to_le32(0x1),
 };
 
-resp = qemu_try_blockalign0(bs, sizeof(NvmeIdCtrl));
-if (!resp) {
+id = qemu_try_blockalign0(bs, sizeof(NvmeIdCtrl));
+if (!id) {
 error_setg(errp, "Cannot allocate buffer for identify response");
 goto out;
 }
-idctrl = (NvmeIdCtrl *)resp;
-idns = (NvmeIdNs *)resp;
-r = qemu_vfio_dma_map(s->vfio, resp, sizeof(NvmeIdCtrl), true, );
+idctrl = (NvmeIdCtrl *)id;
+idns = (NvmeIdNs *)id;
+r = qemu_vfio_dma_map(s->vfio, id, sizeof(NvmeIdCtrl), true, );
 if (r) {
 error_setg(errp, "Cannot map buffer for DMA");
 goto out;
@@ -554,8 +554,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 s->supports_write_zeroes = !!(oncs & NVME_ONCS_WRITE_ZEROS);
 s->supports_discard = !!(oncs & NVME_ONCS_DSM);
 
-memset(resp, 0, 4096);
-
+memset(id, 0, 4096);
 cmd.cdw10 = 0;
 cmd.nsid = cpu_to_le32(namespace);
 if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
@@ -587,8 +586,8 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 
 s->blkshift = lbaf->ds;
 out:
-qemu_vfio_dma_unmap(s->vfio, resp);
-qemu_vfree(resp);
+qemu_vfio_dma_unmap(s->vfio, id);
+qemu_vfree(id);
 }
 
 static bool nvme_poll_queues(BDRVNVMeState *s)
-- 
2.21.3




[PATCH v2 06/12] block/nvme: Use union of NvmeIdCtrl / NvmeIdNs structures

2020-06-30 Thread Philippe Mathieu-Daudé
We allocate an unique chunk of memory then use it for two
different structures. By using an union, we make it clear
the data is overlapping (and we can remove the casts).

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index b9760ff04f..7ebd5be1f3 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -508,9 +508,10 @@ static int nvme_cmd_sync(BlockDriverState *bs, 
NVMeQueuePair *q,
 static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
 {
 BDRVNVMeState *s = bs->opaque;
-NvmeIdCtrl *idctrl;
-NvmeIdNs *idns;
-uint8_t *id;
+union {
+NvmeIdCtrl ctrl;
+NvmeIdNs ns;
+} *id;
 NvmeLBAF *lbaf;
 uint16_t oncs;
 int r;
@@ -520,14 +521,12 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 .cdw10 = cpu_to_le32(0x1),
 };
 
-id = qemu_try_blockalign0(bs, sizeof(NvmeIdCtrl));
+id = qemu_try_blockalign0(bs, sizeof(*id));
 if (!id) {
 error_setg(errp, "Cannot allocate buffer for identify response");
 goto out;
 }
-idctrl = (NvmeIdCtrl *)id;
-idns = (NvmeIdNs *)id;
-r = qemu_vfio_dma_map(s->vfio, id, sizeof(NvmeIdCtrl), true, );
+r = qemu_vfio_dma_map(s->vfio, id, sizeof(*id), true, );
 if (r) {
 error_setg(errp, "Cannot map buffer for DMA");
 goto out;
@@ -539,22 +538,22 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 goto out;
 }
 
-if (le32_to_cpu(idctrl->nn) < namespace) {
+if (le32_to_cpu(id->ctrl.nn) < namespace) {
 error_setg(errp, "Invalid namespace");
 goto out;
 }
-s->write_cache_supported = le32_to_cpu(idctrl->vwc) & 0x1;
-s->max_transfer = (idctrl->mdts ? 1 << idctrl->mdts : 0) * s->page_size;
+s->write_cache_supported = le32_to_cpu(id->ctrl.vwc) & 0x1;
+s->max_transfer = (id->ctrl.mdts ? 1 << id->ctrl.mdts : 0) * s->page_size;
 /* For now the page list buffer per command is one page, to hold at most
  * s->page_size / sizeof(uint64_t) entries. */
 s->max_transfer = MIN_NON_ZERO(s->max_transfer,
   s->page_size / sizeof(uint64_t) * s->page_size);
 
-oncs = le16_to_cpu(idctrl->oncs);
+oncs = le16_to_cpu(id->ctrl.oncs);
 s->supports_write_zeroes = !!(oncs & NVME_ONCS_WRITE_ZEROS);
 s->supports_discard = !!(oncs & NVME_ONCS_DSM);
 
-memset(id, 0, 4096);
+memset(id, 0, sizeof(*id));
 cmd.cdw10 = 0;
 cmd.nsid = cpu_to_le32(namespace);
 if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
@@ -562,11 +561,11 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 goto out;
 }
 
-s->nsze = le64_to_cpu(idns->nsze);
-lbaf = >lbaf[NVME_ID_NS_FLBAS_INDEX(idns->flbas)];
+s->nsze = le64_to_cpu(id->ns.nsze);
+lbaf = >ns.lbaf[NVME_ID_NS_FLBAS_INDEX(id->ns.flbas)];
 
-if (NVME_ID_NS_DLFEAT_WRITE_ZEROES(idns->dlfeat) &&
-NVME_ID_NS_DLFEAT_READ_BEHAVIOR(idns->dlfeat) ==
+if (NVME_ID_NS_DLFEAT_WRITE_ZEROES(id->ns.dlfeat) &&
+NVME_ID_NS_DLFEAT_READ_BEHAVIOR(id->ns.dlfeat) ==
 NVME_ID_NS_DLFEAT_READ_BEHAVIOR_ZEROES) {
 bs->supported_write_flags |= BDRV_REQ_MAY_UNMAP;
 }
-- 
2.21.3




[PATCH v2 08/12] block/nvme: Replace qemu_try_blockalign(bs) by qemu_try_memalign(pg_sz)

2020-06-30 Thread Philippe Mathieu-Daudé
qemu_try_blockalign() is a generic API that call back to the
block driver to return its page alignment. As we call from
within the very same driver, we already know to page alignment
stored in our state. Remove indirections and use the value from
BDRVNVMeState.
This change is required to later remove the BlockDriverState
argument, to make nvme_init_queue() per hardware, and not per
block driver.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 5b0bb9a8d7..8b4d957a8e 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -174,7 +174,7 @@ static void nvme_init_queue(BlockDriverState *bs, NVMeQueue 
*q,
 
 bytes = ROUND_UP(nentries * entry_bytes, s->page_size);
 q->head = q->tail = 0;
-q->queue = qemu_try_blockalign(bs, bytes);
+q->queue = qemu_try_memalign(s->page_size, bytes);
 if (!q->queue) {
 error_setg(errp, "Cannot allocate queue");
 return;
@@ -223,7 +223,7 @@ static NVMeQueuePair 
*nvme_create_queue_pair(BlockDriverState *bs,
 if (!q) {
 return NULL;
 }
-q->prp_list_pages = qemu_try_blockalign(bs,
+q->prp_list_pages = qemu_try_memalign(s->page_size,
   s->page_size * NVME_QUEUE_SIZE);
 if (!q->prp_list_pages) {
 goto fail;
@@ -522,7 +522,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 .cdw10 = cpu_to_le32(0x1),
 };
 
-id = qemu_try_blockalign(bs, sizeof(*id));
+id = qemu_try_memalign(s->page_size, sizeof(*id));
 if (!id) {
 error_setg(errp, "Cannot allocate buffer for identify response");
 goto out;
@@ -1140,7 +1140,7 @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
 }
 trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
-buf = qemu_try_blockalign(bs, bytes);
+buf = qemu_try_memalign(s->page_size, bytes);
 
 if (!buf) {
 return -ENOMEM;
@@ -1284,7 +1284,7 @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState 
*bs,
 
 assert(s->nr_queues > 1);
 
-buf = qemu_try_blockalign(bs, s->page_size);
+buf = qemu_try_memalign(s->page_size, s->page_size);
 if (!buf) {
 return -ENOMEM;
 }
-- 
2.21.3




[PATCH v2 03/12] block/nvme: Let nvme_create_queue_pair() fail gracefully

2020-06-30 Thread Philippe Mathieu-Daudé
As nvme_create_queue_pair() is allowed to fail, replace the
alloc() calls by try_alloc() to avoid aborting QEMU.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 8c30a5fee2..e1893b4e79 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -213,14 +213,22 @@ static NVMeQueuePair 
*nvme_create_queue_pair(BlockDriverState *bs,
 int i, r;
 BDRVNVMeState *s = bs->opaque;
 Error *local_err = NULL;
-NVMeQueuePair *q = g_new0(NVMeQueuePair, 1);
+NVMeQueuePair *q;
 uint64_t prp_list_iova;
 
+q = g_try_new0(NVMeQueuePair, 1);
+if (!q) {
+return NULL;
+}
+q->prp_list_pages = qemu_try_blockalign0(bs,
+  s->page_size * NVME_QUEUE_SIZE);
+if (!q->prp_list_pages) {
+goto fail;
+}
 qemu_mutex_init(>lock);
 q->s = s;
 q->index = idx;
 qemu_co_queue_init(>free_req_queue);
-q->prp_list_pages = qemu_blockalign0(bs, s->page_size * NVME_NUM_REQS);
 q->completion_bh = aio_bh_new(bdrv_get_aio_context(bs),
   nvme_process_completion_bh, q);
 r = qemu_vfio_dma_map(s->vfio, q->prp_list_pages,
-- 
2.21.3




[PATCH v2 04/12] block/nvme: Define QUEUE_INDEX macros to ease code review

2020-06-30 Thread Philippe Mathieu-Daudé
Use definitions instead of '0' or '1' indexes. Also this will
be useful when using multi-queues later.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 33 +++--
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index e1893b4e79..28762d7ee8 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -103,6 +103,9 @@ typedef volatile struct {
 
 QEMU_BUILD_BUG_ON(offsetof(NVMeRegs, doorbells) != 0x1000);
 
+#define QUEUE_INDEX_ADMIN   0
+#define QUEUE_INDEX_IO(n)   (1 + n)
+
 struct BDRVNVMeState {
 AioContext *aio_context;
 QEMUVFIOState *vfio;
@@ -531,7 +534,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 }
 cmd.prp1 = cpu_to_le64(iova);
 
-if (nvme_cmd_sync(bs, s->queues[0], )) {
+if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
 error_setg(errp, "Failed to identify controller");
 goto out;
 }
@@ -555,7 +558,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 
 cmd.cdw10 = 0;
 cmd.nsid = cpu_to_le32(namespace);
-if (nvme_cmd_sync(bs, s->queues[0], )) {
+if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
 error_setg(errp, "Failed to identify namespace");
 goto out;
 }
@@ -644,7 +647,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error 
**errp)
 .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0x)),
 .cdw11 = cpu_to_le32(0x3),
 };
-if (nvme_cmd_sync(bs, s->queues[0], )) {
+if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
 error_setg(errp, "Failed to create io queue [%d]", n);
 nvme_free_queue_pair(q);
 return false;
@@ -655,7 +658,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error 
**errp)
 .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0x)),
 .cdw11 = cpu_to_le32(0x1 | (n << 16)),
 };
-if (nvme_cmd_sync(bs, s->queues[0], )) {
+if (nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], )) {
 error_setg(errp, "Failed to create io queue [%d]", n);
 nvme_free_queue_pair(q);
 return false;
@@ -739,16 +742,18 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 
 /* Set up admin queue. */
 s->queues = g_new(NVMeQueuePair *, 1);
-s->queues[0] = nvme_create_queue_pair(bs, 0, NVME_QUEUE_SIZE, errp);
-if (!s->queues[0]) {
+s->queues[QUEUE_INDEX_ADMIN] = nvme_create_queue_pair(bs, 0,
+  NVME_QUEUE_SIZE,
+  errp);
+if (!s->queues[QUEUE_INDEX_ADMIN]) {
 ret = -EINVAL;
 goto out;
 }
 s->nr_queues = 1;
 QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
 s->regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
-s->regs->asq = cpu_to_le64(s->queues[0]->sq.iova);
-s->regs->acq = cpu_to_le64(s->queues[0]->cq.iova);
+s->regs->asq = cpu_to_le64(s->queues[QUEUE_INDEX_ADMIN]->sq.iova);
+s->regs->acq = cpu_to_le64(s->queues[QUEUE_INDEX_ADMIN]->cq.iova);
 
 /* After setting up all control registers we can enable device now. */
 s->regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
@@ -839,7 +844,7 @@ static int nvme_enable_disable_write_cache(BlockDriverState 
*bs, bool enable,
 .cdw11 = cpu_to_le32(enable ? 0x01 : 0x00),
 };
 
-ret = nvme_cmd_sync(bs, s->queues[0], );
+ret = nvme_cmd_sync(bs, s->queues[QUEUE_INDEX_ADMIN], );
 if (ret) {
 error_setg(errp, "Failed to configure NVMe write cache");
 }
@@ -1056,7 +1061,7 @@ static coroutine_fn int 
nvme_co_prw_aligned(BlockDriverState *bs,
 {
 int r;
 BDRVNVMeState *s = bs->opaque;
-NVMeQueuePair *ioq = s->queues[1];
+NVMeQueuePair *ioq = s->queues[QUEUE_INDEX_IO(0)];
 NVMeRequest *req;
 
 uint32_t cdw12 = (((bytes >> s->blkshift) - 1) & 0x) |
@@ -1171,7 +1176,7 @@ static coroutine_fn int nvme_co_pwritev(BlockDriverState 
*bs,
 static coroutine_fn int nvme_co_flush(BlockDriverState *bs)
 {
 BDRVNVMeState *s = bs->opaque;
-NVMeQueuePair *ioq = s->queues[1];
+NVMeQueuePair *ioq = s->queues[QUEUE_INDEX_IO(0)];
 NVMeRequest *req;
 NvmeCmd cmd = {
 .opcode = NVME_CMD_FLUSH,
@@ -1202,7 +1207,7 @@ static coroutine_fn int 
nvme_co_pwrite_zeroes(BlockDriverState *bs,
   BdrvRequestFlags flags)
 {
 BDRVNVMeState *s = bs->opaque;
-NVMeQueuePair *ioq = s->queues[1];
+NVMeQueuePair *ioq = s->queues[QUEUE_INDEX_IO(0)];
 NVMeRequest *req;
 
 uint32_t cdw12 = ((bytes >> s->blkshift) - 1) & 0x;
@@ -1255,7 +1260,7 @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState 
*bs,
  int bytes)
 {
 BDRVNVMeState *s = bs->opaque;
-NVMeQueuePair *ioq = s->queues[1];

[PATCH v2 00/12] block/nvme: Various cleanups required to use multiple queues

2020-06-30 Thread Philippe Mathieu-Daudé
Hi,

This series is mostly code rearrangement (cleanups) to be
able to split the hardware code from the block driver code,
to be able to use multiple queues on the same hardware, or
multiple block drivers on the same hardware.

Missing review: 1, 2, 5, 6, 8, 12.

Since v1:
- rebased
- use SCALE_MS definition
- added Stefan's R-b
- addressed Stefan's review comments
  - use union { NvmeIdCtrl / NvmeIdNs }
  - move irq_notifier to NVMeQueuePair
  - removed patches depending on "a tracable hardware stateo
object instead of BDRVNVMeState".

Please review,

Phil.

$ git backport-diff -u v1
Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/12:[down] 'block/nvme: Replace magic value by SCALE_MS definition'
002/12:[] [--] 'block/nvme: Avoid further processing if trace event not 
enabled'
003/12:[0002] [FC] 'block/nvme: Let nvme_create_queue_pair() fail gracefully'
004/12:[] [-C] 'block/nvme: Define QUEUE_INDEX macros to ease code review'
005/12:[down] 'block/nvme: Rename local variable'
006/12:[down] 'block/nvme: Use union of NvmeIdCtrl / NvmeIdNs structures'
007/12:[0011] [FC] 'block/nvme: Replace qemu_try_blockalign0 by 
qemu_try_blockalign/memset'
008/12:[0004] [FC] 'block/nvme: Replace qemu_try_blockalign(bs) by 
qemu_try_memalign(pg_sz)'
009/12:[] [-C] 'block/nvme: Simplify nvme_init_queue() arguments'
010/12:[] [-C] 'block/nvme: Replace BDRV_POLL_WHILE by AIO_WAIT_WHILE'
011/12:[0010] [FC] 'block/nvme: Simplify nvme_create_queue_pair() arguments'
012/12:[0056] [FC] 'block/nvme: Use per-queue AIO context'

Philippe Mathieu-Daudé (12):
  block/nvme: Replace magic value by SCALE_MS definition
  block/nvme: Avoid further processing if trace event not enabled
  block/nvme: Let nvme_create_queue_pair() fail gracefully
  block/nvme: Define QUEUE_INDEX macros to ease code review
  block/nvme: Rename local variable
  block/nvme: Use union of NvmeIdCtrl / NvmeIdNs structures
  block/nvme: Replace qemu_try_blockalign0 by qemu_try_blockalign/memset
  block/nvme: Replace qemu_try_blockalign(bs) by
qemu_try_memalign(pg_sz)
  block/nvme: Simplify nvme_init_queue() arguments
  block/nvme: Replace BDRV_POLL_WHILE by AIO_WAIT_WHILE
  block/nvme: Simplify nvme_create_queue_pair() arguments
  block/nvme: Use per-queue AIO context

 block/nvme.c | 186 ---
 1 file changed, 103 insertions(+), 83 deletions(-)

-- 
2.21.3




[PATCH v2 02/12] block/nvme: Avoid further processing if trace event not enabled

2020-06-30 Thread Philippe Mathieu-Daudé
Avoid further processing if TRACE_NVME_SUBMIT_COMMAND_RAW is
not enabled. This is an untested intend of performance optimization.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/nvme.c b/block/nvme.c
index 2f5e3c2adf..8c30a5fee2 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -441,6 +441,9 @@ static void nvme_trace_command(const NvmeCmd *cmd)
 {
 int i;
 
+if (!trace_event_get_state_backends(TRACE_NVME_SUBMIT_COMMAND_RAW)) {
+return;
+}
 for (i = 0; i < 8; ++i) {
 uint8_t *cmdp = (uint8_t *)cmd + i * 8;
 trace_nvme_submit_command_raw(cmdp[0], cmdp[1], cmdp[2], cmdp[3],
-- 
2.21.3




[PATCH v2 01/12] block/nvme: Replace magic value by SCALE_MS definition

2020-06-30 Thread Philippe Mathieu-Daudé
Use self-explicit SCALE_MS definition instead of magic value.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nvme.c b/block/nvme.c
index 374e268915..2f5e3c2adf 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -715,7 +715,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 /* Reset device to get a clean state. */
 s->regs->cc = cpu_to_le32(le32_to_cpu(s->regs->cc) & 0xFE);
 /* Wait for CSTS.RDY = 0. */
-deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * 
100ULL;
+deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
 while (le32_to_cpu(s->regs->csts) & 0x1) {
 if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
 error_setg(errp, "Timeout while waiting for device to reset (%"
-- 
2.21.3




Re: [PATCH v2 05/18] hw/block/nvme: Introduce the Namespace Types definitions

2020-06-30 Thread Keith Busch
On Tue, Jun 30, 2020 at 10:02:15AM +, Niklas Cassel wrote:
> On Mon, Jun 29, 2020 at 07:12:47PM -0700, Alistair Francis wrote:
> > On Wed, Jun 17, 2020 at 2:47 PM Dmitry Fomichev  
> > wrote:
> > > +uint16_tctrlid;
> > 
> > Shouldn't this be CNTID?
> 
> From the NVMe spec:
> https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf
> 
> Figure 241:
> Controller  Identifier  (CNTID)
> 
> So you are correct, this is the official abbreviation.
> 
> I guess that I tried wanted to keep it in sync with Linux:
> https://github.com/torvalds/linux/blob/master/include/linux/nvme.h#L974
> 
> Which uses ctrlid.
> 
> 
> Looking further at the NVMe spec:
> In Figure 247 (Identify Controller Data Structure) they use other names
> for fields:
> 
> Controller  ID  (CNTLID)
> Controller Attributes (CTRATT)
> 
> I can understand if they want to have unique names for fields, but it
> seems like they have trouble deciding how to abbreviate controller :)
> 
> Personally I think that ctrlid is more obvious that we are talking about
> a controller and not a count. But I'm fine regardless.

They shouldn't have shortened controller to "CNT". For those of us that
can't help but pronounce these as words, that is a vulgarity in English.



Re: [PATCH v3 3/4] hw/block/nvme: Fix pmrmsc register size

2020-06-30 Thread Klaus Jensen
On Jun 30 17:16, Philippe Mathieu-Daudé wrote:
> On 6/30/20 5:10 PM, Andrzej Jakowski wrote:
> > On 6/30/20 4:04 AM, Philippe Mathieu-Daudé wrote:
> >> The Persistent Memory Region Controller Memory Space Control
> >> register is 64-bit wide. See 'Figure 68: Register Definition'
> >> of the 'NVM Express Base Specification Revision 1.4'.
> >>
> >> Fixes: 6cf9413229 ("introduce PMR support from NVMe 1.4 spec")
> >> Reported-by: Klaus Jensen 
> >> Reviewed-by: Klaus Jensen 
> >> Signed-off-by: Philippe Mathieu-Daudé 
> >> ---
> >> Cc: Andrzej Jakowski 
> >> ---
> >>  include/block/nvme.h | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/include/block/nvme.h b/include/block/nvme.h
> >> index 71c5681912..82c384614a 100644
> >> --- a/include/block/nvme.h
> >> +++ b/include/block/nvme.h
> >> @@ -21,7 +21,7 @@ typedef struct QEMU_PACKED NvmeBar {
> >>  uint32_tpmrsts;
> >>  uint32_tpmrebs;
> >>  uint32_tpmrswtp;
> >> -uint32_tpmrmsc;
> >> +uint64_tpmrmsc;
> >>  } NvmeBar;
> >>  
> >>  enum NvmeCapShift {
> >> -- 2.21.3
> > 
> > This is good catch, though I wanted to highlight that this will still 
> > need to change as this register is not aligned properly and thus not in
> > compliance with spec.
> 
> I was wondering the odd alignment too. So you are saying at some time
> in the future the spec will be updated to correct the alignment?
> 
> Should we use this instead?
> 
>   uint32_tpmrmsc;
>  +uint32_tpmrmsc_upper32; /* the spec define this, but *
>  + * only low 32-bit are used  */
> 
> Or eventually an unnamed struct:
> 
>  -uint32_tpmrmsc;
>  +struct {
>  +uint32_t pmrmsc;
>  +uint32_t pmrmsc_upper32; /* the spec define this, but *
>  +  * only low 32-bit are used  */
>  +};
> 
> > 
> > Reviewed-by Andrzej Jakowski 
> > 
> 

I'm also not sure what you mean Andrzej. The odd alignment is exactly
what the spec (v1.4) says?



Re: [PATCH v2 05/18] hw/block/nvme: Introduce the Namespace Types definitions

2020-06-30 Thread Niklas Cassel
On Tue, Jun 30, 2020 at 06:57:16AM +0200, Klaus Jensen wrote:
> On Jun 18 06:34, Dmitry Fomichev wrote:
> > From: Niklas Cassel 
> > 
> > Define the structures and constants required to implement
> > Namespace Types support.
> > 
> > Signed-off-by: Niklas Cassel 
> > Signed-off-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme.h  |  3 ++
> >  include/block/nvme.h | 75 +---
> >  2 files changed, 73 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> > index 4f0dac39ae..4fd155c409 100644
> > --- a/hw/block/nvme.h
> > +++ b/hw/block/nvme.h
> > @@ -63,6 +63,9 @@ typedef struct NvmeCQueue {
> >  
> >  typedef struct NvmeNamespace {
> >  NvmeIdNsid_ns;
> > +uint32_tnsid;
> > +uint8_t csi;
> > +QemuUUIDuuid;
> >  } NvmeNamespace;
> >  
> >  static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> > diff --git a/include/block/nvme.h b/include/block/nvme.h
> > index 6a58bac0c2..5a1e5e137c 100644
> > --- a/include/block/nvme.h
> > +++ b/include/block/nvme.h
> > @@ -50,6 +50,11 @@ enum NvmeCapMask {
> >  CAP_PMR_MASK   = 0x1,
> >  };
> >  
> > +enum NvmeCapCssBits {
> > +CAP_CSS_NVM= 0x01,
> > +CAP_CSS_CSI_SUPP   = 0x40,
> > +};
> > +
> >  #define NVME_CAP_MQES(cap)  (((cap) >> CAP_MQES_SHIFT)   & CAP_MQES_MASK)
> >  #define NVME_CAP_CQR(cap)   (((cap) >> CAP_CQR_SHIFT)& CAP_CQR_MASK)
> >  #define NVME_CAP_AMS(cap)   (((cap) >> CAP_AMS_SHIFT)& CAP_AMS_MASK)
> > @@ -101,6 +106,12 @@ enum NvmeCcMask {
> >  CC_IOCQES_MASK  = 0xf,
> >  };
> >  
> > +enum NvmeCcCss {
> > +CSS_NVM_ONLY= 0,
> > +CSS_ALL_NSTYPES = 6,
> 
> Maybe we could call this CSS_CSI, since it just specifies that one or
> more command sets are supported, not that ALL namespace types are
> supported.

The enum name here is CcCss, so this represents CC.CSS,
which specifies which Command Sets to enable,
not which Command Sets that are supported.

(Supported Command Sets are defined by CAP.CSS and the
I/O Command Set data structure.)

So it indeed says to enable ALL command sets supported by the
controller. (Although for the CSI case, you need to check the
I/O Command Set data structure to know what is actually supported.)


However, I agree, the name CSS_ALL_NSTYPES is a bit misleading.
ALL_SUPPORTED_CSI would have been a more precise name.
However, simply naming it CSS_CSI, like you suggest, is more intuitive,
and is what we use in the Linux kernel patches, so let's use that :)


Kind regards,
Niklas

> 
> Otherwise,
> Reviewed-by: Klaus Jensen 
> 
> > +CSS_ADMIN_ONLY  = 7,
> > +};
> > +
> >  #define NVME_CC_EN(cc) ((cc >> CC_EN_SHIFT) & CC_EN_MASK)
> >  #define NVME_CC_CSS(cc)((cc >> CC_CSS_SHIFT)& CC_CSS_MASK)
> >  #define NVME_CC_MPS(cc)((cc >> CC_MPS_SHIFT)& CC_MPS_MASK)
> > @@ -109,6 +120,21 @@ enum NvmeCcMask {
> >  #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) & CC_IOSQES_MASK)
> >  #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) & CC_IOCQES_MASK)
> >  
> > +#define NVME_SET_CC_EN(cc, val) \
> > +(cc |= (uint32_t)((val) & CC_EN_MASK) << CC_EN_SHIFT)
> > +#define NVME_SET_CC_CSS(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_CSS_MASK) << CC_CSS_SHIFT)
> > +#define NVME_SET_CC_MPS(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_MPS_MASK) << CC_MPS_SHIFT)
> > +#define NVME_SET_CC_AMS(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_AMS_MASK) << CC_AMS_SHIFT)
> > +#define NVME_SET_CC_SHN(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_SHN_MASK) << CC_SHN_SHIFT)
> > +#define NVME_SET_CC_IOSQES(cc, val) \
> > +(cc |= (uint32_t)((val) & CC_IOSQES_MASK) << CC_IOSQES_SHIFT)
> > +#define NVME_SET_CC_IOCQES(cc, val) \
> > +(cc |= (uint32_t)((val) & CC_IOCQES_MASK) << CC_IOCQES_SHIFT)
> > +
> >  enum NvmeCstsShift {
> >  CSTS_RDY_SHIFT  = 0,
> >  CSTS_CFS_SHIFT  = 1,
> > @@ -482,10 +508,41 @@ typedef struct NvmeIdentify {
> >  uint64_trsvd2[2];
> >  uint64_tprp1;
> >  uint64_tprp2;
> > -uint32_tcns;
> > -uint32_trsvd11[5];
> > +uint8_t cns;
> > +uint8_t rsvd4;
> > +uint16_tctrlid;
> > +uint16_tnvmsetid;
> > +uint8_t rsvd3;
> > +uint8_t csi;
> > +uint32_trsvd12[4];
> >  } NvmeIdentify;
> >  
> > +typedef struct NvmeNsIdDesc {
> > +uint8_t nidt;
> > +uint8_t nidl;
> > +uint16_trsvd2;
> > +} NvmeNsIdDesc;
> > +
> > +enum NvmeNidType {
> > +NVME_NIDT_EUI64 = 0x01,
> > +NVME_NIDT_NGUID = 0x02,
> > +NVME_NIDT_UUID  = 0x03,
> > +NVME_NIDT_CSI   = 0x04,
> > +};
> > +
> > +enum NvmeNidLength {
> > +NVME_NIDL_EUI64 = 8,
> > +NVME_NIDL_NGUID = 16,
> > +NVME_NIDL_UUID  = 16,
> > +NVME_NIDL_CSI   = 1,
> > +};
> > +
> > +enum NvmeCsi {
> > +NVME_CSI_NVM   

Re: [PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Keith Busch
On Tue, Jun 30, 2020 at 04:09:46PM +0200, Philippe Mathieu-Daudé wrote:
> What I see doable for the following days is:
> - hw/block/nvme: Fix I/O BAR structure [3]
> - hw/block/nvme: handle transient dma errors
> - hw/block/nvme: bump to v1.3


These look like sensible patches to rebase future work on, IMO. The 1.3
updates had been prepared a while ago, at least.



Re: [PATCH v3 3/4] hw/block/nvme: Fix pmrmsc register size

2020-06-30 Thread Philippe Mathieu-Daudé
On 6/30/20 5:10 PM, Andrzej Jakowski wrote:
> On 6/30/20 4:04 AM, Philippe Mathieu-Daudé wrote:
>> The Persistent Memory Region Controller Memory Space Control
>> register is 64-bit wide. See 'Figure 68: Register Definition'
>> of the 'NVM Express Base Specification Revision 1.4'.
>>
>> Fixes: 6cf9413229 ("introduce PMR support from NVMe 1.4 spec")
>> Reported-by: Klaus Jensen 
>> Reviewed-by: Klaus Jensen 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>> Cc: Andrzej Jakowski 
>> ---
>>  include/block/nvme.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/block/nvme.h b/include/block/nvme.h
>> index 71c5681912..82c384614a 100644
>> --- a/include/block/nvme.h
>> +++ b/include/block/nvme.h
>> @@ -21,7 +21,7 @@ typedef struct QEMU_PACKED NvmeBar {
>>  uint32_tpmrsts;
>>  uint32_tpmrebs;
>>  uint32_tpmrswtp;
>> -uint32_tpmrmsc;
>> +uint64_tpmrmsc;
>>  } NvmeBar;
>>  
>>  enum NvmeCapShift {
>> -- 2.21.3
> 
> This is good catch, though I wanted to highlight that this will still 
> need to change as this register is not aligned properly and thus not in
> compliance with spec.

I was wondering the odd alignment too. So you are saying at some time
in the future the spec will be updated to correct the alignment?

Should we use this instead?

  uint32_tpmrmsc;
 +uint32_tpmrmsc_upper32; /* the spec define this, but *
 + * only low 32-bit are used  */

Or eventually an unnamed struct:

 -uint32_tpmrmsc;
 +struct {
 +uint32_t pmrmsc;
 +uint32_t pmrmsc_upper32; /* the spec define this, but *
 +  * only low 32-bit are used  */
 +};

> 
> Reviewed-by Andrzej Jakowski 
> 




Re: [PATCH v3 3/4] hw/block/nvme: Fix pmrmsc register size

2020-06-30 Thread Andrzej Jakowski
On 6/30/20 4:04 AM, Philippe Mathieu-Daudé wrote:
> The Persistent Memory Region Controller Memory Space Control
> register is 64-bit wide. See 'Figure 68: Register Definition'
> of the 'NVM Express Base Specification Revision 1.4'.
> 
> Fixes: 6cf9413229 ("introduce PMR support from NVMe 1.4 spec")
> Reported-by: Klaus Jensen 
> Reviewed-by: Klaus Jensen 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Cc: Andrzej Jakowski 
> ---
>  include/block/nvme.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 71c5681912..82c384614a 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -21,7 +21,7 @@ typedef struct QEMU_PACKED NvmeBar {
>  uint32_tpmrsts;
>  uint32_tpmrebs;
>  uint32_tpmrswtp;
> -uint32_tpmrmsc;
> +uint64_tpmrmsc;
>  } NvmeBar;
>  
>  enum NvmeCapShift {
> -- 2.21.3

This is good catch, though I wanted to highlight that this will still 
need to change as this register is not aligned properly and thus not in
compliance with spec.

Reviewed-by Andrzej Jakowski 



Re: [PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Niklas Cassel
On Tue, Jun 30, 2020 at 12:01:29PM +0200, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Hi all,

Hello Klaus,

> 
> This series adds support for TP 4056 ("Namespace Types") and TP 4053
> ("Zoned Namespaces") and is an alternative implementation to the one
> submitted by Dmitry[1].
> 
> While I don't want this to end up as a discussion about the merits of
> each version, I want to point out a couple of differences from Dmitry's
> version. At a glance, my version
> 
>   * builds on my patch series that adds fairly complete NVMe v1.4
> mandatory support, as well as nice-to-have feature such as SGLs,
> multiple namespaces and mostly just overall clean up. This finally
> brings the nvme device into a fairly compliant state on which we can
> add new features. I've tried hard to get these compliance and
> clean-up patches merged for a long time (in parallel with developing
> the emulation of NST and ZNS) and I would be really sad to see them
> by-passed since they have already been through many iterations and
> already carries Acked- and Reviewed-by's for the bulk of the
> patches. I think the nvme device is already in a "frankenstate" wrt.
> the implemented nvme version and the features it currently supports,
> so I think this kind of cleanup is long overdue.
> 
>   * uses an attached blockdev and standard blk_aio for persistent zone
> info. This is the same method used in our patches for Write
> Uncorrectable and (separate and extended lba) metadata support, but
> I've left those optional features out for now to ease the review
> process.
> 
>   * relies on the universal dulbe support added in ("hw/block/nvme: add
> support for dulbe") and sparse images for handling reads in gaps
> (above write pointer and below ZSZE); that is - the size of the
> underlying blockdev is in terms of ZSZE, not ZCAP
> 
>   * the controller uses timers to autonomously finish zones (wrt. FRL)

AFAICT, Dmitry's patches does this as well.

> 
> I've been on paternity leave for a month, so I havn't been around to
> review Dmitry's patches, but I have started that process now. I would
> also be happy to work with Dmitry & Friends on merging our versions to
> get the best of both worlds if it makes sense.
> 
> This series and all preparatory patch sets (the ones I've been posting
> yesterday and today) are available on my GitHub[2]. Unfortunately
> Patchew got screwed up in the middle of me sending patches and it never
> picked up v2 of "hw/block/nvme: support multiple namespaces" because it
> was getting late and I made a mistake with the CC's. So my posted series
> don't apply according to Patchew, but they actually do if you follow the
> Based-on's (... or just grab [2]).
> 
> 
>   [1]: Message-Id: <20200617213415.22417-1-dmitry.fomic...@wdc.com>
>   [2]: https://github.com/birkelund/qemu/tree/for-master/nvme
> 
> 
> Based-on: <20200630043122.1307043-1-...@irrelevant.dk>
> ("[PATCH 0/3] hw/block/nvme: bump to v1.4")

Is this the only patch series that this series depends on?

In the beginning of the cover letter, you mentioned
"NVMe v1.4 mandatory support", "SGLs", "multiple namespaces",
and "and mostly just overall clean up".

> 
> Klaus Jensen (10):
>   hw/block/nvme: support I/O Command Sets
>   hw/block/nvme: add zns specific fields and types
>   hw/block/nvme: add basic read/write for zoned namespaces
>   hw/block/nvme: add the zone management receive command
>   hw/block/nvme: add the zone management send command
>   hw/block/nvme: add the zone append command
>   hw/block/nvme: track and enforce zone resources
>   hw/block/nvme: allow open to close transitions by controller
>   hw/block/nvme: allow zone excursions
>   hw/block/nvme: support reset/finish recommended limits
> 
>  block/nvme.c  |6 +-
>  hw/block/nvme-ns.c|  397 +-
>  hw/block/nvme-ns.h|  148 +++-
>  hw/block/nvme.c   | 1676 +++--
>  hw/block/nvme.h   |   76 +-
>  hw/block/trace-events |   43 +-
>  include/block/nvme.h  |  252 ++-
>  7 files changed, 2469 insertions(+), 129 deletions(-)
> 
> -- 
> 2.27.0
> 

I think that you have done a great job getting the NVMe
driver out of a frankenstate, and made it compliant with
a proper spec (NVMe 1.4).

I'm also a big fan of the refactoring so that the driver
handles more than one namespace, and the new bus model.

I know that you first sent your
"nvme: support NVMe v1.3d, SGLs and multiple namespaces"
patch series July, last year.

Looking at your outstanding patch series on patchwork:
https://patchwork.kernel.org/project/qemu-devel/list/?submitter=188679

(Feel free to correct me if I have misunderstood anything.)

I see that these are related to your patch series from July last year:
hw/block/nvme: bump to v1.3
hw/block/nvme: support scatter gather lists
hw/block/nvme: support multiple namespaces
hw/block/nvme: bump to v1.4


This patch series seems minor and could probably 

Re: [PATCH v2 05/18] hw/block/nvme: Introduce the Namespace Types definitions

2020-06-30 Thread Niklas Cassel
On Mon, Jun 29, 2020 at 07:12:47PM -0700, Alistair Francis wrote:
> On Wed, Jun 17, 2020 at 2:47 PM Dmitry Fomichev  
> wrote:
> >
> > From: Niklas Cassel 
> >
> > Define the structures and constants required to implement
> > Namespace Types support.
> >
> > Signed-off-by: Niklas Cassel 
> > Signed-off-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme.h  |  3 ++
> >  include/block/nvme.h | 75 +---
> >  2 files changed, 73 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> > index 4f0dac39ae..4fd155c409 100644
> > --- a/hw/block/nvme.h
> > +++ b/hw/block/nvme.h
> > @@ -63,6 +63,9 @@ typedef struct NvmeCQueue {
> >
> >  typedef struct NvmeNamespace {
> >  NvmeIdNsid_ns;
> > +uint32_tnsid;
> > +uint8_t csi;
> > +QemuUUIDuuid;
> >  } NvmeNamespace;
> >
> >  static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> > diff --git a/include/block/nvme.h b/include/block/nvme.h
> > index 6a58bac0c2..5a1e5e137c 100644
> > --- a/include/block/nvme.h
> > +++ b/include/block/nvme.h
> > @@ -50,6 +50,11 @@ enum NvmeCapMask {
> >  CAP_PMR_MASK   = 0x1,
> >  };
> >
> > +enum NvmeCapCssBits {
> > +CAP_CSS_NVM= 0x01,
> > +CAP_CSS_CSI_SUPP   = 0x40,
> > +};
> > +
> >  #define NVME_CAP_MQES(cap)  (((cap) >> CAP_MQES_SHIFT)   & CAP_MQES_MASK)
> >  #define NVME_CAP_CQR(cap)   (((cap) >> CAP_CQR_SHIFT)& CAP_CQR_MASK)
> >  #define NVME_CAP_AMS(cap)   (((cap) >> CAP_AMS_SHIFT)& CAP_AMS_MASK)
> > @@ -101,6 +106,12 @@ enum NvmeCcMask {
> >  CC_IOCQES_MASK  = 0xf,
> >  };
> >
> > +enum NvmeCcCss {
> > +CSS_NVM_ONLY= 0,
> > +CSS_ALL_NSTYPES = 6,
> > +CSS_ADMIN_ONLY  = 7,
> > +};
> > +
> >  #define NVME_CC_EN(cc) ((cc >> CC_EN_SHIFT) & CC_EN_MASK)
> >  #define NVME_CC_CSS(cc)((cc >> CC_CSS_SHIFT)& CC_CSS_MASK)
> >  #define NVME_CC_MPS(cc)((cc >> CC_MPS_SHIFT)& CC_MPS_MASK)
> > @@ -109,6 +120,21 @@ enum NvmeCcMask {
> >  #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) & CC_IOSQES_MASK)
> >  #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) & CC_IOCQES_MASK)
> >
> > +#define NVME_SET_CC_EN(cc, val) \
> > +(cc |= (uint32_t)((val) & CC_EN_MASK) << CC_EN_SHIFT)
> > +#define NVME_SET_CC_CSS(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_CSS_MASK) << CC_CSS_SHIFT)
> > +#define NVME_SET_CC_MPS(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_MPS_MASK) << CC_MPS_SHIFT)
> > +#define NVME_SET_CC_AMS(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_AMS_MASK) << CC_AMS_SHIFT)
> > +#define NVME_SET_CC_SHN(cc, val)\
> > +(cc |= (uint32_t)((val) & CC_SHN_MASK) << CC_SHN_SHIFT)
> > +#define NVME_SET_CC_IOSQES(cc, val) \
> > +(cc |= (uint32_t)((val) & CC_IOSQES_MASK) << CC_IOSQES_SHIFT)
> > +#define NVME_SET_CC_IOCQES(cc, val) \
> > +(cc |= (uint32_t)((val) & CC_IOCQES_MASK) << CC_IOCQES_SHIFT)
> > +
> >  enum NvmeCstsShift {
> >  CSTS_RDY_SHIFT  = 0,
> >  CSTS_CFS_SHIFT  = 1,
> > @@ -482,10 +508,41 @@ typedef struct NvmeIdentify {
> >  uint64_trsvd2[2];
> >  uint64_tprp1;
> >  uint64_tprp2;
> > -uint32_tcns;
> > -uint32_trsvd11[5];
> > +uint8_t cns;
> > +uint8_t rsvd4;
> > +uint16_tctrlid;
> 
> Shouldn't this be CNTID?

>From the NVMe spec:
https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf

Figure 241:
Controller  Identifier  (CNTID)

So you are correct, this is the official abbreviation.

I guess that I tried wanted to keep it in sync with Linux:
https://github.com/torvalds/linux/blob/master/include/linux/nvme.h#L974

Which uses ctrlid.


Looking further at the NVMe spec:
In Figure 247 (Identify Controller Data Structure) they use other names
for fields:

Controller  ID  (CNTLID)
Controller Attributes (CTRATT)

I can understand if they want to have unique names for fields, but it
seems like they have trouble deciding how to abbreviate controller :)

Personally I think that ctrlid is more obvious that we are talking about
a controller and not a count. But I'm fine regardless.


Kind regards,
Niklas

> 
> Alistair
> 
> > +uint16_tnvmsetid;
> > +uint8_t rsvd3;
> > +uint8_t csi;
> > +uint32_trsvd12[4];
> >  } NvmeIdentify;
> >
> > +typedef struct NvmeNsIdDesc {
> > +uint8_t nidt;
> > +uint8_t nidl;
> > +uint16_trsvd2;
> > +} NvmeNsIdDesc;
> > +
> > +enum NvmeNidType {
> > +NVME_NIDT_EUI64 = 0x01,
> > +NVME_NIDT_NGUID = 0x02,
> > +NVME_NIDT_UUID  = 0x03,
> > +NVME_NIDT_CSI   = 0x04,
> > +};
> > +
> > +enum NvmeNidLength {
> > +NVME_NIDL_EUI64 = 8,
> > +NVME_NIDL_NGUID = 16,
> > +NVME_NIDL_UUID  = 16,
> > +NVME_NIDL_CSI   = 1,
> > +};
> > +
> > +enum NvmeCsi {
> > +NVME_CSI_NVM= 0x00,
> > +};
> > +

Re: [PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Philippe Mathieu-Daudé
On 6/30/20 2:59 PM, Niklas Cassel wrote:
> On Tue, Jun 30, 2020 at 12:01:29PM +0200, Klaus Jensen wrote:
>> From: Klaus Jensen 
>>
>> Hi all,
> 
> Hello Klaus,
> 
>>
>> This series adds support for TP 4056 ("Namespace Types") and TP 4053
>> ("Zoned Namespaces") and is an alternative implementation to the one
>> submitted by Dmitry[1].
>>
>> While I don't want this to end up as a discussion about the merits of
>> each version, I want to point out a couple of differences from Dmitry's
>> version. At a glance, my version
>>
>>   * builds on my patch series that adds fairly complete NVMe v1.4
>> mandatory support, as well as nice-to-have feature such as SGLs,
>> multiple namespaces and mostly just overall clean up. This finally
>> brings the nvme device into a fairly compliant state on which we can
>> add new features. I've tried hard to get these compliance and
>> clean-up patches merged for a long time (in parallel with developing
>> the emulation of NST and ZNS) and I would be really sad to see them
>> by-passed since they have already been through many iterations and
>> already carries Acked- and Reviewed-by's for the bulk of the
>> patches. I think the nvme device is already in a "frankenstate" wrt.
>> the implemented nvme version and the features it currently supports,
>> so I think this kind of cleanup is long overdue.
>>
>>   * uses an attached blockdev and standard blk_aio for persistent zone
>> info. This is the same method used in our patches for Write
>> Uncorrectable and (separate and extended lba) metadata support, but
>> I've left those optional features out for now to ease the review
>> process.
>>
>>   * relies on the universal dulbe support added in ("hw/block/nvme: add
>> support for dulbe") and sparse images for handling reads in gaps
>> (above write pointer and below ZSZE); that is - the size of the
>> underlying blockdev is in terms of ZSZE, not ZCAP
>>
>>   * the controller uses timers to autonomously finish zones (wrt. FRL)
> 
> AFAICT, Dmitry's patches does this as well.
> 
>>
>> I've been on paternity leave for a month, so I havn't been around to
>> review Dmitry's patches, but I have started that process now. I would
>> also be happy to work with Dmitry & Friends on merging our versions to
>> get the best of both worlds if it makes sense.
>>
>> This series and all preparatory patch sets (the ones I've been posting
>> yesterday and today) are available on my GitHub[2]. Unfortunately
>> Patchew got screwed up in the middle of me sending patches and it never
>> picked up v2 of "hw/block/nvme: support multiple namespaces" because it
>> was getting late and I made a mistake with the CC's. So my posted series
>> don't apply according to Patchew, but they actually do if you follow the
>> Based-on's (... or just grab [2]).
>>
>>
>>   [1]: Message-Id: <20200617213415.22417-1-dmitry.fomic...@wdc.com>
>>   [2]: https://github.com/birkelund/qemu/tree/for-master/nvme
>>
>>
>> Based-on: <20200630043122.1307043-1-...@irrelevant.dk>
>> ("[PATCH 0/3] hw/block/nvme: bump to v1.4")
> 
> Is this the only patch series that this series depends on?
> 
> In the beginning of the cover letter, you mentioned
> "NVMe v1.4 mandatory support", "SGLs", "multiple namespaces",
> and "and mostly just overall clean up".
> 
>>
>> Klaus Jensen (10):
>>   hw/block/nvme: support I/O Command Sets
>>   hw/block/nvme: add zns specific fields and types
>>   hw/block/nvme: add basic read/write for zoned namespaces
>>   hw/block/nvme: add the zone management receive command
>>   hw/block/nvme: add the zone management send command
>>   hw/block/nvme: add the zone append command
>>   hw/block/nvme: track and enforce zone resources
>>   hw/block/nvme: allow open to close transitions by controller
>>   hw/block/nvme: allow zone excursions
>>   hw/block/nvme: support reset/finish recommended limits
>>
>>  block/nvme.c  |6 +-
>>  hw/block/nvme-ns.c|  397 +-
>>  hw/block/nvme-ns.h|  148 +++-
>>  hw/block/nvme.c   | 1676 +++--
>>  hw/block/nvme.h   |   76 +-
>>  hw/block/trace-events |   43 +-
>>  include/block/nvme.h  |  252 ++-
>>  7 files changed, 2469 insertions(+), 129 deletions(-)
>>
>> -- 
>> 2.27.0
>>
> 
> I think that you have done a great job getting the NVMe
> driver out of a frankenstate, and made it compliant with
> a proper spec (NVMe 1.4).
> 
> I'm also a big fan of the refactoring so that the driver
> handles more than one namespace, and the new bus model.
> 
> I know that you first sent your
> "nvme: support NVMe v1.3d, SGLs and multiple namespaces"
> patch series July, last year.
> 
> Looking at your outstanding patch series on patchwork:
> https://patchwork.kernel.org/project/qemu-devel/list/?submitter=188679
> 
> (Feel free to correct me if I have misunderstood anything.)
> 
> I see that these are related to your patch series from July last year:
> hw/block/nvme: bump 

[PATCH v7 15/17] hw/sd/sdcard: Correctly display the command name in trace events

2020-06-30 Thread Philippe Mathieu-Daudé
Some ACMD were incorrectly displayed. Fix by remembering if we
are processing a ACMD (with current_cmd_is_acmd) and add the
sd_current_cmd_name() helper, which display to correct name
regardless it is a CMD or ACMD.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 3e9faa8add..eb549a52e1 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -125,6 +125,7 @@ struct SDState {
 uint8_t pwd[16];
 uint32_t pwd_len;
 uint8_t function_group[6];
+bool current_cmd_is_acmd;
 uint8_t current_cmd;
 /* True if we will handle the next command as an ACMD. Note that this does
  * *not* track the APP_CMD status bit!
@@ -1718,6 +1719,8 @@ int sd_do_command(SDState *sd, SDRequest *req,
   req->cmd);
 req->cmd &= 0x3f;
 }
+sd->current_cmd = req->cmd;
+sd->current_cmd_is_acmd = sd->expecting_acmd;
 
 if (sd->card_status & CARD_IS_LOCKED) {
 if (!cmd_valid_while_locked(sd, req->cmd)) {
@@ -1745,7 +1748,6 @@ int sd_do_command(SDState *sd, SDRequest *req,
 /* Valid command, we can update the 'state before command' bits.
  * (Do this now so they appear in r1 responses.)
  */
-sd->current_cmd = req->cmd;
 sd->card_status &= ~CURRENT_STATE;
 sd->card_status |= (last_state << 9);
 }
@@ -1806,6 +1808,15 @@ send_response:
 return rsplen;
 }
 
+static const char *sd_current_cmd_name(SDState *sd)
+{
+if (sd->current_cmd_is_acmd) {
+return sd_acmd_name(sd->current_cmd);
+} else {
+return sd_cmd_name(sd->current_cmd);
+}
+}
+
 static void sd_blk_read(SDState *sd, uint64_t addr, uint32_t len)
 {
 trace_sdcard_read_block(addr, len);
@@ -1844,7 +1855,7 @@ void sd_write_data(SDState *sd, uint8_t value)
 return;
 
 trace_sdcard_write_data(sd->proto_name,
-sd_acmd_name(sd->current_cmd),
+sd_current_cmd_name(sd),
 sd->current_cmd, value);
 switch (sd->current_cmd) {
 case 24:   /* CMD24:  WRITE_SINGLE_BLOCK */
@@ -1998,7 +2009,7 @@ uint8_t sd_read_data(SDState *sd)
 io_len = (sd->ocr & (1 << 30)) ? HWBLOCK_SIZE : sd->blk_len;
 
 trace_sdcard_read_data(sd->proto_name,
-   sd_acmd_name(sd->current_cmd),
+   sd_current_cmd_name(sd),
sd->current_cmd, io_len);
 switch (sd->current_cmd) {
 case 6:/* CMD6:   SWITCH_FUNCTION */
-- 
2.21.3




[PATCH v7 16/17] hw/sd/sdcard: Display offset in read/write_data() trace events

2020-06-30 Thread Philippe Mathieu-Daudé
Having 'base address' and 'relative offset' displayed
separately is more helpful than the absolute address.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 8 
 hw/sd/trace-events | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index eb549a52e1..304fa4143a 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1855,8 +1855,8 @@ void sd_write_data(SDState *sd, uint8_t value)
 return;
 
 trace_sdcard_write_data(sd->proto_name,
-sd_current_cmd_name(sd),
-sd->current_cmd, value);
+sd_current_cmd_name(sd), sd->current_cmd,
+sd->data_start, sd->data_offset, value);
 switch (sd->current_cmd) {
 case 24:   /* CMD24:  WRITE_SINGLE_BLOCK */
 sd->data[sd->data_offset ++] = value;
@@ -2009,8 +2009,8 @@ uint8_t sd_read_data(SDState *sd)
 io_len = (sd->ocr & (1 << 30)) ? HWBLOCK_SIZE : sd->blk_len;
 
 trace_sdcard_read_data(sd->proto_name,
-   sd_current_cmd_name(sd),
-   sd->current_cmd, io_len);
+   sd_current_cmd_name(sd), sd->current_cmd,
+   sd->data_start, sd->data_offset, io_len);
 switch (sd->current_cmd) {
 case 6:/* CMD6:   SWITCH_FUNCTION */
 ret = sd->data[sd->data_offset ++];
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
index d0cd7c6ec4..946923223b 100644
--- a/hw/sd/trace-events
+++ b/hw/sd/trace-events
@@ -51,8 +51,8 @@ sdcard_lock(void) ""
 sdcard_unlock(void) ""
 sdcard_read_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
 sdcard_write_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
-sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint8_t value) "%s %20s/ CMD%02d value 0x%02x"
-sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint32_t length) "%s %20s/ CMD%02d len %" PRIu32
+sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint64_t address, uint32_t offset, uint8_t value) "%s %20s/ CMD%02d addr 0x%" 
PRIx64 " ofs 0x%" PRIx32 " val 0x%02x"
+sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint64_t address, uint32_t offset, uint32_t length) "%s %20s/ CMD%02d addr 0x%" 
PRIx64 " ofs 0x%" PRIx32 " len %" PRIu32
 sdcard_set_voltage(uint16_t millivolts) "%u mV"
 
 # milkymist-memcard.c
-- 
2.21.3




[PATCH v7 14/17] hw/sd/sdcard: Make iolen unsigned

2020-06-30 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

I/O request length can not be negative.

Signed-off-by: Philippe Mathieu-Daudé 
---
v4: Use uint32_t (pm215)
---
 hw/sd/sd.c | 2 +-
 hw/sd/trace-events | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 364a6d1fcd..3e9faa8add 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1981,7 +1981,7 @@ uint8_t sd_read_data(SDState *sd)
 {
 /* TODO: Append CRCs */
 uint8_t ret;
-int io_len;
+uint32_t io_len;
 
 if (!sd->blk || !blk_is_inserted(sd->blk) || !sd->enable)
 return 0x00;
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
index 5f09d32eb2..d0cd7c6ec4 100644
--- a/hw/sd/trace-events
+++ b/hw/sd/trace-events
@@ -52,7 +52,7 @@ sdcard_unlock(void) ""
 sdcard_read_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
 sdcard_write_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
 sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint8_t value) "%s %20s/ CMD%02d value 0x%02x"
-sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, int 
length) "%s %20s/ CMD%02d len %d"
+sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint32_t length) "%s %20s/ CMD%02d len %" PRIu32
 sdcard_set_voltage(uint16_t millivolts) "%u mV"
 
 # milkymist-memcard.c
-- 
2.21.3




[PATCH v7 12/17] hw/sd/sdcard: Simplify cmd_valid_while_locked()

2020-06-30 Thread Philippe Mathieu-Daudé
cmd_valid_while_locked() only needs to read SDRequest->cmd,
pass it directly and make it const.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 723e66bbf2..2946fe3040 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1678,7 +1678,7 @@ static sd_rsp_type_t sd_app_command(SDState *sd,
 return sd_illegal;
 }
 
-static int cmd_valid_while_locked(SDState *sd, SDRequest *req)
+static int cmd_valid_while_locked(SDState *sd, const uint8_t cmd)
 {
 /* Valid commands in locked state:
  * basic class (0)
@@ -1689,13 +1689,12 @@ static int cmd_valid_while_locked(SDState *sd, 
SDRequest *req)
  * Anything else provokes an "illegal command" response.
  */
 if (sd->expecting_acmd) {
-return req->cmd == 41 || req->cmd == 42;
+return cmd == 41 || cmd == 42;
 }
-if (req->cmd == 16 || req->cmd == 55) {
+if (cmd == 16 || cmd == 55) {
 return 1;
 }
-return sd_cmd_class[req->cmd] == 0
-|| sd_cmd_class[req->cmd] == 7;
+return sd_cmd_class[cmd] == 0 || sd_cmd_class[cmd] == 7;
 }
 
 int sd_do_command(SDState *sd, SDRequest *req,
@@ -1721,7 +1720,7 @@ int sd_do_command(SDState *sd, SDRequest *req,
 }
 
 if (sd->card_status & CARD_IS_LOCKED) {
-if (!cmd_valid_while_locked(sd, req)) {
+if (!cmd_valid_while_locked(sd, req->cmd)) {
 sd->card_status |= ILLEGAL_COMMAND;
 sd->expecting_acmd = false;
 qemu_log_mask(LOG_GUEST_ERROR, "SD: Card is locked\n");
-- 
2.21.3




[PATCH v7 07/17] hw/sd/sdcard: Move sd->size initialization

2020-06-30 Thread Philippe Mathieu-Daudé
Move sd->size initialization earlier to make the following
patches easier to review.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 871c30a67f..078b0e81ee 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -558,12 +558,13 @@ static void sd_reset(DeviceState *dev)
 
 sect = sd_addr_to_wpnum(size) + 1;
 
+sd->size = size;
 sd->state = sd_idle_state;
 sd->rca = 0x;
 sd_set_ocr(sd);
 sd_set_scr(sd);
 sd_set_cid(sd);
-sd_set_csd(sd, size);
+sd_set_csd(sd, sd->size);
 sd_set_cardstatus(sd);
 sd_set_sdstatus(sd);
 
@@ -574,7 +575,6 @@ static void sd_reset(DeviceState *dev)
 memset(sd->function_group, 0, sizeof(sd->function_group));
 sd->erase_start = 0;
 sd->erase_end = 0;
-sd->size = size;
 sd->blk_len = HWBLOCK_SIZE;
 sd->pwd_len = 0;
 sd->expecting_acmd = false;
-- 
2.21.3




[PATCH v7 17/17] hw/sd/sdcard: Simplify realize() a bit

2020-06-30 Thread Philippe Mathieu-Daudé
We don't need to check if sd->blk is set twice.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 304fa4143a..8ef6715665 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -2154,12 +2154,12 @@ static void sd_realize(DeviceState *dev, Error **errp)
 return;
 }
 
-if (sd->blk && blk_is_read_only(sd->blk)) {
-error_setg(errp, "Cannot use read-only drive as SD card");
-return;
-}
-
 if (sd->blk) {
+if (blk_is_read_only(sd->blk)) {
+error_setg(errp, "Cannot use read-only drive as SD card");
+return;
+}
+
 ret = blk_set_perm(sd->blk, BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE,
BLK_PERM_ALL, errp);
 if (ret < 0) {
-- 
2.21.3




[PATCH v7 13/17] hw/sd/sdcard: Constify sd_crc*()'s message argument

2020-06-30 Thread Philippe Mathieu-Daudé
CRC functions don't modify the buffer argument,
make it const.

Reviewed-by: Alistair Francis 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 2946fe3040..364a6d1fcd 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -255,11 +255,11 @@ static const int sd_cmd_class[SDMMC_CMD_MAX] = {
 7,  7, 10,  7,  9,  9,  9,  8,  8, 10,  8,  8,  8,  8,  8,  8,
 };
 
-static uint8_t sd_crc7(void *message, size_t width)
+static uint8_t sd_crc7(const void *message, size_t width)
 {
 int i, bit;
 uint8_t shift_reg = 0x00;
-uint8_t *msg = (uint8_t *) message;
+const uint8_t *msg = (const uint8_t *)message;
 
 for (i = 0; i < width; i ++, msg ++)
 for (bit = 7; bit >= 0; bit --) {
@@ -271,11 +271,11 @@ static uint8_t sd_crc7(void *message, size_t width)
 return shift_reg;
 }
 
-static uint16_t sd_crc16(void *message, size_t width)
+static uint16_t sd_crc16(const void *message, size_t width)
 {
 int i, bit;
 uint16_t shift_reg = 0x;
-uint16_t *msg = (uint16_t *) message;
+const uint16_t *msg = (const uint16_t *)message;
 width <<= 1;
 
 for (i = 0; i < width; i ++, msg ++)
-- 
2.21.3




[PATCH v7 06/17] hw/sd/sdcard: Restrict Class 6 commands to SCSD cards

2020-06-30 Thread Philippe Mathieu-Daudé
Only SCSD cards support Class 6 (Block Oriented Write Protection)
commands.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.14 Command Functional Difference in Card Capacity Types

  * Write Protected Group

  SDHC and SDXC do not support write-protected groups. Issuing
  CMD28, CMD29 and CMD30 generates the ILLEGAL_COMMAND error.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 7e0d684aca..871c30a67f 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -922,6 +922,11 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->multi_blk_cnt = 0;
 }
 
+if (sd_cmd_class[req.cmd] == 6 && FIELD_EX32(sd->ocr, OCR, CARD_CAPACITY)) 
{
+/* Only Standard Capacity cards support class 6 commands */
+return sd_illegal;
+}
+
 switch (req.cmd) {
 /* Basic commands (Class 0 and Class 1) */
 case 0:/* CMD0:   GO_IDLE_STATE */
-- 
2.21.3




[PATCH v7 05/17] hw/sd/sdcard: Do not switch to ReceivingData if address is invalid

2020-06-30 Thread Philippe Mathieu-Daudé
Only move the state machine to ReceivingData if there is no
pending error. This avoids later OOB access while processing
commands queued.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.3 Data Read

  Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

  4.3.4 Data Write

  Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

WP_VIOLATION errors are not modified: the error bit is set, we
stay in receive-data state, wait for a stop command. All further
data transfer is ignored. See the check on sd->card_status at the
beginning of sd_read_data() and sd_write_data().

Fixes: CVE-2020-13253
Cc: Prasad J Pandit 
Reported-by: Alexander Bulekov 
Buglink: https://bugs.launchpad.net/qemu/+bug/1880822
Signed-off-by: Philippe Mathieu-Daudé 
---
v4: Only modify ADDRESS_ERROR, not WP_VIOLATION (pm215)
---
 hw/sd/sd.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 04451fdad2..7e0d684aca 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1167,13 +1167,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 case 17:   /* CMD17:  READ_SINGLE_BLOCK */
 switch (sd->state) {
 case sd_transfer_state:
-sd->state = sd_sendingdata_state;
-sd->data_start = addr;
-sd->data_offset = 0;
 
 if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
 }
+
+sd->state = sd_sendingdata_state;
+sd->data_start = addr;
+sd->data_offset = 0;
 return sd_r1;
 
 default:
@@ -1184,13 +1186,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 case 18:   /* CMD18:  READ_MULTIPLE_BLOCK */
 switch (sd->state) {
 case sd_transfer_state:
-sd->state = sd_sendingdata_state;
-sd->data_start = addr;
-sd->data_offset = 0;
 
 if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
 }
+
+sd->state = sd_sendingdata_state;
+sd->data_start = addr;
+sd->data_offset = 0;
 return sd_r1;
 
 default:
@@ -1230,14 +1234,17 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 /* Writing in SPI mode not implemented.  */
 if (sd->spi)
 break;
+
+if (sd->data_start + sd->blk_len > sd->size) {
+sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
+}
+
 sd->state = sd_receivingdata_state;
 sd->data_start = addr;
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size) {
-sd->card_status |= ADDRESS_ERROR;
-}
 if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
 }
@@ -1257,14 +1264,17 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 /* Writing in SPI mode not implemented.  */
 if (sd->spi)
 break;
+
+if (sd->data_start + sd->blk_len > sd->size) {
+sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
+}
+
 sd->state = sd_receivingdata_state;
 sd->data_start = addr;
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size) {
-sd->card_status |= ADDRESS_ERROR;
-}
 if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
 }
-- 
2.21.3




[PATCH v7 08/17] hw/sd/sdcard: Call sd_addr_to_wpnum where it is used, consider zero size

2020-06-30 Thread Philippe Mathieu-Daudé
Avoid setting 'sect' variable just once (its name is confuse
anyway). Directly set 'sd->wpgrps_size'. Special case when
size is zero.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 078b0e81ee..e5adcc8055 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -556,8 +556,6 @@ static void sd_reset(DeviceState *dev)
 }
 size = sect << 9;
 
-sect = sd_addr_to_wpnum(size) + 1;
-
 sd->size = size;
 sd->state = sd_idle_state;
 sd->rca = 0x;
@@ -570,7 +568,11 @@ static void sd_reset(DeviceState *dev)
 
 g_free(sd->wp_groups);
 sd->wp_switch = sd->blk ? blk_is_read_only(sd->blk) : false;
-sd->wpgrps_size = sect;
+if (sd->size) {
+sd->wpgrps_size = sd_addr_to_wpnum(sd, sd->size) + 1;
+} else {
+sd->wpgrps_size = 1;
+}
 sd->wp_groups = bitmap_new(sd->wpgrps_size);
 memset(sd->function_group, 0, sizeof(sd->function_group));
 sd->erase_start = 0;
-- 
2.21.3




[PATCH v7 10/17] hw/sd/sdcard: Check address is in range

2020-06-30 Thread Philippe Mathieu-Daudé
As a defense, assert if the requested address is out of the card area.

Suggested-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 548745614e..5d1b314a32 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -537,8 +537,10 @@ static void sd_response_r7_make(SDState *sd, uint8_t 
*response)
 stl_be_p(response, sd->vhs);
 }
 
-static inline uint64_t sd_addr_to_wpnum(uint64_t addr)
+static uint64_t sd_addr_to_wpnum(SDState *sd, uint64_t addr)
 {
+assert(addr < sd->size);
+
 return addr >> (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT);
 }
 
@@ -773,8 +775,8 @@ static void sd_erase(SDState *sd)
 erase_end *= HWBLOCK_SIZE;
 }
 
-erase_start = sd_addr_to_wpnum(erase_start);
-erase_end = sd_addr_to_wpnum(erase_end);
+erase_start = sd_addr_to_wpnum(sd, erase_start);
+erase_end = sd_addr_to_wpnum(sd, erase_end);
 sd->erase_start = 0;
 sd->erase_end = 0;
 sd->csd[14] |= 0x40;
@@ -791,7 +793,7 @@ static uint32_t sd_wpbits(SDState *sd, uint64_t addr)
 uint32_t i, wpnum;
 uint32_t ret = 0;
 
-wpnum = sd_addr_to_wpnum(addr);
+wpnum = sd_addr_to_wpnum(sd, addr);
 
 for (i = 0; i < 32; i++, wpnum++, addr += WPGROUP_SIZE) {
 if (addr < sd->size && test_bit(wpnum, sd->wp_groups)) {
@@ -833,7 +835,7 @@ static void sd_function_switch(SDState *sd, uint32_t arg)
 
 static inline bool sd_wp_addr(SDState *sd, uint64_t addr)
 {
-return test_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
+return test_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
 }
 
 static void sd_lock_command(SDState *sd)
@@ -1345,7 +1347,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 }
 
 sd->state = sd_programming_state;
-set_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
+set_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
 /* Bzzztt  Operation complete.  */
 sd->state = sd_transfer_state;
 return sd_r1b;
@@ -1364,7 +1366,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 }
 
 sd->state = sd_programming_state;
-clear_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
+clear_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
 /* Bzzztt  Operation complete.  */
 sd->state = sd_transfer_state;
 return sd_r1b;
-- 
2.21.3




[PATCH v7 11/17] hw/sd/sdcard: Update the SDState documentation

2020-06-30 Thread Philippe Mathieu-Daudé
Add more descriptive comments to keep a clear separation
between static property vs runtime changeable.

Suggested-by: Peter Maydell 
Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 5d1b314a32..723e66bbf2 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -103,11 +103,14 @@ struct SDState {
 uint32_t card_status;
 uint8_t sd_status[64];
 
-/* Configurable properties */
+/* Static properties */
+
 uint8_t spec_version;
 BlockBackend *blk;
 bool spi;
 
+/* Runtime changeables */
+
 uint32_t mode;/* current card mode, one of SDCardModes */
 int32_t state;/* current card state, one of SDCardStates */
 uint32_t vhs;
-- 
2.21.3




[PATCH v7 03/17] hw/sd/sdcard: Move some definitions to use them earlier

2020-06-30 Thread Philippe Mathieu-Daudé
Move some definitions to use them earlier.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index cac8d7d828..4816b4a462 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -80,6 +80,12 @@ enum SDCardStates {
 sd_disconnect_state,
 };
 
+#define HWBLOCK_SHIFT   9   /* 512 bytes */
+#define SECTOR_SHIFT5   /* 16 kilobytes */
+#define WPGROUP_SHIFT   7   /* 2 megs */
+#define CMULT_SHIFT 9   /* 512 times HWBLOCK_SIZE */
+#define WPGROUP_SIZE(1 << (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT))
+
 struct SDState {
 DeviceState parent_obj;
 
@@ -367,12 +373,6 @@ static void sd_set_cid(SDState *sd)
 sd->cid[15] = (sd_crc7(sd->cid, 15) << 1) | 1;
 }
 
-#define HWBLOCK_SHIFT  9   /* 512 bytes */
-#define SECTOR_SHIFT   5   /* 16 kilobytes */
-#define WPGROUP_SHIFT  7   /* 2 megs */
-#define CMULT_SHIFT9   /* 512 times HWBLOCK_SIZE */
-#define WPGROUP_SIZE   (1 << (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT))
-
 static const uint8_t sd_csd_rw_mask[16] = {
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xfc, 0xfe,
-- 
2.21.3




[PATCH v7 09/17] hw/sd/sdcard: Special case the -ENOMEDIUM error

2020-06-30 Thread Philippe Mathieu-Daudé
As we have no interest in the underlying block geometry,
directly call blk_getlength(). We have to care about machines
creating SD card with not drive attached (probably incorrect
API use). Simply emit a warning when such Frankenstein cards
of zero size are reset.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index e5adcc8055..548745614e 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -545,18 +545,30 @@ static inline uint64_t sd_addr_to_wpnum(uint64_t addr)
 static void sd_reset(DeviceState *dev)
 {
 SDState *sd = SD_CARD(dev);
-uint64_t size;
-uint64_t sect;
 
 trace_sdcard_reset();
 if (sd->blk) {
-blk_get_geometry(sd->blk, );
-} else {
-sect = 0;
-}
-size = sect << 9;
+int64_t size = blk_getlength(sd->blk);
+
+if (size == -ENOMEDIUM) {
+/*
+ * FIXME blk should be set once per device in sd_realize(),
+ * and we shouldn't be checking it in sed_reset(). But this
+ * is how the reparent currently works.
+ */
+char *id = object_get_canonical_path_component(OBJECT(dev));
+
+warn_report("sd-card '%s' created with no drive.",
+id ? id : "unknown");
+g_free(id);
+size = 0;
+}
+assert(size >= 0);
+sd->size = size;
+} else {
+sd->size = 0;
+}
 
-sd->size = size;
 sd->state = sd_idle_state;
 sd->rca = 0x;
 sd_set_ocr(sd);
-- 
2.21.3




[PATCH v7 04/17] hw/sd/sdcard: Use the HWBLOCK_SIZE definition

2020-06-30 Thread Philippe Mathieu-Daudé
Replace the following different uses of the same value by
the same HWBLOCK_SIZE definition:
  - 512 (magic value)
  - 0x200 (magic value)
  - 1 << HWBLOCK_SHIFT

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 4816b4a462..04451fdad2 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -81,6 +81,7 @@ enum SDCardStates {
 };
 
 #define HWBLOCK_SHIFT   9   /* 512 bytes */
+#define HWBLOCK_SIZE(1 << HWBLOCK_SHIFT)
 #define SECTOR_SHIFT5   /* 16 kilobytes */
 #define WPGROUP_SHIFT   7   /* 2 megs */
 #define CMULT_SHIFT 9   /* 512 times HWBLOCK_SIZE */
@@ -129,7 +130,7 @@ struct SDState {
 uint32_t blk_written;
 uint64_t data_start;
 uint32_t data_offset;
-uint8_t data[512];
+uint8_t data[HWBLOCK_SIZE];
 qemu_irq readonly_cb;
 qemu_irq inserted_cb;
 QEMUTimer *ocr_power_timer;
@@ -410,7 +411,7 @@ static void sd_set_csd(SDState *sd, uint64_t size)
 ((HWBLOCK_SHIFT << 6) & 0xc0);
 sd->csd[14] = 0x00;/* File format group */
 } else {   /* SDHC */
-size /= 512 * KiB;
+size /= HWBLOCK_SIZE * KiB;
 size -= 1;
 sd->csd[0] = 0x40;
 sd->csd[1] = 0x0e;
@@ -574,7 +575,7 @@ static void sd_reset(DeviceState *dev)
 sd->erase_start = 0;
 sd->erase_end = 0;
 sd->size = size;
-sd->blk_len = 0x200;
+sd->blk_len = HWBLOCK_SIZE;
 sd->pwd_len = 0;
 sd->expecting_acmd = false;
 sd->dat_lines = 0xf;
@@ -685,7 +686,7 @@ static const VMStateDescription sd_vmstate = {
 VMSTATE_UINT32(blk_written, SDState),
 VMSTATE_UINT64(data_start, SDState),
 VMSTATE_UINT32(data_offset, SDState),
-VMSTATE_UINT8_ARRAY(data, SDState, 512),
+VMSTATE_UINT8_ARRAY(data, SDState, HWBLOCK_SIZE),
 VMSTATE_UNUSED_V(1, 512),
 VMSTATE_BOOL(enable, SDState),
 VMSTATE_END_OF_LIST()
@@ -754,8 +755,8 @@ static void sd_erase(SDState *sd)
 
 if (FIELD_EX32(sd->ocr, OCR, CARD_CAPACITY)) {
 /* High capacity memory card: erase units are 512 byte blocks */
-erase_start *= 512;
-erase_end *= 512;
+erase_start *= HWBLOCK_SIZE;
+erase_end *= HWBLOCK_SIZE;
 }
 
 erase_start = sd_addr_to_wpnum(erase_start);
@@ -1149,7 +1150,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 case 16:   /* CMD16:  SET_BLOCKLEN */
 switch (sd->state) {
 case sd_transfer_state:
-if (req.arg > (1 << HWBLOCK_SHIFT)) {
+if (req.arg > HWBLOCK_SIZE) {
 sd->card_status |= BLOCK_LEN_ERROR;
 } else {
 trace_sdcard_set_blocklen(req.arg);
@@ -1961,7 +1962,7 @@ uint8_t sd_read_data(SDState *sd)
 if (sd->card_status & (ADDRESS_ERROR | WP_VIOLATION))
 return 0x00;
 
-io_len = (sd->ocr & (1 << 30)) ? 512 : sd->blk_len;
+io_len = (sd->ocr & (1 << 30)) ? HWBLOCK_SIZE : sd->blk_len;
 
 trace_sdcard_read_data(sd->proto_name,
sd_acmd_name(sd->current_cmd),
-- 
2.21.3




[PATCH v7 02/17] hw/sd/sdcard: Update coding style to make checkpatch.pl happy

2020-06-30 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

To make the next commit easier to review, clean this code first.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 97a9d32964..cac8d7d828 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1170,8 +1170,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_start = addr;
 sd->data_offset = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+}
 return sd_r1;
 
 default:
@@ -1186,8 +1187,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_start = addr;
 sd->data_offset = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+}
 return sd_r1;
 
 default:
@@ -1232,12 +1234,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
-if (sd_wp_addr(sd, sd->data_start))
+}
+if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
-if (sd->csd[14] & 0x30)
+}
+if (sd->csd[14] & 0x30) {
 sd->card_status |= WP_VIOLATION;
+}
 return sd_r1;
 
 default:
@@ -1256,12 +1261,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
-if (sd_wp_addr(sd, sd->data_start))
+}
+if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
-if (sd->csd[14] & 0x30)
+}
+if (sd->csd[14] & 0x30) {
 sd->card_status |= WP_VIOLATION;
+}
 return sd_r1;
 
 default:
-- 
2.21.3




[PATCH v7 01/17] MAINTAINERS: Cc qemu-block mailing list

2020-06-30 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

We forgot to include the qemu-block mailing list while adding
this section in commit 076a0fc32a7. Fix this.

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index dec252f38b..9ad876c4a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1628,6 +1628,7 @@ F: hw/ssi/xilinx_*
 
 SD (Secure Card)
 M: Philippe Mathieu-Daudé 
+L: qemu-block@nongnu.org
 S: Odd Fixes
 F: include/hw/sd/sd*
 F: hw/sd/core.c
-- 
2.21.3




[PATCH v7 00/17] hw/sd/sdcard: Fix CVE-2020-13253 & cleanups

2020-06-30 Thread Philippe Mathieu-Daudé
Patches 5 & 6 fix CVE-2020-13253.
The rest are (accumulated) cleanups.

Since v6: Handle -ENOMEDIUM error
Since v5: Fix incorrect use of sd_addr_to_wpnum() in sd_reset()

Missing review:
[PATCH 01/15] MAINTAINERS: Cc qemu-block mailing list
[PATCH 03/15] hw/sd/sdcard: Move some definitions to use them
[PATCH 04/15] hw/sd/sdcard: Use the HWBLOCK_SIZE definition
[PATCH 05/15] hw/sd/sdcard: Do not switch to ReceivingData if
[PATCH 07/15] hw/sd/sdcard: Move sd->size initialization
[PATCH 08/15] hw/sd/sdcard: Call sd_addr_to_wpnum where used, consider zero size
[PATCH 09/15] hw/sd/sdcard: Special case the -ENOMEDIUM error
[PATCH 10/15] hw/sd/sdcard: Check address is in range
[PATCH 14/15] hw/sd/sdcard: Make iolen unsigned
[PATCH 15/15] hw/sd/sdcard: Correctly display the command name in trace

$ git backport-diff -u v6
$ git backport-diff -u sd_cve_2020_13253-v6 -r origin/master..
Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/17:[] [--] 'MAINTAINERS: Cc qemu-block mailing list'
002/17:[] [--] 'hw/sd/sdcard: Update coding style to make checkpatch.pl 
happy'
003/17:[] [--] 'hw/sd/sdcard: Move some definitions to use them earlier'
004/17:[] [--] 'hw/sd/sdcard: Use the HWBLOCK_SIZE definition'
005/17:[] [--] 'hw/sd/sdcard: Do not switch to ReceivingData if address is 
invalid'
006/17:[] [--] 'hw/sd/sdcard: Restrict Class 6 commands to SCSD cards'
007/17:[down] 'hw/sd/sdcard: Move sd->size initialization'
008/17:[down] 'hw/sd/sdcard: Call sd_addr_to_wpnum where it is used, consider 
zero size'
009/17:[down] 'hw/sd/sdcard: Special case the -ENOMEDIUM error'
010/17:[0004] [FC] 'hw/sd/sdcard: Check address is in range'
011/17:[] [--] 'hw/sd/sdcard: Update the SDState documentation'
012/17:[] [--] 'hw/sd/sdcard: Simplify cmd_valid_while_locked()'
013/17:[] [--] 'hw/sd/sdcard: Constify sd_crc*()'s message argument'
014/17:[] [--] 'hw/sd/sdcard: Make iolen unsigned'
015/17:[] [--] 'hw/sd/sdcard: Correctly display the command name in trace 
events'
016/17:[] [--] 'hw/sd/sdcard: Display offset in read/write_data() trace 
events'
017/17:[] [--] 'hw/sd/sdcard: Simplify realize() a bit'

Philippe Mathieu-Daudé (17):
  MAINTAINERS: Cc qemu-block mailing list
  hw/sd/sdcard: Update coding style to make checkpatch.pl happy
  hw/sd/sdcard: Move some definitions to use them earlier
  hw/sd/sdcard: Use the HWBLOCK_SIZE definition
  hw/sd/sdcard: Do not switch to ReceivingData if address is invalid
  hw/sd/sdcard: Restrict Class 6 commands to SCSD cards
  hw/sd/sdcard: Move sd->size initialization
  hw/sd/sdcard: Call sd_addr_to_wpnum where it is used, consider zero
size
  hw/sd/sdcard: Special case the -ENOMEDIUM error
  hw/sd/sdcard: Check address is in range
  hw/sd/sdcard: Update the SDState documentation
  hw/sd/sdcard: Simplify cmd_valid_while_locked()
  hw/sd/sdcard: Constify sd_crc*()'s message argument
  hw/sd/sdcard: Make iolen unsigned
  hw/sd/sdcard: Correctly display the command name in trace events
  hw/sd/sdcard: Display offset in read/write_data() trace events
  hw/sd/sdcard: Simplify realize() a bit

 hw/sd/sd.c | 189 +
 MAINTAINERS|   1 +
 hw/sd/trace-events |   4 +-
 3 files changed, 124 insertions(+), 70 deletions(-)

-- 
2.21.3




Re: [PATCH v2 10/18] hw/block/nvme: Support Zoned Namespace Command Set

2020-06-30 Thread Klaus Jensen
On Jun 18 06:34, Dmitry Fomichev wrote:
> The driver has been changed to advertise NVM Command Set when "zoned"
> driver property is not set (default) and Zoned Namespace Command Set
> otherwise.
> 
> Handlers for three new NVMe commands introduced in Zoned Namespace
> Command Set specification are added, namely for Zone Management
> Receive, Zone Management Send and Zone Append.
> 
> Driver initialization code has been extended to create a proper
> configuration for zoned operation using driver properties.
> 
> Read/Write command handler is modified to only allow writes at the
> write pointer if the namespace is zoned. For Zone Append command,
> writes implicitly happen at the write pointer and the starting write
> pointer value is returned as the result of the command. Read Zeroes

s/Read Zeroes/Write Zeroes

> handler is modified to add zoned checks that are identical to those
> done as a part of Write flow.
> 
> The code to support for Zone Descriptor Extensions is not included in
> this commit and the driver always reports ZDES 0. A later commit in
> this series will add ZDE support.
> 
> This commit doesn't yet include checks for active and open zone
> limits. It is assumed that there are no limits on either active or
> open zones.
> 

And s/driver/device ;)

> Signed-off-by: Niklas Cassel 
> Signed-off-by: Hans Holmberg 
> Signed-off-by: Ajay Joshi 
> Signed-off-by: Chaitanya Kulkarni 
> Signed-off-by: Matias Bjorling 
> Signed-off-by: Aravind Ramesh 
> Signed-off-by: Shin'ichiro Kawasaki 
> Signed-off-by: Adam Manzanares 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c | 963 ++--
>  1 file changed, 933 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 453f4747a5..2e03b0b6ed 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -37,6 +37,7 @@
>  #include "qemu/osdep.h"
>  #include "qemu/units.h"
>  #include "qemu/error-report.h"
> +#include "crypto/random.h"
>  #include "hw/block/block.h"
>  #include "hw/pci/msix.h"
>  #include "hw/pci/pci.h"
> @@ -69,6 +70,98 @@
>  
>  static void nvme_process_sq(void *opaque);
>  
> +/*
> + * Add a zone to the tail of a zone list.
> + */
> +static void nvme_add_zone_tail(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList 
> *zl,
> +NvmeZone *zone)
> +{
> +uint32_t idx = (uint32_t)(zone - ns->zone_array);
> +
> +assert(nvme_zone_not_in_list(zone));
> +
> +if (!zl->size) {
> +zl->head = zl->tail = idx;
> +zone->next = zone->prev = NVME_ZONE_LIST_NIL;
> +} else {
> +ns->zone_array[zl->tail].next = idx;
> +zone->prev = zl->tail;
> +zone->next = NVME_ZONE_LIST_NIL;
> +zl->tail = idx;
> +}
> +zl->size++;
> +}
> +
> +/*
> + * Remove a zone from a zone list. The zone must be linked in the list.
> + */
> +static void nvme_remove_zone(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList 
> *zl,
> +NvmeZone *zone)
> +{
> +uint32_t idx = (uint32_t)(zone - ns->zone_array);
> +
> +assert(!nvme_zone_not_in_list(zone));
> +
> +--zl->size;
> +if (zl->size == 0) {
> +zl->head = NVME_ZONE_LIST_NIL;
> +zl->tail = NVME_ZONE_LIST_NIL;
> +} else if (idx == zl->head) {
> +zl->head = zone->next;
> +ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL;
> +} else if (idx == zl->tail) {
> +zl->tail = zone->prev;
> +ns->zone_array[zl->tail].next = NVME_ZONE_LIST_NIL;
> +} else {
> +ns->zone_array[zone->next].prev = zone->prev;
> +ns->zone_array[zone->prev].next = zone->next;
> +}
> +
> +zone->prev = zone->next = 0;
> +}
> +
> +static void nvme_assign_zone_state(NvmeCtrl *n, NvmeNamespace *ns,
> +NvmeZone *zone, uint8_t state)
> +{
> +if (!nvme_zone_not_in_list(zone)) {
> +switch (nvme_get_zone_state(zone)) {
> +case NVME_ZONE_STATE_EXPLICITLY_OPEN:
> +nvme_remove_zone(n, ns, ns->exp_open_zones, zone);
> +break;
> +case NVME_ZONE_STATE_IMPLICITLY_OPEN:
> +nvme_remove_zone(n, ns, ns->imp_open_zones, zone);
> +break;
> +case NVME_ZONE_STATE_CLOSED:
> +nvme_remove_zone(n, ns, ns->closed_zones, zone);
> +break;
> +case NVME_ZONE_STATE_FULL:
> +nvme_remove_zone(n, ns, ns->full_zones, zone);
> +}
> +   }
> +
> +nvme_set_zone_state(zone, state);
> +
> +switch (state) {
> +case NVME_ZONE_STATE_EXPLICITLY_OPEN:
> +nvme_add_zone_tail(n, ns, ns->exp_open_zones, zone);
> +break;
> +case NVME_ZONE_STATE_IMPLICITLY_OPEN:
> +nvme_add_zone_tail(n, ns, ns->imp_open_zones, zone);
> +break;
> +case NVME_ZONE_STATE_CLOSED:
> +nvme_add_zone_tail(n, ns, ns->closed_zones, zone);
> +break;
> +case NVME_ZONE_STATE_FULL:
> +nvme_add_zone_tail(n, ns, ns->full_zones, zone);
> +break;
> +default:
> +zone->d.za = 0;
> +   

Re: [PATCH v2 09/18] hw/block/nvme: Define Zoned NS Command Set trace events

2020-06-30 Thread Klaus Jensen
On Jun 18 06:34, Dmitry Fomichev wrote:
> The Zoned Namespace Command Set / Namespace Types implementation that
> is being introduced in this series adds a good number of trace events.
> Combine all tracepoint definitions into a separate patch to make
> reviewing more convenient.
> 
> Signed-off-by: Dmitry Fomichev 

I would prefer that LBAs was reported in hex, but it's just personal
preference.

> ---
>  hw/block/trace-events | 41 +
>  1 file changed, 41 insertions(+)
> 
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 3f3323fe38..984db8a20c 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -66,6 +66,31 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
>  pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects 
> log read"
>  pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set 
> selected by host, bar.cc=0x%"PRIx32""
>  pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets 
> selected by host, bar.cc=0x%"PRIx32""
> +pci_nvme_open_zone(uint64_t slba, uint32_t zone_idx, int all) "open zone, 
> slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
> +pci_nvme_close_zone(uint64_t slba, uint32_t zone_idx, int all) "close zone, 
> slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
> +pci_nvme_finish_zone(uint64_t slba, uint32_t zone_idx, int all) "finish 
> zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
> +pci_nvme_reset_zone(uint64_t slba, uint32_t zone_idx, int all) "reset zone, 
> slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
> +pci_nvme_offline_zone(uint64_t slba, uint32_t zone_idx, int all) "offline 
> zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
> +pci_nvme_set_descriptor_extension(uint64_t slba, uint32_t zone_idx) "set 
> zone descriptor extension, slba=%"PRIu64", idx=%"PRIu32""
> +pci_nvme_zone_reset_recommended(uint64_t slba) "slba=%"PRIu64""
> +pci_nvme_zone_reset_internal_op(uint64_t slba) "slba=%"PRIu64""
> +pci_nvme_zone_finish_recommended(uint64_t slba) "slba=%"PRIu64""
> +pci_nvme_zone_finish_internal_op(uint64_t slba) "slba=%"PRIu64""
> +pci_nvme_zone_finished_by_controller(uint64_t slba) "slba=%"PRIu64""
> +pci_nvme_zd_extension_set(uint32_t zone_idx) "set descriptor extension for 
> zone_idx=%"PRIu32""
> +pci_nvme_power_on_close(uint32_t state, uint64_t slba) "zone 
> state=%"PRIu32", slba=%"PRIu64" transitioned to Closed state"
> +pci_nvme_power_on_reset(uint32_t state, uint64_t slba) "zone 
> state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state"
> +pci_nvme_power_on_full(uint32_t state, uint64_t slba) "zone state=%"PRIu32", 
> slba=%"PRIu64" transitioned to Full state"
> +pci_nvme_zone_ae_not_enabled(int info, int log_page, int nsid) "zone async 
> event not enabled, info=0x%"PRIx32", lp=0x%"PRIx32", nsid=%"PRIu32""
> +pci_nvme_zone_ae_not_cleared(int info, int log_page, int nsid) "zoned async 
> event not cleared, info=0x%"PRIx32", lp=0x%"PRIx32", nsid=%"PRIu32""

Can we use uintxx_t's here?

> +pci_nvme_zone_aen_not_requested(uint32_t oaes) "zone descriptor AEN are not 
> requested by host, oaes=0x%"PRIx32""
> +pci_nvme_getfeat_aen_cfg(uint64_t res) "reporting async event config 
> res=%"PRIu64""
> +pci_nvme_setfeat_zone_info_aer_on(void) "zone info change notices enabled"
> +pci_nvme_setfeat_zone_info_aer_off(void) "zone info change notices disabled"
> +pci_nvme_changed_zone_log_read(uint16_t nsid) "changed zone list log of ns 
> %"PRIu16""

nsid should be uint32_t.

> +pci_nvme_reporting_changed_zone(uint64_t zslba, uint8_t za) 
> "zslba=%"PRIu64", attr=0x%"PRIx8""
> +pci_nvme_empty_changed_zone_list(void) "no changes zones to report"

s/changes/changed

> +pci_nvme_mapped_zone_file(char *zfile_name, int ret) "mapped zone file %s, 
> error %d"
>  
>  # nvme traces for error conditions
>  pci_nvme_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
> @@ -77,10 +102,25 @@ pci_nvme_err_invalid_ns(uint32_t ns, uint32_t limit) 
> "invalid namespace %u not w
>  pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8""
>  pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8""
>  pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) 
> "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
> +pci_nvme_err_capacity_exceeded(uint64_t zone_id, uint64_t nr_zones) "zone 
> capacity exceeded, zone_id=%"PRIu64", nr_zones=%"PRIu64""

Change the name to pci_nvme_err_ZONE_capacity_exceeded maybe?

> +pci_nvme_err_unaligned_zone_cmd(uint8_t action, uint64_t slba, uint64_t 
> zslba) "unaligned zone op 0x%"PRIx32", got slba=%"PRIu64", zslba=%"PRIu64""
> +pci_nvme_err_invalid_zone_state_transition(uint8_t state, uint8_t action, 
> uint64_t slba, uint8_t attrs) "0x%"PRIx32"->0x%"PRIx32", slba=%"PRIu64", 
> attrs=0x%"PRIx32""
> +pci_nvme_err_write_not_at_wp(uint64_t slba, uint64_t zone, uint64_t wp) 
> "writing at slba=%"PRIu64", zone=%"PRIu64", but wp=%"PRIu64""
> 

Re: [PATCH v2 08/18] hw/block/nvme: Make Zoned NS Command Set definitions

2020-06-30 Thread Klaus Jensen
On Jun 30 13:44, Klaus Jensen wrote:
> On Jun 18 06:34, Dmitry Fomichev wrote:
> > Define values and structures that are needed to support Zoned
> > Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator.
> > 
> > All new protocol definitions are located in include/block/nvme.h
> > and everything added that is specific to this implementation is kept
> > in hw/block/nvme.h.
> > 
> > In order to improve scalability, all open, closed and full zones
> > are organized in separate linked lists. Consequently, almost all
> > zone operations don't require scanning of the entire zone array
> > (which potentially can be quite large) - it is only necessary to
> > enumerate one or more zone lists. Zone lists are designed to be
> > position-independent as they can be persisted to the backing file
> > as a part of zone metadata. NvmeZoneList struct defined in this patch
> > serves as a head of every zone list.
> > 
> > NvmeZone structure encapsulates NvmeZoneDescriptor defined in Zoned
> > Command Set specification and adds a few more fields that are
> > internal to this implementation.
> > 
> > Signed-off-by: Niklas Cassel 
> > Signed-off-by: Hans Holmberg 
> > Signed-off-by: Ajay Joshi 
> > Signed-off-by: Matias Bjorling 
> > Signed-off-by: Shin'ichiro Kawasaki 
> > Signed-off-by: Alexey Bogoslavsky 
> > Signed-off-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme.h  | 130 +++
> >  include/block/nvme.h | 119 ++-
> >  2 files changed, 248 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> > index 0d29f75475..2c932b5e29 100644
> > --- a/hw/block/nvme.h
> > +++ b/hw/block/nvme.h
> > @@ -121,6 +165,86 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, 
> > NvmeNamespace *ns)
> >  return n->ns_size >> nvme_ns_lbads(ns);
> >  }
> >  
> > +static inline uint8_t nvme_get_zone_state(NvmeZone *zone)
> > +{
> > +return zone->d.zs >> 4;
> > +}
> > +
> > +static inline void nvme_set_zone_state(NvmeZone *zone, enum NvmeZoneState 
> > state)
> > +{
> > +zone->d.zs = state << 4;
> > +}
> > +
> > +static inline uint64_t nvme_zone_rd_boundary(NvmeCtrl *n, NvmeZone *zone)
> > +{
> > +return zone->d.zslba + n->params.zone_size;
> > +}
> > +
> > +static inline uint64_t nvme_zone_wr_boundary(NvmeZone *zone)
> > +{
> > +return zone->d.zslba + zone->d.zcap;
> > +}
> 
> Everything working on zone->d needs leXX_to_cpu() conversions.

Disregard this. I see from the following patches that you keep zone->d
in cpu endianess and convert on zone management receive.

Sorry!



Re: [PATCH v2 08/18] hw/block/nvme: Make Zoned NS Command Set definitions

2020-06-30 Thread Klaus Jensen
On Jun 18 06:34, Dmitry Fomichev wrote:
> Define values and structures that are needed to support Zoned
> Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator.
> 
> All new protocol definitions are located in include/block/nvme.h
> and everything added that is specific to this implementation is kept
> in hw/block/nvme.h.
> 
> In order to improve scalability, all open, closed and full zones
> are organized in separate linked lists. Consequently, almost all
> zone operations don't require scanning of the entire zone array
> (which potentially can be quite large) - it is only necessary to
> enumerate one or more zone lists. Zone lists are designed to be
> position-independent as they can be persisted to the backing file
> as a part of zone metadata. NvmeZoneList struct defined in this patch
> serves as a head of every zone list.
> 
> NvmeZone structure encapsulates NvmeZoneDescriptor defined in Zoned
> Command Set specification and adds a few more fields that are
> internal to this implementation.
> 
> Signed-off-by: Niklas Cassel 
> Signed-off-by: Hans Holmberg 
> Signed-off-by: Ajay Joshi 
> Signed-off-by: Matias Bjorling 
> Signed-off-by: Shin'ichiro Kawasaki 
> Signed-off-by: Alexey Bogoslavsky 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.h  | 130 +++
>  include/block/nvme.h | 119 ++-
>  2 files changed, 248 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 0d29f75475..2c932b5e29 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -3,12 +3,22 @@
>  
>  #include "block/nvme.h"
>  
> +#define NVME_DEFAULT_ZONE_SIZE   128 /* MiB */
> +#define NVME_DEFAULT_MAX_ZA_SIZE 128 /* KiB */
> +
>  typedef struct NvmeParams {
>  char *serial;
>  uint32_t num_queues; /* deprecated since 5.1 */
>  uint32_t max_ioqpairs;
>  uint16_t msix_qsize;
>  uint32_t cmb_size_mb;
> +
> +boolzoned;
> +boolcross_zone_read;
> +uint8_t fill_pattern;
> +uint32_tzamds_bs;

Rename to zasl.

> +uint64_tzone_size;
> +uint64_tzone_capacity;
>  } NvmeParams;
>  
>  typedef struct NvmeAsyncEvent {
> @@ -17,6 +27,8 @@ typedef struct NvmeAsyncEvent {
>  
>  enum NvmeRequestFlags {
>  NVME_REQ_FLG_HAS_SG   = 1 << 0,
> +NVME_REQ_FLG_FILL = 1 << 1,
> +NVME_REQ_FLG_APPEND   = 1 << 2,
>  };
>  
>  typedef struct NvmeRequest {
> @@ -24,6 +36,7 @@ typedef struct NvmeRequest {
>  BlockAIOCB  *aiocb;
>  uint16_tstatus;
>  uint16_tflags;
> +uint64_tfill_ofs;
>  NvmeCqe cqe;
>  BlockAcctCookie acct;
>  QEMUSGList  qsg;
> @@ -61,11 +74,35 @@ typedef struct NvmeCQueue {
>  QTAILQ_HEAD(, NvmeRequest) req_list;
>  } NvmeCQueue;
>  
> +typedef struct NvmeZone {
> +NvmeZoneDescr   d;
> +uint64_ttstamp;
> +uint32_tnext;
> +uint32_tprev;
> +uint8_t rsvd80[8];
> +} NvmeZone;
> +
> +#define NVME_ZONE_LIST_NILUINT_MAX
> +
> +typedef struct NvmeZoneList {
> +uint32_thead;
> +uint32_ttail;
> +uint32_tsize;
> +uint8_t rsvd12[4];
> +} NvmeZoneList;
> +
>  typedef struct NvmeNamespace {
>  NvmeIdNsid_ns;
>  uint32_tnsid;
>  uint8_t csi;
>  QemuUUIDuuid;
> +
> +NvmeIdNsZoned   *id_ns_zoned;
> +NvmeZone*zone_array;
> +NvmeZoneList*exp_open_zones;
> +NvmeZoneList*imp_open_zones;
> +NvmeZoneList*closed_zones;
> +NvmeZoneList*full_zones;
>  } NvmeNamespace;
>  
>  static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns)
> @@ -100,6 +137,7 @@ typedef struct NvmeCtrl {
>  uint32_tnum_namespaces;
>  uint32_tmax_q_ents;
>  uint64_tns_size;
> +
>  uint8_t *cmbuf;
>  uint32_tirq_status;
>  uint64_thost_timestamp; /* Timestamp sent by the 
> host */
> @@ -107,6 +145,12 @@ typedef struct NvmeCtrl {
>  
>  HostMemoryBackend *pmrdev;
>  
> +int zone_file_fd;
> +uint32_tnum_zones;
> +uint64_tzone_size_bs;
> +uint64_tzone_array_size;
> +uint8_t zamds;

Rename to zasl.

> +
>  NvmeNamespace   *namespaces;
>  NvmeSQueue  **sq;
>  NvmeCQueue  **cq;
> @@ -121,6 +165,86 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, 
> NvmeNamespace *ns)
>  return n->ns_size >> nvme_ns_lbads(ns);
>  }
>  
> +static inline uint8_t nvme_get_zone_state(NvmeZone *zone)
> +{
> +return zone->d.zs >> 4;
> +}
> +
> +static inline void nvme_set_zone_state(NvmeZone *zone, enum NvmeZoneState 
> state)
> +{
> +zone->d.zs = state << 4;
> +}
> +
> +static inline uint64_t nvme_zone_rd_boundary(NvmeCtrl *n, NvmeZone *zone)
> +{
> +return zone->d.zslba + n->params.zone_size;
> +}
> +
> 

Re: [PATCH v2 07/18] hw/block/nvme: Add support for Namespace Types

2020-06-30 Thread Klaus Jensen
On Jun 18 06:34, Dmitry Fomichev wrote:
> From: Niklas Cassel 
> 
> Namespace Types introduce a new command set, "I/O Command Sets",
> that allows the host to retrieve the command sets associated with
> a namespace. Introduce support for the command set, and enable
> detection for the NVM Command Set.
> 
> Signed-off-by: Niklas Cassel 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c | 210 ++--
>  hw/block/nvme.h |  11 +++
>  2 files changed, 216 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 03b8deee85..453f4747a5 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -686,6 +686,26 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, 
> NvmeIdentify *c)
>  prp1, prp2);
>  }
>  
> +static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeIdentify *c)
> +{
> +uint64_t prp1 = le64_to_cpu(c->prp1);
> +uint64_t prp2 = le64_to_cpu(c->prp2);
> +static const int data_len = NVME_IDENTIFY_DATA_SIZE;
> +uint32_t *list;
> +uint16_t ret;
> +
> +trace_pci_nvme_identify_ctrl_csi(c->csi);
> +
> +if (c->csi == NVME_CSI_NVM) {
> +list = g_malloc0(data_len);
> +ret = nvme_dma_read_prp(n, (uint8_t *)list, data_len, prp1, prp2);
> +g_free(list);
> +return ret;
> +} else {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +}
> +
>  static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
>  {
>  NvmeNamespace *ns;
> @@ -701,11 +721,42 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, 
> NvmeIdentify *c)
>  }
>  
>  ns = >namespaces[nsid - 1];
> +assert(nsid == ns->nsid);
>  
>  return nvme_dma_read_prp(n, (uint8_t *)>id_ns, sizeof(ns->id_ns),
>  prp1, prp2);
>  }
>  
> +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeIdentify *c)
> +{
> +NvmeNamespace *ns;
> +uint32_t nsid = le32_to_cpu(c->nsid);
> +uint64_t prp1 = le64_to_cpu(c->prp1);
> +uint64_t prp2 = le64_to_cpu(c->prp2);
> +static const int data_len = NVME_IDENTIFY_DATA_SIZE;
> +uint32_t *list;
> +uint16_t ret;
> +
> +trace_pci_nvme_identify_ns_csi(nsid, c->csi);
> +
> +if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> +trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
> +return NVME_INVALID_NSID | NVME_DNR;
> +}
> +
> +ns = >namespaces[nsid - 1];
> +assert(nsid == ns->nsid);
> +
> +if (c->csi == NVME_CSI_NVM) {
> +list = g_malloc0(data_len);
> +ret = nvme_dma_read_prp(n, (uint8_t *)list, data_len, prp1, prp2);
> +g_free(list);
> +return ret;
> +} else {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +}
> +
>  static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
>  {
>  static const int data_len = NVME_IDENTIFY_DATA_SIZE;
> @@ -733,6 +784,99 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
> NvmeIdentify *c)
>  return ret;
>  }
>  
> +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeIdentify *c)
> +{
> +static const int data_len = NVME_IDENTIFY_DATA_SIZE;
> +uint32_t min_nsid = le32_to_cpu(c->nsid);
> +uint64_t prp1 = le64_to_cpu(c->prp1);
> +uint64_t prp2 = le64_to_cpu(c->prp2);
> +uint32_t *list;
> +uint16_t ret;
> +int i, j = 0;
> +
> +trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi);
> +
> +if (c->csi != NVME_CSI_NVM) {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
> +list = g_malloc0(data_len);
> +for (i = 0; i < n->num_namespaces; i++) {
> +if (i < min_nsid) {
> +continue;
> +}
> +list[j++] = cpu_to_le32(i + 1);
> +if (j == data_len / sizeof(uint32_t)) {
> +break;
> +}
> +}
> +ret = nvme_dma_read_prp(n, (uint8_t *)list, data_len, prp1, prp2);
> +g_free(list);
> +return ret;
> +}
> +
> +static uint16_t nvme_list_ns_descriptors(NvmeCtrl *n, NvmeIdentify *c)
> +{
> +NvmeNamespace *ns;
> +uint32_t nsid = le32_to_cpu(c->nsid);
> +uint64_t prp1 = le64_to_cpu(c->prp1);
> +uint64_t prp2 = le64_to_cpu(c->prp2);
> +void *buf_ptr;
> +NvmeNsIdDesc *desc;
> +static const int data_len = NVME_IDENTIFY_DATA_SIZE;
> +uint8_t *buf;
> +uint16_t status;
> +
> +trace_pci_nvme_list_ns_descriptors();
> +
> +if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> +trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
> +return NVME_INVALID_NSID | NVME_DNR;
> +}
> +
> +ns = >namespaces[nsid - 1];
> +assert(nsid == ns->nsid);
> +
> +buf = g_malloc0(data_len);
> +buf_ptr = buf;
> +
> +desc = buf_ptr;
> +desc->nidt = NVME_NIDT_UUID;
> +desc->nidl = NVME_NIDL_UUID;
> +buf_ptr += sizeof(*desc);
> +memcpy(buf_ptr, ns->uuid.data, NVME_NIDL_UUID);
> +buf_ptr += NVME_NIDL_UUID;
> +
> +desc = buf_ptr;
> +desc->nidt = NVME_NIDT_CSI;
> +desc->nidl = 

[PATCH v3 3/4] hw/block/nvme: Fix pmrmsc register size

2020-06-30 Thread Philippe Mathieu-Daudé
The Persistent Memory Region Controller Memory Space Control
register is 64-bit wide. See 'Figure 68: Register Definition'
of the 'NVM Express Base Specification Revision 1.4'.

Fixes: 6cf9413229 ("introduce PMR support from NVMe 1.4 spec")
Reported-by: Klaus Jensen 
Reviewed-by: Klaus Jensen 
Signed-off-by: Philippe Mathieu-Daudé 
---
Cc: Andrzej Jakowski 
---
 include/block/nvme.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 71c5681912..82c384614a 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -21,7 +21,7 @@ typedef struct QEMU_PACKED NvmeBar {
 uint32_tpmrsts;
 uint32_tpmrebs;
 uint32_tpmrswtp;
-uint32_tpmrmsc;
+uint64_tpmrmsc;
 } NvmeBar;
 
 enum NvmeCapShift {
-- 
2.21.3




[PATCH v3 0/4] hw/block/nvme: Fix I/O BAR structure

2020-06-30 Thread Philippe Mathieu-Daudé
Improvements for the I/O BAR structure:
- correct pmrmsc register size (Klaus)
- pack structures
- align to 4KB

Since v2:
- Added Klaus' tags with correct address

$ git backport-diff -u v2
Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/4:[] [--] 'hw/block/nvme: Update specification URL'
002/4:[] [--] 'hw/block/nvme: Use QEMU_PACKED on hardware/packet structures'
003/4:[] [--] 'hw/block/nvme: Fix pmrmsc register size'
004/4:[] [--] 'hw/block/nvme: Align I/O BAR to 4 KiB'

Philippe Mathieu-Daudé (4):
  hw/block/nvme: Update specification URL
  hw/block/nvme: Use QEMU_PACKED on hardware/packet structures
  hw/block/nvme: Fix pmrmsc register size
  hw/block/nvme: Align I/O BAR to 4 KiB

 include/block/nvme.h | 42 ++
 hw/block/nvme.c  |  7 +++
 2 files changed, 25 insertions(+), 24 deletions(-)

-- 
2.21.3




[PATCH v3 2/4] hw/block/nvme: Use QEMU_PACKED on hardware/packet structures

2020-06-30 Thread Philippe Mathieu-Daudé
These structures either describe hardware registers, or
commands ('packets') to send to the hardware. To forbid
the compiler to optimize and change fields alignment,
mark the structures as packed.

Reviewed-by: Klaus Jensen 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/block/nvme.h | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 1720ee1d51..71c5681912 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -1,7 +1,7 @@
 #ifndef BLOCK_NVME_H
 #define BLOCK_NVME_H
 
-typedef struct NvmeBar {
+typedef struct QEMU_PACKED NvmeBar {
 uint64_tcap;
 uint32_tvs;
 uint32_tintms;
@@ -377,7 +377,7 @@ enum NvmePmrmscMask {
 #define NVME_PMRMSC_SET_CBA(pmrmsc, val)   \
 (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT)
 
-typedef struct NvmeCmd {
+typedef struct QEMU_PACKED NvmeCmd {
 uint8_t opcode;
 uint8_t fuse;
 uint16_tcid;
@@ -422,7 +422,7 @@ enum NvmeIoCommands {
 NVME_CMD_DSM= 0x09,
 };
 
-typedef struct NvmeDeleteQ {
+typedef struct QEMU_PACKED NvmeDeleteQ {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -432,7 +432,7 @@ typedef struct NvmeDeleteQ {
 uint32_trsvd11[5];
 } NvmeDeleteQ;
 
-typedef struct NvmeCreateCq {
+typedef struct QEMU_PACKED NvmeCreateCq {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -449,7 +449,7 @@ typedef struct NvmeCreateCq {
 #define NVME_CQ_FLAGS_PC(cq_flags)  (cq_flags & 0x1)
 #define NVME_CQ_FLAGS_IEN(cq_flags) ((cq_flags >> 1) & 0x1)
 
-typedef struct NvmeCreateSq {
+typedef struct QEMU_PACKED NvmeCreateSq {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -474,7 +474,7 @@ enum NvmeQueueFlags {
 NVME_Q_PRIO_LOW = 3,
 };
 
-typedef struct NvmeIdentify {
+typedef struct QEMU_PACKED NvmeIdentify {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -486,7 +486,7 @@ typedef struct NvmeIdentify {
 uint32_trsvd11[5];
 } NvmeIdentify;
 
-typedef struct NvmeRwCmd {
+typedef struct QEMU_PACKED NvmeRwCmd {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -528,7 +528,7 @@ enum {
 NVME_RW_PRINFO_PRCHK_REF= 1 << 10,
 };
 
-typedef struct NvmeDsmCmd {
+typedef struct QEMU_PACKED NvmeDsmCmd {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -547,7 +547,7 @@ enum {
 NVME_DSMGMT_AD  = 1 << 2,
 };
 
-typedef struct NvmeDsmRange {
+typedef struct QEMU_PACKED NvmeDsmRange {
 uint32_tcattr;
 uint32_tnlb;
 uint64_tslba;
@@ -569,14 +569,14 @@ enum NvmeAsyncEventRequest {
 NVME_AER_INFO_SMART_SPARE_THRESH= 2,
 };
 
-typedef struct NvmeAerResult {
+typedef struct QEMU_PACKED NvmeAerResult {
 uint8_t event_type;
 uint8_t event_info;
 uint8_t log_page;
 uint8_t resv;
 } NvmeAerResult;
 
-typedef struct NvmeCqe {
+typedef struct QEMU_PACKED NvmeCqe {
 uint32_tresult;
 uint32_trsvd;
 uint16_tsq_head;
@@ -634,7 +634,7 @@ enum NvmeStatusCodes {
 NVME_NO_COMPLETE= 0x,
 };
 
-typedef struct NvmeFwSlotInfoLog {
+typedef struct QEMU_PACKED NvmeFwSlotInfoLog {
 uint8_t afi;
 uint8_t reserved1[7];
 uint8_t frs1[8];
@@ -647,7 +647,7 @@ typedef struct NvmeFwSlotInfoLog {
 uint8_t reserved2[448];
 } NvmeFwSlotInfoLog;
 
-typedef struct NvmeErrorLog {
+typedef struct QEMU_PACKED NvmeErrorLog {
 uint64_terror_count;
 uint16_tsqid;
 uint16_tcid;
@@ -659,7 +659,7 @@ typedef struct NvmeErrorLog {
 uint8_t resv[35];
 } NvmeErrorLog;
 
-typedef struct NvmeSmartLog {
+typedef struct QEMU_PACKED NvmeSmartLog {
 uint8_t critical_warning;
 uint8_t temperature[2];
 uint8_t available_spare;
@@ -693,7 +693,7 @@ enum LogIdentifier {
 NVME_LOG_FW_SLOT_INFO   = 0x03,
 };
 
-typedef struct NvmePSD {
+typedef struct QEMU_PACKED NvmePSD {
 uint16_tmp;
 uint16_treserved;
 uint32_tenlat;
@@ -713,7 +713,7 @@ enum {
 NVME_ID_CNS_NS_ACTIVE_LIST = 0x2,
 };
 
-typedef struct NvmeIdCtrl {
+typedef struct QEMU_PACKED NvmeIdCtrl {
 uint16_tvid;
 uint16_tssvid;
 uint8_t sn[20];
@@ -807,7 +807,7 @@ enum NvmeFeatureIds {
 NVME_SOFTWARE_PROGRESS_MARKER   = 0x80
 };
 
-typedef struct NvmeRangeType {
+typedef struct QEMU_PACKED NvmeRangeType {
 uint8_t type;
 uint8_t attributes;
 uint8_t rsvd2[14];
@@ -817,13 +817,13 @@ typedef struct NvmeRangeType {
 uint8_t rsvd48[16];
 } NvmeRangeType;
 
-typedef struct NvmeLBAF {
+typedef struct QEMU_PACKED NvmeLBAF {
 uint16_tms;
 uint8_t ds;
 uint8_t rp;
 } NvmeLBAF;
 
-typedef struct NvmeIdNs {
+typedef struct QEMU_PACKED NvmeIdNs {
 uint64_tnsze;
 uint64_tncap;
 uint64_tnuse;
-- 
2.21.3




[PATCH v3 4/4] hw/block/nvme: Align I/O BAR to 4 KiB

2020-06-30 Thread Philippe Mathieu-Daudé
Simplify the NVMe emulated device by aligning the I/O BAR to 4 KiB.

Reviewed-by: Klaus Jensen 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/block/nvme.h | 2 ++
 hw/block/nvme.c  | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 82c384614a..4e1cea576a 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -22,6 +22,7 @@ typedef struct QEMU_PACKED NvmeBar {
 uint32_tpmrebs;
 uint32_tpmrswtp;
 uint64_tpmrmsc;
+uint8_t reserved[484];
 } NvmeBar;
 
 enum NvmeCapShift {
@@ -879,6 +880,7 @@ enum NvmeIdNsDps {
 
 static inline void _nvme_check_size(void)
 {
+QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
 QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4);
 QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16);
 QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16);
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6628d0a4ba..2aa54bc20e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -55,7 +55,6 @@
 #include "nvme.h"
 
 #define NVME_MAX_IOQPAIRS 0x
-#define NVME_REG_SIZE 0x1000
 #define NVME_DB_SIZE  4
 #define NVME_CMB_BIR 2
 #define NVME_PMR_BIR 2
@@ -1322,7 +1321,7 @@ static void nvme_mmio_write(void *opaque, hwaddr addr, 
uint64_t data,
 NvmeCtrl *n = (NvmeCtrl *)opaque;
 if (addr < sizeof(n->bar)) {
 nvme_write_bar(n, addr, data, size);
-} else if (addr >= 0x1000) {
+} else {
 nvme_process_db(n, addr, data);
 }
 }
@@ -1416,7 +1415,7 @@ static void nvme_init_state(NvmeCtrl *n)
 {
 n->num_namespaces = 1;
 /* add one to max_ioqpairs to account for the admin queue pair */
-n->reg_size = pow2ceil(NVME_REG_SIZE +
+n->reg_size = pow2ceil(sizeof(NvmeBar) +
2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE);
 n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
 n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
-- 
2.21.3




[PATCH v3 1/4] hw/block/nvme: Update specification URL

2020-06-30 Thread Philippe Mathieu-Daudé
At some point the URL changed, update it to avoid other
developers to search for it.

Reviewed-by: Klaus Jensen 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1aee042d4c..6628d0a4ba 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -11,7 +11,7 @@
 /**
  * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e
  *
- *  http://www.nvmexpress.org/resources/
+ *  https://nvmexpress.org/developers/nvme-specification/
  */
 
 /**
-- 
2.21.3




Re: [PATCH v2 2/4] hw/block/nvme: Use QEMU_PACKED on hardware/packet structures

2020-06-30 Thread Klaus Jensen
On Jun 30 12:37, Philippe Mathieu-Daudé wrote:
> These structures either describe hardware registers, or
> commands ('packets') to send to the hardware. To forbid
> the compiler to optimize and change fields alignment,
> mark the structures as packed.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Klaus Jensen 

> ---
>  include/block/nvme.h | 38 +++---
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 1720ee1d51..71c5681912 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -1,7 +1,7 @@
>  #ifndef BLOCK_NVME_H
>  #define BLOCK_NVME_H
>  
> -typedef struct NvmeBar {
> +typedef struct QEMU_PACKED NvmeBar {
>  uint64_tcap;
>  uint32_tvs;
>  uint32_tintms;
> @@ -377,7 +377,7 @@ enum NvmePmrmscMask {
>  #define NVME_PMRMSC_SET_CBA(pmrmsc, val)   \
>  (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT)
>  
> -typedef struct NvmeCmd {
> +typedef struct QEMU_PACKED NvmeCmd {
>  uint8_t opcode;
>  uint8_t fuse;
>  uint16_tcid;
> @@ -422,7 +422,7 @@ enum NvmeIoCommands {
>  NVME_CMD_DSM= 0x09,
>  };
>  
> -typedef struct NvmeDeleteQ {
> +typedef struct QEMU_PACKED NvmeDeleteQ {
>  uint8_t opcode;
>  uint8_t flags;
>  uint16_tcid;
> @@ -432,7 +432,7 @@ typedef struct NvmeDeleteQ {
>  uint32_trsvd11[5];
>  } NvmeDeleteQ;
>  
> -typedef struct NvmeCreateCq {
> +typedef struct QEMU_PACKED NvmeCreateCq {
>  uint8_t opcode;
>  uint8_t flags;
>  uint16_tcid;
> @@ -449,7 +449,7 @@ typedef struct NvmeCreateCq {
>  #define NVME_CQ_FLAGS_PC(cq_flags)  (cq_flags & 0x1)
>  #define NVME_CQ_FLAGS_IEN(cq_flags) ((cq_flags >> 1) & 0x1)
>  
> -typedef struct NvmeCreateSq {
> +typedef struct QEMU_PACKED NvmeCreateSq {
>  uint8_t opcode;
>  uint8_t flags;
>  uint16_tcid;
> @@ -474,7 +474,7 @@ enum NvmeQueueFlags {
>  NVME_Q_PRIO_LOW = 3,
>  };
>  
> -typedef struct NvmeIdentify {
> +typedef struct QEMU_PACKED NvmeIdentify {
>  uint8_t opcode;
>  uint8_t flags;
>  uint16_tcid;
> @@ -486,7 +486,7 @@ typedef struct NvmeIdentify {
>  uint32_trsvd11[5];
>  } NvmeIdentify;
>  
> -typedef struct NvmeRwCmd {
> +typedef struct QEMU_PACKED NvmeRwCmd {
>  uint8_t opcode;
>  uint8_t flags;
>  uint16_tcid;
> @@ -528,7 +528,7 @@ enum {
>  NVME_RW_PRINFO_PRCHK_REF= 1 << 10,
>  };
>  
> -typedef struct NvmeDsmCmd {
> +typedef struct QEMU_PACKED NvmeDsmCmd {
>  uint8_t opcode;
>  uint8_t flags;
>  uint16_tcid;
> @@ -547,7 +547,7 @@ enum {
>  NVME_DSMGMT_AD  = 1 << 2,
>  };
>  
> -typedef struct NvmeDsmRange {
> +typedef struct QEMU_PACKED NvmeDsmRange {
>  uint32_tcattr;
>  uint32_tnlb;
>  uint64_tslba;
> @@ -569,14 +569,14 @@ enum NvmeAsyncEventRequest {
>  NVME_AER_INFO_SMART_SPARE_THRESH= 2,
>  };
>  
> -typedef struct NvmeAerResult {
> +typedef struct QEMU_PACKED NvmeAerResult {
>  uint8_t event_type;
>  uint8_t event_info;
>  uint8_t log_page;
>  uint8_t resv;
>  } NvmeAerResult;
>  
> -typedef struct NvmeCqe {
> +typedef struct QEMU_PACKED NvmeCqe {
>  uint32_tresult;
>  uint32_trsvd;
>  uint16_tsq_head;
> @@ -634,7 +634,7 @@ enum NvmeStatusCodes {
>  NVME_NO_COMPLETE= 0x,
>  };
>  
> -typedef struct NvmeFwSlotInfoLog {
> +typedef struct QEMU_PACKED NvmeFwSlotInfoLog {
>  uint8_t afi;
>  uint8_t reserved1[7];
>  uint8_t frs1[8];
> @@ -647,7 +647,7 @@ typedef struct NvmeFwSlotInfoLog {
>  uint8_t reserved2[448];
>  } NvmeFwSlotInfoLog;
>  
> -typedef struct NvmeErrorLog {
> +typedef struct QEMU_PACKED NvmeErrorLog {
>  uint64_terror_count;
>  uint16_tsqid;
>  uint16_tcid;
> @@ -659,7 +659,7 @@ typedef struct NvmeErrorLog {
>  uint8_t resv[35];
>  } NvmeErrorLog;
>  
> -typedef struct NvmeSmartLog {
> +typedef struct QEMU_PACKED NvmeSmartLog {
>  uint8_t critical_warning;
>  uint8_t temperature[2];
>  uint8_t available_spare;
> @@ -693,7 +693,7 @@ enum LogIdentifier {
>  NVME_LOG_FW_SLOT_INFO   = 0x03,
>  };
>  
> -typedef struct NvmePSD {
> +typedef struct QEMU_PACKED NvmePSD {
>  uint16_tmp;
>  uint16_treserved;
>  uint32_tenlat;
> @@ -713,7 +713,7 @@ enum {
>  NVME_ID_CNS_NS_ACTIVE_LIST = 0x2,
>  };
>  
> -typedef struct NvmeIdCtrl {
> +typedef struct QEMU_PACKED NvmeIdCtrl {
>  uint16_tvid;
>  uint16_tssvid;
>  uint8_t sn[20];
> @@ -807,7 +807,7 @@ enum NvmeFeatureIds {
>  NVME_SOFTWARE_PROGRESS_MARKER   = 0x80
>  };
>  
> -typedef struct NvmeRangeType {
> +typedef struct QEMU_PACKED NvmeRangeType {
>  uint8_t type;
>  uint8_t attributes;
>  uint8_t rsvd2[14];
> @@ -817,13 +817,13 @@ 

Re: [PATCH 2/4] migration: Add block-bitmap-mapping parameter

2020-06-30 Thread Dr. David Alan Gilbert
* Max Reitz (mre...@redhat.com) wrote:
> This migration parameter allows mapping block node names and bitmap
> names to aliases for the purpose of block dirty bitmap migration.
> 
> This way, management tools can use different node and bitmap names on
> the source and destination and pass the mapping of how bitmaps are to be
> transferred to qemu (on the source, the destination, or even both with
> arbitrary aliases in the migration stream).
> 
> Suggested-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Max Reitz 
> ---
>  qapi/migration.json|  83 +++-
>  migration/migration.h  |   3 +
>  migration/block-dirty-bitmap.c | 372 -
>  migration/migration.c  |  29 +++
>  4 files changed, 432 insertions(+), 55 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index d5000558c6..5aeae9bea8 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -507,6 +507,44 @@
>'data': [ 'none', 'zlib',
>  { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
>  
> +##
> +# @BitmapMigrationBitmapAlias:
> +#
> +# @name: The name of the bitmap.
> +#
> +# @alias: An alias name for migration (for example the bitmap name on
> +# the opposite site).
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'BitmapMigrationBitmapAlias',
> +  'data': {
> +  'name': 'str',
> +  'alias': 'str'
> +  } }
> +
> +##
> +# @BitmapMigrationNodeAlias:
> +#
> +# Maps a block node name and the bitmaps it has to aliases for dirty
> +# bitmap migration.
> +#
> +# @node-name: A block node name.
> +#
> +# @alias: An alias block node name for migration (for example the
> +# node name on the opposite site).
> +#
> +# @bitmaps: Mappings for the bitmaps on this node.
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'BitmapMigrationNodeAlias',
> +  'data': {
> +  'node-name': 'str',
> +  'alias': 'str',
> +  'bitmaps': [ 'BitmapMigrationBitmapAlias' ]
> +  } }
> +
>  ##
>  # @MigrationParameter:
>  #
> @@ -641,6 +679,18 @@
>  #  will consume more CPU.
>  #  Defaults to 1. (Since 5.0)
>  #
> +# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
> +#  aliases for the purpose of dirty bitmap migration.  Such
> +#  aliases may for example be the corresponding names on the
> +#  opposite site.
> +#  The mapping must be one-to-one and complete: On the source,
> +#  migrating a bitmap from a node when either is not mapped
> +#  will result in an error.  On the destination, similarly,
> +#  receiving a bitmap (by alias) from a node (by alias) when
> +#  either alias is not mapped will result in an error.
> +#  By default, all node names and bitmap names are mapped to
> +#  themselves. (Since 5.1)
> +#
>  # Since: 2.4
>  ##
>  { 'enum': 'MigrationParameter',
> @@ -655,7 +705,8 @@
> 'multifd-channels',
> 'xbzrle-cache-size', 'max-postcopy-bandwidth',
> 'max-cpu-throttle', 'multifd-compression',
> -   'multifd-zlib-level' ,'multifd-zstd-level' ] }
> +   'multifd-zlib-level' ,'multifd-zstd-level',
> +   'block-bitmap-mapping' ] }
>  
>  ##
>  # @MigrateSetParameters:
> @@ -781,6 +832,18 @@
>  #  will consume more CPU.
>  #  Defaults to 1. (Since 5.0)
>  #
> +# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
> +#  aliases for the purpose of dirty bitmap migration.  Such
> +#  aliases may for example be the corresponding names on the
> +#  opposite site.
> +#  The mapping must be one-to-one and complete: On the source,
> +#  migrating a bitmap from a node when either is not mapped
> +#  will result in an error.  On the destination, similarly,
> +#  receiving a bitmap (by alias) from a node (by alias) when
> +#  either alias is not mapped will result in an error.
> +#  By default, all node names and bitmap names are mapped to
> +#  themselves. (Since 5.1)
> +#
>  # Since: 2.4
>  ##
>  # TODO either fuse back into MigrationParameters, or make
> @@ -811,7 +874,8 @@
>  '*max-cpu-throttle': 'int',
>  '*multifd-compression': 'MultiFDCompression',
>  '*multifd-zlib-level': 'int',
> -'*multifd-zstd-level': 'int' } }
> +'*multifd-zstd-level': 'int',
> +'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } }

That's a hairy type for a migration parameter!
I'm curious what 'info migrate_parameters' does in hmp or what happens
if you try and set it?

Dave

>  
>  ##
>  # @migrate-set-parameters:
> @@ -957,6 +1021,18 @@
>  #  will consume more CPU.
>  #  Defaults to 1. (Since 5.0)
>  #
> +# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
> +#  aliases for the purpose of dirty bitmap migration.  Such
> +#  aliases may for example be the corresponding names on 

Re: [PATCH v2 3/4] hw/block/nvme: Fix pmrmsc register size

2020-06-30 Thread Klaus Jensen
On Jun 30 12:37, Philippe Mathieu-Daudé wrote:
> The Persistent Memory Region Controller Memory Space Control
> register is 64-bit wide. See 'Figure 68: Register Definition'
> of the 'NVM Express Base Specification Revision 1.4'.
> 
> Fixes: 6cf9413229 ("introduce PMR support from NVMe 1.4 spec")
> Reported-by: Klaus Jensen 
> Signed-off-by: Philippe Mathieu-Daudé 

(if possible, please change the Reported-by to my Samsung address)

Reviewed-by: Klaus Jensen 

> ---
> Cc: Andrzej Jakowski 
> Cc: Keith Busch 
> ---
>  include/block/nvme.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 71c5681912..82c384614a 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -21,7 +21,7 @@ typedef struct QEMU_PACKED NvmeBar {
>  uint32_tpmrsts;
>  uint32_tpmrebs;
>  uint32_tpmrswtp;
> -uint32_tpmrmsc;
> +uint64_tpmrmsc;
>  } NvmeBar;
>  
>  enum NvmeCapShift {
> -- 
> 2.21.3
> 
> 



Re: [PATCH v2 1/4] hw/block/nvme: Update specification URL

2020-06-30 Thread Klaus Jensen
On Jun 30 12:37, Philippe Mathieu-Daudé wrote:
> At some point the URL changed, update it to avoid other
> developers to search for it.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Klaus Jensen 

> ---
>  hw/block/nvme.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 1aee042d4c..6628d0a4ba 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -11,7 +11,7 @@
>  /**
>   * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e
>   *
> - *  http://www.nvmexpress.org/resources/
> + *  https://nvmexpress.org/developers/nvme-specification/
>   */
>  
>  /**
> -- 
> 2.21.3
> 
> 



Re: [PATCH v6 00/15] hw/sd/sdcard: Fix CVE-2020-13253 & cleanups

2020-06-30 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20200630100342.27625-1-f4...@amsat.org/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

qemu-system-aarch64: /tmp/qemu-test/src/hw/sd/sd.c:546: sd_addr_to_wpnum: 
Assertion `addr <= sd->size' failed.
Broken pipe
/tmp/qemu-test/src/tests/qtest/libqtest.c:175: kill_qemu() detected QEMU death 
from signal 6 (Aborted) (core dumped)
ERROR - too few tests run (expected 66, got 0)
make: *** [check-qtest-aarch64] Error 1
make: *** Waiting for unfinished jobs
  TESTiotest-qcow2: 080
  TESTiotest-qcow2: 086
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=4350b95811964ca0b89e805d9baa18ad', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-hzd3yz_c/src/docker-src.2020-06-30-06.28.34.4417:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=4350b95811964ca0b89e805d9baa18ad
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-hzd3yz_c/src'
make: *** [docker-run-test-quick@centos7] Error 2

real15m5.621s
user0m8.700s


The full log is available at
http://patchew.org/logs/20200630100342.27625-1-f4...@amsat.org/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied()

2020-06-30 Thread Max Reitz
On 28.06.20 13:02, Alberto Garcia wrote:
> When writing to a qcow2 file there are two functions that take a
> virtual offset and return a host offset, possibly allocating new
> clusters if necessary:
> 
>- handle_copied() looks for normal data clusters that are already
>  allocated and have a reference count of 1. In those clusters we
>  can simply write the data and there is no need to perform any
>  copy-on-write.
> 
>- handle_alloc() looks for clusters that do need copy-on-write,
>  either because they haven't been allocated yet, because their
>  reference count is != 1 or because they are ZERO_ALLOC clusters.
> 
> The ZERO_ALLOC case is a bit special because those are clusters that
> are already allocated and they could perfectly be dealt with in
> handle_copied() (as long as copy-on-write is performed when required).
> 
> In fact, there is extra code specifically for them in handle_alloc()
> that tries to reuse the existing allocation if possible and frees them
> otherwise.
> 
> This patch changes the handling of ZERO_ALLOC clusters so the
> semantics of these two functions are now like this:
> 
>- handle_copied() looks for clusters that are already allocated and
>  which we can overwrite (NORMAL and ZERO_ALLOC clusters with a
>  reference count of 1).
> 
>- handle_alloc() looks for clusters for which we need a new
>  allocation (all other cases).
> 
> One important difference after this change is that clusters found
> in handle_copied() may now require copy-on-write, but this will be
> necessary anyway once we add support for subclusters.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Eric Blake 
> ---
>  block/qcow2-cluster.c | 256 +++---
>  1 file changed, 141 insertions(+), 115 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[PATCH v2 4/4] hw/block/nvme: Align I/O BAR to 4 KiB

2020-06-30 Thread Philippe Mathieu-Daudé
Simplify the NVMe emulated device by aligning the I/O BAR to 4 KiB.

Reviewed-by: Klaus Jensen 
Signed-off-by: Philippe Mathieu-Daudé 
---
v2: Do not include 'cmd_set_specfic' (Klaus)
---
 include/block/nvme.h | 2 ++
 hw/block/nvme.c  | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 82c384614a..4e1cea576a 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -22,6 +22,7 @@ typedef struct QEMU_PACKED NvmeBar {
 uint32_tpmrebs;
 uint32_tpmrswtp;
 uint64_tpmrmsc;
+uint8_t reserved[484];
 } NvmeBar;
 
 enum NvmeCapShift {
@@ -879,6 +880,7 @@ enum NvmeIdNsDps {
 
 static inline void _nvme_check_size(void)
 {
+QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
 QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4);
 QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16);
 QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16);
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6628d0a4ba..2aa54bc20e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -55,7 +55,6 @@
 #include "nvme.h"
 
 #define NVME_MAX_IOQPAIRS 0x
-#define NVME_REG_SIZE 0x1000
 #define NVME_DB_SIZE  4
 #define NVME_CMB_BIR 2
 #define NVME_PMR_BIR 2
@@ -1322,7 +1321,7 @@ static void nvme_mmio_write(void *opaque, hwaddr addr, 
uint64_t data,
 NvmeCtrl *n = (NvmeCtrl *)opaque;
 if (addr < sizeof(n->bar)) {
 nvme_write_bar(n, addr, data, size);
-} else if (addr >= 0x1000) {
+} else {
 nvme_process_db(n, addr, data);
 }
 }
@@ -1416,7 +1415,7 @@ static void nvme_init_state(NvmeCtrl *n)
 {
 n->num_namespaces = 1;
 /* add one to max_ioqpairs to account for the admin queue pair */
-n->reg_size = pow2ceil(NVME_REG_SIZE +
+n->reg_size = pow2ceil(sizeof(NvmeBar) +
2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE);
 n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
 n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
-- 
2.21.3




[PATCH v2 1/4] hw/block/nvme: Update specification URL

2020-06-30 Thread Philippe Mathieu-Daudé
At some point the URL changed, update it to avoid other
developers to search for it.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1aee042d4c..6628d0a4ba 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -11,7 +11,7 @@
 /**
  * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e
  *
- *  http://www.nvmexpress.org/resources/
+ *  https://nvmexpress.org/developers/nvme-specification/
  */
 
 /**
-- 
2.21.3




[PATCH v2 3/4] hw/block/nvme: Fix pmrmsc register size

2020-06-30 Thread Philippe Mathieu-Daudé
The Persistent Memory Region Controller Memory Space Control
register is 64-bit wide. See 'Figure 68: Register Definition'
of the 'NVM Express Base Specification Revision 1.4'.

Fixes: 6cf9413229 ("introduce PMR support from NVMe 1.4 spec")
Reported-by: Klaus Jensen 
Signed-off-by: Philippe Mathieu-Daudé 
---
Cc: Andrzej Jakowski 
Cc: Keith Busch 
---
 include/block/nvme.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 71c5681912..82c384614a 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -21,7 +21,7 @@ typedef struct QEMU_PACKED NvmeBar {
 uint32_tpmrsts;
 uint32_tpmrebs;
 uint32_tpmrswtp;
-uint32_tpmrmsc;
+uint64_tpmrmsc;
 } NvmeBar;
 
 enum NvmeCapShift {
-- 
2.21.3




[PATCH v2 2/4] hw/block/nvme: Use QEMU_PACKED on hardware/packet structures

2020-06-30 Thread Philippe Mathieu-Daudé
These structures either describe hardware registers, or
commands ('packets') to send to the hardware. To forbid
the compiler to optimize and change fields alignment,
mark the structures as packed.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/block/nvme.h | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 1720ee1d51..71c5681912 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -1,7 +1,7 @@
 #ifndef BLOCK_NVME_H
 #define BLOCK_NVME_H
 
-typedef struct NvmeBar {
+typedef struct QEMU_PACKED NvmeBar {
 uint64_tcap;
 uint32_tvs;
 uint32_tintms;
@@ -377,7 +377,7 @@ enum NvmePmrmscMask {
 #define NVME_PMRMSC_SET_CBA(pmrmsc, val)   \
 (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT)
 
-typedef struct NvmeCmd {
+typedef struct QEMU_PACKED NvmeCmd {
 uint8_t opcode;
 uint8_t fuse;
 uint16_tcid;
@@ -422,7 +422,7 @@ enum NvmeIoCommands {
 NVME_CMD_DSM= 0x09,
 };
 
-typedef struct NvmeDeleteQ {
+typedef struct QEMU_PACKED NvmeDeleteQ {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -432,7 +432,7 @@ typedef struct NvmeDeleteQ {
 uint32_trsvd11[5];
 } NvmeDeleteQ;
 
-typedef struct NvmeCreateCq {
+typedef struct QEMU_PACKED NvmeCreateCq {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -449,7 +449,7 @@ typedef struct NvmeCreateCq {
 #define NVME_CQ_FLAGS_PC(cq_flags)  (cq_flags & 0x1)
 #define NVME_CQ_FLAGS_IEN(cq_flags) ((cq_flags >> 1) & 0x1)
 
-typedef struct NvmeCreateSq {
+typedef struct QEMU_PACKED NvmeCreateSq {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -474,7 +474,7 @@ enum NvmeQueueFlags {
 NVME_Q_PRIO_LOW = 3,
 };
 
-typedef struct NvmeIdentify {
+typedef struct QEMU_PACKED NvmeIdentify {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -486,7 +486,7 @@ typedef struct NvmeIdentify {
 uint32_trsvd11[5];
 } NvmeIdentify;
 
-typedef struct NvmeRwCmd {
+typedef struct QEMU_PACKED NvmeRwCmd {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -528,7 +528,7 @@ enum {
 NVME_RW_PRINFO_PRCHK_REF= 1 << 10,
 };
 
-typedef struct NvmeDsmCmd {
+typedef struct QEMU_PACKED NvmeDsmCmd {
 uint8_t opcode;
 uint8_t flags;
 uint16_tcid;
@@ -547,7 +547,7 @@ enum {
 NVME_DSMGMT_AD  = 1 << 2,
 };
 
-typedef struct NvmeDsmRange {
+typedef struct QEMU_PACKED NvmeDsmRange {
 uint32_tcattr;
 uint32_tnlb;
 uint64_tslba;
@@ -569,14 +569,14 @@ enum NvmeAsyncEventRequest {
 NVME_AER_INFO_SMART_SPARE_THRESH= 2,
 };
 
-typedef struct NvmeAerResult {
+typedef struct QEMU_PACKED NvmeAerResult {
 uint8_t event_type;
 uint8_t event_info;
 uint8_t log_page;
 uint8_t resv;
 } NvmeAerResult;
 
-typedef struct NvmeCqe {
+typedef struct QEMU_PACKED NvmeCqe {
 uint32_tresult;
 uint32_trsvd;
 uint16_tsq_head;
@@ -634,7 +634,7 @@ enum NvmeStatusCodes {
 NVME_NO_COMPLETE= 0x,
 };
 
-typedef struct NvmeFwSlotInfoLog {
+typedef struct QEMU_PACKED NvmeFwSlotInfoLog {
 uint8_t afi;
 uint8_t reserved1[7];
 uint8_t frs1[8];
@@ -647,7 +647,7 @@ typedef struct NvmeFwSlotInfoLog {
 uint8_t reserved2[448];
 } NvmeFwSlotInfoLog;
 
-typedef struct NvmeErrorLog {
+typedef struct QEMU_PACKED NvmeErrorLog {
 uint64_terror_count;
 uint16_tsqid;
 uint16_tcid;
@@ -659,7 +659,7 @@ typedef struct NvmeErrorLog {
 uint8_t resv[35];
 } NvmeErrorLog;
 
-typedef struct NvmeSmartLog {
+typedef struct QEMU_PACKED NvmeSmartLog {
 uint8_t critical_warning;
 uint8_t temperature[2];
 uint8_t available_spare;
@@ -693,7 +693,7 @@ enum LogIdentifier {
 NVME_LOG_FW_SLOT_INFO   = 0x03,
 };
 
-typedef struct NvmePSD {
+typedef struct QEMU_PACKED NvmePSD {
 uint16_tmp;
 uint16_treserved;
 uint32_tenlat;
@@ -713,7 +713,7 @@ enum {
 NVME_ID_CNS_NS_ACTIVE_LIST = 0x2,
 };
 
-typedef struct NvmeIdCtrl {
+typedef struct QEMU_PACKED NvmeIdCtrl {
 uint16_tvid;
 uint16_tssvid;
 uint8_t sn[20];
@@ -807,7 +807,7 @@ enum NvmeFeatureIds {
 NVME_SOFTWARE_PROGRESS_MARKER   = 0x80
 };
 
-typedef struct NvmeRangeType {
+typedef struct QEMU_PACKED NvmeRangeType {
 uint8_t type;
 uint8_t attributes;
 uint8_t rsvd2[14];
@@ -817,13 +817,13 @@ typedef struct NvmeRangeType {
 uint8_t rsvd48[16];
 } NvmeRangeType;
 
-typedef struct NvmeLBAF {
+typedef struct QEMU_PACKED NvmeLBAF {
 uint16_tms;
 uint8_t ds;
 uint8_t rp;
 } NvmeLBAF;
 
-typedef struct NvmeIdNs {
+typedef struct QEMU_PACKED NvmeIdNs {
 uint64_tnsze;
 uint64_tncap;
 uint64_tnuse;
-- 
2.21.3




[PATCH v2 0/4] hw/block/nvme: Fix I/O BAR structure

2020-06-30 Thread Philippe Mathieu-Daudé
Improvements for the I/O BAR structure:
- correct pmrmsc register size (Klaus)
- pack structures
- align to 4KB

Philippe Mathieu-Daudé (4):
  hw/block/nvme: Update specification URL
  hw/block/nvme: Use QEMU_PACKED on hardware/packet structures
  hw/block/nvme: Fix pmrmsc register size
  hw/block/nvme: Align I/O BAR to 4 KiB

 include/block/nvme.h | 42 ++
 hw/block/nvme.c  |  7 +++
 2 files changed, 25 insertions(+), 24 deletions(-)

-- 
2.21.3




Re: [PATCH 1/4] migration: Prevent memleak by ...params_test_apply

2020-06-30 Thread Dr. David Alan Gilbert
* Max Reitz (mre...@redhat.com) wrote:
> The created structure is not really a proper QAPI object, so we cannot
> and will not free its members.  Strings therein should therefore not be
> duplicated, or we will leak them.
> 
> Signed-off-by: Max Reitz 
> ---
>  migration/migration.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 481a590f72..47c7da4e55 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1336,12 +1336,12 @@ static void 
> migrate_params_test_apply(MigrateSetParameters *params,
>  
>  if (params->has_tls_creds) {
>  assert(params->tls_creds->type == QTYPE_QSTRING);
> -dest->tls_creds = g_strdup(params->tls_creds->u.s);
> +dest->tls_creds = params->tls_creds->u.s;
>  }
>  
>  if (params->has_tls_hostname) {
>  assert(params->tls_hostname->type == QTYPE_QSTRING);
> -dest->tls_hostname = g_strdup(params->tls_hostname->u.s);
> +dest->tls_hostname = params->tls_hostname->u.s;
>  }

Yeh I think I agree.

Reviewed-by: Dr. David Alan Gilbert 

>  
>  if (params->has_max_bandwidth) {
> -- 
> 2.26.2
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()

2020-06-30 Thread Alberto Garcia
On Tue 30 Jun 2020 12:19:42 PM CEST, Max Reitz wrote:
>> @@ -537,8 +542,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, 
>> uint64_t offset,
>>  bytes_needed = bytes_available;
>>  }
>>  
>> -*cluster_offset = 0;
>> -
>
> You drop this line without replacement now.  That means that
> *host_offset is no longer set to 0 if the L1 entry is out of bounds or
> empty (which causes this function to return QCOW2_CLUSTER_UNALLOCATED
> and no error).  Was that intentional?

Hmm, no, it wasn't intentional.

It does not have any side effect but I should be explicitly set
it to 0. I'll fix it in the next version.

Berto



Re: [PATCH v2 06/18] hw/block/nvme: Define trace events related to NS Types

2020-06-30 Thread Klaus Jensen
On Jun 18 06:34, Dmitry Fomichev wrote:
> A few trace events are defined that are relevant to implementing
> Namespace Types (NVMe TP 4056).
> 
> Signed-off-by: Dmitry Fomichev 

Reviewed-by: Klaus Jensen 

> ---
>  hw/block/trace-events | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 423d491e27..3f3323fe38 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -39,8 +39,13 @@ pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t 
> vector, uint16_t size,
>  pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
>  pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
> +pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8""
>  pci_nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
> +pci_nvme_identify_ns_csi(uint16_t ns, uint8_t csi) "identify namespace, 
> nsid=%"PRIu16", csi=0x%"PRIx8""
>  pci_nvme_identify_nslist(uint16_t ns) "identify namespace list, 
> nsid=%"PRIu16""
> +pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "identify namespace 
> list, nsid=%"PRIu16", csi=0x%"PRIx8""
> +pci_nvme_list_ns_descriptors(void) "identify namespace descriptors"
> +pci_nvme_identify_cmd_set(void) "identify i/o command set"
>  pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write 
> cache, result=%s"
>  pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
>  pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested 
> cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
> @@ -59,6 +64,8 @@ pci_nvme_mmio_stopped(void) "cleared controller enable bit"
>  pci_nvme_mmio_shutdown_set(void) "shutdown bit set"
>  pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
>  pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects 
> log read"
> +pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set 
> selected by host, bar.cc=0x%"PRIx32""
> +pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets 
> selected by host, bar.cc=0x%"PRIx32""
>  
>  # nvme traces for error conditions
>  pci_nvme_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
> @@ -72,6 +79,9 @@ pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin 
> opcode 0x%"PRIx8""
>  pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) 
> "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
>  pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported 
> and effects log offset must be 0, got %"PRIu64""
>  pci_nvme_err_invalid_effects_log_len(uint32_t len) "commands supported and 
> effects log size is 4096, got %"PRIu32""
> +pci_nvme_err_change_css_when_enabled(void) "changing CC.CSS while controller 
> is enabled"
> +pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM 
> command set is enabled"
> +pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set 
> combination index %"PRIu32""
>  pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue 
> deletion, sid=%"PRIu16""
>  pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating 
> submission queue, invalid cqid=%"PRIu16""
>  pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating 
> submission queue, invalid sqid=%"PRIu16""
> @@ -127,6 +137,7 @@ pci_nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t 
> new_head) "completion qu
>  pci_nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write 
> for nonexistent queue, sqid=%"PRIu32", ignoring"
>  pci_nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) 
> "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", 
> new_head=%"PRIu16", ignoring"
>  pci_nvme_unsupported_log_page(uint16_t lid) "unsupported log page 
> 0x%"PRIx16""
> +pci_nvme_ub_unknown_css_value(void) "unknown value in cc.css field"
>  
>  # xen-block.c
>  xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s 
> d%up%u"
> -- 
> 2.21.0
> 
> 



[PATCH v6 11/15] hw/sd/sdcard: Constify sd_crc*()'s message argument

2020-06-30 Thread Philippe Mathieu-Daudé
CRC functions don't modify the buffer argument,
make it const.

Reviewed-by: Alistair Francis 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 0fd672357c..2238ba066d 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -255,11 +255,11 @@ static const int sd_cmd_class[SDMMC_CMD_MAX] = {
 7,  7, 10,  7,  9,  9,  9,  8,  8, 10,  8,  8,  8,  8,  8,  8,
 };
 
-static uint8_t sd_crc7(void *message, size_t width)
+static uint8_t sd_crc7(const void *message, size_t width)
 {
 int i, bit;
 uint8_t shift_reg = 0x00;
-uint8_t *msg = (uint8_t *) message;
+const uint8_t *msg = (const uint8_t *)message;
 
 for (i = 0; i < width; i ++, msg ++)
 for (bit = 7; bit >= 0; bit --) {
@@ -271,11 +271,11 @@ static uint8_t sd_crc7(void *message, size_t width)
 return shift_reg;
 }
 
-static uint16_t sd_crc16(void *message, size_t width)
+static uint16_t sd_crc16(const void *message, size_t width)
 {
 int i, bit;
 uint16_t shift_reg = 0x;
-uint16_t *msg = (uint16_t *) message;
+const uint16_t *msg = (const uint16_t *)message;
 width <<= 1;
 
 for (i = 0; i < width; i ++, msg ++)
-- 
2.21.3




Re: [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()

2020-06-30 Thread Max Reitz
On 28.06.20 13:02, Alberto Garcia wrote:
> qcow2_get_cluster_offset() takes an (unaligned) guest offset and
> returns the (aligned) offset of the corresponding cluster in the qcow2
> image.
> 
> In practice none of the callers need to know where the cluster starts
> so this patch makes the function calculate and return the final host
> offset directly. The function is also renamed accordingly.
> 
> There is a pre-existing exception with compressed clusters: in this
> case the function returns the complete cluster descriptor (containing
> the offset and size of the compressed data). This does not change with
> this patch but it is now documented.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2.h |  4 ++--
>  block/qcow2-cluster.c | 42 +++---
>  block/qcow2.c | 24 +++-
>  3 files changed, 32 insertions(+), 38 deletions(-)

[...]

> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 4b5fc8c4a7..9ab41cb728 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c

[...]

> @@ -537,8 +542,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, 
> uint64_t offset,
>  bytes_needed = bytes_available;
>  }
>  
> -*cluster_offset = 0;
> -

You drop this line without replacement now.  That means that
*host_offset is no longer set to 0 if the L1 entry is out of bounds or
empty (which causes this function to return QCOW2_CLUSTER_UNALLOCATED
and no error).  Was that intentional?

Max



signature.asc
Description: OpenPGP digital signature


[PATCH v6 10/15] hw/sd/sdcard: Simplify cmd_valid_while_locked()

2020-06-30 Thread Philippe Mathieu-Daudé
cmd_valid_while_locked() only needs to read SDRequest->cmd,
pass it directly and make it const.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index ba4d0e0597..0fd672357c 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1664,7 +1664,7 @@ static sd_rsp_type_t sd_app_command(SDState *sd,
 return sd_illegal;
 }
 
-static int cmd_valid_while_locked(SDState *sd, SDRequest *req)
+static int cmd_valid_while_locked(SDState *sd, const uint8_t cmd)
 {
 /* Valid commands in locked state:
  * basic class (0)
@@ -1675,13 +1675,12 @@ static int cmd_valid_while_locked(SDState *sd, 
SDRequest *req)
  * Anything else provokes an "illegal command" response.
  */
 if (sd->expecting_acmd) {
-return req->cmd == 41 || req->cmd == 42;
+return cmd == 41 || cmd == 42;
 }
-if (req->cmd == 16 || req->cmd == 55) {
+if (cmd == 16 || cmd == 55) {
 return 1;
 }
-return sd_cmd_class[req->cmd] == 0
-|| sd_cmd_class[req->cmd] == 7;
+return sd_cmd_class[cmd] == 0 || sd_cmd_class[cmd] == 7;
 }
 
 int sd_do_command(SDState *sd, SDRequest *req,
@@ -1707,7 +1706,7 @@ int sd_do_command(SDState *sd, SDRequest *req,
 }
 
 if (sd->card_status & CARD_IS_LOCKED) {
-if (!cmd_valid_while_locked(sd, req)) {
+if (!cmd_valid_while_locked(sd, req->cmd)) {
 sd->card_status |= ILLEGAL_COMMAND;
 sd->expecting_acmd = false;
 qemu_log_mask(LOG_GUEST_ERROR, "SD: Card is locked\n");
-- 
2.21.3




[PATCH v6 08/15] hw/sd/sdcard: Check address is in range

2020-06-30 Thread Philippe Mathieu-Daudé
As a defense, assert if the requested address is out of the card area.

Suggested-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
v6: call sd_addr_to_wpnum on 'size - 1' in reset()
---
 hw/sd/sd.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 22392e5084..c6742c884d 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -537,8 +537,10 @@ static void sd_response_r7_make(SDState *sd, uint8_t 
*response)
 stl_be_p(response, sd->vhs);
 }
 
-static inline uint64_t sd_addr_to_wpnum(uint64_t addr)
+static uint64_t sd_addr_to_wpnum(SDState *sd, uint64_t addr)
 {
+assert(addr <= sd->size);
+
 return addr >> (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT);
 }
 
@@ -575,7 +577,7 @@ static void sd_reset(DeviceState *dev)
 sd_set_cardstatus(sd);
 sd_set_sdstatus(sd);
 
-sect = sd_addr_to_wpnum(size) + 1;
+sect = sd_addr_to_wpnum(sd, size - 1) + 1;
 g_free(sd->wp_groups);
 sd->wp_switch = sd->blk ? blk_is_read_only(sd->blk) : false;
 sd->wpgrps_size = sect;
@@ -759,8 +761,8 @@ static void sd_erase(SDState *sd)
 erase_end *= HWBLOCK_SIZE;
 }
 
-erase_start = sd_addr_to_wpnum(erase_start);
-erase_end = sd_addr_to_wpnum(erase_end);
+erase_start = sd_addr_to_wpnum(sd, erase_start);
+erase_end = sd_addr_to_wpnum(sd, erase_end);
 sd->erase_start = 0;
 sd->erase_end = 0;
 sd->csd[14] |= 0x40;
@@ -777,7 +779,7 @@ static uint32_t sd_wpbits(SDState *sd, uint64_t addr)
 uint32_t i, wpnum;
 uint32_t ret = 0;
 
-wpnum = sd_addr_to_wpnum(addr);
+wpnum = sd_addr_to_wpnum(sd, addr);
 
 for (i = 0; i < 32; i++, wpnum++, addr += WPGROUP_SIZE) {
 if (addr < sd->size && test_bit(wpnum, sd->wp_groups)) {
@@ -819,7 +821,7 @@ static void sd_function_switch(SDState *sd, uint32_t arg)
 
 static inline bool sd_wp_addr(SDState *sd, uint64_t addr)
 {
-return test_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
+return test_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
 }
 
 static void sd_lock_command(SDState *sd)
@@ -1331,7 +1333,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 }
 
 sd->state = sd_programming_state;
-set_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
+set_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
 /* Bzzztt  Operation complete.  */
 sd->state = sd_transfer_state;
 return sd_r1b;
@@ -1350,7 +1352,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 }
 
 sd->state = sd_programming_state;
-clear_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
+clear_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
 /* Bzzztt  Operation complete.  */
 sd->state = sd_transfer_state;
 return sd_r1b;
-- 
2.21.3




[PATCH v6 13/15] hw/sd/sdcard: Correctly display the command name in trace events

2020-06-30 Thread Philippe Mathieu-Daudé
Some ACMD were incorrectly displayed. Fix by remembering if we
are processing a ACMD (with current_cmd_is_acmd) and add the
sd_current_cmd_name() helper, which display to correct name
regardless it is a CMD or ACMD.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 504228198e..de194841a7 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -125,6 +125,7 @@ struct SDState {
 uint8_t pwd[16];
 uint32_t pwd_len;
 uint8_t function_group[6];
+bool current_cmd_is_acmd;
 uint8_t current_cmd;
 /* True if we will handle the next command as an ACMD. Note that this does
  * *not* track the APP_CMD status bit!
@@ -1704,6 +1705,8 @@ int sd_do_command(SDState *sd, SDRequest *req,
   req->cmd);
 req->cmd &= 0x3f;
 }
+sd->current_cmd = req->cmd;
+sd->current_cmd_is_acmd = sd->expecting_acmd;
 
 if (sd->card_status & CARD_IS_LOCKED) {
 if (!cmd_valid_while_locked(sd, req->cmd)) {
@@ -1731,7 +1734,6 @@ int sd_do_command(SDState *sd, SDRequest *req,
 /* Valid command, we can update the 'state before command' bits.
  * (Do this now so they appear in r1 responses.)
  */
-sd->current_cmd = req->cmd;
 sd->card_status &= ~CURRENT_STATE;
 sd->card_status |= (last_state << 9);
 }
@@ -1792,6 +1794,15 @@ send_response:
 return rsplen;
 }
 
+static const char *sd_current_cmd_name(SDState *sd)
+{
+if (sd->current_cmd_is_acmd) {
+return sd_acmd_name(sd->current_cmd);
+} else {
+return sd_cmd_name(sd->current_cmd);
+}
+}
+
 static void sd_blk_read(SDState *sd, uint64_t addr, uint32_t len)
 {
 trace_sdcard_read_block(addr, len);
@@ -1830,7 +1841,7 @@ void sd_write_data(SDState *sd, uint8_t value)
 return;
 
 trace_sdcard_write_data(sd->proto_name,
-sd_acmd_name(sd->current_cmd),
+sd_current_cmd_name(sd),
 sd->current_cmd, value);
 switch (sd->current_cmd) {
 case 24:   /* CMD24:  WRITE_SINGLE_BLOCK */
@@ -1984,7 +1995,7 @@ uint8_t sd_read_data(SDState *sd)
 io_len = (sd->ocr & (1 << 30)) ? HWBLOCK_SIZE : sd->blk_len;
 
 trace_sdcard_read_data(sd->proto_name,
-   sd_acmd_name(sd->current_cmd),
+   sd_current_cmd_name(sd),
sd->current_cmd, io_len);
 switch (sd->current_cmd) {
 case 6:/* CMD6:   SWITCH_FUNCTION */
-- 
2.21.3




[PATCH v6 15/15] hw/sd/sdcard: Simplify realize() a bit

2020-06-30 Thread Philippe Mathieu-Daudé
We don't need to check if sd->blk is set twice.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 7f973f6763..e1bba887b2 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -2140,12 +2140,12 @@ static void sd_realize(DeviceState *dev, Error **errp)
 return;
 }
 
-if (sd->blk && blk_is_read_only(sd->blk)) {
-error_setg(errp, "Cannot use read-only drive as SD card");
-return;
-}
-
 if (sd->blk) {
+if (blk_is_read_only(sd->blk)) {
+error_setg(errp, "Cannot use read-only drive as SD card");
+return;
+}
+
 ret = blk_set_perm(sd->blk, BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE,
BLK_PERM_ALL, errp);
 if (ret < 0) {
-- 
2.21.3




[PATCH v6 05/15] hw/sd/sdcard: Do not switch to ReceivingData if address is invalid

2020-06-30 Thread Philippe Mathieu-Daudé
Only move the state machine to ReceivingData if there is no
pending error. This avoids later OOB access while processing
commands queued.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.3 Data Read

  Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

  4.3.4 Data Write

  Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

WP_VIOLATION errors are not modified: the error bit is set, we
stay in receive-data state, wait for a stop command. All further
data transfer is ignored. See the check on sd->card_status at the
beginning of sd_read_data() and sd_write_data().

Fixes: CVE-2020-13253
Cc: Prasad J Pandit 
Reported-by: Alexander Bulekov 
Buglink: https://bugs.launchpad.net/qemu/+bug/1880822
Signed-off-by: Philippe Mathieu-Daudé 
---
v4: Only modify ADDRESS_ERROR, not WP_VIOLATION (pm215)
---
 hw/sd/sd.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 04451fdad2..7e0d684aca 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1167,13 +1167,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 case 17:   /* CMD17:  READ_SINGLE_BLOCK */
 switch (sd->state) {
 case sd_transfer_state:
-sd->state = sd_sendingdata_state;
-sd->data_start = addr;
-sd->data_offset = 0;
 
 if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
 }
+
+sd->state = sd_sendingdata_state;
+sd->data_start = addr;
+sd->data_offset = 0;
 return sd_r1;
 
 default:
@@ -1184,13 +1186,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 case 18:   /* CMD18:  READ_MULTIPLE_BLOCK */
 switch (sd->state) {
 case sd_transfer_state:
-sd->state = sd_sendingdata_state;
-sd->data_start = addr;
-sd->data_offset = 0;
 
 if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
 }
+
+sd->state = sd_sendingdata_state;
+sd->data_start = addr;
+sd->data_offset = 0;
 return sd_r1;
 
 default:
@@ -1230,14 +1234,17 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 /* Writing in SPI mode not implemented.  */
 if (sd->spi)
 break;
+
+if (sd->data_start + sd->blk_len > sd->size) {
+sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
+}
+
 sd->state = sd_receivingdata_state;
 sd->data_start = addr;
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size) {
-sd->card_status |= ADDRESS_ERROR;
-}
 if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
 }
@@ -1257,14 +1264,17 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 /* Writing in SPI mode not implemented.  */
 if (sd->spi)
 break;
+
+if (sd->data_start + sd->blk_len > sd->size) {
+sd->card_status |= ADDRESS_ERROR;
+return sd_r1;
+}
+
 sd->state = sd_receivingdata_state;
 sd->data_start = addr;
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size) {
-sd->card_status |= ADDRESS_ERROR;
-}
 if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
 }
-- 
2.21.3




[PATCH v6 04/15] hw/sd/sdcard: Use the HWBLOCK_SIZE definition

2020-06-30 Thread Philippe Mathieu-Daudé
Replace the following different uses of the same value by
the same HWBLOCK_SIZE definition:
  - 512 (magic value)
  - 0x200 (magic value)
  - 1 << HWBLOCK_SHIFT

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 4816b4a462..04451fdad2 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -81,6 +81,7 @@ enum SDCardStates {
 };
 
 #define HWBLOCK_SHIFT   9   /* 512 bytes */
+#define HWBLOCK_SIZE(1 << HWBLOCK_SHIFT)
 #define SECTOR_SHIFT5   /* 16 kilobytes */
 #define WPGROUP_SHIFT   7   /* 2 megs */
 #define CMULT_SHIFT 9   /* 512 times HWBLOCK_SIZE */
@@ -129,7 +130,7 @@ struct SDState {
 uint32_t blk_written;
 uint64_t data_start;
 uint32_t data_offset;
-uint8_t data[512];
+uint8_t data[HWBLOCK_SIZE];
 qemu_irq readonly_cb;
 qemu_irq inserted_cb;
 QEMUTimer *ocr_power_timer;
@@ -410,7 +411,7 @@ static void sd_set_csd(SDState *sd, uint64_t size)
 ((HWBLOCK_SHIFT << 6) & 0xc0);
 sd->csd[14] = 0x00;/* File format group */
 } else {   /* SDHC */
-size /= 512 * KiB;
+size /= HWBLOCK_SIZE * KiB;
 size -= 1;
 sd->csd[0] = 0x40;
 sd->csd[1] = 0x0e;
@@ -574,7 +575,7 @@ static void sd_reset(DeviceState *dev)
 sd->erase_start = 0;
 sd->erase_end = 0;
 sd->size = size;
-sd->blk_len = 0x200;
+sd->blk_len = HWBLOCK_SIZE;
 sd->pwd_len = 0;
 sd->expecting_acmd = false;
 sd->dat_lines = 0xf;
@@ -685,7 +686,7 @@ static const VMStateDescription sd_vmstate = {
 VMSTATE_UINT32(blk_written, SDState),
 VMSTATE_UINT64(data_start, SDState),
 VMSTATE_UINT32(data_offset, SDState),
-VMSTATE_UINT8_ARRAY(data, SDState, 512),
+VMSTATE_UINT8_ARRAY(data, SDState, HWBLOCK_SIZE),
 VMSTATE_UNUSED_V(1, 512),
 VMSTATE_BOOL(enable, SDState),
 VMSTATE_END_OF_LIST()
@@ -754,8 +755,8 @@ static void sd_erase(SDState *sd)
 
 if (FIELD_EX32(sd->ocr, OCR, CARD_CAPACITY)) {
 /* High capacity memory card: erase units are 512 byte blocks */
-erase_start *= 512;
-erase_end *= 512;
+erase_start *= HWBLOCK_SIZE;
+erase_end *= HWBLOCK_SIZE;
 }
 
 erase_start = sd_addr_to_wpnum(erase_start);
@@ -1149,7 +1150,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 case 16:   /* CMD16:  SET_BLOCKLEN */
 switch (sd->state) {
 case sd_transfer_state:
-if (req.arg > (1 << HWBLOCK_SHIFT)) {
+if (req.arg > HWBLOCK_SIZE) {
 sd->card_status |= BLOCK_LEN_ERROR;
 } else {
 trace_sdcard_set_blocklen(req.arg);
@@ -1961,7 +1962,7 @@ uint8_t sd_read_data(SDState *sd)
 if (sd->card_status & (ADDRESS_ERROR | WP_VIOLATION))
 return 0x00;
 
-io_len = (sd->ocr & (1 << 30)) ? 512 : sd->blk_len;
+io_len = (sd->ocr & (1 << 30)) ? HWBLOCK_SIZE : sd->blk_len;
 
 trace_sdcard_read_data(sd->proto_name,
sd_acmd_name(sd->current_cmd),
-- 
2.21.3




[PATCH v6 03/15] hw/sd/sdcard: Move some definitions to use them earlier

2020-06-30 Thread Philippe Mathieu-Daudé
Move some definitions to use them earlier.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index cac8d7d828..4816b4a462 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -80,6 +80,12 @@ enum SDCardStates {
 sd_disconnect_state,
 };
 
+#define HWBLOCK_SHIFT   9   /* 512 bytes */
+#define SECTOR_SHIFT5   /* 16 kilobytes */
+#define WPGROUP_SHIFT   7   /* 2 megs */
+#define CMULT_SHIFT 9   /* 512 times HWBLOCK_SIZE */
+#define WPGROUP_SIZE(1 << (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT))
+
 struct SDState {
 DeviceState parent_obj;
 
@@ -367,12 +373,6 @@ static void sd_set_cid(SDState *sd)
 sd->cid[15] = (sd_crc7(sd->cid, 15) << 1) | 1;
 }
 
-#define HWBLOCK_SHIFT  9   /* 512 bytes */
-#define SECTOR_SHIFT   5   /* 16 kilobytes */
-#define WPGROUP_SHIFT  7   /* 2 megs */
-#define CMULT_SHIFT9   /* 512 times HWBLOCK_SIZE */
-#define WPGROUP_SIZE   (1 << (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT))
-
 static const uint8_t sd_csd_rw_mask[16] = {
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xfc, 0xfe,
-- 
2.21.3




[PATCH v6 07/15] hw/sd/sdcard: Initialize constant values first

2020-06-30 Thread Philippe Mathieu-Daudé
Reorder initialization code, constant values first.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 871c30a67f..22392e5084 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -556,22 +556,6 @@ static void sd_reset(DeviceState *dev)
 }
 size = sect << 9;
 
-sect = sd_addr_to_wpnum(size) + 1;
-
-sd->state = sd_idle_state;
-sd->rca = 0x;
-sd_set_ocr(sd);
-sd_set_scr(sd);
-sd_set_cid(sd);
-sd_set_csd(sd, size);
-sd_set_cardstatus(sd);
-sd_set_sdstatus(sd);
-
-g_free(sd->wp_groups);
-sd->wp_switch = sd->blk ? blk_is_read_only(sd->blk) : false;
-sd->wpgrps_size = sect;
-sd->wp_groups = bitmap_new(sd->wpgrps_size);
-memset(sd->function_group, 0, sizeof(sd->function_group));
 sd->erase_start = 0;
 sd->erase_end = 0;
 sd->size = size;
@@ -581,6 +565,22 @@ static void sd_reset(DeviceState *dev)
 sd->dat_lines = 0xf;
 sd->cmd_line = true;
 sd->multi_blk_cnt = 0;
+sd->state = sd_idle_state;
+sd->rca = 0x;
+
+sd_set_ocr(sd);
+sd_set_scr(sd);
+sd_set_cid(sd);
+sd_set_csd(sd, size);
+sd_set_cardstatus(sd);
+sd_set_sdstatus(sd);
+
+sect = sd_addr_to_wpnum(size) + 1;
+g_free(sd->wp_groups);
+sd->wp_switch = sd->blk ? blk_is_read_only(sd->blk) : false;
+sd->wpgrps_size = sect;
+sd->wp_groups = bitmap_new(sd->wpgrps_size);
+memset(sd->function_group, 0, sizeof(sd->function_group));
 }
 
 static bool sd_get_inserted(SDState *sd)
-- 
2.21.3




[PATCH v6 14/15] hw/sd/sdcard: Display offset in read/write_data() trace events

2020-06-30 Thread Philippe Mathieu-Daudé
Having 'base address' and 'relative offset' displayed
separately is more helpful than the absolute address.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 8 
 hw/sd/trace-events | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index de194841a7..7f973f6763 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1841,8 +1841,8 @@ void sd_write_data(SDState *sd, uint8_t value)
 return;
 
 trace_sdcard_write_data(sd->proto_name,
-sd_current_cmd_name(sd),
-sd->current_cmd, value);
+sd_current_cmd_name(sd), sd->current_cmd,
+sd->data_start, sd->data_offset, value);
 switch (sd->current_cmd) {
 case 24:   /* CMD24:  WRITE_SINGLE_BLOCK */
 sd->data[sd->data_offset ++] = value;
@@ -1995,8 +1995,8 @@ uint8_t sd_read_data(SDState *sd)
 io_len = (sd->ocr & (1 << 30)) ? HWBLOCK_SIZE : sd->blk_len;
 
 trace_sdcard_read_data(sd->proto_name,
-   sd_current_cmd_name(sd),
-   sd->current_cmd, io_len);
+   sd_current_cmd_name(sd), sd->current_cmd,
+   sd->data_start, sd->data_offset, io_len);
 switch (sd->current_cmd) {
 case 6:/* CMD6:   SWITCH_FUNCTION */
 ret = sd->data[sd->data_offset ++];
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
index d0cd7c6ec4..946923223b 100644
--- a/hw/sd/trace-events
+++ b/hw/sd/trace-events
@@ -51,8 +51,8 @@ sdcard_lock(void) ""
 sdcard_unlock(void) ""
 sdcard_read_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
 sdcard_write_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
-sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint8_t value) "%s %20s/ CMD%02d value 0x%02x"
-sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint32_t length) "%s %20s/ CMD%02d len %" PRIu32
+sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint64_t address, uint32_t offset, uint8_t value) "%s %20s/ CMD%02d addr 0x%" 
PRIx64 " ofs 0x%" PRIx32 " val 0x%02x"
+sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint64_t address, uint32_t offset, uint32_t length) "%s %20s/ CMD%02d addr 0x%" 
PRIx64 " ofs 0x%" PRIx32 " len %" PRIu32
 sdcard_set_voltage(uint16_t millivolts) "%u mV"
 
 # milkymist-memcard.c
-- 
2.21.3




[PATCH v6 02/15] hw/sd/sdcard: Update coding style to make checkpatch.pl happy

2020-06-30 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

To make the next commit easier to review, clean this code first.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 97a9d32964..cac8d7d828 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1170,8 +1170,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_start = addr;
 sd->data_offset = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+}
 return sd_r1;
 
 default:
@@ -1186,8 +1187,9 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_start = addr;
 sd->data_offset = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
+}
 return sd_r1;
 
 default:
@@ -1232,12 +1234,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
-if (sd_wp_addr(sd, sd->data_start))
+}
+if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
-if (sd->csd[14] & 0x30)
+}
+if (sd->csd[14] & 0x30) {
 sd->card_status |= WP_VIOLATION;
+}
 return sd_r1;
 
 default:
@@ -1256,12 +1261,15 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->data_offset = 0;
 sd->blk_written = 0;
 
-if (sd->data_start + sd->blk_len > sd->size)
+if (sd->data_start + sd->blk_len > sd->size) {
 sd->card_status |= ADDRESS_ERROR;
-if (sd_wp_addr(sd, sd->data_start))
+}
+if (sd_wp_addr(sd, sd->data_start)) {
 sd->card_status |= WP_VIOLATION;
-if (sd->csd[14] & 0x30)
+}
+if (sd->csd[14] & 0x30) {
 sd->card_status |= WP_VIOLATION;
+}
 return sd_r1;
 
 default:
-- 
2.21.3




[PATCH v6 12/15] hw/sd/sdcard: Make iolen unsigned

2020-06-30 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

I/O request length can not be negative.

Signed-off-by: Philippe Mathieu-Daudé 
---
v4: Use uint32_t (pm215)
---
 hw/sd/sd.c | 2 +-
 hw/sd/trace-events | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 2238ba066d..504228198e 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -1967,7 +1967,7 @@ uint8_t sd_read_data(SDState *sd)
 {
 /* TODO: Append CRCs */
 uint8_t ret;
-int io_len;
+uint32_t io_len;
 
 if (!sd->blk || !blk_is_inserted(sd->blk) || !sd->enable)
 return 0x00;
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
index 5f09d32eb2..d0cd7c6ec4 100644
--- a/hw/sd/trace-events
+++ b/hw/sd/trace-events
@@ -52,7 +52,7 @@ sdcard_unlock(void) ""
 sdcard_read_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
 sdcard_write_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x"
 sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint8_t value) "%s %20s/ CMD%02d value 0x%02x"
-sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, int 
length) "%s %20s/ CMD%02d len %d"
+sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, 
uint32_t length) "%s %20s/ CMD%02d len %" PRIu32
 sdcard_set_voltage(uint16_t millivolts) "%u mV"
 
 # milkymist-memcard.c
-- 
2.21.3




[PATCH 10/10] hw/block/nvme: support reset/finish recommended limits

2020-06-30 Thread Klaus Jensen
Add the rrl and frl device parameters. The parameters specify the number
of seconds before the device may perform an internal operation to
"clear" the Reset Zone Recommended and Finish Zone Recommended
attributes respectively.

When the attibutes are set is governed by the rrld and frld parameters
(Reset/Finish Recomended Limit Delay). The Reset Zone Recommended Delay
starts when a zone becomes full. The Finish Zone Recommended Delay
starts when the zone is first activated.  When the limits are reached,
the attributes are cleared again and the process is restarted.

If zone excursions are enabled (they are by default), when the Finish
Recommended Limit is reached, the device will finish the zone.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.c| 105 ++
 hw/block/nvme-ns.h|  13 ++
 hw/block/nvme.c   |  49 +---
 hw/block/nvme.h   |   7 +++
 hw/block/trace-events |   3 +-
 5 files changed, 160 insertions(+), 17 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 3b9fa91c7af8..7f9b1d526197 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -25,6 +25,7 @@
 #include "hw/qdev-properties.h"
 #include "hw/qdev-core.h"
 
+#include "trace.h"
 #include "nvme.h"
 #include "nvme-ns.h"
 
@@ -48,6 +49,91 @@ const char *nvme_zs_to_str(NvmeZoneState zs)
 return NULL;
 }
 
+static void nvme_ns_process_timer(void *opaque)
+{
+NvmeNamespace *ns = opaque;
+BusState *s = qdev_get_parent_bus(>parent_obj);
+NvmeCtrl *n = NVME(s->parent);
+NvmeZone *zone;
+
+trace_pci_nvme_ns_process_timer(ns->params.nsid);
+
+int64_t next_timer = INT64_MAX, now = 
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+
+QTAILQ_FOREACH(zone, >zns.resources.lru_open, lru_entry) {
+int64_t activated_ns = now - zone->stats.activated_ns;
+if (activated_ns < ns->zns.frld_ns) {
+next_timer = MIN(next_timer, zone->stats.activated_ns +
+ ns->zns.frld_ns);
+
+break;
+}
+
+if (activated_ns < ns->zns.frld_ns + ns->zns.frl_ns) {
+NVME_ZA_SET_FZR(zone->zd.za, 0x1);
+nvme_zone_changed(n, ns, zone);
+
+next_timer = MIN(next_timer, now + ns->zns.frl_ns);
+
+continue;
+}
+
+if (zone->wp_staging != le64_to_cpu(zone->zd.wp)) {
+next_timer = now + 500;
+continue;
+}
+
+nvme_zone_excursion(n, ns, zone, NULL);
+}
+
+QTAILQ_FOREACH(zone, >zns.resources.lru_active, lru_entry) {
+int64_t activated_ns = now - zone->stats.activated_ns;
+if (activated_ns < ns->zns.frld_ns) {
+next_timer = MIN(next_timer, zone->stats.activated_ns +
+ ns->zns.frld_ns);
+
+break;
+}
+
+if (activated_ns < ns->zns.frld_ns + ns->zns.frl_ns) {
+NVME_ZA_SET_FZR(zone->zd.za, 0x1);
+nvme_zone_changed(n, ns, zone);
+
+next_timer = MIN(next_timer, now + ns->zns.frl_ns);
+
+continue;
+}
+
+nvme_zone_excursion(n, ns, zone, NULL);
+}
+
+QTAILQ_FOREACH(zone, >zns.lru_finished, lru_entry) {
+int64_t finished_ns = now - zone->stats.finished_ns;
+if (finished_ns < ns->zns.rrld_ns) {
+next_timer = MIN(next_timer, zone->stats.finished_ns +
+ ns->zns.rrld_ns);
+
+break;
+}
+
+if (finished_ns < ns->zns.rrld_ns + ns->zns.rrl_ns) {
+NVME_ZA_SET_RZR(zone->zd.za, 0x1);
+nvme_zone_changed(n, ns, zone);
+
+next_timer = MIN(next_timer, now + ns->zns.rrl_ns);
+
+nvme_zone_changed(n, ns, zone);
+continue;
+}
+
+NVME_ZA_SET_RZR(zone->zd.za, 0x0);
+}
+
+if (next_timer != INT64_MAX) {
+timer_mod(ns->zns.timer, next_timer);
+}
+}
+
 static int nvme_ns_blk_resize(BlockBackend *blk, size_t len, Error **errp)
 {
Error *local_err = NULL;
@@ -262,6 +348,21 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
 
 id_ns->ncap = ns->zns.info.num_zones * ns->params.zns.zcap;
 
+id_ns_zns->rrl = ns->params.zns.rrl;
+id_ns_zns->frl = ns->params.zns.frl;
+
+if (ns->params.zns.rrl || ns->params.zns.frl) {
+ns->zns.rrl_ns = ns->params.zns.rrl * NANOSECONDS_PER_SECOND;
+ns->zns.rrld_ns = ns->params.zns.rrld * NANOSECONDS_PER_SECOND;
+ns->zns.frl_ns = ns->params.zns.frl * NANOSECONDS_PER_SECOND;
+ns->zns.frld_ns = ns->params.zns.frld * NANOSECONDS_PER_SECOND;
+
+ns->zns.timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+ nvme_ns_process_timer, ns);
+
+QTAILQ_INIT(>zns.lru_finished);
+}
+
 id_ns_zns->mar = cpu_to_le32(ns->params.zns.mar);
 id_ns_zns->mor = cpu_to_le32(ns->params.zns.mor);
 
@@ -515,6 +616,10 @@ static Property nvme_ns_props[] = {
 DEFINE_PROP_UINT16("zns.ozcs", 

[PATCH v6 01/15] MAINTAINERS: Cc qemu-block mailing list

2020-06-30 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

We forgot to include the qemu-block mailing list while adding
this section in commit 076a0fc32a7. Fix this.

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index dec252f38b..9ad876c4a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1628,6 +1628,7 @@ F: hw/ssi/xilinx_*
 
 SD (Secure Card)
 M: Philippe Mathieu-Daudé 
+L: qemu-block@nongnu.org
 S: Odd Fixes
 F: include/hw/sd/sd*
 F: hw/sd/core.c
-- 
2.21.3




[PATCH v6 09/15] hw/sd/sdcard: Update the SDState documentation

2020-06-30 Thread Philippe Mathieu-Daudé
Add more descriptive comments to keep a clear separation
between static property vs runtime changeable.

Suggested-by: Peter Maydell 
Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index c6742c884d..ba4d0e0597 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -103,11 +103,14 @@ struct SDState {
 uint32_t card_status;
 uint8_t sd_status[64];
 
-/* Configurable properties */
+/* Static properties */
+
 uint8_t spec_version;
 BlockBackend *blk;
 bool spi;
 
+/* Runtime changeables */
+
 uint32_t mode;/* current card mode, one of SDCardModes */
 int32_t state;/* current card state, one of SDCardStates */
 uint32_t vhs;
-- 
2.21.3




[PATCH v6 06/15] hw/sd/sdcard: Restrict Class 6 commands to SCSD cards

2020-06-30 Thread Philippe Mathieu-Daudé
Only SCSD cards support Class 6 (Block Oriented Write Protection)
commands.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.14 Command Functional Difference in Card Capacity Types

  * Write Protected Group

  SDHC and SDXC do not support write-protected groups. Issuing
  CMD28, CMD29 and CMD30 generates the ILLEGAL_COMMAND error.

Reviewed-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sd/sd.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 7e0d684aca..871c30a67f 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -922,6 +922,11 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
SDRequest req)
 sd->multi_blk_cnt = 0;
 }
 
+if (sd_cmd_class[req.cmd] == 6 && FIELD_EX32(sd->ocr, OCR, CARD_CAPACITY)) 
{
+/* Only Standard Capacity cards support class 6 commands */
+return sd_illegal;
+}
+
 switch (req.cmd) {
 /* Basic commands (Class 0 and Class 1) */
 case 0:/* CMD0:   GO_IDLE_STATE */
-- 
2.21.3




[PATCH v6 00/15] hw/sd/sdcard: Fix CVE-2020-13253 & cleanups

2020-06-30 Thread Philippe Mathieu-Daudé
Patches 5 & 6 fix CVE-2020-13253.
The rest are (accumulated) cleanups.

Since v5: Fix incorrect use of sd_addr_to_wpnum() in sd_reset()

Missing review:
[PATCH 01/15] MAINTAINERS: Cc qemu-block mailing list
[PATCH 03/15] hw/sd/sdcard: Move some definitions to use them
[PATCH 04/15] hw/sd/sdcard: Use the HWBLOCK_SIZE definition
[PATCH 05/15] hw/sd/sdcard: Do not switch to ReceivingData if
[PATCH 07/15] hw/sd/sdcard: Initialize constant values first
[PATCH 08/15] hw/sd/sdcard: Check address is in range
[PATCH 12/15] hw/sd/sdcard: Make iolen unsigned
[PATCH 13/15] hw/sd/sdcard: Correctly display the command name in

$ git backport-diff -u v5
Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/15:[] [--] 'MAINTAINERS: Cc qemu-block mailing list'
002/15:[] [--] 'hw/sd/sdcard: Update coding style to make checkpatch.pl 
happy'
003/15:[] [--] 'hw/sd/sdcard: Move some definitions to use them earlier'
004/15:[] [--] 'hw/sd/sdcard: Use the HWBLOCK_SIZE definition'
005/15:[] [--] 'hw/sd/sdcard: Do not switch to ReceivingData if address is 
invalid'
006/15:[] [--] 'hw/sd/sdcard: Restrict Class 6 commands to SCSD cards'
007/15:[] [--] 'hw/sd/sdcard: Initialize constant values first'
008/15:[0004] [FC] 'hw/sd/sdcard: Check address is in range'
009/15:[] [--] 'hw/sd/sdcard: Update the SDState documentation'
010/15:[] [--] 'hw/sd/sdcard: Simplify cmd_valid_while_locked()'
011/15:[] [--] 'hw/sd/sdcard: Constify sd_crc*()'s message argument'
012/15:[] [--] 'hw/sd/sdcard: Make iolen unsigned'
013/15:[] [--] 'hw/sd/sdcard: Correctly display the command name in trace 
events'
014/15:[] [--] 'hw/sd/sdcard: Display offset in read/write_data() trace 
events'
015/15:[] [--] 'hw/sd/sdcard: Simplify realize() a bit'

Philippe Mathieu-Daudé (15):
  MAINTAINERS: Cc qemu-block mailing list
  hw/sd/sdcard: Update coding style to make checkpatch.pl happy
  hw/sd/sdcard: Move some definitions to use them earlier
  hw/sd/sdcard: Use the HWBLOCK_SIZE definition
  hw/sd/sdcard: Do not switch to ReceivingData if address is invalid
  hw/sd/sdcard: Restrict Class 6 commands to SCSD cards
  hw/sd/sdcard: Initialize constant values first
  hw/sd/sdcard: Check address is in range
  hw/sd/sdcard: Update the SDState documentation
  hw/sd/sdcard: Simplify cmd_valid_while_locked()
  hw/sd/sdcard: Constify sd_crc*()'s message argument
  hw/sd/sdcard: Make iolen unsigned
  hw/sd/sdcard: Correctly display the command name in trace events
  hw/sd/sdcard: Display offset in read/write_data() trace events
  hw/sd/sdcard: Simplify realize() a bit

 hw/sd/sd.c | 173 +++--
 MAINTAINERS|   1 +
 hw/sd/trace-events |   4 +-
 3 files changed, 109 insertions(+), 69 deletions(-)

-- 
2.21.3




[PATCH 02/10] hw/block/nvme: add zns specific fields and types

2020-06-30 Thread Klaus Jensen
Add new fields, types and data structures for TP 4053 ("Zoned Namespaces").

Signed-off-by: Klaus Jensen 
---
 include/block/nvme.h | 186 +--
 1 file changed, 180 insertions(+), 6 deletions(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index 637be0ddd2fc..ddf948132272 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -465,7 +465,8 @@ enum NvmeCmbmscMask {
 #define NVME_CMBSTS_CBAI(cmbsts) (cmsts & 0x1)
 
 enum NvmeCommandSet {
-NVME_IOCS_NVM = 0x0,
+NVME_IOCS_NVM   = 0x0,
+NVME_IOCS_ZONED = 0x2,
 };
 
 enum NvmeSglDescriptorType {
@@ -552,6 +553,11 @@ enum NvmeIoCommands {
 NVME_CMD_COMPARE= 0x05,
 NVME_CMD_WRITE_ZEROES   = 0x08,
 NVME_CMD_DSM= 0x09,
+
+/* Zoned Command Set */
+NVME_CMD_ZONE_MGMT_SEND = 0x79,
+NVME_CMD_ZONE_MGMT_RECV = 0x7a,
+NVME_CMD_ZONE_APPEND= 0x7d,
 };
 
 typedef struct NvmeDeleteQ {
@@ -664,6 +670,82 @@ enum {
 NVME_RW_PRINFO_PRCHK_REF= 1 << 10,
 };
 
+typedef struct NvmeZoneAppendCmd {
+uint8_t opcode;
+uint8_t flags;
+uint16_tcid;
+uint32_tnsid;
+uint32_trsvd8[2];
+uint64_tmptr;
+NvmeCmdDptr dptr;
+uint64_tzslba;
+uint16_tnlb;
+uint8_t rsvd50;
+uint8_t control;
+uint32_tilbrt;
+uint16_tlbat;
+uint16_tlbatm;
+} NvmeZoneAppendCmd;
+
+typedef struct NvmeZoneManagementSendCmd {
+uint8_t opcode;
+uint8_t flags;
+uint16_tcid;
+uint32_tnsid;
+uint32_trsvd8[4];
+NvmeCmdDptr dptr;
+uint64_tslba;
+uint32_trsvd48;
+uint8_t zsa;
+uint8_t zsflags;
+uint16_trsvd54;
+uint32_trsvd56[2];
+} NvmeZoneManagementSendCmd;
+
+#define NVME_CMD_ZONE_MGMT_SEND_SELECT_ALL(zsflags) ((zsflags) & 0x1)
+
+typedef enum NvmeZoneManagementSendAction {
+NVME_CMD_ZONE_MGMT_SEND_CLOSE   = 0x1,
+NVME_CMD_ZONE_MGMT_SEND_FINISH  = 0x2,
+NVME_CMD_ZONE_MGMT_SEND_OPEN= 0x3,
+NVME_CMD_ZONE_MGMT_SEND_RESET   = 0x4,
+NVME_CMD_ZONE_MGMT_SEND_OFFLINE = 0x5,
+NVME_CMD_ZONE_MGMT_SEND_SET_ZDE = 0x10,
+} NvmeZoneManagementSendAction;
+
+typedef struct NvmeZoneManagementRecvCmd {
+uint8_t opcode;
+uint8_t flags;
+uint16_tcid;
+uint32_tnsid;
+uint8_t rsvd8[16];
+NvmeCmdDptr dptr;
+uint64_tslba;
+uint32_tnumdw;
+uint8_t zra;
+uint8_t zrasp;
+uint8_t zrasf;
+uint8_t rsvd55[9];
+} NvmeZoneManagementRecvCmd;
+
+typedef enum NvmeZoneManagementRecvAction {
+NVME_CMD_ZONE_MGMT_RECV_REPORT_ZONES  = 0x0,
+NVME_CMD_ZONE_MGMT_RECV_EXTENDED_REPORT_ZONES = 0x1,
+} NvmeZoneManagementRecvAction;
+
+typedef enum NvmeZoneManagementRecvActionSpecificField {
+NVME_CMD_ZONE_MGMT_RECV_LIST_ALL  = 0x0,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSE  = 0x1,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSIO = 0x2,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSEO = 0x3,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSC  = 0x4,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSF  = 0x5,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSRO = 0x6,
+NVME_CMD_ZONE_MGMT_RECV_LIST_ZSO  = 0x7,
+} NvmeZoneManagementRecvActionSpecificField;
+
+#define NVME_CMD_ZONE_MGMT_RECEIVE_PARTIAL 0x1
+
 typedef struct NvmeDsmCmd {
 uint8_t opcode;
 uint8_t flags;
@@ -702,13 +784,15 @@ enum NvmeAsyncEventRequest {
 NVME_AER_INFO_SMART_RELIABILITY = 0,
 NVME_AER_INFO_SMART_TEMP_THRESH = 1,
 NVME_AER_INFO_SMART_SPARE_THRESH= 2,
+NVME_AER_INFO_NOTICE_ZONE_DESCR_CHANGED = 0xef,
 };
 
 typedef struct NvmeAerResult {
-uint8_t event_type;
-uint8_t event_info;
-uint8_t log_page;
-uint8_t resv;
+uint8_t  event_type;
+uint8_t  event_info;
+uint8_t  log_page;
+uint8_t  resv;
+uint32_t nsid;
 } NvmeAerResult;
 
 typedef struct NvmeCqe {
@@ -775,6 +859,14 @@ enum NvmeStatusCodes {
 NVME_CONFLICTING_ATTRS  = 0x0180,
 NVME_INVALID_PROT_INFO  = 0x0181,
 NVME_WRITE_TO_RO= 0x0182,
+NVME_ZONE_BOUNDARY_ERROR= 0x01b8,
+NVME_ZONE_IS_FULL   = 0x01b9,
+NVME_ZONE_IS_READ_ONLY  = 0x01ba,
+NVME_ZONE_IS_OFFLINE= 0x01bb,
+NVME_ZONE_INVALID_WRITE = 0x01bc,
+NVME_TOO_MANY_ACTIVE_ZONES  = 0x01bd,
+NVME_TOO_MANY_OPEN_ZONES= 0x01be,
+NVME_INVALID_ZONE_STATE_TRANSITION = 0x01bf,
 NVME_WRITE_FAULT= 0x0280,
 NVME_UNRECOVERED_READ   = 0x0281,
 NVME_E2E_GUARD_ERROR= 0x0282,
@@ -868,6 +960,46 @@ enum {
 NVME_EFFECTS_UUID_SEL   = 1 << 19,
 };
 
+typedef enum NvmeZoneType {
+NVME_ZT_SEQ = 0x2,
+} NvmeZoneType;
+
+typedef enum NvmeZoneState {
+NVME_ZS_ZSE  = 0x1,
+NVME_ZS_ZSIO = 0x2,
+NVME_ZS_ZSEO = 0x3,
+NVME_ZS_ZSC  = 0x4,
+NVME_ZS_ZSRO = 0xd,
+NVME_ZS_ZSF  = 0xe,
+NVME_ZS_ZSO  = 0xf,
+} NvmeZoneState;
+
+typedef struct 

[PATCH 08/10] hw/block/nvme: allow open to close transitions by controller

2020-06-30 Thread Klaus Jensen
Allow the controller to release open resources by transitioning
implicitly and explicitly opened zones to closed. This is done using a
naive "least recently opened" strategy. Some workloads may behave very
badly with this, but for the purpose of testing how software deals with
this it is acceptable for now.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.c|   3 +
 hw/block/nvme-ns.h|   5 ++
 hw/block/nvme.c   | 176 +++---
 hw/block/nvme.h   |   5 ++
 hw/block/trace-events |   5 ++
 5 files changed, 147 insertions(+), 47 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 5a55a0191f55..3b9fa91c7af8 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -269,6 +269,9 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
 ns->params.zns.mar + 1 : ns->zns.info.num_zones;
 ns->zns.resources.open = ns->params.zns.mor != 0x ?
 ns->params.zns.mor + 1 : ns->zns.info.num_zones;
+
+QTAILQ_INIT(>zns.resources.lru_open);
+QTAILQ_INIT(>zns.resources.lru_active);
 }
 
 static void nvme_ns_init(NvmeNamespace *ns)
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 5660934d6199..6d3a6dc07cd8 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -39,6 +39,8 @@ typedef struct NvmeZone {
 uint8_t *zde;
 
 uint64_t wp_staging;
+
+QTAILQ_ENTRY(NvmeZone) lru_entry;
 } NvmeZone;
 
 typedef struct NvmeNamespace {
@@ -69,6 +71,9 @@ typedef struct NvmeNamespace {
 struct {
 uint32_t open;
 uint32_t active;
+
+QTAILQ_HEAD(, NvmeZone) lru_open;
+QTAILQ_HEAD(, NvmeZone) lru_active;
 } resources;
 } zns;
 } NvmeNamespace;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d5d521954cfc..f7b4618bc805 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1187,6 +1187,41 @@ static void nvme_update_zone_descr(NvmeNamespace *ns, 
NvmeRequest *req,
 nvme_req_add_aio(req, aio);
 }
 
+static uint16_t nvme_zrm_transition(NvmeCtrl *n, NvmeNamespace *ns,
+NvmeZone *zone, NvmeZoneState to,
+NvmeRequest *req);
+
+static uint16_t nvme_zrm_release_open(NvmeCtrl *n, NvmeNamespace *ns,
+  NvmeRequest *req)
+{
+NvmeZone *candidate;
+NvmeZoneState zs;
+
+trace_pci_nvme_zone_zrm_release_open(nvme_cid(req), ns->params.nsid);
+
+QTAILQ_FOREACH(candidate, >zns.resources.lru_open, lru_entry) {
+zs = nvme_zs(candidate);
+
+trace_pci_nvme_zone_zrm_candidate(nvme_cid(req), ns->params.nsid,
+  nvme_zslba(candidate),
+  nvme_wp(candidate), zs);
+
+/* skip explicitly opened zones */
+if (zs == NVME_ZS_ZSEO) {
+continue;
+}
+
+/* the zone cannot be closed if it is currently writing */
+if (candidate->wp_staging != nvme_wp(candidate)) {
+continue;
+}
+
+return nvme_zrm_transition(n, ns, candidate, NVME_ZS_ZSC, req);
+}
+
+return NVME_TOO_MANY_OPEN_ZONES;
+}
+
 /*
  * nvme_zrm_transition validates zone state transitions under the constraint of
  * the Number of Active and Open Resources (NAR and NOR) limits as reported by
@@ -1195,52 +1230,59 @@ static void nvme_update_zone_descr(NvmeNamespace *ns, 
NvmeRequest *req,
  * The function does NOT change the Zone Attribute field; this must be done by
  * the caller.
  */
-static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone,
-NvmeZoneState to)
+static uint16_t nvme_zrm_transition(NvmeCtrl *n, NvmeNamespace *ns,
+NvmeZone *zone, NvmeZoneState to,
+NvmeRequest *req)
 {
 NvmeZoneState from = nvme_zs(zone);
+uint16_t status;
 
-/* fast path */
-if (from == to) {
-return NVME_SUCCESS;
-}
+trace_pci_nvme_zone_zrm_transition(nvme_cid(req), ns->params.nsid,
+   nvme_zslba(zone), nvme_zs(zone), to);
 
 switch (from) {
 case NVME_ZS_ZSE:
 switch (to) {
+case NVME_ZS_ZSE:
+return NVME_SUCCESS;
+
 case NVME_ZS_ZSRO:
 case NVME_ZS_ZSO:
 case NVME_ZS_ZSF:
-nvme_zs_set(zone, to);
-return NVME_SUCCESS;
+goto out;
 
 case NVME_ZS_ZSC:
 if (!ns->zns.resources.active) {
+trace_pci_nvme_err_too_many_active_zones(nvme_cid(req));
 return NVME_TOO_MANY_ACTIVE_ZONES;
 }
 
 ns->zns.resources.active--;
 
-nvme_zs_set(zone, to);
+QTAILQ_INSERT_TAIL(>zns.resources.lru_active, zone, lru_entry);
 
-return NVME_SUCCESS;
+goto out;
 
 case NVME_ZS_ZSIO:
 case NVME_ZS_ZSEO:
 if 

[PATCH 03/10] hw/block/nvme: add basic read/write for zoned namespaces

2020-06-30 Thread Klaus Jensen
This adds basic read and write for zoned namespaces.

A zoned namespace is created by setting the iocs parameter to 0x2 and
supplying a zero-sized blockdev for zone info persistent state
(zns.zoneinfo parameter) and the zns.zcap parameter to specify the
individual zone capacities. The namespace device will compute the
resulting zone size to be the next power of two and fit in as many zones
as possible on the underlying namespace blockdev.

If the zone info blockdev pointed to by zns.zoneinfo is non-zero in size
it will be assumed to contain existing zone state.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.c| 227 +-
 hw/block/nvme-ns.h| 103 
 hw/block/nvme.c   | 361 +++---
 hw/block/nvme.h   |   1 +
 hw/block/trace-events |  10 ++
 5 files changed, 677 insertions(+), 25 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index ae051784caaf..9a08b2ba0fb2 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -28,6 +28,26 @@
 #include "nvme.h"
 #include "nvme-ns.h"
 
+const char *nvme_zs_str(NvmeZone *zone)
+{
+return nvme_zs_to_str(nvme_zs(zone));
+}
+
+const char *nvme_zs_to_str(NvmeZoneState zs)
+{
+switch (zs) {
+case NVME_ZS_ZSE:  return "ZSE";
+case NVME_ZS_ZSIO: return "ZSIO";
+case NVME_ZS_ZSEO: return "ZSEO";
+case NVME_ZS_ZSC:  return "ZSC";
+case NVME_ZS_ZSRO: return "ZSRO";
+case NVME_ZS_ZSF:  return "ZSF";
+case NVME_ZS_ZSO:  return "ZSO";
+}
+
+return NULL;
+}
+
 static int nvme_ns_blk_resize(BlockBackend *blk, size_t len, Error **errp)
 {
Error *local_err = NULL;
@@ -57,6 +77,171 @@ static int nvme_ns_blk_resize(BlockBackend *blk, size_t 
len, Error **errp)
return 0;
 }
 
+static int nvme_ns_init_blk_zoneinfo(NvmeNamespace *ns, size_t len,
+ Error **errp)
+{
+NvmeZone *zone;
+NvmeZoneDescriptor *zd;
+uint64_t zslba;
+int ret;
+
+BlockBackend *blk = ns->zns.info.blk;
+
+Error *local_err = NULL;
+
+for (int i = 0; i < ns->zns.info.num_zones; i++) {
+zslba = i * nvme_ns_zsze(ns);
+zone = nvme_ns_get_zone(ns, zslba);
+zd = >zd;
+
+zd->zt = NVME_ZT_SEQ;
+nvme_zs_set(zone, NVME_ZS_ZSE);
+zd->zcap = ns->params.zns.zcap;
+zone->wp_staging = zslba;
+zd->wp = zd->zslba = cpu_to_le64(zslba);
+}
+
+ret = nvme_ns_blk_resize(blk, len, _err);
+if (ret) {
+error_propagate_prepend(errp, local_err,
+"could not resize zoneinfo blockdev: ");
+return ret;
+}
+
+for (int i = 0; i < ns->zns.info.num_zones; i++) {
+zd = >zns.info.zones[i].zd;
+
+ret = blk_pwrite(blk, i * sizeof(NvmeZoneDescriptor), zd,
+ sizeof(NvmeZoneDescriptor), 0);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "blk_pwrite: ");
+return ret;
+}
+}
+
+return 0;
+}
+
+static int nvme_ns_setup_blk_zoneinfo(NvmeNamespace *ns, Error **errp)
+{
+NvmeZone *zone;
+NvmeZoneDescriptor *zd;
+BlockBackend *blk = ns->zns.info.blk;
+uint64_t perm, shared_perm;
+int64_t len, zoneinfo_len;
+
+Error *local_err = NULL;
+int ret;
+
+perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE;
+shared_perm = BLK_PERM_ALL;
+
+ret = blk_set_perm(blk, perm, shared_perm, _err);
+if (ret) {
+error_propagate_prepend(errp, local_err, "blk_set_perm: ");
+return ret;
+}
+
+zoneinfo_len = ROUND_UP(ns->zns.info.num_zones *
+sizeof(NvmeZoneDescriptor), BDRV_SECTOR_SIZE);
+
+len = blk_getlength(blk);
+if (len < 0) {
+error_setg_errno(errp, -len, "blk_getlength: ");
+return len;
+}
+
+if (len) {
+if (len != zoneinfo_len) {
+error_setg(errp, "zoneinfo size mismatch "
+   "(expected %"PRIu64" bytes; was %"PRIu64" bytes)",
+   zoneinfo_len, len);
+error_append_hint(errp, "Did you change the zone size or "
+  "zone descriptor size?\n");
+return -1;
+}
+
+for (int i = 0; i < ns->zns.info.num_zones; i++) {
+zone = >zns.info.zones[i];
+zd = >zd;
+
+ret = blk_pread(blk, i * sizeof(NvmeZoneDescriptor), zd,
+sizeof(NvmeZoneDescriptor));
+if (ret < 0) {
+error_setg_errno(errp, -ret, "blk_pread: ");
+return ret;
+} else if (ret != sizeof(NvmeZoneDescriptor)) {
+error_setg(errp, "blk_pread: short read");
+return -1;
+}
+
+zone->wp_staging = nvme_wp(zone);
+
+switch (nvme_zs(zone)) {
+case NVME_ZS_ZSE:
+case NVME_ZS_ZSF:
+case NVME_ZS_ZSRO:
+case NVME_ZS_ZSO:
+  

[PATCH 05/10] hw/block/nvme: add the zone management send command

2020-06-30 Thread Klaus Jensen
Add the Zone Management Send command.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c   | 461 ++
 hw/block/nvme.h   |   4 +
 hw/block/trace-events |  12 ++
 3 files changed, 477 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 7e943dece352..a4527ad9840e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -748,6 +748,11 @@ static void nvme_submit_aio(NvmeAIO *aio)
 }
 
 break;
+
+case NVME_AIO_OPC_DISCARD:
+aio->aiocb = blk_aio_pdiscard(blk, aio->offset, aio->len, nvme_aio_cb,
+  aio);
+break;
 }
 }
 
@@ -1142,6 +1147,46 @@ static void nvme_update_zone_info(NvmeNamespace *ns, 
NvmeRequest *req,
 nvme_req_add_aio(req, aio);
 }
 
+static void nvme_update_zone_descr(NvmeNamespace *ns, NvmeRequest *req,
+NvmeZone *zone)
+{
+uint64_t zslba = -1;
+QEMUIOVector *iov = g_new0(QEMUIOVector, 1);
+NvmeAIO *aio = g_new0(NvmeAIO, 1);
+
+*aio = (NvmeAIO) {
+.opc = NVME_AIO_OPC_WRITE,
+.blk = ns->zns.info.blk,
+.payload = iov,
+.offset = ns->zns.info.num_zones * sizeof(NvmeZoneDescriptor),
+.req = req,
+.flags = NVME_AIO_INTERNAL,
+};
+
+qemu_iovec_init(iov, 1);
+
+if (zone) {
+zslba = nvme_zslba(zone);
+trace_pci_nvme_update_zone_descr(nvme_cid(req), ns->params.nsid,
+ zslba);
+
+aio->offset += nvme_ns_zone_idx(ns, zslba) * nvme_ns_zdes_bytes(ns);
+qemu_iovec_add(iov, zone->zde, nvme_ns_zdes_bytes(ns));
+} else {
+trace_pci_nvme_update_zone_descr(nvme_cid(req), ns->params.nsid,
+ zslba);
+
+for (int i = 0; i < ns->zns.info.num_zones; i++) {
+qemu_iovec_add(iov, ns->zns.info.zones[i].zde,
+nvme_ns_zdes_bytes(ns));
+}
+}
+
+aio->len = iov->size;
+
+nvme_req_add_aio(req, aio);
+}
+
 static void nvme_aio_write_cb(NvmeAIO *aio, void *opaque, int ret)
 {
 NvmeRequest *req = aio->req;
@@ -1206,6 +1251,49 @@ static void nvme_rw_cb(NvmeRequest *req, void *opaque)
 nvme_enqueue_req_completion(cq, req);
 }
 
+static void nvme_zone_mgmt_send_reset_cb(NvmeRequest *req, void *opaque)
+{
+NvmeSQueue *sq = req->sq;
+NvmeCtrl *n = sq->ctrl;
+NvmeCQueue *cq = n->cq[sq->cqid];
+NvmeNamespace *ns = req->ns;
+
+trace_pci_nvme_zone_mgmt_send_reset_cb(nvme_cid(req), nvme_nsid(ns));
+
+g_free(opaque);
+
+nvme_enqueue_req_completion(cq, req);
+}
+
+static void nvme_aio_zone_reset_cb(NvmeAIO *aio, void *opaque, int ret)
+{
+NvmeRequest *req = aio->req;
+NvmeZone *zone = opaque;
+NvmeNamespace *ns = req->ns;
+
+uint64_t zslba = nvme_zslba(zone);
+uint64_t zcap = nvme_zcap(zone);
+
+if (ret) {
+return;
+}
+
+trace_pci_nvme_aio_zone_reset_cb(nvme_cid(req), ns->params.nsid, zslba);
+
+nvme_zs_set(zone, NVME_ZS_ZSE);
+NVME_ZA_CLEAR(zone->zd.za);
+
+zone->zd.wp = zone->zd.zslba;
+zone->wp_staging = zslba;
+
+nvme_update_zone_info(ns, req, zone);
+
+if (ns->blk_state) {
+bitmap_clear(ns->utilization, zslba, zcap);
+nvme_ns_update_util(ns, zslba, zcap, req);
+}
+}
+
 static void nvme_aio_cb(void *opaque, int ret)
 {
 NvmeAIO *aio = opaque;
@@ -1336,6 +1424,377 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest 
*req)
 return NVME_NO_COMPLETE;
 }
 
+static uint16_t nvme_zone_mgmt_send_close(NvmeCtrl *n, NvmeRequest *req,
+NvmeZone *zone)
+{
+NvmeNamespace *ns = req->ns;
+
+trace_pci_nvme_zone_mgmt_send_close(nvme_cid(req), nvme_nsid(ns),
+nvme_zslba(zone), nvme_zs_str(zone));
+
+
+switch (nvme_zs(zone)) {
+case NVME_ZS_ZSIO:
+case NVME_ZS_ZSEO:
+nvme_zs_set(zone, NVME_ZS_ZSC);
+
+nvme_update_zone_info(ns, req, zone);
+
+return NVME_NO_COMPLETE;
+
+case NVME_ZS_ZSC:
+return NVME_SUCCESS;
+
+default:
+break;
+}
+
+trace_pci_nvme_err_invalid_zone_condition(nvme_cid(req), nvme_zslba(zone),
+  nvme_zs(zone));
+return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR;
+}
+
+static uint16_t nvme_zone_mgmt_send_finish(NvmeCtrl *n, NvmeRequest *req,
+NvmeZone *zone)
+{
+NvmeNamespace *ns = req->ns;
+
+trace_pci_nvme_zone_mgmt_send_finish(nvme_cid(req), nvme_nsid(ns),
+ nvme_zslba(zone), nvme_zs_str(zone));
+
+
+switch (nvme_zs(zone)) {
+case NVME_ZS_ZSIO:
+case NVME_ZS_ZSEO:
+case NVME_ZS_ZSC:
+case NVME_ZS_ZSE:
+nvme_zs_set(zone, NVME_ZS_ZSF);
+
+nvme_update_zone_info(ns, req, zone);
+
+return NVME_NO_COMPLETE;
+
+case NVME_ZS_ZSF:
+return NVME_SUCCESS;
+
+default:
+break;
+}
+
+trace_pci_nvme_err_invalid_zone_condition(nvme_cid(req), 

[PATCH 09/10] hw/block/nvme: allow zone excursions

2020-06-30 Thread Klaus Jensen
Allow the controller to release active resources by transitioning zones
to the full state.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.h|   2 +
 hw/block/nvme.c   | 171 ++
 hw/block/trace-events |   4 +
 include/block/nvme.h  |  10 +++
 4 files changed, 174 insertions(+), 13 deletions(-)

diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 6d3a6dc07cd8..6acda5c2cf3f 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -75,6 +75,8 @@ typedef struct NvmeNamespace {
 QTAILQ_HEAD(, NvmeZone) lru_open;
 QTAILQ_HEAD(, NvmeZone) lru_active;
 } resources;
+
+NvmeChangedZoneList changed_list;
 } zns;
 } NvmeNamespace;
 
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index f7b4618bc805..6db6daa62bc5 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -859,10 +859,11 @@ static void nvme_process_aers(void *opaque)
 
 req = n->aer_reqs[n->outstanding_aers];
 
-result = (NvmeAerResult *) >cqe.dw0;
+result = (NvmeAerResult *) >cqe.qw0;
 result->event_type = event->result.event_type;
 result->event_info = event->result.event_info;
 result->log_page = event->result.log_page;
+result->nsid = event->result.nsid;
 g_free(event);
 
 req->status = NVME_SUCCESS;
@@ -874,8 +875,9 @@ static void nvme_process_aers(void *opaque)
 }
 }
 
-static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
-   uint8_t event_info, uint8_t log_page)
+static void nvme_enqueue_event(NvmeCtrl *n, NvmeNamespace *ns,
+   uint8_t event_type, uint8_t event_info,
+   uint8_t log_page)
 {
 NvmeAsyncEvent *event;
 
@@ -893,6 +895,11 @@ static void nvme_enqueue_event(NvmeCtrl *n, uint8_t 
event_type,
 .log_page   = log_page,
 };
 
+if (event_info == NVME_AER_INFO_NOTICE_ZONE_DESCR_CHANGED) {
+assert(ns);
+event->result.nsid = ns->params.nsid;
+}
+
 QTAILQ_INSERT_TAIL(>aer_queue, event, entry);
 n->aer_queued++;
 
@@ -1187,15 +1194,50 @@ static void nvme_update_zone_descr(NvmeNamespace *ns, 
NvmeRequest *req,
 nvme_req_add_aio(req, aio);
 }
 
+static void nvme_zone_changed(NvmeCtrl *n, NvmeNamespace *ns, NvmeZone *zone)
+{
+uint16_t num_ids = le16_to_cpu(ns->zns.changed_list.num_ids);
+
+trace_pci_nvme_zone_changed(ns->params.nsid, nvme_zslba(zone));
+
+if (num_ids < NVME_CHANGED_ZONE_LIST_MAX_IDS) {
+ns->zns.changed_list.ids[num_ids] = zone->zd.zslba;
+ns->zns.changed_list.num_ids = cpu_to_le16(num_ids + 1);
+} else {
+memset(>zns.changed_list, 0x0, sizeof(NvmeChangedZoneList));
+ns->zns.changed_list.num_ids = cpu_to_le16(0x);
+}
+
+nvme_enqueue_event(n, ns, NVME_AER_TYPE_NOTICE,
+   NVME_AER_INFO_NOTICE_ZONE_DESCR_CHANGED,
+   NVME_LOG_CHANGED_ZONE_LIST);
+}
+
 static uint16_t nvme_zrm_transition(NvmeCtrl *n, NvmeNamespace *ns,
 NvmeZone *zone, NvmeZoneState to,
 NvmeRequest *req);
 
+static void nvme_zone_excursion(NvmeCtrl *n, NvmeNamespace *ns, NvmeZone *zone,
+NvmeRequest *req)
+{
+trace_pci_nvme_zone_excursion(ns->params.nsid, nvme_zslba(zone),
+  nvme_zs_str(zone));
+
+assert(nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSF, req) == NVME_SUCCESS);
+
+NVME_ZA_SET_ZFC(zone->zd.za, 0x1);
+
+nvme_zone_changed(n, ns, zone);
+
+nvme_update_zone_info(ns, req, zone);
+}
+
 static uint16_t nvme_zrm_release_open(NvmeCtrl *n, NvmeNamespace *ns,
   NvmeRequest *req)
 {
 NvmeZone *candidate;
 NvmeZoneState zs;
+uint16_t status;
 
 trace_pci_nvme_zone_zrm_release_open(nvme_cid(req), ns->params.nsid);
 
@@ -1216,12 +1258,73 @@ static uint16_t nvme_zrm_release_open(NvmeCtrl *n, 
NvmeNamespace *ns,
 continue;
 }
 
-return nvme_zrm_transition(n, ns, candidate, NVME_ZS_ZSC, req);
+status = nvme_zrm_transition(n, ns, candidate, NVME_ZS_ZSC, req);
+if (status) {
+return status;
+}
+
+nvme_update_zone_info(ns, req, candidate);
+return NVME_SUCCESS;
 }
 
 return NVME_TOO_MANY_OPEN_ZONES;
 }
 
+static uint16_t nvme_zrm_release_active(NvmeCtrl *n, NvmeNamespace *ns,
+NvmeRequest *req)
+{
+NvmeIdNsZns *id_ns_zns = nvme_ns_id_zoned(ns);
+NvmeZone *candidate = NULL;
+NvmeZoneDescriptor *zd;
+NvmeZoneState zs;
+
+trace_pci_nvme_zone_zrm_release_active(nvme_cid(req), ns->params.nsid);
+
+/* bail out if Zone Active Excursions are not permitted */
+if (!(le16_to_cpu(id_ns_zns->zoc) & NVME_ID_NS_ZNS_ZOC_ZAE)) {
+trace_pci_nvme_zone_zrm_excursion_not_allowed(nvme_cid(req),
+  ns->params.nsid);
+

[PATCH 07/10] hw/block/nvme: track and enforce zone resources

2020-06-30 Thread Klaus Jensen
Move all zone transition rules to a single state machine that also
manages zone resources.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.c |  17 ++-
 hw/block/nvme-ns.h |   7 ++
 hw/block/nvme.c| 304 -
 3 files changed, 242 insertions(+), 86 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 68996c2f0e72..5a55a0191f55 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -262,8 +262,13 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
 
 id_ns->ncap = ns->zns.info.num_zones * ns->params.zns.zcap;
 
-id_ns_zns->mar = 0x;
-id_ns_zns->mor = 0x;
+id_ns_zns->mar = cpu_to_le32(ns->params.zns.mar);
+id_ns_zns->mor = cpu_to_le32(ns->params.zns.mor);
+
+ns->zns.resources.active = ns->params.zns.mar != 0x ?
+ns->params.zns.mar + 1 : ns->zns.info.num_zones;
+ns->zns.resources.open = ns->params.zns.mor != 0x ?
+ns->params.zns.mor + 1 : ns->zns.info.num_zones;
 }
 
 static void nvme_ns_init(NvmeNamespace *ns)
@@ -426,6 +431,12 @@ static int nvme_ns_check_constraints(NvmeCtrl *n, 
NvmeNamespace *ns, Error
 return -1;
 }
 
+if (ns->params.zns.mor > ns->params.zns.mar) {
+error_setg(errp, "maximum open resources (MOR) must be less "
+   "than or equal to maximum active resources (MAR)");
+return -1;
+}
+
 break;
 
 default:
@@ -499,6 +510,8 @@ static Property nvme_ns_props[] = {
 DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0),
 DEFINE_PROP_UINT16("zns.zoc", NvmeNamespace, params.zns.zoc, 0),
 DEFINE_PROP_UINT16("zns.ozcs", NvmeNamespace, params.zns.ozcs, 0),
+DEFINE_PROP_UINT32("zns.mar", NvmeNamespace, params.zns.mar, 0x),
+DEFINE_PROP_UINT32("zns.mor", NvmeNamespace, params.zns.mor, 0x),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 5940fb73e72b..5660934d6199 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -29,6 +29,8 @@ typedef struct NvmeNamespaceParams {
 uint8_t  zdes;
 uint16_t zoc;
 uint16_t ozcs;
+uint32_t mar;
+uint32_t mor;
 } zns;
 } NvmeNamespaceParams;
 
@@ -63,6 +65,11 @@ typedef struct NvmeNamespace {
 uint64_t  num_zones;
 NvmeZone *zones;
 } info;
+
+struct {
+uint32_t open;
+uint32_t active;
+} resources;
 } zns;
 } NvmeNamespace;
 
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6b394d374c8e..d5d521954cfc 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1187,6 +1187,155 @@ static void nvme_update_zone_descr(NvmeNamespace *ns, 
NvmeRequest *req,
 nvme_req_add_aio(req, aio);
 }
 
+/*
+ * nvme_zrm_transition validates zone state transitions under the constraint of
+ * the Number of Active and Open Resources (NAR and NOR) limits as reported by
+ * the Identify Namespace Data Structure.
+ *
+ * The function does NOT change the Zone Attribute field; this must be done by
+ * the caller.
+ */
+static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone,
+NvmeZoneState to)
+{
+NvmeZoneState from = nvme_zs(zone);
+
+/* fast path */
+if (from == to) {
+return NVME_SUCCESS;
+}
+
+switch (from) {
+case NVME_ZS_ZSE:
+switch (to) {
+case NVME_ZS_ZSRO:
+case NVME_ZS_ZSO:
+case NVME_ZS_ZSF:
+nvme_zs_set(zone, to);
+return NVME_SUCCESS;
+
+case NVME_ZS_ZSC:
+if (!ns->zns.resources.active) {
+return NVME_TOO_MANY_ACTIVE_ZONES;
+}
+
+ns->zns.resources.active--;
+
+nvme_zs_set(zone, to);
+
+return NVME_SUCCESS;
+
+case NVME_ZS_ZSIO:
+case NVME_ZS_ZSEO:
+if (!ns->zns.resources.active) {
+return NVME_TOO_MANY_ACTIVE_ZONES;
+}
+
+if (!ns->zns.resources.open) {
+return NVME_TOO_MANY_OPEN_ZONES;
+}
+
+ns->zns.resources.active--;
+ns->zns.resources.open--;
+
+nvme_zs_set(zone, to);
+
+return NVME_SUCCESS;
+
+default:
+return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR;
+}
+
+case NVME_ZS_ZSEO:
+switch (to) {
+case NVME_ZS_ZSIO:
+return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR;
+default:
+break;
+}
+
+/* fallthrough */
+
+case NVME_ZS_ZSIO:
+switch (to) {
+case NVME_ZS_ZSEO:
+nvme_zs_set(zone, to);
+return NVME_SUCCESS;
+
+case NVME_ZS_ZSE:
+case NVME_ZS_ZSF:
+case NVME_ZS_ZSRO:
+case NVME_ZS_ZSO:
+ns->zns.resources.active++;
+
+/* fallthrough */
+
+case NVME_ZS_ZSC:
+

[PATCH 00/10] hw/block/nvme: namespace types and zoned namespaces

2020-06-30 Thread Klaus Jensen
From: Klaus Jensen 

Hi all,

This series adds support for TP 4056 ("Namespace Types") and TP 4053
("Zoned Namespaces") and is an alternative implementation to the one
submitted by Dmitry[1].

While I don't want this to end up as a discussion about the merits of
each version, I want to point out a couple of differences from Dmitry's
version. At a glance, my version

  * builds on my patch series that adds fairly complete NVMe v1.4
mandatory support, as well as nice-to-have feature such as SGLs,
multiple namespaces and mostly just overall clean up. This finally
brings the nvme device into a fairly compliant state on which we can
add new features. I've tried hard to get these compliance and
clean-up patches merged for a long time (in parallel with developing
the emulation of NST and ZNS) and I would be really sad to see them
by-passed since they have already been through many iterations and
already carries Acked- and Reviewed-by's for the bulk of the
patches. I think the nvme device is already in a "frankenstate" wrt.
the implemented nvme version and the features it currently supports,
so I think this kind of cleanup is long overdue.

  * uses an attached blockdev and standard blk_aio for persistent zone
info. This is the same method used in our patches for Write
Uncorrectable and (separate and extended lba) metadata support, but
I've left those optional features out for now to ease the review
process.

  * relies on the universal dulbe support added in ("hw/block/nvme: add
support for dulbe") and sparse images for handling reads in gaps
(above write pointer and below ZSZE); that is - the size of the
underlying blockdev is in terms of ZSZE, not ZCAP

  * the controller uses timers to autonomously finish zones (wrt. FRL)

I've been on paternity leave for a month, so I havn't been around to
review Dmitry's patches, but I have started that process now. I would
also be happy to work with Dmitry & Friends on merging our versions to
get the best of both worlds if it makes sense.

This series and all preparatory patch sets (the ones I've been posting
yesterday and today) are available on my GitHub[2]. Unfortunately
Patchew got screwed up in the middle of me sending patches and it never
picked up v2 of "hw/block/nvme: support multiple namespaces" because it
was getting late and I made a mistake with the CC's. So my posted series
don't apply according to Patchew, but they actually do if you follow the
Based-on's (... or just grab [2]).


  [1]: Message-Id: <20200617213415.22417-1-dmitry.fomic...@wdc.com>
  [2]: https://github.com/birkelund/qemu/tree/for-master/nvme


Based-on: <20200630043122.1307043-1-...@irrelevant.dk>
("[PATCH 0/3] hw/block/nvme: bump to v1.4")

Klaus Jensen (10):
  hw/block/nvme: support I/O Command Sets
  hw/block/nvme: add zns specific fields and types
  hw/block/nvme: add basic read/write for zoned namespaces
  hw/block/nvme: add the zone management receive command
  hw/block/nvme: add the zone management send command
  hw/block/nvme: add the zone append command
  hw/block/nvme: track and enforce zone resources
  hw/block/nvme: allow open to close transitions by controller
  hw/block/nvme: allow zone excursions
  hw/block/nvme: support reset/finish recommended limits

 block/nvme.c  |6 +-
 hw/block/nvme-ns.c|  397 +-
 hw/block/nvme-ns.h|  148 +++-
 hw/block/nvme.c   | 1676 +++--
 hw/block/nvme.h   |   76 +-
 hw/block/trace-events |   43 +-
 include/block/nvme.h  |  252 ++-
 7 files changed, 2469 insertions(+), 129 deletions(-)

-- 
2.27.0




[PATCH 06/10] hw/block/nvme: add the zone append command

2020-06-30 Thread Klaus Jensen
Add the Zone Append command.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c   | 106 ++
 hw/block/nvme.h   |   3 ++
 hw/block/trace-events |   2 +
 3 files changed, 111 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index a4527ad9840e..6b394d374c8e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1294,6 +1294,12 @@ static void nvme_aio_zone_reset_cb(NvmeAIO *aio, void 
*opaque, int ret)
 }
 }
 
+static void nvme_zone_append_cb(NvmeRequest *req, void *opaque)
+{
+trace_pci_nvme_zone_append_cb(nvme_cid(req), le64_to_cpu(req->cqe.qw0));
+nvme_rw_cb(req, opaque);
+}
+
 static void nvme_aio_cb(void *opaque, int ret)
 {
 NvmeAIO *aio = opaque;
@@ -1424,6 +1430,104 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest 
*req)
 return NVME_NO_COMPLETE;
 }
 
+static uint16_t nvme_do_zone_append(NvmeCtrl *n, NvmeRequest *req,
+NvmeZone *zone)
+{
+NvmeAIO *aio;
+NvmeNamespace *ns = req->ns;
+
+uint64_t zslba = nvme_zslba(zone);
+uint64_t wp = zone->wp_staging;
+
+size_t len;
+uint16_t status;
+
+req->cqe.qw0 = cpu_to_le64(wp);
+req->slba = wp;
+
+len = req->nlb << nvme_ns_lbads(ns);
+
+trace_pci_nvme_zone_append(nvme_cid(req), zslba, wp, req->nlb);
+
+status = nvme_check_rw(n, req);
+if (status) {
+goto invalid;
+}
+
+status = nvme_check_zone_write(n, req->slba, req->nlb, req, zone);
+if (status) {
+goto invalid;
+}
+
+switch (nvme_zs(zone)) {
+case NVME_ZS_ZSE:
+case NVME_ZS_ZSC:
+nvme_zs_set(zone, NVME_ZS_ZSIO);
+default:
+break;
+}
+
+status = nvme_map(n, len, req);
+if (status) {
+goto invalid;
+}
+
+aio = g_new0(NvmeAIO, 1);
+*aio = (NvmeAIO) {
+.opc = NVME_AIO_OPC_WRITE,
+.blk = ns->blk,
+.offset = req->slba << nvme_ns_lbads(ns),
+.req = req,
+.cb = nvme_aio_zone_write_cb,
+.cb_arg = zone,
+};
+
+if (req->qsg.sg) {
+aio->len = req->qsg.size;
+aio->flags |= NVME_AIO_DMA;
+} else {
+aio->len = req->iov.size;
+}
+
+nvme_req_add_aio(req, aio);
+nvme_req_set_cb(req, nvme_zone_append_cb, zone);
+
+zone->wp_staging += req->nlb;
+
+return NVME_NO_COMPLETE;
+
+invalid:
+block_acct_invalid(blk_get_stats(ns->blk), BLOCK_ACCT_WRITE);
+return status;
+}
+
+static uint16_t nvme_zone_append(NvmeCtrl *n, NvmeRequest *req)
+{
+NvmeZone *zone;
+NvmeZoneAppendCmd *zappend = (NvmeZoneAppendCmd *) >cmd;
+NvmeNamespace *ns = req->ns;
+uint64_t zslba = le64_to_cpu(zappend->zslba);
+
+if (!nvme_ns_zoned(ns)) {
+return NVME_INVALID_OPCODE | NVME_DNR;
+}
+
+if (zslba & (nvme_ns_zsze(ns) - 1)) {
+trace_pci_nvme_err_invalid_zslba(nvme_cid(req), zslba);
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+req->nlb = le16_to_cpu(zappend->nlb) + 1;
+
+zone = nvme_ns_get_zone(ns, zslba);
+if (!zone) {
+trace_pci_nvme_err_invalid_zone(nvme_cid(req), req->slba);
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+return nvme_do_zone_append(n, req, zone);
+}
+
 static uint16_t nvme_zone_mgmt_send_close(NvmeCtrl *n, NvmeRequest *req,
 NvmeZone *zone)
 {
@@ -2142,6 +2246,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
 return nvme_zone_mgmt_send(n, req);
 case NVME_CMD_ZONE_MGMT_RECV:
 return nvme_zone_mgmt_recv(n, req);
+case NVME_CMD_ZONE_APPEND:
+return nvme_zone_append(n, req);
 default:
 trace_pci_nvme_err_invalid_opc(req->cmd.opcode);
 return NVME_INVALID_OPCODE | NVME_DNR;
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 757277d339bf..6b4eb0098450 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -53,6 +53,8 @@ static const NvmeEffectsLog nvme_effects[] = {
 [NVME_CMD_ZONE_MGMT_RECV]   = NVME_EFFECTS_CSUPP,
 [NVME_CMD_ZONE_MGMT_SEND]   = NVME_EFFECTS_CSUPP |
 NVME_EFFECTS_LBCC,
+[NVME_CMD_ZONE_APPEND]  = NVME_EFFECTS_CSUPP |
+NVME_EFFECTS_LBCC,
 }
 },
 };
@@ -177,6 +179,7 @@ static inline bool nvme_req_is_write(NvmeRequest *req)
 switch (req->cmd.opcode) {
 case NVME_CMD_WRITE:
 case NVME_CMD_WRITE_ZEROES:
+case NVME_CMD_ZONE_APPEND:
 return true;
 default:
 return false;
diff --git a/hw/block/trace-events b/hw/block/trace-events
index 1da48d1c29d0..0dfc6e22008e 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -50,6 +50,8 @@ pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t 
opcode) "cid %"PRIu16" s
 pci_nvme_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, 
uint64_t count, uint64_t lba) "cid %"PRIu16" %s nsid %"PRIu32" nlb %"PRIu32" 
count %"PRIu64" lba 0x%"PRIx64""
 pci_nvme_rw_cb(uint16_t cid, uint32_t nsid) "cid %"PRIu16" nsid %"PRIu32""
 pci_nvme_write_zeroes(uint16_t cid, 

[PATCH 04/10] hw/block/nvme: add the zone management receive command

2020-06-30 Thread Klaus Jensen
Add the Zone Management Receive command.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.c|  33 +--
 hw/block/nvme-ns.h|   9 ++-
 hw/block/nvme.c   | 130 ++
 hw/block/nvme.h   |   6 ++
 hw/block/trace-events |   1 +
 include/block/nvme.h  |   5 ++
 6 files changed, 179 insertions(+), 5 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 9a08b2ba0fb2..68996c2f0e72 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -99,6 +99,10 @@ static int nvme_ns_init_blk_zoneinfo(NvmeNamespace *ns, 
size_t len,
 zd->zcap = ns->params.zns.zcap;
 zone->wp_staging = zslba;
 zd->wp = zd->zslba = cpu_to_le64(zslba);
+
+if (ns->params.zns.zdes) {
+zone->zde = g_malloc0(nvme_ns_zdes_bytes(ns));
+}
 }
 
 ret = nvme_ns_blk_resize(blk, len, _err);
@@ -128,7 +132,7 @@ static int nvme_ns_setup_blk_zoneinfo(NvmeNamespace *ns, 
Error **errp)
 NvmeZoneDescriptor *zd;
 BlockBackend *blk = ns->zns.info.blk;
 uint64_t perm, shared_perm;
-int64_t len, zoneinfo_len;
+int64_t len, zoneinfo_len, zone_len;
 
 Error *local_err = NULL;
 int ret;
@@ -142,8 +146,9 @@ static int nvme_ns_setup_blk_zoneinfo(NvmeNamespace *ns, 
Error **errp)
 return ret;
 }
 
-zoneinfo_len = ROUND_UP(ns->zns.info.num_zones *
-sizeof(NvmeZoneDescriptor), BDRV_SECTOR_SIZE);
+zone_len = sizeof(NvmeZoneDescriptor) + nvme_ns_zdes_bytes(ns);
+zoneinfo_len = ROUND_UP(ns->zns.info.num_zones * zone_len,
+BDRV_SECTOR_SIZE);
 
 len = blk_getlength(blk);
 if (len < 0) {
@@ -177,6 +182,23 @@ static int nvme_ns_setup_blk_zoneinfo(NvmeNamespace *ns, 
Error **errp)
 
 zone->wp_staging = nvme_wp(zone);
 
+if (ns->params.zns.zdes) {
+uint16_t zde_bytes = nvme_ns_zdes_bytes(ns);
+int64_t offset = ns->zns.info.num_zones *
+sizeof(NvmeZoneDescriptor);
+ns->zns.info.zones[i].zde = g_malloc(zde_bytes);
+
+ret = blk_pread(blk, offset + i * zde_bytes,
+ns->zns.info.zones[i].zde, zde_bytes);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "blk_pread: ");
+return ret;
+} else if (ret != zde_bytes) {
+error_setg(errp, "blk_pread: short read");
+return -1;
+}
+}
+
 switch (nvme_zs(zone)) {
 case NVME_ZS_ZSE:
 case NVME_ZS_ZSF:
@@ -185,7 +207,8 @@ static int nvme_ns_setup_blk_zoneinfo(NvmeNamespace *ns, 
Error **errp)
 continue;
 
 case NVME_ZS_ZSC:
-if (nvme_wp(zone) == nvme_zslba(zone)) {
+if (nvme_wp(zone) == nvme_zslba(zone) &&
+!NVME_ZA_ZDEV(zd->za)) {
 nvme_zs_set(zone, NVME_ZS_ZSE);
 continue;
 }
@@ -231,6 +254,7 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
 
 for (int i = 0; i <= id_ns->nlbaf; i++) {
 id_ns_zns->lbafe[i].zsze = cpu_to_le64(pow2ceil(ns->params.zns.zcap));
+id_ns_zns->lbafe[i].zdes = ns->params.zns.zdes;
 }
 
 ns->zns.info.num_zones = nvme_ns_nlbas(ns) / nvme_ns_zsze(ns);
@@ -472,6 +496,7 @@ static Property nvme_ns_props[] = {
 DEFINE_PROP_UINT8("iocs", NvmeNamespace, params.iocs, 0x0),
 DEFINE_PROP_DRIVE("zns.zoneinfo", NvmeNamespace, zns.info.blk),
 DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0),
+DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0),
 DEFINE_PROP_UINT16("zns.zoc", NvmeNamespace, params.zns.zoc, 0),
 DEFINE_PROP_UINT16("zns.ozcs", NvmeNamespace, params.zns.ozcs, 0),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 7dcf0f02a07f..5940fb73e72b 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -26,13 +26,15 @@ typedef struct NvmeNamespaceParams {
 
 struct {
 uint64_t zcap;
+uint8_t  zdes;
 uint16_t zoc;
 uint16_t ozcs;
 } zns;
 } NvmeNamespaceParams;
 
 typedef struct NvmeZone {
-NvmeZoneDescriptor zd;
+NvmeZoneDescriptor  zd;
+uint8_t *zde;
 
 uint64_t wp_staging;
 } NvmeZone;
@@ -152,6 +154,11 @@ static inline void nvme_zs_set(NvmeZone *zone, 
NvmeZoneState zs)
 zone->zd.zs = zs << 4;
 }
 
+static inline size_t nvme_ns_zdes_bytes(NvmeNamespace *ns)
+{
+return ns->params.zns.zdes << 6;
+}
+
 static inline bool nvme_ns_zone_wp_valid(NvmeZone *zone)
 {
 switch (nvme_zs(zone)) {
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 4ec3b3029388..7e943dece352 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1528,6 +1528,134 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req)
 return nvme_do_rw(n, req);
 }
 
+static 

[PATCH 01/10] hw/block/nvme: support I/O Command Sets

2020-06-30 Thread Klaus Jensen
From: Klaus Jensen 

Implement support for TP 4056 ("Namespace Types"). This adds the 'iocs'
(I/O Command Set) device parameter to the nvme-ns device.

Signed-off-by: Klaus Jensen 
---
 block/nvme.c  |   6 +-
 hw/block/nvme-ns.c|  24 +++--
 hw/block/nvme-ns.h|  11 +-
 hw/block/nvme.c   | 226 +-
 hw/block/nvme.h   |  52 ++
 hw/block/trace-events |   6 +-
 include/block/nvme.h  |  53 --
 7 files changed, 285 insertions(+), 93 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 05485fdd1189..e7fe0c7accd1 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -333,7 +333,7 @@ static inline int nvme_translate_error(const NvmeCqe *c)
 {
 uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF;
 if (status) {
-trace_nvme_error(le32_to_cpu(c->result),
+trace_nvme_error(le32_to_cpu(c->dw0),
  le16_to_cpu(c->sq_head),
  le16_to_cpu(c->sq_id),
  le16_to_cpu(c->cid),
@@ -495,7 +495,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 {
 BDRVNVMeState *s = bs->opaque;
 NvmeIdCtrl *idctrl;
-NvmeIdNs *idns;
+NvmeIdNsNvm *idns;
 NvmeLBAF *lbaf;
 uint8_t *resp;
 uint16_t oncs;
@@ -512,7 +512,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 goto out;
 }
 idctrl = (NvmeIdCtrl *)resp;
-idns = (NvmeIdNs *)resp;
+idns = (NvmeIdNsNvm *)resp;
 r = qemu_vfio_dma_map(s->vfio, resp, sizeof(NvmeIdCtrl), true, );
 if (r) {
 error_setg(errp, "Cannot map buffer for DMA");
diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 7c825c38c69d..ae051784caaf 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -59,8 +59,16 @@ static int nvme_ns_blk_resize(BlockBackend *blk, size_t len, 
Error **errp)
 
 static void nvme_ns_init(NvmeNamespace *ns)
 {
-NvmeIdNs *id_ns = >id_ns;
+NvmeIdNsNvm *id_ns;
 
+int unmap = blk_get_flags(ns->blk) & BDRV_O_UNMAP;
+
+ns->id_ns[NVME_IOCS_NVM] = g_new0(NvmeIdNsNvm, 1);
+id_ns = nvme_ns_id_nvm(ns);
+
+ns->iocs = ns->params.iocs;
+
+id_ns->dlfeat = unmap ? 0x9 : 0x0;
 id_ns->lbaf[0].ds = ns->params.lbads;
 
 id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
@@ -130,8 +138,7 @@ static int nvme_ns_init_blk_state(NvmeNamespace *ns, Error 
**errp)
 return 0;
 }
 
-static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, NvmeIdCtrl *id,
-Error **errp)
+static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
 {
 uint64_t perm, shared_perm;
 
@@ -174,7 +181,8 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, 
NvmeIdCtrl *id,
 return 0;
 }
 
-static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
+static int nvme_ns_check_constraints(NvmeCtrl *n, NvmeNamespace *ns, Error
+ **errp)
 {
 if (!ns->blk) {
 error_setg(errp, "block backend not configured");
@@ -191,11 +199,11 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, 
Error **errp)
 
 int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
 {
-if (nvme_ns_check_constraints(ns, errp)) {
+if (nvme_ns_check_constraints(n, ns, errp)) {
 return -1;
 }
 
-if (nvme_ns_init_blk(n, ns, >id_ctrl, errp)) {
+if (nvme_ns_init_blk(n, ns, errp)) {
 return -1;
 }
 
@@ -210,7 +218,8 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error 
**errp)
  * With a state file in place we can enable the Deallocated or
  * Unwritten Logical Block Error feature.
  */
-ns->id_ns.nsfeat |= 0x4;
+NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns);
+id_ns->nsfeat |= 0x4;
 }
 
 if (nvme_register_namespace(n, ns, errp)) {
@@ -239,6 +248,7 @@ static Property nvme_ns_props[] = {
 DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0),
 DEFINE_PROP_UINT8("lbads", NvmeNamespace, params.lbads, BDRV_SECTOR_BITS),
 DEFINE_PROP_DRIVE("state", NvmeNamespace, blk_state),
+DEFINE_PROP_UINT8("iocs", NvmeNamespace, params.iocs, 0x0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index eb901acc912b..4124f20f1cef 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -21,6 +21,7 @@
 
 typedef struct NvmeNamespaceParams {
 uint32_t nsid;
+uint8_t  iocs;
 uint8_t  lbads;
 } NvmeNamespaceParams;
 
@@ -30,8 +31,9 @@ typedef struct NvmeNamespace {
 BlockBackend *blk_state;
 int32_t  bootindex;
 int64_t  size;
+uint8_t  iocs;
 
-NvmeIdNsid_ns;
+void *id_ns[256];
 NvmeNamespaceParams params;
 
 unsigned long *utilization;
@@ -50,9 +52,14 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns)
 return -1;
 }
 
+static inline NvmeIdNsNvm *nvme_ns_id_nvm(NvmeNamespace *ns)
+{
+return 

Re: [PATCH v5 08/15] hw/sd/sdcard: Check address is in range

2020-06-30 Thread Philippe Mathieu-Daudé
On 6/26/20 7:43 PM, Philippe Mathieu-Daudé wrote:
> On 6/26/20 6:40 PM, Philippe Mathieu-Daudé wrote:
>> As a defense, assert if the requested address is out of the card area.
>>
>> Suggested-by: Peter Maydell 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  hw/sd/sd.c | 18 ++
>>  1 file changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/sd/sd.c b/hw/sd/sd.c
>> index 22392e5084..2689a27b49 100644
>> --- a/hw/sd/sd.c
>> +++ b/hw/sd/sd.c
>> @@ -537,8 +537,10 @@ static void sd_response_r7_make(SDState *sd, uint8_t 
>> *response)
>>  stl_be_p(response, sd->vhs);
>>  }
>>  
>> -static inline uint64_t sd_addr_to_wpnum(uint64_t addr)
>> +static uint64_t sd_addr_to_wpnum(SDState *sd, uint64_t addr)
>>  {
>> +assert(addr < sd->size);
> 
> This should be:
> 
>assert(addr <= sd->size);

No, the current code is correct...

> 
>> +
>>  return addr >> (HWBLOCK_SHIFT + SECTOR_SHIFT + WPGROUP_SHIFT);
>>  }
>>  
>> @@ -575,7 +577,7 @@ static void sd_reset(DeviceState *dev)
>>  sd_set_cardstatus(sd);
>>  sd_set_sdstatus(sd);
>>  
>> -sect = sd_addr_to_wpnum(size) + 1;
>> +sect = sd_addr_to_wpnum(sd, size) + 1;

... but here this should be:

sect = sd_addr_to_wpnum(sd, size - 1) + 1;

>>  g_free(sd->wp_groups);
>>  sd->wp_switch = sd->blk ? blk_is_read_only(sd->blk) : false;
>>  sd->wpgrps_size = sect;
>> @@ -759,8 +761,8 @@ static void sd_erase(SDState *sd)
>>  erase_end *= HWBLOCK_SIZE;
>>  }
>>  
>> -erase_start = sd_addr_to_wpnum(erase_start);
>> -erase_end = sd_addr_to_wpnum(erase_end);
>> +erase_start = sd_addr_to_wpnum(sd, erase_start);
>> +erase_end = sd_addr_to_wpnum(sd, erase_end);
>>  sd->erase_start = 0;
>>  sd->erase_end = 0;
>>  sd->csd[14] |= 0x40;
>> @@ -777,7 +779,7 @@ static uint32_t sd_wpbits(SDState *sd, uint64_t addr)
>>  uint32_t i, wpnum;
>>  uint32_t ret = 0;
>>  
>> -wpnum = sd_addr_to_wpnum(addr);
>> +wpnum = sd_addr_to_wpnum(sd, addr);
>>  
>>  for (i = 0; i < 32; i++, wpnum++, addr += WPGROUP_SIZE) {
>>  if (addr < sd->size && test_bit(wpnum, sd->wp_groups)) {
>> @@ -819,7 +821,7 @@ static void sd_function_switch(SDState *sd, uint32_t arg)
>>  
>>  static inline bool sd_wp_addr(SDState *sd, uint64_t addr)
>>  {
>> -return test_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
>> +return test_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
>>  }
>>  
>>  static void sd_lock_command(SDState *sd)
>> @@ -1331,7 +1333,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
>> SDRequest req)
>>  }
>>  
>>  sd->state = sd_programming_state;
>> -set_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
>> +set_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
>>  /* Bzzztt  Operation complete.  */
>>  sd->state = sd_transfer_state;
>>  return sd_r1b;
>> @@ -1350,7 +1352,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, 
>> SDRequest req)
>>  }
>>  
>>  sd->state = sd_programming_state;
>> -clear_bit(sd_addr_to_wpnum(addr), sd->wp_groups);
>> +clear_bit(sd_addr_to_wpnum(sd, addr), sd->wp_groups);
>>  /* Bzzztt  Operation complete.  */
>>  sd->state = sd_transfer_state;
>>  return sd_r1b;
>>
> 



  1   2   >