Re: [PATCH for-6.0 v2 3/8] hw/block/nvme: fix the nsid 'invalid' value
Hi Klaus, On 4/5/21 7:54 PM, Klaus Jensen wrote: > From: Klaus Jensen > > The `nvme_nsid()` function returns '-1' (h) when the given > namespace is NULL. Since h is actually a valid namespace > identifier (the "broadcast" value), change this to be '0' since that > actually *is* the invalid value. > > Signed-off-by: Klaus Jensen > Reviewed-by: Gollu Appalanaidu > --- > hw/block/nvme-ns.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h > index 9ab7894fc83e..82340c4b2574 100644 > --- a/hw/block/nvme-ns.h > +++ b/hw/block/nvme-ns.h > @@ -96,7 +96,7 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) > return ns->params.nsid; > } > > -return -1; > +return 0; For 6.1 can you add a NVME_NSID_INVALID definition along NVME_NSID_BROADCAST and use it here? > } > > static inline bool nvme_ns_shared(NvmeNamespace *ns) >
Re: [PATCH for-6.0 v2 7/8] hw/block/nvme: fix handling of private namespaces
On 21-04-05 19:54:51, Klaus Jensen wrote: > From: Klaus Jensen > > Prior to this patch, if a private nvme-ns device (that is, a namespace > that is not linked to a subsystem) is wired up to an nvme-subsys linked > nvme controller device, the device fails to verify that the namespace id > is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID > and Namespace Usage") states that because the device supports Namespace > Management, "NSIDs *shall* be unique within the NVM subsystem". > > Additionally, prior to this patch, private namespaces are not known to > the subsystem and the namespace is considered exclusive to the > controller with which it is initially wired up to. However, this is not > the definition of a private namespace; per Section 1.6.33 ("private > namespace"), a private namespace is just a namespace that does not > support multipath I/O or namespace sharing, which means "that it is only > able to be attached to one controller at a time". > > Fix this by always allocating namespaces in the subsystem (if one is > linked to the controller), regardsless of the shared/private status of > the namespace. Whether or not the namespace is shareable is controlled > by a new `shared` nvme-ns parameter. > > Finally, this fix allows the nvme-ns `subsys` parameter to be removed, > since the `shared` parameter now serves the purpose of attaching the > namespace to all controllers in the subsystem upon device realization. > It is invalid to have an nvme-ns namespace device with a linked > subsystem without the parent nvme controller device also being linked to > one and since the nvme-ns devices will unconditionally be "attached" (in > QEMU terms that is) to an nvme controller device through an NvmeBus, the > nvme-ns namespace device can always get a reference to the subsystem of > the controller it is explicitly (using 'bus=' parametr) or implicitly > attaching to. > > Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in > subsystem") > Cc: Minwoo Im > Signed-off-by: Klaus Jensen > Reviewed-by: Gollu Appalanaidu Reviewed-by: Minwoo Im Thanks for the fix.
Re: [PATCH 0/6] Add debug interface to kick/call on purpose
在 2021/4/6 上午4:00, Dongli Zhang 写道: On 4/1/21 8:47 PM, Jason Wang wrote: 在 2021/3/30 下午3:29, Dongli Zhang 写道: On 3/28/21 8:56 PM, Jason Wang wrote: 在 2021/3/27 上午5:16, Dongli Zhang 写道: Hi Jason, On 3/26/21 12:24 AM, Jason Wang wrote: 在 2021/3/26 下午1:44, Dongli Zhang 写道: The virtio device/driver (e.g., vhost-scsi or vhost-net) may hang due to the loss of doorbell kick, e.g., https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01711.html__;!!GqivPVa7Brio!KS3pAU2cKjz4wgI4QSlE-YsJPhPG71nkE5_tGhaOf7mi_xvNxbvKkfn03rk5BNDLSEU$ ... or due to the loss of IRQ, e.g., as fixed by linux kernel commit fe200ae48ef5 ("genirq: Mark polled irqs and defer the real handler"). This patch introduces a new debug interface 'DeviceEvent' to DeviceClass to help narrow down if the issue is due to loss of irq/kick. So far the new interface handles only two events: 'call' and 'kick'. Any device (e.g., virtio/vhost or VFIO) may implement the interface (e.g., via eventfd, MSI-X or legacy IRQ). The 'call' is to inject irq on purpose by admin for a specific device (e.g., vhost-scsi) from QEMU/host to VM, while the 'kick' is to kick the doorbell on purpose by admin at QEMU/host side for a specific device. This device can be used as a workaround if call/kick is lost due to virtualization software (e.g., kernel or QEMU) issue. We may also implement the interface for VFIO PCI, e.g., to write to VFIOPCIDevice->msi_vectors[i].interrupt will be able to inject IRQ to VM on purpose. This is considered future work once the virtio part is done. Below is from live crash analysis. Initially, the queue=2 has count=15 for 'kick' eventfd_ctx. Suppose there is data in vring avail while there is no used available. We suspect this is because vhost-scsi was not notified by VM. In order to narrow down and analyze the issue, we use live crash to dump the current counter of eventfd for queue=2. crash> eventfd_ctx 8f67f6bbe700 struct eventfd_ctx { kref = { refcount = { refs = { counter = 4 } } }, wqh = { lock = { { rlock = { raw_lock = { val = { counter = 0 } } } } }, head = { next = 0x8f841dc08e18, prev = 0x8f841dc08e18 } }, count = 15, ---> eventfd is 15 !!! flags = 526336 } Now we kick the doorbell for vhost-scsi queue=2 on purpose for diagnostic with this interface. { "execute": "x-debug-device-event", "arguments": { "dev": "/machine/peripheral/vscsi0", "event": "kick", "queue": 2 } } The counter is increased to 16. Suppose the hang issue is resolved, it indicates something bad is in software that the 'kick' is lost. What do you mean by "software" here? And it looks to me you're testing whether event_notifier_set() is called by virtio_queue_notify() here. If so, I'm not sure how much value could we gain from a dedicated debug interface like this consider there're a lot of exisinting general purpose debugging method like tracing or gdb. I'd say the path from virtio_queue_notify() to event_notifier_set() is only a very small fraction of the process of virtqueue kick which is unlikey to be buggy. Consider usually the ioeventfd will be offloaded to KVM, it's more a chance that something is wrong in setuping ioeventfd instead of here. Irq is even more complicated. Thank you very much! I am not testing whether event_notifier_set() is called by virtio_queue_notify(). The 'software' indicates the data processing and event notification mechanism involved with virtio/vhost PV driver frontend. E.g., while VM is waiting for an extra IRQ, vhost side did not trigger IRQ, suppose vring_need_event() erroneously returns false due to corrupted ring buffer status. So there could be several factors that may block the notification: 1) eventfd bug (ioeventfd vs irqfd) 2) wrong virtqueue state (either driver or device) 3) missing barriers (either driver or device) 4) Qemu bug (irqchip and routing) ... This is not only about whether notification is blocked. It can also be used to help narrow down and understand if there is any suspicious issue in blk-mq/scsi/netdev/napi code. The PV drivers are not only drivers following virtio spec. It is closely related to many of other kernel components. Suppose IO was recovered after we inject an IRQ to vhost-scsi on purpose, we will be able to analyze what may happen along the IO completion path starting from when /where the IRQ is injected ... perhaps the root cause is not with virtio but blk-mq/scsi (this is just an example). In addition, this idea should help for vfio-pci. Suppose the developer for a specific device driver suspects IO/networking hangs because of loss if IRQ, we will be able to verify if that assumption is correct by injecting an IRQ on purpose. Therefore, this i
Re: [PATCH 0/6] Add debug interface to kick/call on purpose
On 4/1/21 8:47 PM, Jason Wang wrote: > > 在 2021/3/30 下午3:29, Dongli Zhang 写道: >> >> On 3/28/21 8:56 PM, Jason Wang wrote: >>> 在 2021/3/27 上午5:16, Dongli Zhang 写道: Hi Jason, On 3/26/21 12:24 AM, Jason Wang wrote: > 在 2021/3/26 下午1:44, Dongli Zhang 写道: >> The virtio device/driver (e.g., vhost-scsi or vhost-net) may hang due to >> the loss of doorbell kick, e.g., >> >> https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01711.html__;!!GqivPVa7Brio!KS3pAU2cKjz4wgI4QSlE-YsJPhPG71nkE5_tGhaOf7mi_xvNxbvKkfn03rk5BNDLSEU$ >> >> >> >> ... or due to the loss of IRQ, e.g., as fixed by linux kernel commit >> fe200ae48ef5 ("genirq: Mark polled irqs and defer the real handler"). >> >> This patch introduces a new debug interface 'DeviceEvent' to DeviceClass >> to help narrow down if the issue is due to loss of irq/kick. So far the >> new >> interface handles only two events: 'call' and 'kick'. Any device (e.g., >> virtio/vhost or VFIO) may implement the interface (e.g., via eventfd, >> MSI-X >> or legacy IRQ). >> >> The 'call' is to inject irq on purpose by admin for a specific device >> (e.g., >> vhost-scsi) from QEMU/host to VM, while the 'kick' is to kick the >> doorbell >> on purpose by admin at QEMU/host side for a specific device. >> >> >> This device can be used as a workaround if call/kick is lost due to >> virtualization software (e.g., kernel or QEMU) issue. >> >> We may also implement the interface for VFIO PCI, e.g., to write to >> VFIOPCIDevice->msi_vectors[i].interrupt will be able to inject IRQ to VM >> on purpose. This is considered future work once the virtio part is done. >> >> >> Below is from live crash analysis. Initially, the queue=2 has count=15 >> for >> 'kick' eventfd_ctx. Suppose there is data in vring avail while there is >> no >> used available. We suspect this is because vhost-scsi was not notified by >> VM. In order to narrow down and analyze the issue, we use live crash to >> dump the current counter of eventfd for queue=2. >> >> crash> eventfd_ctx 8f67f6bbe700 >> struct eventfd_ctx { >> kref = { >> refcount = { >> refs = { >> counter = 4 >> } >> } >> }, >> wqh = { >> lock = { >> { >> rlock = { >> raw_lock = { >> val = { >> counter = 0 >> } >> } >> } >> } >> }, >> head = { >> next = 0x8f841dc08e18, >> prev = 0x8f841dc08e18 >> } >> }, >> count = 15, ---> eventfd is 15 !!! >> flags = 526336 >> } >> >> Now we kick the doorbell for vhost-scsi queue=2 on purpose for diagnostic >> with this interface. >> >> { "execute": "x-debug-device-event", >> "arguments": { "dev": "/machine/peripheral/vscsi0", >> "event": "kick", "queue": 2 } } >> >> The counter is increased to 16. Suppose the hang issue is resolved, it >> indicates something bad is in software that the 'kick' is lost. > What do you mean by "software" here? And it looks to me you're testing > whether > event_notifier_set() is called by virtio_queue_notify() here. If so, I'm > not > sure how much value could we gain from a dedicated debug interface like > this > consider there're a lot of exisinting general purpose debugging method > like > tracing or gdb. I'd say the path from virtio_queue_notify() to > event_notifier_set() is only a very small fraction of the process of > virtqueue > kick which is unlikey to be buggy. Consider usually the ioeventfd will be > offloaded to KVM, it's more a chance that something is wrong in setuping > ioeventfd instead of here. Irq is even more complicated. Thank you very much! I am not testing whether event_notifier_set() is called by virtio_queue_notify(). The 'software' indicates the data processing and event notification mechanism involved with virtio/vhost PV driver frontend. E.g., while VM is waiting for an extra IRQ, vhost side did not trigger IRQ, suppose vring_need_event() erroneously returns false due to corrupted ring buffer status. >>> >>> So there could be several factors that may block the notification: >>> >>> 1) eventfd bug (ioeventfd vs irqfd) >>> 2) wrong virtqueue state (either driver or device) >>> 3) missing barriers (either driver or device) >>> 4) Qemu bug (irqchip and routing) >>> ... >> This is not only about whether notification is blocked. >> >> It can also be used to help narrow down and understand if there is any >> suspicious iss
Re: [PATCH for-6.0 v2 0/8] hw/block/nvme: misc fixes
On Mon, Apr 05, 2021 at 07:54:44PM +0200, Klaus Jensen wrote: > From: Klaus Jensen > > Various fixes for 6.0. > > v2: > - "hw/block/nvme: fix handling of private namespaces" > update documentation (Gollu) > - add a patch for missing copyright headers Series looks fine. Reviewed-by: Keith Busch
[PATCH for-6.0 v2 5/8] hw/block/nvme: fix warning about legacy namespace configuration
From: Klaus Jensen Remove the unused BlockConf from the controller structure and fix the constraint checking to actually check the right BlockConf and issue the warning. Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme.h | 1 - hw/block/nvme.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index c610ab30dc5c..1570f65989a7 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -166,7 +166,6 @@ typedef struct NvmeCtrl { NvmeBar bar; NvmeParams params; NvmeBus bus; -BlockConfconf; uint16_tcntlid; boolqs_created; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 6d31d5b17a0b..de0e726dfdd8 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -5813,7 +5813,7 @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) params->max_ioqpairs = params->num_queues - 1; } -if (n->conf.blk) { +if (n->namespace.blkconf.blk) { warn_report("drive property is deprecated; " "please use an nvme-ns device instead"); } -- 2.31.1
[PATCH for-6.0 v2 2/8] hw/block/nvme: fix missing string representation for ns attachment
From: Klaus Jensen Add the missing nvme_adm_opc_str entry for the Namespace Attachment command. Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme.h | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 5b0031b11db2..9edc86d79e98 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -86,6 +86,7 @@ static inline const char *nvme_adm_opc_str(uint8_t opc) case NVME_ADM_CMD_SET_FEATURES: return "NVME_ADM_CMD_SET_FEATURES"; case NVME_ADM_CMD_GET_FEATURES: return "NVME_ADM_CMD_GET_FEATURES"; case NVME_ADM_CMD_ASYNC_EV_REQ: return "NVME_ADM_CMD_ASYNC_EV_REQ"; +case NVME_ADM_CMD_NS_ATTACHMENT:return "NVME_ADM_CMD_NS_ATTACHMENT"; case NVME_ADM_CMD_FORMAT_NVM: return "NVME_ADM_CMD_FORMAT_NVM"; default:return "NVME_ADM_CMD_UNKNOWN"; } -- 2.31.1
[PATCH for-6.0 v2 7/8] hw/block/nvme: fix handling of private namespaces
From: Klaus Jensen Prior to this patch, if a private nvme-ns device (that is, a namespace that is not linked to a subsystem) is wired up to an nvme-subsys linked nvme controller device, the device fails to verify that the namespace id is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID and Namespace Usage") states that because the device supports Namespace Management, "NSIDs *shall* be unique within the NVM subsystem". Additionally, prior to this patch, private namespaces are not known to the subsystem and the namespace is considered exclusive to the controller with which it is initially wired up to. However, this is not the definition of a private namespace; per Section 1.6.33 ("private namespace"), a private namespace is just a namespace that does not support multipath I/O or namespace sharing, which means "that it is only able to be attached to one controller at a time". Fix this by always allocating namespaces in the subsystem (if one is linked to the controller), regardsless of the shared/private status of the namespace. Whether or not the namespace is shareable is controlled by a new `shared` nvme-ns parameter. Finally, this fix allows the nvme-ns `subsys` parameter to be removed, since the `shared` parameter now serves the purpose of attaching the namespace to all controllers in the subsystem upon device realization. It is invalid to have an nvme-ns namespace device with a linked subsystem without the parent nvme controller device also being linked to one and since the nvme-ns devices will unconditionally be "attached" (in QEMU terms that is) to an nvme controller device through an NvmeBus, the nvme-ns namespace device can always get a reference to the subsystem of the controller it is explicitly (using 'bus=' parametr) or implicitly attaching to. Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in subsystem") Cc: Minwoo Im Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme-ns.h | 10 +--- hw/block/nvme-subsys.h | 7 +-- hw/block/nvme.h| 39 + include/block/nvme.h | 1 + hw/block/nvme-ns.c | 74 + hw/block/nvme-subsys.c | 28 -- hw/block/nvme.c| 123 ++--- hw/block/trace-events | 1 - 8 files changed, 113 insertions(+), 170 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 82340c4b2574..fb0a41f912e7 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -29,6 +29,7 @@ typedef struct NvmeZone { typedef struct NvmeNamespaceParams { bool detached; +bool shared; uint32_t nsid; QemuUUID uuid; @@ -60,8 +61,8 @@ typedef struct NvmeNamespace { const uint32_t *iocs; uint8_t csi; uint16_t status; +int attached; -NvmeSubsystem *subsys; QTAILQ_ENTRY(NvmeNamespace) entry; NvmeIdNsZoned *id_ns_zoned; @@ -99,11 +100,6 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) return 0; } -static inline bool nvme_ns_shared(NvmeNamespace *ns) -{ -return !!ns->subsys; -} - static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) { NvmeIdNs *id_ns = &ns->id_ns; @@ -225,7 +221,7 @@ static inline void nvme_aor_dec_active(NvmeNamespace *ns) } void nvme_ns_init_format(NvmeNamespace *ns); -int nvme_ns_setup(NvmeNamespace *ns, Error **errp); +int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_shutdown(NvmeNamespace *ns); void nvme_ns_cleanup(NvmeNamespace *ns); diff --git a/hw/block/nvme-subsys.h b/hw/block/nvme-subsys.h index aafa04b84829..24132edd005c 100644 --- a/hw/block/nvme-subsys.h +++ b/hw/block/nvme-subsys.h @@ -14,7 +14,7 @@ OBJECT_CHECK(NvmeSubsystem, (obj), TYPE_NVME_SUBSYS) #define NVME_SUBSYS_MAX_CTRLS 32 -#define NVME_SUBSYS_MAX_NAMESPACES 256 +#define NVME_MAX_NAMESPACES 256 typedef struct NvmeCtrl NvmeCtrl; typedef struct NvmeNamespace NvmeNamespace; @@ -24,7 +24,7 @@ typedef struct NvmeSubsystem { NvmeCtrl*ctrls[NVME_SUBSYS_MAX_CTRLS]; /* Allocated namespaces for this subsystem */ -NvmeNamespace *namespaces[NVME_SUBSYS_MAX_NAMESPACES + 1]; +NvmeNamespace *namespaces[NVME_MAX_NAMESPACES + 1]; struct { char *nqn; @@ -32,7 +32,6 @@ typedef struct NvmeSubsystem { } NvmeSubsystem; int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp); -int nvme_subsys_register_ns(NvmeNamespace *ns, Error **errp); static inline NvmeCtrl *nvme_subsys_ctrl(NvmeSubsystem *subsys, uint32_t cntlid) @@ -54,7 +53,7 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys, return NULL; } -assert(nsid && nsid <= NVME_SUBSYS_MAX_NAMESPACES); +assert(nsid && nsid <= NVME_MAX_NAMESPACES); return subsys->namespaces[nsid]; } diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 1570f65989a7..644143597a0f 100644 --- a/hw/block/nvme.h +
[PATCH for-6.0 v2 6/8] hw/block/nvme: update dmsrl limit on namespace detachment
From: Klaus Jensen The Non-MDTS DMSRL limit must be recomputed when namespaces are detached. Fixes: 645ce1a70cb6 ("hw/block/nvme: support namespace attachment command") Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme.c | 17 + 1 file changed, 17 insertions(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index de0e726dfdd8..3dc51f407671 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -4876,6 +4876,21 @@ static uint16_t nvme_aer(NvmeCtrl *n, NvmeRequest *req) return NVME_NO_COMPLETE; } +static void __nvme_update_dmrsl(NvmeCtrl *n) +{ +int nsid; + +for (nsid = 1; nsid <= NVME_MAX_NAMESPACES; nsid++) { +NvmeNamespace *ns = nvme_ns(n, nsid); +if (!ns) { +continue; +} + +n->dmrsl = MIN_NON_ZERO(n->dmrsl, +BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1)); +} +} + static void __nvme_select_ns_iocs(NvmeCtrl *n, NvmeNamespace *ns); static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req) { @@ -4925,6 +4940,8 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req) } nvme_ns_detach(ctrl, ns); + +__nvme_update_dmrsl(ctrl); } /* -- 2.31.1
[PATCH for-6.0 v2 8/8] hw/block/nvme: add missing copyright headers
From: Klaus Jensen Add missing license/copyright headers to the nvme-dif.{c,h} files. Signed-off-by: Klaus Jensen --- hw/block/nvme-dif.h | 10 ++ hw/block/nvme-dif.c | 10 ++ 2 files changed, 20 insertions(+) diff --git a/hw/block/nvme-dif.h b/hw/block/nvme-dif.h index 5a8e37c8525b..524faffbd7a0 100644 --- a/hw/block/nvme-dif.h +++ b/hw/block/nvme-dif.h @@ -1,3 +1,13 @@ +/* + * QEMU NVM Express End-to-End Data Protection support + * + * Copyright (c) 2021 Samsung Electronics Co., Ltd. + * + * Authors: + * Klaus Jensen + * Gollu Appalanaidu + */ + #ifndef HW_NVME_DIF_H #define HW_NVME_DIF_H diff --git a/hw/block/nvme-dif.c b/hw/block/nvme-dif.c index e6f04faafb5f..81b0a4cb1382 100644 --- a/hw/block/nvme-dif.c +++ b/hw/block/nvme-dif.c @@ -1,3 +1,13 @@ +/* + * QEMU NVM Express End-to-End Data Protection support + * + * Copyright (c) 2021 Samsung Electronics Co., Ltd. + * + * Authors: + * Klaus Jensen + * Gollu Appalanaidu + */ + #include "qemu/osdep.h" #include "hw/block/block.h" #include "sysemu/dma.h" -- 2.31.1
[PATCH for-6.0 v2 4/8] hw/block/nvme: fix controller namespaces array indexing
From: Klaus Jensen The controller namespaces array being 0-indexed requires 'nsid - 1' everywhere. Something that is easy to miss. Align the controller namespaces array with the subsystem namespaces array such that both are 1-indexed. Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme.h | 8 hw/block/nvme.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 9edc86d79e98..c610ab30dc5c 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -217,7 +217,7 @@ typedef struct NvmeCtrl { * Attached namespaces to this controller. If subsys is not given, all * namespaces in this list will always be attached. */ -NvmeNamespace *namespaces[NVME_MAX_NAMESPACES]; +NvmeNamespace *namespaces[NVME_MAX_NAMESPACES + 1]; NvmeSQueue **sq; NvmeCQueue **cq; NvmeSQueue admin_sq; @@ -232,7 +232,7 @@ static inline NvmeNamespace *nvme_ns(NvmeCtrl *n, uint32_t nsid) return NULL; } -return n->namespaces[nsid - 1]; +return n->namespaces[nsid]; } static inline bool nvme_ns_is_attached(NvmeCtrl *n, NvmeNamespace *ns) @@ -253,7 +253,7 @@ static inline void nvme_ns_attach(NvmeCtrl *n, NvmeNamespace *ns) uint32_t nsid = nvme_nsid(ns); assert(nsid && nsid <= NVME_MAX_NAMESPACES); -n->namespaces[nsid - 1] = ns; +n->namespaces[nsid] = ns; } static inline void nvme_ns_detach(NvmeCtrl *n, NvmeNamespace *ns) @@ -261,7 +261,7 @@ static inline void nvme_ns_detach(NvmeCtrl *n, NvmeNamespace *ns) uint32_t nsid = nvme_nsid(ns); assert(nsid && nsid <= NVME_MAX_NAMESPACES); -n->namespaces[nsid - 1] = NULL; +n->namespaces[nsid] = NULL; } static inline NvmeCQueue *nvme_cq(NvmeRequest *req) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index c54ec3c9523c..6d31d5b17a0b 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -5915,7 +5915,7 @@ int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) return -1; } } else { -if (n->namespaces[nsid - 1]) { +if (n->namespaces[nsid]) { error_setg(errp, "namespace id '%d' is already in use", nsid); return -1; } -- 2.31.1
[PATCH for-6.0 v2 3/8] hw/block/nvme: fix the nsid 'invalid' value
From: Klaus Jensen The `nvme_nsid()` function returns '-1' (h) when the given namespace is NULL. Since h is actually a valid namespace identifier (the "broadcast" value), change this to be '0' since that actually *is* the invalid value. Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme-ns.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 9ab7894fc83e..82340c4b2574 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -96,7 +96,7 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) return ns->params.nsid; } -return -1; +return 0; } static inline bool nvme_ns_shared(NvmeNamespace *ns) -- 2.31.1
[PATCH for-6.0 v2 1/8] hw/block/nvme: fix pi constraint check
From: Klaus Jensen Protection Information can only be enabled if there is at least 8 bytes of metadata. Signed-off-by: Klaus Jensen Reviewed-by: Gollu Appalanaidu --- hw/block/nvme-ns.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 7f8d139a8663..ca04ee1bacfb 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -394,7 +394,7 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } -if (ns->params.pi && !ns->params.ms) { +if (ns->params.pi && ns->params.ms < 8) { error_setg(errp, "at least 8 bytes of metadata required to enable " "protection information"); return -1; -- 2.31.1
[PATCH for-6.0 v2 0/8] hw/block/nvme: misc fixes
From: Klaus Jensen Various fixes for 6.0. v2: - "hw/block/nvme: fix handling of private namespaces" update documentation (Gollu) - add a patch for missing copyright headers Klaus Jensen (8): hw/block/nvme: fix pi constraint check hw/block/nvme: fix missing string representation for ns attachment hw/block/nvme: fix the nsid 'invalid' value hw/block/nvme: fix controller namespaces array indexing hw/block/nvme: fix warning about legacy namespace configuration hw/block/nvme: update dmsrl limit on namespace detachment hw/block/nvme: fix handling of private namespaces hw/block/nvme: add missing copyright headers hw/block/nvme-dif.h| 10 +++ hw/block/nvme-ns.h | 12 ++-- hw/block/nvme-subsys.h | 7 +- hw/block/nvme.h| 45 ++--- include/block/nvme.h | 1 + hw/block/nvme-dif.c| 10 +++ hw/block/nvme-ns.c | 76 ++ hw/block/nvme-subsys.c | 28 hw/block/nvme.c| 142 + hw/block/trace-events | 1 - 10 files changed, 156 insertions(+), 176 deletions(-) -- 2.31.1
[PULL for-6.0 1/2] hw/block/nvme: remove description for zoned.append_size_limit
From: Niklas Cassel The description was originally removed in commit 578d914b263c ("hw/block/nvme: align zoned.zasl with mdts") together with the removal of the zoned.append_size_limit parameter itself. However, it was (most likely accidentally), re-added in commit f7dcd31885cb ("hw/block/nvme: add non-mdts command size limit for verify"). Remove the description again, since the parameter it describes, zoned.append_size_limit, no longer exists. Signed-off-by: Niklas Cassel Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 8 1 file changed, 8 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index c54ec3c9523c..08c204d46c43 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -91,14 +91,6 @@ * the minimum memory page size (CAP.MPSMIN). The default value is 0 (i.e. * defaulting to the value of `mdts`). * - * - `zoned.append_size_limit` - * The maximum I/O size in bytes that is allowed in Zone Append command. - * The default is 128KiB. Since internally this this value is maintained as - * ZASL = log2( / ), some values assigned - * to this property may be rounded down and result in a lower maximum ZA - * data size being in effect. By setting this property to 0, users can make - * ZASL to be equal to MDTS. This property only affects zoned namespaces. - * * nvme namespace device parameters * * - `subsys` -- 2.31.1
[PULL for-6.0 0/2] emulated nvme fixes
From: Klaus Jensen Hi Peter, The following changes since commit 25d75c99b2e5941c67049ee776efdb226414f4c6: Merge remote-tracking branch 'remotes/xtensa/tags/20210403-xtensa' into staging (2021-04-04 21:48:45 +0100) are available in the Git repository at: git://git.infradead.org/qemu-nvme.git tags/nvme-fixes-for-6.0-pull-request for you to fetch changes up to 498114b37bc99fddcfc24b92bff7f1323fb32672: hw/block/nvme: expose 'bootindex' property (2021-04-05 19:33:04 +0200) emulated nvme fixes Joelle van Dyne (1): hw/block/nvme: expose 'bootindex' property Niklas Cassel (1): hw/block/nvme: remove description for zoned.append_size_limit hw/block/nvme.c | 16 +++- 1 file changed, 3 insertions(+), 13 deletions(-) -- 2.31.1
[PULL for-6.0 2/2] hw/block/nvme: expose 'bootindex' property
From: Joelle van Dyne The check for `n->namespace.blkconf.blk` always fails because this is in the initialization function. Signed-off-by: Joelle van Dyne Reviewed-by: Klaus Jensen Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 08c204d46c43..7244534a89e9 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -6328,11 +6328,9 @@ static void nvme_instance_init(Object *obj) { NvmeCtrl *n = NVME(obj); -if (n->namespace.blkconf.blk) { -device_add_bootindex_property(obj, &n->namespace.blkconf.bootindex, - "bootindex", "/namespace@1,0", - DEVICE(obj)); -} +device_add_bootindex_property(obj, &n->namespace.blkconf.bootindex, + "bootindex", "/namespace@1,0", + DEVICE(obj)); object_property_add(obj, "smart_critical_warning", "uint8", nvme_get_smart_warning, -- 2.31.1
Re: [PATCH for-6.0 7/7] hw/block/nvme: fix handling of private namespaces
On Apr 5 18:02, Gollu Appalanaidu wrote: > On Wed, Mar 24, 2021 at 09:09:07PM +0100, Klaus Jensen wrote: > > From: Klaus Jensen > > > > Prior to this patch, if a private nvme-ns device (that is, a namespace > > that is not linked to a subsystem) is wired up to an nvme-subsys linked > > nvme controller device, the device fails to verify that the namespace id > > is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID > > and Namespace Usage") states that because the device supports Namespace > > Management, "NSIDs *shall* be unique within the NVM subsystem". > > > > Additionally, prior to this patch, private namespaces are not known to > > the subsystem and the namespace is considered exclusive to the > > controller with which it is initially wired up to. However, this is not > > the definition of a private namespace; per Section 1.6.33 ("private > > namespace"), a private namespace is just a namespace that does not > > support multipath I/O or namespace sharing, which means "that it is only > > able to be attached to one controller at a time". > > > > Fix this by always allocating namespaces in the subsystem (if one is > > linked to the controller), regardsless of the shared/private status of > > the namespace. Whether or not the namespace is shareable is controlled > > by a new `shared` nvme-ns parameter. > > > > Finally, this fix allows the nvme-ns `subsys` parameter to be removed, > > since the `shared` parameter now serves the purpose of attaching the > > namespace to all controllers in the subsystem upon device realization. > > It is invalid to have an nvme-ns namespace device with a linked > > subsystem without the parent nvme controller device also being linked to > > one and since the nvme-ns devices will unconditionally be "attached" (in > > QEMU terms that is) to an nvme controller device through an NvmeBus, the > > nvme-ns namespace device can always get a reference to the subsystem of > > the controller it is explicitly (using 'bus=' parametr) or implicitly > > attaching to. > > > > Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in > > subsystem") > > Cc: Minwoo Im > > Signed-off-by: Klaus Jensen > > --- > > hw/block/nvme-ns.h | 10 ++-- > > hw/block/nvme-subsys.h | 7 ++- > > hw/block/nvme.h| 39 +- > > include/block/nvme.h | 1 + > > hw/block/nvme-ns.c | 74 +++ > > hw/block/nvme-subsys.c | 28 --- > > hw/block/nvme.c| 112 + > > hw/block/trace-events | 1 - > > 8 files changed, 106 insertions(+), 166 deletions(-) > > > > > > static Property nvme_ns_props[] = { > > DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), > > -DEFINE_PROP_LINK("subsys", NvmeNamespace, subsys, TYPE_NVME_SUBSYS, > > - NvmeSubsystem *), > > DEFINE_PROP_BOOL("detached", NvmeNamespace, params.detached, false), > > +DEFINE_PROP_BOOL("shared", NvmeNamespace, params.shared, false)i, > > Nice change point, hope we need update the usage, removing "subsys" from > nvme-ns device params and adding "shared" param? > Good catch, thanks. Fixing that up for a v2. signature.asc Description: PGP signature
Re: [PATCH for-6.0 0/7] hw/block/nvme: misc fixes
On Wed, Mar 24, 2021 at 09:09:00PM +0100, Klaus Jensen wrote: From: Klaus Jensen Various fixes for 6.0. Klaus Jensen (7): hw/block/nvme: fix pi constraint check hw/block/nvme: fix missing string representation for ns attachment hw/block/nvme: fix the nsid 'invalid' value hw/block/nvme: fix controller namespaces array indexing hw/block/nvme: fix warning about legacy namespace configuration hw/block/nvme: update dmsrl limit on namespace detachment hw/block/nvme: fix handling of private namespaces hw/block/nvme-ns.h | 12 ++-- hw/block/nvme-subsys.h | 7 +-- hw/block/nvme.h| 45 ++ include/block/nvme.h | 1 + hw/block/nvme-ns.c | 76 hw/block/nvme-subsys.c | 28 - hw/block/nvme.c| 131 - hw/block/trace-events | 1 - 8 files changed, 129 insertions(+), 172 deletions(-) -- Reviewed-by: Gollu Appalanaidu 2.31.0
Re: [PATCH for-6.0 7/7] hw/block/nvme: fix handling of private namespaces
On Wed, Mar 24, 2021 at 09:09:07PM +0100, Klaus Jensen wrote: From: Klaus Jensen Prior to this patch, if a private nvme-ns device (that is, a namespace that is not linked to a subsystem) is wired up to an nvme-subsys linked nvme controller device, the device fails to verify that the namespace id is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID and Namespace Usage") states that because the device supports Namespace Management, "NSIDs *shall* be unique within the NVM subsystem". Additionally, prior to this patch, private namespaces are not known to the subsystem and the namespace is considered exclusive to the controller with which it is initially wired up to. However, this is not the definition of a private namespace; per Section 1.6.33 ("private namespace"), a private namespace is just a namespace that does not support multipath I/O or namespace sharing, which means "that it is only able to be attached to one controller at a time". Fix this by always allocating namespaces in the subsystem (if one is linked to the controller), regardsless of the shared/private status of the namespace. Whether or not the namespace is shareable is controlled by a new `shared` nvme-ns parameter. Finally, this fix allows the nvme-ns `subsys` parameter to be removed, since the `shared` parameter now serves the purpose of attaching the namespace to all controllers in the subsystem upon device realization. It is invalid to have an nvme-ns namespace device with a linked subsystem without the parent nvme controller device also being linked to one and since the nvme-ns devices will unconditionally be "attached" (in QEMU terms that is) to an nvme controller device through an NvmeBus, the nvme-ns namespace device can always get a reference to the subsystem of the controller it is explicitly (using 'bus=' parametr) or implicitly attaching to. Fixes: e570768566b3 ("hw/block/nvme: support for shared namespace in subsystem") Cc: Minwoo Im Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 10 ++-- hw/block/nvme-subsys.h | 7 ++- hw/block/nvme.h| 39 +- include/block/nvme.h | 1 + hw/block/nvme-ns.c | 74 +++ hw/block/nvme-subsys.c | 28 --- hw/block/nvme.c| 112 + hw/block/trace-events | 1 - 8 files changed, 106 insertions(+), 166 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 82340c4b2574..fb0a41f912e7 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -29,6 +29,7 @@ typedef struct NvmeZone { typedef struct NvmeNamespaceParams { bool detached; +bool shared; uint32_t nsid; QemuUUID uuid; @@ -60,8 +61,8 @@ typedef struct NvmeNamespace { const uint32_t *iocs; uint8_t csi; uint16_t status; +int attached; -NvmeSubsystem *subsys; QTAILQ_ENTRY(NvmeNamespace) entry; NvmeIdNsZoned *id_ns_zoned; @@ -99,11 +100,6 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) return 0; } -static inline bool nvme_ns_shared(NvmeNamespace *ns) -{ -return !!ns->subsys; -} - static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) { NvmeIdNs *id_ns = &ns->id_ns; @@ -225,7 +221,7 @@ static inline void nvme_aor_dec_active(NvmeNamespace *ns) } void nvme_ns_init_format(NvmeNamespace *ns); -int nvme_ns_setup(NvmeNamespace *ns, Error **errp); +int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_shutdown(NvmeNamespace *ns); void nvme_ns_cleanup(NvmeNamespace *ns); diff --git a/hw/block/nvme-subsys.h b/hw/block/nvme-subsys.h index aafa04b84829..24132edd005c 100644 --- a/hw/block/nvme-subsys.h +++ b/hw/block/nvme-subsys.h @@ -14,7 +14,7 @@ OBJECT_CHECK(NvmeSubsystem, (obj), TYPE_NVME_SUBSYS) #define NVME_SUBSYS_MAX_CTRLS 32 -#define NVME_SUBSYS_MAX_NAMESPACES 256 +#define NVME_MAX_NAMESPACES 256 typedef struct NvmeCtrl NvmeCtrl; typedef struct NvmeNamespace NvmeNamespace; @@ -24,7 +24,7 @@ typedef struct NvmeSubsystem { NvmeCtrl*ctrls[NVME_SUBSYS_MAX_CTRLS]; /* Allocated namespaces for this subsystem */ -NvmeNamespace *namespaces[NVME_SUBSYS_MAX_NAMESPACES + 1]; +NvmeNamespace *namespaces[NVME_MAX_NAMESPACES + 1]; struct { char *nqn; @@ -32,7 +32,6 @@ typedef struct NvmeSubsystem { } NvmeSubsystem; int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp); -int nvme_subsys_register_ns(NvmeNamespace *ns, Error **errp); static inline NvmeCtrl *nvme_subsys_ctrl(NvmeSubsystem *subsys, uint32_t cntlid) @@ -54,7 +53,7 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys, return NULL; } -assert(nsid && nsid <= NVME_SUBSYS_MAX_NAMESPACES); +assert(nsid && nsid <= NVME_MAX_NAMESPACES); return subsys->namespaces[nsid]; } diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 1570f65989a7..644143597a0f 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @
Re: [PATCH for-6.0 0/7] hw/block/nvme: misc fixes
On Mar 24 21:09, Klaus Jensen wrote: > From: Klaus Jensen > > Various fixes for 6.0. > > Klaus Jensen (7): > hw/block/nvme: fix pi constraint check > hw/block/nvme: fix missing string representation for ns attachment > hw/block/nvme: fix the nsid 'invalid' value > hw/block/nvme: fix controller namespaces array indexing > hw/block/nvme: fix warning about legacy namespace configuration > hw/block/nvme: update dmsrl limit on namespace detachment > hw/block/nvme: fix handling of private namespaces > > hw/block/nvme-ns.h | 12 ++-- > hw/block/nvme-subsys.h | 7 +-- > hw/block/nvme.h| 45 ++ > include/block/nvme.h | 1 + > hw/block/nvme-ns.c | 76 > hw/block/nvme-subsys.c | 28 - > hw/block/nvme.c| 131 - > hw/block/trace-events | 1 - > 8 files changed, 129 insertions(+), 172 deletions(-) > Ping on patches [3,4,7/7] :) signature.asc Description: PGP signature