Re: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs
On Tue, Jun 05, 2018 at 06:47:33AM +0200, Christoph Hellwig wrote: > But let's assume we don't want to use the list due to this concern: > we'd still have to read the log page, as per the NVMe spec the only > think clearing a pending AEN is reading the associated log page. > We'd then need to still do the full scan using identify. Is this what > we want? If you think this is important for reliability we could > just ignore the log page. That sounds good. Let's have the driver read the log page to re-arm the event as you've done, but don't rely on the contents for namespace enumeration. The rest of your series looks good to me.
Re: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs
On Mon, Jun 04, 2018 at 01:59:09PM -0600, Keith Busch wrote: > > Per section 5.2 we need to issue the corresponding log page to clear an > > AEN, so for a namespace data changed AEN we need to read the changed > > namespace list log. And once we read that log anyway we might as well > > use it to optimize the rescan. > > > > Signed-off-by: Christoph Hellwig > > I'm a little concerned about this. Userspace might be reading the same > log page. Since the contents of the page may change each time its read, > it's possible the driver will see only a subset of changed namespaces > at the point it gets to read the log page, missing a chance to > revalidate others. I agree with the concern, but I don't think it really is uniqueue here. We allow userspace to send all kind sof harmful commands, even queue for queue creation and deletion. But let's assume we don't want to use the list due to this concern: we'd still have to read the log page, as per the NVMe spec the only think clearing a pending AEN is reading the associated log page. We'd then need to still do the full scan using identify. Is this what we want? If you think this is important for reliability we could just ignore the log page.
Re: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs
On Sat, May 26, 2018 at 12:27:34PM +0200, Christoph Hellwig wrote: > Per section 5.2 we need to issue the corresponding log page to clear an > AEN, so for a namespace data changed AEN we need to read the changed > namespace list log. And once we read that log anyway we might as well > use it to optimize the rescan. > > Signed-off-by: Christoph Hellwig I'm a little concerned about this. Userspace might be reading the same log page. Since the contents of the page may change each time its read, it's possible the driver will see only a subset of changed namespaces at the point it gets to read the log page, missing a chance to revalidate others.
Re: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs
Looks good, Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs
On Sat, May 26, 2018 at 12:05:02PM +, Popuri, Sriram wrote: > Reading the spec it looks like ns log is alternate approach: > > "Namespace Attribute Changed: The Identify Namespace data structure for one > or more namespaces, as well as the Namespace List returned when the Identify > command is issued with the CNS field set to 02h, have changed. Host software > may use this event as an indication that it should read the Identify > Namespace data structures for each namespace to determine what has changed. > > Alternatively, host software may request the Changed Namespace List (Log > Identifier 04h) to determine which namespaces in this controller have changed > Identify Namespace information since the last time the log page was read." > > My question is: Does the target needs to implement log page 04h as this is an > optional log page and the text above suggests its used in alternate way? Section 5.2 clearly states that each AER comes with a log page to clear it, and the above does not contain any language to explicitly override it. So my reading of the spec is that to clear the event you need to read the log page, but to find out what has changed you might issue Identify commands instead. > If target is not required to implement, then I guess your change should still > work because if log page 04h fails, it fails back to rescan. Correct? Independ of what the spec actually requires, the code tries to robust as possible and falls back to doing the manual scan. > I think you can optimize this by checking the "Log Page Identifier" field in > your result and accordingly setting EVENT_NS_CHANGED. I assume target would > clear this if log page 04h is not supported. We can try that, but I'm not sure what it is going to buy us.
RE: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs
Reading the spec it looks like ns log is alternate approach: "Namespace Attribute Changed: The Identify Namespace data structure for one or more namespaces, as well as the Namespace List returned when the Identify command is issued with the CNS field set to 02h, have changed. Host software may use this event as an indication that it should read the Identify Namespace data structures for each namespace to determine what has changed. Alternatively, host software may request the Changed Namespace List (Log Identifier 04h) to determine which namespaces in this controller have changed Identify Namespace information since the last time the log page was read." My question is: Does the target needs to implement log page 04h as this is an optional log page and the text above suggests its used in alternate way? If target is not required to implement, then I guess your change should still work because if log page 04h fails, it fails back to rescan. Correct? I think you can optimize this by checking the "Log Page Identifier" field in your result and accordingly setting EVENT_NS_CHANGED. I assume target would clear this if log page 04h is not supported. "23:16 Log Page Identifier: Indicates the log page associated with the asynchronous event. This log page needs to be read by the host to clear the event" Regards, ~Sriram -Original Message- From: Linux-nvme On Behalf Of Christoph Hellwig Sent: Saturday, May 26, 2018 3:58 PM To: linux-n...@lists.infradead.org Cc: Jens Axboe ; Keith Busch ; linux-block@vger.kernel.org; Sagi Grimberg ; Hannes Reinecke Subject: [PATCH 13/14] nvme: use the changed namespaces list log to clear ns data changed AENs Per section 5.2 we need to issue the corresponding log page to clear an AEN, so for a namespace data changed AEN we need to read the changed namespace list log. And once we read that log anyway we might as well use it to optimize the rescan. Signed-off-by: Christoph Hellwig --- drivers/nvme/host/core.c | 51 drivers/nvme/host/nvme.h | 2 ++ 2 files changed, 49 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 06cd04dcffbc..1ae77428a1a5 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3194,6 +3194,42 @@ static void nvme_scan_ns_sequential(struct nvme_ctrl *ctrl, unsigned nn) nvme_remove_invalid_namespaces(ctrl, nn); } +static bool nvme_scan_changed_ns_log(struct nvme_ctrl *ctrl) { + size_t log_size = NVME_MAX_CHANGED_NAMESPACES * sizeof(__le32); + __le32 *log; + int error, i; + bool ret = false; + + log = kzalloc(log_size, GFP_KERNEL); + if (!log) + return false; + + error = nvme_get_log(ctrl, NVME_LOG_CHANGED_NS, log, log_size); + if (error) { + dev_warn(ctrl->device, + "reading changed ns log failed: %d\n", error); + goto out_free_log; + } + + if (log[0] == cpu_to_le32(0x)) + goto out_free_log; + + for (i = 0; i < NVME_MAX_CHANGED_NAMESPACES; i++) { + u32 nsid = le32_to_cpu(log[i]); + + if (nsid == 0) + break; + dev_info(ctrl->device, "rescanning namespace %d.\n", nsid); + nvme_validate_ns(ctrl, nsid); + } + ret = true; + +out_free_log: + kfree(log); + return ret; +} + static void nvme_scan_work(struct work_struct *work) { struct nvme_ctrl *ctrl = @@ -3206,6 +3242,12 @@ static void nvme_scan_work(struct work_struct *work) WARN_ON_ONCE(!ctrl->tagset); + if (test_and_clear_bit(EVENT_NS_CHANGED, &ctrl->events)) { + if (nvme_scan_changed_ns_log(ctrl)) + goto out_sort_namespaces; + dev_info(ctrl->device, "rescanning namespaces.\n"); + } + if (nvme_identify_ctrl(ctrl, &id)) return; @@ -3213,14 +3255,15 @@ static void nvme_scan_work(struct work_struct *work) if (ctrl->vs >= NVME_VS(1, 1, 0) && !(ctrl->quirks & NVME_QUIRK_IDENTIFY_CNS)) { if (!nvme_scan_ns_list(ctrl, nn)) - goto done; + goto out_free_id; } nvme_scan_ns_sequential(ctrl, nn); - done: +out_free_id: + kfree(id); +out_sort_namespaces: down_write(&ctrl->namespaces_rwsem); list_sort(NULL, &ctrl->namespaces, ns_cmp); up_write(&ctrl->namespaces_rwsem); - kfree(id); } /* @@ -3340,7 +3383,7 @@ static void nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result) { switch ((result & 0xff00) >> 8) { case NVME_AER_NOTICE_NS_CHANGED: - dev_info(ctrl->device, "rescanning\n"); + set_bit(EVENT_NS_CHANGED, &ctrl->events); nvme_queue_scan(ctrl); break; case NVME_AER_NOTICE_FW_ACT_STARTING: dif