Re: [dm-devel] [for-4.16 PATCH v2 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler

2018-01-02 Thread Mike Snitzer
On Tue, Jan 02 2018 at  6:29pm -0500,
Keith Busch  wrote:

> Instead of hiding NVMe path related errors, the NVMe driver needs to
> code an appropriate generic block status from an NVMe status.
> 
> We already do this translation whether or not CONFIG_NVME_MULTIPATHING is
> set, so I think it's silly NVMe native multipathing has a second status
> decoder. This just doubles the work if we need to handle any new NVMe
> status codes in the future.
> 
> I have a counter-proposal below that unifies NVMe-to-block status
> translations, and combines common code for determining if an error is a
> path failure. This should work for both NVMe and DM, and DM won't need
> NVMe specifics.

I'm happy with this approach.

FYI, I did recently invert the logic on dm-mpath.c:noretry_error() and
staged for 4.16, see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=806f913a543e30d0d608a823b11613bbeeba1f6d
But I can drop that patch and rebase accordingly.

Also, I think a better name for blk_path_failure() would be
blk_retryable_error()?  Don't feel that strongly about it though.

> I can split this into a series if there's indication this is ok and
> satisfies the need.

Don't think it needs to be a series, having it be a single patch shows
why it makes sense to elevate the retryable error check to block core.

But maybe Jens will be able to give you more guidance on what he'd like
to see.

I'd very much like to see this land in 4.16.

Thanks!
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [for-4.16 PATCH v2 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler

2018-01-02 Thread Keith Busch
Instead of hiding NVMe path related errors, the NVMe driver needs to
code an appropriate generic block status from an NVMe status.

We already do this translation whether or not CONFIG_NVME_MULTIPATHING is
set, so I think it's silly NVMe native multipathing has a second status
decoder. This just doubles the work if we need to handle any new NVMe
status codes in the future.

I have a counter-proposal below that unifies NVMe-to-block status
translations, and combines common code for determining if an error is a
path failure. This should work for both NVMe and DM, and DM won't need
NVMe specifics.

I can split this into a series if there's indication this is ok and
satisfies the need.

---
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index f7810cc869ac..f6f7a1aefc53 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1475,21 +1475,6 @@ static void activate_path_work(struct work_struct *work)
activate_or_offline_path(pgpath);
 }
 
-static int noretry_error(blk_status_t error)
-{
-   switch (error) {
-   case BLK_STS_NOTSUPP:
-   case BLK_STS_NOSPC:
-   case BLK_STS_TARGET:
-   case BLK_STS_NEXUS:
-   case BLK_STS_MEDIUM:
-   return 1;
-   }
-
-   /* Anything else could be a path failure, so should be retried */
-   return 0;
-}
-
 static int multipath_end_io(struct dm_target *ti, struct request *clone,
blk_status_t error, union map_info *map_context)
 {
@@ -1508,7 +1493,7 @@ static int multipath_end_io(struct dm_target *ti, struct 
request *clone,
 * request into dm core, which will remake a clone request and
 * clone bios for it and resubmit it later.
 */
-   if (error && !noretry_error(error)) {
+   if (error && blk_path_failure(error)) {
struct multipath *m = ti->private;
 
r = DM_ENDIO_REQUEUE;
@@ -1544,7 +1529,7 @@ static int multipath_end_io_bio(struct dm_target *ti, 
struct bio *clone,
unsigned long flags;
int r = DM_ENDIO_DONE;
 
-   if (!*error || noretry_error(*error))
+   if (!*error || !blk_path_failure(*error))
goto done;
 
if (pgpath)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f837d666cbd4..25ef349fd4e4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -190,20 +190,18 @@ static inline bool nvme_req_needs_retry(struct request 
*req)
 
 void nvme_complete_rq(struct request *req)
 {
-   if (unlikely(nvme_req(req)->status && nvme_req_needs_retry(req))) {
-   if (nvme_req_needs_failover(req)) {
-   nvme_failover_req(req);
-   return;
-   }
+   blk_status_t status = nvme_error_status(req);
 
+   if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
+   if (nvme_failover_req(req, status))
+   return;
if (!blk_queue_dying(req->q)) {
nvme_req(req)->retries++;
blk_mq_requeue_request(req, true);
return;
}
}
-
-   blk_mq_end_request(req, nvme_error_status(req));
+   blk_mq_end_request(req, status);
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 1218a9fca846..fa3d96780258 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -19,11 +19,16 @@ module_param(multipath, bool, 0644);
 MODULE_PARM_DESC(multipath,
"turn on native support for multiple controllers per subsystem");
 
-void nvme_failover_req(struct request *req)
+bool nvme_failover_req(struct request *req, blk_status_t status)
 {
struct nvme_ns *ns = req->q->queuedata;
unsigned long flags;
 
+   if (!(req->cmd_flags & REQ_NVME_MPATH))
+   return false;
+   if (!blk_path_failure(status))
+   return false;
+
spin_lock_irqsave(>head->requeue_lock, flags);
blk_steal_bios(>head->requeue_list, req);
spin_unlock_irqrestore(>head->requeue_lock, flags);
@@ -31,52 +36,6 @@ void nvme_failover_req(struct request *req)
 
nvme_reset_ctrl(ns->ctrl);
kblockd_schedule_work(>head->requeue_work);
-}
-
-bool nvme_req_needs_failover(struct request *req)
-{
-   if (!(req->cmd_flags & REQ_NVME_MPATH))
-   return false;
-
-   switch (nvme_req(req)->status & 0x7ff) {
-   /*
-* Generic command status:
-*/
-   case NVME_SC_INVALID_OPCODE:
-   case NVME_SC_INVALID_FIELD:
-   case NVME_SC_INVALID_NS:
-   case NVME_SC_LBA_RANGE:
-   case NVME_SC_CAP_EXCEEDED:
-   case NVME_SC_RESERVATION_CONFLICT:
-   return false;
-
-   /*
-* I/O command set specific error.  Unfortunately these values are
-* reused for fabrics commands, but those should never get here.
-*/
-   case 

[dm-devel] [PATCH -next] dm raid: make local symbol raid_sets static

2018-01-02 Thread Wei Yongjun
Fixes the following sparse warning:

drivers/md/dm-raid.c:33:1: warning:
 symbol 'raid_sets' was not declared. Should it be static?

Signed-off-by: Wei Yongjun 
---
 drivers/md/dm-raid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index a96ca44..7ef469e 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -30,7 +30,7 @@
 #defineMIN_RAID456_JOURNAL_SPACE (4*2048)
 
 /* Global list of all raid sets */
-LIST_HEAD(raid_sets);
+static LIST_HEAD(raid_sets);
 
 static bool devices_handle_discard_safely = false;

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel