Re: [dm-devel] [PATCH 0/5] Failover criteria unification

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  6:47pm -0500,
Keith Busch  wrote:

> On Thu, Jan 04, 2018 at 06:36:27PM -0500, Mike Snitzer wrote:
> > Right, I dropped that patch since it'd have only resulted in conflicts
> > come merge time.  As such, this series can easily go through the nvme
> > tree to Jens.
> 
> It looks like you can also touch up dm to allow it to multipath nvme
> even if CONFIG_NVME_MULTIPATH is set. It may be useful since native NVMe
> doesn't multipath namespaces across subsystems, and some crack smoking
> people want to create something that requires that.

I had a "CONFIG_NVME_MULITIPATH=Y" typo in the dm-mpath.c commit that
added the CONFIG_NVME_MULTIPATH mutual exclussion in dm-4.16; so I
just dropped it via rebase.

So now the underlying nvme devices (already multipath'd by NVMe) should
be discoverable by multipathd.  Who am I to judge crack smokers?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 0/5] Failover criteria unification

2018-01-04 Thread Keith Busch
On Thu, Jan 04, 2018 at 06:36:27PM -0500, Mike Snitzer wrote:
> Right, I dropped that patch since it'd have only resulted in conflicts
> come merge time.  As such, this series can easily go through the nvme
> tree to Jens.

It looks like you can also touch up dm to allow it to multipath nvme
even if CONFIG_NVME_MULTIPATH is set. It may be useful since native NVMe
doesn't multipath namespaces across subsystems, and some crack smoking
people want to create something that requires that.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 5/5] dm mpath: Use blk_retryable

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:46pm -0500,
Keith Busch  wrote:

> Uses common code for determining if an error should be retried on
> alternate path.
> 
> Signed-off-by: Keith Busch 

Acked-by: Mike Snitzer 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 2/5] nvme/multipath: Consult blk_status_t for failover

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:46pm -0500,
Keith Busch  wrote:

> This removes nvme multipath's specific status decoding to see if failover
> is needed, using the generic blk_status_t that was translated earlier. This
> abstraction from the raw NVMe status means nvme status decoding exists
> in just one place.
> 
> Signed-off-by: Keith Busch 

Acked-by: Mike Snitzer 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 3/5] block: Provide blk_status_t decoding for retryable errors

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:46pm -0500,
Keith Busch  wrote:

> This patch provides a common decoder for block status that may be retried
> so various entities wishing to consult this do not have to duplicate
> this decision.
> 
> Signed-off-by: Keith Busch 

Acked-by: Mike Snitzer 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 0/5] Failover criteria unification

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:46pm -0500,
Keith Busch  wrote:

> The nvme native multipath provided a separate NVMe status decoder,
> complicating maintenance as new statuses need to be accounted for. This
> was already diverging from the generic nvme status decoder, which has
> implications for other components that rely on accurate generic block
> errors.
> 
> This series unifies common code among nvme and device-mapper multipath
> so that user experience regarding the failover fate of a command is the
> same.
> 
> Mike:
> 
> I split this up because I thought there'd be trouble merging the dm
> mpath update with the inverted retry logic, but I think you may have
> rebased the dm-4.16 without that patch, as I'm not seeing it in the most
> current branch.

Right, I dropped that patch since it'd have only resulted in conflicts
come merge time.  As such, this series can easily go through the nvme
tree to Jens.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 4/5] nvme/multipath: Use blk_retryable

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:46pm -0500,
Keith Busch  wrote:

> Uses common code for determining if an error should be retried on
> alternate path.
> 
> Signed-off-by: Keith Busch 

Acked-by: Mike Snitzer 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 1/5] nvme: Add more command status translation

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:46pm -0500,
Keith Busch  wrote:

> This adds more NVMe status code translations to blk_status_t values,
> and captures all the current status codes NVMe multipath uses.
> 
> Signed-off-by: Keith Busch 

Acked-by: Mike Snitzer 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 5/5] dm mpath: Use blk_retryable

2018-01-04 Thread Keith Busch
Uses common code for determining if an error should be retried on
alternate path.

Signed-off-by: Keith Busch 
---
 drivers/md/dm-mpath.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index f7810cc869ac..3eb9500db1b3 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1475,21 +1475,6 @@ static void activate_path_work(struct work_struct *work)
activate_or_offline_path(pgpath);
 }
 
-static int noretry_error(blk_status_t error)
-{
-   switch (error) {
-   case BLK_STS_NOTSUPP:
-   case BLK_STS_NOSPC:
-   case BLK_STS_TARGET:
-   case BLK_STS_NEXUS:
-   case BLK_STS_MEDIUM:
-   return 1;
-   }
-
-   /* Anything else could be a path failure, so should be retried */
-   return 0;
-}
-
 static int multipath_end_io(struct dm_target *ti, struct request *clone,
blk_status_t error, union map_info *map_context)
 {
@@ -1508,7 +1493,7 @@ static int multipath_end_io(struct dm_target *ti, struct 
request *clone,
 * request into dm core, which will remake a clone request and
 * clone bios for it and resubmit it later.
 */
-   if (error && !noretry_error(error)) {
+   if (error && blk_retryable(error)) {
struct multipath *m = ti->private;
 
r = DM_ENDIO_REQUEUE;
@@ -1544,7 +1529,7 @@ static int multipath_end_io_bio(struct dm_target *ti, 
struct bio *clone,
unsigned long flags;
int r = DM_ENDIO_DONE;
 
-   if (!*error || noretry_error(*error))
+   if (!*error || !blk_retryable(*error))
goto done;
 
if (pgpath)
-- 
2.13.6

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 4/5] nvme/multipath: Use blk_retryable

2018-01-04 Thread Keith Busch
Uses common code for determining if an error should be retried on
alternate path.

Signed-off-by: Keith Busch 
---
 drivers/nvme/host/multipath.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index ae9abb600c0f..93bb72b6efb6 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -37,19 +37,7 @@ bool nvme_req_needs_failover(struct request *req, 
blk_status_t error)
 {
if (!(req->cmd_flags & REQ_NVME_MPATH))
return false;
-
-   switch (error) {
-   case BLK_STS_NOTSUPP:
-   case BLK_STS_NOSPC:
-   case BLK_STS_TARGET:
-   case BLK_STS_NEXUS:
-   case BLK_STS_MEDIUM:
-   case BLK_STS_PROTECTION:
-   return false;
-   }
-
-   /* Everything else could be a path failure, so should be retried */
-   return true;
+   return blk_retryable(error);
 }
 
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
-- 
2.13.6

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 1/5] nvme: Add more command status translation

2018-01-04 Thread Keith Busch
This adds more NVMe status code translations to blk_status_t values,
and captures all the current status codes NVMe multipath uses.

Signed-off-by: Keith Busch 
---
 drivers/nvme/host/core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2a69d735efbc..f1cf1f6c5b28 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -156,14 +156,20 @@ static blk_status_t nvme_error_status(struct request *req)
case NVME_SC_SUCCESS:
return BLK_STS_OK;
case NVME_SC_CAP_EXCEEDED:
+   case NVME_SC_LBA_RANGE:
return BLK_STS_NOSPC;
+   case NVME_SC_BAD_ATTRIBUTES:
case NVME_SC_ONCS_NOT_SUPPORTED:
+   case NVME_SC_INVALID_OPCODE:
+   case NVME_SC_INVALID_FIELD:
+   case NVME_SC_INVALID_NS:
return BLK_STS_NOTSUPP;
case NVME_SC_WRITE_FAULT:
case NVME_SC_READ_ERROR:
case NVME_SC_UNWRITTEN_BLOCK:
case NVME_SC_ACCESS_DENIED:
case NVME_SC_READ_ONLY:
+   case NVME_SC_COMPARE_FAILED:
return BLK_STS_MEDIUM;
case NVME_SC_GUARD_CHECK:
case NVME_SC_APPTAG_CHECK:
-- 
2.13.6

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 3/5] block: Provide blk_status_t decoding for retryable errors

2018-01-04 Thread Keith Busch
This patch provides a common decoder for block status that may be retried
so various entities wishing to consult this do not have to duplicate
this decision.

Signed-off-by: Keith Busch 
---
 include/linux/blk_types.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index a1e628e032da..b6a8723b493c 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -39,6 +39,22 @@ typedef u8 __bitwise blk_status_t;
 
 #define BLK_STS_AGAIN  ((__force blk_status_t)12)
 
+static inline bool blk_retryable(blk_status_t error)
+{
+   switch (error) {
+   case BLK_STS_NOTSUPP:
+   case BLK_STS_NOSPC:
+   case BLK_STS_TARGET:
+   case BLK_STS_NEXUS:
+   case BLK_STS_MEDIUM:
+   case BLK_STS_PROTECTION:
+   return false;
+   }
+
+   /* Anything else could be a path failure, so should be retried */
+   return true;
+}
+
 struct blk_issue_stat {
u64 stat;
 };
-- 
2.13.6

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 2/5] nvme/multipath: Consult blk_status_t for failover

2018-01-04 Thread Keith Busch
This removes nvme multipath's specific status decoding to see if failover
is needed, using the generic blk_status_t that was translated earlier. This
abstraction from the raw NVMe status means nvme status decoding exists
in just one place.

Signed-off-by: Keith Busch 
---
 drivers/nvme/host/core.c  |  9 +
 drivers/nvme/host/multipath.c | 44 ---
 drivers/nvme/host/nvme.h  |  4 ++--
 3 files changed, 15 insertions(+), 42 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f1cf1f6c5b28..4afc681f7089 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -196,8 +196,10 @@ static inline bool nvme_req_needs_retry(struct request 
*req)
 
 void nvme_complete_rq(struct request *req)
 {
-   if (unlikely(nvme_req(req)->status && nvme_req_needs_retry(req))) {
-   if (nvme_req_needs_failover(req)) {
+   blk_status_t status = nvme_error_status(req);
+
+   if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
+   if (nvme_req_needs_failover(req, status)) {
nvme_failover_req(req);
return;
}
@@ -208,8 +210,7 @@ void nvme_complete_rq(struct request *req)
return;
}
}
-
-   blk_mq_end_request(req, nvme_error_status(req));
+   blk_mq_end_request(req, status);
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 1218a9fca846..ae9abb600c0f 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -33,46 +33,18 @@ void nvme_failover_req(struct request *req)
kblockd_schedule_work(&ns->head->requeue_work);
 }
 
-bool nvme_req_needs_failover(struct request *req)
+bool nvme_req_needs_failover(struct request *req, blk_status_t error)
 {
if (!(req->cmd_flags & REQ_NVME_MPATH))
return false;
 
-   switch (nvme_req(req)->status & 0x7ff) {
-   /*
-* Generic command status:
-*/
-   case NVME_SC_INVALID_OPCODE:
-   case NVME_SC_INVALID_FIELD:
-   case NVME_SC_INVALID_NS:
-   case NVME_SC_LBA_RANGE:
-   case NVME_SC_CAP_EXCEEDED:
-   case NVME_SC_RESERVATION_CONFLICT:
-   return false;
-
-   /*
-* I/O command set specific error.  Unfortunately these values are
-* reused for fabrics commands, but those should never get here.
-*/
-   case NVME_SC_BAD_ATTRIBUTES:
-   case NVME_SC_INVALID_PI:
-   case NVME_SC_READ_ONLY:
-   case NVME_SC_ONCS_NOT_SUPPORTED:
-   WARN_ON_ONCE(nvme_req(req)->cmd->common.opcode ==
-   nvme_fabrics_command);
-   return false;
-
-   /*
-* Media and Data Integrity Errors:
-*/
-   case NVME_SC_WRITE_FAULT:
-   case NVME_SC_READ_ERROR:
-   case NVME_SC_GUARD_CHECK:
-   case NVME_SC_APPTAG_CHECK:
-   case NVME_SC_REFTAG_CHECK:
-   case NVME_SC_COMPARE_FAILED:
-   case NVME_SC_ACCESS_DENIED:
-   case NVME_SC_UNWRITTEN_BLOCK:
+   switch (error) {
+   case BLK_STS_NOTSUPP:
+   case BLK_STS_NOSPC:
+   case BLK_STS_TARGET:
+   case BLK_STS_NEXUS:
+   case BLK_STS_MEDIUM:
+   case BLK_STS_PROTECTION:
return false;
}
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ea1aa5283e8e..cf8548c1d7ec 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -401,7 +401,7 @@ extern const struct block_device_operations 
nvme_ns_head_ops;
 
 #ifdef CONFIG_NVME_MULTIPATH
 void nvme_failover_req(struct request *req);
-bool nvme_req_needs_failover(struct request *req);
+bool nvme_req_needs_failover(struct request *req, blk_status_t error);
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl);
 int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head);
 void nvme_mpath_add_disk(struct nvme_ns_head *head);
@@ -421,7 +421,7 @@ struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
 static inline void nvme_failover_req(struct request *req)
 {
 }
-static inline bool nvme_req_needs_failover(struct request *req)
+static inline bool nvme_req_needs_failover(struct request *req, blk_status_t 
error)
 {
return false;
 }
-- 
2.13.6

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 0/5] Failover criteria unification

2018-01-04 Thread Keith Busch
The nvme native multipath provided a separate NVMe status decoder,
complicating maintenance as new statuses need to be accounted for. This
was already diverging from the generic nvme status decoder, which has
implications for other components that rely on accurate generic block
errors.

This series unifies common code among nvme and device-mapper multipath
so that user experience regarding the failover fate of a command is the
same.

Mike:

I split this up because I thought there'd be trouble merging the dm
mpath update with the inverted retry logic, but I think you may have
rebased the dm-4.16 without that patch, as I'm not seeing it in the most
current branch.

Keith Busch (5):
  nvme: Add more command status translation
  nvme/multipath: Consult blk_status_t for failover
  block: Provide blk_status_t decoding for retryable errors
  nvme/multipath: Use blk_retryable
  dm mpath: Use blk_retryable

 drivers/md/dm-mpath.c | 19 ++-
 drivers/nvme/host/core.c  | 15 +++
 drivers/nvme/host/multipath.c | 44 ++-
 drivers/nvme/host/nvme.h  |  4 ++--
 include/linux/blk_types.h | 16 
 5 files changed, 33 insertions(+), 65 deletions(-)

-- 
2.13.6

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm bufio: fix shrinker scans when (nr_to_scan < retain_target)

2018-01-04 Thread Mike Snitzer
This was already included as of v4.15-rc4 via commit fbc7c07ec2 ("dm
bufio: fix shrinker scans when (nr_to_scan < retain_target)")

I even cc'd you on the relevant pull request that I sent to Linus, see:
https://www.redhat.com/archives/dm-devel/2017-December/msg00119.html

Mike

On Thu, Jan 04 2018 at  2:04pm -0500,
Suren Baghdasaryan  wrote:

> Dear kernel maintainers. I know it was close to holiday season when I
> send this patch last month, so delay was expected. Could you please
> take a look at it and provide your feedback?
> Thanks!
> 
> On Wed, Dec 6, 2017 at 9:27 AM, Suren Baghdasaryan  wrote:
> > When system is under memory pressure it is observed that dm bufio
> > shrinker often reclaims only one buffer per scan. This change fixes
> > the following two issues in dm bufio shrinker that cause this behavior:
> >
> > 1. ((nr_to_scan - freed) <= retain_target) condition is used to
> > terminate slab scan process. This assumes that nr_to_scan is equal
> > to the LRU size, which might not be correct because do_shrink_slab()
> > in vmscan.c calculates nr_to_scan using multiple inputs.
> > As a result when nr_to_scan is less than retain_target (64) the scan
> > will terminate after the first iteration, effectively reclaiming one
> > buffer per scan and making scans very inefficient. This hurts vmscan
> > performance especially because mutex is acquired/released every time
> > dm_bufio_shrink_scan() is called.
> > New implementation uses ((LRU size - freed) <= retain_target)
> > condition for scan termination. LRU size can be safely determined
> > inside __scan() because this function is called after dm_bufio_lock().
> >
> > 2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to
> > determine number of freeable objects in the slab. However dm_bufio
> > always retains retain_target buffers in its LRU and will terminate
> > a scan when this mark is reached. Therefore returning the entire LRU size
> > from dm_bufio_shrink_count() is misleading because that does not
> > represent the number of freeable objects that slab will reclaim during
> > a scan. Returning (LRU size - retain_target) better represents the
> > number of freeable objects in the slab. This way do_shrink_slab()
> > returns 0 when (LRU size < retain_target) and vmscan will not try to
> > scan this shrinker avoiding scans that will not reclaim any memory.
> >
> > Test: tested using Android device running
> > /system/extras/alloc-stress that generates memory pressure
> > and causes intensive shrinker scans
> >
> > Signed-off-by: Suren Baghdasaryan 
> > ---
> >  drivers/md/dm-bufio.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> > index b8ac5917..c546b567f3b5 100644
> > --- a/drivers/md/dm-bufio.c
> > +++ b/drivers/md/dm-bufio.c
> > @@ -1611,7 +1611,8 @@ static unsigned long __scan(struct dm_bufio_client 
> > *c, unsigned long nr_to_scan,
> > int l;
> > struct dm_buffer *b, *tmp;
> > unsigned long freed = 0;
> > -   unsigned long count = nr_to_scan;
> > +   unsigned long count = c->n_buffers[LIST_CLEAN] +
> > + c->n_buffers[LIST_DIRTY];
> > unsigned long retain_target = get_retain_buffers(c);
> >
> > for (l = 0; l < LIST_SIZE; l++) {
> > @@ -1647,8 +1648,11 @@ static unsigned long
> >  dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> >  {
> > struct dm_bufio_client *c = container_of(shrink, struct 
> > dm_bufio_client, shrinker);
> > +   unsigned long count = READ_ONCE(c->n_buffers[LIST_CLEAN]) +
> > + READ_ONCE(c->n_buffers[LIST_DIRTY]);
> > +   unsigned long retain_target = get_retain_buffers(c);
> >
> > -   return READ_ONCE(c->n_buffers[LIST_CLEAN]) + 
> > READ_ONCE(c->n_buffers[LIST_DIRTY]);
> > +   return (count < retain_target) ? 0 : (count - retain_target);
> >  }
> >
> >  /*
> > --
> > 2.15.0.531.g2ccb3012c9-goog
> >

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v6 0/2] dax, dm: stop requiring dax for device-mapper

2018-01-04 Thread Mike Snitzer
On Wed, Nov 29 2017 at  1:59pm -0500,
Dan Williams  wrote:

> Changes since v5 [1]:
> * Make DAX_DRIVER select DAX to simplify the Kconfig dependencies
>   (Michael)
> * Rebase on 4.15-rc1 and add new IS_ENABLED(CONFIG_DAX_DRIVER) checks in
>   drivers/md/dm-log-writes.c.
> 
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-September/012569.html
> 
> ---
> 
> Hi Mike,
> 
> Bart points out that the DAX core is unconditionally enabled if
> device-mapper is enabled. Add some config machinery and some
> stub/static-inline routines to allow dax infrastructure to be deleted
> from device-mapper at compile time.

Hey Dan,

Did you want me to pick this up for 4.16 or were you thinking of sending
the changes through a different tree?

I'm fine with picking these up so long as you don't expect the dax.h and
Kconfig changes to become a source of conflict come merge time.

Let me know, thanks!
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm bufio: fix missed destroy of mutex c->lock

2018-01-04 Thread Mike Snitzer
I already staged a similar fix, see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16&id=cfb0bd8b25eb90faa7cbbad9a52ad2c33102123e

On Thu, Jan 04 2018 at  9:08am -0500,
Xiongwei Song  wrote:

> The mutex c->lock is initialized in dm_bufio_client_create, however,
> it is not destroyed before free the structure of dm_bufio_client in
> dm_bufio_client_destroy.
> 
> Signed-off-by: Xiongwei Song 
> ---
>  drivers/md/dm-bufio.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index c546b567f3b5..53c0d5d2039a 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -1811,6 +1811,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c)
>   BUG_ON(c->n_buffers[i]);
>  
>   dm_io_client_destroy(c->dm_io);
> + mutex_destroy(&c->lock);
>   kfree(c);
>  }
>  EXPORT_SYMBOL_GPL(dm_bufio_client_destroy);
> -- 
> 2.15.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [for-4.16 PATCH v2 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:26am -0500,
Christoph Hellwig  wrote:

> On Tue, Jan 02, 2018 at 04:29:43PM -0700, Keith Busch wrote:
> > Instead of hiding NVMe path related errors, the NVMe driver needs to
> > code an appropriate generic block status from an NVMe status.
> > 
> > We already do this translation whether or not CONFIG_NVME_MULTIPATHING is
> > set, so I think it's silly NVMe native multipathing has a second status
> > decoder. This just doubles the work if we need to handle any new NVMe
> > status codes in the future.
> > 
> > I have a counter-proposal below that unifies NVMe-to-block status
> > translations, and combines common code for determining if an error is a
> > path failure. This should work for both NVMe and DM, and DM won't need
> > NVMe specifics.
> > 
> > I can split this into a series if there's indication this is ok and
> > satisfies the need.
> 
> You'll need to update nvme_error_status to account for all errors
> handled in nvme_req_needs_failover

These weren't accounted for (I may have mis-categorized them in this
patch, so please feel free to correct):

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 25ef349fd4e4..835a9c60200e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -165,10 +165,18 @@ static blk_status_t nvme_error_status(struct request *req)
case NVME_SC_ACCESS_DENIED:
case NVME_SC_READ_ONLY:
return BLK_STS_MEDIUM;
+   case NVME_SC_INVALID_OPCODE:
+   case NVME_SC_INVALID_FIELD:
+   case NVME_SC_INVALID_NS:
+   case NVME_SC_LBA_RANGE:
+   case NVME_SC_CAP_EXCEEDED:
+   case NVME_SC_BAD_ATTRIBUTES:
+   return BLK_STS_TARGET;
case NVME_SC_GUARD_CHECK:
case NVME_SC_APPTAG_CHECK:
case NVME_SC_REFTAG_CHECK:
case NVME_SC_INVALID_PI:
+   case NVME_SC_COMPARE_FAILED:
return BLK_STS_PROTECTION;
case NVME_SC_RESERVATION_CONFLICT:
return BLK_STS_NEXUS;

And of those, these aren't currently used:

NVME_SC_LBA_RANGE
NVME_SC_CAP_EXCEEDED
NVME_SC_BAD_ATTRIBUTES
NVME_SC_COMPARE_FAILED

At least not that I can see.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH] dm bufio: fix missed destroy of mutex c->lock

2018-01-04 Thread Xiongwei Song
2018-01-04 22:08 GMT+08:00 Xiongwei Song :
> The mutex c->lock is initialized in dm_bufio_client_create, however,
> it is not destroyed before free the structure of dm_bufio_client in
> dm_bufio_client_destroy.
>
> Signed-off-by: Xiongwei Song 
> ---
>  drivers/md/dm-bufio.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index c546b567f3b5..53c0d5d2039a 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -1811,6 +1811,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c)
> BUG_ON(c->n_buffers[i]);
>
> dm_io_client_destroy(c->dm_io);
> +   mutex_destroy(&c->lock);
> kfree(c);
>  }
>  EXPORT_SYMBOL_GPL(dm_bufio_client_destroy);
> --
> 2.15.1
>

Sorry for the noise. I added incorrect email address.
Please ignore the message. Sorry for this again.

Yours sincerely,
Xiongwei

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH] dm bufio: fix missed destroy of mutex c->lock

2018-01-04 Thread Xiongwei Song
The mutex c->lock is initialized in dm_bufio_client_create, however,
it is not destroyed before free the structure of dm_bufio_client in
dm_bufio_client_destroy.

Signed-off-by: Xiongwei Song 
---
 drivers/md/dm-bufio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index c546b567f3b5..53c0d5d2039a 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1811,6 +1811,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c)
BUG_ON(c->n_buffers[i]);
 
dm_io_client_destroy(c->dm_io);
+   mutex_destroy(&c->lock);
kfree(c);
 }
 EXPORT_SYMBOL_GPL(dm_bufio_client_destroy);
-- 
2.15.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH] dm bufio: fix missed destroy of mutex c->lock

2018-01-04 Thread Xiongwei Song
The mutex c->lock is initialized in dm_bufio_client_create, however,
it is not destroyed before free the structure of dm_bufio_client in
dm_bufio_client_destroy.

Signed-off-by: Xiongwei Song 
---
 drivers/md/dm-bufio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index c546b567f3b5..53c0d5d2039a 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1811,6 +1811,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c)
BUG_ON(c->n_buffers[i]);
 
dm_io_client_destroy(c->dm_io);
+   mutex_destroy(&c->lock);
kfree(c);
 }
 EXPORT_SYMBOL_GPL(dm_bufio_client_destroy);
-- 
2.15.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [for-4.16 PATCH v2 1/5] block: establish request failover callback

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:28am -0500,
Christoph Hellwig  wrote:

> On Fri, Dec 29, 2017 at 03:19:04PM -0500, Mike Snitzer wrote:
> > On Fri, Dec 29 2017 at  5:10am -0500,
> > Christoph Hellwig  wrote:
> > 
> > > On Tue, Dec 26, 2017 at 10:22:53PM -0500, Mike Snitzer wrote:
> > > > All requests allocated from a request_queue with this callback set can
> > > > failover their requests during completion.
> > > > 
> > > > This callback is expected to use the blk_steal_bios() interface to
> > > > transfer a request's bios back to an upper-layer bio-based
> > > > request_queue.
> > > > 
> > > > This will be used by both NVMe multipath and DM multipath.  Without it
> > > > DM multipath cannot get access to NVMe-specific error handling that NVMe
> > > > core provides in nvme_complete_rq().
> > > 
> > > And the whole point is that it should not get any such access.
> > 
> > No the whole point is you hijacked multipathing for little to no gain.
> 
> That is your idea.  In the end there have been a lot of complains about
> dm-multipath, and there was a lot of discussion how to do things better,
> with a broad agreement on this approach.  Up to the point where Hannes
> has started considering doing something similar for scsi.

All the "DM multipath" complaints I heard at LSF were fixable and pretty
superficial.  Some less so, but Hannes had a vision for addressing
various SCSI stuff (which really complicated DM multipath).

But I'd really rather not dwell on all the history of NVMe native
multipathing's evolution.  It isn't productive (other than to
acknowledge that there are far more efficient and productive ways to
coordinate such a change).

> And to be honest if this is the tone you'd like to set for technical
> discussions I'm not really interested.  Please calm down and stick
> to a technical discussion.

I think you'd probably agree that you've repeatedly derailed or avoided
technical discussion if it got into "DM multipath".  But again I'm not
looking to dwell on how dysfunctional this has been.  I really do
appreciate your technical expertise.  Sadly, cannot say I feel you think
similarly of me.

I will say that I'm human, as such I have limits on what I'm willing to
accept.  You leveraged your position to the point where it has started
to feel like you were lording over me.  Tough to accept that.   It makes
my job _really_ feel like "work".  All I've ever been trying to do
(since accepting the reality of "NVMe native multipathing") is bridge
the gap from the old solution to new solution.  I'm not opposed to the
new solution, it just needs to mature without being the _only_ way to
provide the feature (NVMe multipathing).  Hopefully we can be productive
exchanges moving forward.

There are certainly some challenges associated with trying to allow a
kernel to support both NVMe native multipathing and DM multipathing.
E.g. would an NVMe device scan multipath blacklist be doable/acceptable?

I'd also like to understand if your vision for NVMe's ANA support will
model something like scsi_dh?  Meaning ANA is a capability that, when
attached, augments the behavior of the NVMe device but that it is
otherwise internal to the device and upper layers will get the benefit
of ANA handler being attached.  Also, curious to know if you see that as
needing to be tightly coupled to multipathing?  If so that is the next
interface point hurdle.

In the end I really think that DM multipath can help make NVMe native
multipath very robust (and vice-versa).

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [for-4.16 PATCH v2 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler

2018-01-04 Thread Mike Snitzer
On Thu, Jan 04 2018 at  5:26am -0500,
Christoph Hellwig  wrote:

> On Tue, Jan 02, 2018 at 04:29:43PM -0700, Keith Busch wrote:
> > Instead of hiding NVMe path related errors, the NVMe driver needs to
> > code an appropriate generic block status from an NVMe status.
> > 
> > We already do this translation whether or not CONFIG_NVME_MULTIPATHING is
> > set, so I think it's silly NVMe native multipathing has a second status
> > decoder. This just doubles the work if we need to handle any new NVMe
> > status codes in the future.
> > 
> > I have a counter-proposal below that unifies NVMe-to-block status
> > translations, and combines common code for determining if an error is a
> > path failure. This should work for both NVMe and DM, and DM won't need
> > NVMe specifics.
> > 
> > I can split this into a series if there's indication this is ok and
> > satisfies the need.
> 
> You'll need to update nvme_error_status to account for all errors
> handled in nvme_req_needs_failover, and you will probably have to
> add additional BLK_STS_* code.  But if this is all that the rage was
> about I'm perfectly fine with it.

Glad you're fine with it.  I thought you'd balk at this too.  Mainly
because I was unaware nvme_error_status() existed; so I thought any
amount of new NVMe error translation for upper-layer consumption would
be met with resistence.

Keith arrived at this approach based on an exchange we had in private.

I gave him context for DM multipath's need to access the code NVMe uses
to determine if an NVMe-specific error is retryable or not.  Explained
how SCSI uses scsi_dh error handling and
drivers/scsi/scsi_lib.c:__scsi_error_from_host_byte() to establish a
"differentiated IO error", and then
drivers/md/dm-mpath.c:noretry_error() consumes the resulting BLK_STS_*.

Armed with this context Keith was able to take his NVMe knowledge and
arrive at something you're fine with.  Glad it worked out.

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [for-4.16 PATCH v2 1/5] block: establish request failover callback

2018-01-04 Thread Christoph Hellwig
On Fri, Dec 29, 2017 at 03:19:04PM -0500, Mike Snitzer wrote:
> On Fri, Dec 29 2017 at  5:10am -0500,
> Christoph Hellwig  wrote:
> 
> > On Tue, Dec 26, 2017 at 10:22:53PM -0500, Mike Snitzer wrote:
> > > All requests allocated from a request_queue with this callback set can
> > > failover their requests during completion.
> > > 
> > > This callback is expected to use the blk_steal_bios() interface to
> > > transfer a request's bios back to an upper-layer bio-based
> > > request_queue.
> > > 
> > > This will be used by both NVMe multipath and DM multipath.  Without it
> > > DM multipath cannot get access to NVMe-specific error handling that NVMe
> > > core provides in nvme_complete_rq().
> > 
> > And the whole point is that it should not get any such access.
> 
> No the whole point is you hijacked multipathing for little to no gain.

That is your idea.  In the end there have been a lot of complains about
dm-multipath, and there was a lot of discussion how to do things better,
with a broad agreement on this approach.  Up to the point where Hannes
has started considering doing something similar for scsi.

And to be honest if this is the tone you'd like to set for technical
discussions I'm not really interested.  Please calm down and stick
to a technical discussion.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [for-4.16 PATCH v2 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler

2018-01-04 Thread Christoph Hellwig
On Tue, Jan 02, 2018 at 04:29:43PM -0700, Keith Busch wrote:
> Instead of hiding NVMe path related errors, the NVMe driver needs to
> code an appropriate generic block status from an NVMe status.
> 
> We already do this translation whether or not CONFIG_NVME_MULTIPATHING is
> set, so I think it's silly NVMe native multipathing has a second status
> decoder. This just doubles the work if we need to handle any new NVMe
> status codes in the future.
> 
> I have a counter-proposal below that unifies NVMe-to-block status
> translations, and combines common code for determining if an error is a
> path failure. This should work for both NVMe and DM, and DM won't need
> NVMe specifics.
> 
> I can split this into a series if there's indication this is ok and
> satisfies the need.

You'll need to update nvme_error_status to account for all errors
handled in nvme_req_needs_failover, and you will probably have to
add additional BLK_STS_* code.  But if this is all that the rage was
about I'm perfectly fine with it.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel