Re: [dm-devel] [PATCH 08/12] libmultipath: wait one seconds for more uevents in uevent_listen() in uevents burst situations

2017-01-03 Thread tang . junhui
Hello Ben,

I know the issue of socket's buffer fills up, 
but I think uevent_can_discard() totally process in memory,
it is light-weight and low cpu consumption, and it reduce the 
iteration count in merging, the test result is also good in
massive multipath devices scene. 

But if you still stick to it, I will revert it to uevents 
processing thread, and call it in uevent_dispatch() before 
calling merge_uevq(), actually, I'm going to do that.

Regards
Tang Junhui





发件人: "Benjamin Marzinski" 
收件人: tang.jun...@zte.com.cn, 
抄送:   tang.wenj...@zte.com.cn, zhang.ka...@zte.com.cn, 
dm-devel@redhat.com, bart.vanass...@sandisk.com, mwi...@suse.com
日期:   2017/01/04 06:38
主题:   Re: [dm-devel] [PATCH 08/12] libmultipath: wait one seconds for 
more uevents in uevent_listen() in uevents burst situations
发件人: dm-devel-boun...@redhat.com



On Tue, Dec 27, 2016 at 04:03:25PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> The more uevents are merged, the higher efficiency program will 
performs.
> So, do not process uevents after receiving immediately in uevents burst
> situations, but continue wait 1 seconds for more uevents except that too
> much uevents (2048) have already been received or too much time eclipse
> (60 seconds).

The issue I have with doing this in uevent_listen is that we seperated
receiving the uevents from servicing the uevents specificially to make
sure what we received them as fast as possible. The udev monitor code
is all based on a NETLINK socket. If our socket's receive buffer fills
up, there is no flow control. Events just start getting dropped, which
does cut down on processing time, but not in a way we would like.

This issue applies to a lesser extent to you previous two patches. I
don't really see the benefit of making sure that we drop the uevents
as soon as possible.  As long as we don't process them, that should
be good enough, right?

Now, maybe you put a lot of thought into your timeouts, and feel
confident that we will start processing well before the receive buffer
fills up. But if so, some comments on that would be reassuring, because
from the commit message, they seem fairly arbitrary to me.

-Ben

> 
> Change-Id: I763d491540e8114a81d12d603281540a81502742
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 35 +--
>  1 file changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index cc10d65..b0b05e9 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -39,6 +39,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> 
> @@ -490,11 +491,37 @@ struct uevent *uevent_from_udev_device(struct 
udev_device *dev)
>return uev;
>  }
> 
> +bool uevent_burst(struct timeval *start_time, int events)
> +{
> +  struct timeval diff_time, end_time;
> +  unsigned long speed;
> +  unsigned long eclipse_ms;
> +
> +  gettimeofday(&end_time, NULL);
> +  timersub(&end_time, start_time, &diff_time);
> +
> +  eclipse_ms = diff_time.tv_sec * 1000 + diff_time.tv_usec 
/ 1000;
> +  if (eclipse_ms == 0)
> +  return true;
> +  /* max wait 60s */
> +  if (eclipse_ms > 60*1000) {
> +  condlog(1, "burst continued =%lu ms, 
stoped", eclipse_ms);
> +  return false;
> +  }
> +
> +  speed = (events * 1000) / eclipse_ms;
> +  if (speed > 10)
> +  return true;
> +
> +  return false;
> +}
> +
>  int uevent_listen(struct udev *udev)
>  {
>int err = 2;
>struct udev_monitor *monitor = NULL;
>int fd, socket_flags, events;
> +  struct timeval start_time;
>int need_failback = 1;
>int timeout = 30;
>LIST_HEAD(uevlisten_tmp);
> @@ -548,6 +575,7 @@ int uevent_listen(struct udev *udev)
>}
> 
>events = 0;
> +  gettimeofday(&start_time, NULL);
>while (1) {
>struct uevent *uev;
>struct udev_device *dev;
> @@ -562,7 +590,8 @@ int uevent_listen(struct udev *udev)
>errno = 0;
>fdcount = poll(&ev_poll, 1, 
poll_timeout);
>if (fdcount && ev_poll.revents & POLLIN) 
{
> -  timeout = 0;
> +  timeout = 
uevent_burst(&start_time, events + 1) ? 1:0;
> +
>dev = 
udev_monitor_receive_device(monitor);
>if (!dev) {
> condlog(0, "failed getting udev device");
> @@ -578,7 +607,8 @@ int uevent_l

Re: [dm-devel] [PATCH 01/12] libmultipath: add wwid for "struct uevent" to record wwid of uevent

2017-01-03 Thread tang . junhui
Hello Ben,

> Right now, multipath users are allowed configure devices to set the wwid
> based on any udev environment variable (or even use a callout, although
> this is deprecated). With this patch, that breaks.
Does WWID obtained by different methods change? If it changes, we would 
better to modify code to keep it no change. 

> If the udev sets
> ID_SERIAL for a device, that is its wwid, right? 
Yes

> Do you know if rbd
> devices have ID_SERIAL set? 
WWID has different label in uevents for different devices, I only test for
SCSI devices. Now we do not support rbd divice for uevents merging, these 
device process as old way, it has no harm in logic. If we need to merge 
rbd uevents for these devices, we can add code to get wwid from uevents
and it can be supported easily. 


Regards
Tang Junhui





发件人: "Benjamin Marzinski" 
收件人: tang.jun...@zte.com.cn, 
抄送:   christophe.varo...@opensvc.com, h...@suse.de, mwi...@suse.com, 
bart.vanass...@sandisk.com, dm-devel@redhat.com, zhang.ka...@zte.com.cn, 
tang.wenj...@zte.com.cn
日期:   2017/01/04 06:03
主题:   Re: [PATCH 01/12] libmultipath: add wwid for "struct uevent" to 
record wwid of uevent



On Tue, Dec 27, 2016 at 04:03:18PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> Add "char *wwid" to point WWID of uevent. This member identifies
> the LUN ID which the path belongs to, and it is used for merging
> uevents. WWID possibly did not exist in uevent yet, so ->wwid
> would be NULL, those uevents would not be merged, but be proccessed
> as old way.

Right now, multipath users are allowed configure devices to set the wwid
based on any udev environment variable (or even use a callout, although
this is deprecated). With this patch, that breaks. If the udev sets
ID_SERIAL for a device, that is its wwid, right?  Do you know if rbd
devices have ID_SERIAL set? If so, this change will break them.  Even if
this change doesn't break any devices in their default configurations,
we would need to disallow changing how the wwid is set for this patch
to be safe.

-Ben 

> 
> Change-Id: Ie6b076363b3735dc7de10184b27fa799b499af0e
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 2 ++
>  libmultipath/uevent.h | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index 7edcce1..ef1bafe 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -424,6 +424,8 @@ struct uevent *uevent_from_udev_device(struct 
udev_device *dev)
>uev->devpath = 
uev->envp[i] + 8;
>if (strcmp(name, "ACTION") == 0)
>uev->action = 
uev->envp[i] + 7;
> +  if (strcmp(name, "ID_SERIAL") == 0)
> +  uev->wwid = uev->envp[i] 
+ 10;
>i++;
>if (i == HOTPLUG_NUM_ENVP - 1)
>break;
> diff --git a/libmultipath/uevent.h b/libmultipath/uevent.h
> index 9d22dcd..7bfccef 100644
> --- a/libmultipath/uevent.h
> +++ b/libmultipath/uevent.h
> @@ -22,6 +22,7 @@ struct uevent {
>char *devpath;
>char *action;
>char *kernel;
> +  char *wwid;
>unsigned long seqnum;
>char *envp[HOTPLUG_NUM_ENVP];
>  };
> -- 
> 2.8.1.windows.1
> 


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH v2 1/1] block: fix blk_queue_split() resource exhaustion

2017-01-03 Thread NeilBrown
On Tue, Jan 03 2017, Jack Wang wrote:

> 2016-12-23 12:45 GMT+01:00 Lars Ellenberg :
>> On Fri, Dec 23, 2016 at 09:49:53AM +0100, Michael Wang wrote:
>>> Dear Maintainers
>>>
>>> I'd like to ask for the status of this patch since we hit the
>>> issue too during our testing on md raid1.
>>>
>>> Split remainder bio_A was queued ahead, following by bio_B for
>>> lower device, at this moment raid start freezing, the loop take
>>> out bio_A firstly and deliver it, which will hung since raid is
>>> freezing, while the freezing never end since it waiting for
>>> bio_B to finish, and bio_B is still on the queue, waiting for
>>> bio_A to finish...
>>>
>>> We're looking for a good solution and we found this patch
>>> already progressed a lot, but we can't find it on linux-next,
>>> so we'd like to ask are we still planning to have this fix
>>> in upstream?
>>
>> I don't see why not, I'd even like to have it in older kernels,
>> but did not have the time and energy to push it.
>>
>> Thanks for the bump.
>>
>> Lars
>>
> Hi folks,
>
> As Michael mentioned, we hit a bug this patch is trying to fix.
> Neil suggested another way to fix it.  I attached below.
> I personal prefer Neil's version as it's less code change, and straight 
> forward.
>
> Could you share your comments, we can get one fix into mainline.
>
> Thanks,
> Jinpu
> From 69a4829a55503e496ce9c730d2c8e3dd8a08874a Mon Sep 17 00:00:00 2001
> From: NeilBrown 
> Date: Wed, 14 Dec 2016 16:55:52 +0100
> Subject: [PATCH] block: fix deadlock between freeze_array() and wait_barrier()
>
> When we call wait_barrier, we might have some bios waiting
> in current->bio_list, which prevents the array_freeze call to
> complete. Those can only be internal READs, which have already
> passed the wait_barrier call (thus incrementing nr_pending), but
> still were not submitted to the lower level, due to generic_make_request
> logic to avoid recursive calls. In such case, we have a deadlock:
> - array_frozen is already set to 1, so wait_barrier unconditionally waits, so
> - internal READ bios will not be submitted, thus freeze_array will
> never completes.
>
> To fix this, modify generic_make_request to always sort bio_list_on_stack
> first with lowest level, then higher, until same level.
>
> Sent to linux-raid mail list:
> https://marc.info/?l=linux-raid&m=148232453107685&w=2
>

This should probably also have

  Inspired-by: Lars Ellenberg 

or something that, as I was building on Lars' ideas when I wrote this.

It would also be worth noting in the description that this addresses
issues with dm and drbd as well as md.

In fact, I think that with this patch in place, much of the need for the
rescue_workqueue won't exist any more.  I cannot promise it can be
removed completely, but it should be to hard to make it optional and
only enabled for those few block devices that will still need it.
The rescuer should only be needed for a bioset which can be allocated
From twice in the one call the ->make_request_fn.  This would include
raid0 for example, though raid0_make_reqest could be re-written to not
use a loop and to just call generic_make_request(bio) if bio != split.

Thanks,
NeilBrown


> Suggested-by: NeilBrown 
> Signed-off-by: Jack Wang 
> ---
>  block/blk-core.c | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 9e3ac56..47ef373 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2138,10 +2138,30 @@ blk_qc_t generic_make_request(struct bio *bio)
>   struct request_queue *q = bdev_get_queue(bio->bi_bdev);
>  
>   if (likely(blk_queue_enter(q, __GFP_DIRECT_RECLAIM) == 0)) {
> + struct bio_list lower, same, hold;
> +
> + /* Create a fresh bio_list for all subordinate requests 
> */
> + bio_list_init(&hold);
> + bio_list_merge(&hold, &bio_list_on_stack);
> + bio_list_init(&bio_list_on_stack);
>  
>   ret = q->make_request_fn(q, bio);
>  
>   blk_queue_exit(q);
> + /* sort new bios into those for a lower level
> +  * and those for the same level
> +  */
> + bio_list_init(&lower);
> + bio_list_init(&same);
> + while ((bio = bio_list_pop(&bio_list_on_stack)) != NULL)
> + if (q == bdev_get_queue(bio->bi_bdev))
> + bio_list_add(&same, bio);
> + else
> + bio_list_add(&lower, bio);
> + /* now assemble so we handle the lowest level first */
> + bio_list_merge(&bio_list_on_stack, &lower);
> + bio_list_merge(&bio_list_on_stack, &same);
> + bio_list_merge(&bio_list_on_stack, &hold);
>  
>   bio = bi

Re: [dm-devel] [PATCH 09/12] multipathd: merge uevents before proccessing

2017-01-03 Thread tang . junhui
Hello Ben,

Yes, a *_safe list traversal method can meet the needs,
I will modify it and simplify the codes.

Thanks,
Tang Junhui




发件人: "Benjamin Marzinski" 
收件人: tang.jun...@zte.com.cn, 
抄送:   tang.wenj...@zte.com.cn, zhang.ka...@zte.com.cn, 
dm-devel@redhat.com, bart.vanass...@sandisk.com, mwi...@suse.com
日期:   2017/01/04 08:39
主题:   Re: [dm-devel] [PATCH 09/12] multipathd: merge uevents before 
proccessing
发件人: dm-devel-boun...@redhat.com



On Tue, Dec 27, 2016 at 04:03:26PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> These uevents are going to be merged:
> 1) uevents come from paths and
> 2) uevents type is same and
> 3) uevents type is addition or deletion and
> 4) uevents wwid is same.

This is just a nit, and I might be missing something subtle here, but it
seems like instead of adding list_for_some_entry_reverse, and then
breaking the abstraction to manually get previous entries, you could
have just added list_for_some_entry_reverse_safe in your earlier patch,
and hid the work of traversing a list while removing elements behind the
well understood abstraction of a *_safe list traversal method.

-Ben

> 
> Change-Id: I05ee057391c092aa0c5f989b7a4f9cb550bb4d98
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 125 
+-
>  1 file changed, 114 insertions(+), 11 deletions(-)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index b0b05e9..114068c 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -85,6 +85,20 @@ struct uevent * alloc_uevent (void)
>return uev;
>  }
> 
> +void
> +uevq_cleanup(struct list_head *tmpq)
> +{
> +  struct uevent *uev, *tmp;
> +
> +  list_for_each_entry_safe(uev, tmp, tmpq, node) {
> +  list_del_init(&uev->node);
> +
> +  if (uev->udev)
> + udev_device_unref(uev->udev);
> +  FREE(uev);
> +  }
> +}
> +
>  bool
>  uevent_can_discard(char *devpath, char *kernel)
>  {
> @@ -125,6 +139,103 @@ uevent_can_discard(char *devpath, char *kernel)
>return false;
>  }
> 
> +bool
> +merge_need_stop(struct uevent *earlier, struct uevent *later)
> +{
> +  /*
> +   * dm uevent do not try to merge with left uevents
> +   */
> +  if (!strncmp(later->kernel, "dm-", 3))
> +  return true;
> +
> +  /*
> +   * we can not make a jugement without wwid,
> +   * so it is sensible to stop merging
> +   */
> +  if (!earlier->wwid || !later->wwid)
> +  return true;
> +  /*
> +   * uevents merging stoped
> +   * when we meet an opposite action uevent from the same 
LUN to AVOID
> +   * "add path1 |remove path1 |add path2 |remove path2 |add 
path3"
> +   * to merge as "remove path1, path2" and "add path1, 
path2, path3"
> +   * OR
> +   * "remove path1 |add path1 |remove path2 |add path2 
|remove path3"
> +   * to merge as "add path1, path2" and "remove path1, 
path2, path3"
> +   * SO
> +   * when we meet a non-change uevent from the same LUN
> +   * with the same wwid and different action
> +   * it would be better to stop merging.
> +   */
> +  if (!strcmp(earlier->wwid, later->wwid) &&
> +  strcmp(earlier->action, later->action) &&
> +  strcmp(earlier->action, "change") &&
> +  strcmp(later->action, "change"))
> +  return true;
> +
> +  return false;
> +}
> +
> +bool
> +uevent_can_merge(struct uevent *earlier, struct uevent *later)
> +{
> +  /* merge paths uevents
> +   * whose wwids exsit and are same
> +   * and actions are same,
> +   * and actions are addition or deletion
> +   */
> +  if (earlier->wwid && later->wwid &&
> +  !strcmp(earlier->wwid, later->wwid) &&
> +  !strcmp(earlier->action, later->action) &&
> +  strncmp(earlier->action, "change", 6) &&
> +  strncmp(earlier->kernel, "dm-", 3)) {
> +  return true;
> +  }
> +
> +  return false;
> +}
> +
> +void
> +uevent_merge(struct uevent *later, struct list_head *tmpq)
> +{
> +  struct uevent *earlier, *temp;
> +  /*
> +   * compare the uevent with earlier uevents
> +   */
> +  list_for_some_entry_reverse(earlier, &later->node, tmpq, 
node) {
> +next_earlier_node:
> +  if (merge_need_stop(earlier, later))
> +  break;
> +   

Re: [dm-devel] [PATCH 10/12] libmultipath: filter uevents before proccessing

2017-01-03 Thread tang . junhui
Hello Ben,

I add wwid judgment for safe.
I think twice, and as you say, it is safe enough without
this judgment, I will delete these wwid judgment.

Tanks
Tang Junhui



发件人: "Benjamin Marzinski" 
收件人: tang.jun...@zte.com.cn, 
抄送:   tang.wenj...@zte.com.cn, zhang.ka...@zte.com.cn, 
dm-devel@redhat.com, bart.vanass...@sandisk.com, mwi...@suse.com
日期:   2017/01/04 09:28
主题:   Re: [dm-devel] [PATCH 10/12] libmultipath: filter uevents before 
proccessing
发件人: dm-devel-boun...@redhat.com



On Tue, Dec 27, 2016 at 04:03:27PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> Before merging uevents, these uevents are going to be filtered:
> Change or addition uevent of a removed path (it indicate by an
> deletion uevent occurred later).
> 

I think it's safe to remove add and change uevents if they are followed
by a remove event, regardless of whether or not they have a wwid, as
long as the kernel name is the same.  We only get the remove event when
the device is gone. Processing the add and change events will never get
us anything in these cases, because there is no device to act on.

-Ben

> Change-Id: If00c2c2e23ea466c1d4643c541ed2d8f9a0c8dea
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 55 
+++
>  1 file changed, 55 insertions(+)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index 114068c..424f272 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -140,6 +140,28 @@ uevent_can_discard(char *devpath, char *kernel)
>  }
> 
>  bool
> +uevent_can_filter(struct uevent *earlier, struct uevent *later)
> +{
> +
> +  /*
> +   * filter earlier uvents if path has removed later, eg:
> +   * "add path3 |chang path3 |add path2 |remove path3"
> +   * can filter as:
> +   * "add path2 |remove path3"
> +   * uevent "add path3" and "chang path3" are filtered out
> +   */
> +  if (earlier->wwid && later->wwid &&
> +  !strcmp(earlier->wwid, later->wwid) &&
> +  strncmp(later->kernel, "dm-", 3) &&
> +  !strcmp(later->action, "remove") &&
> +  !strcmp(earlier->kernel, later->kernel)) 
{
> +  return true;
> +  }
> +
> +  return false;
> +}
> +
> +bool
>  merge_need_stop(struct uevent *earlier, struct uevent *later)
>  {
>/*
> @@ -196,6 +218,38 @@ uevent_can_merge(struct uevent *earlier, struct 
uevent *later)
>  }
> 
>  void
> +uevent_filter(struct uevent *later, struct list_head *tmpq)
> +{
> +  struct uevent *earlier, *temp;
> +  /*
> +   * compare the uevent with earlier uevents
> +   */
> +  list_for_some_entry_reverse(earlier, &later->node, tmpq, 
node) {
> +next_earlier_node:
> +  /*
> +   * filter unnessary earlier uevents by 
the later uevent
> +   */
> +  if (uevent_can_filter(earlier, later)) {
> +  condlog(2, "uevent: 
%s-%s-%s has removed by uevent: %s-%s-%s, filtered",
> + earlier->action, earlier->kernel, earlier->wwid,
> + later->action, later->kernel, later->wwid);
> +
> +  temp = earlier;
> +  earlier = 
list_entry(earlier->node.prev, typeof(struct uevent), node);
> + list_del_init(&temp->node);
> +  if (temp->udev)
> + udev_device_unref(temp->udev);
> +  FREE(temp);
> +
> +  if (earlier == 
list_entry(tmpq, typeof(struct uevent), node))
> +  break;
> +  else
> +  goto 
next_earlier_node;
> +  }
> +  }
> +}
> +
> +void
>  uevent_merge(struct uevent *later, struct list_head *tmpq)
>  {
>struct uevent *earlier, *temp;
> @@ -232,6 +286,7 @@ merge_uevq(struct list_head *tmpq)
>struct uevent *later;
> 
>list_for_each_entry_reverse(later, tmpq, node) {
> +  uevent_filter(later, tmpq);
>uevent_merge(later, tmpq);
>}
>  }
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH 11/12] multipathd: proccess merged uevents

2017-01-03 Thread tang . junhui
Hello Ben

Good idea,
I will modify the patch to filter these change uevents.

Thanks
Tang Junhui



发件人: "Benjamin Marzinski" 
收件人: tang.jun...@zte.com.cn, 
抄送:   tang.wenj...@zte.com.cn, zhang.ka...@zte.com.cn, 
dm-devel@redhat.com, bart.vanass...@sandisk.com, mwi...@suse.com
日期:   2017/01/04 09:10
主题:   Re: [dm-devel] [PATCH 11/12] multipathd: proccess merged uevents
发件人: dm-devel-boun...@redhat.com



On Tue, Dec 27, 2016 at 04:03:28PM +0800, tang.jun...@zte.com.cn wrote:
> From: "tang.junhui" 
> 
> After filtering and merging, then uevents are processed in
> uev_trigger(), firstly, each of merged uevents would be processed one
> by one with need_do_map in value of 0. Finally, the node “uev” itself
> would be processed with need_do_map in value of 1, which would reload
> kernel table, and create or remove the dm device.

Wait, doesn't this mean that you would change

"add path1 | change path1 | add path2"

to

"change path1 | add path1, path2"

It doesn't make sense to process a change event before an add event.
Looking at uev_update_path, the three things it currently does all can
be safely ignored if you don't process the add event until after you
receive the change event. They are:

disable path on changed wwid: the multipath device hasn't been created
yet, so it will simply be created with the new wwid. That's fine.

attempt to get the pathinfo again if you failed when adding it: Either
you don't have the necessary udev information in the add event, in which
case that uevent won't get merged in the first place and they will
remain in order, or you do have the information, and there's no point in
processing the change event in that case.

change the Read-Only state of the device: since both events (the change
and the add event) have already happened, when you look at the device
for the add event, you will already get the current read-only state.

So, as far as I can tell, if you have change events in your queue as
well as an add event for the same device, you might as well just filter
out the change events, since processing it immediately after the add
event will never actually change anything, and processing it beforehand
makes no sense.

-Ben

> 
> Change-Id: Ifb4de0ca36180a8af16729c2f70bc250858877bd
> Signed-off-by: tang.junhui 
> ---
>  multipathd/main.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 15a6175..0b18f6c 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1092,6 +1092,7 @@ uev_trigger (struct uevent * uev, void * 
trigger_data)
>  {
>int r = 0;
>struct vectors * vecs;
> +  struct uevent *merge_uev, *tmp;
> 
>vecs = (struct vectors *)trigger_data;
> 
> @@ -1123,20 +1124,21 @@ uev_trigger (struct uevent * uev, void * 
trigger_data)
>}
> 
>/*
> -   * path add/remove event
> +   * path add/remove/change event, add/remove maybe merged
> */
> -  if (!strncmp(uev->action, "add", 3)) {
> -  r = uev_add_path(uev, vecs, 1);
> -  goto out;
> -  }
> -  if (!strncmp(uev->action, "remove", 6)) {
> -  r = uev_remove_path(uev, vecs, 1);
> -  goto out;
> -  }
> -  if (!strncmp(uev->action, "change", 6)) {
> -  r = uev_update_path(uev, vecs);
> -  goto out;
> -  }
> +  list_for_each_entry_safe(merge_uev, tmp, 
&uev->merge_node, node) {
> +  if (!strncmp(merge_uev->action, "add", 
3))
> +  r += 
uev_add_path(merge_uev, vecs, 0);
> +  if (!strncmp(merge_uev->action, "remove", 
6))
> +  r += 
uev_remove_path(merge_uev, vecs, 0);
> +  }
> +
> +  if (!strncmp(uev->action, "add", 3))
> +  r += uev_add_path(uev, vecs, 1);
> +  if (!strncmp(uev->action, "remove", 6))
> +  r += uev_remove_path(uev, vecs, 1);
> +  if (!strncmp(uev->action, "change", 6))
> +  r += uev_update_path(uev, vecs);
> 
>  out:
>return r;
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH 10/12] libmultipath: filter uevents before proccessing

2017-01-03 Thread Benjamin Marzinski
On Tue, Dec 27, 2016 at 04:03:27PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> Before merging uevents, these uevents are going to be filtered:
> Change or addition uevent of a removed path (it indicate by an
> deletion uevent occurred later).
> 

I think it's safe to remove add and change uevents if they are followed
by a remove event, regardless of whether or not they have a wwid, as
long as the kernel name is the same.  We only get the remove event when
the device is gone. Processing the add and change events will never get
us anything in these cases, because there is no device to act on.

-Ben

> Change-Id: If00c2c2e23ea466c1d4643c541ed2d8f9a0c8dea
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 55 
> +++
>  1 file changed, 55 insertions(+)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index 114068c..424f272 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -140,6 +140,28 @@ uevent_can_discard(char *devpath, char *kernel)
>  }
>  
>  bool
> +uevent_can_filter(struct uevent *earlier, struct uevent *later)
> +{
> +
> + /*
> +  * filter earlier uvents if path has removed later, eg:
> +  * "add path3 |chang path3 |add path2 |remove path3"
> +  * can filter as:
> +  * "add path2 |remove path3"
> +  * uevent "add path3" and "chang path3" are filtered out
> +  */
> + if (earlier->wwid && later->wwid &&
> + !strcmp(earlier->wwid, later->wwid) &&
> + strncmp(later->kernel, "dm-", 3) &&
> + !strcmp(later->action, "remove") &&
> + !strcmp(earlier->kernel, later->kernel)) {
> + return true;
> + }
> +
> + return false;
> +}
> +
> +bool
>  merge_need_stop(struct uevent *earlier, struct uevent *later)
>  {
>   /*
> @@ -196,6 +218,38 @@ uevent_can_merge(struct uevent *earlier, struct uevent 
> *later)
>  }
>  
>  void
> +uevent_filter(struct uevent *later, struct list_head *tmpq)
> +{
> + struct uevent *earlier, *temp;
> + /*
> +  * compare the uevent with earlier uevents
> +  */
> + list_for_some_entry_reverse(earlier, &later->node, tmpq, node) {
> +next_earlier_node:
> + /*
> +  * filter unnessary earlier uevents by the later uevent
> +  */
> + if (uevent_can_filter(earlier, later)) {
> + condlog(2, "uevent: %s-%s-%s has removed by uevent: 
> %s-%s-%s, filtered",
> + earlier->action, earlier->kernel, earlier->wwid,
> + later->action, later->kernel, later->wwid);
> +
> + temp = earlier;
> + earlier = list_entry(earlier->node.prev, typeof(struct 
> uevent), node);
> + list_del_init(&temp->node);
> + if (temp->udev)
> + udev_device_unref(temp->udev);
> + FREE(temp);
> +
> + if (earlier ==  list_entry(tmpq, typeof(struct uevent), 
> node))
> + break;
> + else
> + goto next_earlier_node;
> + }
> + }
> +}
> +
> +void
>  uevent_merge(struct uevent *later, struct list_head *tmpq)
>  {
>   struct uevent *earlier, *temp;
> @@ -232,6 +286,7 @@ merge_uevq(struct list_head *tmpq)
>   struct uevent *later;
>  
>   list_for_each_entry_reverse(later, tmpq, node) {
> + uevent_filter(later, tmpq);
>   uevent_merge(later, tmpq);
>   }
>  }
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 11/12] multipathd: proccess merged uevents

2017-01-03 Thread Benjamin Marzinski
On Tue, Dec 27, 2016 at 04:03:28PM +0800, tang.jun...@zte.com.cn wrote:
> From: "tang.junhui" 
> 
> After filtering and merging, then uevents are processed in
> uev_trigger(), firstly, each of merged uevents would be processed one
> by one with need_do_map in value of 0. Finally, the node “uev” itself
> would be processed with need_do_map in value of 1, which would reload
> kernel table, and create or remove the dm device.

Wait, doesn't this mean that you would change

"add path1 | change path1 | add path2"

to

"change path1 | add path1, path2"

It doesn't make sense to process a change event before an add event.
Looking at uev_update_path, the three things it currently does all can
be safely ignored if you don't process the add event until after you
receive the change event. They are:

disable path on changed wwid: the multipath device hasn't been created
yet, so it will simply be created with the new wwid. That's fine.

attempt to get the pathinfo again if you failed when adding it: Either
you don't have the necessary udev information in the add event, in which
case that uevent won't get merged in the first place and they will
remain in order, or you do have the information, and there's no point in
processing the change event in that case.

change the Read-Only state of the device: since both events (the change
and the add event) have already happened, when you look at the device
for the add event, you will already get the current read-only state.

So, as far as I can tell, if you have change events in your queue as
well as an add event for the same device, you might as well just filter
out the change events, since processing it immediately after the add
event will never actually change anything, and processing it beforehand
makes no sense.

-Ben

> 
> Change-Id: Ifb4de0ca36180a8af16729c2f70bc250858877bd
> Signed-off-by: tang.junhui 
> ---
>  multipathd/main.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 15a6175..0b18f6c 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1092,6 +1092,7 @@ uev_trigger (struct uevent * uev, void * trigger_data)
>  {
>   int r = 0;
>   struct vectors * vecs;
> + struct uevent *merge_uev, *tmp;
>  
>   vecs = (struct vectors *)trigger_data;
>  
> @@ -1123,20 +1124,21 @@ uev_trigger (struct uevent * uev, void * trigger_data)
>   }
>  
>   /*
> -  * path add/remove event
> +  * path add/remove/change event, add/remove maybe merged
>*/
> - if (!strncmp(uev->action, "add", 3)) {
> - r = uev_add_path(uev, vecs, 1);
> - goto out;
> - }
> - if (!strncmp(uev->action, "remove", 6)) {
> - r = uev_remove_path(uev, vecs, 1);
> - goto out;
> - }
> - if (!strncmp(uev->action, "change", 6)) {
> - r = uev_update_path(uev, vecs);
> - goto out;
> - }
> + list_for_each_entry_safe(merge_uev, tmp, &uev->merge_node, node) {
> + if (!strncmp(merge_uev->action, "add", 3))
> + r += uev_add_path(merge_uev, vecs, 0);
> + if (!strncmp(merge_uev->action, "remove", 6))
> + r += uev_remove_path(merge_uev, vecs, 0);
> + }
> +
> + if (!strncmp(uev->action, "add", 3))
> + r += uev_add_path(uev, vecs, 1);
> + if (!strncmp(uev->action, "remove", 6))
> + r += uev_remove_path(uev, vecs, 1);
> + if (!strncmp(uev->action, "change", 6))
> + r += uev_update_path(uev, vecs);
>  
>  out:
>   return r;
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH 09/12] multipathd: merge uevents before proccessing

2017-01-03 Thread Benjamin Marzinski
On Tue, Dec 27, 2016 at 04:03:26PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> These uevents are going to be merged:
> 1) uevents come from paths and
> 2) uevents type is same and
> 3) uevents type is addition or deletion and
> 4) uevents wwid is same.

This is just a nit, and I might be missing something subtle here, but it
seems like instead of adding list_for_some_entry_reverse, and then
breaking the abstraction to manually get previous entries, you could
have just added list_for_some_entry_reverse_safe in your earlier patch,
and hid the work of traversing a list while removing elements behind the
well understood abstraction of a *_safe list traversal method.

-Ben

> 
> Change-Id: I05ee057391c092aa0c5f989b7a4f9cb550bb4d98
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 125 
> +-
>  1 file changed, 114 insertions(+), 11 deletions(-)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index b0b05e9..114068c 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -85,6 +85,20 @@ struct uevent * alloc_uevent (void)
>   return uev;
>  }
>  
> +void
> +uevq_cleanup(struct list_head *tmpq)
> +{
> + struct uevent *uev, *tmp;
> +
> + list_for_each_entry_safe(uev, tmp, tmpq, node) {
> + list_del_init(&uev->node);
> +
> + if (uev->udev)
> + udev_device_unref(uev->udev);
> + FREE(uev);
> + }
> +}
> +
>  bool
>  uevent_can_discard(char *devpath, char *kernel)
>  {
> @@ -125,6 +139,103 @@ uevent_can_discard(char *devpath, char *kernel)
>   return false;
>  }
>  
> +bool
> +merge_need_stop(struct uevent *earlier, struct uevent *later)
> +{
> + /*
> +  * dm uevent do not try to merge with left uevents
> +  */
> + if (!strncmp(later->kernel, "dm-", 3))
> + return true;
> +
> + /*
> +  * we can not make a jugement without wwid,
> +  * so it is sensible to stop merging
> +  */
> + if (!earlier->wwid || !later->wwid)
> + return true;
> + /*
> +  * uevents merging stoped
> +  * when we meet an opposite action uevent from the same LUN to AVOID
> +  * "add path1 |remove path1 |add path2 |remove path2 |add path3"
> +  * to merge as "remove path1, path2" and "add path1, path2, path3"
> +  * OR
> +  * "remove path1 |add path1 |remove path2 |add path2 |remove path3"
> +  * to merge as "add path1, path2" and "remove path1, path2, path3"
> +  * SO
> +  * when we meet a non-change uevent from the same LUN
> +  * with the same wwid and different action
> +  * it would be better to stop merging.
> +  */
> + if (!strcmp(earlier->wwid, later->wwid) &&
> + strcmp(earlier->action, later->action) &&
> + strcmp(earlier->action, "change") &&
> + strcmp(later->action, "change"))
> + return true;
> +
> + return false;
> +}
> +
> +bool
> +uevent_can_merge(struct uevent *earlier, struct uevent *later)
> +{
> + /* merge paths uevents
> +  * whose wwids exsit and are same
> +  * and actions are same,
> +  * and actions are addition or deletion
> +  */
> + if (earlier->wwid && later->wwid &&
> + !strcmp(earlier->wwid, later->wwid) &&
> + !strcmp(earlier->action, later->action) &&
> + strncmp(earlier->action, "change", 6) &&
> + strncmp(earlier->kernel, "dm-", 3)) {
> + return true;
> + }
> +
> + return false;
> +}
> +
> +void
> +uevent_merge(struct uevent *later, struct list_head *tmpq)
> +{
> + struct uevent *earlier, *temp;
> + /*
> +  * compare the uevent with earlier uevents
> +  */
> + list_for_some_entry_reverse(earlier, &later->node, tmpq, node) {
> +next_earlier_node:
> + if (merge_need_stop(earlier, later))
> + break;
> + /*
> +  * try to merge earlier uevents to the later uevent
> +  */
> + if (uevent_can_merge(earlier, later)) {
> + condlog(3, "merged uevent: %s-%s-%s with uevent: 
> %s-%s-%s",
> + earlier->action, earlier->kernel, earlier->wwid,
> + later->action, later->kernel, later->wwid);
> + temp = earlier;
> +
> + earlier = list_entry(earlier->node.prev, typeof(struct 
> uevent), node);
> + list_move(&temp->node, &later->merge_node);
> +
> + if (earlier ==  list_entry(tmpq, typeof(struct uevent), 
> node))
> + break;
> + else
> + goto next_earlier_node;
> + }
> + }
> +}
> +
> +void
> +merge_uevq(struct list_head *tmpq)
> +{
> + struct uevent *later;
> +
> + list_for_each_entry_reverse(later, tmpq, node) {
> + uevent_merge(later, tmpq);
> + }
> +}
> +
> 

Re: [dm-devel] [PATCH 08/12] libmultipath: wait one seconds for more uevents in uevent_listen() in uevents burst situations

2017-01-03 Thread Benjamin Marzinski
On Tue, Dec 27, 2016 at 04:03:25PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> The more uevents are merged, the higher efficiency program will performs.
> So, do not process uevents after receiving immediately in uevents burst
> situations, but continue wait 1 seconds for more uevents except that too
> much uevents (2048) have already been received or too much time eclipse
> (60 seconds).

The issue I have with doing this in uevent_listen is that we seperated
receiving the uevents from servicing the uevents specificially to make
sure what we received them as fast as possible. The udev monitor code
is all based on a NETLINK socket. If our socket's receive buffer fills
up, there is no flow control. Events just start getting dropped, which
does cut down on processing time, but not in a way we would like.

This issue applies to a lesser extent to you previous two patches. I
don't really see the benefit of making sure that we drop the uevents
as soon as possible.  As long as we don't process them, that should
be good enough, right?

Now, maybe you put a lot of thought into your timeouts, and feel
confident that we will start processing well before the receive buffer
fills up. But if so, some comments on that would be reassuring, because
from the commit message, they seem fairly arbitrary to me.

-Ben

> 
> Change-Id: I763d491540e8114a81d12d603281540a81502742
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 35 +--
>  1 file changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index cc10d65..b0b05e9 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -39,6 +39,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -490,11 +491,37 @@ struct uevent *uevent_from_udev_device(struct 
> udev_device *dev)
>   return uev;
>  }
>  
> +bool uevent_burst(struct timeval *start_time, int events)
> +{
> + struct timeval diff_time, end_time;
> + unsigned long speed;
> + unsigned long eclipse_ms;
> +
> + gettimeofday(&end_time, NULL);
> + timersub(&end_time, start_time, &diff_time);
> +
> + eclipse_ms = diff_time.tv_sec * 1000 + diff_time.tv_usec / 1000;
> + if (eclipse_ms == 0)
> + return true;
> + /* max wait 60s */
> + if (eclipse_ms > 60*1000) {
> + condlog(1, "burst continued =%lu ms, stoped", eclipse_ms);
> + return false;
> + }
> +
> + speed = (events * 1000) / eclipse_ms;
> + if (speed > 10)
> + return true;
> +
> + return false;
> +}
> +
>  int uevent_listen(struct udev *udev)
>  {
>   int err = 2;
>   struct udev_monitor *monitor = NULL;
>   int fd, socket_flags, events;
> + struct timeval start_time;
>   int need_failback = 1;
>   int timeout = 30;
>   LIST_HEAD(uevlisten_tmp);
> @@ -548,6 +575,7 @@ int uevent_listen(struct udev *udev)
>   }
>  
>   events = 0;
> + gettimeofday(&start_time, NULL);
>   while (1) {
>   struct uevent *uev;
>   struct udev_device *dev;
> @@ -562,7 +590,8 @@ int uevent_listen(struct udev *udev)
>   errno = 0;
>   fdcount = poll(&ev_poll, 1, poll_timeout);
>   if (fdcount && ev_poll.revents & POLLIN) {
> - timeout = 0;
> + timeout = uevent_burst(&start_time, events + 1) ? 1:0;
> +
>   dev = udev_monitor_receive_device(monitor);
>   if (!dev) {
>   condlog(0, "failed getting udev device");
> @@ -578,7 +607,8 @@ int uevent_listen(struct udev *udev)
>   }
>   list_add_tail(&uev->node, &uevlisten_tmp);
>   events++;
> - continue;
> + if(events < 2048)
> + continue;
>   }
>   if (fdcount < 0) {
>   if (errno == EINTR)
> @@ -600,6 +630,7 @@ int uevent_listen(struct udev *udev)
>   pthread_mutex_unlock(uevq_lockp);
>   events = 0;
>   }
> + gettimeofday(&start_time, NULL);
>   timeout = 30;
>   }
>   need_failback = 0;
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 01/12] libmultipath: add wwid for "struct uevent" to record wwid of uevent

2017-01-03 Thread Benjamin Marzinski
On Tue, Dec 27, 2016 at 04:03:18PM +0800, tang.jun...@zte.com.cn wrote:
> From: tang.junhui 
> 
> Add "char *wwid" to point WWID of uevent. This member identifies
> the LUN ID which the path belongs to, and it is used for merging
> uevents. WWID possibly did not exist in uevent yet, so ->wwid
> would be NULL, those uevents would not be merged, but be proccessed
> as old way.

Right now, multipath users are allowed configure devices to set the wwid
based on any udev environment variable (or even use a callout, although
this is deprecated). With this patch, that breaks. If the udev sets
ID_SERIAL for a device, that is its wwid, right?  Do you know if rbd
devices have ID_SERIAL set? If so, this change will break them.  Even if
this change doesn't break any devices in their default configurations,
we would need to disallow changing how the wwid is set for this patch
to be safe.

-Ben 

> 
> Change-Id: Ie6b076363b3735dc7de10184b27fa799b499af0e
> Signed-off-by: tang.junhui 
> ---
>  libmultipath/uevent.c | 2 ++
>  libmultipath/uevent.h | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c
> index 7edcce1..ef1bafe 100644
> --- a/libmultipath/uevent.c
> +++ b/libmultipath/uevent.c
> @@ -424,6 +424,8 @@ struct uevent *uevent_from_udev_device(struct udev_device 
> *dev)
>   uev->devpath = uev->envp[i] + 8;
>   if (strcmp(name, "ACTION") == 0)
>   uev->action = uev->envp[i] + 7;
> + if (strcmp(name, "ID_SERIAL") == 0)
> + uev->wwid = uev->envp[i] + 10;
>   i++;
>   if (i == HOTPLUG_NUM_ENVP - 1)
>   break;
> diff --git a/libmultipath/uevent.h b/libmultipath/uevent.h
> index 9d22dcd..7bfccef 100644
> --- a/libmultipath/uevent.h
> +++ b/libmultipath/uevent.h
> @@ -22,6 +22,7 @@ struct uevent {
>   char *devpath;
>   char *action;
>   char *kernel;
> + char *wwid;
>   unsigned long seqnum;
>   char *envp[HOTPLUG_NUM_ENVP];
>  };
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH] multipathd: fix SIGUSR2 handling

2017-01-03 Thread Benjamin Marzinski
On Thu, Dec 29, 2016 at 03:56:14PM +0800, tang.jun...@zte.com.cn wrote:
> From: "tang.junhui" 
> 
> SIGUSR2 is not blocked by default, it would cause bellow conflict
> in removing of the last path:

ACK

-Ben

> -
> uevent processing thread | waiter thread
> -|---
> uev_remove_path()|   waiteventloop()
>   lock(&vecs->lock)  |
>   ev_remove_path()   | 
> dm_queue_if_no_path()|> dm_task_run() 
> flush_map()  |  |
>   remove_map_and_stop_waiter()   |  -> 
> pthread_cleanup_push(cleanup_lock, &waiter->vecs->lock);
> _remove_map()| 
> lock(&waiter->vecs->lock)//wait for the locker 
>   stop_waiter_thread()   | 
> pthread_cancel(thread)   |  
> pthread_kill(thread,SIGUSR2)-|--> sigusr2 (int sig)
>  |  condlog()
>  |  fprintf() //it has test 
> cancel point
>  |  cleanup_lock() //thread 
> cancel, and pop, which unlock the 
>  |   locker 
> of uevent processing thread 
> --
> Since SIGUSR2 is only needed when waiter thread running in dm_task_run(),
> and it has already been unblocked before dm_task_run(), so this patch block 
> SIGUSR2 for other times.
> 
> Change-Id: I8c46292dc954415764ebbe054498b4f7ea53c1c6
> Signed-off-by: tang.junhui 
> ---
>  multipathd/main.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index adc3258..fe69abd 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2181,6 +2181,12 @@ sigusr2 (int sig)
>  static void
>  signal_init(void)
>  {
> + sigset_t set;
> +
> + sigemptyset(&set);
> + sigaddset(&set, SIGUSR2);
> + pthread_sigmask(SIG_SETMASK, &set, NULL);
> +
>   signal_set(SIGHUP, sighup);
>   signal_set(SIGUSR1, sigusr1);
>   signal_set(SIGUSR2, sigusr2);
> -- 
> 2.8.1.windows.1
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] deterministic io throughput in multipath

2017-01-03 Thread Benjamin Marzinski
On Mon, Dec 26, 2016 at 09:42:48AM +, Muneendra Kumar M wrote:
> Hi Ben,
> 
> If there are two paths on a dm-1 say sda and sdb as below.
> 
> #  multipath -ll
>   mpathd (3600110d001ee7f0102050001cc0b6751) dm-1 SANBlaze,VLUN MyLun
>   size=8.0M features='0' hwhandler='0' wp=rw
>   `-+- policy='round-robin 0' prio=50 status=active
> |- 8:0:1:0  sda 8:48 active ready  running
> `- 9:0:1:0  sdb 8:64 active ready  running  
> 
> And on sda if iam seeing lot of errors due to which the sda path is 
> fluctuating from failed state to active state and vicevera.
> 
> My requirement is something like this if sda is failed for more then 5 times 
> in a hour duration ,then I want to keep the sda in failed state for few hours 
> (3hrs)
> 
> And the data should travel only thorugh sdb path.
> Will this be possible with the below parameters.

No. delay_watch_checks sets how may path checks you watch a path that
has recently come back from the failed state. If the path fails again
within this time, multipath device delays it.  This means that the delay
is always trigger by two failures within the time limit.  It's possible
to adapt this to count numbers of failures, and act after a certain
number within a certain timeframe, but it would take a bit more work.

delay_wait_checks doesn't guarantee that it will delay for any set
length of time.  Instead, it sets the number of consecutive successful
path checks that must occur before the path is usable again. You could
set this for 3 hours of path checks, but if a check failed during this
time, you would restart the 3 hours over again.

-Ben

> Can you just let me know what values I should add for delay_watch_checks and 
> delay_wait_checks.
> 
> Regards,
> Muneendra.
> 
> 
> 
> -Original Message-
> From: Muneendra Kumar M 
> Sent: Thursday, December 22, 2016 11:10 AM
> To: 'Benjamin Marzinski' 
> Cc: dm-devel@redhat.com
> Subject: RE: [dm-devel] deterministic io throughput in multipath
> 
> Hi Ben,
> 
> Thanks for the reply.
> I will look into this parameters will do the internal testing and let you 
> know the results.
> 
> Regards,
> Muneendra.
> 
> -Original Message-
> From: Benjamin Marzinski [mailto:bmarz...@redhat.com] 
> Sent: Wednesday, December 21, 2016 9:40 PM
> To: Muneendra Kumar M 
> Cc: dm-devel@redhat.com
> Subject: Re: [dm-devel] deterministic io throughput in multipath
> 
> Have you looked into the delay_watch_checks and delay_wait_checks 
> configuration parameters?  The idea behind them is to minimize the use of 
> paths that are intermittently failing.
> 
> -Ben
> 
> On Mon, Dec 19, 2016 at 11:50:36AM +, Muneendra Kumar M wrote:
> >Customers using Linux host (mostly RHEL host) using a SAN network for
> >block storage, complain the Linux multipath stack is not resilient to
> >handle non-deterministic storage network behaviors. This has caused many
> >customer move away to non-linux based servers. The intent of the below
> >patch and the prevailing issues are given below. With the below design we
> >are seeing the Linux multipath stack becoming resilient to such network
> >issues. We hope by getting this patch accepted will help in more Linux
> >server adoption that use SAN network.
> > 
> >I have already sent the design details to the community in a different
> >mail chain and the details are available in the below link.
> > 
> >
> > [1]https://urldefense.proofpoint.com/v2/url?u=https-3A__www.redhat.com_archives_dm-2Ddevel_2016-2DDecember_msg00122.html&d=DgIDAw&c=IL_XqQWOjubgfqINi2jTzg&r=E3ftc47B6BGtZ4fVaYvkuv19wKvC_Mc6nhXaA1sBIP0&m=vfwpVp6e1KXtRA0ctwHYJ7cDmPsLi2C1L9pox7uexsY&s=q5OI-lfefNC2CHKmyUkokgiyiPo_Uj7MRu52hG3MKzM&e=
> >  .
> > 
> >Can you please go through the design and send the comments to us.
> > 
> > 
> > 
> >Regards,
> > 
> >Muneendra.
> > 
> > 
> > 
> > 
> > 
> > References
> > 
> >Visible links
> >1. 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.redhat.com_ar
> > chives_dm-2Ddevel_2016-2DDecember_msg00122.html&d=DgIDAw&c=IL_XqQWOjub
> > gfqINi2jTzg&r=E3ftc47B6BGtZ4fVaYvkuv19wKvC_Mc6nhXaA1sBIP0&m=vfwpVp6e1K
> > XtRA0ctwHYJ7cDmPsLi2C1L9pox7uexsY&s=q5OI-lfefNC2CHKmyUkokgiyiPo_Uj7MRu
> > 52hG3MKzM&e=
> 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.redhat.com_ma
> > ilman_listinfo_dm-2Ddevel&d=DgIDAw&c=IL_XqQWOjubgfqINi2jTzg&r=E3ftc47B6BGtZ4fVaYvkuv19wKvC_Mc6nhXaA1sBIP0&m=vfwpVp6e1KXtRA0ctwHYJ7cDmPsLi2C1L9pox7uexsY&s=UyE46dXOrNTbPz_TVGtpoHl3J3h_n0uYhI4TI-PgyWg&e=

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v1 12/54] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE

2017-01-03 Thread Mike Snitzer
On Tue, Dec 27 2016 at 10:56am -0500,
Ming Lei  wrote:

> For BIO based DM, some targets aren't ready for dealing with
> bigger incoming bio than 1Mbyte, such as crypt target.
> 
> Signed-off-by: Ming Lei 
> ---
>  drivers/md/dm.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 3086da5664f3..6139bf7623f7 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -899,7 +899,16 @@ int dm_set_target_max_io_len(struct dm_target *ti, 
> sector_t len)
>   return -EINVAL;
>   }
>  
> - ti->max_io_len = (uint32_t) len;
> + /*
> +  * BIO based queue uses its own splitting. When multipage bvecs
> +  * is switched on, size of the incoming bio may be too big to
> +  * be handled in some targets, such as crypt.
> +  *
> +  * When these targets are ready for the big bio, we can remove
> +  * the limit.
> +  */
> + ti->max_io_len = min_t(uint32_t, len,
> +(BIO_MAX_PAGES * PAGE_SIZE));
>  
>   return 0;
>  }
> -- 
> 2.7.4

dm_set_target_max_io_len() is already meant to be called by the .ctr
hook for each DM target.  So why not just have the dm-crypt target (and
other targets if needed) pass your reduced $len?

That way only targets that need to be fixed (e.g. dm-crypt) impose this
limit.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [RFC PATCH v2] crypto: Add IV generation algorithms

2017-01-03 Thread Gilad Ben-Yossef
Hi Binoy,

On Tue, Dec 13, 2016 at 02:19:09PM +0530, Binoy Jayan wrote:
> Currently, the iv generation algorithms are implemented in dm-crypt.c.
> The goal is to move these algorithms from the dm layer to the kernel
> crypto layer by implementing them as template ciphers so they can be
> implemented in hardware for performance. As part of this patchset, the
> iv-generation code is moved from the dm layer to the crypto layer and
> adapt the dm-layer to send a whole 'bio' (as defined in the block layer)
> at a time. Each bio contains the in memory representation of physically
> contiguous disk blocks. The dm layer sets up a chained scatterlist of
> these blocks split into physically contiguous segments in memory so that
> DMA can be performed. The iv generation algorithms implemented in geniv.c
> include plain, plain64, essiv, benbi, null, lmk and tcw.
>

Good idea. I wanted to test the patch but alas it does not apply cleanly.
You seem to have a blank line at the end of files and other small
transgressions that makes checkpatch grumpy.



Also...

>
> Not-signed-off-by: Binoy Jayan 


What is Not-signed-off-by ? :-)

Thanks,
Gilad Ben-Yossef

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel