Re: [dm-devel] [PATCH v2 02/13] dm-mpath: Avoid that path removal can trigger an infinite loop
On 04/27/2017 07:11 PM, Bart Van Assche wrote: > If blk_get_request() fails check whether the failure is due to > a path being removed. If that is the case fail the path by > triggering a call to fail_path(). This patch avoids that the > following scenario can be encountered while removing paths: > * CPU usage of a kworker thread jumps to 100%. > * Removing the dm device becomes impossible. > > Delay requeueing if blk_get_request() returns -EBUSY or > -EWOULDBLOCK because in these cases immediate requeuing is > inappropriate. > > Signed-off-by: Bart Van Assche > Cc: Hannes Reinecke > Cc: Christoph Hellwig > Cc: > --- > drivers/md/dm-mpath.c | 17 - > 1 file changed, 12 insertions(+), 5 deletions(-) > Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 01/13] dm-mpath: Split activate_path()
On 04/27/2017 07:11 PM, Bart Van Assche wrote: > This patch does not change any functionality but makes the next > patch in this series easier to read. > > Signed-off-by: Bart Van Assche > Cc: Hannes Reinecke > Cc: Christoph Hellwig > Cc: > --- > drivers/md/dm-mpath.c | 17 - > 1 file changed, 12 insertions(+), 5 deletions(-) > Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [mdadm PATCH] Create: tell udev md device is not ready when first created.
When an array is created the content is not initialized, so it could have remnants of an old filesystem or md array etc on it. udev will see this and might try to activate it, which is almost certainly not what is wanted. So create a mechanism for mdadm to communicate with udev to tell it that the device isn't ready. This mechanism is the existance of a file /run/mdadm/created-mdXXX where mdXXX is the md device name. When creating an array, mdadm will create the file. A new udev rule file, 01-md-raid-creating.rules, will detect the precense of thst file and set ENV{SYSTEMD_READY}="0". This is fairly uniformly used to suppress actions based on the contents of the device. Signed-off-by: NeilBrown --- Assemble.c | 2 +- Build.c | 2 +- Create.c| 9 +++- Incremental.c | 4 ++-- Makefile| 4 ++-- lib.c | 29 + mdadm.h | 4 +++- mdopen.c| 52 - udev-md-raid-creating.rules | 7 ++ 9 files changed, 86 insertions(+), 27 deletions(-) create mode 100644 udev-md-raid-creating.rules diff --git a/Assemble.c b/Assemble.c index d6beb23da9c5..a9442c8ce73b 100644 --- a/Assemble.c +++ b/Assemble.c @@ -1478,7 +1478,7 @@ try_again: name = strchr(name, ':')+1; mdfd = create_mddev(mddev, name, ident->autof, trustworthy, - chosen_name); + chosen_name, 0); } if (mdfd < 0) { st->ss->free_super(st); diff --git a/Build.c b/Build.c index 11ba12f4ae7d..665d9067b8d6 100644 --- a/Build.c +++ b/Build.c @@ -109,7 +109,7 @@ int Build(char *mddev, struct mddev_dev *devlist, /* We need to create the device. It can have no name. */ map_lock(&map); mdfd = create_mddev(mddev, NULL, c->autof, LOCAL, - chosen_name); + chosen_name, 0); if (mdfd < 0) { map_unlock(&map); return 1; diff --git a/Create.c b/Create.c index 6ca092449880..df1bc20c635b 100644 --- a/Create.c +++ b/Create.c @@ -605,7 +605,7 @@ int Create(struct supertype *st, char *mddev, /* We need to create the device */ map_lock(&map); - mdfd = create_mddev(mddev, name, c->autof, LOCAL, chosen_name); + mdfd = create_mddev(mddev, name, c->autof, LOCAL, chosen_name, 1); if (mdfd < 0) { map_unlock(&map); return 1; @@ -620,6 +620,7 @@ int Create(struct supertype *st, char *mddev, chosen_name); close(mdfd); map_unlock(&map); + udev_unblock(); return 1; } mddev = chosen_name; @@ -1053,9 +1054,15 @@ int Create(struct supertype *st, char *mddev, pr_err("not starting array - not enough devices.\n"); } close(mdfd); + /* Give udev a moment to process the Change event caused +* by the close. +*/ + usleep(100*1000); + udev_unblock(); return 0; abort: + udev_unblock(); map_lock(&map); abort_locked: map_remove(&map, fd2devnm(mdfd)); diff --git a/Incremental.c b/Incremental.c index 28f1f7734956..63ed4fa1a88d 100644 --- a/Incremental.c +++ b/Incremental.c @@ -321,7 +321,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c, /* Couldn't find an existing array, maybe make a new one */ mdfd = create_mddev(match ? match->devname : NULL, - name_to_use, c->autof, trustworthy, chosen_name); + name_to_use, c->autof, trustworthy, chosen_name, 0); if (mdfd < 0) goto out_unlock; @@ -1605,7 +1605,7 @@ static int Incremental_container(struct supertype *st, char *devname, ra->name, c->autof, trustworthy, - chosen_name); + chosen_name, 0); } if (only && (!mp || strcmp(mp->devnm, only) != 0)) continue; diff --git a/Makefile b/Makefile index 685069612617..021d3adf3ed1 100644 --- a/Makefile +++ b/Makefile @@ -256,8 +256,8 @@ install-man: mdadm.8 md.4 mdadm.conf.5 mdmon.8 $(INSTALL) -D -m 644 md.4 $(DESTDIR)$(MAN4DIR)/md.4 $(INSTALL) -D -m 644 mdadm.conf.5 $(DESTDIR)$(MAN5DIR)/mdadm.conf.5 -install-udev: udev-md-raid-arrays.rules udev-md-raid-assembly.rules - @for file in 63-md-raid-arrays.rules 64-md-raid-assembly.rules ; \ +install-udev: udev-md-raid-arrays.rules udev-md-raid-assembly.rules udev-md-raid-creating
Re: [dm-devel] [mdadm PATCH 4/4] Create: tell udev device is not ready when first created.
On Wed, Apr 26 2017, Peter Rajnoha wrote: > On 04/20/2017 11:35 PM, NeilBrown wrote: >> On Thu, Apr 20 2017, Jes Sorensen wrote: > ... >>> Second, isn't this going to be racey if you have multiple arrays >>> running? I am wondering if we cannot find a solution that relies on a >>> permanently installed udev rule that we enable/disable with systemctl >>> and use the device name as an argument? >> >> There would only be a problematic race of an array as being created at >> the same time that another array is being assembled. Is that at all >> likely? They tend to happen at different parts of the usage cycle... I >> guess we shouldn't assume though. >> >> I admit that blocking *all* arrays was a short cut. We need to create >> the udev rule before the new device is first activated, and keep it in >> existence until after the array has been started and the fd on /dev/mdXX >> has been closed - and then a bit more because udev doesn't respond >> instantly. >> In some cases we don't know the name of the md device until >> create_mddev() chooses on with find_free_devnum(). >> So if we want to block udev from handling a device, we need to put the >> block into effect (e.g. create the rules.d file) in mddev_create() just >> before the call to open_dev_excl(). >> >> If we wanted an more permanent udev rule, we would need to record the >> devices that should be ignored in the filesystem somewhere else. >> Maybe in /run/mdadm. >> e.g. >> >> KERNEL=="md*", TEST="/run/mdadm/creating-$kernel", ENV{SYSTEMD_READY}="0" >> >> Then we could have something like the following (untested) in mdadm. >> Does that seem more suitable? >> > > Yes, please, if possible, go for a permanent udev rule surely - this > will make it much easier for other foreign tools to hook in properly if > needed and it would also be much easier to debug. I'm leaning towards the files-in-/run/mdadm approach too. I'll make a proper patch. > > But, wouldn't it be better if we could just pass this information ("not > initialized yet") as RUN_ARRAY md ioctl parameter? In that case the md > driver in kernel could add the variable to the uevent it generates which > userspace udev rules could check for easily. This way, we don't need to > hassle with creating files in filesystem and the information would be > directly a part of the uevent the md kernel driver generates (so > directly accessible in udev rules too). Also, possibly adding more > variables for other future scenarios if needed. When would we clear the "not initialised yet" flag in the kernel, and how? And would that be enough. When mdadm creates an md array, at least 3 uevents get generated. The first is generated when the md device object is created, either by opening the /dev/mdXX file, or by writing magic to somewhere in sysfs. The second is generated when the array is started and the content is visible. The third is generated when mdadm closes the file descriptor. It opens /dev/mdXX for O_RDWR and performs ioctls on this, and then closes it. Because udev uses inotify to watch for a 'close for a writable file descriptor', this generates another event. We need to ensure that none of these cause udev to run anything that inspects the contents of the array. Of the three, only the second one is directly under the control of the md module, so only that one can add an environment variable. It might be possible to avoid the last one (by not opening for write), but I cannot see a way to avoid the first one. I don't think that making a file appear in /run is really very different from making a variable appear in a uevent. If the variable were describing the event itself there would be a different, but it would be describing the state of the device. So the only important difference is "which is easier to work with". I think creating an deleting a file is easier to setting and clearing a variable. Thanks, NeilBrown > > We use something similar in device-mapper already where we pass flags > (besides other things) as a device-mapper ioctl parameter which then > gets into the uevent as DM_COOKIE uevent variable (which in turn the > udev rules can decode into individual flags). This method proved to be > working well for us to solve the problem of uninitialized data area so > we don't get into trouble with scans from udev (see also dm_ioctl > structure in usr/include/linux/dm-ioctl.h and the dm_kobject_uevent > function in drivers/md/dm.c for the kernel part; in udev rules the flags > are then various DM_UDEV_* variables we check for) > > -- > Peter > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: PGP signature -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [BUG] multipath-tools: missing internal values (multipath -t)
On Thu, Apr 27, 2017 at 09:10:09AM +0200, Christophe Varoqui wrote: >Hi, >The config outputed should be usable, so a comment could be printed for >the keywords you suggest printing as the default for. >Example: ># no_path_retry >or simply ># no_path_retry Makes sense. Thanks -Ben >On Tue, Apr 25, 2017 at 8:42 PM, Benjamin Marzinski ><[1]bmarz...@redhat.com> wrote: > > On Tue, Apr 25, 2017 at 04:26:06PM +0200, Xose Vazquez Perez wrote: > > Hi, > > > > These values: > > What do you think of these? > > no_path_retry > > The default if no_path_retry is set is to use the features line to > determine what to do when there are no paths. we could have it > print something like > > no_path_retry > > > dev_loss_tmo > > The default here is to keep whatever the sysfs value currently is. We > could print > > dev_loss_tmo > > or > > dev_loss_tmo > > or something else. I'm open to ideas. > > > reservation_key > > The default here is nothing > > reservation_key > > probably makes the most sense > > > partition_delimite > > The default here is an empty string > > partition_delm "" > > is probably the right answer. > > > delay_watch_checks > > delay_wait_checks > > These clearly have well defined default values and should be set. > > -Ben > > are missing from the "defaults section" of multipath -t output. > > -- > > dm-devel mailing list > > [2]dm-devel@redhat.com > > [3]https://www.redhat.com/mailman/listinfo/dm-devel > > -- > dm-devel mailing list > [4]dm-devel@redhat.com > [5]https://www.redhat.com/mailman/listinfo/dm-devel > > References > >Visible links >1. mailto:bmarz...@redhat.com >2. mailto:dm-devel@redhat.com >3. https://www.redhat.com/mailman/listinfo/dm-devel >4. mailto:dm-devel@redhat.com >5. https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 11/13] dm-mpath: Micro-optimize the hot path
On Thu, Apr 27 2017 at 1:11P -0400, Bart Van Assche wrote: > Instead of checking MPATHF_QUEUE_IF_NO_PATH, > MPATHF_SAVED_QUEUE_IF_NO_PATH and the no_flush flag to decide whether > or not to push back a request if there are no paths available, only > clear MPATHF_QUEUE_IF_NO_PATH in queue_if_no_path() if no_flush has > not been set. The result is that only a single bit has to be tested > in the hot path to decide whether or not a request must be pushed > back and also that m->lock does not have to be taken in the hot path. > > Signed-off-by: Bart Van Assche > Reviewed-by: Hannes Reinecke This patch was very demanding to review.. all that old must_push_back() code was very tedious. Happy to see it go. BUT I'm left nervous that something might regress. Not because of something overlooked (though that could happen).. just that its a particularly delicate portion of the mpath target's suspend/resume handling. Kudos to you for tackling this. Staged in dm-4.12 for 4.12 inclusion (will push it out once I go back over what I've staged) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 05/13] dm-mpath: Make it easier to analyze requeuing behavior
On Thu, 2017-04-27 at 15:29 -0400, Mike Snitzer wrote: > Documentation/process/coding-style.rst says: > "Coming up with good debugging messages can be quite a challenge; and once > you have them, they can be a huge help for remote troubleshooting. However > debug message printing is handled differently than printing other non-debug > messages. While the other pr_XXX() functions print unconditionally, > pr_debug() does not; it is compiled out by default, unless either DEBUG is > defined or CONFIG_DYNAMIC_DEBUG is set." > > So I assume you're leveraging DYNAMIC_DEBUG. > > Anyway, I'm not liking the idea of making this debugging part of the > mpath code. But if there is a convincing argument for it please > elaborate. > > Are you finding that things are going wrong on production systems and > enabling pr_debug() in these paths would have, or has, saved you? Hello Mike, If the dm-mpath driver is busy with requeuing requests in a loop today there is no way to figure out why that continuous requeuing happens. The pr_debug() statements introduced by this patch provide an easy way to figure out on development systems why the requeuing happens. Please note that there is no guarantee on production systems that CONFIG_DYNAMIC_DEBUG=y. Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 04/13] dm-rq: Adjust requeuing delays
On Thu, 2017-04-27 at 15:16 -0400, Mike Snitzer wrote: > This call toblk_mq_delay_run_hw_queue(), while unconvincing and suspect, > is being introduced via block core during the 4.12 merge. > > But in general, this tweaking of the timeouts in such a short period > speaks to indecision and leaves me unconvinced of what the _best_ values > to use are. > > Let's revisit this after the merge window closes, we can tweak the 100ms > up to 500ms in both locations if you _really_ prefer that. Hello Mike, The 100ms was a copy/paste from the SCSI initiator code. Later I realized that that is too fast in the dm core, e.g. when the underlying paths are busy due to SCSI error handling. Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 05/13] dm-mpath: Make it easier to analyze requeuing behavior
On Thu, Apr 27 2017 at 1:11P -0400, Bart Van Assche wrote: > When debugging the dm-mpath driver it is important to know what > decisions have been taken with regard to requeuing. Hence this > patch that adds pr_debug() statements that report what decisions > have been taken. > > Signed-off-by: Bart Van Assche > Reviewed-by: Hannes Reinecke > Cc: Christoph Hellwig I've not used pr_debug() before.. I generally sprinkle debugging when I'm actively chasing an issue and then throw it away once I sort the problem out. Documentation/process/coding-style.rst says: "Coming up with good debugging messages can be quite a challenge; and once you have them, they can be a huge help for remote troubleshooting. However debug message printing is handled differently than printing other non-debug messages. While the other pr_XXX() functions print unconditionally, pr_debug() does not; it is compiled out by default, unless either DEBUG is defined or CONFIG_DYNAMIC_DEBUG is set." So I assume you're leveraging DYNAMIC_DEBUG. Anyway, I'm not liking the idea of making this debugging part of the mpath code. But if there is a convincing argument for it please elaborate. Are you finding that things are going wrong on production systems and enabling pr_debug() in these paths would have, or has, saved you? -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 04/13] dm-rq: Adjust requeuing delays
On Thu, Apr 27 2017 at 1:11P -0400, Bart Van Assche wrote: > Reduce the requeue delay in dm_requeue_original_request() from 5s > to 0.5s to avoid that this delay slows down failover or failback. > Increase the requeue delay in dm_mq_queue_rq() from 0.1s to 0.5s > to reduce the system load if immediate requeuing has been requested > by the dm driver. > > Signed-off-by: Bart Van Assche > Reviewed-by: Hannes Reinecke > Cc: Christoph Hellwig > --- > drivers/md/dm-rq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > index 0b081d170087..c53debdcd7dc 100644 > --- a/drivers/md/dm-rq.c > +++ b/drivers/md/dm-rq.c > @@ -280,7 +280,7 @@ static void dm_requeue_original_request(struct > dm_rq_target_io *tio, bool delay_ > if (!rq->q->mq_ops) > dm_old_requeue_request(rq); > else > - dm_mq_delay_requeue_request(rq, delay_requeue ? 5000 : 0); > + dm_mq_delay_requeue_request(rq, delay_requeue ? 500/*ms*/ : 0); > > rq_completed(md, rw, false); > } This one was already changed to 100ms via commit 06eb061f that I already staged for 4.12, see: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.12&id=06eb061f48594aa369f6e852b352410298b317a8 > @@ -755,7 +755,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, > /* Undo dm_start_request() before requeuing */ > rq_end_stats(md, rq); > rq_completed(md, rq_data_dir(rq), false); > - blk_mq_delay_run_hw_queue(hctx, 100/*ms*/); > + blk_mq_delay_run_hw_queue(hctx, 500/*ms*/); > return BLK_MQ_RQ_QUEUE_BUSY; > } > This call toblk_mq_delay_run_hw_queue(), while unconvincing and suspect, is being introduced via block core during the 4.12 merge. But in general, this tweaking of the timeouts in such a short period speaks to indecision and leaves me unconvinced of what the _best_ values to use are. Let's revisit this after the merge window closes, we can tweak the 100ms up to 500ms in both locations if you _really_ prefer that. Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 13/13] dm, dm-mpath: Make it easier to detect unintended I/O request flushes
I/O errors triggered by multipathd incorrectly not enabling the no-flush flag for DM_DEVICE_SUSPEND or DM_DEVICE_RESUME are hard to debug. Add more logging to make it easier to debug this. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-mpath.c | 25 + drivers/md/dm.c | 3 +++ 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 6cab04a0e565..1971952396cf 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -442,6 +442,23 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes) } /* + * Note: dm_report_eio() is a macro instead of a function to make pr_debug() + * report the function name and line number of the function from which it has + * been invoked. + */ +#define dm_report_eio(m) \ +({ \ + struct mapped_device *md = dm_table_get_md((m)->ti->table); \ + \ + pr_debug("%s: returning EIO; QIFNP = %d; SQIFNP = %d; DNFS = %d\n", \ +dm_device_name(md),\ +test_bit(MPATHF_QUEUE_IF_NO_PATH, &(m)->flags),\ +test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &(m)->flags), \ +dm_noflush_suspending((m)->ti)); \ + -EIO; \ +}) + +/* * Map cloned requests (request-based multipath) */ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, @@ -466,7 +483,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, pr_debug("no path - requeueing\n"); return DM_MAPIO_DELAY_REQUEUE; } - return -EIO;/* Failed */ + return dm_report_eio(m);/* Failed */ } else if (test_bit(MPATHF_QUEUE_IO, &m->flags) || test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) { return pg_init_all_paths(m) == 0 ? DM_MAPIO_REQUEUE : @@ -542,7 +559,7 @@ static int __multipath_map_bio(struct multipath *m, struct bio *bio, struct dm_m if (!pgpath) { if (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) return DM_MAPIO_REQUEUE; - return -EIO; + return dm_report_eio(m); } mpio->pgpath = pgpath; @@ -1484,7 +1501,7 @@ static int do_end_io(struct multipath *m, struct request *clone, if (atomic_read(&m->nr_valid_paths) == 0 && !test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) - r = -EIO; + r = dm_report_eio(m); return r; } @@ -1527,7 +1544,7 @@ static int do_end_io_bio(struct multipath *m, struct bio *clone, if (atomic_read(&m->nr_valid_paths) == 0 && !test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) - return -EIO; + return dm_report_eio(m); /* Queue for the daemon to resubmit */ dm_bio_restore(get_bio_details_from_bio(clone), clone); diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 7161c7804363..9f0aed0e1bf0 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2158,6 +2158,9 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, */ if (noflush) set_bit(DMF_NOFLUSH_SUSPENDING, &md->flags); + else + pr_debug("%s: suspending without no-flush flag\n", +dm_device_name(md)); /* * This gets reverted if there's an error later and the targets -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 01/13] dm-mpath: Split activate_path()
This patch does not change any functionality but makes the next patch in this series easier to read. Signed-off-by: Bart Van Assche Cc: Hannes Reinecke Cc: Christoph Hellwig Cc: --- drivers/md/dm-mpath.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 7f223dbed49f..eff7ecea1d2a 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -111,7 +111,8 @@ typedef int (*action_fn) (struct pgpath *pgpath); static struct workqueue_struct *kmultipathd, *kmpath_handlerd; static void trigger_event(struct work_struct *work); -static void activate_path(struct work_struct *work); +static void activate_or_offline_path(struct pgpath *pgpath); +static void activate_path_work(struct work_struct *work); static void process_queued_bios(struct work_struct *work); /*--- @@ -136,7 +137,7 @@ static struct pgpath *alloc_pgpath(void) if (pgpath) { pgpath->is_active = true; - INIT_DELAYED_WORK(&pgpath->activate_path, activate_path); + INIT_DELAYED_WORK(&pgpath->activate_path, activate_path_work); } return pgpath; @@ -1437,10 +1438,8 @@ static void pg_init_done(void *data, int errors) spin_unlock_irqrestore(&m->lock, flags); } -static void activate_path(struct work_struct *work) +static void activate_or_offline_path(struct pgpath *pgpath) { - struct pgpath *pgpath = - container_of(work, struct pgpath, activate_path.work); struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev); if (pgpath->is_active && !blk_queue_dying(q)) @@ -1449,6 +1448,14 @@ static void activate_path(struct work_struct *work) pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); } +static void activate_path_work(struct work_struct *work) +{ + struct pgpath *pgpath = + container_of(work, struct pgpath, activate_path.work); + + activate_or_offline_path(pgpath); +} + static int noretry_error(int error) { switch (error) { -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 04/13] dm-rq: Adjust requeuing delays
Reduce the requeue delay in dm_requeue_original_request() from 5s to 0.5s to avoid that this delay slows down failover or failback. Increase the requeue delay in dm_mq_queue_rq() from 0.1s to 0.5s to reduce the system load if immediate requeuing has been requested by the dm driver. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke Cc: Christoph Hellwig --- drivers/md/dm-rq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 0b081d170087..c53debdcd7dc 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -280,7 +280,7 @@ static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_ if (!rq->q->mq_ops) dm_old_requeue_request(rq); else - dm_mq_delay_requeue_request(rq, delay_requeue ? 5000 : 0); + dm_mq_delay_requeue_request(rq, delay_requeue ? 500/*ms*/ : 0); rq_completed(md, rw, false); } @@ -755,7 +755,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, /* Undo dm_start_request() before requeuing */ rq_end_stats(md, rq); rq_completed(md, rq_data_dir(rq), false); - blk_mq_delay_run_hw_queue(hctx, 100/*ms*/); + blk_mq_delay_run_hw_queue(hctx, 500/*ms*/); return BLK_MQ_RQ_QUEUE_BUSY; } -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 11/13] dm-mpath: Micro-optimize the hot path
Instead of checking MPATHF_QUEUE_IF_NO_PATH, MPATHF_SAVED_QUEUE_IF_NO_PATH and the no_flush flag to decide whether or not to push back a request if there are no paths available, only clear MPATHF_QUEUE_IF_NO_PATH in queue_if_no_path() if no_flush has not been set. The result is that only a single bit has to be tested in the hot path to decide whether or not a request must be pushed back and also that m->lock does not have to be taken in the hot path. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-mpath.c | 70 --- 1 file changed, 11 insertions(+), 59 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 286d74593b2c..5a3200332eb1 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -442,47 +442,6 @@ static struct pgpath *choose_pgpath(struct multipath *m, size_t nr_bytes) } /* - * Check whether bios must be queued in the device-mapper core rather - * than here in the target. - * - * If m->queue_if_no_path and m->saved_queue_if_no_path hold the - * same value then we are not between multipath_presuspend() - * and multipath_resume() calls and we have no need to check - * for the DMF_NOFLUSH_SUSPENDING flag. - */ -static bool __must_push_back(struct multipath *m) -{ - return ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) != -test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) && - dm_noflush_suspending(m->ti)); -} - -static bool must_push_back_rq(struct multipath *m) -{ - bool r; - unsigned long flags; - - spin_lock_irqsave(&m->lock, flags); - r = (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) || -__must_push_back(m)); - spin_unlock_irqrestore(&m->lock, flags); - - return r; -} - -static bool must_push_back_bio(struct multipath *m) -{ - bool r; - unsigned long flags; - - spin_lock_irqsave(&m->lock, flags); - r = __must_push_back(m); - spin_unlock_irqrestore(&m->lock, flags); - - return r; -} - -/* * Map cloned requests (request-based multipath) */ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, @@ -503,7 +462,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, pgpath = choose_pgpath(m, nr_bytes); if (!pgpath) { - if (must_push_back_rq(m)) { + if (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) { pr_debug("no path - requeueing\n"); return DM_MAPIO_DELAY_REQUEUE; } @@ -581,9 +540,9 @@ static int __multipath_map_bio(struct multipath *m, struct bio *bio, struct dm_m } if (!pgpath) { - if (!must_push_back_bio(m)) - return -EIO; - return DM_MAPIO_REQUEUE; + if (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) + return DM_MAPIO_REQUEUE; + return -EIO; } mpio->pgpath = pgpath; @@ -675,7 +634,7 @@ static int queue_if_no_path(struct multipath *m, bool queue_if_no_path, else clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); } - if (queue_if_no_path) + if (queue_if_no_path || dm_noflush_suspending(m->ti)) set_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); else clear_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); @@ -1526,12 +1485,9 @@ static int do_end_io(struct multipath *m, struct request *clone, if (mpio->pgpath) fail_path(mpio->pgpath); - if (!atomic_read(&m->nr_valid_paths)) { - if (!test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) { - if (!must_push_back_rq(m)) - r = -EIO; - } - } + if (atomic_read(&m->nr_valid_paths) == 0 && + !test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) + r = -EIO; return r; } @@ -1572,13 +1528,9 @@ static int do_end_io_bio(struct multipath *m, struct bio *clone, if (mpio->pgpath) fail_path(mpio->pgpath); - if (!atomic_read(&m->nr_valid_paths)) { - if (!test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) { - if (!must_push_back_bio(m)) - return -EIO; - return DM_ENDIO_REQUEUE; - } - } + if (atomic_read(&m->nr_valid_paths) == 0 && + !test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) + return -EIO; /* Queue for the daemon to resubmit */ dm_bio_restore(get_bio_details_from_bio(clone), clone); -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 09/13] dm-mpath: Verify locking assumptions at runtime
Verify at runtime that __pg_init_all_paths() is called with multipath.lock held if lockdep is enabled. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-mpath.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index fc3e7f028110..4e4d53faf2d4 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -298,6 +298,8 @@ static int __pg_init_all_paths(struct multipath *m) struct pgpath *pgpath; unsigned long pg_init_delay = 0; + lockdep_assert_held(&m->lock); + if (atomic_read(&m->pg_init_in_progress) || test_bit(MPATHF_PG_INIT_DISABLED, &m->flags)) return 0; -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 12/13] dm-mpath: Introduce assign_bit()
This patch does not change any functionality but makes the code easier to read. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-mpath.c | 37 - 1 file changed, 16 insertions(+), 21 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 5a3200332eb1..6cab04a0e565 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -613,6 +613,14 @@ static void process_queued_bios(struct work_struct *work) blk_finish_plug(&plug); } +static void assign_bit(bool value, long nr, unsigned long *addr) +{ + if (value) + set_bit(nr, addr); + else + clear_bit(nr, addr); +} + /* * If we run out of usable paths, should we queue I/O or error it? */ @@ -622,23 +630,12 @@ static int queue_if_no_path(struct multipath *m, bool queue_if_no_path, unsigned long flags; spin_lock_irqsave(&m->lock, flags); - - if (save_old_value) { - if (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) - set_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); - else - clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); - } else { - if (queue_if_no_path) - set_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); - else - clear_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); - } - if (queue_if_no_path || dm_noflush_suspending(m->ti)) - set_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); - else - clear_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); - + assign_bit((save_old_value && + test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) || + (!save_old_value && queue_if_no_path), + MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags); + assign_bit(queue_if_no_path || dm_noflush_suspending(m->ti), + MPATHF_QUEUE_IF_NO_PATH, &m->flags); spin_unlock_irqrestore(&m->lock, flags); if (!queue_if_no_path) { @@ -1596,10 +1593,8 @@ static void multipath_resume(struct dm_target *ti) unsigned long flags; spin_lock_irqsave(&m->lock, flags); - if (test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) - set_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); - else - clear_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags); + assign_bit(test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags), + MPATHF_QUEUE_IF_NO_PATH, &m->flags); spin_unlock_irqrestore(&m->lock, flags); } -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 10/13] dm: Introduce enum dm_queue_mode
Introduce an enumeration type for the queue mode. This patch does not change any functionality but makes the dm code easier to read. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-core.h | 2 +- drivers/md/dm-ioctl.c | 2 +- drivers/md/dm-mpath.c | 5 - drivers/md/dm-table.c | 14 +++--- drivers/md/dm.c | 14 +- drivers/md/dm.h | 12 +++- include/linux/device-mapper.h | 14 -- 7 files changed, 37 insertions(+), 26 deletions(-) diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h index 136fda3ff9e5..b92f74d9a982 100644 --- a/drivers/md/dm-core.h +++ b/drivers/md/dm-core.h @@ -47,7 +47,7 @@ struct mapped_device { struct request_queue *queue; int numa_node_id; - unsigned type; + enum dm_queue_mode type; /* Protect queue and type against concurrent access. */ struct mutex type_lock; diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c index 4da6fc6b1ffd..9b1ce440fc70 100644 --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1268,7 +1268,7 @@ static int populate_table(struct dm_table *table, return dm_table_complete(table); } -static bool is_valid_type(unsigned cur, unsigned new) +static bool is_valid_type(enum dm_queue_mode cur, enum dm_queue_mode new) { if (cur == new || (cur == DM_TYPE_BIO_BASED && new == DM_TYPE_DAX_BIO_BASED)) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 4e4d53faf2d4..286d74593b2c 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -90,7 +90,7 @@ struct multipath { atomic_t pg_init_in_progress; /* Only one pg_init allowed at once */ atomic_t pg_init_count; /* Number of times pg_init called */ - unsigned queue_mode; + enum dm_queue_mode queue_mode; struct mutex work_mutex; struct work_struct trigger_event; @@ -1707,6 +1707,9 @@ static void multipath_status(struct dm_target *ti, status_type_t type, case DM_TYPE_MQ_REQUEST_BASED: DMEMIT("queue_mode mq "); break; + default: + WARN_ON_ONCE(true); + break; } } } diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 92dbc85af53a..6692571336c7 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -30,7 +30,7 @@ struct dm_table { struct mapped_device *md; - unsigned type; + enum dm_queue_mode type; /* btree table */ unsigned int depth; @@ -821,19 +821,19 @@ void dm_consume_args(struct dm_arg_set *as, unsigned num_args) } EXPORT_SYMBOL(dm_consume_args); -static bool __table_type_bio_based(unsigned table_type) +static bool __table_type_bio_based(enum dm_queue_mode table_type) { return (table_type == DM_TYPE_BIO_BASED || table_type == DM_TYPE_DAX_BIO_BASED); } -static bool __table_type_request_based(unsigned table_type) +static bool __table_type_request_based(enum dm_queue_mode table_type) { return (table_type == DM_TYPE_REQUEST_BASED || table_type == DM_TYPE_MQ_REQUEST_BASED); } -void dm_table_set_type(struct dm_table *t, unsigned type) +void dm_table_set_type(struct dm_table *t, enum dm_queue_mode type) { t->type = type; } @@ -875,7 +875,7 @@ static int dm_table_determine_type(struct dm_table *t) struct dm_target *tgt; struct dm_dev_internal *dd; struct list_head *devices = dm_table_get_devices(t); - unsigned live_md_type = dm_get_md_type(t->md); + enum dm_queue_mode live_md_type = dm_get_md_type(t->md); if (t->type != DM_TYPE_NONE) { /* target already set the table's type */ @@ -984,7 +984,7 @@ static int dm_table_determine_type(struct dm_table *t) return 0; } -unsigned dm_table_get_type(struct dm_table *t) +enum dm_queue_mode dm_table_get_type(struct dm_table *t) { return t->type; } @@ -1035,7 +1035,7 @@ bool dm_table_all_blk_mq_devices(struct dm_table *t) static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *md) { - unsigned type = dm_table_get_type(t); + enum dm_queue_mode type = dm_table_get_type(t); unsigned per_io_data_size = 0; struct dm_target *tgt; unsigned i; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 78706a04bab4..7161c7804363 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1797,13 +1797,13 @@ void dm_unlock_md_type(struct mapped_device *md) mutex_unlock(&md->type_lock); } -void dm_set_md_type(struct mapped_device *md, unsigned type) +void dm_set_md_type(struct mapped_device *md, enum dm_queue_mode type) { BUG_ON(!mutex_is_locked(&md->type_lock)); md->type = type; } -
[dm-devel] [PATCH v2 07/13] dm, persistence: Remove an unused argument from dm_block_manager_create()
The 'cache_size' argument of dm_block_manager_create() has never been used. Hence remove it and also the definitions of the constants passed as 'cache_size' argument. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-cache-metadata.c| 3 --- drivers/md/dm-era-target.c| 2 -- drivers/md/dm-thin-metadata.c | 2 -- drivers/md/persistent-data/dm-block-manager.c | 1 - drivers/md/persistent-data/dm-block-manager.h | 2 +- 5 files changed, 1 insertion(+), 9 deletions(-) diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c index 6735c8d6a445..8568dbd50ba4 100644 --- a/drivers/md/dm-cache-metadata.c +++ b/drivers/md/dm-cache-metadata.c @@ -27,8 +27,6 @@ #define MIN_CACHE_VERSION 1 #define MAX_CACHE_VERSION 2 -#define CACHE_METADATA_CACHE_SIZE 64 - /* * 3 for btree insert + * 2 for btree lookup used within space map @@ -535,7 +533,6 @@ static int __create_persistent_data_objects(struct dm_cache_metadata *cmd, { int r; cmd->bm = dm_block_manager_create(cmd->bdev, DM_CACHE_METADATA_BLOCK_SIZE << SECTOR_SHIFT, - CACHE_METADATA_CACHE_SIZE, CACHE_MAX_CONCURRENT_LOCKS); if (IS_ERR(cmd->bm)) { DMERR("could not create block manager"); diff --git a/drivers/md/dm-era-target.c b/drivers/md/dm-era-target.c index 9fab33b113c4..b13751d78b9e 100644 --- a/drivers/md/dm-era-target.c +++ b/drivers/md/dm-era-target.c @@ -254,7 +254,6 @@ static struct dm_block_validator sb_validator = { * Low level metadata handling *--*/ #define DM_ERA_METADATA_BLOCK_SIZE 4096 -#define DM_ERA_METADATA_CACHE_SIZE 64 #define ERA_MAX_CONCURRENT_LOCKS 5 struct era_metadata { @@ -615,7 +614,6 @@ static int create_persistent_data_objects(struct era_metadata *md, int r; md->bm = dm_block_manager_create(md->bdev, DM_ERA_METADATA_BLOCK_SIZE, -DM_ERA_METADATA_CACHE_SIZE, ERA_MAX_CONCURRENT_LOCKS); if (IS_ERR(md->bm)) { DMERR("could not create block manager"); diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index a15091a0d40c..0f0251d0d337 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -77,7 +77,6 @@ #define THIN_SUPERBLOCK_MAGIC 27022010 #define THIN_SUPERBLOCK_LOCATION 0 #define THIN_VERSION 2 -#define THIN_METADATA_CACHE_SIZE 64 #define SECTOR_TO_BLOCK_SHIFT 3 /* @@ -686,7 +685,6 @@ static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool f int r; pmd->bm = dm_block_manager_create(pmd->bdev, THIN_METADATA_BLOCK_SIZE << SECTOR_SHIFT, - THIN_METADATA_CACHE_SIZE, THIN_MAX_CONCURRENT_LOCKS); if (IS_ERR(pmd->bm)) { DMERR("could not create block manager"); diff --git a/drivers/md/persistent-data/dm-block-manager.c b/drivers/md/persistent-data/dm-block-manager.c index 8589e0a14068..ea15d220ced7 100644 --- a/drivers/md/persistent-data/dm-block-manager.c +++ b/drivers/md/persistent-data/dm-block-manager.c @@ -378,7 +378,6 @@ struct dm_block_manager { struct dm_block_manager *dm_block_manager_create(struct block_device *bdev, unsigned block_size, -unsigned cache_size, unsigned max_held_per_thread) { int r; diff --git a/drivers/md/persistent-data/dm-block-manager.h b/drivers/md/persistent-data/dm-block-manager.h index 3627d1b7667a..e728937f376a 100644 --- a/drivers/md/persistent-data/dm-block-manager.h +++ b/drivers/md/persistent-data/dm-block-manager.h @@ -33,7 +33,7 @@ void *dm_block_data(struct dm_block *b); struct dm_block_manager; struct dm_block_manager *dm_block_manager_create( struct block_device *bdev, unsigned block_size, - unsigned cache_size, unsigned max_held_per_thread); + unsigned max_held_per_thread); void dm_block_manager_destroy(struct dm_block_manager *bm); unsigned dm_bm_block_size(struct dm_block_manager *bm); -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 08/13] dm: Verify suspend_locking assumptions at runtime
Ensure that the assumptions about the caller holding suspend_lock are checked at runtime if lockdep is enabled. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke --- drivers/md/dm-table.c | 4 drivers/md/dm.c | 9 - 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 3ad16d9c9d5a..92dbc85af53a 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1661,6 +1661,8 @@ static void suspend_targets(struct dm_table *t, enum suspend_mode mode) int i = t->num_targets; struct dm_target *ti = t->targets; + lockdep_assert_held(&t->md->suspend_lock); + while (i--) { switch (mode) { case PRESUSPEND: @@ -1708,6 +1710,8 @@ int dm_table_resume_targets(struct dm_table *t) { int i, r = 0; + lockdep_assert_held(&t->md->suspend_lock); + for (i = 0; i < t->num_targets; i++) { struct dm_target *ti = t->targets + i; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index dfb75979e455..78706a04bab4 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1686,11 +1686,10 @@ static void event_callback(void *context) wake_up(&md->eventq); } -/* - * Protected by md->suspend_lock obtained by dm_swap_table(). - */ static void __set_size(struct mapped_device *md, sector_t size) { + lockdep_assert_held(&md->suspend_lock); + set_capacity(md->disk, size); i_size_write(md->bdev->bd_inode, (loff_t)size << SECTOR_SHIFT); @@ -2140,8 +2139,6 @@ static void unlock_fs(struct mapped_device *md) * If __dm_suspend returns 0, the device is completely quiescent * now. There is no request-processing activity. All new requests * are being added to md->deferred list. - * - * Caller must hold md->suspend_lock */ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, unsigned suspend_flags, long task_state, @@ -2357,6 +2354,8 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_fla { struct dm_table *map = NULL; + lockdep_assert_held(&md->suspend_lock); + if (md->internal_suspend_count++) return; /* nested internal suspend */ -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 06/13] dm-rq: Check blk_mq_register_dev() return value
blk_mq_register_dev() can fail. Hence check the return value of that function. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke Cc: Christoph Hellwig --- drivers/md/dm-rq.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index ba5694be55a4..3ff7280f5dc5 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -813,10 +813,14 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t) dm_init_md_queue(md); /* backfill 'mq' sysfs registration normally done in blk_register_queue */ - blk_mq_register_dev(disk_to_dev(md->disk), q); + err = blk_mq_register_dev(disk_to_dev(md->disk), q); + if (err) + goto free_queue; return 0; +free_queue: + blk_cleanup_queue(q); out_tag_set: blk_mq_free_tag_set(md->tag_set); out_kfree_tag_set: -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 03/13] dm-mpath: Delay requeuing while path initialization is in progress
Requeuing a request immediately while path initialization is ongoing causes high CPU usage, something that is undesired. Hence delay requeuing while path initialization is in progress. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke Cc: Christoph Hellwig Cc: --- drivers/md/dm-mpath.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index cc41e34607c3..d739688246a0 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -322,13 +322,16 @@ static int __pg_init_all_paths(struct multipath *m) return atomic_read(&m->pg_init_in_progress); } -static void pg_init_all_paths(struct multipath *m) +static int pg_init_all_paths(struct multipath *m) { unsigned long flags; + int ret; spin_lock_irqsave(&m->lock, flags); - __pg_init_all_paths(m); + ret = __pg_init_all_paths(m); spin_unlock_irqrestore(&m->lock, flags); + + return ret; } static void __switch_pg(struct multipath *m, struct priority_group *pg) @@ -485,7 +488,6 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, struct request **__clone) { struct multipath *m = ti->private; - int r = DM_MAPIO_REQUEUE; size_t nr_bytes = blk_rq_bytes(rq); struct pgpath *pgpath; struct block_device *bdev; @@ -504,8 +506,8 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, return -EIO;/* Failed */ } else if (test_bit(MPATHF_QUEUE_IO, &m->flags) || test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) { - pg_init_all_paths(m); - return r; + return pg_init_all_paths(m) == 0 ? DM_MAPIO_REQUEUE : + DM_MAPIO_DELAY_REQUEUE; } memset(mpio, 0, sizeof(*mpio)); -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 02/13] dm-mpath: Avoid that path removal can trigger an infinite loop
If blk_get_request() fails check whether the failure is due to a path being removed. If that is the case fail the path by triggering a call to fail_path(). This patch avoids that the following scenario can be encountered while removing paths: * CPU usage of a kworker thread jumps to 100%. * Removing the dm device becomes impossible. Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK because in these cases immediate requeuing is inappropriate. Signed-off-by: Bart Van Assche Cc: Hannes Reinecke Cc: Christoph Hellwig Cc: --- drivers/md/dm-mpath.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index eff7ecea1d2a..cc41e34607c3 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -490,6 +490,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, struct pgpath *pgpath; struct block_device *bdev; struct dm_mpath_io *mpio = get_mpio(map_context); + struct request_queue *q; struct request *clone; /* Do we need to select a new pgpath? */ @@ -512,13 +513,19 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, mpio->nr_bytes = nr_bytes; bdev = pgpath->path.dev->bdev; - - clone = blk_get_request(bdev_get_queue(bdev), - rq->cmd_flags | REQ_NOMERGE, - GFP_ATOMIC); + q = bdev_get_queue(bdev); + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); if (IS_ERR(clone)) { /* EBUSY, ENODEV or EWOULDBLOCK: requeue */ - return r; + pr_debug("blk_get_request() returned %ld%s - requeuing\n", +PTR_ERR(clone), blk_queue_dying(q) ? +" (path offline)" : ""); + if (blk_queue_dying(q)) { + atomic_inc(&m->pg_init_in_progress); + activate_or_offline_path(pgpath); + return DM_MAPIO_REQUEUE; + } + return DM_MAPIO_DELAY_REQUEUE; } clone->bio = clone->biotail = NULL; clone->rq_disk = bdev->bd_disk; -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 00/13] Device mapper and dm-mpath patches
Hello Mike, The patches in this series are: * A few fixes for bugs in the dm-mpath driver I ran into while testing this driver. * A resend of various dm / dm-mpath patches I had posted before but for which I'm still waiting for a review from you. Please consider at least the dm-mpath bug fixes for kernel v4.12. Thanks, Bart. Changes compared to v1: - Renamed activate_path() to activate_or_offline_path(). - Added Reviewed-by tags to the patches that got a positive review. Bart Van Assche (13): dm-mpath: Split activate_path() dm-mpath: Avoid that path removal can trigger an infinite loop dm-mpath: Delay requeuing while path initialization is in progress dm-rq: Adjust requeuing delays dm-mpath: Make it easier to analyze requeuing behavior dm-rq: Check blk_mq_register_dev() return value dm, persistence: Remove an unused argument from dm_block_manager_create() dm: Verify suspend_locking assumptions at runtime dm-mpath: Verify locking assumptions at runtime dm: Introduce enum dm_queue_mode dm-mpath: Micro-optimize the hot path dm-mpath: Introduce assign_bit() dm, dm-mpath: Make it easier to detect unintended I/O request flushes drivers/md/dm-cache-metadata.c| 3 - drivers/md/dm-core.h | 2 +- drivers/md/dm-era-target.c| 2 - drivers/md/dm-ioctl.c | 2 +- drivers/md/dm-mpath.c | 190 +- drivers/md/dm-rq.c| 15 +- drivers/md/dm-table.c | 18 ++- drivers/md/dm-thin-metadata.c | 2 - drivers/md/dm.c | 26 ++-- drivers/md/dm.h | 12 +- drivers/md/persistent-data/dm-block-manager.c | 1 - drivers/md/persistent-data/dm-block-manager.h | 2 +- include/linux/device-mapper.h | 14 +- 13 files changed, 151 insertions(+), 138 deletions(-) -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2 05/13] dm-mpath: Make it easier to analyze requeuing behavior
When debugging the dm-mpath driver it is important to know what decisions have been taken with regard to requeuing. Hence this patch that adds pr_debug() statements that report what decisions have been taken. Signed-off-by: Bart Van Assche Reviewed-by: Hannes Reinecke Cc: Christoph Hellwig --- drivers/md/dm-mpath.c | 21 ++--- drivers/md/dm-rq.c| 5 - 2 files changed, 22 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index d739688246a0..fc3e7f028110 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -501,8 +501,10 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, pgpath = choose_pgpath(m, nr_bytes); if (!pgpath) { - if (must_push_back_rq(m)) + if (must_push_back_rq(m)) { + pr_debug("no path - requeueing\n"); return DM_MAPIO_DELAY_REQUEUE; + } return -EIO;/* Failed */ } else if (test_bit(MPATHF_QUEUE_IO, &m->flags) || test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) { @@ -1425,17 +1427,23 @@ static void pg_init_done(void *data, int errors) /* Activations of other paths are still on going */ goto out; + pr_debug("pg_init_in_progress = %d\n", +atomic_read(&m->pg_init_in_progress)); + if (test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) { if (delay_retry) set_bit(MPATHF_PG_INIT_DELAY_RETRY, &m->flags); else clear_bit(MPATHF_PG_INIT_DELAY_RETRY, &m->flags); - if (__pg_init_all_paths(m)) + if (__pg_init_all_paths(m)) { + pr_debug("__pg_init_all_paths() reported that initialization is ongoing\n"); goto out; + } } clear_bit(MPATHF_QUEUE_IO, &m->flags); + pr_debug("Processing queued I/O list\n"); process_queued_io_list(m); /* @@ -1932,8 +1940,11 @@ static int multipath_busy(struct dm_target *ti) struct pgpath *pgpath; /* pg_init in progress */ - if (atomic_read(&m->pg_init_in_progress)) + if (atomic_read(&m->pg_init_in_progress)) { + pr_debug("pg_init_in_progress = %d\n", +atomic_read(&m->pg_init_in_progress)); return true; + } /* no paths available, for blk-mq: rely on IO mapping to delay requeue */ if (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) @@ -1980,6 +1991,10 @@ static int multipath_busy(struct dm_target *ti) busy = false; } + + if (busy) + pr_debug("all active paths are busy\n"); + return busy; } diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index c53debdcd7dc..ba5694be55a4 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -737,8 +737,10 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, dm_put_live_table(md, srcu_idx); } - if (ti->type->busy && ti->type->busy(ti)) + if (ti->type->busy && ti->type->busy(ti)) { + pr_debug("ti->type->busy()\n"); return BLK_MQ_RQ_QUEUE_BUSY; + } dm_start_request(md, rq); @@ -756,6 +758,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, rq_end_stats(md, rq); rq_completed(md, rq_data_dir(rq), false); blk_mq_delay_run_hw_queue(hctx, 500/*ms*/); + pr_debug("DM_MAPIO_REQUEUE\n"); return BLK_MQ_RQ_QUEUE_BUSY; } -- 2.12.2 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function
On 26/04/17 09:56 PM, Herbert Xu wrote: > On Tue, Apr 25, 2017 at 12:20:54PM -0600, Logan Gunthorpe wrote: >> Very straightforward conversion to the new function in the caam driver >> and shash library. >> >> Signed-off-by: Logan Gunthorpe >> Cc: Herbert Xu >> Cc: "David S. Miller" >> --- >> crypto/shash.c| 9 ++--- >> drivers/crypto/caam/caamalg.c | 8 +++- >> 2 files changed, 9 insertions(+), 8 deletions(-) >> >> diff --git a/crypto/shash.c b/crypto/shash.c >> index 5e31c8d..5914881 100644 >> --- a/crypto/shash.c >> +++ b/crypto/shash.c >> @@ -283,10 +283,13 @@ int shash_ahash_digest(struct ahash_request *req, >> struct shash_desc *desc) >> if (nbytes < min(sg->length, ((unsigned int)(PAGE_SIZE)) - offset)) { >> void *data; >> >> -data = kmap_atomic(sg_page(sg)); >> -err = crypto_shash_digest(desc, data + offset, nbytes, >> +data = sg_map(sg, 0, SG_KMAP_ATOMIC); >> +if (IS_ERR(data)) >> +return PTR_ERR(data); >> + >> +err = crypto_shash_digest(desc, data, nbytes, >>req->result); >> -kunmap_atomic(data); >> +sg_unmap(sg, data, 0, SG_KMAP_ATOMIC); >> crypto_yield(desc->flags); >> } else >> err = crypto_shash_init(desc) ?: > > Nack. This is an optimisation for the special case of a single > SG list entry. In fact in the common case the kmap_atomic should > disappear altogether in the no-highmem case. So replacing it > with sg_map is not acceptable. What you seem to have missed is that sg_map is just a thin wrapper around kmap_atomic. Perhaps with a future check for a mappable page. This change should have zero impact on performance. Logan -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 27/04/17 12:53 AM, Christoph Hellwig wrote: > I think you'll need to follow the existing kmap semantics and never > fail the iomem version either. Otherwise you'll have a special case > that's almost never used that has a different error path. > > Again, wrong way. Suddenly making things fail for your special case > that normally don't fail is a receipe for bugs. I don't disagree but these restrictions make the problem impossible to solve? If there is iomem behind a page in an SGL and someone tries to map it, we either have to fail or we break iomem safety which was your original concern. Logan -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 27/04/17 09:27 AM, Jason Gunthorpe wrote: > On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote: > How about first switching as many call sites as possible to use > sg_copy_X_buffer instead of kmap? Yeah, I could look at doing that first. One problem is we might get more Naks of the form of Herbert Xu's who might be concerned with the performance implications. These are definitely a bit more invasive changes than thin wrappers around kmap calls. > A random audit of Logan's series suggests this is actually a fairly > common thing. It's not _that_ common but there are a significant fraction. One of my patches actually did this to two places that seemed to be reimplementing the sg_copy_X_buffer logic. Thanks, Logan -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v1] dm-crypt: replace custom implementation of hex2bin()
From: Andy Shevchenko There is no need to have a duplication of the generic library, i.e. hex2bin(). Replace the open coded variant. Signed-off-by: Andy Shevchenko --- drivers/md/dm-crypt.c | 27 ++- 1 file changed, 2 insertions(+), 25 deletions(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 70f765f50c59..ebf9e72d479b 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1814,30 +1814,6 @@ static void kcryptd_queue_crypt(struct dm_crypt_io *io) queue_work(cc->crypt_queue, &io->work); } -/* - * Decode key from its hex representation - */ -static int crypt_decode_key(u8 *key, char *hex, unsigned int size) -{ - char buffer[3]; - unsigned int i; - - buffer[2] = '\0'; - - for (i = 0; i < size; i++) { - buffer[0] = *hex++; - buffer[1] = *hex++; - - if (kstrtou8(buffer, 16, &key[i])) - return -EINVAL; - } - - if (*hex != '\0') - return -EINVAL; - - return 0; -} - static void crypt_free_tfms_aead(struct crypt_config *cc) { if (!cc->cipher_tfm.tfms_aead) @@ -2136,7 +2112,8 @@ static int crypt_set_key(struct crypt_config *cc, char *key) kzfree(cc->key_string); cc->key_string = NULL; - if (cc->key_size && crypt_decode_key(cc->key, key, cc->key_size) < 0) + /* Decode key from its hex representation. */ + if (cc->key_size && hex2bin(cc->key, key, cc->key_size) < 0) goto out; r = crypt_setkey(cc); -- 2.11.0 -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote: > > The main difficulty we > > have now is that neither of those functions are expected to fail and we > > need them to be able to in cases where the page doesn't map to system > > RAM. This patch series is trying to address it for users of scatterlist. > > I'm certainly open to other suggestions. > > I think you'll need to follow the existing kmap semantics and never > fail the iomem version either. Otherwise you'll have a special case > that's almost never used that has a different error path. How about first switching as many call sites as possible to use sg_copy_X_buffer instead of kmap? A random audit of Logan's series suggests this is actually a fairly common thing. eg drivers/mmc/host/sdhci.c is only doing this: buffer = sdhci_kmap_atomic(sg, &flags); memcpy(buffer, align, size); sdhci_kunmap_atomic(buffer, &flags); drivers/scsi/mvsas/mv_sas.c is this: + to = sg_map(sg_resp, 0, SG_KMAP_ATOMIC); + memcpy(to, + slot->response + sizeof(struct mvs_err_info), + sg_dma_len(sg_resp)); + sg_unmap(sg_resp, to, 0, SG_KMAP_ATOMIC); etc. Lots of other places seem similar, if not sometimes a little bit more convoluted.. Switching all the trivial cases to use copy might bring more clarity as to what is actually required for the remaining few users? If there are only a few then it may no longer matter if the API is not idyllic. Jason -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] dm ioctl: prevent stack leak in dm ioctl call
On Tue, Apr 25, 2017 at 05:33:19PM -0700, Adrian Salido wrote: > Will update the patch to be clear So at the moment, we're leaning towards just: param->data_size = offsetof(struct dm_ioctl, data) to replace param->data_size = sizeof(*param); in the caller. Alasdair -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 02/13] dm-mpath: Avoid that path removal can trigger an infinite loop
On Thu, Apr 27 2017 at 11:13am -0400, Hannes Reinecke wrote: > On 04/27/2017 05:11 PM, Bart Van Assche wrote: > > On Thu, 2017-04-27 at 07:46 +0200, Hannes Reinecke wrote: > >> On 04/26/2017 08:37 PM, Bart Van Assche wrote: > >>> + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); > >>> if (IS_ERR(clone)) { > >>> /* EBUSY, ENODEV or EWOULDBLOCK: requeue */ > >>> - return r; > >>> + pr_debug("blk_get_request() returned %ld%s - requeuing\n", > >>> + PTR_ERR(clone), blk_queue_dying(q) ? > >>> + " (path offline)" : ""); > >>> + if (blk_queue_dying(q)) { > >>> + atomic_inc(&m->pg_init_in_progress); > >>> + activate_path(pgpath); > >>> + return DM_MAPIO_REQUEUE; > >>> + } > >>> + return DM_MAPIO_DELAY_REQUEUE; > >>> } > >>> clone->bio = clone->biotail = NULL; > >>> clone->rq_disk = bdev->bd_disk; > >> > >> At the very least this does warrant some inline comments. > >> Why do we call activate_path() here, seeing that the queue is dying? > > > > Hello Hannes, > > > > activate_path() is not only able to activate a path but can also change > > the state of a path to offline. The body of the activate_path() function > > makes that clear and that is why I had not added a comment above the > > activate_path() call: > > > > static void activate_path(struct pgpath *pgpath) > > { > > struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev); > > > > if (pgpath->is_active && !blk_queue_dying(q)) > > scsi_dh_activate(q, pg_init_done, pgpath); > > else > > pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); > > } > > > So why not call 'pg_init_done()' directly and avoid the confusion? Doing so is sprinkling more SCSI specific droppings in code that should be increasingly transport agnostic. Might be worth renaming activate_path() to activate_or_offline_path() ? Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 02/13] dm-mpath: Avoid that path removal can trigger an infinite loop
On 04/27/2017 05:11 PM, Bart Van Assche wrote: > On Thu, 2017-04-27 at 07:46 +0200, Hannes Reinecke wrote: >> On 04/26/2017 08:37 PM, Bart Van Assche wrote: >>> + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); >>> if (IS_ERR(clone)) { >>> /* EBUSY, ENODEV or EWOULDBLOCK: requeue */ >>> - return r; >>> + pr_debug("blk_get_request() returned %ld%s - requeuing\n", >>> +PTR_ERR(clone), blk_queue_dying(q) ? >>> +" (path offline)" : ""); >>> + if (blk_queue_dying(q)) { >>> + atomic_inc(&m->pg_init_in_progress); >>> + activate_path(pgpath); >>> + return DM_MAPIO_REQUEUE; >>> + } >>> + return DM_MAPIO_DELAY_REQUEUE; >>> } >>> clone->bio = clone->biotail = NULL; >>> clone->rq_disk = bdev->bd_disk; >> >> At the very least this does warrant some inline comments. >> Why do we call activate_path() here, seeing that the queue is dying? > > Hello Hannes, > > activate_path() is not only able to activate a path but can also change > the state of a path to offline. The body of the activate_path() function > makes that clear and that is why I had not added a comment above the > activate_path() call: > > static void activate_path(struct pgpath *pgpath) > { > struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev); > > if (pgpath->is_active && !blk_queue_dying(q)) > scsi_dh_activate(q, pg_init_done, pgpath); > else > pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); > } > So why not call 'pg_init_done()' directly and avoid the confusion? Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 02/13] dm-mpath: Avoid that path removal can trigger an infinite loop
On Thu, 2017-04-27 at 07:46 +0200, Hannes Reinecke wrote: > On 04/26/2017 08:37 PM, Bart Van Assche wrote: > > + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); > > if (IS_ERR(clone)) { > > /* EBUSY, ENODEV or EWOULDBLOCK: requeue */ > > - return r; > > + pr_debug("blk_get_request() returned %ld%s - requeuing\n", > > +PTR_ERR(clone), blk_queue_dying(q) ? > > +" (path offline)" : ""); > > + if (blk_queue_dying(q)) { > > + atomic_inc(&m->pg_init_in_progress); > > + activate_path(pgpath); > > + return DM_MAPIO_REQUEUE; > > + } > > + return DM_MAPIO_DELAY_REQUEUE; > > } > > clone->bio = clone->biotail = NULL; > > clone->rq_disk = bdev->bd_disk; > > At the very least this does warrant some inline comments. > Why do we call activate_path() here, seeing that the queue is dying? Hello Hannes, activate_path() is not only able to activate a path but can also change the state of a path to offline. The body of the activate_path() function makes that clear and that is why I had not added a comment above the activate_path() call: static void activate_path(struct pgpath *pgpath) { struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev); if (pgpath->is_active && !blk_queue_dying(q)) scsi_dh_activate(q, pg_init_done, pgpath); else pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); } Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 3.18-stable] dm bufio: hide bogus warning
On Fri, Apr 21, 2017 at 03:41:10PM +0200, Arnd Bergmann wrote: > mips-gcc-5.3 warns about correct code on linux-3.18 and earlier: > > In file included from ../include/linux/blkdev.h:4:0, > from ../drivers/md/dm-bufio.h:12, > from ../drivers/md/dm-bufio.c:9: > ../drivers/md/dm-bufio.c: In function 'alloc_buffer': > ../include/linux/sched.h:1975:56: warning: 'noio_flag' may be used > uninitialized in this function [-Wmaybe-uninitialized] > current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags; >~^~~ > ../drivers/md/dm-bufio.c:325:11: note: 'noio_flag' was declared here > > The warning disappeared on later kernels with this commit: be0c37c985ed > ("MIPS: Rearrange PTE bits into fixed positions.") I assume this only > happened because it changed some inlining decisions. > > On 3.18.y, we can shut up the warning by adding an extra initialization. > > Signed-off-by: Arnd Bergmann Now applied, thanks. greg k-h -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 26/04/17 01:44 AM, Christoph Hellwig wrote: > I think we'll at least need a draft of those to make sense of these > patches. Otherwise they just look very clumsy. Ok, I'll work up a draft proposal and send it in a couple days. But without a lot of cleanup such as this series it's not going to even be able to compile. > I'm sorry but this API is just a trainwreck. Right now we have the > nice little kmap_atomic API, which never fails and has a very nice > calling convention where we just pass back the return address, but does > not support sleeping inside the critical section. > > And kmap, whіch may fail and requires the original page to be passed > back. Anything that mixes these two concepts up is simply a non-starter. Ok, well for starters I think you are mistaken about kmap being able to fail. I'm having a hard time finding many users of that function that bother to check for an error when calling it. The main difficulty we have now is that neither of those functions are expected to fail and we need them to be able to in cases where the page doesn't map to system RAM. This patch series is trying to address it for users of scatterlist. I'm certainly open to other suggestions. I also have to disagree that kmap and kmap_atomic are all that "nice". Except for the sleeping restriction and performance, they effectively do the same thing. And it was necessary to write a macro wrapper around kunmap_atomic to ensure that users of that function don't screw it up. (See 597781f3e5.) I'd say the kmap/kmap_atomic functions are the trainwreck and I'm trying to do my best to cleanup a few cases. There are a fair number of cases in the kernel that do something like: if (something) x = kmap(page); else x = kmap_atomic(page); ... if (something) kunmap(page) else kunmap_atomic(x) Which just seems cumbersome to me. In any case, if you can accept an sg_kmap and sg_kmap_atomic api just say so and I'll make the change. But I'll still need a flags variable for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path and both of those functions will need to be pretty nearly replicas of each other. Logan -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 26/04/17 02:59 AM, wrote: > Good to know that somebody is working on this. Those problems troubled > us as well. Thanks Christian. It's a daunting problem and a there's a lot of work to do before we will ever be where we need to be so any help, even an ack, is greatly appreciated. Logan -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [BUG] multipath-tools: missing internal values (multipath -t)
Hi, The config outputed should be usable, so a comment could be printed for the keywords you suggest printing as the default for. Example: # no_path_retry or simply # no_path_retry On Tue, Apr 25, 2017 at 8:42 PM, Benjamin Marzinski wrote: > On Tue, Apr 25, 2017 at 04:26:06PM +0200, Xose Vazquez Perez wrote: > > Hi, > > > > These values: > > What do you think of these? > > no_path_retry > > The default if no_path_retry is set is to use the features line to > determine what to do when there are no paths. we could have it > print something like > > no_path_retry > > > dev_loss_tmo > > The default here is to keep whatever the sysfs value currently is. We > could print > > dev_loss_tmo > > or > > dev_loss_tmo > > or something else. I'm open to ideas. > > > reservation_key > > The default here is nothing > > reservation_key > > probably makes the most sense > > > partition_delimite > > The default here is an empty string > > partition_delm "" > > is probably the right answer. > > > delay_watch_checks > > delay_wait_checks > > These clearly have well defined default values and should be set. > > > -Ben > > > are missing from the "defaults section" of multipath -t output. > > -- > > dm-devel mailing list > > dm-devel@redhat.com > > https://www.redhat.com/mailman/listinfo/dm-devel > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel