Re: [dm-devel] [PATCH v4 13/18] nvme: Add pr_ops read_reservation support
On 2/24/23 3:04 PM, kernel test robot wrote: > >drivers/nvme/host/pr.c: In function 'block_pr_type_from_nvme': >>> drivers/nvme/host/pr.c:43:24: warning: implicit conversion from 'enum >>> nvme_pr_type' to 'enum pr_type' [-Wenum-conversion] > 43 | return NVME_PR_EXCLUSIVE_ACCESS_ALL_REGS; > |^~~ Should have been PR_EXCLUSIVE_ACCESS_ALL_REGS. Will fix. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v4 10/18] nvme: Add helper to send pr command
On 3/5/23 3:28 PM, Chaitanya Kulkarni wrote: >> >> +static int nvme_send_pr_command(struct block_device *bdev, >> +struct nvme_command *c, void *data, unsigned int data_len) >> +{ >> +if (IS_ENABLED(CONFIG_NVME_MULTIPATH) && >> +bdev->bd_disk->fops == &nvme_ns_head_ops) >> +return nvme_send_ns_head_pr_command(bdev, c, data, data_len); > below else is not needed after above return.. Will fix. Thanks. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] dm crypt: initialize tasklet in crypt_io_init()
Hi, On 3/7/2023 3:31 AM, Mike Snitzer wrote: > On Mon, Mar 06 2023 at 8:49P -0500, > Hou Tao wrote: > >> From: Hou Tao >> >> When neither no_read_workqueue nor no_write_workqueue are enabled, >> tasklet_trylock() in crypt_dec_pending() may still return false due to >> an uninitialized state, and dm-crypt will do io completion in io_queue >> instead of current context unnecessarily. > Have you actually experienced this? Yes. I had written a bpftrace script to check the completion context of blkdev_bio_end_io_simple() when doing direct io read on dm-crypt device. The expected context should be unbound workers of crypt_queue, but sometimes the context is the bound worker of io_queue. > >> Fix it by initializing io->tasklet in crypt_io_init(). > Really would rather avoid always calling tasklet_init(). But I can > optimize it away with a later patch. My first though was "io->tasklet.state = 0", but it may be fragile because it operated on the internal status of tasklet, so I switch to tasklet_init(). > > Mike -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] dm-crypt: add the "high_priority" flag
在 2023/3/7 3:01, Mikulas Patocka 写道: On Mon, 6 Mar 2023, Mikulas Patocka wrote: Hi, Thanks a lot for your review! It's ok to fix the softlockup, but for async write encrypt, kcryptd_crypt_write_io_submit will add bio to write_tree, and once we call cond_resched before every kcryptd_io_write, the write performance may be poor while we meet a high cpu usage scene. Hi To fix this problem, find the PID of the process "dmcrypt_write" and change its priority to -20, for example "renice -n -20 -p 34748". This is the proper way how to fix it; locking up the process for one second is not. We used to have high-priority workqueues by default, but it caused audio playback skipping, so we had to revert it - see f612b2132db529feac4f965f28a1b9258ea7c22b. Perhaps we should add an option to have high-priority kernel threads? Mikulas Here I'm sending a patch that makes it optional to raise the priority of dm-crypt workqueues and the write thread. Test it, if it helps for you. Thanks, will test it! Mikulas From: Mikulas Patocka It was reported that dm-crypt performs badly when the system is loaded[1]. So I'm adding an option "high_priority" that sets the workqueues and the write_thread to nice level -20. This used to be the default in the past, but it caused audio skipping, so it had to be reverted - see the commit f612b2132db529feac4f965f28a1b9258ea7c22b. This commit makes it optional, so that normal users won't be harmed by it. [1] https://listman.redhat.com/archives/dm-devel/2023-February/053410.html Signed-off-by: Mikulas Patocka --- Documentation/admin-guide/device-mapper/dm-crypt.rst |5 +++ drivers/md/dm-crypt.c| 28 +-- 2 files changed, 25 insertions(+), 8 deletions(-) Index: linux-2.6/drivers/md/dm-crypt.c === --- linux-2.6.orig/drivers/md/dm-crypt.c +++ linux-2.6/drivers/md/dm-crypt.c @@ -133,9 +133,9 @@ struct iv_elephant_private { * and encrypts / decrypts at the same time. */ enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID, -DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD, -DM_CRYPT_NO_READ_WORKQUEUE, DM_CRYPT_NO_WRITE_WORKQUEUE, -DM_CRYPT_WRITE_INLINE }; +DM_CRYPT_SAME_CPU, DM_CRYPT_HIGH_PRIORITY, +DM_CRYPT_NO_OFFLOAD, DM_CRYPT_NO_READ_WORKQUEUE, +DM_CRYPT_NO_WRITE_WORKQUEUE, DM_CRYPT_WRITE_INLINE }; enum cipher_flags { CRYPT_MODE_INTEGRITY_AEAD, /* Use authenticated mode for cipher */ @@ -3087,7 +3087,7 @@ static int crypt_ctr_optional(struct dm_ struct crypt_config *cc = ti->private; struct dm_arg_set as; static const struct dm_arg _args[] = { - {0, 8, "Invalid number of feature args"}, + {0, 9, "Invalid number of feature args"}, }; unsigned int opt_params, val; const char *opt_string, *sval; @@ -3114,6 +3114,8 @@ static int crypt_ctr_optional(struct dm_ else if (!strcasecmp(opt_string, "same_cpu_crypt")) set_bit(DM_CRYPT_SAME_CPU, &cc->flags); + else if (!strcasecmp(opt_string, "high_priority")) + set_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags); else if (!strcasecmp(opt_string, "submit_from_crypt_cpus")) set_bit(DM_CRYPT_NO_OFFLOAD, &cc->flags); @@ -3352,18 +3354,22 @@ static int crypt_ctr(struct dm_target *t } ret = -ENOMEM; - cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM, 1, devname); + cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM | + test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI, + 1, devname); if (!cc->io_queue) { ti->error = "Couldn't create kcryptd io queue"; goto bad; } if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags)) - cc->crypt_queue = alloc_workqueue("kcryptd/%s", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, + cc->crypt_queue = alloc_workqueue("kcryptd/%s", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | + test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI, 1, devname); else cc->crypt_queue = alloc_workqueue("kcryptd/%s", - WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, + WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND | + test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI, num_online_cpus(), devname); if (!cc->crypt_queue) { ti->error = "Couldn't create kcryptd queue"; @@ -3380,6 +3386,8 @@ static int crypt_ctr(struct dm_target
Re: [dm-devel] dm-crypt: add cond_resched() to dmcrypt_write()
在 2023/3/7 0:33, Mike Snitzer 写道: On Mon, Mar 06 2023 at 11:17P -0500, Mikulas Patocka wrote: The loop in dmcrypt_write may be running for unbounded amount of time, thus we need cond_resched() in it. This commit fixes the following warning: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897] ... [ 3391.258372][ C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G L5.10.0+ #8 [ 3391.267288][ C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.75 04/26/2021 [ 3391.275428][ C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--) [ 3391.282102][ C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.288080][ C12] lr : blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.293540][ C12] sp : 8000718abc30 [ 3391.297530][ C12] x29: 8000718abc30 x28: [ 3391.303509][ C12] x27: 2020361d9ac0 x26: 0001 [ 3391.309488][ C12] x25: 0001 x24: 8000718abd08 [ 3391.315467][ C12] x23: 0020a10dbb00 x22: 0001 [ 3391.321445][ C12] x21: 8000718abe20 x20: 0020a10dbb00 [ 3391.327424][ C12] x19: 0020aed7b040 x18: [ 3391.333402][ C12] x17: x16: fde0 [ 3391.339381][ C12] x15: 1000 x14: [ 3391.345359][ C12] x13: 2000 x12: 2060011f9070 [ 3391.351338][ C12] x11: 0001 x10: 0002 [ 3391.357316][ C12] x9 : 800010586c38 x8 : 202009a7f200 [ 3391.363294][ C12] x7 : 8000718abd00 x6 : 20200df33750 [ 3391.369274][ C12] x5 : 0001 x4 : [ 3391.375252][ C12] x3 : 0001 x2 : 0020a10dbb00 [ 3391.381230][ C12] x1 : 0020aed7b040 x0 : 1000 [ 3391.387210][ C12] Call trace: [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] [ 3391.416005][ C12] kthread+0x130/0x138 [ 3391.419911][ C12] ret_from_fork+0x10/0x18 Signed-off-by: Mikulas Patocka Reported-by: yangerkun Fixes: dc2676210c42 ("dm crypt: offload writes to thread") Cc: sta...@vger.kernel.org --- drivers/md/dm-crypt.c |1 + 1 file changed, 1 insertion(+) Index: linux-2.6/drivers/md/dm-crypt.c === --- linux-2.6.orig/drivers/md/dm-crypt.c +++ linux-2.6/drivers/md/dm-crypt.c @@ -1954,6 +1954,7 @@ pop_from_list: io = crypt_io_from_node(rb_first(&write_tree)); rb_erase(&io->rb_node, &write_tree); kcryptd_io_write(io); + cond_resched(); } while (!RB_EMPTY_ROOT(&write_tree)); blk_finish_plug(&plug); } I suggested the same here: https://listman.redhat.com/archives/dm-devel/2023-February/053402.html Yet yangerkun resisted it: https://listman.redhat.com/archives/dm-devel/2023-February/053410.html And stuck with their more throttled approach (which looks wrong to me, shouldn't it be using time_is_after_jiffies!?) Hi, --> ||| start_time start_time + HZ jiffies (start_time + HZ) < jiffies means since start_time, we have pass 1s, so we need to cond_resched(see writeback_sb_inodes and blk_mq_do_dispatch_sched)... Thanks, Kun. https://listman.redhat.com/archives/dm-devel/2023-March/053473.html Anyway, I'm inclined to take your patch, I share your views: https://listman.redhat.com/archives/dm-devel/2023-March/053470.html Mike -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] dm crypt: initialize tasklet in crypt_io_init()
On Mon, Mar 06 2023 at 8:49P -0500, Hou Tao wrote: > From: Hou Tao > > When neither no_read_workqueue nor no_write_workqueue are enabled, > tasklet_trylock() in crypt_dec_pending() may still return false due to > an uninitialized state, and dm-crypt will do io completion in io_queue > instead of current context unnecessarily. Have you actually experienced this? > Fix it by initializing io->tasklet in crypt_io_init(). Really would rather avoid always calling tasklet_init(). But I can optimize it away with a later patch. Mike -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] dm-zero: support discards
This patch adds discard support to dm-zero. The discards are ignored. It is useful when the user combines dm-zero with other discard-supporting targets in the same table; without dm-zero support, discards would be disabled for the whole combined device. Signed-off-by: Mikulas Patocka --- drivers/md/dm-table.c |9 - drivers/md/dm-zero.c |4 +++- 2 files changed, 11 insertions(+), 2 deletions(-) Index: linux-2.6/drivers/md/dm-zero.c === --- linux-2.6.orig/drivers/md/dm-zero.c +++ linux-2.6/drivers/md/dm-zero.c @@ -27,6 +27,7 @@ static int zero_ctr(struct dm_target *ti * Silently drop discards, avoiding -EOPNOTSUPP. */ ti->num_discard_bios = 1; + ti->discards_supported = true; return 0; } @@ -45,6 +46,7 @@ static int zero_map(struct dm_target *ti zero_fill_bio(bio); break; case REQ_OP_WRITE: + case REQ_OP_DISCARD: /* writes get silently dropped */ break; default: @@ -59,7 +61,7 @@ static int zero_map(struct dm_target *ti static struct target_type zero_target = { .name = "zero", - .version = {1, 1, 0}, + .version = {1, 2, 0}, .features = DM_TARGET_NOWAIT, .module = THIS_MODULE, .ctr= zero_ctr, Index: linux-2.6/drivers/md/dm-table.c === --- linux-2.6.orig/drivers/md/dm-table.c +++ linux-2.6/drivers/md/dm-table.c @@ -1670,8 +1670,15 @@ int dm_calculate_queue_limits(struct dm_ blk_set_stacking_limits(&ti_limits); - if (!ti->type->iterate_devices) + if (!ti->type->iterate_devices) { + if (ti->discards_supported) { + ti_limits.max_discard_sectors = UINT_MAX; + ti_limits.max_hw_discard_sectors = UINT_MAX; + ti_limits.discard_granularity = 512; + ti_limits.discard_alignment = 0; + } goto combine_limits; + } /* * Combine queue limits of all the devices this target uses. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] [PATCH] libmultipath: return 'ghost' state when port is in transition
On Mon, Mar 06, 2023 at 12:46:50PM +0100, Martin Wilck wrote: > Hi Brian, > > On Sat, 2023-03-04 at 12:49 -0800, Brian Bunker wrote: > > > > > The checking for standby is 14 years old, and says that TUR returns > > a unit attention when the path is in standby. I am not sure why that > > wouldn’t be handled by this code above: I would think there should be > > just one unit attention for each I_T_L when an ALUA state change > > happens.Not sure how it exceeds the retry count. > > > > if (key == 0x6) { > > /* Unit Attention, retry */ > > if (—retry_tur) > > goto retry; > > } > > > > From my perspective failing a path for ALUA state standby is > > reasonable > > since it is not an active state. > > I think the historic rationale for using GHOST is that some storage > arrays, in particular active-passive configurations, may keep certain > port groups in STANDBY state. If STANDBY was classified as FAILED, > "multipath -ll" would show all paths of such port groups as FAILED, > which would confuse users. > > That's what I meant before, multipath's GHOST can mean multiple things > depending on the actual hardware in use, explicit/implicit ALUA, etc. > Given that today basically every hardware supports ALUA, we could > probably do better. But changing the semantics in the current situation > is risky and error prone. I am sympathetic to Martin's view that GHOST is an ambiguous state, and it's not at all clear that in means "temporarily between states". In fact, it ususally doesn't. On the other hand, if we can be pretty certain that devices won't keep paths in the TRANSITIONING state for an extended time, but we can't be certain what the end state will be, I do see the rationale for not failing them preemtively. I wonder if PATH_PENDING makes more sense. We would retain the existing state until the path left the TRANSITIONING state. The question is, are you trying to make paths that are transitioning out of a failed state come back sooner, or are you trying to keep paths that were in a active state from being prevemtively failed. Using PATH_PENDING won't fix the first case, only the second. PATH_PENDING makes sure that if IO to the path does start failing, the checker won't keep setting the path back to an active state again. It also avoids the another GHOST issue, where the path would end up being grouped with any actually passive paths, which isn't what we're looking for. The one problem I can think of off the top of my head would be that if the device was held in the TRANSISTIONING state for a long time, multipathd would keep checking it constantly, since PATH_PENDING is really meant for cases where the checker hasn't completed yet, and we just want to keep looking for the result. I suppose it would be possible to add another state that worked just like pending (and could even get converted to PATH_PENDING if there was no other state to be converted to) but didn't cause us to retigger the checker so quickly. But if devices really will only be in TRANSITIONING for a short time, it might not even be an issue we have to worry about. Thoughts? -Ben > > Regards > Martin -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] dm-crypt: fix softlockup in dmcrypt_write
On Mon, 6 Mar 2023, Bart Van Assche wrote: > On 3/6/23 10:02, Mikulas Patocka wrote: > > On Tue, 28 Feb 2023, yangerkun wrote: > > > It's ok to fix the softlockup, but for async write encrypt, > > > kcryptd_crypt_write_io_submit will add bio to write_tree, and once we > > > call cond_resched before every kcryptd_io_write, the write performance > > > may be poor while we meet a high cpu usage scene. > > > > Hi > > > > To fix this problem, find the PID of the process "dmcrypt_write" and > > change its priority to -20, for example "renice -n -20 -p 34748". > > > > This is the proper way how to fix it; locking up the process for one > > second is not. > > > > We used to have high-priority workqueues by default, but it caused audio > > playback skipping, so we had to revert it - see > > f612b2132db529feac4f965f28a1b9258ea7c22b. > > > > Perhaps we should add an option to have high-priority kernel threads? > > Would calling cond_resched() every n iterations instead of every iteration > help? From mm/swapfile.c: > > if (unlikely(--latency_ration < 0)) { > cond_resched(); > latency_ration = LATENCY_LIMIT; > } > > Thanks, > > Bart. I think that if this helps, it is really a bug in the scheduler... It shouldn't switch tasks so often. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] dm-crypt: add the "high_priority" flag
On Mon, 6 Mar 2023, Mikulas Patocka wrote: > > Hi, > > > > Thanks a lot for your review! > > > > It's ok to fix the softlockup, but for async write encrypt, > > kcryptd_crypt_write_io_submit will add bio to write_tree, and once we > > call cond_resched before every kcryptd_io_write, the write performance > > may be poor while we meet a high cpu usage scene. > > Hi > > To fix this problem, find the PID of the process "dmcrypt_write" and > change its priority to -20, for example "renice -n -20 -p 34748". > > This is the proper way how to fix it; locking up the process for one > second is not. > > We used to have high-priority workqueues by default, but it caused audio > playback skipping, so we had to revert it - see > f612b2132db529feac4f965f28a1b9258ea7c22b. > > Perhaps we should add an option to have high-priority kernel threads? > > Mikulas Here I'm sending a patch that makes it optional to raise the priority of dm-crypt workqueues and the write thread. Test it, if it helps for you. Mikulas From: Mikulas Patocka It was reported that dm-crypt performs badly when the system is loaded[1]. So I'm adding an option "high_priority" that sets the workqueues and the write_thread to nice level -20. This used to be the default in the past, but it caused audio skipping, so it had to be reverted - see the commit f612b2132db529feac4f965f28a1b9258ea7c22b. This commit makes it optional, so that normal users won't be harmed by it. [1] https://listman.redhat.com/archives/dm-devel/2023-February/053410.html Signed-off-by: Mikulas Patocka --- Documentation/admin-guide/device-mapper/dm-crypt.rst |5 +++ drivers/md/dm-crypt.c| 28 +-- 2 files changed, 25 insertions(+), 8 deletions(-) Index: linux-2.6/drivers/md/dm-crypt.c === --- linux-2.6.orig/drivers/md/dm-crypt.c +++ linux-2.6/drivers/md/dm-crypt.c @@ -133,9 +133,9 @@ struct iv_elephant_private { * and encrypts / decrypts at the same time. */ enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID, -DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD, -DM_CRYPT_NO_READ_WORKQUEUE, DM_CRYPT_NO_WRITE_WORKQUEUE, -DM_CRYPT_WRITE_INLINE }; +DM_CRYPT_SAME_CPU, DM_CRYPT_HIGH_PRIORITY, +DM_CRYPT_NO_OFFLOAD, DM_CRYPT_NO_READ_WORKQUEUE, +DM_CRYPT_NO_WRITE_WORKQUEUE, DM_CRYPT_WRITE_INLINE }; enum cipher_flags { CRYPT_MODE_INTEGRITY_AEAD, /* Use authenticated mode for cipher */ @@ -3087,7 +3087,7 @@ static int crypt_ctr_optional(struct dm_ struct crypt_config *cc = ti->private; struct dm_arg_set as; static const struct dm_arg _args[] = { - {0, 8, "Invalid number of feature args"}, + {0, 9, "Invalid number of feature args"}, }; unsigned int opt_params, val; const char *opt_string, *sval; @@ -3114,6 +3114,8 @@ static int crypt_ctr_optional(struct dm_ else if (!strcasecmp(opt_string, "same_cpu_crypt")) set_bit(DM_CRYPT_SAME_CPU, &cc->flags); + else if (!strcasecmp(opt_string, "high_priority")) + set_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags); else if (!strcasecmp(opt_string, "submit_from_crypt_cpus")) set_bit(DM_CRYPT_NO_OFFLOAD, &cc->flags); @@ -3352,18 +3354,22 @@ static int crypt_ctr(struct dm_target *t } ret = -ENOMEM; - cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM, 1, devname); + cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM | + test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI, + 1, devname); if (!cc->io_queue) { ti->error = "Couldn't create kcryptd io queue"; goto bad; } if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags)) - cc->crypt_queue = alloc_workqueue("kcryptd/%s", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, + cc->crypt_queue = alloc_workqueue("kcryptd/%s", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | + test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI, 1, devname); else cc->crypt_queue = alloc_workqueue("kcryptd/%s", - WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, + WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND | + test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI, num_online_cpus(), devname); if (!cc->crypt_queue) { ti->error = "Couldn't create kcryptd queue"; @@ -3380,6 +3386,8 @@ static int c
Re: [dm-devel] dm-crypt: fix softlockup in dmcrypt_write
On 3/6/23 10:02, Mikulas Patocka wrote: On Tue, 28 Feb 2023, yangerkun wrote: It's ok to fix the softlockup, but for async write encrypt, kcryptd_crypt_write_io_submit will add bio to write_tree, and once we call cond_resched before every kcryptd_io_write, the write performance may be poor while we meet a high cpu usage scene. Hi To fix this problem, find the PID of the process "dmcrypt_write" and change its priority to -20, for example "renice -n -20 -p 34748". This is the proper way how to fix it; locking up the process for one second is not. We used to have high-priority workqueues by default, but it caused audio playback skipping, so we had to revert it - see f612b2132db529feac4f965f28a1b9258ea7c22b. Perhaps we should add an option to have high-priority kernel threads? Would calling cond_resched() every n iterations instead of every iteration help? From mm/swapfile.c: if (unlikely(--latency_ration < 0)) { cond_resched(); latency_ration = LATENCY_LIMIT; } Thanks, Bart. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] dm-crypt: fix softlockup in dmcrypt_write
On Tue, 28 Feb 2023, yangerkun wrote: > > > 在 2023/2/28 2:06, Mike Snitzer 写道: > > On Mon, Feb 27 2023 at 1:03P -0500, > > Mike Snitzer wrote: > > > >> On Mon, Feb 27 2023 at 12:55P -0500, > >> Mike Snitzer wrote: > >> > >>> On Sun, Feb 26 2023 at 8:31P -0500, > >>> yangerkun wrote: > >>> > > > 在 2023/2/26 10:01, Bart Van Assche 写道: > > On 2/22/23 19:19, yangerkun wrote: > >> @@ -1924,6 +1926,10 @@ static int dmcrypt_write(void *data) > >> BUG_ON(rb_parent(write_tree.rb_node)); > >> + if (time_is_before_jiffies(start_time + HZ)) { > >> + schedule(); > >> + start_time = jiffies; > >> + } > > > > Why schedule() instead of cond_resched()? > > cond_resched may not really schedule, which may trigger the problem too, > but > it seems after 1 second, it may never happend? > >>> > >>> I had the same question as Bart when reviewing your homegrown > >>> conditional schedule(). Hopefully you can reproduce this issue? If > >>> so, please see if simply using cond_resched() fixes the issue. > >> > >> This seems like a more appropriate patch: > >> > >> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > >> index 87c5706131f2..faba1be572f9 100644 > >> --- a/drivers/md/dm-crypt.c > >> +++ b/drivers/md/dm-crypt.c > >> @@ -1937,6 +1937,7 @@ static int dmcrypt_write(void *data) > >>io = crypt_io_from_node(rb_first(&write_tree)); > >>rb_erase(&io->rb_node, &write_tree); > >>kcryptd_io_write(io); > >> + cond_resched(); > >>} while (!RB_EMPTY_ROOT(&write_tree)); > >>blk_finish_plug(&plug); > >>} > > > > > > or: > > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > index 87c5706131f2..3ba2fd3e4358 100644 > > --- a/drivers/md/dm-crypt.c > > +++ b/drivers/md/dm-crypt.c > > @@ -1934,6 +1934,7 @@ static int dmcrypt_write(void *data) > > */ > > blk_start_plug(&plug); > > do { > > + cond_resched(); > > io = crypt_io_from_node(rb_first(&write_tree)); > > rb_erase(&io->rb_node, &write_tree); > > kcryptd_io_write(io); > > Hi, > > Thanks a lot for your review! > > It's ok to fix the softlockup, but for async write encrypt, > kcryptd_crypt_write_io_submit will add bio to write_tree, and once we > call cond_resched before every kcryptd_io_write, the write performance > may be poor while we meet a high cpu usage scene. Hi To fix this problem, find the PID of the process "dmcrypt_write" and change its priority to -20, for example "renice -n -20 -p 34748". This is the proper way how to fix it; locking up the process for one second is not. We used to have high-priority workqueues by default, but it caused audio playback skipping, so we had to revert it - see f612b2132db529feac4f965f28a1b9258ea7c22b. Perhaps we should add an option to have high-priority kernel threads? Mikulas > kcryptd_crypt_write_io_submit will wakeup write_thread once there is a > empty write_tree, and dmcrypt_write will peel the old write_tree to > submit bio, so there can not exist too many bio in write_tree. Then I > choose yield cpu before the 'while' that submit bio... > > Thanks, > Kun. > > -- > dm-devel mailing list > dm-devel@redhat.com > https://listman.redhat.com/mailman/listinfo/dm-devel > -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] [PATCH] libmultipath: return 'ghost' state when port is in transition
On Tue, Feb 21, 2023 at 12:56:43PM -0800, Brian Bunker wrote: > A test unit ready command sent on a path in standby state will not result > in a failed path. The same should be true for a path in the > transitioning state. > > Signed-off-by: Brian Bunker br...@purestorage.com Reviewed-by: Benjamin Marzinski > --- > libmultipath/checkers/tur.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c > index 551dc4f0..fff7987b 100644 > --- a/libmultipath/checkers/tur.c > +++ b/libmultipath/checkers/tur.c > @@ -179,10 +179,11 @@ retry: > else if (key == 0x2) { > /* Not Ready */ > /* Note: Other ALUA states are either UP or DOWN */ > - if( asc == 0x04 && ascq == 0x0b){ > + if (asc == 0x04 && > + (ascq == 0x0b || ascq == 0x0a)) { > /* > * LOGICAL UNIT NOT ACCESSIBLE, > -* TARGET PORT IN STANDBY STATE > +* TARGET PORT IN STANDBY OR TRANSITION STATE > */ > *msgid = CHECKER_MSGID_GHOST; > return PATH_GHOST; > -- > > -- > dm-devel mailing list > dm-devel@redhat.com > https://listman.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] dm-crypt: add cond_resched() to dmcrypt_write()
On Mon, Mar 06 2023 at 11:17P -0500, Mikulas Patocka wrote: > The loop in dmcrypt_write may be running for unbounded amount of time, > thus we need cond_resched() in it. > > This commit fixes the following warning: > > [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! > [dmcrypt_write/2:2897] > ... > [ 3391.258372][ C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G > L5.10.0+ #8 > [ 3391.267288][ C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS > 1.75 04/26/2021 > [ 3391.275428][ C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--) > [ 3391.282102][ C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158 > [ 3391.288080][ C12] lr : blk_attempt_plug_merge+0xc0/0x1b0 > [ 3391.293540][ C12] sp : 8000718abc30 > [ 3391.297530][ C12] x29: 8000718abc30 x28: > [ 3391.303509][ C12] x27: 2020361d9ac0 x26: 0001 > [ 3391.309488][ C12] x25: 0001 x24: 8000718abd08 > [ 3391.315467][ C12] x23: 0020a10dbb00 x22: 0001 > [ 3391.321445][ C12] x21: 8000718abe20 x20: 0020a10dbb00 > [ 3391.327424][ C12] x19: 0020aed7b040 x18: > [ 3391.333402][ C12] x17: x16: fde0 > [ 3391.339381][ C12] x15: 1000 x14: > [ 3391.345359][ C12] x13: 2000 x12: 2060011f9070 > [ 3391.351338][ C12] x11: 0001 x10: 0002 > [ 3391.357316][ C12] x9 : 800010586c38 x8 : 202009a7f200 > [ 3391.363294][ C12] x7 : 8000718abd00 x6 : 20200df33750 > [ 3391.369274][ C12] x5 : 0001 x4 : > [ 3391.375252][ C12] x3 : 0001 x2 : 0020a10dbb00 > [ 3391.381230][ C12] x1 : 0020aed7b040 x0 : 1000 > [ 3391.387210][ C12] Call trace: > [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 > [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 > [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 > [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 > [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] > [ 3391.416005][ C12] kthread+0x130/0x138 > [ 3391.419911][ C12] ret_from_fork+0x10/0x18 > > Signed-off-by: Mikulas Patocka > Reported-by: yangerkun > Fixes: dc2676210c42 ("dm crypt: offload writes to thread") > Cc: sta...@vger.kernel.org > > --- > drivers/md/dm-crypt.c |1 + > 1 file changed, 1 insertion(+) > > Index: linux-2.6/drivers/md/dm-crypt.c > === > --- linux-2.6.orig/drivers/md/dm-crypt.c > +++ linux-2.6/drivers/md/dm-crypt.c > @@ -1954,6 +1954,7 @@ pop_from_list: > io = crypt_io_from_node(rb_first(&write_tree)); > rb_erase(&io->rb_node, &write_tree); > kcryptd_io_write(io); > + cond_resched(); > } while (!RB_EMPTY_ROOT(&write_tree)); > blk_finish_plug(&plug); > } > I suggested the same here: https://listman.redhat.com/archives/dm-devel/2023-February/053402.html Yet yangerkun resisted it: https://listman.redhat.com/archives/dm-devel/2023-February/053410.html And stuck with their more throttled approach (which looks wrong to me, shouldn't it be using time_is_after_jiffies!?) https://listman.redhat.com/archives/dm-devel/2023-March/053473.html Anyway, I'm inclined to take your patch, I share your views: https://listman.redhat.com/archives/dm-devel/2023-March/053470.html Mike -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH v2] dm-crypt: fix softlockup in dmcrypt_write
From: yangerkun We meet a softlockup: localhost login: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897] ... [ 3391.258372][ C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G L5.10.0+ #8 [ 3391.267288][ C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.75 04/26/2021 [ 3391.275428][ C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--) [ 3391.282102][ C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.288080][ C12] lr : blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.293540][ C12] sp : 8000718abc30 [ 3391.297530][ C12] x29: 8000718abc30 x28: [ 3391.303509][ C12] x27: 2020361d9ac0 x26: 0001 [ 3391.309488][ C12] x25: 0001 x24: 8000718abd08 [ 3391.315467][ C12] x23: 0020a10dbb00 x22: 0001 [ 3391.321445][ C12] x21: 8000718abe20 x20: 0020a10dbb00 [ 3391.327424][ C12] x19: 0020aed7b040 x18: [ 3391.333402][ C12] x17: x16: fde0 [ 3391.339381][ C12] x15: 1000 x14: [ 3391.345359][ C12] x13: 2000 x12: 2060011f9070 [ 3391.351338][ C12] x11: 0001 x10: 0002 [ 3391.357316][ C12] x9 : 800010586c38 x8 : 202009a7f200 [ 3391.363294][ C12] x7 : 8000718abd00 x6 : 20200df33750 [ 3391.369274][ C12] x5 : 0001 x4 : [ 3391.375252][ C12] x3 : 0001 x2 : 0020a10dbb00 [ 3391.381230][ C12] x1 : 0020aed7b040 x0 : 1000 [ 3391.387210][ C12] Call trace: [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] [ 3391.416005][ C12] kthread+0x130/0x138 [ 3391.419911][ C12] ret_from_fork+0x10/0x18 dmcrypt_write will be wakeup once there is a not empty write_tree, and before really submit_bio, the old write_tree will be peeled, so there may not so much bio, but every time we check RB_EMPTY_ROOT, it may be false since the heavy load write, so the softlockup can trigger. We can add cond_resched every time before or after kcryptd_io_write, but dmcrypt_write seems performance sensitive since it will submit all pending write, so we choose cond_resched after 1s before the 'while' submit bio. Fixes: dc2676210c42 ("dm crypt: offload writes to thread") Signed-off-by: yangerkun --- drivers/md/dm-crypt.c | 6 ++ 1 file changed, 6 insertions(+) v1->v2: change schedule() to cond_resched() diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 40cb1719ae4d..6a675257e00e 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1894,6 +1894,7 @@ static int dmcrypt_write(void *data) { struct crypt_config *cc = data; struct dm_crypt_io *io; + unsigned long start_time = jiffies; while (1) { struct rb_root write_tree; @@ -1916,6 +1917,7 @@ static int dmcrypt_write(void *data) schedule(); + start_time = jiffies; set_current_state(TASK_RUNNING); spin_lock_irq(&cc->write_thread_lock); goto continue_locked; @@ -1927,6 +1929,10 @@ static int dmcrypt_write(void *data) BUG_ON(rb_parent(write_tree.rb_node)); + if (time_is_before_jiffies(start_time + HZ)) { + cond_resched(); + start_time = jiffies; + } /* * Note: we cannot walk the tree here with rb_next because * the structures may be freed when kcryptd_io_write is called. -- 2.31.1 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] dm crypt: initialize tasklet in crypt_io_init()
From: Hou Tao When neither no_read_workqueue nor no_write_workqueue are enabled, tasklet_trylock() in crypt_dec_pending() may still return false due to an uninitialized state, and dm-crypt will do io completion in io_queue instead of current context unnecessarily. Fix it by initializing io->tasklet in crypt_io_init(). Fixes: 8e14f610159d ("dm crypt: do not call bio_endio() from the dm-crypt tasklet") Signed-off-by: Hou Tao --- drivers/md/dm-crypt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 3aeeb8f2802f..caee6ce3b79f 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -238,6 +238,7 @@ static void crypt_endio(struct bio *clone); static void kcryptd_queue_crypt(struct dm_crypt_io *io); static struct scatterlist *crypt_get_sg_data(struct crypt_config *cc, struct scatterlist *sg); +static void kcryptd_crypt_tasklet(unsigned long work); static bool crypt_integrity_aead(struct crypt_config *cc); @@ -1725,6 +1726,7 @@ static void crypt_io_init(struct dm_crypt_io *io, struct crypt_config *cc, io->sector = sector; io->error = 0; io->ctx.r.req = NULL; + tasklet_init(&io->tasklet, kcryptd_crypt_tasklet, (unsigned long)&io->work); io->integrity_metadata = NULL; io->integrity_metadata_from_pool = false; atomic_set(&io->io_pending, 0); @@ -2226,7 +2228,6 @@ static void kcryptd_queue_crypt(struct dm_crypt_io *io) * it is being executed with irqs disabled. */ if (in_hardirq() || irqs_disabled()) { - tasklet_init(&io->tasklet, kcryptd_crypt_tasklet, (unsigned long)&io->work); tasklet_schedule(&io->tasklet); return; } -- 2.29.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH] dm-crypt: add cond_resched() to dmcrypt_write()
The loop in dmcrypt_write may be running for unbounded amount of time, thus we need cond_resched() in it. This commit fixes the following warning: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897] ... [ 3391.258372][ C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G L5.10.0+ #8 [ 3391.267288][ C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.75 04/26/2021 [ 3391.275428][ C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--) [ 3391.282102][ C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.288080][ C12] lr : blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.293540][ C12] sp : 8000718abc30 [ 3391.297530][ C12] x29: 8000718abc30 x28: [ 3391.303509][ C12] x27: 2020361d9ac0 x26: 0001 [ 3391.309488][ C12] x25: 0001 x24: 8000718abd08 [ 3391.315467][ C12] x23: 0020a10dbb00 x22: 0001 [ 3391.321445][ C12] x21: 8000718abe20 x20: 0020a10dbb00 [ 3391.327424][ C12] x19: 0020aed7b040 x18: [ 3391.333402][ C12] x17: x16: fde0 [ 3391.339381][ C12] x15: 1000 x14: [ 3391.345359][ C12] x13: 2000 x12: 2060011f9070 [ 3391.351338][ C12] x11: 0001 x10: 0002 [ 3391.357316][ C12] x9 : 800010586c38 x8 : 202009a7f200 [ 3391.363294][ C12] x7 : 8000718abd00 x6 : 20200df33750 [ 3391.369274][ C12] x5 : 0001 x4 : [ 3391.375252][ C12] x3 : 0001 x2 : 0020a10dbb00 [ 3391.381230][ C12] x1 : 0020aed7b040 x0 : 1000 [ 3391.387210][ C12] Call trace: [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] [ 3391.416005][ C12] kthread+0x130/0x138 [ 3391.419911][ C12] ret_from_fork+0x10/0x18 Signed-off-by: Mikulas Patocka Reported-by: yangerkun Fixes: dc2676210c42 ("dm crypt: offload writes to thread") Cc: sta...@vger.kernel.org --- drivers/md/dm-crypt.c |1 + 1 file changed, 1 insertion(+) Index: linux-2.6/drivers/md/dm-crypt.c === --- linux-2.6.orig/drivers/md/dm-crypt.c +++ linux-2.6/drivers/md/dm-crypt.c @@ -1954,6 +1954,7 @@ pop_from_list: io = crypt_io_from_node(rb_first(&write_tree)); rb_erase(&io->rb_node, &write_tree); kcryptd_io_write(io); + cond_resched(); } while (!RB_EMPTY_ROOT(&write_tree)); blk_finish_plug(&plug); } -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v2] dm-crypt: fix softlockup in dmcrypt_write
On Mon, 6 Mar 2023, yangerkun wrote: > From: yangerkun > > We meet a softlockup: > > localhost login: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 > stuck for 23s! [dmcrypt_write/2:2897] > ... > [ 3391.258372][ C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G > L5.10.0+ #8 > [ 3391.267288][ C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS > 1.75 04/26/2021 > [ 3391.275428][ C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--) > [ 3391.282102][ C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158 > [ 3391.288080][ C12] lr : blk_attempt_plug_merge+0xc0/0x1b0 > [ 3391.293540][ C12] sp : 8000718abc30 > [ 3391.297530][ C12] x29: 8000718abc30 x28: > [ 3391.303509][ C12] x27: 2020361d9ac0 x26: 0001 > [ 3391.309488][ C12] x25: 0001 x24: 8000718abd08 > [ 3391.315467][ C12] x23: 0020a10dbb00 x22: 0001 > [ 3391.321445][ C12] x21: 8000718abe20 x20: 0020a10dbb00 > [ 3391.327424][ C12] x19: 0020aed7b040 x18: > [ 3391.333402][ C12] x17: x16: fde0 > [ 3391.339381][ C12] x15: 1000 x14: > [ 3391.345359][ C12] x13: 2000 x12: 2060011f9070 > [ 3391.351338][ C12] x11: 0001 x10: 0002 > [ 3391.357316][ C12] x9 : 800010586c38 x8 : 202009a7f200 > [ 3391.363294][ C12] x7 : 8000718abd00 x6 : 20200df33750 > [ 3391.369274][ C12] x5 : 0001 x4 : > [ 3391.375252][ C12] x3 : 0001 x2 : 0020a10dbb00 > [ 3391.381230][ C12] x1 : 0020aed7b040 x0 : 1000 > [ 3391.387210][ C12] Call trace: > [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 > [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 > [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 > [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 > [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] > [ 3391.416005][ C12] kthread+0x130/0x138 > [ 3391.419911][ C12] ret_from_fork+0x10/0x18 > > dmcrypt_write will be wakeup once there is a not empty write_tree, and > before really submit_bio, the old write_tree will be peeled, so there > may not so much bio, but every time we check RB_EMPTY_ROOT, it may be > false since the heavy load write, so the softlockup can trigger. > > We can add cond_resched every time before or after kcryptd_io_write, but > dmcrypt_write seems performance sensitive since it will submit all > pending write, so we choose cond_resched after 1s before the 'while' > submit bio. > > Fixes: dc2676210c42 ("dm crypt: offload writes to thread") > Signed-off-by: yangerkun Hi Thanks for the bug report. I think that we don't have to count time and call cond_resched() only after being stuck for one second, we should call it always. There is no performance problem when calling cond_resched() - it does nothing if the scheduler hasn't decided that scheduling is needed. cond_resched() should be called in the do-while loop because this loop can run for unbounded amount of time too. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH] [PATCH] libmultipath: return 'ghost' state when port is in transition
Hi Brian, On Sat, 2023-03-04 at 12:49 -0800, Brian Bunker wrote: > > > The checking for standby is 14 years old, and says that TUR returns > a unit attention when the path is in standby. I am not sure why that > wouldn’t be handled by this code above: I would think there should be > just one unit attention for each I_T_L when an ALUA state change > happens.Not sure how it exceeds the retry count. > > if (key == 0x6) { > /* Unit Attention, retry */ > if (—retry_tur) > goto retry; > } > > From my perspective failing a path for ALUA state standby is > reasonable > since it is not an active state. I think the historic rationale for using GHOST is that some storage arrays, in particular active-passive configurations, may keep certain port groups in STANDBY state. If STANDBY was classified as FAILED, "multipath -ll" would show all paths of such port groups as FAILED, which would confuse users. That's what I meant before, multipath's GHOST can mean multiple things depending on the actual hardware in use, explicit/implicit ALUA, etc. Given that today basically every hardware supports ALUA, we could probably do better. But changing the semantics in the current situation is risky and error prone. Regards Martin -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] Thin pool CoW latency
On Sun, Mar 5, 2023 at 8:40 PM Demi Marie Obenour < d...@invisiblethingslab.com> wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA512 > > Like Eric, I am very concerned about CoW latency and throughput. I > am almost certain that allocating new blocks and snapshot copy-on-write > are _the_ hot paths in Qubes OS. In particular, I suspect that > workloads such as building an image in a throwaway VM or installing > packages onto a root volume that had just been shapshotted are dominated > by metadata operations, rather than by in-place updates. I suspect that > frequently-snapshotted volumes will observe similar behavior in general. > > Yes, provisioning and breaking sharing are relatively slow operations. As discussed with Eric I'm not intending to change how either of these operations is implemented. If the performance profile is not suitable for your application your company can either do some work to improve it yourselves, or select a different solution. - Joe -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [announce] thin-provisioning-tools v1.0.0-rc1
Hi Eric, On Fri, Mar 3, 2023 at 9:21 PM Eric Wheeler wrote: > > It would be nice to get people testing the new improvements: > > Do you think it can make it for the 6.3 merge window that is open? > Doubtful. The bulk of the changes are in dm-bufio, which is used by a lot of targets. So I want to do a lot more testing before pushing upstream. > Did thinp v2 get dropped, or just turn into the patchset above? > > I've shelved thinp v2 in favour of userland approach I outlined. > I think io_uring, and ublk have shown us that this is viable. That way > > a snapshot copy-on-write, or dm-cache data migration, which are very > > slow operations can be done with ordinary userland code. > > Would be nice to minimize CoW latency somehow if going to userspace > increases that a notable amount. CoW for spinning disks is definitely > slow, but NVMe's are pretty quick to copy a 64k chunk. > Yes, minimising latency would be good. I don't mind performing the CoW within kernel, but I don't want to handle the metadata updates in kernel. > > For the fast paths, layering will be removed by having userland give > > the kernel instruction to execute for specific regions of the virtual > > device (ie. remap to here). > > Maybe you just answered my question of latency? > Yep. > > > The kernel driver will have nothing specific to thin/cache etc. I'm not > > sure how many of the current dm-targets would fit into this model, but > > I'm sure thin provisioning, caching, linear, and stripe can. > > To be clear, linear and stripe would stay in the kernel? Linear and stripe would not need a call out to userland to service. And very little of thin/cache io would either. - Joe -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel