Re: [dm-devel] [PATCH v4 13/18] nvme: Add pr_ops read_reservation support

2023-03-06 Thread Mike Christie
On 2/24/23 3:04 PM, kernel test robot wrote:
> 
>drivers/nvme/host/pr.c: In function 'block_pr_type_from_nvme':
>>> drivers/nvme/host/pr.c:43:24: warning: implicit conversion from 'enum 
>>> nvme_pr_type' to 'enum pr_type' [-Wenum-conversion]
>   43 | return NVME_PR_EXCLUSIVE_ACCESS_ALL_REGS;
>  |^~~
Should have been PR_EXCLUSIVE_ACCESS_ALL_REGS. Will fix.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 10/18] nvme: Add helper to send pr command

2023-03-06 Thread Mike Christie
On 3/5/23 3:28 PM, Chaitanya Kulkarni wrote:
>>   
>> +static int nvme_send_pr_command(struct block_device *bdev,
>> +struct nvme_command *c, void *data, unsigned int data_len)
>> +{
>> +if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
>> +bdev->bd_disk->fops == &nvme_ns_head_ops)
>> +return nvme_send_ns_head_pr_command(bdev, c, data, data_len);
> below else is not needed after above return..


Will fix. Thanks.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] dm crypt: initialize tasklet in crypt_io_init()

2023-03-06 Thread Hou Tao
Hi,

On 3/7/2023 3:31 AM, Mike Snitzer wrote:
> On Mon, Mar 06 2023 at  8:49P -0500,
> Hou Tao  wrote:
>
>> From: Hou Tao 
>>
>> When neither no_read_workqueue nor no_write_workqueue are enabled,
>> tasklet_trylock() in crypt_dec_pending() may still return false due to
>> an uninitialized state, and dm-crypt will do io completion in io_queue
>> instead of current context unnecessarily.
> Have you actually experienced this?
Yes. I had written a bpftrace script to check the completion context of
blkdev_bio_end_io_simple() when doing direct io read on dm-crypt device. The
expected context should be unbound workers of crypt_queue, but sometimes the
context is the bound worker of io_queue.
>
>> Fix it by initializing io->tasklet in crypt_io_init().
> Really would rather avoid always calling tasklet_init(). But I can
> optimize it away with a later patch.
My first though was "io->tasklet.state = 0", but it may be fragile because it
operated on the internal status of tasklet, so I switch to tasklet_init().
>
> Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] dm-crypt: add the "high_priority" flag

2023-03-06 Thread yangerkun



在 2023/3/7 3:01, Mikulas Patocka 写道:



On Mon, 6 Mar 2023, Mikulas Patocka wrote:


Hi,

Thanks a lot for your review!

It's ok to fix the softlockup, but for async write encrypt,
kcryptd_crypt_write_io_submit will add bio to write_tree, and once we
call cond_resched before every kcryptd_io_write, the write performance
may be poor while we meet a high cpu usage scene.


Hi

To fix this problem, find the PID of the process "dmcrypt_write" and
change its priority to -20, for example "renice -n -20 -p 34748".

This is the proper way how to fix it; locking up the process for one
second is not.

We used to have high-priority workqueues by default, but it caused audio
playback skipping, so we had to revert it - see
f612b2132db529feac4f965f28a1b9258ea7c22b.

Perhaps we should add an option to have high-priority kernel threads?

Mikulas


Here I'm sending a patch that makes it optional to raise the priority
of dm-crypt workqueues and the write thread. Test it, if it helps for you.


Thanks, will test it!


Mikulas



From: Mikulas Patocka 

It was reported that dm-crypt performs badly when the system is loaded[1].
So I'm adding an option "high_priority" that sets the workqueues and the
write_thread to nice level -20. This used to be the default in the past,
but it caused audio skipping, so it had to be reverted - see the commit
f612b2132db529feac4f965f28a1b9258ea7c22b. This commit makes it optional,
so that normal users won't be harmed by it.

[1] https://listman.redhat.com/archives/dm-devel/2023-February/053410.html

Signed-off-by: Mikulas Patocka 

---
  Documentation/admin-guide/device-mapper/dm-crypt.rst |5 +++
  drivers/md/dm-crypt.c|   28 
+--
  2 files changed, 25 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/md/dm-crypt.c
===
--- linux-2.6.orig/drivers/md/dm-crypt.c
+++ linux-2.6/drivers/md/dm-crypt.c
@@ -133,9 +133,9 @@ struct iv_elephant_private {
   * and encrypts / decrypts at the same time.
   */
  enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID,
-DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD,
-DM_CRYPT_NO_READ_WORKQUEUE, DM_CRYPT_NO_WRITE_WORKQUEUE,
-DM_CRYPT_WRITE_INLINE };
+DM_CRYPT_SAME_CPU, DM_CRYPT_HIGH_PRIORITY,
+DM_CRYPT_NO_OFFLOAD, DM_CRYPT_NO_READ_WORKQUEUE,
+DM_CRYPT_NO_WRITE_WORKQUEUE, DM_CRYPT_WRITE_INLINE };
  
  enum cipher_flags {

CRYPT_MODE_INTEGRITY_AEAD,  /* Use authenticated mode for cipher */
@@ -3087,7 +3087,7 @@ static int crypt_ctr_optional(struct dm_
struct crypt_config *cc = ti->private;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
-   {0, 8, "Invalid number of feature args"},
+   {0, 9, "Invalid number of feature args"},
};
unsigned int opt_params, val;
const char *opt_string, *sval;
@@ -3114,6 +3114,8 @@ static int crypt_ctr_optional(struct dm_
  
  		else if (!strcasecmp(opt_string, "same_cpu_crypt"))

set_bit(DM_CRYPT_SAME_CPU, &cc->flags);
+   else if (!strcasecmp(opt_string, "high_priority"))
+   set_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags);
  
  		else if (!strcasecmp(opt_string, "submit_from_crypt_cpus"))

set_bit(DM_CRYPT_NO_OFFLOAD, &cc->flags);
@@ -3352,18 +3354,22 @@ static int crypt_ctr(struct dm_target *t
}
  
  	ret = -ENOMEM;

-   cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM, 1, 
devname);
+   cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM |
+  test_bit(DM_CRYPT_HIGH_PRIORITY, 
&cc->flags) * WQ_HIGHPRI,
+  1, devname);
if (!cc->io_queue) {
ti->error = "Couldn't create kcryptd io queue";
goto bad;
}
  
  	if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags))

-   cc->crypt_queue = alloc_workqueue("kcryptd/%s", 
WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM,
+   cc->crypt_queue = alloc_workqueue("kcryptd/%s", 
WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM |
+ test_bit(DM_CRYPT_HIGH_PRIORITY, 
&cc->flags) * WQ_HIGHPRI,
  1, devname);
else
cc->crypt_queue = alloc_workqueue("kcryptd/%s",
- WQ_CPU_INTENSIVE | 
WQ_MEM_RECLAIM | WQ_UNBOUND,
+ WQ_CPU_INTENSIVE | 
WQ_MEM_RECLAIM | WQ_UNBOUND |
+ test_bit(DM_CRYPT_HIGH_PRIORITY, 
&cc->flags) * WQ_HIGHPRI,
  num_online_cpus(), devname);
if (!cc->crypt_queue) {
ti->error = "Couldn't create kcryptd queue";
@@ -3380,6 +3386,8 @@ static int crypt_ctr(struct dm_target

Re: [dm-devel] dm-crypt: add cond_resched() to dmcrypt_write()

2023-03-06 Thread yangerkun



在 2023/3/7 0:33, Mike Snitzer 写道:

On Mon, Mar 06 2023 at 11:17P -0500,
Mikulas Patocka  wrote:


The loop in dmcrypt_write may be running for unbounded amount of time,
thus we need cond_resched() in it.

This commit fixes the following warning:

[ 3391.153255][   C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! 
[dmcrypt_write/2:2897]
...
[ 3391.258372][   C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G   
  L5.10.0+ #8
[ 3391.267288][   C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 
1.75 04/26/2021
[ 3391.275428][   C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
[ 3391.282102][   C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.288080][   C12] lr : blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.293540][   C12] sp : 8000718abc30
[ 3391.297530][   C12] x29: 8000718abc30 x28: 
[ 3391.303509][   C12] x27: 2020361d9ac0 x26: 0001
[ 3391.309488][   C12] x25: 0001 x24: 8000718abd08
[ 3391.315467][   C12] x23: 0020a10dbb00 x22: 0001
[ 3391.321445][   C12] x21: 8000718abe20 x20: 0020a10dbb00
[ 3391.327424][   C12] x19: 0020aed7b040 x18: 
[ 3391.333402][   C12] x17:  x16: fde0
[ 3391.339381][   C12] x15: 1000 x14: 
[ 3391.345359][   C12] x13: 2000 x12: 2060011f9070
[ 3391.351338][   C12] x11: 0001 x10: 0002
[ 3391.357316][   C12] x9 : 800010586c38 x8 : 202009a7f200
[ 3391.363294][   C12] x7 : 8000718abd00 x6 : 20200df33750
[ 3391.369274][   C12] x5 : 0001 x4 : 
[ 3391.375252][   C12] x3 : 0001 x2 : 0020a10dbb00
[ 3391.381230][   C12] x1 : 0020aed7b040 x0 : 1000
[ 3391.387210][   C12] Call trace:
[ 3391.390338][   C12]  blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.395970][   C12]  blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.401085][   C12]  blk_mq_submit_bio+0x398/0x550
[ 3391.405856][   C12]  submit_bio_noacct+0x308/0x380
[ 3391.410630][   C12]  dmcrypt_write+0x1e4/0x208 [dm_crypt]
[ 3391.416005][   C12]  kthread+0x130/0x138
[ 3391.419911][   C12]  ret_from_fork+0x10/0x18

Signed-off-by: Mikulas Patocka 
Reported-by: yangerkun 
Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
Cc: sta...@vger.kernel.org

---
  drivers/md/dm-crypt.c |1 +
  1 file changed, 1 insertion(+)

Index: linux-2.6/drivers/md/dm-crypt.c
===
--- linux-2.6.orig/drivers/md/dm-crypt.c
+++ linux-2.6/drivers/md/dm-crypt.c
@@ -1954,6 +1954,7 @@ pop_from_list:
io = crypt_io_from_node(rb_first(&write_tree));
rb_erase(&io->rb_node, &write_tree);
kcryptd_io_write(io);
+   cond_resched();
} while (!RB_EMPTY_ROOT(&write_tree));
blk_finish_plug(&plug);
}



I suggested the same here:
https://listman.redhat.com/archives/dm-devel/2023-February/053402.html

Yet yangerkun resisted it:
https://listman.redhat.com/archives/dm-devel/2023-February/053410.html

And stuck with their more throttled approach (which looks wrong to me,
shouldn't it be using time_is_after_jiffies!?)


Hi,

-->
  |||
  start_time start_time + HZ jiffies

(start_time + HZ) < jiffies means since start_time, we have pass 1s, so 
we need to cond_resched(see writeback_sb_inodes and 
blk_mq_do_dispatch_sched)...


Thanks,
Kun.


https://listman.redhat.com/archives/dm-devel/2023-March/053473.html

Anyway, I'm inclined to take your patch, I share your views:
https://listman.redhat.com/archives/dm-devel/2023-March/053470.html




Mike


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm crypt: initialize tasklet in crypt_io_init()

2023-03-06 Thread Mike Snitzer
On Mon, Mar 06 2023 at  8:49P -0500,
Hou Tao  wrote:

> From: Hou Tao 
> 
> When neither no_read_workqueue nor no_write_workqueue are enabled,
> tasklet_trylock() in crypt_dec_pending() may still return false due to
> an uninitialized state, and dm-crypt will do io completion in io_queue
> instead of current context unnecessarily.

Have you actually experienced this?

> Fix it by initializing io->tasklet in crypt_io_init().

Really would rather avoid always calling tasklet_init(). But I can
optimize it away with a later patch.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [PATCH] dm-zero: support discards

2023-03-06 Thread Mikulas Patocka
This patch adds discard support to dm-zero. The discards are ignored.
It is useful when the user combines dm-zero with other discard-supporting
targets in the same table; without dm-zero support, discards would be
disabled for the whole combined device.

Signed-off-by: Mikulas Patocka 

---
 drivers/md/dm-table.c |9 -
 drivers/md/dm-zero.c  |4 +++-
 2 files changed, 11 insertions(+), 2 deletions(-)

Index: linux-2.6/drivers/md/dm-zero.c
===
--- linux-2.6.orig/drivers/md/dm-zero.c
+++ linux-2.6/drivers/md/dm-zero.c
@@ -27,6 +27,7 @@ static int zero_ctr(struct dm_target *ti
 * Silently drop discards, avoiding -EOPNOTSUPP.
 */
ti->num_discard_bios = 1;
+   ti->discards_supported = true;
 
return 0;
 }
@@ -45,6 +46,7 @@ static int zero_map(struct dm_target *ti
zero_fill_bio(bio);
break;
case REQ_OP_WRITE:
+   case REQ_OP_DISCARD:
/* writes get silently dropped */
break;
default:
@@ -59,7 +61,7 @@ static int zero_map(struct dm_target *ti
 
 static struct target_type zero_target = {
.name   = "zero",
-   .version = {1, 1, 0},
+   .version = {1, 2, 0},
.features = DM_TARGET_NOWAIT,
.module = THIS_MODULE,
.ctr= zero_ctr,
Index: linux-2.6/drivers/md/dm-table.c
===
--- linux-2.6.orig/drivers/md/dm-table.c
+++ linux-2.6/drivers/md/dm-table.c
@@ -1670,8 +1670,15 @@ int dm_calculate_queue_limits(struct dm_
 
blk_set_stacking_limits(&ti_limits);
 
-   if (!ti->type->iterate_devices)
+   if (!ti->type->iterate_devices) {
+   if (ti->discards_supported) {
+   ti_limits.max_discard_sectors = UINT_MAX;
+   ti_limits.max_hw_discard_sectors = UINT_MAX;
+   ti_limits.discard_granularity = 512;
+   ti_limits.discard_alignment = 0;
+   }
goto combine_limits;
+   }
 
/*
 * Combine queue limits of all the devices this target uses.
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] [PATCH] libmultipath: return 'ghost' state when port is in transition

2023-03-06 Thread Benjamin Marzinski
On Mon, Mar 06, 2023 at 12:46:50PM +0100, Martin Wilck wrote:
> Hi Brian,
> 
> On Sat, 2023-03-04 at 12:49 -0800, Brian Bunker wrote:
> > > 
> > The checking for standby is 14 years old, and says that TUR returns
> > a unit attention when the path is in standby. I am not sure why that
> > wouldn’t be handled by this code above: I would think there should be
> > just one unit attention for each I_T_L when an ALUA state change
> > happens.Not sure how it exceeds the retry count.
> > 
> > if (key == 0x6) {
> >     /* Unit Attention, retry */
> >     if (—retry_tur)
> >     goto retry;
> > }
> > 
> > From my perspective failing a path for ALUA state standby is
> > reasonable
> > since it is not an active state.
> 
> I think the historic rationale for using GHOST is that some storage
> arrays, in particular active-passive configurations, may keep certain
> port groups in STANDBY state. If STANDBY was classified as FAILED,
> "multipath -ll" would show all paths of such port groups as FAILED,
> which would confuse users.
> 
> That's what I meant before, multipath's GHOST can mean multiple things
> depending on the actual hardware in use, explicit/implicit ALUA, etc.
> Given that today basically every hardware supports ALUA, we could
> probably do better. But changing the semantics in the current situation
> is risky and error prone.

I am sympathetic to Martin's view that GHOST is an ambiguous state, and
it's not at all clear that in means "temporarily between states". In
fact, it ususally doesn't.  On the other hand, if we can be pretty
certain that devices won't keep paths in the TRANSITIONING state for an
extended time, but we can't be certain what the end state will be, I do
see the rationale for not failing them preemtively. 

I wonder if PATH_PENDING makes more sense.  We would retain the existing
state until the path left the TRANSITIONING state.  The question is, are
you trying to make paths that are transitioning out of a failed state
come back sooner, or are you trying to keep paths that were in a active
state from being prevemtively failed.  Using PATH_PENDING won't fix the
first case, only the second.

PATH_PENDING makes sure that if IO to the path does start failing, the
checker won't keep setting the path back to an active state again.  It
also avoids the another GHOST issue, where the path would end up being
grouped with any actually passive paths, which isn't what we're looking
for.

The one problem I can think of off the top of my head would be that if
the device was held in the TRANSISTIONING state for a long time,
multipathd would keep checking it constantly, since PATH_PENDING is
really meant for cases where the checker hasn't completed yet, and we
just want to keep looking for the result. I suppose it would be possible
to add another state that worked just like pending (and could even get
converted to PATH_PENDING if there was no other state to be converted
to) but didn't cause us to retigger the checker so quickly.  But if
devices really will only be in TRANSITIONING for a short time, it might
not even be an issue we have to worry about.

Thoughts?

-Ben

> 
> Regards
> Martin
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm-crypt: fix softlockup in dmcrypt_write

2023-03-06 Thread Mikulas Patocka



On Mon, 6 Mar 2023, Bart Van Assche wrote:

> On 3/6/23 10:02, Mikulas Patocka wrote:
> > On Tue, 28 Feb 2023, yangerkun wrote:
> > > It's ok to fix the softlockup, but for async write encrypt,
> > > kcryptd_crypt_write_io_submit will add bio to write_tree, and once we
> > > call cond_resched before every kcryptd_io_write, the write performance
> > > may be poor while we meet a high cpu usage scene.
> > 
> > Hi
> > 
> > To fix this problem, find the PID of the process "dmcrypt_write" and
> > change its priority to -20, for example "renice -n -20 -p 34748".
> > 
> > This is the proper way how to fix it; locking up the process for one
> > second is not.
> > 
> > We used to have high-priority workqueues by default, but it caused audio
> > playback skipping, so we had to revert it - see
> > f612b2132db529feac4f965f28a1b9258ea7c22b.
> > 
> > Perhaps we should add an option to have high-priority kernel threads?
> 
> Would calling cond_resched() every n iterations instead of every iteration
> help? From mm/swapfile.c:
> 
>   if (unlikely(--latency_ration < 0)) {
>   cond_resched();
>   latency_ration = LATENCY_LIMIT;
>   }
> 
> Thanks,
> 
> Bart.

I think that if this helps, it is really a bug in the scheduler... It 
shouldn't switch tasks so often.

Mikulas
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [PATCH] dm-crypt: add the "high_priority" flag

2023-03-06 Thread Mikulas Patocka



On Mon, 6 Mar 2023, Mikulas Patocka wrote:

> > Hi,
> > 
> > Thanks a lot for your review!
> > 
> > It's ok to fix the softlockup, but for async write encrypt, 
> > kcryptd_crypt_write_io_submit will add bio to write_tree, and once we 
> > call cond_resched before every kcryptd_io_write, the write performance 
> > may be poor while we meet a high cpu usage scene.
> 
> Hi
> 
> To fix this problem, find the PID of the process "dmcrypt_write" and 
> change its priority to -20, for example "renice -n -20 -p 34748".
> 
> This is the proper way how to fix it; locking up the process for one 
> second is not.
> 
> We used to have high-priority workqueues by default, but it caused audio 
> playback skipping, so we had to revert it - see 
> f612b2132db529feac4f965f28a1b9258ea7c22b.
> 
> Perhaps we should add an option to have high-priority kernel threads?
> 
> Mikulas

Here I'm sending a patch that makes it optional to raise the priority 
of dm-crypt workqueues and the write thread. Test it, if it helps for you.

Mikulas



From: Mikulas Patocka 

It was reported that dm-crypt performs badly when the system is loaded[1]. 
So I'm adding an option "high_priority" that sets the workqueues and the 
write_thread to nice level -20. This used to be the default in the past, 
but it caused audio skipping, so it had to be reverted - see the commit 
f612b2132db529feac4f965f28a1b9258ea7c22b. This commit makes it optional, 
so that normal users won't be harmed by it.

[1] https://listman.redhat.com/archives/dm-devel/2023-February/053410.html

Signed-off-by: Mikulas Patocka 

---
 Documentation/admin-guide/device-mapper/dm-crypt.rst |5 +++
 drivers/md/dm-crypt.c|   28 +--
 2 files changed, 25 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/md/dm-crypt.c
===
--- linux-2.6.orig/drivers/md/dm-crypt.c
+++ linux-2.6/drivers/md/dm-crypt.c
@@ -133,9 +133,9 @@ struct iv_elephant_private {
  * and encrypts / decrypts at the same time.
  */
 enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID,
-DM_CRYPT_SAME_CPU, DM_CRYPT_NO_OFFLOAD,
-DM_CRYPT_NO_READ_WORKQUEUE, DM_CRYPT_NO_WRITE_WORKQUEUE,
-DM_CRYPT_WRITE_INLINE };
+DM_CRYPT_SAME_CPU, DM_CRYPT_HIGH_PRIORITY,
+DM_CRYPT_NO_OFFLOAD, DM_CRYPT_NO_READ_WORKQUEUE,
+DM_CRYPT_NO_WRITE_WORKQUEUE, DM_CRYPT_WRITE_INLINE };
 
 enum cipher_flags {
CRYPT_MODE_INTEGRITY_AEAD,  /* Use authenticated mode for cipher */
@@ -3087,7 +3087,7 @@ static int crypt_ctr_optional(struct dm_
struct crypt_config *cc = ti->private;
struct dm_arg_set as;
static const struct dm_arg _args[] = {
-   {0, 8, "Invalid number of feature args"},
+   {0, 9, "Invalid number of feature args"},
};
unsigned int opt_params, val;
const char *opt_string, *sval;
@@ -3114,6 +3114,8 @@ static int crypt_ctr_optional(struct dm_
 
else if (!strcasecmp(opt_string, "same_cpu_crypt"))
set_bit(DM_CRYPT_SAME_CPU, &cc->flags);
+   else if (!strcasecmp(opt_string, "high_priority"))
+   set_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags);
 
else if (!strcasecmp(opt_string, "submit_from_crypt_cpus"))
set_bit(DM_CRYPT_NO_OFFLOAD, &cc->flags);
@@ -3352,18 +3354,22 @@ static int crypt_ctr(struct dm_target *t
}
 
ret = -ENOMEM;
-   cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM, 1, 
devname);
+   cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM |
+  test_bit(DM_CRYPT_HIGH_PRIORITY, 
&cc->flags) * WQ_HIGHPRI,
+  1, devname);
if (!cc->io_queue) {
ti->error = "Couldn't create kcryptd io queue";
goto bad;
}
 
if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags))
-   cc->crypt_queue = alloc_workqueue("kcryptd/%s", 
WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM,
+   cc->crypt_queue = alloc_workqueue("kcryptd/%s", 
WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM |
+ 
test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI,
  1, devname);
else
cc->crypt_queue = alloc_workqueue("kcryptd/%s",
- WQ_CPU_INTENSIVE | 
WQ_MEM_RECLAIM | WQ_UNBOUND,
+ WQ_CPU_INTENSIVE | 
WQ_MEM_RECLAIM | WQ_UNBOUND |
+ 
test_bit(DM_CRYPT_HIGH_PRIORITY, &cc->flags) * WQ_HIGHPRI,
  num_online_cpus(), devname);
if (!cc->crypt_queue) {
ti->error = "Couldn't create kcryptd queue";
@@ -3380,6 +3386,8 @@ static int c

Re: [dm-devel] dm-crypt: fix softlockup in dmcrypt_write

2023-03-06 Thread Bart Van Assche

On 3/6/23 10:02, Mikulas Patocka wrote:

On Tue, 28 Feb 2023, yangerkun wrote:

It's ok to fix the softlockup, but for async write encrypt,
kcryptd_crypt_write_io_submit will add bio to write_tree, and once we
call cond_resched before every kcryptd_io_write, the write performance
may be poor while we meet a high cpu usage scene.


Hi

To fix this problem, find the PID of the process "dmcrypt_write" and
change its priority to -20, for example "renice -n -20 -p 34748".

This is the proper way how to fix it; locking up the process for one
second is not.

We used to have high-priority workqueues by default, but it caused audio
playback skipping, so we had to revert it - see
f612b2132db529feac4f965f28a1b9258ea7c22b.

Perhaps we should add an option to have high-priority kernel threads?


Would calling cond_resched() every n iterations instead of every 
iteration help? From mm/swapfile.c:


if (unlikely(--latency_ration < 0)) {
cond_resched();
latency_ration = LATENCY_LIMIT;
}

Thanks,

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] dm-crypt: fix softlockup in dmcrypt_write

2023-03-06 Thread Mikulas Patocka


On Tue, 28 Feb 2023, yangerkun wrote:

> 
> 
> 在 2023/2/28 2:06, Mike Snitzer 写道:
> > On Mon, Feb 27 2023 at  1:03P -0500,
> > Mike Snitzer  wrote:
> > 
> >> On Mon, Feb 27 2023 at 12:55P -0500,
> >> Mike Snitzer  wrote:
> >>
> >>> On Sun, Feb 26 2023 at  8:31P -0500,
> >>> yangerkun  wrote:
> >>>
> 
> 
>  在 2023/2/26 10:01, Bart Van Assche 写道:
> > On 2/22/23 19:19, yangerkun wrote:
> >> @@ -1924,6 +1926,10 @@ static int dmcrypt_write(void *data)
> >>    BUG_ON(rb_parent(write_tree.rb_node));
> >> +    if (time_is_before_jiffies(start_time + HZ)) {
> >> +    schedule();
> >> +    start_time = jiffies;
> >> +    }
> >
> > Why schedule() instead of cond_resched()?
> 
>  cond_resched may not really schedule, which may trigger the problem too,
> but
>  it seems after 1 second, it may never happend?
> >>>
> >>> I had the same question as Bart when reviewing your homegrown
> >>> conditional schedule().  Hopefully you can reproduce this issue?  If
> >>> so, please see if simply using cond_resched() fixes the issue.
> >>
> >> This seems like a more appropriate patch:
> >>
> >> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> >> index 87c5706131f2..faba1be572f9 100644
> >> --- a/drivers/md/dm-crypt.c
> >> +++ b/drivers/md/dm-crypt.c
> >> @@ -1937,6 +1937,7 @@ static int dmcrypt_write(void *data)
> >>io = crypt_io_from_node(rb_first(&write_tree));
> >>rb_erase(&io->rb_node, &write_tree);
> >>kcryptd_io_write(io);
> >> +  cond_resched();
> >>} while (!RB_EMPTY_ROOT(&write_tree));
> >>blk_finish_plug(&plug);
> >>}
> > 
> > 
> > or:
> > 
> > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> > index 87c5706131f2..3ba2fd3e4358 100644
> > --- a/drivers/md/dm-crypt.c
> > +++ b/drivers/md/dm-crypt.c
> > @@ -1934,6 +1934,7 @@ static int dmcrypt_write(void *data)
> >  */
> > blk_start_plug(&plug);
> > do {
> > +   cond_resched();
> > io = crypt_io_from_node(rb_first(&write_tree));
> > rb_erase(&io->rb_node, &write_tree);
> > kcryptd_io_write(io);
> 
> Hi,
> 
> Thanks a lot for your review!
> 
> It's ok to fix the softlockup, but for async write encrypt, 
> kcryptd_crypt_write_io_submit will add bio to write_tree, and once we 
> call cond_resched before every kcryptd_io_write, the write performance 
> may be poor while we meet a high cpu usage scene.

Hi

To fix this problem, find the PID of the process "dmcrypt_write" and 
change its priority to -20, for example "renice -n -20 -p 34748".

This is the proper way how to fix it; locking up the process for one 
second is not.

We used to have high-priority workqueues by default, but it caused audio 
playback skipping, so we had to revert it - see 
f612b2132db529feac4f965f28a1b9258ea7c22b.

Perhaps we should add an option to have high-priority kernel threads?

Mikulas

> kcryptd_crypt_write_io_submit will wakeup write_thread once there is a 
> empty write_tree, and dmcrypt_write will peel the old write_tree to 
> submit bio, so there can not exist too many bio in write_tree. Then I 
> choose yield cpu before the 'while' that submit bio...
> 
> Thanks,
> Kun.
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
> --
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH] [PATCH] libmultipath: return 'ghost' state when port is in transition

2023-03-06 Thread Benjamin Marzinski
On Tue, Feb 21, 2023 at 12:56:43PM -0800, Brian Bunker wrote:
> A test unit ready command sent on a path in standby state will not result
> in a failed path. The same should be true for a path in the
> transitioning state.
> 
> Signed-off-by: Brian Bunker br...@purestorage.com

Reviewed-by: Benjamin Marzinski 

> ---
>  libmultipath/checkers/tur.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
> index 551dc4f0..fff7987b 100644
> --- a/libmultipath/checkers/tur.c
> +++ b/libmultipath/checkers/tur.c
> @@ -179,10 +179,11 @@ retry:
> else if (key == 0x2) {
> /* Not Ready */
> /* Note: Other ALUA states are either UP or DOWN */
> -   if( asc == 0x04 && ascq == 0x0b){
> +   if (asc == 0x04 &&
> +   (ascq == 0x0b || ascq == 0x0a)) {
> /*
>  * LOGICAL UNIT NOT ACCESSIBLE,
> -* TARGET PORT IN STANDBY STATE
> +* TARGET PORT IN STANDBY OR TRANSITION STATE
>  */
> *msgid = CHECKER_MSGID_GHOST;
> return PATH_GHOST;
> --
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] dm-crypt: add cond_resched() to dmcrypt_write()

2023-03-06 Thread Mike Snitzer
On Mon, Mar 06 2023 at 11:17P -0500,
Mikulas Patocka  wrote:

> The loop in dmcrypt_write may be running for unbounded amount of time, 
> thus we need cond_resched() in it.
> 
> This commit fixes the following warning:
> 
> [ 3391.153255][   C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! 
> [dmcrypt_write/2:2897]
> ...
> [ 3391.258372][   C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G 
> L5.10.0+ #8
> [ 3391.267288][   C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 
> 1.75 04/26/2021
> [ 3391.275428][   C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
> [ 3391.282102][   C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158
> [ 3391.288080][   C12] lr : blk_attempt_plug_merge+0xc0/0x1b0
> [ 3391.293540][   C12] sp : 8000718abc30
> [ 3391.297530][   C12] x29: 8000718abc30 x28: 
> [ 3391.303509][   C12] x27: 2020361d9ac0 x26: 0001
> [ 3391.309488][   C12] x25: 0001 x24: 8000718abd08
> [ 3391.315467][   C12] x23: 0020a10dbb00 x22: 0001
> [ 3391.321445][   C12] x21: 8000718abe20 x20: 0020a10dbb00
> [ 3391.327424][   C12] x19: 0020aed7b040 x18: 
> [ 3391.333402][   C12] x17:  x16: fde0
> [ 3391.339381][   C12] x15: 1000 x14: 
> [ 3391.345359][   C12] x13: 2000 x12: 2060011f9070
> [ 3391.351338][   C12] x11: 0001 x10: 0002
> [ 3391.357316][   C12] x9 : 800010586c38 x8 : 202009a7f200
> [ 3391.363294][   C12] x7 : 8000718abd00 x6 : 20200df33750
> [ 3391.369274][   C12] x5 : 0001 x4 : 
> [ 3391.375252][   C12] x3 : 0001 x2 : 0020a10dbb00
> [ 3391.381230][   C12] x1 : 0020aed7b040 x0 : 1000
> [ 3391.387210][   C12] Call trace:
> [ 3391.390338][   C12]  blk_attempt_bio_merge.part.6+0x38/0x158
> [ 3391.395970][   C12]  blk_attempt_plug_merge+0xc0/0x1b0
> [ 3391.401085][   C12]  blk_mq_submit_bio+0x398/0x550
> [ 3391.405856][   C12]  submit_bio_noacct+0x308/0x380
> [ 3391.410630][   C12]  dmcrypt_write+0x1e4/0x208 [dm_crypt]
> [ 3391.416005][   C12]  kthread+0x130/0x138
> [ 3391.419911][   C12]  ret_from_fork+0x10/0x18
> 
> Signed-off-by: Mikulas Patocka 
> Reported-by: yangerkun 
> Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
> Cc: sta...@vger.kernel.org
> 
> ---
>  drivers/md/dm-crypt.c |1 +
>  1 file changed, 1 insertion(+)
> 
> Index: linux-2.6/drivers/md/dm-crypt.c
> ===
> --- linux-2.6.orig/drivers/md/dm-crypt.c
> +++ linux-2.6/drivers/md/dm-crypt.c
> @@ -1954,6 +1954,7 @@ pop_from_list:
>   io = crypt_io_from_node(rb_first(&write_tree));
>   rb_erase(&io->rb_node, &write_tree);
>   kcryptd_io_write(io);
> + cond_resched();
>   } while (!RB_EMPTY_ROOT(&write_tree));
>   blk_finish_plug(&plug);
>   }
> 

I suggested the same here:
https://listman.redhat.com/archives/dm-devel/2023-February/053402.html

Yet yangerkun resisted it:
https://listman.redhat.com/archives/dm-devel/2023-February/053410.html

And stuck with their more throttled approach (which looks wrong to me,
shouldn't it be using time_is_after_jiffies!?)
https://listman.redhat.com/archives/dm-devel/2023-March/053473.html

Anyway, I'm inclined to take your patch, I share your views:
https://listman.redhat.com/archives/dm-devel/2023-March/053470.html

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [PATCH v2] dm-crypt: fix softlockup in dmcrypt_write

2023-03-06 Thread yangerkun
From: yangerkun 

We meet a softlockup:

localhost login: [ 3391.153255][   C12] watchdog: BUG: soft lockup - CPU#12 
stuck for 23s! [dmcrypt_write/2:2897]
...
[ 3391.258372][   C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G   
  L5.10.0+ #8
[ 3391.267288][   C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 
1.75 04/26/2021
[ 3391.275428][   C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
[ 3391.282102][   C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.288080][   C12] lr : blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.293540][   C12] sp : 8000718abc30
[ 3391.297530][   C12] x29: 8000718abc30 x28: 
[ 3391.303509][   C12] x27: 2020361d9ac0 x26: 0001
[ 3391.309488][   C12] x25: 0001 x24: 8000718abd08
[ 3391.315467][   C12] x23: 0020a10dbb00 x22: 0001
[ 3391.321445][   C12] x21: 8000718abe20 x20: 0020a10dbb00
[ 3391.327424][   C12] x19: 0020aed7b040 x18: 
[ 3391.333402][   C12] x17:  x16: fde0
[ 3391.339381][   C12] x15: 1000 x14: 
[ 3391.345359][   C12] x13: 2000 x12: 2060011f9070
[ 3391.351338][   C12] x11: 0001 x10: 0002
[ 3391.357316][   C12] x9 : 800010586c38 x8 : 202009a7f200
[ 3391.363294][   C12] x7 : 8000718abd00 x6 : 20200df33750
[ 3391.369274][   C12] x5 : 0001 x4 : 
[ 3391.375252][   C12] x3 : 0001 x2 : 0020a10dbb00
[ 3391.381230][   C12] x1 : 0020aed7b040 x0 : 1000
[ 3391.387210][   C12] Call trace:
[ 3391.390338][   C12]  blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.395970][   C12]  blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.401085][   C12]  blk_mq_submit_bio+0x398/0x550
[ 3391.405856][   C12]  submit_bio_noacct+0x308/0x380
[ 3391.410630][   C12]  dmcrypt_write+0x1e4/0x208 [dm_crypt]
[ 3391.416005][   C12]  kthread+0x130/0x138
[ 3391.419911][   C12]  ret_from_fork+0x10/0x18

dmcrypt_write will be wakeup once there is a not empty write_tree, and
before really submit_bio, the old write_tree will be peeled, so there
may not so much bio, but every time we check RB_EMPTY_ROOT, it may be
false since the heavy load write, so the softlockup can trigger.

We can add cond_resched every time before or after kcryptd_io_write, but
dmcrypt_write seems performance sensitive since it will submit all
pending write, so we choose cond_resched after 1s before the 'while'
submit bio.

Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
Signed-off-by: yangerkun 
---
 drivers/md/dm-crypt.c | 6 ++
 1 file changed, 6 insertions(+)

v1->v2:
change schedule() to cond_resched()

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 40cb1719ae4d..6a675257e00e 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1894,6 +1894,7 @@ static int dmcrypt_write(void *data)
 {
struct crypt_config *cc = data;
struct dm_crypt_io *io;
+   unsigned long start_time = jiffies;
 
while (1) {
struct rb_root write_tree;
@@ -1916,6 +1917,7 @@ static int dmcrypt_write(void *data)
 
schedule();
 
+   start_time = jiffies;
set_current_state(TASK_RUNNING);
spin_lock_irq(&cc->write_thread_lock);
goto continue_locked;
@@ -1927,6 +1929,10 @@ static int dmcrypt_write(void *data)
 
BUG_ON(rb_parent(write_tree.rb_node));
 
+   if (time_is_before_jiffies(start_time + HZ)) {
+   cond_resched();
+   start_time = jiffies;
+   }
/*
 * Note: we cannot walk the tree here with rb_next because
 * the structures may be freed when kcryptd_io_write is called.
-- 
2.31.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [PATCH] dm crypt: initialize tasklet in crypt_io_init()

2023-03-06 Thread Hou Tao
From: Hou Tao 

When neither no_read_workqueue nor no_write_workqueue are enabled,
tasklet_trylock() in crypt_dec_pending() may still return false due to
an uninitialized state, and dm-crypt will do io completion in io_queue
instead of current context unnecessarily.

Fix it by initializing io->tasklet in crypt_io_init().

Fixes: 8e14f610159d ("dm crypt: do not call bio_endio() from the dm-crypt 
tasklet")
Signed-off-by: Hou Tao 
---
 drivers/md/dm-crypt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 3aeeb8f2802f..caee6ce3b79f 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -238,6 +238,7 @@ static void crypt_endio(struct bio *clone);
 static void kcryptd_queue_crypt(struct dm_crypt_io *io);
 static struct scatterlist *crypt_get_sg_data(struct crypt_config *cc,
 struct scatterlist *sg);
+static void kcryptd_crypt_tasklet(unsigned long work);
 
 static bool crypt_integrity_aead(struct crypt_config *cc);
 
@@ -1725,6 +1726,7 @@ static void crypt_io_init(struct dm_crypt_io *io, struct 
crypt_config *cc,
io->sector = sector;
io->error = 0;
io->ctx.r.req = NULL;
+   tasklet_init(&io->tasklet, kcryptd_crypt_tasklet, (unsigned 
long)&io->work);
io->integrity_metadata = NULL;
io->integrity_metadata_from_pool = false;
atomic_set(&io->io_pending, 0);
@@ -2226,7 +2228,6 @@ static void kcryptd_queue_crypt(struct dm_crypt_io *io)
 * it is being executed with irqs disabled.
 */
if (in_hardirq() || irqs_disabled()) {
-   tasklet_init(&io->tasklet, kcryptd_crypt_tasklet, 
(unsigned long)&io->work);
tasklet_schedule(&io->tasklet);
return;
}
-- 
2.29.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [PATCH] dm-crypt: add cond_resched() to dmcrypt_write()

2023-03-06 Thread Mikulas Patocka
The loop in dmcrypt_write may be running for unbounded amount of time, 
thus we need cond_resched() in it.

This commit fixes the following warning:

[ 3391.153255][   C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! 
[dmcrypt_write/2:2897]
...
[ 3391.258372][   C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G   
  L5.10.0+ #8
[ 3391.267288][   C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 
1.75 04/26/2021
[ 3391.275428][   C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
[ 3391.282102][   C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.288080][   C12] lr : blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.293540][   C12] sp : 8000718abc30
[ 3391.297530][   C12] x29: 8000718abc30 x28: 
[ 3391.303509][   C12] x27: 2020361d9ac0 x26: 0001
[ 3391.309488][   C12] x25: 0001 x24: 8000718abd08
[ 3391.315467][   C12] x23: 0020a10dbb00 x22: 0001
[ 3391.321445][   C12] x21: 8000718abe20 x20: 0020a10dbb00
[ 3391.327424][   C12] x19: 0020aed7b040 x18: 
[ 3391.333402][   C12] x17:  x16: fde0
[ 3391.339381][   C12] x15: 1000 x14: 
[ 3391.345359][   C12] x13: 2000 x12: 2060011f9070
[ 3391.351338][   C12] x11: 0001 x10: 0002
[ 3391.357316][   C12] x9 : 800010586c38 x8 : 202009a7f200
[ 3391.363294][   C12] x7 : 8000718abd00 x6 : 20200df33750
[ 3391.369274][   C12] x5 : 0001 x4 : 
[ 3391.375252][   C12] x3 : 0001 x2 : 0020a10dbb00
[ 3391.381230][   C12] x1 : 0020aed7b040 x0 : 1000
[ 3391.387210][   C12] Call trace:
[ 3391.390338][   C12]  blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.395970][   C12]  blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.401085][   C12]  blk_mq_submit_bio+0x398/0x550
[ 3391.405856][   C12]  submit_bio_noacct+0x308/0x380
[ 3391.410630][   C12]  dmcrypt_write+0x1e4/0x208 [dm_crypt]
[ 3391.416005][   C12]  kthread+0x130/0x138
[ 3391.419911][   C12]  ret_from_fork+0x10/0x18

Signed-off-by: Mikulas Patocka 
Reported-by: yangerkun 
Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
Cc: sta...@vger.kernel.org

---
 drivers/md/dm-crypt.c |1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6/drivers/md/dm-crypt.c
===
--- linux-2.6.orig/drivers/md/dm-crypt.c
+++ linux-2.6/drivers/md/dm-crypt.c
@@ -1954,6 +1954,7 @@ pop_from_list:
io = crypt_io_from_node(rb_first(&write_tree));
rb_erase(&io->rb_node, &write_tree);
kcryptd_io_write(io);
+   cond_resched();
} while (!RB_EMPTY_ROOT(&write_tree));
blk_finish_plug(&plug);
}

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v2] dm-crypt: fix softlockup in dmcrypt_write

2023-03-06 Thread Mikulas Patocka



On Mon, 6 Mar 2023, yangerkun wrote:

> From: yangerkun 
> 
> We meet a softlockup:
> 
> localhost login: [ 3391.153255][   C12] watchdog: BUG: soft lockup - CPU#12 
> stuck for 23s! [dmcrypt_write/2:2897]
> ...
> [ 3391.258372][   C12] CPU: 12 PID: 2897 Comm: dmcrypt_write/2 Tainted: G 
> L5.10.0+ #8
> [ 3391.267288][   C12] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 
> 1.75 04/26/2021
> [ 3391.275428][   C12] pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
> [ 3391.282102][   C12] pc : blk_attempt_bio_merge.part.6+0x38/0x158
> [ 3391.288080][   C12] lr : blk_attempt_plug_merge+0xc0/0x1b0
> [ 3391.293540][   C12] sp : 8000718abc30
> [ 3391.297530][   C12] x29: 8000718abc30 x28: 
> [ 3391.303509][   C12] x27: 2020361d9ac0 x26: 0001
> [ 3391.309488][   C12] x25: 0001 x24: 8000718abd08
> [ 3391.315467][   C12] x23: 0020a10dbb00 x22: 0001
> [ 3391.321445][   C12] x21: 8000718abe20 x20: 0020a10dbb00
> [ 3391.327424][   C12] x19: 0020aed7b040 x18: 
> [ 3391.333402][   C12] x17:  x16: fde0
> [ 3391.339381][   C12] x15: 1000 x14: 
> [ 3391.345359][   C12] x13: 2000 x12: 2060011f9070
> [ 3391.351338][   C12] x11: 0001 x10: 0002
> [ 3391.357316][   C12] x9 : 800010586c38 x8 : 202009a7f200
> [ 3391.363294][   C12] x7 : 8000718abd00 x6 : 20200df33750
> [ 3391.369274][   C12] x5 : 0001 x4 : 
> [ 3391.375252][   C12] x3 : 0001 x2 : 0020a10dbb00
> [ 3391.381230][   C12] x1 : 0020aed7b040 x0 : 1000
> [ 3391.387210][   C12] Call trace:
> [ 3391.390338][   C12]  blk_attempt_bio_merge.part.6+0x38/0x158
> [ 3391.395970][   C12]  blk_attempt_plug_merge+0xc0/0x1b0
> [ 3391.401085][   C12]  blk_mq_submit_bio+0x398/0x550
> [ 3391.405856][   C12]  submit_bio_noacct+0x308/0x380
> [ 3391.410630][   C12]  dmcrypt_write+0x1e4/0x208 [dm_crypt]
> [ 3391.416005][   C12]  kthread+0x130/0x138
> [ 3391.419911][   C12]  ret_from_fork+0x10/0x18
> 
> dmcrypt_write will be wakeup once there is a not empty write_tree, and
> before really submit_bio, the old write_tree will be peeled, so there
> may not so much bio, but every time we check RB_EMPTY_ROOT, it may be
> false since the heavy load write, so the softlockup can trigger.
> 
> We can add cond_resched every time before or after kcryptd_io_write, but
> dmcrypt_write seems performance sensitive since it will submit all
> pending write, so we choose cond_resched after 1s before the 'while'
> submit bio.
> 
> Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
> Signed-off-by: yangerkun 

Hi

Thanks for the bug report.

I think that we don't have to count time and call cond_resched() only 
after being stuck for one second, we should call it always. There is no 
performance problem when calling cond_resched() - it does nothing if the 
scheduler hasn't decided that scheduling is needed.

cond_resched() should be called in the do-while loop because this loop can 
run for unbounded amount of time too.

Mikulas
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] [PATCH] libmultipath: return 'ghost' state when port is in transition

2023-03-06 Thread Martin Wilck
Hi Brian,

On Sat, 2023-03-04 at 12:49 -0800, Brian Bunker wrote:
> > 
> The checking for standby is 14 years old, and says that TUR returns
> a unit attention when the path is in standby. I am not sure why that
> wouldn’t be handled by this code above: I would think there should be
> just one unit attention for each I_T_L when an ALUA state change
> happens.Not sure how it exceeds the retry count.
> 
> if (key == 0x6) {
>     /* Unit Attention, retry */
>     if (—retry_tur)
>     goto retry;
> }
> 
> From my perspective failing a path for ALUA state standby is
> reasonable
> since it is not an active state.

I think the historic rationale for using GHOST is that some storage
arrays, in particular active-passive configurations, may keep certain
port groups in STANDBY state. If STANDBY was classified as FAILED,
"multipath -ll" would show all paths of such port groups as FAILED,
which would confuse users.

That's what I meant before, multipath's GHOST can mean multiple things
depending on the actual hardware in use, explicit/implicit ALUA, etc.
Given that today basically every hardware supports ALUA, we could
probably do better. But changing the semantics in the current situation
is risky and error prone.

Regards
Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Thin pool CoW latency

2023-03-06 Thread Joe Thornber
On Sun, Mar 5, 2023 at 8:40 PM Demi Marie Obenour <
d...@invisiblethingslab.com> wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Like Eric, I am very concerned about CoW latency and throughput.  I
> am almost certain that allocating new blocks and snapshot copy-on-write
> are _the_ hot paths in Qubes OS.  In particular, I suspect that
> workloads such as building an image in a throwaway VM or installing
> packages onto a root volume that had just been shapshotted are dominated
> by metadata operations, rather than by in-place updates.  I suspect that
> frequently-snapshotted volumes will observe similar behavior in general.
>
>
Yes, provisioning and breaking sharing are relatively slow operations.  As
discussed with Eric
I'm not intending to change how either of these operations is implemented.
If the performance
profile is not suitable for your application your company can either do
some work to improve it yourselves, or
select a different solution.

- Joe
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [announce] thin-provisioning-tools v1.0.0-rc1

2023-03-06 Thread Joe Thornber
Hi Eric,

On Fri, Mar 3, 2023 at 9:21 PM Eric Wheeler 
wrote:

>
> It would be nice to get people testing the new improvements:
>
> Do you think it can make it for the 6.3 merge window that is open?
>

Doubtful.  The bulk of the changes are in dm-bufio, which is used by a lot
of targets.  So
I want to do a lot more testing before pushing upstream.



> Did thinp v2 get dropped, or just turn into the patchset above?
>
>
I've shelved thinp v2 in favour of userland approach I outlined.

 > I think io_uring, and ublk have shown us that this is viable.  That way

> > a snapshot copy-on-write, or dm-cache data migration, which are very
> > slow operations can be done with ordinary userland code.
>
> Would be nice to minimize CoW latency somehow if going to userspace
> increases that a notable amount.  CoW for spinning disks is definitely
> slow, but NVMe's are pretty quick to copy a 64k chunk.
>

Yes, minimising latency would be good.  I don't mind performing the CoW
within kernel, but I don't want to
handle the metadata updates in kernel.


> > For the fast paths, layering will be removed by having userland give
> > the kernel instruction to execute for specific regions of the virtual
> > device (ie. remap to here).
>
> Maybe you just answered my question of latency?
>

Yep.


>
> > The kernel driver will have nothing specific to thin/cache etc. I'm not
> > sure how many of the current dm-targets would fit into this model, but
> > I'm sure thin provisioning, caching, linear, and stripe can.
>
> To be clear, linear and stripe would stay in the kernel?


Linear and stripe would not need a call out to userland to service.  And
very little of thin/cache io would either.

 - Joe
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel